0% found this document useful (0 votes)
160 views

Number Theory Notes

This document provides an introduction to the mathematical concepts covered in the course MATH 312: An Introduction to Number Theory. It begins with a table of contents listing 34 topics covered in the course, ranging from the integers and mathematical induction to cryptography. The introduction defines some basic notation and properties regarding integers, such as closure, commutativity, and the well-ordering principle. It also provides examples to illustrate these properties.

Uploaded by

JaZz SF
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
160 views

Number Theory Notes

This document provides an introduction to the mathematical concepts covered in the course MATH 312: An Introduction to Number Theory. It begins with a table of contents listing 34 topics covered in the course, ranging from the integers and mathematical induction to cryptography. The introduction defines some basic notation and properties regarding integers, such as closure, commutativity, and the well-ordering principle. It also provides examples to illustrate these properties.

Uploaded by

JaZz SF
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 112

MATH 312

AN INTRODUCTION TO
NUMBER THEORY

NUNO FREITAS and ADELA GHERGA

November 22, 2017


The University of British Columbia
An Introduction to Number Theory is an introductory undergraduate text designed to initi-
ate the study of the integer numbers together with some of their elementary properties and
applications. The exposition is aimed at students who have some (but not necessarily much)
experience with reading and writing proofs. Specifically, it will be assumed that students are
familiar with basic techniques of mathematical proof and reasoning such as induction and
proof by contradiction. We will introduce basic concepts of number theory, such as prime
numbers, factorization, and congruences, as well as some of their applications, particularly
to cryptography.

1
Contents 19. Wilson’s Theorem 54
20. Fermat’s Little Theorem 56
1. The Integers 3
21. Primality Testing, Pseudoprimes,
2. Mathematical Induction 5
and Carmichael Numbers 58
3. Divisibility 8
22. Euler’s φ-Function and Euler’s
4. Representation of Integers 10 Theorem 62
5. The Greatest Common Divisor 12 23. Arithmetic Functions 66
6. The Euclidean Algorithm 15 24. Formulas for the Functions φ, τ
7. Prime Numbers 18 and σ 69

8. The Fundamental Theorem of 25. Perfect Numbers and Mersenne


Arithmetic 20 Primes 72
26. Primitive Roots 75
9. The Least Common Multiple 23
27. Primitive Roots for Primes 81
10. Primes of the Form 4k + 3 27
28. Index Arithmetic and Discrete
11. Linear Diophantine Equations 28
Logarithms 85
12. Irrational Numbers 30
29. Nonlinear Diophantine Equations 89
13. Congruences 32
30. Pythagorean Triples 90
14. Fast Modular Exponentiation 38
31. Fermat’s Last Theorem and
15. The Congruence Method 40 Infinite Descent 95
16. Linear Congruences in One 32. Fermat Factorization 98
Variable 42 33. The Pollard p − 1 Factorization
17. The Chinese Remainder Theorem 46 Method 100
18. Applications of Congruences 50 34. Cryptography 101

2
1. The Integers

In this section, we will recall some basic notation and properties of the integers. Throughout
the remainder of this book, we will use this information as axioms without further expla-
nation. The properties listed here are not necessarily independent; that is, it is possible to
prove some of these properties from the others.
We will denote the integer numbers by
Z = {. . . , −3, −2, −1, 0, 1, 2, 3, 4, . . .}.

Given a, b ∈ Z we will write a + b for their sum and a ⋅ b for their product. We also denote the
product by ab. We will write a − b to denote a + (−b).
We call the positive integers, to the numbers
Z>0 = {1, 2, 3, 4, . . .}.
Given two integers a, b, we will say that a is greater than b if a − b ∈ Z>0 ; this is denoted a > b.
We also say that b is smaller than a, writing b < a.
The integers satisfy the following properties:

Closure: If a, b ∈ Z then a + b ∈ Z and ab ∈ Z.

Commutativity: If a, b ∈ Z then a + b = b + a and ab = ba.

Associativity: If a, b, c ∈ Z then (a + b) + c = a + (b + c) and (ab)c = a(bc).

Distributivity: If a, b, c ∈ Z then a(b + c) = ab + ac.

Identity: If a ∈ Z then a + 0 = 0 + a = a and 1 ⋅ a = a ⋅ 1 = a.

Additive inverse: For all a ∈ Z there is a unique element x ∈ Z such that a + x = 0. We


denote x by −a and call it the additive inverse of a.

Cancellation law for multiplication: If a, b, c ∈ Z, c ≠ 0 and ca = cb then a = b.

Trichotomy law: If a ∈ Z then exactly one of the following holds:


(i) a < 0, (ii) a = 0, (iii) a > 0.
Closure for the positive integers: If a and b are positive integers then a + b and a ⋅ b are
positive integers.

Example 1.1. We will use the axioms above to show that, for all a ∈ Z, we have a ⋅ 0 = 0.
Indeed, since 0 is the identity element for addition we have 0 + 0 = 0, hence
a ⋅ 0 = a ⋅ (0 + 0) = a ⋅ 0 + a ⋅ 0
3
by the distributivity property. Adding the inverse of a ⋅ 0 to both sides and applying the
associativity law, we obtain
0 = a ⋅ 0 − a ⋅ 0 = a ⋅ 0 + (a ⋅ 0 − a ⋅ 0) = a ⋅ 0 + 0 = a ⋅ 0,
where the first equality comes from the definition of inverse and the last by the identity
element of addition. Thus a ⋅ 0 = 0, as desired.

The Well Ordering Principle (WOP): Every non-empty subset S ⊂ Z>0 of the positive
integers contains a least element.
That is, given a subset S of Z>0 , there is an m ∈ S such that m ≤ n for all n ∈ S. The
following examples illustrate this property.
Examples 1.2.
(1) Given S = Z>0 , the smallest element of S is 1.
(2) Let S be the set of even integers. Then 2 is the smallest element of S.
(3) Let S be the set of all prime numbers. Then 2 is the smallest element of S.
Remark 1.3. The WOP does not hold for other sets of numbers like Q or R. Indeed, consider
the set
1
S = { ∶ n ∈ Z>0 } .
n
This is a non-empty set of positive elements which does not have a smallest element in either
Q or R.

For an integer k we will write Z>k to denote the set of integers greater than k. Similarly, we
will also write Z≥k , Z<k , Z≤k or Z≠k to denote the sets with the natural analogous definition.
We conclude this section by defining the rational numbers Q as fractions of integers. For-
mally, we have the following definition.
Definition 1.4. Consider pairs (p, q) where p, q ∈ Z and q ≠ 0 and the following equivalence
relation on them: two such pairs (p, q) and (p′ , q ′ ) are equivalent if and only if pq ′ = p′ q in Z.
We define the rational numbers Q as the set of equivalence classes for this relation. The
equivalence class of (p, q) is denoted by the fraction pq .

Exercises.
Exercise 1.5. Let a, b ∈ Z with ab = 0. Show that either a = 0 or b = 0.
Exercise 1.6. Let a, b, c ∈ Z with a < b. Show that a + c < b + c.
Exercise 1.7. Let a, b, c ∈ Z with a < b and c > 0. Show that ac < bc.

4
2. Mathematical Induction

In this section, we recall the first and second principles of mathematical induction, an im-
portant proof technique. The first principle is also known as weak induction while the second
is also known as strong induction, because it seems to use a stronger assumption (compare
parts (b) of Theorems 1 and 2). However, they are equivalent; we shall see in Theorem 3
that they are both equivalent to the Well Ordering Principle.
Theorem 1 (First Principle of Mathematical Induction).
Let m be an integer and S a subset of Z satisfying
(a) m ∈ S and
(b) if k ≥ m and k ∈ S then k + 1 ∈ S.
Then S contains all integers greater or equal to m, that is S = Z≥m .

Proof. Let m = 1 and S be as in the statement. Assume, for contraction, that there exists an
integer greater or equal to m = 1 which is not in S. Then the set of positive integers which
are not in S is non-empty. By the WOP, this set has a minimal element s. Since 1 ∈ S, we
have that s ≠ 1 so that s is a positive integer strictly greater than 1. Now, the integer s − 1
is a positive integer smaller than s. By minimality of s, we must have that s − 1 ∈ S. Then,
from property (b), it follows that s = (s − 1) + 1 ∈ S, a contraction.
Finally, let m be any integer and S as in the statement; we will reduce this situation to the
case m = 1 and apply the previous paragraph. Indeed, consider the translated set
S ′ = {k − m + 1 ∣ k ∈ S}.
Since m ∈ S we have 1 ∈ S ′ . Let k ≥ 1 be in S ′ . Then, there is k0 ≥ m in S such that
k = k0 − m + 1; since S satisfies (b) we have k0 + 1 ∈ S, hence k + 1 = (k0 + 1) − m + 1 is in
S ′ . We conclude that S ′ satisfies (a) and (b) with m = 1, so by the first part of the proof we
have S ′ = Z≥1 . Then S = Z≥m , as desired. K
Theorem 2 (Second Principle of Mathematical Induction).
Let m be an integer and S a subset of Z satisfying
(a) m ∈ S and
(b) if k ≥ m and {m, m + 1, m + 2, . . . , k} ⊂ S then k + 1 ∈ S.
Then S contains all integers greater or equal to m, that is S = Z≥m .

Proof. Let m and S be as in the statement. Consider the set T of all the integers n ≥ m such
that every integer in the interval [m, n] belongs to S. In particular, m ∈ T .
Suppose that n ∈ T . Then {m, m + 1, m + 2, . . . , n} ⊂ S by definition of T . The hypotheses on
S now imply n + 1 ∈ S. Then all the integers in the interval [m, n + 1] are in S, so n + 1 ∈ T
also.
We have shown that T satisfies hypothesis (a) and (b) of Theorem 1, so T contains all the
integers greater or equal to m. Since T ⊂ S the same is true for S, as desired. K
5
In practice, induction is used to show that a statement is true for all integers ≥ m for some
m ∈ Z≥0 . This is done via two steps. The first step is the base case, where we prove the
desired statement is true for n = m. The second is the induction step, where, assuming that
the desired statement is true for n (the induction hypothesis), we prove it is also true for n+1.
Letting S denote the set of positive integers for which the statement is true, these two steps
show S satisfies (a) and (b) of Theorem 1. Hence, S must contain all the integers ≥ m.
Note that the only difference between strong and weak induction is that in strong induction,
the induction hypothesis becomes that the statement is true for all integers in the interval
[m, n]. We now give a few examples.
Proposition 1. Let n ∈ Z>0 . The sum of the first n integers is given by the formula
n
n(n + 1)
∑k = .
k=1 2

Proof. Let S ⊂ Z be the set of positive integers for which the formula holds. We use weak
induction on S.
Base: Let n = 1. Then ∑1k=1 k = 1 = 1 ⋅ (1 + 1)/2, so 1 ∈ S.
Hypothesis: Suppose that the formula holds for n > 1, that is n ∈ S.
Step: We will show that n + 1 ∈ S. We have that
n+1 n
n(n + 1)
∑ k = ∑ k + (n + 1) = +n+1
k=1 k=1 2
n(n + 1) + 2n + 2 n2 + 3n + 2 (n + 1)(n + 2)
= = =
2 2 2
(n + 1)((n + 1) + 1)
= ,
2
where in the second equality we have used the induction hypothesis. This shows that n+1 ∈ S.
We conclude that S satisfies both properties (a) with m = 1 and (b) in Theorem 1, therefore
S = Z≥1 , as desired. K
Proposition 2. Consider the geometric series ∑∞ k
k=0 ar where a, r ∈ R with r ≠ 1. For
n ∈ Z≥0 , its partial sum is given by the formula
n
1 − rn+1
∑ ark = a ( ).
k=0 1−r

Proof. We will use weak induction to prove the case a = 1.


1−r0+1
Base: Let n = 0. Then ∑nk=0 rk = 1 = 1−r .
Hypothesis: Suppose that the formula holds for n > 1.
Step: We have that
n+1 n
1 − rn+1
∑ rk = ∑ rk + rn+1 = + rn+1
k=0 k=0 1 − r
1 − rn+1 + (1 − r)rn+1 1 − rn+1 + rn+1 − rn+2 1 − rn+2
= = = .
1−r 1−r 1−r
6
It follows that the formula holds for a = 1. Finally, multiply the above formula on both sides
by a to obtain the general case. K
Example 2.1 (Strong induction). We will show that any amount of postage more than one
cent can be formed using just two-cent and three-cent stamps.
Base: For n = 2 cents we use one two-cent stamp; for n = 3 we use one three-cent stamp.
Hypothesis: Let n ≥ 3 and suppose that every amount of postage in the interval [2, n] can
be formed using two-cent and three-cent stamps.
Step: We can write n + 1 = 2 + (n − 1). By the induction hypothesis, n − 1 ≥ 2 can be obtained
by using just two-cent and three-cent stamps; then n + 1 can be obtained by using those
stamps plus an extra two-cent stamp.
Theorem 3. The Well Ordering Principle is equivalent to both weak and strong induction.

Proof. We have seen in the previous proofs that WOP implies weak induction and that weak
induction implies strong induction. We will now show that strong induction implies WOP.
Suppose there exist S ⊂ Z>0 without a smallest element. We will prove that S is empty. Let
T be the complement of S in Z>0 . That is, T is the set of positive integers which are not in
S.
Clearly, 1 ∈ T otherwise 1 ∈ S is the smallest element of S since 1 is the smallest positive
integer. Let n > 1 and write Sn = {1, . . . , n}. Suppose Sn ⊂ T , hence Sn ∩ S = ∅. Therefore, if
n + 1 ∈ S then n + 1 is the smallest integer in S, which is a contradiction to our hypothesis, so
n + 1 ∈/ S. We conclude that n + 1 ∈ T , hence T satisfies properties (a) and (b) of Theorem 2
and we have T = Z>0 by strong induction. Thus S = ∅, as desired. K

Exercises.
Exercise 2.2. Let n ∈ Z>0 . Use induction to show that the sum of the first n2 integers is
given by the formula
n
n(n + 1)(2n + 1)
∑ k2 = .
k=1 6
Exercise 2.3. Define a sequence x1 , x2 , . . . by

⎪ x1 = 1



⎨ x2 = 3


⎩xk+2 = 3xk+1 − 2xk for k ≥ 1.


Use induction to show that for all positive integers n, we have xn = 2n − 1.

7
3. Divisibility

Definition 3.1. Let a, b ∈ Z. We say that a divides b, denoted a ∣ b, if there exists c ∈ Z such
that b = a ⋅ c. In this case, we also say that a is a factor of b and b is a multiple of a. We
write a ∤ b to denote that a does not divide b.
Examples 3.2.
(1) 3 ∣ 6 since 6 = 3 ⋅ c with c = 2.
(2) 3 ∤ 5 since 5 = 3 ⋅ c with c = 53 ∉ Z.
(3) a = 1 ⋅ a = (−1)(−a) ⇒ ±1, ±a divide a.
(4) 0 = a ⋅ 0 ⇒ a ∣ 0 ∀a ∈ Z.
(5) b = 0 ⋅ c ⇒ b = 0. That is, only 0 is divisible by 0.
Remark 3.3. From (4) in the example above, it follows that 0 ∣ 0. However, the fraction 0
0
makes no sense as a rational number.

In subsequent sections, we will need some simple properties of divisibility, which we now
state and prove.
Proposition 3. Let a, b, c be integers. If a ∣ b and b ∣ c then a ∣ c.

Proof. By hypothesis, b = ab0 and c = bc0 for some b0 , c0 ∈ Z. Then,


c = bc0 = (ab0 )c0 = a(b0 c0 ) ⇐⇒ a ∣ c.
K
Proposition 4. Let a, b, c, m, n ∈ Z. If c ∣ a and c ∣ b then c ∣ ma + nb.

Proof. By hypothesis, a = ca0 and b = cb0 for some a0 , b0 ∈ Z. Then,


ma + nb = m(ca0 ) + n(cb0 ) = c(ma0 + nb0 ) ⇐⇒ c ∣ ma + nb.
K

An expression of the form ma + nb as in the previous proposition is called a (integral) linear


combination of a and b. A very useful consequence of this proposition is that if a and b are
divisible by an integer d, then their sum and difference are also divisible by d.
Corollary 1. Let a, b, c ∈ Z. If c ∣ a and c ∣ b, then c ∣ a + b and c ∣ a − b.

The above examples and definitions give a consice meaning to an exact division, but we are
also used to division with a remainder. For example, we know that 4 fits into 15 exactly 3
times, with a remainder of 3. The following theorem makes this idea precise.
Theorem 4 (Division Algorithm/Division with Remainder). Let n, a ∈ Z with a > 0. Then
there exist unique q, r ∈ Z such that
n=q⋅a+r where 0 ≤ r < a.
We say that q is the quotient and r the remainder of the division of n by a.
8
Proof. The proof consists of two parts: first we find some q, r with the desired properties
and then we prove they are unique with those properties. Let n, a ∈ Z with a > 0.
Existence. Consider
T = {m ∈ Z>0 ∣ m = n − ka for some k ∈ Z},
that is, the set of non-negative numbers that differ from n by a multiple of a. Note that
T ≠ ∅ because we can choose a negative k with large enough absolute value to make m > 0.
Then, by the WOP we can choose r to be the smallest positive integer in T . In particular,
we have 0 ≤ r = n − qa for some q ∈ Z by definition of T .
This gives our candidates for r and q. It remains to show that r < a. Indeed, suppose
r ≥ a > 0. Then
r − a = n − (q + 1)a ≥ 0 Ô⇒ r − a ∈ T with 0 ≤ r − a < r.
This contradicts the fact that r is the smallest positive element of T , hence r < a, as desired.
Uniqueness. Suppose, in addition to q and r, there exist q ′ and r′ with
n = q ′ ⋅ a + r′ where 0 ≤ r′ < a.
Now
n = q ⋅ a + r = q ′ ⋅ a + r′ with 0 ≤ r, r′ < a.
Suppose first r = r′ . Then (q − q ′ )a = 0 and, since a ≠ 0, we have q = q ′ . We conclude that,
to finish the proof, we need to show r = r′ . We proceed by contraction.
Suppose WLOG that r′ > r. Then r′ − r = (q − q ′ )a > 0 implies r′ − r ≥ a but
a > r′ ≥ r′ − r ≥ a,
a contradiction. Hence r = r′ , as desired. K
Corollary 2. Let n, a ∈ Z with a > 0. Then a ∣ n if and only if the remainder of the division
of n by a is r = 0.
Examples 3.4.
(1) Take n = 6 and a = 3; then 6 = 2 ⋅ 3 + 0. That is q = 2 and r = 0.
(2) Take n = 30 and a = 7; then 30 = 4 ⋅ 7 + 2. That is q = 4 and r = 2.

Exercises.
Exercise 3.5. Let n ∈ Z. Prove that 5 ∣ n5 − n.
Exercise 3.6. Let n ∈ Z. Is it true that 4 ∣ n4 − n? Provide a proof or counterexample.

9
4. Representation of Integers

When writing down integers, we typically use decimal notation, also called ‘base 10’. For
example, 37465 means that
37465 = 3 ⋅ 104 + 7 ⋅ 103 + 4 ⋅ 102 + 6 ⋅ 10 + 5 ⋅ 100 .
We have also heard that computers use ‘base 2,’ representing numbers by using only a series
of 1’s and 0’s. For instance, 36 can be written as
36 = 1 ⋅ 25 + 0 ⋅ 24 + 0 ⋅ 23 + 1 ⋅ 22 + 0 ⋅ 21 + 0 ⋅ 20 ,
or more simply, 36 = (100100)2 . Here, (100100)2 is the collection of the coefficients in front
of the exponents of 2 in the representation of 36. Of course, these coefficients can only be
either 1 or 0, since, for example 2 = 1 ⋅ 21 + 0 ⋅ 20 , or more concisely, 2 = (10)2 .
The following theorem makes this notion precise and shows that other bases, aside from 10
and 2, may also be used.
Theorem 5. Let b ≥ 2 be an integer. Every positive integer n can be uniquely written in
base b. More precisely,
n = ak bk + ak−1 bk−1 + ⋯ + a1 b + a0 with ak ≠ 0 and 0 ≤ ai ≤ b − 1 for i = 0, . . . , k.
We denote n in base b by (ak ak−1 . . . a1 a0 )b .

Proof. The proof uses strong induction and is divided into two parts: first we prove the
existence of a description of n as in the statement and then we show that such a description
is unique. Note that in the base step of induction, we must consider several cases. This
is because these cases are all independent of each other, in contrast to the induction step,
where each case follows from previous cases.
Existence.
Base: For the cases n = 1, . . . , b − 1, take k = 0 and a0 = n.
Hypothesis: There exists a description in base b for all positive integers less than n.
Step: Suppose n ≥ b. We divide n by b using the division algorithm (Theorem 4) to obtain
n = b ⋅ q + a0 with 0 ≤ a0 ≤ b − 1.
Note that 1 ≤ q < n, so by the induction hypothesis
q = cs bs + cs−1 bs−1 + ⋯ + c0 with cs ≠ 0 and 0 ≤ ci ≤ b − 1.
Then
n = b ⋅ q + a0 = b(cs bs + cs−1 bs−1 + ⋯ + c0 ) + a0 = cs bs+1 + ⋯ + c0 b + a0 .
Taking k = s + 1 and ai = ci−1 for i = 1, . . . , k we obtain the claimed description.
Uniqueness. Suppose
(4.1) n = ak bk + ⋯ + a1 b + a0 = a′l bl + ⋯ + a′1 b + a′0 with ak , a′l ≠ 0 and 0 ≤ ai , a′i ≤ b − 1.

Base: If n ≤ b − 1, then k = l = 0 and a′0 = a0 = n.


Hypothesis: There is an unique description in base b for all positive integers less than n.
10
Step: Suppose n ≥ b. From equation (4.1) we see that both a0 and a′0 satisfy the properties
of being the remainder of the divsion of n by b. From the uniqueness part of Theorem 4 we
conclude a0 = a′0 . We thus have
n − a0
= ak bk−1 + ⋯ + a2 b + a1 = a′l bl−1 + ⋯ + a′2 b + a′1
b
b < n, by the induction hypothesis, we have k = ` and
and since, 1 ≤ n−a 0

ak = a′l , . . . , a1 = a′1 ,
completing the proof. K
Example 4.2. Let n = 67.
(1) n = (67)10 since 67 = 6 ⋅ 10 + 7 ⋅ 100
(2) n = (235)5 since 67 = 2 ⋅ 52 + 3 ⋅ 5 + 2 ⋅ 50
(3) n = (2111)3 since 67 = 2 ⋅ 33 + 1 ⋅ 32 + 1 ⋅ 3 + 1 ⋅ 30

Exercises.
Exercise 4.3. Convert (101001000)2 to base 7.
Exercise 4.4. Consider a balance scale with 2 pans, A and B. Let k ∈ Z>0 . Show that any
weight not exceeding 2k − 1 that is placed on pan A may be measured, by placing on pan B,
a subset of weights of {1, 2, 22 , . . . , 2k−1 }.

11
5. The Greatest Common Divisor

Definition 5.1. Let a, b ∈ Z not both zero. The greatest common divisor of a and b is the
largest positive integer d such that d ∣ a and d ∣ b. We denote it by (a, b) or gcd(a, b). When
(a, b) = 1, we say that a and b are coprime.

Since the set of positive divisors of n and −n are the same, it is clear that
(−a, b) = (a, −b) = (−a, −b) = (a, b).
Therefore, we can restrict the coming discussion to non-negative integers a, b.
Examples 5.2.
(1) The set of all common divisors of 12 and 18 is {1, 2, 3, 6}, so (12, 18) = gcd(12, 18) = 6.
(2) For all a > 0, since a ∣ 0, we have (a, 0) = a.

The following theorem provides an alternative description of the greatest common divisor.
Theorem 6. Let a, b ∈ Z not both zero. Then (a, b) is the smallest positive integral linear
combination of a and b. That is, the smallest positive integer of the form
ax + by where x, y ∈ Z.

Proof. Let a, b ∈ Z be non-negative and not both 0. Consider


I = {ax + by ∣ x, y ∈ Z},
that is, the set of all integral linear combinations of a, b. Clearly, ±a, ±b ∈ I, so that I contains
positive integers. By the WOP, let d be the smallest such positive integer. By definition
of I, we have d = ax0 + by0 , for some x0 , y0 ∈ Z. To complete the proof, we must show that
d = (a, b).
Let n be a common divisor of a and b. By Proposition 4, we have n ∣ ax + by for all x, y ∈ Z,
i.e. n divides all the elements of I. In particular, n ∣ d. Choosing n = (a, b) we conclude that
(a, b) ≤ d.
Suppose for a moment that d divides all the elements of I. In particular, d ∣ a and d ∣ b so
d ≤ (a, b) by definition of (a, b). Since (a, b) ≤ d ≤ (a, b), we may conclude that d = (a, b).
Now, to finish the proof, we will now show that d divides every element of I. Indeed, let
n ∈ I. Dividing n by d with the division algorithm gives
n = q ⋅ d + r, 0 ≤ r < d, q ∈ Z.

We claim that I is closed under addition and multiplication by scalars. More precisely, if
x, y ∈ I and λ ∈ Z then x + y and λx belong to I. In particular, qd ∈ I and, since r = n − qd
with n ∈ I, we conclude r ∈ I. Thus r = 0, otherwise I would contain a positive number
smaller than d. It follows that d ∣ n, where n = (a, b).
We now prove the claim. Let x = ax0 + bx1 and y = ay0 + by1 be elements of I and λ ∈ Z.
Then,
x + y = ax0 + bx1 + ay0 + by1 = a(x0 + y0 ) + b(x1 + y1 ) ∈ I
12
and
λx = λ(ax0 + bx1 ) = a(λx0 ) + b(λx1 ) ∈ I,
as claimed. K
Examples 5.3.

(1) (5, 7) = 10 ⋅ 5 + (−7) ⋅ 7 = 1.


(2) (3, 15) = 3 ⋅ 6 + (−1) ⋅ 15 = 3 ⋅ 1 + 0 ⋅ 15 = 3.

The second example illustrates that the values x, y given by Theorem 6 are not unique. In
what follows, we look at how to find all the possible choices for x, y.
Corollary 3. Let a, b ∈ Z not both zero. If (a, b) = 1, then ax + by = 1 for some x, y ∈ Z.

Proof. This is a direct consequence of Theorem 6. K


Corollary 4. Let a, b ∈ Z not both zero. Every common divisor of a and b divides (a, b).

Proof. Let d be a common divisor of a, b. We have a = da′ and b = db′ for some a′ , b′ ∈ Z.
From Theorem 6, there are x, y ∈ Z such that
(a, b) = ax + by = d(a′ x) + d(b′ y) = d(a′ x + b′ y) Ô⇒ d ∣ (a, b)
K
Corollary 5. Let a, a′ , b, b′ ∈ Z satisfy a = da′ and b = db′ where d = (a, b). Then (a′ , b′ ) = 1.

Proof. From Theorem 6, there are x0 , y0 ∈ Z such that


ax0 + by0 = d ⇐⇒ d(a′ x0 ) + d(b′ y0 ) = d Ô⇒ a′ x0 + b′ y0 = 1.
By Theorem 6, (a′ , b′ ) is the smallest positive integer that can be written as a linear combi-
nation of a′ and b′ . It follows that (a′ , b′ ) = 1. K

The notion of greatest common divisor also makes sense for more than two integers.
Definition 5.4. Let a1 , a2 , . . . , an ∈ Z not all zero. The greatest common divisor of a1 , . . . , an ,
denoted gcd(a1 , . . . , an ) or (a1 , . . . , an ), is the largest positive integer dividing all the ai .
When (a1 , . . . , an ) = 1 we say that the ai are coprime and if (ai , aj ) = 1 for all i ≠ j, we say
they are pairwise coprime.
Example 5.5. Note that 7 ∤ 24, 7 ∤ 60 and it is the unique prime factor of 49 = 72 , so
(24, 60, 49) = 1; however, (24, 60) = 12. This shows that 24, 60 and 49 are coprime but not
pairwise coprime.

We complete this section with an useful generalization of Corollary 4.


Proposition 5. Let k ≥ 2 and a1 , . . . , ak ∈ Z≠0 . Every common divisor of all the ai divides
their greatest common divisor (a1 , . . . , ak ).
13
Proof. We will use induction.
Base: Suppose k = 2. The result follows directly by Corollary 4.
Hypothesis: Assume the result is true for any set of k ≥ 2 non-zero integers.
Step: Suppose d ∣ ai where ai ≠ 0 for i = 1, . . . , k + 1. In particular, d divides a1 and, by the
induction hypothesis, d divides gcd(a2 , a3 , . . . , ak+1 ). Then, by Corollary 4 it also divides
gcd(a1 , gcd(a2 , a3 , . . . , ak+1 )).

To complete the proof, we will now show that gcd(a1 , gcd(a2 , a3 , . . . , ak+1 )) = gcd(a1 , a2 , . . . , ak+1 ).
Indeed, let d0 be a common divisor of all the ai . In particular, by the induction hypothesis,
d0 divides gcd(a2 , . . . , ak+1 ), and since d0 also divides a1 , we have that
d0 ∣ gcd(a1 , gcd(a2 , a3 , . . . , ak+1 ))
by Corollary 4. By choosing d0 = gcd(a1 , . . . , ak+1 ), we conclude that
gcd(a1 , gcd(a2 , a3 , . . . , ak+1 )) ≥ gcd(a1 , a2 , . . . , ak+1 )
by definition of the GCD. Conversely, suppose d0 divides a1 and gcd(a2 , . . . , ak+1 ); hence
d0 also divides a2 , . . . ak+1 . It follows that d0 divides gcd(a1 , a2 , . . . , ak+1 ), and as above, we
conclude that
gcd(a1 , gcd(a2 , a3 , . . . , ak+1 )) ≤ gcd(a1 , a2 , . . . , ak+1 ).
K

Exercises.
Exercise 5.6. Let a, b be coprime integers not both zero. Determine with proof the possible
values of (a2 + b2 , a + b).
Note: You may use the fact that every integer has a prime divisor (Lemma 2 below).

14
6. The Euclidean Algorithm

In Example 5.2 we have computed (12, 18) = 6 by first listing all common divisors of 12
and 18. We now compute (18, 30) in the same way. Indeed, the positive divisors of 30 are
{1, 2, 3, 5, 6, 10, 15, 30} and those of 18 are {1, 2, 3, 6, 9, 18}. Then their common divisors are
{1, 2, 3, 6}, therefore (30, 18) = 6. Though this method is effective, it is not practical when
dealing with large numbers. In this section, we introduce the Euclidean algorithm which,
given integers a, b, allows one to compute (a, b) in an efficient way. We will first need the
following auxiliary result.
Lemma 1. Let a, b ∈ Z with a ≥ b > 0. Suppose
a=q⋅b+r with q, r ∈ Z.
Then (a, b) = (b, r).

Proof. Let c be a common divisor of a and b. Since r = a − q ⋅ b, we have c ∣ r by Proposition 4


so that c is a common divisor of b and r. Conversely, suppose c is a common divisor of b and
r. Then c divides a = b ⋅ q + r, thus it is also a common divisor of a and b. We conclude that
a, b and b, r have the same set of common divisors. Then (a, b) = (b, r), as desired. K
Theorem 7 (The Euclidean Algorithm). Let a, b ∈ Z with a ≥ b > 0. By the Division
Algorithm, there exist q1 , r1 ∈ Z such that
a = bq1 + r1 , 0 ≤ r1 < b.
If r1 > 0, there exist (again, by the Division Algorithm) q2 , r2 ∈ Z such that
b = r1 q2 + r2 , 0 ≤ r2 < r1 .
If r2 > 0, there exist (again, by the Division Algorithm) q3 , r3 ∈ Z such that
r1 = r2 q 3 + r3 , 0 ≤ r3 < r2 .
Continue this process. Then rn = 0 for some n. Moreover, if n > 1, then (a, b) = rn−1 ; if
n = 1, we have (a, b) = b.

Proof. Note that the ri ≥ 0 satisfy r1 > r2 > r3 > . . . . If rn ≠ 0 for all n, then we obtain a
strictly decreasing sequence of positive integers, which is impossible. Thus rn = 0 for some
n ≥ 1.
Suppose n > 1. Repeated applications of Lemma 1 gives
(a, b) = (b, r1 ) = (r1 , r2 ) = . . . = (rN −1 , rn ) = (rn−1 , 0) = rn−1 ,
as desired. If n = 1, then r1 = 0 and b ∣ a, thus (a, b) = b. K
Example 6.1. For a = 30, b = 18, we compute
(1) 30 = 1 ⋅ 18 + 12, so r1 = 12 Ô⇒ (30, 18) = (18, 12).
(2) 18 = 1 ⋅ 12 + 6, so r2 = 6 Ô⇒ (18, 12) = (12, 6).
(3) 12 = 2 ⋅ 6 + 0, so r3 = 0 Ô⇒ (12, 6) = (6, 0) = 6.
Thus (30, 18) = 6, as expected.
Example 6.2. Compute (803, 154):
15
(1) 803 = 154 ⋅ 5 + 33, so r1 = 33 Ô⇒ (803, 154) = (154, 33).
(2) 154 = 33 ⋅ 4 + 22, so r2 = 22 Ô⇒ (154, 33) = (33, 22).
(3) 33 = 22 ⋅ 1 + 11, so r3 = 11 Ô⇒ (33, 22) = (22, 11).
(4) 22 = 11 ⋅ 2 + 0, so r4 = 0 Ô⇒ (22, 11) = (11, 0) = 11.
Thus (803, 154) = 11.

Recall that (a, b) is the smallest positive integer of the form ax+by with x, y ∈ Z (Theorem 6).
The following method, called back substitution, allows one to find x0 , y0 ∈ Z such that
(a, b) = ax0 + by0 .
This method is also known as extended Euclidean algorithm since it mostly consists of re-
verting the steps of the Euclidean algorithm. We illustrate this with a few examples.
Example 6.3. In Example 6.2, we computed (803, 154) = 11. We can revert the steps of
the Euclidean algorithm as follows:
(803, 154) = 11 = 33 − 22 = 33 − (154 − 33 ⋅ 4) = 33 ⋅ 5 − 154
= (803 − 154 ⋅ 5) ⋅ 5 − 154 = 803 ⋅ 3 − 154 ⋅ 26
= 803 ⋅ 3 + 154 ⋅ (−26),
hence
(803, 154) = 803 ⋅ 3 + 154 ⋅ (−26) Ô⇒ x0 = 3 and y0 = −26.
Example 6.4. Compute (154, 35) and x0 , y0 satisfying 154x0 + 35y0 = (154, 35).
First apply the Euclidean Algorithm:
(1) 154 = 4 ⋅ 35 + 14, so r1 = 14 Ô⇒ (154, 35) = (35, 14);
(2) 35 = 2 ⋅ 14 + 7, so r2 = 7 Ô⇒ (35, 14) = (14, 7);
(3) 14 = 2 ⋅ 7 + 0, so r3 = 0 Ô⇒ (14, 7) = (7, 0) = 7,
to conclude (154, 35) = 7. Now we apply back substitution:
(154, 35) = 7 = 35 − 2 ⋅ 14 = 35 − 2 ⋅ (154 − 4 ⋅ 35)
= 35 ⋅ 9 + 154 ⋅ (−2) = 154 ⋅ (−2) + 35 ⋅ 9.
That is x0 = −2 and y0 = 9.
Proposition 6. Let a and b be non-zero integers satisfying a ∣ b and b ∣ a.
Then, a = b or a = −b. In particular, if a and b are positive, then a = b.

Proof. By hypothesis we have a = bk and b = ak ′ for some integers k, k ′ ∈ Z. Then,


a = bk = akk ′ Ô⇒ kk ′ = 1 Ô⇒ k = k ′ = 1 or k = k ′ = −1
and we have a = b or a = −b accordingly. The last statement is clear since a = −b and b have
different signs. K

16
Exercises.
Exercise 6.5. Use the Euclidean algorithm to prove that 7 has no expression as an integral
linear combination of 18209 and 19043.
Exercise 6.6. Use the Euclidean algorithm and back substitution to find two rational num-
bers with denominators 11 and 13, respectively, and a sum of 143
7
.

17
7. Prime Numbers

The prime numbers function as the ‘building blocks’ of the integers in the sense that they
cannot be divided any further.
Definition 7.1. Let p > 1 be an integer. Then p is a prime number if its only positive
divisors are 1 and p. An integer n > 1 which is not prime is called composite.
Examples 7.2.
(1) 2, 3, 5, 7 are prime numbers.
(2) 6 = 2 ⋅ 3 is composite.
(3) 34052881 is a prime.
(4) 274207281 − 1 is the largest prime number known as of May 2017. It is a number with
22338618 digits.

The last examples show that there are enormous prime numbers. In fact, a theorem of
Euclid states that there are infinitely many primes. It is a consequence of this theorem
(see Theorem 8) that we can always find larger and larger primes. Before we prove Euclid’s
theorem, we need to introduce the following important auxiliary result.
Lemma 2. Every integer n > 1 has a prime divisor.

Proof. Let n > 1 be an integer. If n is prime, since n ∣ n, then n is its own prime divisor
and we are done. Suppose now that n is composite. Assume further that n is the smallest
composite number without any prime divisors. Then, there are integers a, b such that
n=a⋅b with 1 < a, b < n.
By minimality of n, there exists a prime p dividing a; that is a = pa′ for some a′ ∈ Z. Then,
n = ab = p(a′ n), so that p ∣ n, a contraction. K
Theorem 8 (Euclid). There are infinitely many prime numbers.

Proof of Euclid’s Theorem. Suppose, for contradiction, that there are only finitely many
primes numbers. Denote them p1 , p2 . . . , pk and consider the the number
n = p1 p2 ⋯pk + 1.
By Lemma 2, n has a prime divisor p, hence p = pi for some i. Since p divides n and p1 p2 ⋯pk ,
from Corollary 1, p divides the difference
n − p1 p2 ⋯pk = 1
which is impossible. Hence there are infinitely many primes. K

The following two theorems are classical results on the distribution of prime numbers. They
are beyond the scope of these notes, so we restrict ourselves to their statements.
Theorem 9 (The Prime Number Theorem). Let π(x) denote the function giving the number
x
of primes ≤ x. Then, when x gets closer to infinity, the function log(x) gets closer to π(x).
Theorem 10 (Dirichlet Density Theorem). Let a, b ∈ Z satisfy (a, b) = 1. Then, there are
infinitely many primes of the form a + bk with k ∈ Z.
18
In later discussions about cryptographic applications, it will be clear that it is important to
find and use extremely large primes. Given a large odd integer it can be very hard to decide
if it is a prime number, therefore tests distinguishing between primes and composite integers
will be crucial. The most basic such test is trial division; the following proposition√tells us
that, given an integer n, we need only test its divisibility by all the primes up to n. If n
is not divisible by any of these primes, then n must be a prime number.

Proposition 7. Let n be composite. Then n has a prime divisor p ≤ n.

√ Since n is composite, we have n = a ⋅ b for 1 < a, b < n. WLOG, suppose b ≥ a. Suppose


Proof.
a > n. Then √ √
n = a ⋅ b > n ⋅ n = n,
√ √
a contradiction. Hence a ≤ n. In particular, the prime factors of a are ≤ n and, since
they are also prime factors of n, the result follows. K

This method, though effective, is not practical when n is large. In later sections, we shall
study alternative methods to deal with such cases.

Exercises.
Exercise 7.3. Using Euclid’s proof that there are infinitely many primes, show that the
n-th prime pn does not exceed 22 whenever n is a positive integer. Conclude that when n
n−1

is a positive integer, there are at least n + 1 primes less than 22 .


n

19
8. The Fundamental Theorem of Arithmetic

The main objective of this section is to prove the following result, which justifies the expres-
sion ‘the primes are the building blocks of the integers’.
Theorem 11 (The Fundamental Theorem of Arithmetic). Let n ≠ 0, 1 be an integer. Then
n has a prime factorization of the form
n = ±pa11 ⋯par r , ai ≥ 1,
where the pi are distinct prime numbers. Furthermore, up to the order of the pi , this factor-
ization is unique.

We remark that, however familiar this statement sounds, it is non-trivial. Suppose, for
example, that instead of the integers, we work with only with the even integers. In this
setting, the numbers 6, 10, 30, 50 are ‘primes’, in the sense that they cannot be decomposed
into a product of smaller even numbers. Moreover, we have 300 = 10 ⋅ 30 = 6 ⋅ 50, showing that
the number 300 has two different ‘prime decompositions’ in the universe of even numbers.
To prove the FTA some preparation is required.
Lemma 3. Let a, b ∈ Z>0 satisfy (a, b) = 1. If a ∣ bc, then a ∣ c.

Proof. By hypothesis, there exists k ∈ Z such that bc = ak. Additionally, by Corollary 3, we


have
(a, b) = 1 = ax + by for some x, y ∈ Z.
Then,
c = cax + cby = a(cx) + (ak)y = a(cx + ky)
so that a ∣ c as required. K
Remark 8.1. The condition (a, b) = 1 in Lemma 3 is necessary. Indeed, if a = 6, b = 3, and
c = 4, we have 6 ∣ 3 ⋅ 4 = 12 but 6 ∤ 4 and 6 ∤ 3.
Corollary 6. Let a1 , . . . , an , p be integers with p prime. If p ∣ a1 ⋯an then p ∣ ai for some i.

Proof. We will use induction on the number n of integers ai .


Base: Let n = 1. If p ∣ a1 , then p ∣ ai for i = 1;
Hypothesis: Assume that, for any n integers a1 , . . . , an , if p ∣ a1 ⋯an then p ∣ ai for some i.
Step: We consider now n + 1 integers a1 ⋯an an+1 . Suppose p ∣ a1 ⋯an an+1 = (a1 ⋯an ) ⋅ an+1 .
If (p, a1 , . . . , an ) = 1. Then, by Lemma 3, we have p ∣ an+1 . Suppose now (p, a1 , . . . , an ) ≠ 1
Since p is prime, we have p ∣ a1 ⋯an and, by the induction hypothesis, we have p ∣ ai for some
i = 1, . . . , n as desired. K

We are now in position to prove the Fundamental Theorem of Arithmetic.


20
Proof of FTA. The proof is divided into two parts. Namely, we first prove that a prime
factorization exists and then we show this factorization is unique. For n < 1 the result
follows from the factorization of −n. Let n > 1 be an integer.
Existence.
If n is prime, then taking p1 = n and a1 = 1 gives the desired factorization.
Suppose n is composite. For contradiction, suppose n is the smallest integer without a prime
decomposition. We have
n = a ⋅ b with 1 < a, b < n,
and, by minimality of n, we have a = p1 . . . pk and b = q1 . . . qt where the pi and qj are primes.
(Here we allow repetition of primes in these factorizations.) Thus
n = a ⋅ b = p1 . . . pk ⋅ q1 ⋯qt ,
is a prime factorization for n, a contradiction.
Uniqueness. Suppose n = p1 . . . ps = q1 . . . qt are two prime decompositions of n. After
cancelling common factors and relabeling the remaining primes, we obtain
p1 ⋯ps′ = q1 ⋯qt′ with 0 ≤ s′ ≤ s, 0 ≤ t′ ≤ t.
We will now show by contradiction that s′ = t′ = 0. This means that the previous equality
is 1 = 1; that is, the initial decompositions of n are the same up to ordering of the prime
factors. Indeed, suppose there is at least one prime on the left side, that is s′ ≥ 1 and p1 ≠ 1.
Then, t′ ≥ 1 and pi ≠ qj for all i, j since common primes were cancelled out. Moreover, since
p1 divides the product on the right hand side, by Corollary 6, we have p1 ∣ qj for some j.
Since qj is prime, we must have p1 = qj , contradicting the fact that pi ≠ qj for all i, j. Thus
s′ = t′ = 0. K
Example 8.2.
756 = 2 ⋅ 378 = 2 ⋅ 2 ⋅ 189 = 2 ⋅ 2 ⋅ 3 ⋅ 63
= 2 ⋅ 2 ⋅ 3 ⋅ 7 ⋅ 3 ⋅ 3 = 22 ⋅ 33 ⋅ 7.
Proposition 8. Let n ∈ Z>0 have prime factorization n = pa11 ⋯pann . Suppose that d ∣ n. Then,
the prime factorization of d is of the form
d = pb11 ⋯pbnn with 0 ≤ b i ≤ ai .

Proof. Let n > 0 and d ∣ n. We have n = dk for some integer k. Clearly, any prime divisor
of d is a prime divisor of n, so d = pb11 ⋯pbnn with bi ≥ 0. WLOG suppose that b1 > a1 . Then,
b1 − a1 ≥ 1 and
n = dk ⇐⇒ pa11 ⋯pann = (pb11 ⋯pbnn )k ⇐⇒ pa22 ⋯pann = p1 (pb11 −a1 −1 pn2 2 ⋯pbnn k),
showing that p1 divides the left hand side, which is impossible because the pi ≠ p1 for all
i ≥ 2. Thus, b1 ≤ a1 , as desired. K

21
Exercises.
Exercise 8.3. An integer n > 0 is a square if there is c ∈ Z such that n = c2 . A square-free
integer is an integer that is not divisible by any squares other than 1. Show that every
positive integer can be written as the product of a square (possibly 1) and a square-free
integer.
Exercise 8.4. An integer n is called powerful if, whenever a prime p divides n, p2 also
divides n. Show that every powerful number can be written as the product of square and a
cube (i.e. an integer of the form c3 for some integer c).

22
9. The Least Common Multiple

Definition 9.1. Let a1 , a2 , . . . , ak ∈ Z>0 . The least common multiple of a1 , a2 , . . . , ak is the


smallest positive integer that is divisible by all of the ai . We denote this by lcm(a1 , . . . , ak ).
Remark 9.2. Let a1 , a2 , . . . , ak ∈ Z>0 . Note that lcm(a1 , . . . , ak ) is the smallest positive mul-
tiple of all of the ai . In addition, since the product a1 a2 ⋯ak > 0 is a common multiple of all
the ai , the Well Ordering Principle guarantees that lcm(a1 , . . . , ak ) exists.
Examples 9.3.
(1) The positive multiples of 2 and 3 are, respectively, {2, 4, 6, 8, . . . } and {3, 6, 9, . . . }.
Then, lcm(2, 3) = 6.
(2) The positive multiples of 6 and 9 are, respectively, {6, 12, 18, . . . } and {9, 18, 27, . . . }.
Then, lcm(6, 9) = 18.
Proposition 9. Let a1 , a2 , . . . ak ∈ Z>0 have prime decompositions
s s
ai = p1i,1 ⋯pni,n for i = 1, . . . , k,
where si,j ≥ 0 for all i, j and the pi are distinct primes. Then,
min(si,1 ) min(si,n ) max(si,1 ) max(si,n )
(a1 , a2 , . . . , ak ) = p1 ⋯pn and lcm(a1 , a2 , . . . , ak ) = p1 ⋯pn ,
where min(si,j ) and max(si,j ) denote the minimum and maximum element of the set of
exponents {s1,j , . . . , sn,j } respectively.

Proof. Let p be a prime and let ps denote the largest power of p dividing (a1 , . . . , ak ). For
i = 1, . . . , k, write
ai = pei mi with (mi , p) = 1.
Since ps ∣ (a1 , . . . , ak ), we have ps ∣ ai for all i. By Proposition 8, it we have s ≤ min(e1 , . . . , ek ).
Conversely, pmin(e1 ,...,ek ) ∣ ai for all i. It now follows from Proposition 5 that
pmin(e1 ,...,ek ) ∣ (a1 , . . . , ak ),
hence s = min(e1 , . . . , ek ), as desired. Repeating this argument for each pi establishes
min(si,1 ) min(si,n )
(a1 , a2 , . . . , ak ) = p1 ⋯pn .

Now, to prove the second part of the proposition, write ` = lcm(a1 , a2 , . . . , ak ) and
max(si,1 ) max(si,n )
`′ = p1 ⋯pn .
If si,j denotes the exponent of the j th prime, pj , in the decomposition of ai ,
s s
ai = p1i,1 ⋯pni,n for i = 1, . . . , k,
∣ `′ for all i, j. Therefore, `′ is a multiple of all the ai . Since ` is the smallest
s
then clearly pj i,j
multiple of all of the ai , this means that ` ≤ `′ .
Suppose now that ` < `′ . Clearly, ` does not have any prime factor different from the pi .
Indeed, suppose ` contained a prime factor different from p1 , . . . , pn , say
` = pb11 ⋯pbnn ⋅ pb ,
23
for some integers b1 , . . . , bn , b and p a prime distinct from p1 , . . . , pn . Since the ai are made up
only of the primes {p1 , . . . , pn } and ` denotes the smallest multiple of all of the ai , dropping
p from ` would yield a smaller multiple of the ai , a contradiction.
Now, if
max(si,1 ) max(si,n )
` < `′ = p1 ⋯pn ,
it is because one of the exponents in the factorization of `, say (WLOG) the exponent of p1 ,
is strictly smaller than max(si,1 ). Suppose max(si,1 ) = sr,1 for some 1 ≤ r ≤ k. But then, the
above implies that ar ∤ ` because the exponent sr,1 of p1 in the factorization of ar is strictly
larger than the exponent of p1 in `. This is a contradiction. Thus ` = `′ as desired. K

The particular case of only two integers a, b is very useful and deserves to be highlighted.
Proposition 10. Let a, b ∈ Z>0 have prime decompositions
a = pa11 ⋯pann and b = pb11 ⋯pbnn ,
where ai , bi ≥ 0, and the pi are distinct primes. Then,
min(a ,b ) min(a ,b )
(i) (a, b) = p1 1 1
⋯pn n n .
max(a1 ,b1 ) max(an ,bn )
(ii) lcm(a, b) = p1 ⋯pn .
(iii) a ⋅ b = (a, b) ⋅ lcm(a, b).

Proof. Parts (i) and (ii) follow by Proposition 9 with k = 2.


We will prove (iii). Let a, b ∈ Z>0 have prime decompositions
a = pa11 ⋯pann and b = pb11 ⋯pbnn ,
where ai , bi ≥ 0 and the pi are the primes dividing a or b.
Note that ai + bi = min(ai , bi ) + max(ai , bi ). Then, from (i) and (ii) we have
min(a1 ,b1 ) min(an ,bn ) max(a1 ,b1 ) max(an ,bn )
(a, b) ⋅ lcm(a, b) = (p1 ⋯pn ) ⋅ (p1 ⋯pn )
min(a1 ,b1 )+max(a1 ,b1 ) min(an ,bn )+max(an ,bn )
= p1 ⋯pn
= pa11 +b1 ⋯pann +bn = (pa11 ⋯pann )(pb11 ⋯pbnn )
= a ⋅ b.
K
Remark 9.4. Unlike parts (i) and (ii), which have an analogous version for more than two
integers (Proposition 9), part (iii) of Proposition 10 does not generalize in the more direct
way; this is illustrated by the exercises at the end of this section.
Example 9.5. We will compute (756, 2205) and lcm(756, 2205).
We have the prime factorizations 756 = 22 ⋅ 33 ⋅ 50 ⋅ 71 and 2205 = 20 ⋅ 32 ⋅ 51 ⋅ 72 , hence
(756, 2205) = 20 ⋅ 32 ⋅ 50 ⋅ 71 = 63 and lcm(756, 2205) = 22 ⋅ 33 ⋅ 51 ⋅ 72 = 26460.
24
We note that the above formulas for (a, b) and lcm(a, b) are great theoretical tools but not
practical for computation when large values of a and b are involved. Both formulas require
one to compute the prime factorization of both a and b, which is a very hard problem
computationally. Instead, we compute ab and use the Euclidean algorithm to compute (a, b)
and the third formula to find lcm(a, b).
Proposition 11. Let a1 , a2 . . . , an ∈ Z. Then lcm(a1 , a2 , . . . , an ) = lcm(a1 , lcm(a2 , . . . , an )).

Proof. This follows directly from the formula for the least common multiple in Proposition 9.
K
Proposition 12. Let n, a1 , . . . , ak ∈ Z. Suppose that ai ∣ n for all i. Then lcm(a1 , . . . , ak ) ∣ n.

Proof. Clearly, ai ∣ n for all i if lcm(a1 , . . . , ak ) ∣ n since ai ∣ lcm(a1 , . . . , ak ). For the other
direction, we use induction on k ≥ 2.
Base: Let a1 , a2 , n be integers such that both a1 and a2 divide n. Then, we can write their
factorizations as follows
a1 = pe11 ⋯penn a2 = pb11 ⋯pbnn n = pc11 ⋯pcnn with ei , bi , ci ≥ 0
and pi distinct primes. From Proposition 8, we have ei , bi ≤ ci , hence max(ei , bi ) ≤ ci for all i.
Thus, by Propositions 10 and 8, we conclude lcm(a1 , a2 ) ∣ n.
Hypothesis: The result is true for k > 2 integers ai .
Step: Suppose ai ∣ n for 1 ≤ i ≤ k + 1 and write ` = lcm(a1 , . . . , ak ). Then, ak+1 ∣ n and by
hypothesis ` ∣ n, therefore lcm(`, ak+1 ) ∣ n by the base case. Now from Proposition 11 we
have lcm(`, ak+1 ) = lcm(a1 , . . . , ak+1 ) ∣ n, as desired. K
Proposition 13. Let a1 , . . . , an ∈ Z be pairwise coprime. Then lcm(a1 , . . . , an ) = a1 ⋯an .

Proof. We use induction on n.


Base: Let a1 , a2 be coprime. Then (a1 , a2 ) = 1 and Proposition 10 (iii) gives lcm(a1 , a2 ) =
a1 a2 .
Hypothesis: Assume lcm(a1 , . . . , an ) = a1 ⋯an for any choice of n pairwise coprime integers.
Step: Let a1 , . . . , an+1 be pairwise coprime integers. We have,
lcm(a1 , . . . , an+1 ) = lcm(lcm(a1 , . . . , an ), an+1 ) = lcm(a1 ⋯an , an+1 ) = a1 ⋯an+1 ,
where the first equality follows from Proposition 11, the second by induction hypothesis, and
the third by the base case (because (a1 ⋯an , an+1 ) = 1). K
Proposition 14. Let n1 , n2 be coprime integers. If d ∣ n1 n2 , then there are unique integers
d1 ∣ n1 and d2 ∣ n2 such that d = d1 d2 . Conversely, any such product is a divisor of n1 n2 .

Proof. Consider the prime factorizations


n1 = pa11 ⋯pakk and n2 = q1b1 ⋯qrbr .
Since (n1 , n2 ) = 1 we have that pi ≠ qi for all i, j and the prime factorization of n1 n2 is
n1 n2 = pa11 ⋯pakk q1b1 ⋯qrbr .
25
Let d be a divisor of n1 n2 . Then, by Proposition 8, we have
d = ps11 ⋯pskk q1e1 ⋯qrer ,
where 0 ≤ si ≤ ai and 0 ≤ ej ≤ bj . Now let d1 = (n1 , d) and d2 = (n2 , d). From Proposi-
tion 10 (i), we have
d1 = ps11 ⋯pskk and d2 = q1e1 ⋯qrer ,
which clearly satisfy di ∣ ni and d = d1 d2 . Suppose now d = d′1 d′2 and d′i ∣ ni . Since (n1 , n2 ) = 1
we have also (d′1 , n2 ) = 1 and (d′2 , n1 ) = 1, therefore d′i = (d, ni ) = di , showing the decomposi-
tion d = d1 d2 is unique.
Conversely, let d1 and d2 be divisors of n1 and n2 , respectively. Then, by Proposition 8 we
have
d1 = ps11 ⋯pskk and d2 = q1e1 ⋯qrer ,
where 0 ≤ si ≤ ai and 0 ≤ ej ≤ bj . Moreover, from the same proposition and the prime
factorizations of the products d1 d2 and n1 n2 (which are the product of the factorizations of
each factor) we conclude that d1 d2 ∣ n1 n2 , as desired. K

Exercises.
Exercise 9.6. Show that, abc = gcd(a, b, c) lcm(a, b, c) does not hold for general a, b, c ∈ Z
by finding a counterexample.
Exercise 9.7. Prove that, for all a, b, c ∈ Z>0 , we have abc = gcd(bc, ac, ab) lcm(a, b, c).

26
10. Primes of the Form 4k + 3

In this section, we will prove the following particular case of Dirichlet’s Density Theorem
(Theorem 10).
Theorem 12. There are infinitely many primes of the form 4k + 3 for k ∈ Z.

We will need the following two auxiliary results.


Lemma 4. Let n be an integer. Then n is of the form 4k, 4k + 1, 4k + 2, or 4k + 3. In
particular, if n is odd, then it is of the form 4k + 1 or 4k + 3.

Proof. Dividing n by 4 via the division algorithm (Theorem 4) yields


n = 4k + r, 0 ≤ r ≤ 3.
Clearly, the four possible forms in the statement are in correspondence with the value of r.
Suppose now n = 4k or n = 4k + 2. Then 2 ∣ n, hence n is even. K
Lemma 5. Let a, b be integers of the form 4k + 1. Then ab is also of the form 4k + 1.

Proof. Write a = 4ka + 1 and b = 4kb + 1. Then,


a ⋅ b = (4ka + 1)(4kb + 1) = 16ka kb + 4ka + 4kb + 1 = 4(4ka kb + ka + kb ) + 1.
K

Proof of Theorem 12. We will proceed using proof by contradiction. Indeed, suppose there
are only finitely many primes of the form 4k + 3. Denote these primes p0 = 3, p1 , p2 , . . . , ps
and consider the number
Q = 4p1 p2 ⋯ps + 3.
Clearly, 2 ∤ Q, hence the prime factorization of Q (which exists by Theorem 11) contains
only odd primes. By Lemma 4, the primes in this factorization are all of the form 4k + 1 or
4k + 3. If all the primes occurring in the prime factorization of Q are of the form 4k + 1, by
Lemma 5, we conclude that Q is also of the form 4k + 1. Here, Q is of the form 4k + 3, so
that there is at least one prime factor of Q which is of the form 4k + 3.
Let p ∣ Q be of the form 4k + 3. Thus p = pi for some i. If p = 3, then 3 ∣ (Q − 3) = 4p1 ⋯ps ,
a contradiction. If p = pi ≠ 3, then p ∣ (Q − 4p1 ⋯ps ) = 3, a contradiction. Hence there are
infinitely many primes of the form 4k + 3. K
Example 10.1. The first few values of 4k + 3 are 3, 7, 11, 15, 19, 23, 27, so that clearly the
formula generates both primes and composite numbers. Theorem 12 guarantees that we will
always find larger and larger values of k giving rise to new primes.

Exercises.
Exercise 10.2. Give a counterexample to show that Lemma 5 is false if we replace 4k + 1
by 4k + 3.

27
11. Linear Diophantine Equations

Definition 11.1. Any equation with one or more variables to be solved in the integers is
called a Diophantine Equation.
Examples 11.2. The equations
3x = 1, 2x + 2y = 3, x2 + z 2 = y 2
are Diophantine equations when we are only interested in integer solutions. For example,
the first equation has solution x = 1/3. However, viewed as a Diophantine equation in Z,
this equation has no solutions.
Definition 11.3. Let a1 , . . . , an ∈ Z≠0 . A Diophantine equations of the form
a1 x1 + a2 x2 + ⋯ + an xn = b, with b ∈ Z
is a linear Diophantine equation in n variables x1 , . . . , xn .
Examples 11.4.

(1) 3x = 1 and 2x + 2y = 3 are linear.


(2) x2 + z 2 = y 2 and 3xy = 10 are non-linear.

Our objective in this section is to prove Theorem 14 which gives a complete resolution of
linear Diophantine equations in two variables. The case of one variable follows directly from
the definition of divisibility.
Theorem 13. Let a, b ∈ Z with a ≠ 0. The equation ax = b has a unique solution if and only
if a ∣ b. When a solution exists, necessarily, it is given by x = ab .
Theorem 14. Let a, b, c ∈ Z, with a, b ≠ 0. Write d = (a, b). Consider the equation
(11.5) ax + by = c.

(A) The equation (11.5) has an integer solution (x0 , y0 ) if and only if d ∣ c.
(B) Suppose d ∣ c so that there is a solution (x0 , y0 ) by part (A). Then, all the solutions
to (11.5) are given by the formulas
b a
x = x0 + t, y = y0 − t with t ∈ Z.
d d

Proof. We have a = da′ and b = db′ with a′ , b′ ∈ Z. By Corollary 5 we know that (a′ , b′ ) = 1.
We will now prove part (A). Suppose first that ax + by = c has a solution (x0 , y0 ). Then,
ax0 + by0 = c ⇐⇒ d(a′ x0 ) + d(b′ y0 ) = d(a′ x0 + b′ y0 ) = c Ô⇒ d ∣ c.

Conversely, suppose d ∣ c. That is, c = dt with t ∈ Z. From Theorem 6, we know there are
x1 , y1 ∈ Z such that
ax1 + by1 = d ⇐⇒ a(tx1 ) + b(ty1 ) = dt = c.
Then, x0 = tx1 , y0 = ty1 is a solution to ax + by = c.
28
We now prove (B). Suppose d ∣ c and (x0 , y0 ) is a solution to ax + by = c. Let t ∈ Z. We
compute
b a ab ab
a (x0 + t) + b (y0 − t) = ax0 + t + by0 − t = ax0 + by0 = c,
d d d d
showing that the formula in the statement produces solutions to ax + by = c. To finish the
proof, it remains to show that all solutions are given by the formula above. Let (x1 , y1 ) be
another solution. We define the quantities tx = x1 − x0 and ty = y1 − y0 and compute
atx + bty = ax1 − ax0 + by1 − by0 = (ax1 + by1 ) − (ax0 + by0 ) = c − c = 0.
Then,
b a
bty = −atx ⇐⇒ d ( ) ty = −d ( ) tx where d = (a, b)
d d
a b
⇐⇒ b′ ty = −a′ tx , where a′ = , b′ = .
d d
′ ′ ′ ′
Since (a , b ) = 1, by Lemma 3, we have b ∣ tx , that is tx = b t for some t ∈ Z. Then,
b′ ty = −a′ b′ t Ô⇒ ty = −a′ t. Therefore,
b a
x1 = x0 + tx = x0 + b′ t = x0 + t and y1 = y0 + ty = y0 − a′ t = y0 − t,
d d
showing that (x1 , y1 ) is obtained from (x0 , y0 ) by the formula in the statement, as desired.
K
Example 11.6. We will solve the equation 154x + 35y = 7.
In Example 6.4, we have computed d = (154, 35) = 7 and, since d ∣ 7, there exist solutions
by part (A) of Theorem 14. Indeed, in the same example, we also computed the particular
solution (x0 , y0 ) = (−2, 9). Therefore, by part (B) of Theorem 14, the general solution is
given by
x = −2 + 5t, y = 9 − 22t, t ∈ Z.
In particular, taking t = 1 gives the particular solution (x1 , y1 ) = (3, −13).
Example 11.7. Consider the equation 154x + 35y = 24.
Since d = (154, 35) = 7 ∤ 24, there are no solutions by part (A) of Theorem 14.
Example 11.8. Consider the equation 154x + 35y = 21.
Since d = (154, 35) = 7 ∣ 21, this equation has solutions in Z. Example 11.6 shows that
154x + 35y = 7 has the solution x1 = −2 and y1 = 9. Then, 154x + 35y = 21 has the solution
x0 = 3x1 = −6, y0 = 3y1 = 27. We conclude that the general solution is given by
x = −6 + 5t, y = 27 − 22t for t ∈ Z.

Exercises.
Exercise 11.9. A shopper spends a total of $5.49 for oranges, which cost 18¢ each, and
grapefruit, which cost 33¢ each. What is the minimum number of pieces of fruit the shopper
could have bought?

29
12. Irrational Numbers

Any element in the set rational numbers Q is denoted as a fraction, a/b, where a, b ∈ Z with
b ≠ 0. By cancelling out the common factors of a and b we may obtain another fraction,
a′ /b′ . This new fraction a′ /b′ represents the same rational number a/b = a′ /b′ , but with a′
and b′ now coprime. Recall the inclusions Z ⊂ Q ⊂ R.
Definition 12.1. We say that a real number x ∈ R is irrational if x ∈/ Q.

We will prove the following standard fact using two techniques we have learned so far.

Theorem 15. The number 2 is irrational.

Proof
√ 1. Suppose, for contradiction, that 2 is rational. Then, by definition of the rationals,
2 = a/b with a, b positive integers. Consider the set
√ √
S = {k 2 ∣ k and k 2 are positive integers }.

Note that this set is non-empty since a = b√ 2 ∈ S. It follows by the WOP that √ there exists
a√smallest positive element in S, say s = t 2 with t ∈ Z>0 . We claim that s 2 − s ∈ S and
s 2 − s < s, obtaining a contradiction with minimality of s, and completing the proof.
Indeed, note that s ∈ S is an integer by definition of S. Additionally, since t ∈ Z>0 , it follows
that √ √ √
s 2 = t 2 ⋅ 2 = 2t and s − t
are integers. Moreover,
√ √ √ √
s 2 − s = s 2 − t 2 = (s − t) 2,

so that√s 2 − s ∈ S, provided that we show s − t is positive. This is the same as showing that
(s − t) 2 is positive, which is true because
√ √ √ √
(s − t) 2 = s 2 − s = s( 2 − 1) and ( 2 − 1), s > 0.
√ √ √
Therefore, s 2 − s ∈ S and since 2 − 1 < 1 we also have s 2 − s < s, as desired. K

Proof 2. Suppose, for contradiction, that 2 is rational. Then, by definition of the rationals,

2 = a/b with a, b coprime positive integers. Hence,

2 = a/b Ô⇒ 2b2 = a2 Ô⇒ 2 ∣ a
because 2 is a prime dividing the product a2 = a⋅a, so it divides one of the factors. Therefore,
a = 2k for some k ∈ Z and, replacing the above gives,
2b2 = a2 = (2k)2 ⇐⇒ b2 = 2k 2 Ô⇒ 2 ∣ b,
showing that both a, b are divisible by 2, contradicting the fact that (a, b) = 1. K

The following theorem provides a criterion to decide if a number is irrational.


Theorem 16. Let f (x) = xn + cn−1 xn−1 + ⋯ + c1 x + c0 be a polynomial with coefficients ci ∈ Z.
Suppose that the real number α satisfies f (α) = 0. Then α is either an integer or irrational.
30
Proof. Let α ∈ R satisfy f (α) = 0. If α is irrational we are done. Suppose that α = a/b ∈ Q;
we shall show that α ∈ Z. That is, b = ±1.
From f (α) = 0, we see that
a n a n−1 a
( ) + cn−1 ( ) + ⋯ + c1 ( ) + c0 = 0
b b b
and, multiplying by b , we obtain
n

an + cn−1 an−1 b + ⋯ + c1 abn−1 + c0 bn = 0 ⇐⇒ an = b(−cn−1 an−1 − ⋯ − c1 abn−2 − c0 bn−1 ),


showing that b ∣ an . Now, if b ≠ ±1, any prime factor of b is also a prime factor of a, a
contraction with (a, b) = 1. Thus b = ±1 and α = ±a ∈ Z, as desired. K
m

√ a, m ∈ Z>0 satisfy a ≠ k for k ∈ Z so that the real number
m
Corollary 7. Let a is not an
m
integer. Then, a is irrational.
√ √ √
Proof.√The number m a satisfies f ( m a) = 0 where f (x) = xm − a. By hypothesis, m a ∈/ Z
then m a is irrational by Theorem 16. K

Using this corollary, we can easily give examples of irrational numbers; in particular, we
obtain another proof of Theorem 15.
√ √ √ √
Example 12.2. The numbers 2, 6, 3 5 and 10 19 are irrational.

Exercises.
√ √
Exercise 12.3. Show that 5 + 3 is irrational.
Exercise 12.4. Show that log2 3 is an irrational number.

31
13. Congruences

Definition 13.1. Let a, b, m ∈ Z with m > 0. We say that a is congruent to b modulo m if


and only if m ∣ a − b. We will write a ≡ b (mod m) to denote that a and b are congruent
modulo m and a ≡/ b (mod m) if they are not. We call m the congruence modulus.
Examples 13.2.
(a) 9 ≡ 3 (mod 3) because 3 ∣ (9 − 3) = 6.
(b) 7 ≡ 1 (mod 2) because 2 ∣ (7 − 1).
(c) 8 ≡ 0 (mod 2) because 2 ∣ 8 − 0 = 8.
(d) If n = 4k + 3, then 4 ∣ (n − 3) and n ≡ 3 (mod 4).
(e) If n = 4k + 1, then 4 ∣ (n − 1) and n ≡ 1 (mod 4).
(f) For all a, b ∈ Z we have a ≡ b (mod 1) because 1 ∣ a − b.
(g) a ≡ 1 (mod 2) for all odd integer a = 2k + 1.
(h) a ≡ 0 (mod 2) for all even integer a = 2k.

Using the language of congruences we can often state theorems in a more compact way; in
particular, we can now rephrase Lemma 5 and Theorem 12 as follows:
Lemma 6. Let a, b ∈ Z satisfy a, b ≡ 1 (mod 4). Then ab ≡ 1 (mod 4).
Theorem 17. There are infinitely many primes p such that p ≡ 3 (mod 4).

We will show that Lemma 6 is a special case of a general basic property of congruences (see
Corollary 10), but first we need to introduce other elementary properties and definitions.
Proposition 15. Let m ∈ Z>0 . Then, the relation of congruence modulo m is an equivalence
relation in Z. More precisely, for all a, b, c ∈ Z, we have
(i) a ≡ a (mod m) (reflexivity);
(ii) a ≡ b (mod m) Ô⇒ b ≡ a (mod m) (symmetry);
(iii) a ≡ b, b ≡ c (mod m) Ô⇒ a ≡ c (mod m) (transitivity).

Proof. Let a, b, c, m ∈ Z with m > 0.


(i) Clearly m ∣ (a − a) = 0, so a ≡ a (mod m).
(ii) Since a ≡ b (mod m), we have a − b = mk for some k ∈ Z. Then b − a = m(−k), so that
m ∣ b − a ⇐⇒ b ≡ a (mod m).
(iii) We have a − b = mk1 , b − c = mk2 for k1 , k2 ∈ Z; then
a − c = (a − b) + (b − c) = mk1 + mk2 = m(k1 + k2 ) ⇐⇒ a ≡ c (mod m).
K

Fix a congruence modulus m > 0. Since the relation of congruence mod m is an equivalence
relation, it divides Z into disjoint equivalence classes. The equivalence class of an integer a
is the set of integers which are congruent to a modulo m. We call it the congruence class of
a mod m and denote it by [a]. That is, for an integer a, we have
[a] ∶= {x ∈ Z ∶ x ≡ a (mod m)}.
32
We say that a is a representative of the class; we can choose any element y ∈ [a] as a
representative, in which case we have [y] = [a]. This is illustrated by the following examples.
Example 13.3. Let m = 4. We have Z = [0] ∪ [1] ∪ [2] ∪ [3], where
[0] = {x ∈ Z ∶ x ≡ 0 (mod 4)} = {x ∈ Z ∶ x − 0 = 4k with k ∈ Z}
= {. . . , −8, −4, 0, 4, 8, . . . }
[1] = {x ∈ Z ∶ x ≡ 1 (mod 4)} = {x ∈ Z ∶ x − 1 = 4k with k ∈ Z}
= {x ∈ Z ∶ x = 1 + 4k with k ∈ Z} = {. . . , −7, −3, 1, 5, 9, . . . }
[2] = {. . . , −6, −2, 2, 6, . . . }
[3] = {. . . , −5, −1, 3, 7, 11, . . . }
In particular, [0] = [−4], [1] = [9], [2] = [6] and [3] = [−1].
Example 13.4. Let m = 3. We have Z = [0] ∪ [1] ∪ [2], where
[0] = {. . . , −6, −3, 0, 3, 6, . . . }
[1] = {. . . , −5, −2, 1, 4, 7, . . . }
[2] = {. . . , −4, −1, 2, 5, 8, . . . }
In particular, [0] = [3], [1] = [−2] and [2] = [−1].
Example 13.5. Let m = 2. We have Z = [0] ∪ [1], where
[0] = { even integers }
[1] = { odd integers }
In particular, [0] = [4] and [1] = [3].

It follows from the previous discussion that every integer a belongs to an unique congru-
ence class modulo m. Given a, the next proposition determines the smallest non-negative
representative of the congruence class, [a].
Proposition 16. Let a, m ∈ Z with m > 0. Then a ≡ r (mod m), where r is the remainder
of the division of a by m.
In particular, [a] = [r] and a is congruent to exactly one integer in {0, 1, 2, . . . , m − 1}.

Proof. Let S = {0, 1, 2, . . . , m − 1}. By the division algorithm we obtain


a=m⋅q+r with r ∈ S.
Then, a − r = m ⋅ q ⇐⇒ a ≡ r (mod m), which proves the first statement.
Suppose now a ≡ r1 (mod m) and a ≡ r2 (mod m) with r1 , r2 ∈ S. Then, r1 ≡ r2 (mod m)
and m ∣ r1 − r2 . Moreover, since r1 , r2 are in S, we see that −(m − 1) ≤ r1 − r2 ≤ m − 1. But
now we have just shown that these two conditions
m ∣ r1 − r2 , and − (m − 1) ≤ r1 − r2 ≤ m − 1
are simultaneously satisfied. Since the only multiple of m in the range [−(m − 1), m + 1] is
zero, we must have r1 − r2 = 0, hence r1 = r2 . K
33
The following two results follow from the theory so far. We highlight them in a format that
we will use several times later on.
Corollary 8. Let a, b, m ∈ Z with m > 0. If a ≡ b (mod m) and 0 ≤ a, b ≤ m − 1, then a = b.
Corollary 9. Let a, m ∈ Z with m > 0. Suppose a ≡ r (mod m) with 0 ≤ r ≤ m − 1. Then,
(a, m) = 1 ⇐⇒ (r, m) = 1.

Proof. We have a = mk + r for some integer k. Then (a, m) = (m, r) by Lemma 1. K


Definition 13.6. A set S ⊂ Z such that every integer is congruent mod m to exactly one
integer in S is called a complete residue system modulo m.
Example 13.7. From Proposition 16 it follows that S = {0, 1, . . . , m − 1} is a complete
residue system modulo m.
Definition 13.8. We define Z/mZ, the integers modulo m, to be the set of congruence
classes modulo m, i.e
Z
∶= {[0], [1], . . . , [m − 1]}.
mZ
Example 13.9. Let m = 3. Then,
Z
= {[0], [1], [2]} = {[3], [7], [2]},
3Z
where the second equality holds because we have only changed the representative of the
congruence classes [0] and [1].

In what follows, we will see that Z/mZ has properties similar to the integers. In particular,
we shall soon define addition and multiplication in Z/mZ. However, let us first observe a
very important difference between Z/mZ and Z: there is no cancellation law in Z/mZ. More
precisely, in the integers we have
if a, b ∈ Z satisfy ab = 0, then a = 0 or b = 0
whilst, for example, in Z/4Z we have
2 ⋅ 2 ≡ 4 ≡ 0 (mod 4) and 2 ≡/ 0 (mod 4).

To define addition and multiplication in Z/mZ we will need the following result.
Theorem 18. Let m ∈ Z>0 . Suppose a ≡ b (mod m) and c ≡ d (mod m). Then,
(i) a + c ≡ b + d (mod m);
(ii) a − c ≡ b − d (mod m);
(iii) ac ≡ bd (mod m).

Proof. By hypothesis, we have b = a + km and d = c + k ′ m for some k, k ′ ∈ Z.


(i) Adding the two equalities gives
a + c + (k + k ′ )m = b + d ⇐⇒ (a + c) − (b + d) = m(−k − k ′ )
⇐⇒ a + c ≡ b + d (mod m).
(ii) Similar to (i).
34
(iii) We conpute
bd = (a + km)(c + k ′ m) = ac + ak ′ m + ckm + kk ′ m2
⇐⇒ bd − ac = m(ak ′ + ck + kk ′ m)
⇐⇒ bd ≡ ac (mod m)
K
Corollary 10. The Lemma 6 holds.

Proof. Take m = 4 and let a, c ∈ Z satisfy a, c ≡ 1 (mod 4). Then, by part (iii) of Theorem 18
with b = d = 1, we conclude ac ≡ 1 ⋅ 1 ≡ 1 (mod 4), as desired. K
Example 13.10. Let m = 5. To compute 492 (mod 5) we calculate that 492 = 2401 = 480⋅5+1
and by Proposition 16, it follows that 492 ≡ 1 (mod 5). However, Theorem 18 allows for the
much quicker calculations
492 ≡ 42 ≡ 16 ≡ 1 (mod 5) or 492 ≡ (−1)2 ≡ 1 (mod 5).
Remark 13.11. Theorem 18 does not hold for exponentiation. That is,
/ ac ≡ ad (mod m).
c ≡ d (mod m) Ô⇒
For example, taking m = 3, a = 2, d = 3, and c = 6, we have 3 ≡ 6 (mod 3) but
23 ≡ 8 ≡ 2 (mod 3) and 26 ≡ 23 ⋅ 22 ≡ 4 ≡ 1 (mod 3)
are not congruent mod 3.

We can now define arithmetic operations in Z/mZ.


Definition 13.12. Define the addition, multiplication and multiplication by scalar opera-
tions in Z/mZ as follows. Let [r], [s] ∈ Z/mZ and λ ∈ Z.
Addition: [r] + [s] ∶= [r + s];
Multiplication: [r] ⋅ [s] ∶= [r ⋅ s];
Multiplication by scalar: λ ⋅ [r] ∶= [λ ⋅ r].

Note that, in the above definitions, we use the concrete representatives r, s ∈ Z to calculate
the result of the operation. For example, for m = 5, we have [2] + [3] = [2 + 3] = [5], but,
since [2] = [7] and [3] = [−7], in order for the definition to make sense, we also need that
[7] + [−7] = [7 + (−7)] = [0] is equal to [5], which is the case. Clearly, for the operations to
be well defined, we need a similar compatibility for any other choice of representatives. This
is the content of the next proposition.
Proposition 17. The operations in Definition 13.12 are well defined. That is, their output
is independent of the choice of representatives.

Proof. Let r′ ∈ [r] and s′ ∈ [s], that is, r′ ≡ r (mod m) and s′ ≡ s (mod m). Then, by part (i)
of Theorem 18, we have
r + s ≡ r′ + s′ (mod m) ⇐⇒ [r + s] = [r′ + s′ ],
hence
[r] + [s] ∶= [r + s] = [r′ + s′ ] =∶ [r′ ] + [s′ ].
35
This shows that addition is well defined. Similar arguments show that the other operations
are also well-defined. K
Example 13.13. We can write tables of addition and multiplication in Z/mZ. For example,
the table of addition in Z/3Z:
+ [0] [1] [2]
[0] [0] [1] [2]
[1] [1] [2] [0]
[2] [2] [0] [1]
Example 13.14. In Z/7Z, we have [3] ⋅ [6] = [18] = [4] = [−10], but since [3] = [10] and
[6] = [−1] we also have, more directly, [3] ⋅ [6] = [10] ⋅ [−1] = [−10].

We have mentioned that, in general, there is no cancellation law in Z/mZ. For example,
when m = 4, we have 2 ⋅ 2 ≡ 0 (mod 4) and 2 ≡/ 0 (mod 4). The following is another example
of the failure of the cancellation law. For all a, b ∈ Z we have 6a ≡ 6b (mod 3), because both
sides of the congruence are congruent to 0 since 3 ∣ 6. If we just cancel out the 6 (like we do
in Z), we get a ≡ b (mod 3) for all a, b, which of course is false since 1 ≡/ 2 (mod 3).
The following lemma can be interpreted as a cancellation law in Z/mZ where we allow
changing the congruence modulus.
Lemma 7. Let a, b, c, m ∈ Z with m > 0 and c ≠ 0. Write d = (c, m). Then,
m
c ⋅ a ≡ c ⋅ b (mod m) ⇐⇒ a ≡ b (mod ).
d
Proof. Suppose first a ≡ b (mod md ). That is, a − b = md ⋅ k for some k ∈ Z. Then,
c c c
da − db = m ⋅ k ⇐⇒ (da − db) = mk ⇐⇒ ca − cb = m ( k) ⇐⇒ ca ≡ cb (mod m).
d d d
Conversely, suppose ca ≡ cb (mod m). That is ca − cb = mk for k ∈ Z. Therefore, we also
have
m c c m
⋅ k = (a − b) with ( , ) = 1,
d d d d
where the second condition follows from Corollary 5. Then, Lemma 3 implies that
m m
∣ a − b ⇐⇒ a ≡ b (mod ).
d d
K
Example 13.15. From Lemma 7, for all a, b ∈ Z, we have
3
6a ≡ 6b (mod 3) ⇐⇒ a ≡ b (mod ) ⇐⇒ a ≡ b (mod 1),
(3, 6)
which is true (see Examples 13.2 (f)).

36
Exercises.
Exercise 13.16. Let m ∈ Z>0 and let [r], [s] ∈ Z/mZ. Prove that multiplication, as defined
by
[r] ⋅ [s] ∶= [r ⋅ s],
is a well-defined operation on Z/mZ.
Exercise 13.17. Prove or disprove that {−39, 72, −23, 50, −15, 63, −52} is a complete residue
system modulo 7.
Exercise 13.18. Find a complete residue system modulo 7 consisting entirely of even inte-
gers.
Exercise 13.19. Determine all least positive integers k modulo 16 satisfying k ≡ 2 (mod 4).

37
14. Fast Modular Exponentiation

In this section, we describe an efficient procedure to deal with exponentials modulo m. More
precisely, given a, k, m ∈ Z with m, k ≥ 2, we will describe how to compute ak (mod m)
quickly.
The following method, know as fast modular exponentiation, consists of 3 main steps.
Step 1: Write the exponent in base 2. That is,
k = 2r1 + 2r2 + ⋯ + 2rl , r1 > r2 > ⋯ > rl .
Step 2: For all powers of 2 which are less than or equal to 2r1 , compute
r
a (mod m), a2 (mod m), a4 (mod m), . . . , a2 1 (mod m)
by successively squaring and reducing the result modulo m.
Step 3: Compute
ak = a2 1 +2 2 +⋯+2 l ≡ a2 1 ⋅ a2 2 ⋅ . . . ⋅ a2 l (mod m),
r r r r r r

where we use the values computed in Step 2 to obtain the right hand side of the congruence.
Example 14.1. Compute 751 (mod 17).
Step 1: 51 = 25 + 24 + 2 + 1 = 32 + 16 + 2 + 1.
Step 2:
7 ≡ 7 (mod 17) 72 ≡ 49 ≡ 15 ≡ −2 (mod 17)
74 ≡ (−2)2 ≡ 4 (mod 17) 78 ≡ 42 ≡ 16 ≡ −1 (mod 17)
716 ≡ (−1)2 ≡ 1 (mod 17) 732 ≡ 12 ≡ 1 (mod 17).

Step 3:
751 = 732+16+2+1 = 71 ⋅ 72 ⋅ 716 ⋅ 732 = 7 ⋅ (−2) ⋅ 1 ⋅ 1 ≡ −14 ≡ 3 (mod 17).
Example 14.2. In this example, we demonstrate why working with base 2 is efficient in
fast modular exponentiation. Suppose we want to compute 751 (mod 17) using base 3, for
instance.
We first compute 51 in base 3, obtaining
51 = 33 + 2 ⋅ 32 + 2 ⋅ 3 = 27 + 18 + 6.

Now, by successively taking cubes and reducing modulo 17, we obtain


7 ≡ 7 (mod 17) 73 ≡ 3 (mod 17)
79 ≡ 33 ≡ 9 (mod 17) 727 ≡ 93 ≡ 14 (mod 17).

Now, since
2 2
751 = 727+18+6 = 727 ⋅ 718 ⋅ 76 ≡ 14 ⋅ (79 ) ⋅ (73 ) ≡ 14 ⋅ 92 ⋅ 32 (mod 17),
2
we see that the above pre-calculations are insufficient, since we must also compute (79 )
2
and (73 ) modulo 17. Note that this is due to the fact that representations of integers in
38
base 3 (or any other base ≥ 3) can have coefficients other than 0 and 1, which is not the case
in base 2.

Exercises.
Exercise 14.3. Find the least positive residue of each of the following.
(a) 310 (mod 11)
(b) 516 (mod 17)
(c) 212 (mod 13)
(d) 322 (mod 23)
(e) Can you propose a theorem from the above congruences?

39
15. The Congruence Method

Before we proceed with the study of congruences, in this section, we will describe an appli-
cation of congruences to the solution of Diophantine equations.
The following method, called the congruence method, may sometimes be used to conclude
that certain Diophantine equations have no solutions in Z. The idea behind this method is
that if an equation is satisfied in Z, then it has to be satisfied modulo m for all m > 0. If,
however, we can find a value of m for which it is not satisfied mod m, then we can conclude
that there are no solutions in Z. We illustrate this with two examples.
Example 15.1. We will show that 3x3 + 2 = y 2 has no integer solutions. Indeed, suppose
there are x0 , y0 ∈ Z satisfying 3x30 + 2 = y02 . Since every integer is congruent to itself (see
Proposition 15 (i)) we conclude that, for all integers m > 0, we have the congruence
(15.2) y02 ≡ y02 = 3x30 + 2 (mod m).
In particular, taking m = 3, we have
y02 ≡ 2 (mod 3),
where we have used the fact that 3 ≡ 0 (mod 3).
On the other hand, every integer is congruent modulo 3 to one of {0, 1, 2}; in particular,
y0 ≡ 0, 1 or 2 (mod 3) and we respectively obtain
y02 ≡ 0, 1, 4 ≡ 0, 1, 1 (mod 3).
Thus y02 ≡/ 2 (mod 3) and the integer solution x0 , y0 to equation (15.2) cannot exist, otherwise
y0 satisfies an impossible congruence.
We note that there can be solutions mod m for other values of m. For example, if instead
we work modulo m = 2, from (15.2) we obtain
3x20 + 2 ≡ y02 (mod 2) ⇐⇒ x20 ≡ y02 (mod 2),
which is satisfied whenever x0 ≡ y0 (mod 2). For example, take x0 = y0 = 1. This shows that
the existence of solutions mod m says nothing about the existence of solutions in Z.
Example 15.3. We will show that 20y 2 + 2x = 3 has no integer solutions. Indeed, suppose
x0 , y0 ∈ Z is a solution. Taking m = 2 and arguing as in the previous example, we get
20y02 + 2x0 ≡ 3 (mod 2) ⇐⇒ 0 ≡ 1 (mod 2),
which is impossible. If instead we take m = 5, we obtain
(15.4) 20y02 + 2x0 ≡ 3 (mod 5) ⇐⇒ 2x0 ≡ 3 (mod 5).
We have that x0 ≡ 0, 1, 2, 3, 4 (mod 5) which implies, respectively,
2x0 ≡ 0, 2, 4, 1, 3 (mod 5),
so x0 ≡ 4 (mod 5) satisfies (15.4) and there is no contradiction. We conclude that every
integer x0 = 4 + 5k satisfies the congruence equation (15.4). However, we have shown above
that no integer x0 will satisfy the original equation in Z.

40
Exercises.
Exercise 15.5. Prove or disprove the following statements
(a) The Diophantine equation 3x2 − 7y 2 = 2 has no integral solutions.
(b) The Diophantine equation x2 + y 2 + 1 = 4z has no integral solutions.

41
16. Linear Congruences in One Variable

Let a, b, m ∈ Z with m > 0. Here we consider congruence equations of the form


(16.1) ax ≡ b (mod m),
which are called linear congruences in one variable. Note that in Example 15.3 we have
already found a congruence of this type. More precisely, we have shown that the congruence
equation 2x ≡ 3 (mod 5) admits the unique solution x ≡ 4 (mod 5). We now give further
examples.
Example 16.2. Consider the linear congruence equation 10x ≡ 3 (mod 4). We have
x ≡ 0, 1, 2, 3 (mod 4) Ô⇒ 10x ≡ 0, 10, 20, 30 ≡ 0, 2, 0, 2 (mod 4),
respectively. Hence this equation has no solutions.
Example 16.3. We consider 3x ≡ 9 (mod 6). Note that 9 ≡ 3 (mod 6) and
x ≡ 0, 1, 2, 3, 4, 5 (mod 6) Ô⇒ 3x ≡ 0, 3, 6, 9, 12, 15 ≡ 0, 3, 0, 3, 0, 3 (mod 6),
hence there are three non-congruent solutions
x ≡ 1 (mod 6), x ≡ 3 (mod 6), x ≡ 5 (mod 6).

The previous examples show that an equation of the form (16.1) can have sets of solutions
with different behaviours. This is explained by the following theorem.
Theorem 19. Let a, b, m ∈ Z, with m > 0. Write d = (a, m).

(i) If d ∤ b then the equation (16.1) has no solutions.


(ii) Suppose d ∣ b. Then equation (16.1) has exactly d non-congruent solutions modulo m,
which are given by
m
x ≡ x0 − t, where 0 ≤ t ≤ d − 1,
d
and x0 is a particular solution.

Proof.

(i) Suppose for contradiction that x0 ∈ Z satisfies ax0 ≡ b (mod m). By definition of con-
gruence, there exists some y0 ∈ Z such that
ax0 − b = my0 ⇐⇒ ax0 + m(−y0 ) = b,
meaning that ax+my = b has the solution (x0 , −y0 ). Then (a, m) = d ∣ b by Theorem 14.
(ii) Suppose d ∣ b. Then ax − my = b has solutions by Theorem 14. Let (x0 , y0 ) be a
particular solution. By Theorem 14, the general solution is
m a
x = x0 − t, y = y0 − t, t ∈ Z,
d d
which gives all the integer solutions satisfying ax ≡ b (mod m).
42
To finish the proof, we must show that the above formula for x produces exactly d
incongruent values modulo m. Indeed, suppose we choose t1 , t2 ∈ Z giving the same
value for x modulo m, that is,
m m
x0 − t1 ≡ x0 − t2 (mod m).
d d
From here, we see that
m m m m
x0 − t1 ≡ x0 − t2 (mod m) ⇐⇒ (t2 − t1 ) ≡ 0 ≡ ⋅ 0 (mod m)
d d d d
m
⇐⇒ t2 − t1 ≡ 0 (mod ) by Lemma 7
(m, m/d)
⇐⇒ t1 ≡ t2 (mod d),
where we used (m, md ) = md in the last step. In other words, for t1 , t2 giving the
same value of x modulo m, we must have that t1 ≡ t2 (mod d). Therefore taking
t ∈ {0, 1, . . . , d − 1} gives the desired d non-congruent solutions mod m.
K

We highlight the following special case.


Corollary 11. Let a, m ∈ Z, with m > 0. The congruence equation
ax ≡ 1 (mod m)
has exactly one solution modulo m if and only if (a, m) = 1.

The solutions to the congruence in this corollary will play a crucial role in everything that
follows, so they deserve a special name.
Definition 16.4. Let a, m ∈ Z with m > 0 and (a, m) = 1. We call any integer solution of
the congruence ax ≡ 1 (mod m) an inverse of a modulo m.

Suppose that x0 ∈ Z is an inverse of a mod m. Then ax0 ≡ 1 (mod m) and we have the
following equalities in Z/mZ
[ax0 ] = [1] ⇐⇒ [a] ⋅ [x0 ] = [1].
Suppose x1 is another inverse of a mod m. Corollary 11 shows that x1 is congruent to x0
mod m so that [x1 ] = [x0 ]. In other words, the inverse of a mod m is unique when viewed
as an element of Z/mZ. This is summarized by the following definition.
Definition 16.5. Let a, m ∈ Z with m > 0 and (a, m) = 1. The congruence class [x0 ] in
Z/mZ which satisfies [a] ⋅ [x0 ] = [1] is called the inverse of [a] in Z/mZ. We denote it by
[a]−1 .
Remark 16.6. We note that the use of the term ‘inverse’ and the notation a−1 is analogous
to that of the real numbers. Indeed, for all a ∈ R≠0 , we call 1/a the ‘inverse’ of a, which we
also denote as a−1 . This number is also the unique number satisfying a ⋅ (1/a) = 1.

In practice, despite the fact that a−1 makes no sense as an integer, we write a−1 (mod m) to
denote the smallest positive representative of the congruence class [a]−1 . As an example, in
the following tables, we list the inverses modulo m = 10 and m = 5.
43
Examples 16.7.
(1) For m = 10,
a (mod 10) 0 1 2 3 4 5 6 7 8 9
a−1 (mod 10) − 1 − 7 − − − 3 − 9
(2) For m = 5,
a (mod 5) 0 1 2 3 4
a−1 (mod 5) − 1 3 2 4

Note that there are many integers a ≡/ 0 (mod 10) which are not invertible, while for m = 5,
only those congruent to zero have no inverse. This behavior for m = 5 holds more generally
for all prime numbers.
Corollary 12. Let a, p ∈ Z with p a prime and a ≡/ 0 (mod p). Then a has an inverse mod p.

Proof. Since p is prime and a ≡/ 0 (mod p) we have (a, p) = 1. The result follows from
Corollary 11. K

To find the inverses for m = 5, 10 we only have to try a few possibilities due to the small size
of m. In general, to compute a−1 (mod m), we need to solve the linear Diophantine equation
ax + my = 1 using the Euclidean Algorithm and back substitution.
Example 16.8. We will compute 17−1 (mod 55). Here, we need to solve 17x ≡ 1 (mod 55).
We will do this by finding x0 , y0 satisfying 17x0 + 55y0 = 1, because taking this equality
mod 55 gives precisely 17x0 ≡ 1 (mod 55). This means that x0 (mod 55) will be the inverse
of 17 mod 55 that we are looking for. First, we find (17, 55) using the Euclidean Algorithm:
55 = 17 ⋅ 3 + 4
17 = 4 ⋅ 4 + 1
4 = 1 ⋅ 4 + 0,
so (17, 55) = 1. Secondly, we find a solution (x0 , y0 ) to 17x + 55y = (17, 55) = 1 using back
substitution:
(17, 55) = 1 = 17 − 4 ⋅ 4 = 17 − 4(55 − 17 ⋅ 3)
= 17 − 4 ⋅ 55 + 12 ⋅ 17
= 17 ⋅ 13 − 55 ⋅ 4,
so x0 = 13 and y0 = −4. We conclude that 17 ⋅ 13 ≡ 1 (mod 55) and
[17]−1 = [13] in Z/55Z.
Proposition 18. Let a, m be coprime integers with m > 0 and let k ∈ Z>0 .
−1
≡ (a−1 ) (mod m).
k
Then (ak )

Proof. We use induction.


Base: For k = 1 the result is clear.
Hypothesis: Suppose the result holds for k ≥ 2.
44
Step: We have
(ak+1 )−1 ≡ (ak ⋅ a)−1 ≡ (ak )−1 a−1 ≡ (a−1 )k a−1 ≡ (a−1 )k+1 (mod m),
as desired, where we used Exercise 16.10 for the second congruence. K

In other words, the previous proposition shows that mod m the inverse of a power is the
same power of the inverse; therefore, we may use the notation a−k (mod m) to denote both
−1
(ak ) and (a−1 ) mod m.
k

Exercises.
Exercise 16.9. Prove that the inverse of the inverse of a modulo m is a. More precisely, let
−1
a−1 be an inverse of a modulo m and prove that (a−1 ) ≡ a (mod m).
Exercise 16.10. Let a−1 be an inverse of a modulo m and let b−1 be an inverse of b modulo
m. Prove that a−1 b−1 is an inverse of ab modulo m.
Exercise 16.11. Find all least non-negative incongruent solutions of 623x ≡ 511 (mod 679).

45
17. The Chinese Remainder Theorem

In the previous section, we studied a single congruence in one variable, so it is natural to


wonder if something can be said about several congruences in one variable. As motivation,
let us consider the following problem:
“Find a positive integer having remainder 2 when divided by 3, remainder 1
when divided by 4, and remainder 3 when divided by 5.”
In the language of congruences, this problem translates into finding a positive integer solu-
tion x to the following system of congruences

⎪ x ≡ 2 (mod 3)



⎨x ≡ 1 (mod 4)



⎩x ≡ 3 (mod 5).

The Chinese Remainder Theorem (CRT) is a tool that allows us to solve this and many
other systems of congruences.
Theorem 20 (Chinese Remainder Theorem). Let n1 , n2 , . . . , nk ∈ Z>0 be pairwise coprime
and b1 , b2 , . . . , bk ∈ Z. Consider the system of congruences

⎪ x ≡ b1 (mod n1 )




⎪x ≡ b2 (mod n2 )

(17.1) ⎨


⎪ ⋮ ⋮


⎩x ≡ bk (mod nk ).


Write m = n1 n2 . . . nk . Then, any two solutions x, x′ to the system satisfy x ≡ x′ (mod m),
that is, there is a unique solution modulo m.

Before we give a proof of CRT, let us analyze the congruence equations


x ≡ 3 (mod 7) and x ≡ 2 (mod 3).
From the first congruence, we have x = 3 + 7k for some k ∈ Z. Substituting this equation into
the second congruence yields
x = 3 + 7k ≡ 2 (mod 3) ⇐⇒ k ≡ 2 (mod 3).
Then k = 2 + 3t for t ∈ Z and replacing this for k gives
x = 3 + 7k = 3 + 7(2 + 3t) = 17 + 21t.
In particular, taking t = 0 and t = 1 gives, respectively, x = 17 and x = 38, and we can
easily double-check that these values satisfy the two initial congruences. Note that 21 is the
modulus predicted by the conclusion of CRT. Hence x = 17 + 21t ≡ 17 (mod 21) must be the
unique solution modulo 21 predicted by CRT.
For the proof of CRT, we will need the following basic fact, which will come of use in the
remainder of these notes.
Proposition 19. Let a, b, m, n ∈ Z with m, n > 0 and n ∣ m. If a ≡ b (mod m), then
a ≡ b (mod n).
46
Proof. We have m = nm′ for some m′ ∈ Z and
a − b = mk = (nm′ )k Ô⇒ n ∣ a − b ⇐⇒ a ≡ b (mod n).
K

We will now prove the Chinese Remainder Theorem.

Proof of CRT. This proof has two parts. Namely, we first show that a solution exists by
constructing it explicitly, and then we show that this solution is unique modulo (n1 n2 ⋯nk ).
Existence. Let m = n1 n2 ⋯nk and mi = m/ni . Since (ni , nj ) = 1 for all i ≠ j, we have
(mi , ni ) = 1, therefore the congruence equation mi y ≡ 1 (mod ni ) has a solution yi . Consider
the integer
x = b1 m1 y1 + b2 m2 y2 + ⋯ + bk mk yk
and observe that ni ∣ mi for all i ≠ j. Proposition 19 now implies
x ≡ 0 + 0 + ⋯ + bi mi yi + ⋯ + 0 ≡ bi mi yi (mod ni )
≡ bi (mod ni ),
where the last congruence follows because mi yi ≡ 1 (mod ni ).
Uniqueness. Suppose x, x′ ∈ Z are two solutions to the system in the statement of CRT. This
means that x ≡ bi ≡ x′ (mod ni ) for all i, and hence ni ∣ x − x′ for all i. From Proposition 12
we conclude that x − x′ is divisible by lcm(n1 , n2 , . . . , nk ). Since ni are pairwise coprime,
Proposition 13 tells us that
m = n1 n2 ⋯nk = lcm(n1 , n2 , . . . , nk ).
Then m ∣ (x − x′ ) ⇐⇒ x ≡ x′ (mod m) as desired. K

We extract the following useful consequence of CRT.


Corollary 13. Let n1 , n2 , . . . , nk ∈ Z>0 be pairwise coprime. Then the systems

⎪ x ≡ 1 (mod n1 ) ⎧
⎪ x ≡ −1 (mod n1 )


⎪ ⎪



⎪x ≡ 1 (mod n2 )
⎪ ⎪
⎪x ≡ −1 (mod n2 )

⎨ and ⎨


⎪ ⋮ ⋮ ⎪

⎪ ⋮ ⋮

⎪ ⎪

⎩x ≡ 1 (mod nk ) ⎩x ≡ −1 (mod nk )

⎪ ⎪

have respectively the unique solution x ≡ 1 (mod n1 ⋯nk ) and x ≡ −1 (mod n1 ⋯nk ).

Proof. Clearly x = 1 and x = −1 satisfy the above systems, respectively. It follows from the
uniqueness part of the CRT that there are no other solutions modulo (n1 ⋯nk ). K

We observe that the proof of the CRT is an effective proof. That is, the proof of existence
provides us with a method to compute the solution x mod n1 ⋯nk .
Corollary 14. Consider a system of congruences as in (17.1). Let m = n1 n2 ⋯nk and
mi = m/ni . Since (ni , nj ) = 1 for all i ≠ j, we have (mi , ni ) = 1, so that mi y ≡ 1 (mod ni )
has a solution yi . Then
x = b1 m1 y1 + b2 m2 y2 + ⋯ + bk mk yk
is a solution to (17.1).
47
We illustrate this method with a few examples.
Example 17.2. Consider again
x ≡ 3 (mod 7) and x ≡ 2 (mod 3).
In the notation of the theorem and its proof we have b1 = 3, b2 = 2,
n1 = 7 n2 = 3, m = 3 ⋅ 7 = 21, m1 = m/n1 = 3, m2 = m/n2 = 7
and for i = 1, 2 we have to solve mi y ≡ 1 (mod ni ). Indeed,
i=1∶ 3y ≡ 1 (mod 7) Ô⇒ y1 = 5 (mod 7).
i=2∶ 7y ≡ 1 (mod 3) Ô⇒ y2 = 1 (mod 3).
Thus
x = b1 m1 y1 + b2 m2 y2 ≡ 3 ⋅ 3 ⋅ 5 + 2 ⋅ 7 ⋅ 1 ≡ 45 + 14 ≡ 17 (mod 21),
as expected.
Example 17.3. Find 17−1 (mod 55). We have to solve 17x ≡ 1 (mod 55). Since 55 = 5 ⋅ 11,
by Proposition 19, any solution to the previous congruence will also satisfy the following
congruences
17x ≡ 1 (mod 5) 2x ≡ 1 (mod 5)
{ ⇐⇒ {
17x ≡ 1 (mod 11) 6x ≡ 1 (mod 11).
Observe that the latter system is not yet ready to be solved using CRT, because the variable
x appears with coefficients different from 1. To make the coefficients equal to 1, we have to
multiply each equation by the corresponding inverse. Note that 3 ⋅ 2 ≡ 1 (mod 5), hence
2x ≡ 1 (mod 5) ⇐⇒ x ≡ 3 (mod 5),
and using 6 ⋅ 2 ≡ 1 (mod 11), we proceed similarly for the second congruence. This leads to
the equivalent system
x ≡ 3 (mod 5)
{
x ≡ 2 (mod 11)
to which we can now apply the CRT. In this case, we have
n1 = 5, n2 = 11, b1 = 3, b2 = 2,
so
m = 5 ⋅ 11 = 55, m1 = m/n1 = 11, m2 = m/n2 = 5
and we have to solve mi x ≡ 1 (mod ni ) for i = 1, 2. Indeed,
i=1∶ 11x ≡ 1 (mod 5) Ô⇒ y1 = 1,
i=2∶ 5x ≡ 1 (mod 11) Ô⇒ y2 = −2,
and the solution is given by
x ≡ b1 m1 y1 + b2 m2 y2 (mod m)
≡ 3 ⋅ 11 ⋅ 1 + 2 ⋅ 5 ⋅ (−2) (mod 55)
≡ 33 − 20 ≡ 13 (mod 55)
as computed in Example 16.8.
48
Example 17.4. In Section 14 we computed 810003 (mod 105) by using fast modular expo-
nentiation; here, we give an alternative calculation using CRT. We want to find an integer
x ≡ 810003 (mod 105) such that 0 ≤ x < 105. In particular, since 105 = 3⋅5⋅7, by Proposition 19,
we know that x satisfies

⎪ x ≡ 810003 (mod 3)



⎨x ≡ 810003 (mod 5)


⎪ 10003 (mod 7)
⎩x ≡ 8

and applying CRT will give the number we need. Before we proceed, we will simplify the
congruences above. First note that

⎪ 8 ≡ −1 (mod 3) ⎧
⎪ x ≡ (−1)10003 ≡ −1 (mod 3)


⎪ ⎪


⎨8 ≡ −2 (mod 5) Ô⇒ ⎨x ≡ (−2)10003 ≡ r (mod 5)


⎪ ⎪

⎪ 10003 ≡ 1 (mod 7).
⎩8 ≡ 1 (mod 7)
⎪ ⎩x ≡ 1

To find r, we observe (−2)4 ≡ 16 ≡ 1 (mod 5), thus
x ≡ r ≡ (−2)10003 ≡ (−2)10000 ⋅ (−2)3 ≡ ((−2)4 )2500 ⋅ (−2)3 ≡ 1 ⋅ (−8) ≡ 2 (mod 5).
Therefore, we have to apply CRT to the congruences
x ≡ −1 (mod 3), x ≡ 2 (mod 5), x ≡ 1 (mod 7).
In this case, we have b1 = −1, b2 = 2, b3 = 1,
n1 = 3, n2 = 5, n3 = 7, m = 105, m1 = 35, m2 = 21, m3 = 15
and we need to solve the congruences
35y ≡ 1 (mod 3), 21y ≡ 1 (mod 5), 15y ≡ 1 (mod 7).
We can take, respectively, the solutions y1 ≡ −1, y2 ≡ 1 and y3 ≡ 1, from which we obtain
x ≡ b1 m1 y1 + b2 m2 y2 + b3 m3 y3 (mod m)
≡ (−1) ⋅ 35 ⋅ (−1) + 2 ⋅ 21 ⋅ 1 + 1 ⋅ 15 ⋅ 1 (mod 105)
≡ 35 + 42 + 15 ≡ 92 (mod 105).
Remark 17.5. Since −1 ≡ 2 (mod 3), we could have rewritten the system in the previous
example as
x ≡ 2 (mod 3), x ≡ 2 (mod 5), x ≡ 1 (mod 7)
and grouped the first two congruences together into
x ≡ 2 (mod 15) x ≡ 1 (mod 7)
and applied CRT with these two congruences instead.

Exercises.
Exercise 17.6. Solve the following ancient Indian problem: If eggs are removed from a
basket 2, 3, 4, 5 and 6 at a time, there remain respectively, 1, 2, 3, 4 and 5 eggs. But if the
eggs are removed 7 at a time, no eggs remain. What is the least number of eggs that could
have been in the basket?

49
18. Applications of Congruences

Here we will explore a couple of applications of the theory we have developed so far.

18.1. Divisibility Tests. Here we will prove practical criteria to decide when a given in-
teger n is divisible by 3, 9, 11, or a power of 2. In particular, we will understand why the
following well known fact is true.
“A number is divisible by 3 if the sum of its digits is divisible by 3.”
Proposition 20. Let n ∈ Z>0 . Then n is divisible by 3 or 9 if and only if the sum of its
digits (in base 10) is divisible by 3 or 9, respectively.

Proof. Let q = 3 or q = 9. We have


10 ≡ 1 (mod q) ⇒ 10k ≡ 1 (mod q) for all k > 0.
In base 10,
n = ak 10k + ak−1 10k−1 + ⋯ + a1 10 + a0 , ak ≠ 0
≡ ak + ak−1 + ⋯ + a1 + a0 (mod q).
Therefore,
q ∣ n ⇐⇒ n ≡ 0 (mod q)
⇐⇒ ak + ak−1 + ⋯ + a1 + a0 ≡ 0 (mod q),
⇐⇒ q ∣ ak + ak−1 + ⋯ + a1 + a0 ,
as desired. K
Proposition 21. Let n ∈ Z>0 . Then n is divisible by 11 if and only if the alternating sum
of its digits (in base 10) is divisible by 11.

Proof. Note that


10 ≡ −1 (mod 11) ⇒ 10k ≡ (−1)k (mod 11) for all k > 0.
In base 10, we have
n = ak 10k + ak−1 10k−1 + ⋯ + a1 10 + a0 , ak ≠ 0
≡ ak (−1)k + ak−1 (−1)k−1 + ⋯ + a2 − a1 + a0 (mod 11),
therefore
11 ∣ n ⇐⇒ n ≡ 0 (mod 11)
ak (−1)k + ak−1 (−1)k−1 + ⋯ + a2 − a1 + a0 ≡ 0 (mod 11),
⇐⇒ 11 ∣ ak (−1)k + ak−1 (−1)k−1 + ⋯ + a2 − a1 + a0 (mod 11),
as desired. K
Proposition 22. Let n, k ∈ Z>0 . Then, n is divisible by 2k if and only if the integer obtained
from the last k digits (in base 10) of n is divisible by 2k .
50
Proof. We have
10 ≡ 0 (mod 2) Ô⇒ 10j ≡ 0 (mod 2j ) for all j > 0.
From n = ak 10k + ⋯ + a1 10 + a0 , we obtain
n ≡ aj−1 10j−1 + ⋯ + a1 10 + a0 (mod 2j ).
The number on the right hand side of this congruence has base 10 representation (aj−1 ⋯a1 a0 )10 .
Taking j = k, this is the integer obtained from the last k digits of n, as desired. K
Examples 18.1.
(1) Let n = 4127835. Consider
S = sum of the digits of n = 4 + 1 + 2 + 7 + 8 + 3 + 5 = 30.
Since 3 ∣ S but 9 ∤ S, we conclude that 3 ∣ n but 9 ∤ n.
(2) Let n = 723160823. We have,
S = alternating sum of the digits of n = 7 − 2 + 3 − 1 + 6 − 0 + 8 − 2 + 3 = 22.
Then 11 ∣ n.
(3) Let n = 33678924. We have,
S = 3 − 3 + 6 − 7 + 8 − 9 + 2 − 4 = −4,
so that 11 ∤ n.
(4) Let n = 32688048. Since
2 ∣ 8, 4 ∣ 48, 8 ∣ 048, 16 ∣ 8048, 32 ∤ 88048,
we may conclude that 2, 4, 8, 16 ∣ n and 32 ∤ n.

18.2. The ISBN10 Code. In this section, we will apply congruences to describe the ISBN10
code and some of its properties. An ISBN10 code is a sequence of 10 digits, a1 , a2 ,. . . ,a10 ,
used to identify books, where
(i) 0 ≤ ai ≤ 9 for i = 1, . . . , 9;
(ii) a10 is an integer mod 11, where the letter X is used to denote 10 (mod 11).
An ISBn10 code is called valid if
10
S = ∑ i ⋅ ai ≡ 0 (mod 11).
i=1

Examples 18.2.
(1) The code is 0 − 321 − 50031 − 8 is valid because it satisfies
S = 1 ⋅ 0 + 2 ⋅ 3 + 3 ⋅ 2 + 4 ⋅ 1 + 5 ⋅ 5 + 6 ⋅ 0 + 7 ⋅ 0 + 8 ⋅ 3 + 9 ⋅ 1 + 10 ⋅ 8
≡ 16 + 49 + 89 ≡ 5 + 5 + 1 ≡ 0 (mod 11)
(2) The code 1 − 100 − 00000 − X is invalid since
S = 1 ⋅ 1 + 2 ⋅ 1 + 10 ⋅ 10 ≡ 103 ≡ 4 ≡/ 0 (mod 11).
51
Proposition 23. Let a1 , a2 , . . . , a9 be integers such that 0 ≤ ai ≤ 9 for i = 1, . . . , 9 and take
9
a10 = ∑ i ⋅ ai (mod 11),
i=1
where we write X for a10 if a10 ≡ 10 (mod 11). Then, a1 a2 ⋯a10 is a valid ISBN10 code.

Proof.
10 9 9 9 9
S = ∑ i ⋅ ai = (∑ i ⋅ ai ) + 10a10 = (∑ i ⋅ ai ) + 10 (∑ i ⋅ ai ) = 11 (∑ i ⋅ ai ) ≡ 0 (mod 11).
i=1 i=1 i=1 i=1 i=1
K

Suppose that an ISBN10 code x = x1 ⋯x10 is transmitted and the code y = y1 ⋯y10 is received;
the transmission is successful if x = y. We say that y contains a single error if there exists a
single value of j such that
∀i ≠ j we have xi = yi and yj = xj + a with − 10 ≤ a ≤ 10, a ≠ 0.
We say that y contains a transposition error if there are j ≠ k such that
xj ≠ xk , yj = xk , yk = xj and yi = xi ∀i ≠ j, k.
Proposition 24. The ISBN10 code detects both single errors and transposition errors.

Proof. Let x be a valid code so that Sx = ∑10


i=1 i ⋅ xi ≡ 0 (mod 11).

We will assume that a single error has occurred in the transmission and will show that the
received code y is not valid. Indeed, let j and a be as described above and compute
10 10 10
Sy = ∑ i ⋅ yi = ∑ i ⋅ yi + j ⋅ yj = ∑ i ⋅ xi + jxj + ja = Sx + j ⋅ a ≡ ja (mod 11).
i=1 i=1,i≠j i=1,i≠j

Since 11 is prime and 11 ∤ j and 11 ∤ a,


Sy ≡ ja ≡/ 0 (mod 11),
hence y is not valid.
Suppose now that a transposition error has occurred in the transmission. We will show that
y is not valid. Indeed, let j, k be as described above and compute
10 10
Sy = ∑ i ⋅ yi = ∑ i ⋅ yi + kxk − kxk + jxj − jxj
i=1 i=1
10
= ∑ i ⋅ yi + kyk + jyj + kxk − kxk + jxk − jxj
i=1,i≠k,j
10
= ∑ i ⋅ xi + kxj + jxk − kxk − jxj
i=1
= Sx + (k − j)(xj − xk ) ≡ 0 + (k − j)(xj − xk ) (mod 11).
Since 1 ≤ ∣k − j∣, ∣xj − xk ∣ ≤ 10 and 11 is a prime we conclude that 11 ∤ (k − j)(xj − xk ), hence
Sy ≡/ 0, as desired. K

52
Exercises.
Exercise 18.3. Suppose that n = 81294358X. Write down a digit in the slot marked X so
that n is divisible by
(a) 11
(b) 9
(c) 4
Exercise 18.4. Suppose that one digit, indicated with a question mark, in each of the
following ISBN10 codes has been smudged and cannot be read. What should this missing
digit be?
(a) 0 − 19 − 8?3804 − 9
(b) ? − 261 − 05073 − X

53
19. Wilson’s Theorem

Theorem 21 (Wilson’s Theorem). Let p be a prime. Then (p − 1)! ≡ −1 (mod p).

In order to prove this theorem, we will need the following result. Note that this result is also
relevant on its own.
Lemma 8. Let a, p ∈ Z with p a prime and a invertible mod p. That is p ∤ a. Then
a ≡ a−1 (mod p) if and only if a ≡ ±1 (mod p).

Proof. Suppose first that a ≡ ±1 (mod p). Recall that a−1 is an integer satisfying aa−1 ≡ 1 (mod p).
Since 1⋅1 ≡ 1 (mod p) and (−1)(−1) ≡ 1 (mod p) we conclude, in both cases, that a ≡ a−1 (mod p).
Conversely, suppose a ≡ a−1 (mod p). Multiplying both sides by a then yields
a2 ≡ 1 (mod p) ⇐⇒ a2 − 1 = pk, for some k ∈ Z
⇐⇒ p ∣ (a − 1)(a + 1)
Ô⇒ p ∣ (a − 1) or p ∣ (a + 1) by Corollary 6
⇐⇒ a ≡ 1 (mod p) or a ≡ −1 (mod p).
K
Remark 19.1. In Lemma 8, the condition that p is prime is necessary. For example, take
a = 3 and p = 8; since 3 ⋅ 3 = 9 ≡ 1 (mod 8), we have a−1 ≡ 3 ≡ a (mod 8) but 3 ≡/ ±1 (mod 8).

Before we prove Wilson’s theorem, let us verify it via an example. This example illustrates
the main idea of the proof.
Example 19.2. Let p = 7. Wilson’s theorem tells us that (7 − 1)! = 6! ≡ −1 (mod 7). We
now verify this by direct computation.
6! = 6 ⋅ 5 ⋅ 4 ⋅ 3 ⋅ 2 ⋅ 1
= 1 ⋅ 6 ⋅ (2 ⋅ 4) ⋅ (3 ⋅ 5)
≡ 1 ⋅ 6 ⋅ 1 ⋅ 1 ≡ −1 (mod 7),
as expected. In the second equality, we note that we have reordered the integers in the
product. This reordering pairs the numbers in the brackets with their inverses mod p.

Proof of Wilson’s Theorem. For p = 2, 3 the theorem holds. Indeed,


(2 − 1)! = 1 ≡ −1 (mod 2) and (3 − 1)! = 2 ≡ −1 (mod 3).

Suppose p > 3 is prime. We know that every a ≡/ 0 (mod p) has an inverse a−1 which is
unique in the range 1 ≤ a−1 ≤ p−1. Also, by Lemma 8, only 1 and p−1 are their own inverses.
Therefore the set S = {2, . . . , p − 2} contains p − 3 > 0 elements which can be grouped into
(p−3)/2 pairs of the form {a, a−1 }. This is the generalization of the situation in Example 19.2,
where we have the pairs {a = 2, a−1 = 4} and {a = 3, a−1 = 5}.
Now, the product of the elements of S satisfies
2 ⋅ 3 ⋅ . . . ⋅ (p − 2) ≡ (2 ⋅ 2−1 )(3 ⋅ 3−1 )⋯ ≡ 1 (mod p).
54
Multiplying this congruence by 1 on the left and p − 1 on the right gives
(p − 1)! = 1 ⋅ (2 ⋅ 3 ⋅ . . . ⋅ (p − 2))(p − 1) ≡ 1 ⋅ 1 ⋅ (p − 1) ≡ −1 (mod p).
K

Exercises.
Exercise 19.3. For each of the following congruences, find the least nonnegative integer x
that satisfies it.
(a)
60!
≡ x (mod 31)
31!
(b)
59!
≡ x (mod 31)
30!

55
20. Fermat’s Little Theorem

Theorem 22 (Fermat’s Little Theorem).


Let p be a prime. If a ∈ Z satisfies (a, p) = 1, then ap−1 ≡ 1 (mod p).

Proof. Let a ∈ Z be coprime to p and consider the sequence of integers


(20.1) a, 2a, 3a, . . . , (p − 1)a.

Claim: The integers in (20.1) are all distinct mod p and not congruent to zero mod p.
It follows from the claim that the sequence
a (mod p), 2a (mod p), . . . , (p − 1)a (mod p)
is comprised of p − 1 distinct integers in the interval [1, p − 1]. Hence, they must be the
integers 1, 2, . . . , p − 1 in some order (i.e. multiplication by a mod p is reordering them).
Therefore, by taking the product mod p of the elements in (20.1), we obtain
a ⋅ (2a) ⋅ (3a)⋯(p − 1)a ≡ 1 ⋅ 2 ⋅ 3⋯(p − 1) (mod p)
= (p − 1)! (mod p).
Since we also have
a(2a)(3a)⋯(p − 1)a ≡ ap−1 (1 ⋅ 2 ⋅ 3⋯p − 1) = ap−1 (p − 1)! (mod p),
it follows that
ap−1 (p − 1)! ≡ (p − 1)! (mod p).
Now, by Wilson’s theorem, we conclude that
ap−1 (−1) ≡ −1 (mod p) ⇐⇒ ap−1 ≡ 1 (mod p),
as desired. To complete the proof, it remains to prove the claim.
Proof of Claim: Suppose ka ≡ k ′ a (mod p). Note that a−1 exists since (a, p) = 1. Then,
multiplying the previous congruence by a−1 , we obtain
ka ≡ k ′ a (mod p) ⇐⇒ k(aa−1 ) ≡ k ′ (aa−1 ) (mod p) Ô⇒ k ≡ k ′ (mod p) Ô⇒ k = k ′ ,
where the last implication follows from Corollary 8 because 1 ≤ k, k ′ ≤ p − 1. Finally, since
p ∤ a and p ∤ k, we conclude ka ≡/ 0 (mod p) for all ka in (20.1), completing the proof. K

We have the following three corollaries of FLT.


Corollary 15. Let p be prime and let a be any integer. Then ap ≡ a (mod p).

Proof. Let a ∈ Z. If p ∣ a, then p ∣ ap and we have a ≡ 0 ≡ ap (mod p).


Suppose p ∤ a. Then (a, p) = 1 and ap−1 ≡ 1 (mod p) by FLT. Multiplying both sides by a
gives ap ≡ a (mod p), as desired. K
Corollary 16. Let p be prime and a ∈ Z coprime to p. Suppose d ≡ e (mod p − 1).
Then ad ≡ ae (mod p).
56
Proof. If d = e then ad = ae and the result is trivial. WLOG, suppose that d > e.
We have d − e = (p − 1)k for some k ∈ Z>0 . Thus
k
ad = ae+(p−1)k = ae ⋅ (ap−1 ) ≡ ae ⋅ 1k ≡ ae (mod p),
where we used FLT to conclude ap−1 ≡ 1 (mod p). K
Examples 20.2.
(1) Compute 3201 (mod 11). By Fermat’s Little Theorem, we have 310 ≡ 1 (mod 11), hence
20
3201 = (310 ) ⋅ 3 ≡ 120 ⋅ 3 ≡ 3 (mod 11).
(2) Compute 2180 (mod 89). Note that p = 89 is prime and p − 1 = 88. By Corollary 16, since
180 ≡ 4 (mod 88), we have 2180 ≡ 24 ≡ 16 (mod 89).
(3) Exercise 3.5 follows directly from Corollary 15 and the definition of congruence, that is
5 ∣ n5 − n.
(4) Note that FLT and Corollary 15 do not hold for non prime modulus; indeed, we have
34 ≡ 1 ≡/ 3 (mod 4); this is a reformulation of Exercise 3.6.

Exercises.
Exercise 20.3. Let p and q be distinct odd prime numbers with p − 1 ∣ q − 1. If a ∈ Z with
(a, pq) = 1, prove that aq−1 ≡ 1 (mod pq).

57
21. Primality Testing, Pseudoprimes, and Carmichael Numbers

Given a positive integer n it is important to decide if it is a prime number.


√ From Proposition 7
we know that it is enough to test divisibility of n by primes up to n; if no such prime
divides n we conclude that n is a prime number. This test, however, is not practical when
n is very large, so other tests are needed. In this section we will describe how the theorems
in Sections 19 and 20 can be used to obtain more efficient primality tests.
We start by showing that the converse of Wilson’s theorem provides a primality test.
Proposition 25. Let n ∈ Z>1 satisfy (n − 1)! ≡ −1 (mod n). Then n is a prime number.

Proof. Suppose that n is a composite number such that (n − 1)! ≡ −1 (mod n). In particular,
say n factors into n = a ⋅ b where 1 < a, b < n. We observe that a ≤ n − 1, so a ∣ (n − 1)!.
Moreover,
(n − 1)! ≡ −1 (mod n) ⇐⇒ n ∣ (n − 1)! + 1.
Lastly, since a ∣ n and n ∣ (n − 1)! + 1, we have a ∣ (n − 1)! + 1, and in particular, a divides the
difference,
a ∣ ((n − 1)! + 1 − (n − 1)!) = 1 Ô⇒ a = 1.
This is a contradiction, hence n is prime. K

We remark that this proposition, together with Wilson’s theorem, shows that the condition
(n − 1)! ≡ −1 (mod n) is equivalent to n being prime. This can be very helpful for theoretical
arguments, but in practice it is not a good test because computing (n − 1)! mod n is hard.
The following test is much better in practice.
Theorem 23 (Fermat’s Test). Let n, b ∈ Z>1 with 1 < b < n.
If bn−1 ≡/ 1 (mod n), then n is composite.

Proof. If n is prime, we have (b, n) = 1 and bn−1 ≡ 1 (mod n) by FLT. K


Example 21.1. Consider n = 91. Since 291−1 ≡ 64 (mod 91) and 64 ≡/ 1 (mod 91), Fermat’s
test implies that 91 is composite; indeed 91 = 13 ⋅ 7.

Unlike the condition (n − 1)! mod n, Fermat’s test does not classify prime numbers. That is,
the converse of the theorem does not imply that n is prime. For instance, bn−1 ≡ 1 (mod n)
does not necessarily mean that n is prime, as the following example illustrates.
Example 21.2. Taking n = 341 and b = 2, we observe that 2340 ≡ 1 (mod 341) but
341 = 11 ⋅ 31 so that 341 is not prime.

Composite numbers which pass Fermat’s test deserve a special name.


Definition 21.3. If n ∈ Z>1 is composite and satisfies bn−1 ≡ 1 (mod n) for some 1 < b < n,
we say that n is a pseudoprime to the base b.
Examples 21.4.
(1) 2340 ≡ 1 (mod 341) but 341 = 11 ⋅ 31, hence 341 is a pseudoprime for base b = 2.
58
(2) 341 is not a pseudoprime for base b = 3. Indeed, 330 ≡ 1 (mod 31) by Fermat’s Little
Theorem, therefore
11
3340 ≡ (330 ) ⋅ 310 ≡ 111 ⋅ 310 (mod 31)
and since
3
310 ≡ (33 ) ⋅ 3 ≡ (−4)3 ⋅ 3 ≡ 25 (mod 31),
we conclude 3340 ≡/ 1 (mod 31). Because 31 ∣ 341, it follows from Proposition 19, that
3340 ≡/ 1 (mod 341), as desired.

The previous example show that 341 passes Fermat’s test in base 2 but not in base 3. It is
natural to wonder if there are integers n that pass Fermat’s test in every base coprime to n.
Definition 21.5. We call an integer n > 1 a Carmichael number if it is a pseudoprime for
every base b ≥ 2 such that (n, b) = 1.

It is not easy to prove that Carmichael numbers actually exist. The following theorem
classifies them, allowing us to decide if an integer is a Carmichael number without checking
the definition. For now, we will only prove one implication of the theorem, as the other
direction (Theorem 44) requires the notion of primitive roots, which will only be introduced
in Section 26.
Definition 21.6. We say that an integer n is squarefree if no square number divides it. In
particular, the prime factorization of n contains only primes with exponent one.
Theorem 24 (Korset). A composite positive integer n is a Carmichael number if and only
if
(i) n is squarefree and
(ii) if p ∣ n is prime then p − 1 ∣ n − 1.

Proof. For now, we will only prove one implication. Suppose (i) and (ii) hold for n. and let
b ∈ Z satisfy (b, n) = 1.
From (i), we have n = p1 ⋯pk with pi distinct primes. Then (b, pi ) = 1 for i = 1, . . . , k.
From (ii), we have, for i = 1, .., k, that n − 1 = (pi − 1)ki for some ki ∈ Z. Then

bn−1 ≡ (bpi −1 ) ≡ 1ki ≡ 1 (mod pi ),


ki

where the second congruence follows from FLT. Therefore, the system of congruences

⎪ x ≡ 1 (mod p1 )



⎨ ⋮



⎩x ≡ 1 (mod pk )

has the solution x = bn−1 . Clearly, x = 1 is also a solution to the above system. From
the uniqueness part of CRT, we have bn−1 ≡ 1 (mod n = p1 ⋯pk ). This shows that n is a
pseudoprime for base b. Since b is arbitrary, we conclude that this holds for all values b such
that 1 < b < n so that n is a Carmichael number. K
59
Remark 21.7. In the previous proof, we could replace CRT by the following argument. For
all i = 1, . . . , k, we have bn−1 ≡ 1 (mod pi ), hence pi ∣ bn−1 − 1. Then lcm(p1 , .., pk ) ∣ bn−1 − 1 by
Proposition 12. Since the pi are distinct primes,
lcm(p1 , .., pk ) = p1 ⋯pk = n,
hence bn−1 ≡ 1 (mod n).
Example 21.8. The number 561 is the smallest Carmichael number. Indeed, 561 = 3 ⋅ 11 ⋅ 17
and 3 − 1 = 2, 11 − 1 = 10, and 17 − 1 = 16 all divide 561 − 1 = 560 = 24 ⋅ 5 ⋅ 7.

To conclude this section, we describe a primality test which is a refinement of Fermat’s test.

21.1. Miller’s Test. Let n > 0 be odd and suppose n is a pseudoprime for the base b ≥ 2.
That is,
bn−1 ≡ 1 (mod n).
n−1
Write x = b 2 (mod n). If n is prime, since x2 ≡ bn−1 ≡ 1 (mod n), it follows from Lemma 8
n−1
that x ≡ ±1 (mod n). So, if b 2 ≡/ ±1 (mod n), then n is composite.
n−1
Suppose we failed to conclude that n is composite in the previous step. If b 2 ≡ 1 (mod n)
n−1
and n − 1 is divisible by 4, then we can repeat the argument with y = b 4 .
n−1
Indeed, y 2 ≡ b 2 ≡ 1 (mod n) implies y ≡ ±1 (mod n) if n is prime. Then, if we have
n−1
b 4 ≡/ ±1 (mod n) we conclude that n is composite. If we fail again to conclude that n is
n−1
composite we can repeat this procedure as long as n−1
2k
is an integer and b 2k−1 ≡ 1 (mod n).
Example 21.9. We have seen that n = 561 is the smallest Carmichael number. In other
words,
b560 ≡ 1 (mod 561) for all b ≥ 2 satisfying (b, n) = 1.
Let b = 5. Then 5280 ≡ 67 ≡/ ±1 (mod 561) so that n is composite by Miller’s test.
Let b = 2; we have 2280 ≡ 1 (mod 561) but 2140 ≡ 67 ≡/ ±1 (mod 561) and we conclude again
that n is composite. Note, however, that depending on the base b we may need a different
number of steps in Miller’s test.

There are integers which fool the test, and we often refer to these integers as strong pseudo-
primes.
Example 21.10. Let n = 2047 = 23 ⋅ 89. Then
186
22046 = (211 ) = (2048)186 ≡ 1 (mod 2047),
so n is a pseudoprime in base b = 2. Moreover,
n−1 93
= 1023 and 21023 = (211 ) = 204893 ≡ 1 (mod 2047),
2
so 2047 fools Miller’s Test for base b = 2.

It is convenient to summarize the conditions under which Miller’s test fails.


60
Definition 21.11. Let n ∈ Z>2 be odd. Write n − 1 = 2s t, where s ≥ 1 and t is odd. We say
that n passes Miller’s test for base b if either bt ≡ 1 (mod n) or b2 t ≡ −1 (mod n) for some
j

j in 0 ≤ j ≤ s − 1.

We have seen that Carmichael numbers fool Fermat’s test for every base. The following
theorem, which we will not prove, shows that this is not possible for Miller’s test.
n−1
Theorem 25. Let n ∈ Z>0 be odd and composite. Then n fools Miller’s test for at most 4
bases b such that 1 ≤ b ≤ n − 1.

Based on this theorem, there is the following very practical primality test.
Theorem 26 (Rabin’s probabilistic test). Let n ∈ Z>0 be odd and composite. Choose
b1 , . . . , bk ∈ Z such that 1 < bi ≤ n − 1. If n is composite, then the probability that it passes
Miller’s test for all bi is less than 41k .

Exercises.
Exercise 21.12. Prove that 1729 is a Carmichael number.
Exercise 21.13. Use Miller’s Test in base b = 2 to show that 1729 is composite.

61
22. Euler’s φ-Function and Euler’s Theorem

Fermat’s Little Theorem tells us that the (p − 1)-th power of any integer coprime to p is con-
gruent to one mod p. In this section we will study Euler’s theorem which generalizes this idea
to any congruence modulus m. In other words, for any fixed m, Euler’s theorem determines
y > 0 (depending on m) such that, for all a ∈ Z coprime to m, we have ay ≡ 1 (mod m).
To state Euler’s theorem we first need to introduce a very important function.
Definition 22.1. The Euler φ-function is the function φ ∶ Z>0 → Z>0 defined by
φ(n) = # {x ∈ Z ∶ 1 ≤ x ≤ n and (x, n) = 1} .
In words, it counts the number of positive integers up to n that are coprime to n.
Examples 22.2.
(1) φ(1) = φ(2) = 1;
(2) φ(3) = 2 since both {1, 2} are coprime to 3;
(3) φ(6) = 2 since, from {1, 2, 3, 4, 5, 6}, only 1 and 5 are coprime to 6;
(4) For any prime p, since p ∤ x if x < p, we have
φ(p) = # {x ∈ Z ∶ 1 ≤ x ≤ p and (x, p) = 1} = # {x ∈ Z ∶ 1 ≤ x ≤ p − 1} = p − 1.
Theorem 27 (Euler). Let a, m ∈ Z with m > 0 and (a, m) = 1. Then,
aφ(m) ≡ 1 (mod m).

Observe that, as a direct consequence of Example 22.2 (4) and Euler’s theorem, we recover
FLT.
Corollary 17. Let p be a prime. Then φ(p) = p − 1 and ap−1 ≡ 1 (mod p).

Proof of Euler’s Theorem. Let a ∈ Z satisfy (a, m) = 1. From the definition of φ(m), there
are φ(m) distinct positive integers, a1 , . . . , aφ(m) , such that ai ≤ m and (ai , m) = 1. Consider
the sequence of integers
(22.3) a ⋅ a1 , a ⋅ a2 , . . . , a ⋅ aφ(m) .
Claim. The integers in (22.3) are all distinct mod m, satisfy (a ⋅ ai , m) = 1, and are not
congruent to zero mod m.
It follows from the claim that, the mod m sequence,
a ⋅ a1 (mod m), a ⋅ a2 (mod m), . . . , a ⋅ aφ(m) (mod m).
is made of φ(m) distinct integers in the interval [1, m−1] which are coprime to m (by Propo-
sition 16). Since the integers with these properties are a1 , a2 , . . . , aφ(m) , we conclude that the
mod m sequence must be the integers a1 , a2 , . . . , aφ(m) in some order (i.e. multiplication by
a is reordering them). Therefore, by taking their product, we get
(a ⋅ a1 ) ⋅ (a ⋅ a2 )⋯(a ⋅ aφ(m) ) ≡ a1 ⋅ a2 ⋯aφ(m) (mod m)
⇐⇒ aφ(m) (a1 a2 ⋯aφ(m) ) ≡ a1 a2 ⋯aφ(m) (mod m).
62
Write A = a1 a2 ⋯aφ(m) . Clearly, (A, m) = 1, therefore A is invertible mod m, and multiplying
the last congruence by A−1 yields
aφ(m) ≡ 1 (mod m),
as desired. To complete the proof, we now prove the claim.
Proof of Claim. Suppose a ⋅ ai ≡ a ⋅ aj (mod m). Since (a, m) = 1, the inverse a−1 exists so
we can cancel the a in the previous congruence to obtain
ai ≡ aj (mod m) with 0 ≤ ai , aj ≤ m − 1.
It now follows from Corollary 8 that ai = aj . Suppose (a ⋅ ai , m) > 1 for some i. Then there
exists p such that p ∣ aai and p ∣ m; hence (p ∣ a and p ∣ m) or (p ∣ ai and p ∣ m). This
implies (a, m) > 1 or (ai , m) > 1, a contraction. We conclude (a ⋅ ai , m) = 1. Clearly a ⋅ ai ≡/ 0
(mod m), otherwise m ∣ a ⋅ ai , completing the proof. K
Remark 22.4. Since Euler’s theorem implies FLT (see Corollary 17), the previous proof, when
restricted to m = p a prime, must also provide a proof of Fermat’s Little Theorem. Indeed,
comparing both proofs, we see that the main difference is that instead of using Wilson’s
theorem, we used the fact that A = a1 a2 ⋯aφ(m) is invertible. Of course A is invertible in the
proof of FLT, since A ≡ −1 (mod m) by Wilson’s theorem.
Definition 22.5. A set of integers with φ(m) elements which are coprime to m such that
no two of them are congruent modulo m is called a reduced residue system modulo m.

We extract the following corollary from the previous proof.


Corollary 18. Let a, m ∈ Z with m > 0 and (a, m) = 1. If {a1 , a2 , . . . , aφ(m) } is a reduced
residue system modulo m, then as is {a ⋅ a1 , a ⋅ a2 , . . . , a ⋅ aφ(m) }.

22.1. A Formula for φ. The following theorem gives a formula to compute φ(n). We will
prove this formula in Section 23 when studying arithmetic functions. For the moment, we
are interested in using the formula to illustrate different kinds of calculations involving the
function φ(n).
Theorem 28. Let n ∈ Z>1 have factorization n = pa11 ⋯pakk , aj ≥ 1 and pj distinct primes.
Then, φ(n) is given by the formula,
k
1 1 1 a −1
(22.6) φ(n) = n (1 − ) (1 − ) ⋯ (1 − ) = ∏ pj j (pj − 1).
p1 p2 pk j=1

Examples 22.7.

(1) φ(100) = φ (22 ⋅ 52 ) = 100 (1 − 21 ) (1 − 51 ) = 40.


(2) We will determine the last two (decimal) digits of 350 , i.e. 350 (mod 100). By Euler’s
Theorem, we have 340 = 3φ(100) ≡ 1 (mod 100). Then
350 = 340 ⋅ 310 ≡ 1 ⋅ 310 ≡ 34 ⋅ 34 ⋅ 32 ≡ (−19)2 ⋅ 9 ≡ 49 (mod 100).
63
Example 22.8. Find all integers n > 0 satisfying φ(n) = 1. We know from Examples 22.2
that φ(2) = φ(1) = 1. We will now show that no other integer has this property.
Write n = pa11 ⋯pakk for the prime factorization of n and suppose φ(n) = 1. Then, from the
formula 22.6, we obtain
k
φ(n) = ∏ pai −1 (pi − 1) = 1,
i=1
which implies (pi − 1) ∣ 1 for all i. That is, pi = 2 for all i. Hence, if n ≠ 1, we have n = 2a1
with a1 ≥ 1 and therefore
φ(n) = 2a1 −1 = 1 Ô⇒ a1 = 1 Ô⇒ n = 2.
Example 22.9. Find all integers n > 0 satisfying φ(n) = 3.
Write n = pa11 ⋯pakk for the prime factorization of n and suppose φ(n) = 3. Then, from the
formula 22.6, we get
k
φ(n) = ∏ pai −1 (pi − 1) = 3,
i=1
which implies pi − 1 ∣ 3 for all i. Thus pi − 1 = 1 or pi − 1 = 3 for all i. Note that the second
case is impossible, because pi = 4 is not a prime. We conclude that pi = 2 for all i. Hence,
if n ≠ 1, we have n = 2a1 with a1 ≥ 1, therefore φ(n) = 2a1 −1 = 3, which is impossible. In
addition, n = 1 clearly has φ(1) ≠ 3 so that there are no solutions to the equation.
Example 22.10. Find all integers n > 0 satisfying φ(n) = 8.
Let n have prime factorization n = pa11 ⋯pakk and suppose φ(n) = 8. If, for some j, pj > 9
then, by the formula (22.6), we have φ(n) ≥ pj − 1 > 8, a contradiction. If pj = 7 ∣ n then
pj − 1 = 6 ∣ φ(n) = 8, another contradiction. We conclude that n = 2a ⋅ 3b ⋅ 5c , where a, b, c ≥ 0.
Suppose b ≥ 2. Then 3b−1 ∣ φ(n) = 8, a contraction. Thus b = 0 or b = 1. Similarly, if c ≥ 2 we
get 5c−1 ∣ φ(n) = 8, a contradiction, hence c = 0 or c = 1.
We now have the following cases, according to the possible values of b and c:
(1) b = c = 0: then n = 2a .
(a) if a ≥ 1, we have
φ(n) = 2a−1 = 8 Ô⇒ a = 4 Ô⇒ n = 16;
(b) if a = 0 then n = 1, hence φ(n) = 1 ≠ 8 is not a solution.
(2) b = 0, c = 1: then n = 2a ⋅ 5.
(a) if a ≥ 1,
φ(n) = 2a−1 ⋅ 4 = 8 Ô⇒ a = 2 Ô⇒ n = 20;
(b) if a = 0 then n = 5 Ô⇒ φ(n) = 4 ≠ 8.
(3) b = 1, c = 0: then n = 2a ⋅ 3.
(a) if a ≥ 1,
φ(n) = 2a−1 ⋅ 2 = 8 Ô⇒ a = 3 Ô⇒ n = 24;
(b) if a = 0 then n = 3 Ô⇒ φ(n) = 2 ≠ 8.
(4) b = c = 1: then n = 2a ⋅ 3 ⋅ 5.
(a) if a ≥ 1,
φ(n) = 2a−1 ⋅ 2 ⋅ 4 = 8 Ô⇒ a = 1 Ô⇒ n = 30;
64
(b) if a = 0 then n = 15 Ô⇒ φ(15) = (3 − 1)(5 − 1) = 8.
We conclude that φ(n) = 8 has solutions n = 15, 16, 20, 24, 30.

Exercises.
Exercise 22.11. Find a reduced residue system modulo each integer below
(i) 15
(ii) 18
(iii) p, where p is a prime number
(iv) 2n , where n is a positive integer
(v) For each of (i) and (ii), give another solution sharing exactly one element with your
previous solution
Exercise 22.12. Prove that 98 ≡ 1 (mod 16) by following the steps in the proof of Euler’s
Theorem.

65
23. Arithmetic Functions

We have already encountered in Section 22.1 a very important function, the Euler-φ function.
In this section, we will study other relevant functions in number theory; in particular, we’ll
focus on those which are ‘multiplicative’, a property that sometimes allows to derive formulas
for the functions we consider.
Definition 23.1. A function whose domain is Z>0 is called an arithmetic function.
Examples 23.2.
(1) f (n) = 1 for all n ∈ Z>0 ;
(2) f (n) = n for all n ∈ Z>0 ;
(3) φ(n), the Euler φ-function;
(4) τ (n) = the number of positive divisors of n;
(5) σ(n) = the sum of the positive divisors of n.
Example 23.3. The positive divisors of 6 are {1, 2, 3, 6}. Therefore,
τ (6) = 4 and σ(6) = 1 + 2 + 3 + 6 = 12.
Definition 23.4. Let f be an arithmetic function. We say that f is multiplicative if, for all
n1 , n2 ∈ Z>0 satisfying (n1 , n2 ) = 1, we have
f (n1 ⋅ n2 ) = f (n1 ) ⋅ f (n2 )
and we say f is completely multiplicative if
f (n1 ⋅ n2 ) = f (n1 ) ⋅ f (n2 ) for all n1 , n2 ∈ Z>0 .

Clearly, both the constant function f (n) = n and the identity function f (n) = 1 are com-
pletely multiplicative. We shall prove that the three functions φ, τ , and σ are multiplicative.
We begin by showing that φ is multiplicative, which is a key ingredient to later establish the
formula (22.6).
Theorem 29. The Euler φ-function is multiplicative.

Proof. Let n1 , n2 be positive and coprime integers. By definition, we have


φ(n1 n2 ) = # {x ∈ Z ∶ 1 ≤ x ≤ n1 n2 and (x, n1 n2 ) = 1} .
We want to show
φ(n1 ⋅ n2 ) = φ(n1 ) ⋅ φ(n2 ).
To prove this equality, we will count the elements in the set above in such a way that the
desired result becomes clear. The integers we need to count are between 1 and n1 n2 . Begin
by writing the positive integers up to n1 n2 in the form

1 n1 + 1 2n1 + 1 ⋯ (n2 − 1)n1 + 1


2 n1 + 2 2n1 + 2 ⋯ (n2 − 1)n1 + 2
⋮ ⋮ ⋮ ⋯ ⋮
r n1 + r 2n1 + r ⋯ (n2 − 1)n1 + r
⋮ ⋮ ⋮ ⋯ ⋮
n1 2n1 3n1 ⋯ n1 n2 .
66
We now identify the integers in the above list which are coprime to n1 n2 .
Suppose 1 ≤ r ≤ n1 and (r, n1 ) = d > 1. Then all the elements in the r-th row are divisible
by d, hence are not coprime to n1 n2 . We conclude that all the integers coprime to n1 n2
belong to the rows whose first number is coprime to n1 . Since there are n1 rows, there are
precisely φ(n1 ) rows containing integers coprime to n1 n2 . To finish the proof, it remains to
show that each of these φ(n1 ) rows contains exactly φ(n2 ) integers coprime to n1 n2 .
Suppose (r, n1 ) = 1. It follows that the numbers in the r-th row are coprime to n1 . Indeed, if
d > 1 divides both n1 and a number of the form kn1 +r, then it also divides r, a contradiction.
Therefore, an integer in the r-th row is coprime to n1 n2 if and only if it is coprime to n2 .
We claim that the n2 elements in the r-th row are all distinct mod n2 . Thus, exactly φ(n2 )
of them are coprime to n2 by Corollary 9. Since these integers are also coprime to n1 , they
are coprime to n1 n2 . There are φ(n1 ) rows, each containing φ(n2 ) integers coprime to n1 n2 ,
hence φ(n1 n2 ) = φ(n1 ) ⋅ φ(n2 ), as desired.
We now prove the claim. Suppose that
kn1 + r ≡ k ′ n1 + r (mod n2 ) where 1 ≤ k, k ′ ≤ n2 − 1.
Since (n1 , n2 ) = 1 there exists an inverse of n1 mod n2 and we have
kn1 + r ≡ k ′ n1 + r (mod n2 ) ⇐⇒ k ≡ k ′ (mod n2 ) Ô⇒ k = k ′ ,
where the last implication follows from Corollary 8. K

The following theorem will play a central role in proving that τ and σ are multiplicative
functions. This will yield a method to construct a new multiplicative function, provided
that we start with a multiplicative function.
Theorem 30. Let f be an arithmetic function and define the arithmetic function F by
F (n) = ∑ f (d), ∀n ∈ Z>0 .
d∣n,d>0

If f is multiplicative, then F is multiplicative.

Proof. Let n1 , n2 > 0 be coprime integers. We want to show that


F (n1 n2 ) = F (n1 ) ⋅ F (n2 ).
Since (n1 , n2 ) = 1, from Proposition 14 we know that the divisors d of n1 n2 are exactly the
integers of the form d = d1 d2 , where (d1 , d2 ) = 1, d1 ∣ n1 , d2 ∣ n2 . Then,
F (n1 n2 ) = ∑ f (d) = ∑ f (d1 d2 ) = ∑ f (d1 )f (d2 )
d∣n1 n2 ,d>0 d1 ∣n1 ,d2 ∣n2 d1 ∣n1 ,d2 ∣n2
d1 >0,d2 >0 d1 >0,d2 >0

⎛ ⎞⎛ ⎞
=⎜
⎜∑ f (d )⎟ ⎜
1 ⎟⎜ ∑ f (d 2 ⎟ = F (n1 )F (n2 ),
) ⎟
d ∣n
⎝ d1 >01 d ∣n
⎠ ⎝ d2 >02 ⎠
1 2

where we used that f (d1 d2 ) = f (d1 )f (d2 ) as f is multiplicative. K


Theorem 31. The functions σ(n) and τ (n) are multiplicative.
67
Proof. Note that we can write τ and σ as
τ (n) = ∑ 1 and σ(n) = ∑ d.
d∣n,d>0 d∣n,d>0

In other words, they are of the form F as in Theorem 30 where we choose f (n) = 1 and
f (n) = n, respectively. Since these two functions f are multiplicative, the result now follows
from Theorem 30. K

Exercises.
Exercise 23.5. Prove that a completely multiplicative arithmetic function is completely
determined by its values at prime numbers.
Exercise 23.6. Let n ∈ Z with n > 0. Define an arithmetic function ρ by ρ(1) = 1 and
ρ(n) = 2m where m is the number of distinct prime factors dividing n. Prove that ρ is
multiplicative but not completely multiplicative.

68
24. Formulas for the Functions φ, τ and σ

Let f be a multiplicative arithmetic function and let n > 1 be an integer with prime factor-
ization n = pa11 pa22 ⋯pakk . Then
f (n) = f (pa11 ) f (pa22 ) ⋯f (pakk ) .
Thus, to determine the formula for f , it suffices to determine a formula for f (pai i ) and take
the product.
Lemma 9. Let p be a prime and a ≥ 1. Then,
1
φ (pa ) = pa − pa−1 = pa (1 − ) .
p
In particular, φ(p) = p − 1.

Proof. Since p is prime,


(n, pa ) = 1 ⇐⇒ (n, p) = 1 ⇐⇒ p ∤ n,
hence
φ(pa ) = # {x ∈ Z ∶ 1 ≤ x ≤ pa and (x, pa ) = 1} = # {x ∈ Z ∶ 1 ≤ x ≤ pa and p ∤ x} .
The positive multiples of p which are ≤ pa are the numbers of the form kp for 1 ≤ k ≤ pa−1 .
In particular, there are pa−1 of them and we conclude φ(pa ) = pa − pa−1 , as desired. K

We are now in position to prove the formula for φ(n).


Theorem 32. Let n ∈ Z>1 have factorization n = pa11 ⋯pakk , aj ≥ 1 and pj distinct primes.
Then, φ(n) is given by the formula,
k k
1 a −1
φ(n) = n ∏ (1 − ) = ∏ pj j (pj − 1).
j=1 pi j=1

Proof. From Lemma 9 and the fact that φ is multiplicative, we have


φ(n) = φ (pa11 ) φ (pa22 ) ⋯φ (pakk )
1 1
= pa11 (1 − ) ⋯pakk (1 − )
p1 pk
1 1
= pa11 ⋯pakk (1 − ) ⋯ (1 − )
p1 pk
k
1
= n ∏ (1 − ) .
j=1 pi
K
Lemma 10. Let p be a prime and a ≥ 1. Then,
1 − pa+1
τ (pa ) = a + 1 and σ(pa ) = .
1−p
69
Proof. The positive divisors of pa are {1, p, . . . , pa }. Clearly τ (pa ) = a + 1 as there are
a + 1 such divisors. Moreover, the formula for the sum of terms in a geometric progression
(Proposition 2) gives
1 − pa+1
σ (pa ) = 1 + p + ⋯ + pa = .
1−p
K
Theorem 33. Let n ∈ Z>1 have prime factorization n = pa11 ⋯pakk . Then,
k k
1 − pai i +1
τ (n) = ∏(ai + 1) and σ(n) = ∏ ( ).
i=1 i=1 1 − pi

Proof. The result follows from Lemma 10 and the fact that τ and σ are multiplicative
functions. K
Proposition 26. An integer n > 0 is prime if and only if σ(n) = 1 + n.

Proof. Clearly, σ(n) ≥ 1 + n for all n. Furthermore, n is not a prime if and only if the set of
its positive divisors contains at least one element c such that 1 < c < n. That is,
σ(n) ≥ 1 + n + c > n + 1.
K
Example 24.1. Let n = 100 = 22 ⋅ 52 . Then τ (n) = (2 + 1)(2 + 1) = 9 and
23 − 1 53 − 1
σ(n) = ⋅ = 7 ⋅ 31 = 217.
2−1 5−1
Theorem 34. Let n ∈ Z>0 . Then
∑ φ(d) = n.
d∣n
d>0

Proof. Since φ is multiplicative, Theorem 30 yields


F (n) = ∑ φ(d),
d∣n
d>0

a multiplicative function. In other words, F (n) = F (pa11 ) ⋯F (pakk ), where n = pa11 ⋯pakk is the
prime factorization of n. Lastly, we observe that
F (pa ) = ∑ φ (pi ) = 1 + (p − 1) + (p2 − p) + ⋯ + (pa − pa−1 ) = pa ,
0≤i≤a

and therefore F (n) = p1 ⋯pakk


a1
= n, as desired. K
Example 24.2. The positive divisors of 12 are {1, 2, 3, 4, 6, 12} and we have
φ(1) = φ(2) = 1, φ(3) = φ(4) = φ(6) = 2, φ(12) = 4.
Finally, we check 1 + 1 + 2 + 2 + 2 + 4 = 12, as expected.

70
Exercises.
Exercise 24.3. Show that φ, τ and σ are not completely multiplicative by providing a
counterexample in each case.
Exercise 24.4. Characterize the positive integers for which τ (n) is odd.
Exercise 24.5. Characterize the positive integers n such that
(i) φ(n) is odd
(ii) 4 ∣ φ(n)
Exercise 24.6. Let p be a prime such that p+2 is also a prime (these are called twin primes).
Prove that σ(p + 2) = σ(p) + 2.

71
25. Perfect Numbers and Mersenne Primes

In this section, we study the relationship between so-called perfect numbers and Mersenne
primes. We begin with a few definitions.
Definition 25.1. An integer n > 0 is called perfect if σ(n) = 2n.
Definition 25.2. Let n > 1 be an integer. We call the integer Mn = 2n − 1 the n-th Mersenne
number. If Mn is prime, we call it a Mersenne prime.
Examples 25.3.
(1) The positive divisors of n = 6 are {1, 2, 3, 6} and we have
σ(6) = 1 + 2 + 3 + 6 = 12 = 2 ⋅ 6,
so 6 is a perfect number;
(2) The positive divisors of n = 28 are {1, 2, 4, 7, 14, 28} and we have
σ(28) = 1 + 2 + 4 + 7 + 14 + 28 = 56 = 2 ⋅ 28,
so 28 is a perfect number.
(3) M5 = 25 − 1 = 31 is a Mersenne prime.
(4) M7 = 27 − 1 = 127 is a Mersenne prime.
(5) M11 = 211 − 1 = 2047 = 23 ⋅ 89 is not prime.

There are no known odd perfect numbers.1 It is also unknown if there are infinitely many even
ones, but the next theorem shows there is a one-to-one correspondence between Mersenne
primes and even perfect numbers.
Theorem 35. Let n ∈ Z>0 . Then n is an even perfect number if and only if
n = 2p−1 (2p − 1) with 2p − 1 a Mersenne prime.

Proof. We first suppose n = 2p−1 (2p − 1), where 2p − 1 is a Mersenne prime, and show that
n must be an even perfect number. In other words, 2p − 1 is a prime number and we have
σ(2p − 1) = (2p − 1) + 1 = 2p by Proposition 26. We now compute
σ(n) = σ(2p−1 (2p − 1)) = σ(2p−1 )σ(2p − 1) = σ(2p−1 )2p ,
where we used the fact that σ its multiplicative and (2p−1 , 2p − 1) = 1. Now, from Lemma 10,
it follows that
2p − 1
σ(n) = ( ) ⋅ 2p = (2p − 1) ⋅ 2p = 2 (2p−1 (2p − 1)) = 2n.
2−1
Conversely, suppose now n is an even perfect number. Write n = 2a ⋅ b, where a, b ∈ Z>0 , b is
odd, and a ≥ 1. Since σ is multiplicative, by Lemma 10,
2a+1 − 1
σ(n) = σ (2a ) σ(b) = ( ) σ(b) = (2a+1 − 1) σ(b).
2−1
Since n is perfect,
σ(n) = 2n = 2 (2a ⋅ b) = 2a+1 b,
1As of 2012 it is known that no odd perfect numbers were found up to 101500
72
we have
(2a+1 − 1) σ(b) = 2a+1 b Ô⇒ 2a+1 ∣ σ(b) ⇐⇒ σ(b) = 2a+1 c with c > 0
and it follows that
(2a+1 − 1) σ(b) = (2a+1 − 1) 2a+1 c = 2a+1 b Ô⇒ (2a+1 − 1) c = b.

We claim that c = 1. Then b = 2a+1 − 1 and σ(b) = 2a+1 = b + 1, hence b is prime by


Proposition 26. Thus n = 2a b = 2a (2a+1 − 1) with 2a+1 − 1 is a prime, as desired.
We will now prove the claim. Suppose c > 1. Since (2a+1 − 1) c = b, we see that b has at least
the three positive divisors 1, c, and b. Thus σ(b) ≥ 1 + b + c, but
σ(b) = 2a+1 c = 2a+1 c − c + c = (2a+1 − 1) c + c = b + c,
a contradiction. K

To conclude this section, we establish the following two properties of Mersenne numbers.
Theorem 36. Let n ∈ Z>1 . If Mn is prime then n is prime.

Proof. We prove the contrapositive. That is, suppose n is composite, so n = a ⋅ b with


1 < a, b < n. We have
2n − 1 = 2ab − 1 = (2a − 1) (2a(b−1) + 2a(b−2) + 2a(b−3) + ⋯ + 2a + 1) ,
with both factors > 1. Thus Mn is not prime. K

From Example 25.3 (5), we see that M11 is not a prime despite the fact that 11 is a prime.
The next theorem shows that, in this kind of situation, the divisors of Mp cannot be arbitrary.
For this result, we require the following lemma.
Lemma 11. Let a and b be positive integers. Then (2a − 1, 2b − 1) = 2(a,b) − 1.

Proof. Write D = (2a − 1, 2b − 1) and d = (a, b). We want to show that D = 2d − 1.


By Theorem 6, there exist x, y ∈ Z such that
d = ax + by.
Since D = (2a − 1, 2b − 1), we have
2a ≡ 1 (mod D), and 2b ≡ 1 (mod D),
so
x y
2d = 2ax+by = (2a ) (2b ) ≡ (1x )(1y ) ≡ 1 (mod D).
That is, D ∣ (2d − 1).
Conversely, since d∣a, we have that 2d − 1 ∣ 2a − 1 (as in the proof of Theorem 36). Similarly,
as d∣b, we have 2d − 1 ∣ 2b − 1, and so 2d − 1 ∣ (2a − 1, 2b − 1). That is, 2d − 1 ∣ D.
Since D and 2d − 1 are positive and satisfy D ∣ 2d − 1 and 2d − 1 ∣ D, Proposition 6 implies
(2a − 1, 2b − 1) = D = 2d − 1 = 2(a,b) − 1,
as desired. K
Theorem 37. Let p be an odd prime and d a divisor of Mp = 2p − 1. Then d ≡ 1 (mod 2p).
73
Proof. Since the product of two numbers q1 , q2 ≡ 1 (mod 2p) is q1 q2 ≡ 1 (mod 2p), it is
enough to prove the theorem for the prime factors of Mp (any other divisor will be a product
of prime factors).
Let q ∣ Mp be prime. By FLT, we have
2q−1 ≡ 1 (mod q) ⇐⇒ q ∣ 2q−1 − 1,
and, by Corollary 4, we also have q ∣ (2p − 1, 2q−1 − 1). Now, Lemma 11 gives
(2p − 1, 2q−1 − 1) = 2(p,q−1) − 1 Ô⇒ q ∣ 2(p,q−1) − 1,
so 2(p,q−1) − 1 ≠ 1. We conclude (p, q − 1) ≠ 1 and, since p is prime, we have p ∣ q − 1. Thus
q −1 = pk ′ with k ′ = 2k because q is odd (since Mp is odd). That is, q = 1+2pk, as desired. K
Example 25.4. Is M23 = 223 − 1 = 8388607 a prime? By the previous theorem, we need
only test divisibility by primes of the form q = 46k + 1. The smallest such prime is 47, and
dividing M23 by this number shows M23 = 47 ⋅ 178481. So, M23 is not a Mersenne prime.

Exercises.
Exercise 25.5. Let n ∈ Z with n > 1. Then n is said to be almost perfect if σ(n) = 2n − 1.
Show that, for k ∈ Z>0 , the number 2k is almost perfect.

74
26. Primitive Roots

We know from Euler’s theorem that aφ(m) ≡ 1 (mod m) for any integer a coprime to m > 0.
Therefore, it is natural to ask if, fixed m > 0, there exists an integer x < φ(m) such that for
all integer a coprime to m we have ax ≡ 1 (mod m). For example, for m = 8 we can easily
compute that
12 ≡ 1, 32 = 9 ≡ 1, 52 = 25, 72 = 49 ≡ 1 (mod 8),
showing that x = 2 < φ(8) = 4 has the desired property. Inverting the question, we are inter-
ested in understanding when the smallest value of x with the above property is x = φ(m).
The complete answer to this question is provided by the Primitive Root Theorem (see The-
orem 40).
We begin with the following definition.
Definition 26.1. Let a, m ∈ Z with m > 0 and (a, m) = 1. The order of a modulo m, denoted
ordm (a), is the least positive integer n such that an ≡ 1 (mod m).
Example 26.2. Let m = 7, a = 3. We have
31 ≡ 3, 32 ≡ 2, 33 ≡ 6, 34 ≡ 4, 35 ≡ 5, 36 ≡ 1 (mod 7),
so ord7 (3) = 6 = φ(7). Similarly, we can compute the order of every integer a coprime to 7:
a (mod 7) 1 2 3 4 5 6
ord7 (a) 1 3 6 3 6 2
Example 26.3. Let m = 8. Here, φ(8) = 4 and the order of an integer a coprime to 8 is
given in the table
a (mod 8) 1 3 5 7
ord8 (a) 1 2 2 2

It is clear from Euler’s theorem that, for any a coprime to m, we have ordm (a) ≤ φ(m). In
the examples above, we see that ord7 (3) = ord7 (5) = φ(7), while for m = 8 there is no a with
the maximal order φ(8) = 4. However, in both cases, all the orders occurring are divisors of
φ(m). This is a general property.
Proposition 27. Let a, m ∈ Z such that m > 0 and (a, m) = 1. Then an ≡ 1 (mod m) for
some n ∈ Z>0 if and only if ordm (a) ∣ n. In particular, ordm a ∣ φ(m).

Proof. Suppose first that an ≡ 1 (mod m) for some n > 0. Dividing n by ordm (a) via the
division algorithm yields
n = ordm (a)q + r, 0 ≤ r < ordm (a).
Then
an = (aordm (a) ) ⋅ ar ≡ 1q ⋅ ar ≡ ar ≡ 1 (mod m),
q

where we used the definition of order and our assumption. Suppose r ≠ 0. Since ordm (a) is the
smallest integer for which aordm (a) ≡ 1 (mod m) and r < ordm (a), we obtain a contradiction.
So r = 0 and hence ordm (a) ∣ n.
75
Suppose now ordm (a) ∣ n. That is, n = ordm (a) ⋅ k for some k ∈ Z. Thus
an = (aordm (a) ) ≡ 1k ≡ 1 (mod m).
k

K
Example 26.4. Let m = 11 and a = 2. We have φ(11) = 10, so ord11 2 ∈ {1, 2, 5, 10} by
Proposition 27. We compute
21 ≡ 2 (mod 11), 22 ≡ 4 (mod 11), 25 ≡ 32 ≡ 10 (mod 11)
and since none of these are congruent to 1 mod 11, it follows that ord11 (2) = 10. Note that,
by using Proposition 27, we avoided computing 23 , 24 , 26 , 27 , 28 , 29 (mod 11).
Definition 26.5. Let a, m ∈ Z with m > 0 and (a, m) = 1. We say that a is a primitive root
modulo m if ordm (a) is maximal, that is ordm (a) = φ(m).
Examples 26.6. From the examples above, we already know the following:
(1) 3 and 5 are primitive roots modulo 7;
(2) 2 is a primitive root modulo 11;
(3) There are no primitive roots modulo 8.

We see now that the discussion in the first paragraph of this section can be summarized into
the question: which integers admit primitive roots?. The answer is given by Theorem 40 to
which we will not give a complete proof. In the remainder of this section, we will need the
following result.
Proposition 28. Let a, m ∈ Z, m > 0, (a, m) = 1.
(i) For i, j ∈ Z, we have ai ≡ aj (mod m) ⇐⇒ i ≡ j (mod ordm (a)).
(ii) For i > 0, we have
ordm (a)
ordm (ai ) = .
(ordm (a), i)

Proof. (i) Since (a, m) = 1, we know a−1 (mod m) exists and


ai ≡ aj (mod m) ⇐⇒ ai (a−1 ) ≡ aj ⋅ a−j ≡ 1 (mod m) ⇐⇒ ai−j ≡ 1 (mod m).
j

By Proposition 27, we thus have ordm a ∣ i − j ⇐⇒ i ≡ j (mod ordm a).


ord (ai )
(ii) Note that ai⋅ordm (a ) = (ai ) m
i
≡ 1 (mod m) so that ordm (a) ∣ i ⋅ ordm (ai ) by Propo-
sition 27. We claim that i ⋅ ordm (ai ) = lcm(ordm (a), i). In this case,
i ⋅ ordm (a)
i ⋅ ordm (ai ) = lcm(ordm (a), i) =
(ordm (a), i)
by Proposition 10 (iii), therefore
ordm (a)
ordm (ai ) =
(ordm (a), i)
as required.
We now prove the claim. Suppose that ordm (a) ∣ ik for some k > 0. By Proposition 27,
k
(a ) = aik ≡ 1 (mod m), hence ordm (ai ) ∣ k again by Proposition 27. It follows that
i
76
k = ordm (ai ) is the smallest k such that ik is both a multiple of i and ordm (a). That
is, i ordm (ai ) = lcm(ordm (a), i), as claimed.
K
Corollary 19. Let a be a primitive root mod m and S = {1, a, a2 , . . . , aφ(m)−1 }. Then, the
set S is a reduced residue system mod m.

Proof. The set S has φ(m) elements. Additionally, because a is a primitive root, (a, m) = 1
so that all of these elements are coprime to m. It remains to show that no two of them are
congruent mod m.
Suppose that ai ≡ aj (mod m) for some ai , aj ∈ S. Then i ≡ j (mod ordm (a)) by Proposi-
tion 28 (i). Then i = j by Corollary 8 since ordm (a) = φ(m) and 0 ≤ i, j ≤ φ(m) − 1. K
Example 26.7. Let m = 7 and a = 3 which is a primitive root mod 7. For 0 ≤ i ≤ 6, in
Example 26.2, we computed 3i (mod 7) and obtained the second row of the table
i 1 2 3 4 5 6
3i(mod 7) 3 2 6 4 5 1
ord7 (3i ) 6 3 2 3 6 1
which is a reduced residue system mod 7, as predicted by the previous corollary. To obtain
the third row we can, for example, apply the formula in Proposition 28 (ii). For instance,
to determine ord7 (2), we compute
ord7 (3) 6 6
ord7 (2) = ord7 (32 ) = = = = 3.
(ord7 (3), 2) (6, 2) 2
Corollary 20. Let m be an integer admitting a primitive root. Then there are φ(φ(m))
non-congruent primitive roots mod m.

Proof. Let r be a primitive root mod m so ordm (r) = φ(m). By Corollary 19, any other
primitive root must be congruent to ri for some i such that 1 ≤ i ≤ φ(m). If ri is also a
primitive root, then ordm (ri ) = ordm (r) = φ(m) and, by Proposition 28 (ii), we have
ordm r
ordm (ri ) = ⇐⇒ (ordm (r), i) = 1.
(ordm (r), i)
Clearly, there are φ(ordm (r)) = φ(φ(m)) such i, giving the desired result. K

To understand which integers admit a primitive root it is convenient to first understand why
certain integers cannot have a primitive root.
Examples 26.8.
(1) For m = 15, we have φ(m) = φ(3)φ(5) = 2 ⋅ 4 = 8 and
a such that (a, 15) = 1 1 2 4 7 8 11 13 14
ord15 (a) 1 4 2 4 4 4 4 2
(2) For m = 16, we have φ(16) = 8 and
a such that (a, 16) = 1 1 3 5 7 9 11 13 15
ord16 (a) 1 4 4 2 2 4 4 2
77
(3) Recall there are no primitive roots mod 8.

We shall shortly prove results explaining what is behind these examples, but first let us have
a closer look at the case m = 15. From Euler’s theorem we know that a8 ≡ 1 (mod 15) when
(a, 15) = 1, hence, by Proposition 19, we have a8 ≡ 1 (mod 3) and a8 ≡ 1 (mod 5). Clearly,
from these two congruences we recover that a8 ≡ 1 (mod 15) by CRT. However, FLT gives
us the sharper congruences a2 ≡ 1 (mod 3) and a4 ≡ 1 (mod 5) which, after squaring the
first, leads to a4 ≡ 1 (mod 15). Note that this is consistent with the orders in the table for
m = 15 in Examples 26.8. The following theorem generalizes this idea.
Theorem 38. Let m ∈ Z>0 . Suppose m = kn where (k, n) = 1 and φ(k), φ(n) are even.
Then, for all a ∈ Z coprime to m, we have
φ(m)
a 2 ≡ 1 (mod m).
In particular, there are no primitive roots modulo m.

Proof. Let ` = lcm(φ(k), φ(n)). By Euler’s Theorem, we have


aφ(k) ≡ 1 (mod k) and aφ(n) ≡ 1 (mod n),
hence
a` ≡ 1 (mod k) and a` ≡ 1 (mod n).
Thus a` ≡ 1 (mod m) by CRT. Finally, from Proposition 10 (iii), we have
φ(k)φ(n) φ(m)
` ⋅ (φ(k), φ(n)) = φ(k)φ(n) Ô⇒ ` ∣ =
2 2
φ(m)
where we used the fact that 2 ∣ (φ(k), φ(n)) as both φ(k), φ(n) are even. Then a 2 ≡ 1 (mod m),
as desired. The last statement is clear from φ(m)/2 < φ(m). K
Corollary 21. If m > 0 is divisible by two different odd primes then, there are no primitive
roots modulo m.

Proof. We can write m = pd q ` r, where p ≠ q are two odd primes and (r, p) = (r, q) = 1. Let
k = pd and n = q ` r, so that m = kn and (k, n) = 1. By the formula for φ, we also have that
φ(k) = pd−1 (p − 1) and φ(n) = q `−1 (q − 1)φ(r)
are even, so we can apply Theorem 38. K
Corollary 22. If m is divisible by 4p with p an odd prime, then there are no primitive roots
modulo m.

Proof. We can write m = 2d p` r with d ≥ 2, (2p, r) = 1. Let k = 2d and n = p` r, so that m = kn


and (k, n) = 1. By the formula for φ, we also have that
φ(k) = 2d−1 ≠ 1 and φ(n) = p`−1 (p − 1)φ(r)
are even, so we can apply Theorem 38. K
Theorem 39. Suppose m = 2d , d ≥ 3. Then
d−2
(A) a2 ≡ 1 (mod m) for all a ∈ Z odd.
(B) There is no primitive root modulo m.
78
Proof. We will use induction on d ≥ 3 to prove (A).
Base: Let d = 3. Then m = 8 and 2d−2 = 2. We check
12 ≡ 1 (mod 8), 32 ≡ 9 ≡ 1 (mod 8), 52 ≡ 25 ≡ 1 (mod 8), 72 ≡ 1 (mod 8).

Hypothesis: Suppose the result is valid for d − 1, that is, a2


d−3
≡ 1 (mod 2d−1 ).
Step: Let d > 3. The induction hypothesis is equivalent to a2 = 1 + 2d−1 k for some k ∈ Z.
d−3

Squaring both sides gives


d−2
a2 = (1 + 2d−1 k)2 = 1 + 2d k + 22d−2 k 2 ,
which, since 2d − 2 ≥ d for d ≥ 3, implies
d−2
a2 ≡ 1 (mod 2d ).
This completes the proof of (A). Finally, from (A) and 2d−2 < φ(m) = 2d−1 it follows there is
no integer of order φ(m), proving (B). K

Putting together these results we conclude that primitive roots may exist only for the integers
m = 1, 2, 4, pd or 2pd , where d ≥ 1 and p is an odd prime. The following theorem guarantees
that primitive roots exist for all such integers.
Theorem 40 (Primitive Root Theorem). Let m ∈ Z>0 . Then a primitive root modulo m
exists if and only if m = 1, 2, 4, pd or 2pd , where d ≥ 1 and p is an odd prime.

This result is clear for m = 1, 2, 4 and, in the next section, we will prove it for m = p a prime.
To close this section, we will prove the above result for m = 2pd assuming it to be true for
m = pd . More precisely, we will prove that (1) implies (2) in the following result.
Theorem 41. Let p be an odd prime and d ≥ 1. Then,
(1) there exist a primitive root modulo pd ;
(2) there exist a primitive root modulo 2pd .

Proof of part (2). Write n = 2pd . Let r be a primitive root mod pd which exists by part (1).
Then (r, pd ) = 1 and, since r ≡ r + pd (mod pd ), if r is even we replace it by r + pd which is
odd. So, we can assume r is odd and (2pd , r) = 1.
We aim to show that r is also a primitive root mod n. Note that φ(2pd ) = φ(2)φ(pd ) = φ(pd ).
By Proposition 27, we have ordn r ∣ φ(2pd ) = φ(pd ). Moreover,
rordn r ≡ 1 (mod n = 2pd ) Ô⇒ rordn r ≡ 1 (mod pd )
which, by Proposition 27, implies ordpd r = φ(pr ) ∣ ordn r. Now, we have shown that
ordn r ∣ φ(pd ) and φ(pd ) ∣ ordn r, therefore ordn r = φ(pd ) = φ(2pd ) because both ordn r and
φ(pd ) are positive. We conclude that r is a primitive root modulo n = 2pd . K

79
Exercises.
Exercise 26.9.
(a) Show that 2 is a primitive root modulo 19.
(b) How many incongruent primitive roots modulo 19 are there?
(c) By Euler’s Theorem, we know that a18 ≡ 1 (mod 19) for any a coprime to 19. Explain
why a is not necessarily a primitive root modulo 19.
(d) Determine, with proof, a maximal set of incongruent primitive roots modulo 19.

80
27. Primitive Roots for Primes

The objective of this section is to prove the following result


Theorem 42. Let p be a prime. Then there exists a primitive root modulo p.

Consider a polynomial f (x) = an xn + an−1 xn−1 + ⋯ + a1 x + a0 with integer coefficients ai ∈ Z.


We call n the degree of f and we say that f is monic if an = 1. We call an integer c satisfying
f (c) ≡ 0 (mod m) a root of f modulo m.
Example 27.1. Let f (x) = x3 + x + 1. We have
f (0) = 1 ≡ 1 (mod 2) and f (1) = 3 ≡ 1 (mod 2),
so f has no roots modulo 2. Now, working modulo 3, we obtain
f (0) = 1 ≡ 1 (mod 3), f (1) = 3 ≡ 0 (mod 3), f (2) = 11 ≡ 2 (mod 3),
showing that 1 is a root modulo 3 but 0 and 2 are not. We also have f (4) = 69 ≡ 0 (mod 3)
so 4 is another root of f modulo 3, but 4 ≡ 1 (mod 3) is congruent to the root we already
found. Since any integer is congruent mod 3 to 0, 1 or 2 we conclude that f has exactly one
root modulo 3.

The polynomial f of the previous examples has 0 and 1 roots modulo 2 and 3, respectively.
In both cases, the number of roots is smaller than the degree of f which is 3. The following
lemma shows this is a general fact.
Lemma 12 (Lagrange). Let f be a monic polynomial of degree n with integer coefficients.
Then, f has at most n roots modulo p.

Proof. We use induction on the degree n of f .


Base: For n = 1, then f (x) = x + a0 has one root, namely x ≡ −a0 (mod p).
Hypothesis: Suppose the statement is true for polynomials of degree n − 1. That is, every
polynomial of degree n − 1 has at most n − 1 roots mod p.
Step: Let f be a polynomial of degree n. For contradiction, assume f has n + 1 roots mod p.
Denote these roots by c0 , c1 , . . . , cn . Therefore, f (ck ) ≡ 0 (mod p) and ci ≠ cj (mod p) for
i ≠ j.
We compute
f (x) − f (c0 ) = xn + an−1 xn−1 + ⋯a1 x + a0 − (cn0 + an−1 cn−1
0 + ⋯ + a1 c 0 + a0 )
= xn − cn0 + an−1 (xn−1 − cn−1
0 ) + ⋯ + a1 (x − c0 ).

Note that xi − ci0 = (x − c0 )hi−1 (x), where hi−1 (x) is a monic polynomial of degree i − 1, hence
f (x) − f (c0 ) = (x − c0 )hn−1 (x) + an−1 (x − c0 )hn−2 (x) + ⋯ + a1 (x − c0 )
= (x − c0 )g(x).
Here, g(x) is a monic polynomial of degree n − 1. Now, evaluating x at ci in the previous
equality gives
f (ci ) − f (c0 ) ≡ (ci − c0 )g(ci ) (mod p) ⇐⇒ (ci − c0 )g(ci ) ≡ 0 (mod p)
81
and, since p is prime, this implies
ci − c0 ≡ 0 (mod p) or g(ci ) ≡ 0 (mod p).
For i > 0, the first equivalence cannot occur since ck ≡/ c0 (mod p) by hypothesis. Hence
g(ci ) ≡ 0 (mod p) for all i = 1, .., n. Thus, g has degree n − 1 and n different roots mod p, a
contraction. We conclude that f has at most n roots mod p, as desired. K

We can now prove the following statement which implies Theorem 42.
Theorem 43. Let p be a prime and d ≥ 1 a divisor of p − 1. Then, there are φ(d) integers
a such that 1 ≤ a ≤ p − 1 such that ordp (a) = d.
In particular, there are φ(p − 1) primitive roots (mod p).

Proof. Let F (d) denote the number of integers a such that 1 ≤ a ≤ p − 1 and ordp (a) = d.
The proof is divided into two main parts:
(1) We will show that either F (d) = 0 or F (d) = φ(d);
(2) Using (1), we will show that F (d) = φ(d) when d ∣ p − 1.
We start by proving (1). If F (d) = 0 there are no integers of order d.
Suppose F (d) ≠ 0, so that there is at least one integer of order d. Note that any a of order d
is a root mod p of f (x) = xd − 1; indeed, ad ≡ 1 (mod p) ⇐⇒ f (a) ≡ ad − 1 ≡ 0 (mod p).
Fix a of order d. Note that f (ai ) = (ai )d − 1 ≡ (ad )i − 1 ≡ 0 (mod p) and ai ≡/ aj (mod p) if
i ≠ j are in the range 0 ≤ i, j ≤ d − 1. Then, a0 , a1 , . . . , ad−1 are d distinct mod p roots of f .
Since f has degree d, it follows from Lemma 12 that these are all the mod p roots of f .
We conclude that all the elements of order d are among the ai and so we need to determine
how many ai , 1 ≤ i ≤ d − 1 have order d. Suppose ai has order d. Then, from Proposi-
tion 28 (ii), we know that
ordp a d
ordp ai = ⇐⇒ d = ⇐⇒ (d, i) = 1,
(ordp a, i) (d, i)
which occurs for φ(d) values of i in 1 ≤ i ≤ d − 1. Then F (d) = φ(d), as desired.
We will now prove (2). Since every a in 1 ≤ a ≤ p−1 has a unique order d dividing φ(p) = p−1
(by Proposition 27), we can group these elements based on their orders. In doing so, we see
that the total amount of integers, p − 1, is equal to the sum of the number of integers F (d)
for each order d. Therefore,
p − 1 = ∑ F (d) = ∑ φ(d),
d∣p−1 d∣p−1
d>0 d>0
where the second equality follows from by Theorem 34. We conclude that
∑ (F (d) − φ(d)) = ∑ (F (d) − φ(d)) = − ∑ φ(d) = 0.
d∣p−1 d∣p−1 d∣p−1
d>0 F (d)=0 F (d)=0

Here, we used part (1) to discard all the terms such that F (d) ≠ 0 in the first equality. Since
φ(d) ≥ 1 for all d, we conclude that the last sum runs over the empty set, otherwise we have
a contradiction. Thus F (d) = φ(d) for all d ∣ p − 1. K
82
Example 27.2. Find all integers of order 6 modulo 19.
We first show that 2 is a primitive root. We have φ(19) = 18, so the possible order mod 19
is among the values {1, 2, 3, 6, 9, 18}. We compute
2 ≡ 2, 22 ≡ 4, 23 ≡ 8, 26 ≡ 7, 29 ≡ 18 (mod 19).
Since all of the above congruences have ≡/ 1 (mod 19), we conclude ord19 2 = 18, as desired.
Since φ(6) = 2, by Theorem 43, there are two integers of order 6 mod 19. From Corollary 19
we know they are congruent mod 19 to 2i for some 1 ≤ i ≤ 18. We need to find the values of
i in this interval satisfying
ord19 2 18
6 = ord19 2i = = ⇐⇒ (18, i) = 3.
(ord19 2, i) (18, i)
Therefore i = 3 or i = 15, hence
23 ≡ 8 (mod 19) and 215 ≡ 12 (mod 19)
are the two order 6 elements.

27.1. Carmichael Numbers, Revisited. Recall that a Carmichael number is a composite


integer n > 0 such that, for all a ∈ Z coprime to n, we have an−1 ≡ 1 (mod n). In Section 21,
we have introduced Korset’s criterion which classifies Carmichael numbers (see Theorem 24),
but we only proved one direction of this theorem. The proof of the other direction requires
the use of primitive roots. In this section, we finally complete the proof of Korset’s criterion.
More precisely, we will show the following implication.
Theorem 44. Let n > 2 be a Carmichael number. Then,

(i) n is squarefree, i.e n = p1 ⋯pk with pi distinct primes;


(ii) if p ∣ n is prime then p − 1 ∣ n − 1.

Proof. Let n > 2 be a Carmichael number and p ∣ N a prime factor. We can write n = pk n′
with (p, n′ ) = 1 for some n′ ∈ Z. Note that to prove (i) we need to show that k = 1. By CRT,
the system of congruences
x ≡ 1 + p (mod pk ), x ≡ 1 (mod n′ )
admits a solution, that is, there is an integer a satisfying
(27.3) a ≡ 1 + p (mod pk ), a ≡ 1 (mod n′ ).
We note that (a, n) = 1. Indeed, if a prime q ∣ (a, n) then, either q = p or q is a prime factor of
n′ . Reducing the first congruence mod q if q = p or the second if q ∣ n′ leads to 0 ≡ 1 (mod q)
in both cases, a contraction. Therefore, since n is a Carmichael number, we have
an−1 ≡ 1 (mod n).
Suppose now k ≥ 2 so that p2 ∣ n. Reducing this congruence mod p2 gives
an−1 ≡ 1 (mod p2 ) Ô⇒ (1 + p)n−1 ≡ 1 (mod p2 )
83
where we used (27.3) in the implication above. We have, by the Binomial theorem2, that
(1 + p)n−1 = (1 + p)(1 + p)⋯(1 + p) ≡ 1 + (n − 1)p (mod p2 ).
Since p ∣ N , we also have
1 + (n − 1)p = 1 + np − p ≡ 1 − p (mod p2 ),
therefore,
1 ≡ (1 + p)n−1 ≡ 1 + (n − 1)p ≡ 1 − p (mod p2 ) Ô⇒ −p ≡ 0 (mod p2 ),
which is impossible. Thus k = 1, completing the proof of (i).
We will now prove (ii). Let p ∣ n be a prime. Since n is squarefree by part (i), we have
(p, n/p) = 1. Let b be a primitive root mod p. This primitive root exists by Theorem 42. By
CRT, the system of congruences
x ≡ b (mod p), x ≡ 1 (mod n/p)
admits a solution. That is, there is an integer a satisfying
a ≡ b (mod p), a ≡ 1 (mod n/p).
A similar argument as above shows that (a, n) = 1 and, since n is a Carmichael number, we
have
an−1 ≡ 1 (mod n) Ô⇒ an−1 ≡ bn−1 ≡ 1 (mod p).
By Proposition 27 and since b is a primitive root mod p, we conclude
ordp b = φ(p) = p − 1 ∣ n − 1,
completing the proof of (ii). K

Exercises.
Exercise 27.4. Show that, if f (x) is a polynomial of degree n with integer coefficients, and
p and q are prime numbers such that p ≠ q, then the congruence f (x) ≡ 0 (mod pq) has at
most n2 incongruent solutions modulo pq.
Exercise 27.5.
(a) How many elements of order 6 modulo 17 are there?
(b) How many elements of order 4 modulo 17 are there?
(c) Find all elements of order 4 modulo 17 using the fact that 3 is a primitive root modulo 17.

2The binomial theorem:


n
n
(x + y)n = ∑ ( )xk y n−k
k=0 k

84
28. Index Arithmetic and Discrete Logarithms

Let n be an integer admitting a primitive root. Recall that, if r is a primitive root, then the
set {1, r, r2 , . . . , rφ(n)−1 } is a reduced residue system mod n. In particular, for all a ∈ Z such
that (a, n) = 1 we have ri ≡ a (mod n) for some i in the range 1 ≤ i ≤ φ(n).
Definition 28.1. Let r be a primitive root mod n and a ∈ Z coprime to n. The index of
a relative to r is the least positive integer i such that ri ≡ a (mod n). We denote this by
indr a.
Example 28.2. We know that r = 3 is a primitive root mod n = 7. We have already
computed the first two rows of the following table. Using them we can determine all the
indices relative to 3 mod 7.
i 1 2 3 4 5 6
3i (mod 7) 3 2 6 4 5 1
a 1 2 3 4 5 6
ind3 a 6 2 1 4 5 3
Remark 28.3. Note that the existence of a primitive root is necessary for the definition of
the index to make sense. For instance, consider n = 12 and r = 5
i 1 2 3 4 5 6 7 8 9 10 11
5i (mod 12) 5 1 5 1 5 1 5 1 5 1 5
This shows that there is no integer i such that 5i ≡ a (mod n) for a ≠ 1, 5. This occurs
because 5 is not a primitive root modulo 12. In fact, since there is no primitive root mod
12, the index does not make sense in this setting.

It is common to also refer to indices as discrete logs since they share properties similar to
those of the usual logarithms of real numbers. This is clear from the following proposition.
Proposition 29. Let r be a primitive root mod n. Let a, b ∈ Z be coprime to n and d ≥ 1.
(a) indr 1 ≡ 0 (mod φ(n))
(b) indr r ≡ 1 (mod φ(n))
(c) indr ab ≡ indr a + indr b (mod φ(n))
(d) indr ad ≡ d ⋅ indr a (mod φ(n)).

Proof.
(a) By definition of the primitive root, we have rφ(n) ≡ 1 (mod n) and no smaller positive
exponent i satisfies ri ≡ 1 (mod n). Thus indr 1 = φ(n) ≡ 0 (mod φ(n)).
(b) Since i = 1 is the smallest positive exponent such that ri ≡ r (mod n), we have
indr r = 1 ≡ 1 (mod φ(n)).
(c) By definition of index, we have
rindr (ab) ≡ ab ≡ rindr a ⋅ rindr b ≡ rindr a+indr b (mod n),
hence, by Proposition 28 (i), we have
indr ab ≡ indr a + indr b (mod ordn r = φ(n)).
85
(d) Similarly to (c), we have
d
rindr a ≡ ad ≡ (rindr a )d ≡ rd⋅indr a (mod n),
hence, by Proposition 28 (i), indr ad ≡ d ⋅ indr a (mod ordn r) = φ(n).

We will now see how index arithmetic can be used to solve certain congruence equations.
Let r be a primitive root mod n, a, b, d ∈ Z with d ≥ 1 and consider the congruence equation
axd ≡ b (mod n).
We can rewrite this equation as rindr ax ≡ rindr b (mod n). By Propositions 28 and 29, we
d

also have
indr a + d ⋅ indr x ≡ indr b (mod φ(n)).
Relabeling y = indr x, a′ = d, and b′ = indr b − indr a, the equation transforms into the linear
congruence in one variable
a′ y ≡ b′ (mod φ(n)),
which we know how to solve using Theorem 19.
We will now solve some concrete equations in a couple of examples. For that we need to
have access to a table of indices.
Example 28.4. For n = 17, we check that r = 3 is a primitive root and compute the table
of indices relative to 3.
a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
ind3 a 16 14 1 12 5 15 11 10 2 3 7 13 4 9 6 8

In the next two examples, we will not refer to the previous table, though it should be
understood that we are using the results listed there.
Example 28.5. Determine all the integers satisfying 6x12 ≡ 11 (mod 17).
We have φ(17) = 16. Taking indices on both sides gives
6x12 ≡ 11 (mod 17) ⇐⇒ ind3 6 + 12 ⋅ ind3 x ≡ ind3 11 (mod 16)
⇐⇒ 15 + 12 ⋅ ind3 x ≡ 7 (mod 16)
⇐⇒ 12 ⋅ ind3 x ≡ 8 (mod 16)
⇐⇒ 3 ⋅ ind3 x ≡ 2 (mod 4)
⇐⇒ ind3 x ≡ 2 (mod 4)
⇐⇒ ind3 x ≡ 2, 6, 10, 14 (mod 16)
⇐⇒ x ≡ 3ind3 x ≡ 32 , 36 , 310 , 314 (mod 17)
Ô⇒ x ≡ 9, 15, 8, 2 (mod 17).

Here, we have changed from modulus 16 to 4 using Lemma 7. See also Exercise 13.19.
86
Example 28.6. Determine all the integers satisfying 7x ≡ 6 (mod 17).
Taking indices on both sides we get
7x ≡ 6 (mod 17) ⇐⇒ x ⋅ ind3 7 ≡ ind3 6 (mod 16)
⇐⇒ 11x ≡ 15 (mod 16)
⇐⇒ 33x ≡ 45 (mod 16)
⇐⇒ x ≡ 13 (mod 16).

We note that, in this example, the original congruence is mod 17 but the final description
of the integer solutions is mod 16.

The last two examples show that particular non-linear congruence equations can be solved
using indices. As for a general theorem, we will prove the following criterion to decide if
certain congruence equations have solutions.
Theorem 45. Let n be an integer admitting a primitive root. Let a, k ∈ Z with (a, n) = 1
and k ≥ 1. Consider the congruence equation
(28.7) xk ≡ a (mod n).
Write d = (k, φ(n)). Then,
φ(n)
(a) if a d ≡/ 1 (mod n), then (28.7) has no solutions;
φ(n)
(b) if a d ≡ 1 (mod n), then (28.7) has exactly d non-congruent solutions mod n.

Proof. Let r be a primitive root (mod n). We have,


xk ≡ a (mod n) ⇐⇒ k ⋅ indr x ≡ indr a (mod φ(n)).
By Theorem 19, the above linear congruence, in the variable y = indr x, has no solutions if
d ∤ indr a and d non-congruent solutions if d ∣ indr a. We show that the condition d ∣ indr a
φ(n)
is equivalent to a d ≡ 1 (mod n) which proves (A) and (B) simultaneously. Indeed,
φ(n) φ(n)
a d ≡ 1 (mod n) ⇐⇒ indr a d ≡ indr 1 ≡ 0 (mod φ(n))
φ(n)
⇐⇒ ( ) indr a ≡ 0 (mod φ(n))
d
⇐⇒ d ∣ indr a.
K

We finish this section with the following example.


Example 28.8. Decide how many non-congruent solutions the equation x3 ≡ 6 (mod 7) has.
In the notation of Theorem 45, we have
a = 6, n = 7, k = 3, d = (3, φ(7)) = (3, 6) = 3,
and computing
φ(n)
a d = 62 ≡ 36 ≡ 1 (mod 7),
we conclude that x3 ≡ 6 (mod 7) has d = 3 non-congruent solutions mod 7. Indeed, direct
calculation shows the solutions are x ≡ 3, 5, 6 (mod 7).
87
Exercises.
Exercise 28.9. Given that 318 ≡ 9 (mod 17), explain why ind3 (9) ≠ 18.
Exercise 28.10. Prove that the congruence x5 ≡ 1 (mod 52579) has exactly one solution,
x ≡ 1 (mod 52579). Use the fact that 52579 is a prime.

88
29. Nonlinear Diophantine Equations

A Diophantine equation in one or more variables, which is not linear in the sense of Defini-
tion 11.3, is called nonlinear. In Section 11, we have studied linear Diophantine equations
and completely solved the case of two variables. There is no analogous result for the nonlin-
ear case. Moreover, it is a theorem that there is no algorithm that will solve all nonlinear
Diophantine equation, and it is usually very hard to solve particular examples. Nevertheless,
there are many methods that can be used for particular instances or families; for example,
in Section 15, we have used the congruence method to prove that 3x3 + 2 = y 2 has no integer
solutions. Depending on the situation, other methods may find a partial or complete list of
solutions and sometimes one finds all the solutions but has no proof there are no more.
Example 29.1. The following are examples of famous nonlinear Diophantine equations:
(1) The Pythagorean Equation
x2 + y 2 = z 2 ;
(2) The Fermat Equation
xn + y n = z n , where n ≥ 3;
(3) The Pell equation
x2 − ny 2 = 1,
where n ∈ Z>0 is not a square.

In the next two sections, we will solve two classical examples of nonlinear Diophantine
equations. More precisely, we will describe the complete set of solutions (a, b, c) satisfying
gcd(a, b, c) = 1 of the equations x2 + y 2 = z 2 and x4 + y 4 = z 4 .

Exercises.
Exercise 29.2. Which of the following equations are nonlinear?
x2 = 1, x + y + z = 4, xy = 3, y = 1, z + y2 = 7

89
30. Pythagorean Triples

It is very well known that, given a right triangle, the square of the hypotenuse is equal to
the sum of the squares of the other two sides. This statement can be made more precise in
the following way.
Theorem 46 (Pythagora’s theorem). Let x, y, z be the sides of a right triangle, where z
corresponds to the hypothenuse. Then, x2 + y 2 = z 2 .
Examples 30.1. Here are a few solutions to the Pythagorean equation:

(1) 12 + 12 = ( 2)2
(2) 32 + 42 = 52
(3) (−3)2 + 42 = 52
(4) 92 + 122 = 152

We are interested in x√ + y = z as a Diophantine equation, so example (1) above is not a


2 2 2

solution for us since 2 ∉ Z. Also, solutions (2) and (3) are related by a change of sign;
indeed, we can flip the sign of any variable and obtain a new solution. This occurs because
the exponents are even. Therefore, we will restrict ourselves to only positive values of x, y, z.
Definition 30.2. We call x, y, z ∈ Z a Pythagorean triple if x, y, z > 0 and x2 + y 2 = z 2 .

Note that solution (4) can be obtained by multiplying solution (2) by 3. In general, if x, y, z
is a Pythagorean triple, then
(dx)2 + (dy)2 = d2 (x2 + y 2 ) = d2 z 2 = (dz)2 ,
so dx, dy, dz is also a Pythagorean triple for all d > 0. Conversely, suppose x, y, z is a
Pythagorean triple with a common factor (x, y, z) = d. Then, we can write
x = dx0 , y = dy0 , z = dz0
to obtain
(dx0 )2 + (dy0 )2 = (dz0 )2 Ô⇒ x20 + y02 = z02
with (x0 , y0 , z0 ) = 1. Thus we can restrict our attention to coprime triples.
Definition 30.3. We call a Pythagorean triple x, y, z primitive if (x, y, z) = 1.

We start by proving some elementary properties of Pythagorean triples.


Proposition 30. If x, y, z is a primitive Pythagorean triple then
(x, y) = (x, z) = (y, z) = 1.

Proof. Suppose (x, y) ≠ 1. Then p ∣ x and p ∣ y for some prime p. Thus p ∣ (x2 + y 2 ) = z 2 ,
hence p ∣ z and (x, y, z) ≠ 1, a contradiction. Similarly for (x, z) and (y, z). K
Proposition 31. If x, y, z is a primitive Pythagorean triple then x ≡/ y (mod 2). That is,
exactly one of x, y is odd and the other is even.
90
Proof. By the previous proposition, x, y are not both even, otherwise 2 ∣ (x, y).
Suppose x, y are both odd, i.e. x ≡ y ≡ 1 (mod 2). Then, x, y ≡ 1, 3 (mod 4) and
z 2 = x2 + y 2 Ô⇒ z 2 ≡ x2 + y 2 ≡ 2 (mod 4)
which is impossible because
02 ≡ 0, 12 ≡ 1, 22 ≡ 0, 32 ≡ 1 (mod 4),
shows that 2 is not a square mod 4. K

30.1. Classification of Primitive Pythagorean Triples. Given a primitive Pythagorean


triple, x, y, z, we know from Proposition 31 that x, y must have different parity. From the
symmetry of the equation x2 + y 2 = z 2 , we can further assume y to be even.
Theorem 47. The positive integers x, y, z form a primitive Pythagorean triple with even y
if and only if there are integers m, n > 0 such that
(1) (m, n) = 1;
(2) m > n;
(3) m and n have different parity, that is, m ≡/ n (mod 2);
(4) the values of x, y, z are given by
x = m2 − n2 , y = 2mn, z = m2 + n2 .

Proof. Suppose m, n are positive integers satisfying (1), (2), (3) and (4). Let x, y, z be as in
(4). We compute
x2 + y 2 = (m2 − n2 )2 + (2mn)2 = m4 − 2m2 n2 + n4 + 4m2 n2
= m4 + 2m2 n2 + n4 = (m2 + n2 )2 = z 2 ,
so x, y, z form a Pythagorean triple. Suppose x, y, z are not primitive. That is, there exists
a prime p such that p ∣ x, p ∣ y, and p ∣ z.
Since x = m2 − n2 ≡ m − n ≡/ 0 (mod 2) by (3), it follows that x is odd, so p ≠ 2. Note also
that p divides both x + z = 2m2 and z − x = 2n2 , so p ∣ m and p ∣ n, contradicting (1). Thus,
x, y, z form a primitive Pythagorean triple.
Conversely, let x, y, z be a primitive Pythagorean triple with even y. From Proposition 30,
we know that (x, y) = (x, z) = (y, z) = 1. We also have
x2 + y 2 = z 2 ⇐⇒ y 2 = z 2 − x2 = (z − x)(z + x),
and dividing both sides by 4 we get
y 2 z−x z+x
( ) =( )( ).
2 2 2
Since x, z are odd and y is even, z−x
2 , 2 and 2 are integers.
z+x y

We note that ( z−x z+x


2 , 2
) = 1. Indeed, suppose ( z−x z+x
2 , 2
) = d > 1. Then d divides both the
sum and the difference of the two numbers, that is
z−x z+x z−x z+x
d ∣( + )=z and d ∣( − ) = x,
2 2 2 2
91
contradicting (x, z) = 1. Note also that z > x so z−x 2 , 2 are both positive. Now, from
z+x

applying Proposition 32 with a = 2 , b = 2 , and c = 2 , there are m, n ∈ Z>0 such that


x+z z−x y

z−x z+x
= n2 and = m2 .
2 2
Writing x, y, z in terms of m, n gives z = m2 + n2 , x = m2 − n2 and
y 2 = z 2 − x2 = (z − x)(z + x) = (2n2 )(2m2 ) = 4m2 n2 ,
hence y = ±2mn. Choosing the positive value of y proves (4). Since x > 0, we have m > n > 0,
proving (2). If p divides n and m, then p divides x and z, contradicting (x, z) = 1; this
proves (1). Finally, to prove (3), suppose both m, n are odd. Then,
z ≡ 1 + 1 ≡ 0 (mod 2) and x ≡ 1 + 1 = 2 ≡ 0 (mod 2),
a contradiction with (x, z) = 1. This shows m, n are not both odd and, since they are coprime
by (1), they cannot be both even, proving (iii). K
Example 30.4. Using Theorem 47, we can easily produce non-trivial primitive Pythagorean
triples. For example, taking m = 5, n = 2 gives
(x, y, z) = (21, 20, 29),
and taking m = 6, n = 5 gives
(x, y, z) = (11, 60, 61).
We can also take m = 310 , n= 210 to obtain
(x, y, z) = (3485735825, 120932352, 3487832977).
Proposition 32. Let a, b, c ∈ Z with a, b > 0, (a, b) = 1 and ab = c2 . Then a and b are squares.
That is, there are positive integers m and n such that a = m2 and b = n2 .

Proof. Since (−c)2 = c2 we may assume c > 0. Consider the prime factorizations
a = pe11 ⋯pekk , b = q1s1 ⋯qm
sm
, c = `d11 ⋯`dnn ,
where pi are distinct primes and similarly for qi and `i . From ab = c2 , we have
(pe11 ⋯pekk )(q1s1 ⋯qm
sm
) = `2d 2dn
1 ⋯`n .
1

Since (a, b) = 1 we have that pi ≠ qj for all i, j. By uniqueness of the prime factorization, we
conclude that both sides of the equation are the unique prime factorization of c2 . It follows
that n = k + m and, for all 1 ≤ i ≤ k and 1 ≤ j ≤ m, there are 1 ≤ zi , zj ≤ n such that
2dzi s 2dz
pei i = `zi qj j = `zj j .
Therefore,
2d 2dzk d dz
a = `z1 z1 ⋯`zk = (`z1z1 ⋯`zkk )2
hence a is a square. Similarly, b is a square. K
Remark 30.5. The hypothesis in Proposition 32 are necessary. Indeed, (−4)(−9) = 62 and
(−4, −9) = 1 but neither −4 or −9 is a square. Moreover, (3 ⋅ 22 ) ⋅ 3 = 62 and both factors are
positive, but neither 3 ⋅ 22 or 3 is a square. However, we also have (22 )(32 ) = 62 , where the
factors are positive, coprime and squares, as predicted by the proposition.
92
30.2. Geometric View of Pythagorean Triples. In this section we will interpret Pythagorean
triples in a geometric way as points of positive rational coordinates on the unit circle, which
we recall is defined by the equation x2 + y 2 = 1.
y

(x0 , y0 )
y = t(x + 1)

(−1, 0)
x

Suppose that a, b, c > 0 form a Pythagorean triple. Then, since c ≠ 0, we have


a 2 b 2
(30.6) a2 + b2 = c2 ⇐⇒ ( ) + ( ) = 1,
c c
which shows that the point (x0 , y0 ) = ( c , c ) is on the unit circle. Conversely, if (x0 , y0 ) is a
a b

point on the unit circle with rational coordinates x0 , y0 then, for some choice of a common
denominator c, we can write x0 = a/c and y0 = b/c and the equivalence (30.6) shows that a,
b, c, when positive, form a Pythagorean triple.
Example 30.7. The triple 32 + 42 = 52 gives rise to the point ( 53 , 45 ) on the unit circle.

It follows from the previous discussion that we can describe Pythagorean triples by describing
the points on the unit circle having rational coordinates. We will now obtain an explicit
description of such points.
Consider the line y = t(x + 1), passing through the point (−1, 0) and a point (x0 , y0 ) on the
unit circle. This line has slope t = x0y+1
0
which is a rational number if the coordinates x0 ,
y0 are rational. In particular, when (x0 , y0 ) = ( ac , cb ) arises from a Pythagorean triple, the
slope t is rational. Conversely, if we intersect the unit circle with the line y = t(x + 1) for
a rational value of the slope t we obtain a point (x0 , y0 ) with rational coordinates. Indeed,
the intersection are points whose coordinates (x, y) satisfy both equations
x2 + y 2 = 1 and y = t(x + 1).
Substituting the equation for y into the first equation leads to
x2 + (t(x + 1))2 = 1 ⇐⇒ x2 − 1 + t2 (x + 1)2 = 0 ⇐⇒ (x − 1)(x + 1) + t2 (x + 1)2 = 0
⇐⇒ (x + 1)((x − 1) + t2 (x + 1)) = 0,
93
hence
x+1=0 or x − 1 + t2 (x + 1) = 0.
Solving this for x gives
1 − t2
x = −1 or x= .
1 + t2
Now, by replacing these values in the equation y = t(x + 1) we see that the corresponding y
coordinates are
1 − t2 2t
y = 0 or y = t ( 2
+ 1) = .
1+t 1 + t2
Therefore, the points of intersection of the line with the unit circle are
1 − t2 2t
(−1, 0) and (x0 , y0 ) = ( , ).
1 + t2 1 + t2
The first point was expected due to our construction of the line, while the second has rational
coordinates if the slope t is a rational number, as desired. Suppose now t = m/n is rational
with m, n > 0. Then,
1 − (m/n)2 2(m/n) m2 − n2 2mn
(x0 , y0 ) = ( , ) = ( , )
1 + (m/n)2 1 + (m/n)2 m2 + n2 m2 + n2
and by the argument in the beginning of this section, we obtain the Pythagorean triple
m2 − n2 , 2mn, m2 + n2 ,
recovering the formulas in Theorem 47.

Exercises.
Exercise 30.8. Find formulas for the integers of all Pythagorean triples (x, y, z) with
z = y + 1.
Exercise 30.9. Use the classification of primitive Pythagorean triples to show that if (x, y, z)
is a PPT, then at least one of x, y, and z is divisible by 4.

94
31. Fermat’s Last Theorem and Infinite Descent

We have fully described the solutions to the equation x2 + y 2 = z 2 , so now it is natural to


consider
x3 + y 3 = z 3 , x4 + y 4 = z 4 , xn + y n = z n , n ≥ 3
and ask if it is possible to describe all solutions. To solve this problem, it has taken mathe-
maticians around 350 years. A proof was completed by Andrew Wiles in 1995 and was one
of the greatest mathematical achievements of the 20th century.
Theorem 48 (Fermat’s Last Theorem). Let a, b, c ∈ Z satisfy the Fermat equation
an + b n = c n
with n ≥ 3. Then abc = 0.

To prove FLT, it is enough to consider the case n = 4 or n = p, for p an odd prime. Indeed,
for a composite n ≥ 3 we can write n = ab where b = 4 or b = p is an odd prime. Therefore,
from a solution xn0 + y0n = z0n we get (xa0 )b + (y0a )b = (z0a )b . That is, a solution for exponent
b. So, if we show that xb + y b = z b has no solutions in non-zero integers, then the original
equation xn + y n = z n also cannot have solutions in non-zero integers.
We shall shortly prove Fermat’s Last Theorem for n = 4 by combining Theorem 47 with a
method called infinite descent due to Fermat.
Let us first sketch the main idea of infinite descent. Suppose x0 , y0 , z0 is an integral solution
to the equation x4 + y 4 = z 4 such that x0 y0 z0 ≠ 0. Starting from this solution, we construct
another solution to the same equation on non-zero integers x1 , y1 , z1 with the property that
0 < z1 < z0 . Then, from x1 , y1 , z1 , we construct another solution on non-zero integers x2 , y2 , z2
such that 0 < z2 < z1 < z0 . Repeating this procedure, the values of zi form a strictly decreasing
infinite sequence of positive integers; this is clearly impossible, therefore the original solution
x0 , y0 , z0 cannot exist.
To illustrate infinite descent in a simpler situation, we will prove the following fact.

Theorem 49. The number 2 is not rational.
√ √
Proof. Suppose 2 ∈ Q. Then, there are coprime integers p and q such that 2 = pq . Thus,
squaring both sides leads to
p2
2 = 2 ⇐⇒ 2q 2 = p2 Ô⇒ 2 ∣ p2 Ô⇒ 2 ∣ p,
q
so p = 2r for some r ∈ Z>0 . Then,
2q 2 = (2r)2 = 4r2 ⇐⇒ q 2 = 2r2 ,
hence, as above, 2 ∣ q, i.e. q = 2s for some s ∈ Z>0 . Therefore,
p 2r r
= = , with 0 < r < p, 0 < s < q
q 2s s
Now, starting from 2 = r/s and arguing as above, we get r′ , s′ such that 2 = r′ /s′ with
0 < r′ < r < p and 0 < s′ < s < q. Continuing this procedure generates a strictly decreasing
infinite sequence of positive integers, which is a contradiction. K
95
Finally, we will prove the following theorem, which implies FLT for n = 4.
Theorem 50 (Fermat). The equation x4 + y 4 = z 2 has no solutions in non-zero integers.
Corollary 23. FLT holds for exponent n = 4.

Proof of Corollary. Suppose x0 , y0 , z0 is a solution in non-zero integers to x4 + y 4 = z 4 . Then,


x40 + y04 = z04 ⇐⇒ x40 + y04 = (z02 )2 ,
so that x0 , y0 , z02 is a solution in non-zero integers to x4 + y 4 = z 2 , contradicting Theorem 50.
K

Proof of Theorem 50. Suppose x1 , y1 , z1 satisfy x41 + y14 = z12 and x1 y1 z1 ≠ 0. Since the expo-
nents are even we can assume x1 , y1 , z1 > 0. Further, we can assume (x1 , y1 ) = 1. Indeed, if
x1 = dx′1 , y1 = dy1′ , then (x′1 , y1′ ) = 1 and x1 , y1 , z/d2 also satisfies the equation, because
z 2
d4 x′4
1 + d4 y1′4 2
= z ⇐⇒ x′4
1 + y1′4 = ( 2) .
d
We will show there is another solution x2 , y2 , z2 > 0 such that (x2 , y2 ) = 1 and z2 < z1 .
Note that
x41 + y14 = (x21 )2 + (y12 )2 = z12 and (x21 , y12 , z1 ) = 1,
so that x21 , y12 , z1 forms a primitive Pythagorean triple. Further, we know that, by swapping
x21 and y12 if necessary, we can assume y12 to be even, hence y1 is even and x1 odd. Then,
from Theorem 47, there are coprime integers m > n > 0 with different parity such that
x21 = m2 − n2 , y12 = 2mn, z1 = m2 + n2 .
In particular, x21 +n2 = m2 so that x1 , n, m forms a primitive Pythagorean triple with n even.
Again from Theorem 47, there are coprime integers a > b > 0 with different parity such that
x 1 = a2 − b 2 , n = 2ab, m = a2 + b 2 .
We claim that m, a and b are squares, that is, there are positive integers z2 , y2 , x2 such that
m = z22 , a = x22 , b = y22
with (x2 , y2 ) = 1 since (a, b) = 1. Finally, from m = a2 +b2 and the claim, we obtain x42 +y24 = z22 ,
meaning that x2 , y2 , z2 give a solution to the equation x4 + y 4 = z 2 satisfying (x2 , y2 ) = 1 and
x2 y2 z2 ≠ 0. Furthermore,
0 < z2 ≤ z22 = m ≤ m2 < m2 + n2 = z1 ,
as desired. A contraction now follows by infinite descent as explained above.
We will now prove the claim. We have to show m, a, b are squares. Recall that
y12 = 2mn = m(2n), (m, 2n) = 1, m > 0, 2n > 0,
hence m and 2n are squares by Proposition 32. Since 2n is a square, there is an integer c > 0
such that 2n = 4c2 , hence n = 2c2 . Now
n = 2ab ⇐⇒ 2c2 = 2ab Ô⇒ ab = c2
and, since a and b are positIve and coprime, by Proposition 32 they must be squares. K

96
Exercises.
Exercise 31.1. Prove that there is at most one square in any Pythagorean triple.

97
32. Fermat Factorization

Fermat factorization, named after Pierre de Fermat, is a factorization method based on the
representation of an odd integer as the difference of two squares
n = a2 − b2 = (a − b)(a + b)
and, if neither factors a − b or a + b equals 1, this is a proper factorization of n.
Lemma 13. Let n ∈ Z>0 be odd. Then there is a 1 − 1 correspondence between factorizations
of n into 2 positive odd numbers and differences of squares that equal n.

Proof. Let n = ab, with a, b odd. We set


a+b a−b
s= and t = .
2 2
Since a, b are both odd, we note that s, t ∈ Z. It follows that s2 − t2 = (s + t)(s − t) gives the
desired factors of n. K

Based on this lemma, to apply Fermat factorization, one tries various values t, hoping that
t2 − n is a square. More precisely, for n > 0 odd, we apply the following steps:

(i) Find the smallest integer t ≥ n
(ii) Consider the sequence of numbers
t2 − n, (t + 1)2 − n, (t + 2)2 − n, . . .
until a square s20 = (t + k)2 − n is found.
(iii) Let t0 = t + k. We have
n = t20 − s20 = (s0 + t0 )(s0 − t0 ).
This procedure (ii) will terminate since
n+1 2 n−1 2 n+1 2 n−1 2
n=( ) −( ) ⇐⇒ ( ) −n=( )
2 2 2 2
and
n+1 √
≥ n.
2
Corollary 24. Successive applications of Fermat’s factorization will factor n completely. In
particular, if n = pq one application suffices.

Example 32.1. Take n = 6077. The smallest integer t such that t ≥ n is t = 78. We
therefore compute the sequence of numbers (t + k)2 − n for k ≥ 0 until a square is found.
t2 − n = 782 − 6077 = 7
(t + 1)2 − n = 792 − 6077 = 164
(t + 2)2 − n = 802 − 6077 = 323
(t + 3)2 − n = 812 − 6077 = 484 = 222 .
We conclude that
6077 = 812 − 222 = (81 + 22)(81 − 22) = 103 ⋅ 59
98
Example 32.2. Fermat’s factorization works best when n has factors which are close to
each other. Let us consider the extreme case, where n = pq with p, q being ‘twin primes’,
that is, p and q = p + 2 are consecutive odd numbers.
√ √
We √ √ first 2that p < n ≤ p + 1. Indeed, suppose for
note contradiction that n √ ≤ p. Then
n= n n≤p√ , which is impossible because
√ √ n = pq = p 2 + 2p. Similarly, suppose n > p + 1.
It follows √
that n ≥ p + 2 = q and pq = n n ≥ q , a contradiction. So t = p + 1 is the smallest
2

integer ≥ n. Next, we compute the numbers (t + k)2 − n for k ≥ 0 until a square is found.
Indeed, we see that
t2 − n = (p + 1)2 − p(p + 2) = p2 + 2p + 1 − p2 − 2p = 1 = 12
so we stop in one step. We conclude that
n = (p + 1)2 − 12 = (p + 1 − 1)(p + 1 + 1) = pq.

Exercises.
Exercise 32.3. Using the Fermat factorization method, factor 8051.

99
33. The Pollard p − 1 Factorization Method

We will now introduce a factorization method due to John Pollard. Let n be a large integer
and compute Rk ≡ 2k! (mod n) recursively using fast modular exponentiation and the formula
k
Rk ≡ Rk−1 (mod n).
At each step, compute (Rk − 1, n) with the Euclidean Algorithm. Since 0 ≤ Rk ≤ n − 1, we
have Rk − 1 < n. Hence, if (Rk − 1, n) > 1, we have found a proper divisor of n.
Why does this work? Suppose p divides n and p − 1 ∣ k! for some k. Note this is true at least
for k ≥ p − 1. Hence, there exists a ∈ Z such that k! = (p − 1)a, and we have
2k! = 2(p−1)a = (2p−1 ) ≡ 1a ≡ 1 (mod p)
a

by FLT. It follows that p divides 2k! − 1. Since Rk ≡ 2k! (mod n), we also have
Rk = 2k! + bn
for some b ∈ Z. Then Rk − 1 = (2k! − 1) + bn, which implies p ∣ (Rk − 1) since p ∣ n and
p ∣ (2k! − 1). Therefore p ∣ (Rk − 1, n).
Example 33.1. Consider n = 10403. We compute
Rk (mod n) (n, Rk − 1)
R2 ≡ 22 ≡ 4 (mod n) (n, 3) = 1
R3 ≡ 43 ≡ 64 (mod n) (n, 63) = 1
R4 ≡ 644 ≡ 7580 (mod n) (n, 7579) = 1
R5 ≡ 75805 ≡ 4438 (mod n) (n, 4437) = 1
R6 ≡ 44386 ≡ 6862 (mod n) (n, 6861) = 1
R7 ≡ 68627 ≡ 137 (mod n) (n, 136) = 1
R8 ≡ 1378 ≡ 196 (mod n) (n, 195) = 1
R9 ≡ 1969 ≡ 3619 (mod n) (n, 3618) = 1
R10 ≡ 9798 (mod n) (n, 9797) = 101.
Since (n, 9797) = 101 > 1 we divide 10403 by 101 to get the factorization 10403 = 101 ⋅ 103.

Note that a large k always exists but is not practical. The Pollard p − 1 factorization method
is good if we can find small k such that p − 1 ∣ k! for some p ∣ n. This is likely to happen
when p − 1 has small prime factors.
Example 33.2. In the previous example, n = 10403 has the prime factor p = 101. We note
that p − 1 = 100 = 22 ⋅ 52 and 100 ∣ k! for k ≥ 10, finding a factor in 10 steps.

Of course, we can replace 2 by any other base b ≥ 2. Lastly, we note that in practice, this is
used after trial by division by small primes and before harder methods (which are not part
of these notes!).

Exercises.
Exercise 33.3. Use the Pollard p − 1 method to find a divisor of 689.
100
34. Cryptography

Suppose two friends, Alice and Bob, wish to communicate over an insecure channel in such
a way that their opponent Eve cannot understand or change what is being said. To keep
their conversation secure, Alice and Bob must consider the tools they are using to ensure
that their messages are kept secret, as well as the possible attacks on these tools to find out
their weaknesses.
The information that Alice wants to sent to Bob is called the plaintext. This is simply
data that can be read and understood without any special measures. Using a key, Alice will
encrypt the plaintext to obtain a ciphertext. To the unknowning observer, ciphertext appears
as unreadable gibberish. However, Bob, who knows the key, can decrypt the ciphertext to
obtain the original message from Alice. The following figure illustrates this process.

Encryption
Plaintext ÐÐÐÐÐÐÐ→ Ciphertext

Decryption

Definition 34.1. Cryptography is the design and implementation of secure systems. Crypt-
analysis is the process of breaking secure systems. The science that encompasses both of
these ideas is called cryptology.

The above process requires that both Alice and Bob have access to this key. However,
this key needs to be kept secret otherwise third parties such as Eve can use the key to
decrypt their messages. Encryption algorithms which have this property are called symmetric
cryptosystems or private key cryptosystems. There is a form of cryptography which uses two
different types of keys, one which is publicly available and used for encryption whilst the
other is private and used for decryption. These latter types of cryptosystems are called
asymmetric cryptosystems or public key cryptosystems. We will return to these types of
cryptosystems later in this section.
In this section, we use the mathematical techniques that we have thus far learned to encrypt
and decrypt messages that we wish to be kept secret. We will describe some historical
encryption methods that were used in the pre-computer era to encrypt data, as well as the
attacks on them.
Definition 34.2. A cryptosystem is made up of
● P: the set of all plaintext messages,
● C : the set of all ciphertext messages,
● K: the set of all keys,
and a correspondence
k ↦ (Ek , Dk ), for some k ∈ K
where
Ek ∶ P → C , the Encryption function and
Dk ∶ C → P, the Decryption function.
101
These functions satisfy
Dk (Ek (x)) = x, ∀x ∈ P.

In the private key cryptosystem described above, Eve wants to know what information
Bob and Alice are exchanging, and can attempt to decipher their messages and change the
information being sent between the two. To keep their messages secret from Eve, Alice and
Bob will first choose a random key k ∈ K. Then, to send a message to Bob over an insecure
channel, Alice will encrypt her message using Ek . That is, if the message is a string
x = x1 x2 ⋯xn ,
for some integer n > 0, where each xi ∈ P, then she will encrypt each xi as yi = Ek (xi ) and
send the resulting ciphertext
y = y1 y2 ⋯yn
to Bob. When Bob receives y, he deciphers it using Dk . Applying this protocol, their message
should remain secret from Eve, provided that she is not able to determine the key k. In the
following sections, we study classical cryptosystems based on congruences.

34.1. The Shift Cipher. When Julius Caesar sent messages to his generals, he did not
trust his messengers. So he replaced every A in his messages with a D, every B with an E,
and so on through the alphabet. Only someone who knew the “shift by 3” rule could decipher
his messages. This simple encryption algorithm is known as the Caesar cipher. Of course,
one could shift the alphabet by any arbitrary number. Such a generalization of Caesar’s
cipher is called a shift cipher.
Before describing this encryption algorithm, we must first translate the letters of the English
alphabet into numbers as follows.
A B C D E F G H I J K L M
0 1 2 3 4 5 6 7 8 9 10 11 12
N O P Q R S T U V W X Y Z
13 14 15 16 17 18 19 20 21 22 23 24 25
Note that we could extend this list by including symbols and numbers. For now, however,
we will just use the alphabet. In this case, P = C = K = Z/26Z. Let b ∈ K so that
b ∈ {0, 1, . . . , 25}.
Definition 34.3. The shift cipher is described via the correspondence
b z→ Eb (x) = x + b (mod 26), Db (x) = x − b (mod 26)
where the key b ∈ K is fixed and secret.
Example 34.4. Suppose Alice wants to send the message “MEET AT FOUR” to Bob using
a shift cipher with the key b = 3. This plaintext may be represented numerically as
MEET AT FOUR Ð→ 12 04 04 19 00 19 05 14 20 17.
Applying the shift cipher E3 (x) = x + 3 (mod 26) to each of the above numbers yields the
ciphertext
15 07 07 22 03 22 08 17 23 20 Ð→ PHHWDWIRXU,
102
where “PHHWDWIRXU” is the corresponding alphabetic representation of the ciphertext.
Hence, Alice sends the message “PHHWDWIRXU” to Bob.
Example 34.5. Using a shift cipher with the key b = 19, suppose Alice receives the message
BEHOXGBVDXEUTVDIEXTLXWHGMMXEETGRHGX
from Bob. Numerically, this corresponds to
01 04 07 14 23 06 01 21 03 23 04 20 19 21 03 08
04 23 19 11 23 22 07 06 12 12 23 04 04 19 06 17 07 06 23
To translate this back into plaintext, Alice uses the decryption function D3 (x) = x − 19
(mod 26) to obtain
08 11 14 21 04 13 08 02 10 04 11 01 00 02 10 15
11 04 00 18 04 03 14 13 19 19 04 11 11 00 13 24 14 13 04,
so that Alice deciphers the message as
I LOVE NICKELBACK PLEASE DONT TELL ANYONE.

The shift cipher is easy to break as soon as one understands the statistics of the underlying
language, in our case English. The distribution of English letter frequencies is described in
the table below.
Letter Percentage Letter Percentage
A 8.2 N 6.7
B 1.5 O 7.5
C 2.8 P 1.9
D 4.2 Q 0.1
E 12.7 R 6.0
F 2.2 S 6.3
G 2.0 T 9.0
H 6.1 U 2.8
I 7.0 V 1.0
J 0.1 W 2.4
K 0.8 X 0.1
L 4.0 Y 2.0
M 2.4 Z 0.1
To break a shift cipher, we compute the frequencies of the letters in the ciphertext and
compare them with the frequencies obtained from English.
For instance, suppose Eve intercepts the ciphertext
PTLKPAHALHASVUKVUKYBNZLCLYFTVYUPUN.
Suppose further that she knows P, C and that an encryption function of the form
Eb = x + b (mod 26)
103
was used. She wants to find b. To proceed, Eve begins by translating the ciphertext into its
numerical equivalent as
15 19 11 10 15 00 07 00 11 07 00 18 21 20 10 21 20 10 24 01 13 25 11 02 11 24 05 19 21 24 20 15 20 13.
Looking at the frequency of each letter appearing in the ciphertext, we note that the letters
L and U each occur four times. Since the most common letters in the English alphabet is ‘E’,
it is reasonable to guess that L or U correspond to E. Indeed, suppose that E is encrypted
as U. That is,
Eb (4) = 4 + b ≡ 20 (mod 26) Ô⇒ b = 16.
Using this key, Eve decrypts the message as
25 03 21 20 25 10 17 10 21 17 10 02 05 04 20 05 04 20 08 11 23 09 21 12 21 08 15 03 05 08 04 25 04 23
which corresponds to
ZDVUZKRKVRKCFEUFEUILXJVMVIPDFIEZEX.
Of course, this is just nonsense so we suppose instead that E is encrypted as L. That is
Eb (4) = 4 + b ≡ 11 (mod 26) Ô⇒ b = 7.
Using this key, Eve decrypts the message as
08 12 04 03 08 19 00 19 04 00 19 11 14 13 03 14 13 03 17 20 06 18 04 21 04 17 24 12 14 17 13 08 13 06.
This corresponds to
IMEDITATEATLONDONDRUGSEVERYMORNING
so that Eve deciphers the message as “I mediate at London Drugs every morning.”

34.2. The Affine Cipher. A generalization of the shift cipher is the affine cipher. In this
case, the key is (a, b, n) where a and n are coprime. We will denote the key simply by (a, b)
when the value of n is clear from the context.
Definition 34.6. The corresponding encryption function for the affine cipher is
(a, b) ↦ Ea,b (x) = ax + b (mod n).

Of course, when a = 1, we recover the shift cipher.


We note that these ciphers can also be broken by frequency analysis, but unlike the shift
cipher, we now need 2-bits of information. Indeed, we want to find
Da,b (y) = cy + d
satisfying
Da,b (Ea,b (x)) ≡ x (mod n) ∀x ∈ P ⇐⇒ c(ax + b) + d = cax + cb + d ≡ x (mod n).
Since this congruence must hold for all x ∈ P, taking
x ≡ 0 (mod n) yields cb + d ≡ 0 (mod n), and
x ≡ 1 (mod n) yields ca + cb + d ≡ 0 (mod n).
Combining these, we obtain
c ≡ a−1 (mod n) and d ≡ −a−1 b (mod n),
104
where a−1 exists because a and n are coprime. However, since we do not have access to (a, b),
we cannot determine Da,b and therefore need another way.
Example 34.7. Suppose we intercepted ciphertext
23 16 07 03 25 08 06 25 10 17 20 07 24 10 12 05 20 08 17 25 12

08 06 25 23 25 07 12 25 08 06 25 04 05 11 07 21 25 23 05 10 08
06 25 23 08 07 12 23 06 17 16 25 20 08 25 12 16 12 17 23 25.
This corresponds to
XQHDZIGZKRUHYKMFUIRZMIGZXZHMZIG

ZEFLHVZXFKIGZXIHMXGRQZUIZMQMRXZ
where the most common letters are Z and I. The most frequent letters in English are E and
T, so we try
Ea,b Ea,b
E ÐÐ→ Z T ÐÐ→ I
4 Ð→ 25 19 Ð→ 8.
Therefore, Da,b (x) = cx + b must satisfy
Da,b (25) ≡ 4 25c + d ≡ 4 (mod 26)
{ ⇐⇒ {
Da,b (8) ≡ 19 8c + d ≡ 19 (mod 26).
Subtracting both equations gives
17c ≡ −15 ≡ 11 (mod 26).
Since 17−1 ≡ 23 (mod 26), we obtain
c ≡ 11 ⋅ 23 ≡ 19 (mod 26)

d ≡ 4 − 25 ⋅ 19 ≡ 23 (mod 26)
hence Da,b (y) = 19y + 23 (mod 26). If decrypting the intercepted ciphertext message with
this function leads to meaningful text, we conclude that
E↔Z and T ↔I
was the correct guess. Indeed, using (19,23) as our decryption key, we obtain
18 15 00 02 04 19 07 04 05 08 13 00 11 05 17 14 13 19 08 04 17

19 07 04 18 04 00 17 04 19 07 04 21 14 24 00 06 04 18 14 05 19
07 04 18 19 00 17 18 07 08 15 04 13 19 04 17 15 17 08 18 04,
which corresponds to the plaintext
SPACE THE FINAL FRONTIER

THESE ARE THE VOYAGES OF THE STARSHIP ENTERPRISE.


105
Sometimes we may need more than 2-bits of information to break the affine cipher. Suppose
we enlarge our alphabet to 28 symbols by adding
blank space = 26
? = 27
and we use the affine cipher
f (x) = ax + b (mod 28).
We want to find
g(y) = cy + d (mod 28),
which is the decryption of f .
Example 34.8. Suppose we intercept the ciphertext
27 10 17 18 11 13 15 11 18 10 01 11 17 13 12 15 06 01 01 26 11 24 01 22 03 23
13 18 05 11 03 04 11 17 00 11 21 00 22 17 26 01 00 11 15 27 17 22 22 03 27 14
which corresponds to the message
?KRSLNPLSKBLRNMPGBB LYBWDXNSFLDELRALVAWR BALP?RWWD?O.
Suppose by frequency analysis, we know
g g
blank space Ð→ D O Ð → ?
26 Ð→ 3 14 Ð→ 27,
That is,
26c + d ≡ 3 (mod 28)
{
14c + d ≡ 27 (mod 28).
Subtracting both gives
12c ≡ 4 (mod 28)
so we get 3 solutions
c≡5 c ≡ 12 c ≡ 19
{ or { or {
d ≡ 13 d ≡ 13 d ≡ 13.

At this point we can


(1) Decipher the text with both and see if at least one makes sense;
(2) Continue with the next letter on the frequency analysis. Since L is the most frequent
letter in the ciphertext, we suppose that
g
L Ð → blank space
11 Ð→ 26,
g(11) ≡ 26 = 11c + d (mod 28) Ô⇒ c = 15, d = 13.
and we decrypt the message to obtain the plaintext
22 07 00 19 26 08 18 26 19 07 04 26 00 08 17 18 15 04 04 03 26 21 04 11 14 02
08 19 24 26 14 05 26 00 13 26 20 13 11 00 03 04 13 26 18 22 00 11 11 14 22 27,
corresponding to
WHAT IS THE AIRSPEED VELOCITY OF AN UNLADEN SWALLOW?
106
34.3. Exponential Ciphers.
Definition 34.9. Let p be a prime. We encrypt x (mod p) via the exponential cipher
f (x) = xe (mod p),
where the (p, e) is the encryption key with e ∈ Z such that (e, p − 1) = 1. The decryption
transformation is the exponential function
g(y) = y d (mod p)
where d ≡ e−1 (mod p − 1).

Indeed, since d ≡ e−1 (mod p − 1), we have that de ≡ 1 (mod p − 1) and hence de = 1 + k(p − 1)
for some integer k. It follows that
g(f (x)) = g(xe )
≡ (xe )d (mod p)
≡ xde (mod p)
≡ x1+k(p−1) (mod p)
≡ x1 (xp−1 )k (mod p)
≡ x (mod p) by FLT since x ≡/ 0 (mod p).
Note that if x ≡ 0 (mod p), then g(f (x)) = g(0) = 0 ≡ 0 (mod p) also.
To use this cipher, both Bob and Alice must know the key (p, e), which is kept secret. We
use the normal correspondence with an added zero (if necessary) to make all numbers have
2-digits. That is
A B C D E F G H I J K L M
00 01 02 03 04 05 06 07 08 09 10 11 12
N O P Q R S T U V W X Y Z
13 14 15 16 17 18 19 20 21 22 23 24 25
Example 34.10.
E X A M P L E
04 23 00 12 15 11 04

Next, we group the resulting numbers into blocks of 2m digits, where 2m is the largest
positive integer such that all blocks are < p. We choose our blocks in this way so that the
numerical value of each block does not get reduced modulo p. For instance, the word BB
corresponds to 0101, and the word LJ corresponds to 1110. If we choose p = 1009 and
2m = 4, then 0101 ≡ 1110 (mod p), so that BB is indistinguishable from LJ. In this case,
for this value of p, the correct choice of block length is 2m = 2. On the other hand, if
2525 < p < 252525, then m = 2. Note that the largest value of a 2-letter word is ZZ, which
corresponds to 2525.
Example 34.11. Take p = 2633 and e = 29 so that (2632, 29) = 1 and m = 2. In the example
above, we group the blocks
0423 0012 1511 0425
107
where the last 25 is used to fill the last block so that every block has 4 digits. Using f (x) = x29
(mod 2633) yields the following ciphertext
2437 2425 1729 0687.

In order to decrypt an exponential cipher, one must compute d satisfying ed ≡ 1 (mod p − 1)


and apply the function g(x) = xd (mod p) to each block. However, exponential ciphers resist
frequency analysis, making them harder to break than affine ciphers.
Indeed, suppose Eve knows p and that the plaintext x corresponds to ciphertext y. Then
she must solve for d in the equation
y d ≡ x (mod p)
to obtain the decryption key d. By analogy with the real numbers, we call this the “discrete
log problem” because
d = logy (x).
if we were working in R. In general, no efficient classical algorithm is known for computing
discrete logarithms. The simplest approach to compute d is to raise y to larger and larger
powers k until y k ≡ x (mod p), however such an algorithm is practical only for small primes
p.
More sophisticated algorithms exist, usually inspired by similar algorithms for integer fac-
torization. These algorithms run faster than the naive algorithm, however none of them run
in polynomial time. In other words, this problem is very hard computationally for large p;
this is what makes exponential ciphers secure.
Example 34.12. Suppose that Eve intercepts the ciphertext
1207 2012 0214 1088 0034 1402 1795 1531 0155
0718 0931 2652 2186 2137 0186 1580 0884 2280.
Suppose that additionally, she knows that p = 2707 and that the plaintext 1802 corresponds
to 1207. To decrypt this message, she must solve
1207d ≡ 1802 (mod 2707).
Using the naive algorithm, Eve must compute 2570k for k ≥ 1 until 1207k ≡ 1802 (mod 2707).
In doing so, she finds that this holds for any k in the set
{217, 463, 709, 955, 1201, 1447, 1693, 1939, 2185, 2431, 2677}.
In fact, by FLT, there are infinitely many values k for which 1207k ≡ 1802 (mod 2707). From
here, Eve must test every one of these values as a potential decryption key d until a sensical
plaintext is found. For instance, d = 217 yields
1802 2050 2253 1567 1763 1213 2649 1794 0508
0301 1200 1058 0124 1730 0134 0266 0406 1104,
which is nonsense since 2050 cannot correspond to any pair of letters. Similarly, d = 463
yields
1802 2616 1809 0668 2524 2274 0958 1193 0508
0532 1200 2544 0124 2617 0432 1668 2271 1104,
108
where again 2616 cannot correspond to any pair of letters. Finally, using d = 709, Eve
computes
1802 1413 0418 0017 0409 2018 1912 2005 0508
1318 1200 0304 0124 1100 2524 1504 1415 1104,
which corresponds to the plaintext
SCONES ARE JUST MUFFINS MADE BY LAZY PEOPLE.

34.4. The RSA Cryptosystem. Up until now, we have looked at cryptosystems that
required both communicating parties to have a copy of the same secret key. There is a form
of cryptography which uses two different types of keys, one which is publicly available and
used for encryption whilst the other is private and used for decryption. These latter types
of cryptosystems are called asymmetric cryptosystems or public key cryptosystems. In this
section, we will discuss the world’s first public key cryptosystem, RSA.
RSA is made of the initial letters of the surnames of Ron Rivest, Adi Shamir, and Leonard
Adleman, who first publicly described the algorithm in 1978. The RSA algorithm is based
on the difficulty of finding prime factors of large integers. In such a system, any person can
encrypt a message using the public key of the receiver, but such a message can be decrypted
only with the receiver’s private key. An analogy to this cryoptosystem is that of a locked
mail box with a mail slot. The mail slot is exposed and accessible to the public - its location
(the street address) is, in essence, the public key. Anyone knowing the street address can
go to the door and drop a written message through the slot. However, only the person who
possesses the key can open the mailbox and read the message.
Definition 34.13. To use an RSA cipher, each communicating party must choose two large
primes p and q and an exponent e such that
1 < e < (p − 1)(q − 1) and (e, (p − 1)(q − 1)) = 1.
Let n = pq so that φ(n) = (p − 1)(q − 1) and define d = e−1 (mod φ(n)). The (public)
encryption key is (n, e) and the corresponding encryption function is
Ek (x) = xe (mod n).
The (private) decryption key is (n, d) with decryption function
Dk (x) = xd (mod n).

The following theorem verifies that Dk does indeed recover the original message.
Theorem 51. We have Dk (Ek (x)) ≡ x (mod n).

Proof. We need to show xed ≡ x (mod n). By CRT, it is enough to show that
xed ≡ x (mod p) and xed ≡ x (mod q).
Suppose first that x ≡ 0 (mod p). Then
xed ≡ 0 ≡ x (mod p),
and we are done. Suppose now that x ≡/ 0 (mod p). By construction, d ≡ e−1 (mod φ(n)) so
ed = 1 + φ(n)k = 1 + (p − 1)(q − 1)k
109
hence
xed ≡ x1+(p−1)(q−1)k ≡ x(xp−1 )(q−1)k ≡ x (mod p),
where the last equivalence follows by FLT since (x, p) = 1. The same argument holds for
xed ≡ x (mod q), which completes the proof. K
Example 34.14. Take p = 11, q = 3. Then n = pq = 33 so that φ(n) = (p − 1)(q − 1) = 20.
Choose e = 3 and note that this is a valid choice since (e, (p − 1)(q − 1)) = (3, 20) = 1. In this
case, we find that d = e−1 ≡ 7 (mod φ(n)). Hence, the public key is given by (n, e) = (33, 3)
and the private key is (n, d) = (33, 7).
Suppose we want to use this system to encrypt the message “This is an example.” Since
25 < n < 2525, after changing each letter into its corresponding 2-digit number, we group the
resulting numbers into blocks of 2m digits, where m = 1. This means that 2m = 2 is the
largest positive integer such that all blocks are < n. Hence, we have
19 07 08 18 08 18 00 13 04 23 00 12 15 11 04
Now, for each of these 2-digit numbers x, we compute x3 (mod 33) to obtain the ciphertext
integers
28 13 17 24 17 24 00 19 31 23 00 12 09 11 31.
Example 34.15. Consider the system (n, e) = (3127, 11) and suppose we want to encrypt
the message “Number theory is my favourite class.” Converting the plaintext into digits and
separating these digits into blocks yields
1320 1201 0417 1907 0414 1724 0818 1224 0500 2114 2017 0819 0402 1100 1818.
Of course, here, since n = 3127 and 2525 < n < 252525, we take m = 2 so that each block has
2m digits to ensure that each block is < n. Using our encryption key, we compute xe (mod n)
for each block x. This gives the encrypted ciphertext
1464 2549 0702 1854 1122 2356 1196 2193 2150 0399 1611 1499 1988 0991 0100.

The reason why RSA is secure is that factoring large integers is very hard computationally.
Indeed, suppose we factor n = pq, then we can compute φ(n) = (p − 1)(q − 1) and since (n, e)
is public we can find d ≡ e−1 (mod φ(n)) via the Euclidean Algorithm. To break RSA we
only need the value of d which can be computed from φ(n) and e and not necessarily the
factorization of n. However, the following argument shows that computing the value of φ(n)
is not simpler than factoring n. Indeed, suppose we know both n and φ(n). We have
(i) φ(n) = √
(p − 1)(q − 1) = pq − p − q + 1 ⇐⇒ p + q = pq − φ(n) + 1 = n − φ(n) + 1
(ii) p − q = (p + q)2 − 4n,
Therefore, with the value of n and φ(n) we compute p + q using (i), then we use (ii) to
compute p − q. Finally we can determine p and q, computing
p = 21 ((p + q) + (p − q))
{
q = 21 ((p + q) − (p − q)),
showing that from knowing φ(n) we can factor n. There have been however successful
attacks on RSA but these issues were solved by being more careful when setting up an
implementation. For example, the primes p and q should not be close because of Fermat
110
factorization (see Example 32.2); moreover, we should choose p and q such that p − 1, q − 1
have large factors to avoid a successful factorisation of n = pq with Pollard p − 1 factorization
method from Section 33.

Exercises.
Exercise 34.16. Consider an affine cipher with encryption key (a, b, 26). We say that a
letter with numerical value x is “fixed” if x is enciphered as x. Is it possible to choose a, b
with gcd(a, 26) = 1, so that there is
(a) exactly one fixed letter?
(b) exactly two fixed letters?
(c) exactly three fixed letters?
(d) exactly four fixed letters?
(e) exactly 13 fixed letters?
In each part, give a proof if the answer is “no,” and an example if the answer is “yes”.
Exercise 34.17.
(a) Consider the RSA encryption scheme with public encryption key (2623, 11). Encipher
the message PATIENCE IS A VIRTUE.
(b) Decipher the message 284 926 2489 445 662 2445 926 178 using the encryption key as in
Part (a).
Exercise 34.18. Suppose a cryptanalyst discovers a message P that is not relatively prime
to the enciphering modulus n = pq used in an RSA cipher. (He can confirm this by running
the Euclidean algorithm.) Show that the cryptanalyst can factor n.

111

You might also like