Number Theory Notes
Number Theory Notes
AN INTRODUCTION TO
NUMBER THEORY
1
Contents 19. Wilson’s Theorem 54
20. Fermat’s Little Theorem 56
1. The Integers 3
21. Primality Testing, Pseudoprimes,
2. Mathematical Induction 5
and Carmichael Numbers 58
3. Divisibility 8
22. Euler’s φ-Function and Euler’s
4. Representation of Integers 10 Theorem 62
5. The Greatest Common Divisor 12 23. Arithmetic Functions 66
6. The Euclidean Algorithm 15 24. Formulas for the Functions φ, τ
7. Prime Numbers 18 and σ 69
2
1. The Integers
In this section, we will recall some basic notation and properties of the integers. Throughout
the remainder of this book, we will use this information as axioms without further expla-
nation. The properties listed here are not necessarily independent; that is, it is possible to
prove some of these properties from the others.
We will denote the integer numbers by
Z = {. . . , −3, −2, −1, 0, 1, 2, 3, 4, . . .}.
Given a, b ∈ Z we will write a + b for their sum and a ⋅ b for their product. We also denote the
product by ab. We will write a − b to denote a + (−b).
We call the positive integers, to the numbers
Z>0 = {1, 2, 3, 4, . . .}.
Given two integers a, b, we will say that a is greater than b if a − b ∈ Z>0 ; this is denoted a > b.
We also say that b is smaller than a, writing b < a.
The integers satisfy the following properties:
Example 1.1. We will use the axioms above to show that, for all a ∈ Z, we have a ⋅ 0 = 0.
Indeed, since 0 is the identity element for addition we have 0 + 0 = 0, hence
a ⋅ 0 = a ⋅ (0 + 0) = a ⋅ 0 + a ⋅ 0
3
by the distributivity property. Adding the inverse of a ⋅ 0 to both sides and applying the
associativity law, we obtain
0 = a ⋅ 0 − a ⋅ 0 = a ⋅ 0 + (a ⋅ 0 − a ⋅ 0) = a ⋅ 0 + 0 = a ⋅ 0,
where the first equality comes from the definition of inverse and the last by the identity
element of addition. Thus a ⋅ 0 = 0, as desired.
The Well Ordering Principle (WOP): Every non-empty subset S ⊂ Z>0 of the positive
integers contains a least element.
That is, given a subset S of Z>0 , there is an m ∈ S such that m ≤ n for all n ∈ S. The
following examples illustrate this property.
Examples 1.2.
(1) Given S = Z>0 , the smallest element of S is 1.
(2) Let S be the set of even integers. Then 2 is the smallest element of S.
(3) Let S be the set of all prime numbers. Then 2 is the smallest element of S.
Remark 1.3. The WOP does not hold for other sets of numbers like Q or R. Indeed, consider
the set
1
S = { ∶ n ∈ Z>0 } .
n
This is a non-empty set of positive elements which does not have a smallest element in either
Q or R.
For an integer k we will write Z>k to denote the set of integers greater than k. Similarly, we
will also write Z≥k , Z<k , Z≤k or Z≠k to denote the sets with the natural analogous definition.
We conclude this section by defining the rational numbers Q as fractions of integers. For-
mally, we have the following definition.
Definition 1.4. Consider pairs (p, q) where p, q ∈ Z and q ≠ 0 and the following equivalence
relation on them: two such pairs (p, q) and (p′ , q ′ ) are equivalent if and only if pq ′ = p′ q in Z.
We define the rational numbers Q as the set of equivalence classes for this relation. The
equivalence class of (p, q) is denoted by the fraction pq .
Exercises.
Exercise 1.5. Let a, b ∈ Z with ab = 0. Show that either a = 0 or b = 0.
Exercise 1.6. Let a, b, c ∈ Z with a < b. Show that a + c < b + c.
Exercise 1.7. Let a, b, c ∈ Z with a < b and c > 0. Show that ac < bc.
4
2. Mathematical Induction
In this section, we recall the first and second principles of mathematical induction, an im-
portant proof technique. The first principle is also known as weak induction while the second
is also known as strong induction, because it seems to use a stronger assumption (compare
parts (b) of Theorems 1 and 2). However, they are equivalent; we shall see in Theorem 3
that they are both equivalent to the Well Ordering Principle.
Theorem 1 (First Principle of Mathematical Induction).
Let m be an integer and S a subset of Z satisfying
(a) m ∈ S and
(b) if k ≥ m and k ∈ S then k + 1 ∈ S.
Then S contains all integers greater or equal to m, that is S = Z≥m .
Proof. Let m = 1 and S be as in the statement. Assume, for contraction, that there exists an
integer greater or equal to m = 1 which is not in S. Then the set of positive integers which
are not in S is non-empty. By the WOP, this set has a minimal element s. Since 1 ∈ S, we
have that s ≠ 1 so that s is a positive integer strictly greater than 1. Now, the integer s − 1
is a positive integer smaller than s. By minimality of s, we must have that s − 1 ∈ S. Then,
from property (b), it follows that s = (s − 1) + 1 ∈ S, a contraction.
Finally, let m be any integer and S as in the statement; we will reduce this situation to the
case m = 1 and apply the previous paragraph. Indeed, consider the translated set
S ′ = {k − m + 1 ∣ k ∈ S}.
Since m ∈ S we have 1 ∈ S ′ . Let k ≥ 1 be in S ′ . Then, there is k0 ≥ m in S such that
k = k0 − m + 1; since S satisfies (b) we have k0 + 1 ∈ S, hence k + 1 = (k0 + 1) − m + 1 is in
S ′ . We conclude that S ′ satisfies (a) and (b) with m = 1, so by the first part of the proof we
have S ′ = Z≥1 . Then S = Z≥m , as desired. K
Theorem 2 (Second Principle of Mathematical Induction).
Let m be an integer and S a subset of Z satisfying
(a) m ∈ S and
(b) if k ≥ m and {m, m + 1, m + 2, . . . , k} ⊂ S then k + 1 ∈ S.
Then S contains all integers greater or equal to m, that is S = Z≥m .
Proof. Let m and S be as in the statement. Consider the set T of all the integers n ≥ m such
that every integer in the interval [m, n] belongs to S. In particular, m ∈ T .
Suppose that n ∈ T . Then {m, m + 1, m + 2, . . . , n} ⊂ S by definition of T . The hypotheses on
S now imply n + 1 ∈ S. Then all the integers in the interval [m, n + 1] are in S, so n + 1 ∈ T
also.
We have shown that T satisfies hypothesis (a) and (b) of Theorem 1, so T contains all the
integers greater or equal to m. Since T ⊂ S the same is true for S, as desired. K
5
In practice, induction is used to show that a statement is true for all integers ≥ m for some
m ∈ Z≥0 . This is done via two steps. The first step is the base case, where we prove the
desired statement is true for n = m. The second is the induction step, where, assuming that
the desired statement is true for n (the induction hypothesis), we prove it is also true for n+1.
Letting S denote the set of positive integers for which the statement is true, these two steps
show S satisfies (a) and (b) of Theorem 1. Hence, S must contain all the integers ≥ m.
Note that the only difference between strong and weak induction is that in strong induction,
the induction hypothesis becomes that the statement is true for all integers in the interval
[m, n]. We now give a few examples.
Proposition 1. Let n ∈ Z>0 . The sum of the first n integers is given by the formula
n
n(n + 1)
∑k = .
k=1 2
Proof. Let S ⊂ Z be the set of positive integers for which the formula holds. We use weak
induction on S.
Base: Let n = 1. Then ∑1k=1 k = 1 = 1 ⋅ (1 + 1)/2, so 1 ∈ S.
Hypothesis: Suppose that the formula holds for n > 1, that is n ∈ S.
Step: We will show that n + 1 ∈ S. We have that
n+1 n
n(n + 1)
∑ k = ∑ k + (n + 1) = +n+1
k=1 k=1 2
n(n + 1) + 2n + 2 n2 + 3n + 2 (n + 1)(n + 2)
= = =
2 2 2
(n + 1)((n + 1) + 1)
= ,
2
where in the second equality we have used the induction hypothesis. This shows that n+1 ∈ S.
We conclude that S satisfies both properties (a) with m = 1 and (b) in Theorem 1, therefore
S = Z≥1 , as desired. K
Proposition 2. Consider the geometric series ∑∞ k
k=0 ar where a, r ∈ R with r ≠ 1. For
n ∈ Z≥0 , its partial sum is given by the formula
n
1 − rn+1
∑ ark = a ( ).
k=0 1−r
Proof. We have seen in the previous proofs that WOP implies weak induction and that weak
induction implies strong induction. We will now show that strong induction implies WOP.
Suppose there exist S ⊂ Z>0 without a smallest element. We will prove that S is empty. Let
T be the complement of S in Z>0 . That is, T is the set of positive integers which are not in
S.
Clearly, 1 ∈ T otherwise 1 ∈ S is the smallest element of S since 1 is the smallest positive
integer. Let n > 1 and write Sn = {1, . . . , n}. Suppose Sn ⊂ T , hence Sn ∩ S = ∅. Therefore, if
n + 1 ∈ S then n + 1 is the smallest integer in S, which is a contradiction to our hypothesis, so
n + 1 ∈/ S. We conclude that n + 1 ∈ T , hence T satisfies properties (a) and (b) of Theorem 2
and we have T = Z>0 by strong induction. Thus S = ∅, as desired. K
Exercises.
Exercise 2.2. Let n ∈ Z>0 . Use induction to show that the sum of the first n2 integers is
given by the formula
n
n(n + 1)(2n + 1)
∑ k2 = .
k=1 6
Exercise 2.3. Define a sequence x1 , x2 , . . . by
⎧
⎪ x1 = 1
⎪
⎪
⎪
⎨ x2 = 3
⎪
⎪
⎩xk+2 = 3xk+1 − 2xk for k ≥ 1.
⎪
⎪
Use induction to show that for all positive integers n, we have xn = 2n − 1.
7
3. Divisibility
Definition 3.1. Let a, b ∈ Z. We say that a divides b, denoted a ∣ b, if there exists c ∈ Z such
that b = a ⋅ c. In this case, we also say that a is a factor of b and b is a multiple of a. We
write a ∤ b to denote that a does not divide b.
Examples 3.2.
(1) 3 ∣ 6 since 6 = 3 ⋅ c with c = 2.
(2) 3 ∤ 5 since 5 = 3 ⋅ c with c = 53 ∉ Z.
(3) a = 1 ⋅ a = (−1)(−a) ⇒ ±1, ±a divide a.
(4) 0 = a ⋅ 0 ⇒ a ∣ 0 ∀a ∈ Z.
(5) b = 0 ⋅ c ⇒ b = 0. That is, only 0 is divisible by 0.
Remark 3.3. From (4) in the example above, it follows that 0 ∣ 0. However, the fraction 0
0
makes no sense as a rational number.
In subsequent sections, we will need some simple properties of divisibility, which we now
state and prove.
Proposition 3. Let a, b, c be integers. If a ∣ b and b ∣ c then a ∣ c.
The above examples and definitions give a consice meaning to an exact division, but we are
also used to division with a remainder. For example, we know that 4 fits into 15 exactly 3
times, with a remainder of 3. The following theorem makes this idea precise.
Theorem 4 (Division Algorithm/Division with Remainder). Let n, a ∈ Z with a > 0. Then
there exist unique q, r ∈ Z such that
n=q⋅a+r where 0 ≤ r < a.
We say that q is the quotient and r the remainder of the division of n by a.
8
Proof. The proof consists of two parts: first we find some q, r with the desired properties
and then we prove they are unique with those properties. Let n, a ∈ Z with a > 0.
Existence. Consider
T = {m ∈ Z>0 ∣ m = n − ka for some k ∈ Z},
that is, the set of non-negative numbers that differ from n by a multiple of a. Note that
T ≠ ∅ because we can choose a negative k with large enough absolute value to make m > 0.
Then, by the WOP we can choose r to be the smallest positive integer in T . In particular,
we have 0 ≤ r = n − qa for some q ∈ Z by definition of T .
This gives our candidates for r and q. It remains to show that r < a. Indeed, suppose
r ≥ a > 0. Then
r − a = n − (q + 1)a ≥ 0 Ô⇒ r − a ∈ T with 0 ≤ r − a < r.
This contradicts the fact that r is the smallest positive element of T , hence r < a, as desired.
Uniqueness. Suppose, in addition to q and r, there exist q ′ and r′ with
n = q ′ ⋅ a + r′ where 0 ≤ r′ < a.
Now
n = q ⋅ a + r = q ′ ⋅ a + r′ with 0 ≤ r, r′ < a.
Suppose first r = r′ . Then (q − q ′ )a = 0 and, since a ≠ 0, we have q = q ′ . We conclude that,
to finish the proof, we need to show r = r′ . We proceed by contraction.
Suppose WLOG that r′ > r. Then r′ − r = (q − q ′ )a > 0 implies r′ − r ≥ a but
a > r′ ≥ r′ − r ≥ a,
a contradiction. Hence r = r′ , as desired. K
Corollary 2. Let n, a ∈ Z with a > 0. Then a ∣ n if and only if the remainder of the division
of n by a is r = 0.
Examples 3.4.
(1) Take n = 6 and a = 3; then 6 = 2 ⋅ 3 + 0. That is q = 2 and r = 0.
(2) Take n = 30 and a = 7; then 30 = 4 ⋅ 7 + 2. That is q = 4 and r = 2.
Exercises.
Exercise 3.5. Let n ∈ Z. Prove that 5 ∣ n5 − n.
Exercise 3.6. Let n ∈ Z. Is it true that 4 ∣ n4 − n? Provide a proof or counterexample.
9
4. Representation of Integers
When writing down integers, we typically use decimal notation, also called ‘base 10’. For
example, 37465 means that
37465 = 3 ⋅ 104 + 7 ⋅ 103 + 4 ⋅ 102 + 6 ⋅ 10 + 5 ⋅ 100 .
We have also heard that computers use ‘base 2,’ representing numbers by using only a series
of 1’s and 0’s. For instance, 36 can be written as
36 = 1 ⋅ 25 + 0 ⋅ 24 + 0 ⋅ 23 + 1 ⋅ 22 + 0 ⋅ 21 + 0 ⋅ 20 ,
or more simply, 36 = (100100)2 . Here, (100100)2 is the collection of the coefficients in front
of the exponents of 2 in the representation of 36. Of course, these coefficients can only be
either 1 or 0, since, for example 2 = 1 ⋅ 21 + 0 ⋅ 20 , or more concisely, 2 = (10)2 .
The following theorem makes this notion precise and shows that other bases, aside from 10
and 2, may also be used.
Theorem 5. Let b ≥ 2 be an integer. Every positive integer n can be uniquely written in
base b. More precisely,
n = ak bk + ak−1 bk−1 + ⋯ + a1 b + a0 with ak ≠ 0 and 0 ≤ ai ≤ b − 1 for i = 0, . . . , k.
We denote n in base b by (ak ak−1 . . . a1 a0 )b .
Proof. The proof uses strong induction and is divided into two parts: first we prove the
existence of a description of n as in the statement and then we show that such a description
is unique. Note that in the base step of induction, we must consider several cases. This
is because these cases are all independent of each other, in contrast to the induction step,
where each case follows from previous cases.
Existence.
Base: For the cases n = 1, . . . , b − 1, take k = 0 and a0 = n.
Hypothesis: There exists a description in base b for all positive integers less than n.
Step: Suppose n ≥ b. We divide n by b using the division algorithm (Theorem 4) to obtain
n = b ⋅ q + a0 with 0 ≤ a0 ≤ b − 1.
Note that 1 ≤ q < n, so by the induction hypothesis
q = cs bs + cs−1 bs−1 + ⋯ + c0 with cs ≠ 0 and 0 ≤ ci ≤ b − 1.
Then
n = b ⋅ q + a0 = b(cs bs + cs−1 bs−1 + ⋯ + c0 ) + a0 = cs bs+1 + ⋯ + c0 b + a0 .
Taking k = s + 1 and ai = ci−1 for i = 1, . . . , k we obtain the claimed description.
Uniqueness. Suppose
(4.1) n = ak bk + ⋯ + a1 b + a0 = a′l bl + ⋯ + a′1 b + a′0 with ak , a′l ≠ 0 and 0 ≤ ai , a′i ≤ b − 1.
ak = a′l , . . . , a1 = a′1 ,
completing the proof. K
Example 4.2. Let n = 67.
(1) n = (67)10 since 67 = 6 ⋅ 10 + 7 ⋅ 100
(2) n = (235)5 since 67 = 2 ⋅ 52 + 3 ⋅ 5 + 2 ⋅ 50
(3) n = (2111)3 since 67 = 2 ⋅ 33 + 1 ⋅ 32 + 1 ⋅ 3 + 1 ⋅ 30
Exercises.
Exercise 4.3. Convert (101001000)2 to base 7.
Exercise 4.4. Consider a balance scale with 2 pans, A and B. Let k ∈ Z>0 . Show that any
weight not exceeding 2k − 1 that is placed on pan A may be measured, by placing on pan B,
a subset of weights of {1, 2, 22 , . . . , 2k−1 }.
11
5. The Greatest Common Divisor
Definition 5.1. Let a, b ∈ Z not both zero. The greatest common divisor of a and b is the
largest positive integer d such that d ∣ a and d ∣ b. We denote it by (a, b) or gcd(a, b). When
(a, b) = 1, we say that a and b are coprime.
Since the set of positive divisors of n and −n are the same, it is clear that
(−a, b) = (a, −b) = (−a, −b) = (a, b).
Therefore, we can restrict the coming discussion to non-negative integers a, b.
Examples 5.2.
(1) The set of all common divisors of 12 and 18 is {1, 2, 3, 6}, so (12, 18) = gcd(12, 18) = 6.
(2) For all a > 0, since a ∣ 0, we have (a, 0) = a.
The following theorem provides an alternative description of the greatest common divisor.
Theorem 6. Let a, b ∈ Z not both zero. Then (a, b) is the smallest positive integral linear
combination of a and b. That is, the smallest positive integer of the form
ax + by where x, y ∈ Z.
We claim that I is closed under addition and multiplication by scalars. More precisely, if
x, y ∈ I and λ ∈ Z then x + y and λx belong to I. In particular, qd ∈ I and, since r = n − qd
with n ∈ I, we conclude r ∈ I. Thus r = 0, otherwise I would contain a positive number
smaller than d. It follows that d ∣ n, where n = (a, b).
We now prove the claim. Let x = ax0 + bx1 and y = ay0 + by1 be elements of I and λ ∈ Z.
Then,
x + y = ax0 + bx1 + ay0 + by1 = a(x0 + y0 ) + b(x1 + y1 ) ∈ I
12
and
λx = λ(ax0 + bx1 ) = a(λx0 ) + b(λx1 ) ∈ I,
as claimed. K
Examples 5.3.
The second example illustrates that the values x, y given by Theorem 6 are not unique. In
what follows, we look at how to find all the possible choices for x, y.
Corollary 3. Let a, b ∈ Z not both zero. If (a, b) = 1, then ax + by = 1 for some x, y ∈ Z.
Proof. Let d be a common divisor of a, b. We have a = da′ and b = db′ for some a′ , b′ ∈ Z.
From Theorem 6, there are x, y ∈ Z such that
(a, b) = ax + by = d(a′ x) + d(b′ y) = d(a′ x + b′ y) Ô⇒ d ∣ (a, b)
K
Corollary 5. Let a, a′ , b, b′ ∈ Z satisfy a = da′ and b = db′ where d = (a, b). Then (a′ , b′ ) = 1.
The notion of greatest common divisor also makes sense for more than two integers.
Definition 5.4. Let a1 , a2 , . . . , an ∈ Z not all zero. The greatest common divisor of a1 , . . . , an ,
denoted gcd(a1 , . . . , an ) or (a1 , . . . , an ), is the largest positive integer dividing all the ai .
When (a1 , . . . , an ) = 1 we say that the ai are coprime and if (ai , aj ) = 1 for all i ≠ j, we say
they are pairwise coprime.
Example 5.5. Note that 7 ∤ 24, 7 ∤ 60 and it is the unique prime factor of 49 = 72 , so
(24, 60, 49) = 1; however, (24, 60) = 12. This shows that 24, 60 and 49 are coprime but not
pairwise coprime.
To complete the proof, we will now show that gcd(a1 , gcd(a2 , a3 , . . . , ak+1 )) = gcd(a1 , a2 , . . . , ak+1 ).
Indeed, let d0 be a common divisor of all the ai . In particular, by the induction hypothesis,
d0 divides gcd(a2 , . . . , ak+1 ), and since d0 also divides a1 , we have that
d0 ∣ gcd(a1 , gcd(a2 , a3 , . . . , ak+1 ))
by Corollary 4. By choosing d0 = gcd(a1 , . . . , ak+1 ), we conclude that
gcd(a1 , gcd(a2 , a3 , . . . , ak+1 )) ≥ gcd(a1 , a2 , . . . , ak+1 )
by definition of the GCD. Conversely, suppose d0 divides a1 and gcd(a2 , . . . , ak+1 ); hence
d0 also divides a2 , . . . ak+1 . It follows that d0 divides gcd(a1 , a2 , . . . , ak+1 ), and as above, we
conclude that
gcd(a1 , gcd(a2 , a3 , . . . , ak+1 )) ≤ gcd(a1 , a2 , . . . , ak+1 ).
K
Exercises.
Exercise 5.6. Let a, b be coprime integers not both zero. Determine with proof the possible
values of (a2 + b2 , a + b).
Note: You may use the fact that every integer has a prime divisor (Lemma 2 below).
14
6. The Euclidean Algorithm
In Example 5.2 we have computed (12, 18) = 6 by first listing all common divisors of 12
and 18. We now compute (18, 30) in the same way. Indeed, the positive divisors of 30 are
{1, 2, 3, 5, 6, 10, 15, 30} and those of 18 are {1, 2, 3, 6, 9, 18}. Then their common divisors are
{1, 2, 3, 6}, therefore (30, 18) = 6. Though this method is effective, it is not practical when
dealing with large numbers. In this section, we introduce the Euclidean algorithm which,
given integers a, b, allows one to compute (a, b) in an efficient way. We will first need the
following auxiliary result.
Lemma 1. Let a, b ∈ Z with a ≥ b > 0. Suppose
a=q⋅b+r with q, r ∈ Z.
Then (a, b) = (b, r).
Proof. Note that the ri ≥ 0 satisfy r1 > r2 > r3 > . . . . If rn ≠ 0 for all n, then we obtain a
strictly decreasing sequence of positive integers, which is impossible. Thus rn = 0 for some
n ≥ 1.
Suppose n > 1. Repeated applications of Lemma 1 gives
(a, b) = (b, r1 ) = (r1 , r2 ) = . . . = (rN −1 , rn ) = (rn−1 , 0) = rn−1 ,
as desired. If n = 1, then r1 = 0 and b ∣ a, thus (a, b) = b. K
Example 6.1. For a = 30, b = 18, we compute
(1) 30 = 1 ⋅ 18 + 12, so r1 = 12 Ô⇒ (30, 18) = (18, 12).
(2) 18 = 1 ⋅ 12 + 6, so r2 = 6 Ô⇒ (18, 12) = (12, 6).
(3) 12 = 2 ⋅ 6 + 0, so r3 = 0 Ô⇒ (12, 6) = (6, 0) = 6.
Thus (30, 18) = 6, as expected.
Example 6.2. Compute (803, 154):
15
(1) 803 = 154 ⋅ 5 + 33, so r1 = 33 Ô⇒ (803, 154) = (154, 33).
(2) 154 = 33 ⋅ 4 + 22, so r2 = 22 Ô⇒ (154, 33) = (33, 22).
(3) 33 = 22 ⋅ 1 + 11, so r3 = 11 Ô⇒ (33, 22) = (22, 11).
(4) 22 = 11 ⋅ 2 + 0, so r4 = 0 Ô⇒ (22, 11) = (11, 0) = 11.
Thus (803, 154) = 11.
Recall that (a, b) is the smallest positive integer of the form ax+by with x, y ∈ Z (Theorem 6).
The following method, called back substitution, allows one to find x0 , y0 ∈ Z such that
(a, b) = ax0 + by0 .
This method is also known as extended Euclidean algorithm since it mostly consists of re-
verting the steps of the Euclidean algorithm. We illustrate this with a few examples.
Example 6.3. In Example 6.2, we computed (803, 154) = 11. We can revert the steps of
the Euclidean algorithm as follows:
(803, 154) = 11 = 33 − 22 = 33 − (154 − 33 ⋅ 4) = 33 ⋅ 5 − 154
= (803 − 154 ⋅ 5) ⋅ 5 − 154 = 803 ⋅ 3 − 154 ⋅ 26
= 803 ⋅ 3 + 154 ⋅ (−26),
hence
(803, 154) = 803 ⋅ 3 + 154 ⋅ (−26) Ô⇒ x0 = 3 and y0 = −26.
Example 6.4. Compute (154, 35) and x0 , y0 satisfying 154x0 + 35y0 = (154, 35).
First apply the Euclidean Algorithm:
(1) 154 = 4 ⋅ 35 + 14, so r1 = 14 Ô⇒ (154, 35) = (35, 14);
(2) 35 = 2 ⋅ 14 + 7, so r2 = 7 Ô⇒ (35, 14) = (14, 7);
(3) 14 = 2 ⋅ 7 + 0, so r3 = 0 Ô⇒ (14, 7) = (7, 0) = 7,
to conclude (154, 35) = 7. Now we apply back substitution:
(154, 35) = 7 = 35 − 2 ⋅ 14 = 35 − 2 ⋅ (154 − 4 ⋅ 35)
= 35 ⋅ 9 + 154 ⋅ (−2) = 154 ⋅ (−2) + 35 ⋅ 9.
That is x0 = −2 and y0 = 9.
Proposition 6. Let a and b be non-zero integers satisfying a ∣ b and b ∣ a.
Then, a = b or a = −b. In particular, if a and b are positive, then a = b.
16
Exercises.
Exercise 6.5. Use the Euclidean algorithm to prove that 7 has no expression as an integral
linear combination of 18209 and 19043.
Exercise 6.6. Use the Euclidean algorithm and back substitution to find two rational num-
bers with denominators 11 and 13, respectively, and a sum of 143
7
.
17
7. Prime Numbers
The prime numbers function as the ‘building blocks’ of the integers in the sense that they
cannot be divided any further.
Definition 7.1. Let p > 1 be an integer. Then p is a prime number if its only positive
divisors are 1 and p. An integer n > 1 which is not prime is called composite.
Examples 7.2.
(1) 2, 3, 5, 7 are prime numbers.
(2) 6 = 2 ⋅ 3 is composite.
(3) 34052881 is a prime.
(4) 274207281 − 1 is the largest prime number known as of May 2017. It is a number with
22338618 digits.
The last examples show that there are enormous prime numbers. In fact, a theorem of
Euclid states that there are infinitely many primes. It is a consequence of this theorem
(see Theorem 8) that we can always find larger and larger primes. Before we prove Euclid’s
theorem, we need to introduce the following important auxiliary result.
Lemma 2. Every integer n > 1 has a prime divisor.
Proof. Let n > 1 be an integer. If n is prime, since n ∣ n, then n is its own prime divisor
and we are done. Suppose now that n is composite. Assume further that n is the smallest
composite number without any prime divisors. Then, there are integers a, b such that
n=a⋅b with 1 < a, b < n.
By minimality of n, there exists a prime p dividing a; that is a = pa′ for some a′ ∈ Z. Then,
n = ab = p(a′ n), so that p ∣ n, a contraction. K
Theorem 8 (Euclid). There are infinitely many prime numbers.
Proof of Euclid’s Theorem. Suppose, for contradiction, that there are only finitely many
primes numbers. Denote them p1 , p2 . . . , pk and consider the the number
n = p1 p2 ⋯pk + 1.
By Lemma 2, n has a prime divisor p, hence p = pi for some i. Since p divides n and p1 p2 ⋯pk ,
from Corollary 1, p divides the difference
n − p1 p2 ⋯pk = 1
which is impossible. Hence there are infinitely many primes. K
The following two theorems are classical results on the distribution of prime numbers. They
are beyond the scope of these notes, so we restrict ourselves to their statements.
Theorem 9 (The Prime Number Theorem). Let π(x) denote the function giving the number
x
of primes ≤ x. Then, when x gets closer to infinity, the function log(x) gets closer to π(x).
Theorem 10 (Dirichlet Density Theorem). Let a, b ∈ Z satisfy (a, b) = 1. Then, there are
infinitely many primes of the form a + bk with k ∈ Z.
18
In later discussions about cryptographic applications, it will be clear that it is important to
find and use extremely large primes. Given a large odd integer it can be very hard to decide
if it is a prime number, therefore tests distinguishing between primes and composite integers
will be crucial. The most basic such test is trial division; the following proposition√tells us
that, given an integer n, we need only test its divisibility by all the primes up to n. If n
is not divisible by any of these primes, then n must be a prime number.
√
Proposition 7. Let n be composite. Then n has a prime divisor p ≤ n.
This method, though effective, is not practical when n is large. In later sections, we shall
study alternative methods to deal with such cases.
Exercises.
Exercise 7.3. Using Euclid’s proof that there are infinitely many primes, show that the
n-th prime pn does not exceed 22 whenever n is a positive integer. Conclude that when n
n−1
19
8. The Fundamental Theorem of Arithmetic
The main objective of this section is to prove the following result, which justifies the expres-
sion ‘the primes are the building blocks of the integers’.
Theorem 11 (The Fundamental Theorem of Arithmetic). Let n ≠ 0, 1 be an integer. Then
n has a prime factorization of the form
n = ±pa11 ⋯par r , ai ≥ 1,
where the pi are distinct prime numbers. Furthermore, up to the order of the pi , this factor-
ization is unique.
We remark that, however familiar this statement sounds, it is non-trivial. Suppose, for
example, that instead of the integers, we work with only with the even integers. In this
setting, the numbers 6, 10, 30, 50 are ‘primes’, in the sense that they cannot be decomposed
into a product of smaller even numbers. Moreover, we have 300 = 10 ⋅ 30 = 6 ⋅ 50, showing that
the number 300 has two different ‘prime decompositions’ in the universe of even numbers.
To prove the FTA some preparation is required.
Lemma 3. Let a, b ∈ Z>0 satisfy (a, b) = 1. If a ∣ bc, then a ∣ c.
Proof. Let n > 0 and d ∣ n. We have n = dk for some integer k. Clearly, any prime divisor
of d is a prime divisor of n, so d = pb11 ⋯pbnn with bi ≥ 0. WLOG suppose that b1 > a1 . Then,
b1 − a1 ≥ 1 and
n = dk ⇐⇒ pa11 ⋯pann = (pb11 ⋯pbnn )k ⇐⇒ pa22 ⋯pann = p1 (pb11 −a1 −1 pn2 2 ⋯pbnn k),
showing that p1 divides the left hand side, which is impossible because the pi ≠ p1 for all
i ≥ 2. Thus, b1 ≤ a1 , as desired. K
21
Exercises.
Exercise 8.3. An integer n > 0 is a square if there is c ∈ Z such that n = c2 . A square-free
integer is an integer that is not divisible by any squares other than 1. Show that every
positive integer can be written as the product of a square (possibly 1) and a square-free
integer.
Exercise 8.4. An integer n is called powerful if, whenever a prime p divides n, p2 also
divides n. Show that every powerful number can be written as the product of square and a
cube (i.e. an integer of the form c3 for some integer c).
22
9. The Least Common Multiple
Proof. Let p be a prime and let ps denote the largest power of p dividing (a1 , . . . , ak ). For
i = 1, . . . , k, write
ai = pei mi with (mi , p) = 1.
Since ps ∣ (a1 , . . . , ak ), we have ps ∣ ai for all i. By Proposition 8, it we have s ≤ min(e1 , . . . , ek ).
Conversely, pmin(e1 ,...,ek ) ∣ ai for all i. It now follows from Proposition 5 that
pmin(e1 ,...,ek ) ∣ (a1 , . . . , ak ),
hence s = min(e1 , . . . , ek ), as desired. Repeating this argument for each pi establishes
min(si,1 ) min(si,n )
(a1 , a2 , . . . , ak ) = p1 ⋯pn .
Now, to prove the second part of the proposition, write ` = lcm(a1 , a2 , . . . , ak ) and
max(si,1 ) max(si,n )
`′ = p1 ⋯pn .
If si,j denotes the exponent of the j th prime, pj , in the decomposition of ai ,
s s
ai = p1i,1 ⋯pni,n for i = 1, . . . , k,
∣ `′ for all i, j. Therefore, `′ is a multiple of all the ai . Since ` is the smallest
s
then clearly pj i,j
multiple of all of the ai , this means that ` ≤ `′ .
Suppose now that ` < `′ . Clearly, ` does not have any prime factor different from the pi .
Indeed, suppose ` contained a prime factor different from p1 , . . . , pn , say
` = pb11 ⋯pbnn ⋅ pb ,
23
for some integers b1 , . . . , bn , b and p a prime distinct from p1 , . . . , pn . Since the ai are made up
only of the primes {p1 , . . . , pn } and ` denotes the smallest multiple of all of the ai , dropping
p from ` would yield a smaller multiple of the ai , a contradiction.
Now, if
max(si,1 ) max(si,n )
` < `′ = p1 ⋯pn ,
it is because one of the exponents in the factorization of `, say (WLOG) the exponent of p1 ,
is strictly smaller than max(si,1 ). Suppose max(si,1 ) = sr,1 for some 1 ≤ r ≤ k. But then, the
above implies that ar ∤ ` because the exponent sr,1 of p1 in the factorization of ar is strictly
larger than the exponent of p1 in `. This is a contradiction. Thus ` = `′ as desired. K
The particular case of only two integers a, b is very useful and deserves to be highlighted.
Proposition 10. Let a, b ∈ Z>0 have prime decompositions
a = pa11 ⋯pann and b = pb11 ⋯pbnn ,
where ai , bi ≥ 0, and the pi are distinct primes. Then,
min(a ,b ) min(a ,b )
(i) (a, b) = p1 1 1
⋯pn n n .
max(a1 ,b1 ) max(an ,bn )
(ii) lcm(a, b) = p1 ⋯pn .
(iii) a ⋅ b = (a, b) ⋅ lcm(a, b).
Proof. This follows directly from the formula for the least common multiple in Proposition 9.
K
Proposition 12. Let n, a1 , . . . , ak ∈ Z. Suppose that ai ∣ n for all i. Then lcm(a1 , . . . , ak ) ∣ n.
Proof. Clearly, ai ∣ n for all i if lcm(a1 , . . . , ak ) ∣ n since ai ∣ lcm(a1 , . . . , ak ). For the other
direction, we use induction on k ≥ 2.
Base: Let a1 , a2 , n be integers such that both a1 and a2 divide n. Then, we can write their
factorizations as follows
a1 = pe11 ⋯penn a2 = pb11 ⋯pbnn n = pc11 ⋯pcnn with ei , bi , ci ≥ 0
and pi distinct primes. From Proposition 8, we have ei , bi ≤ ci , hence max(ei , bi ) ≤ ci for all i.
Thus, by Propositions 10 and 8, we conclude lcm(a1 , a2 ) ∣ n.
Hypothesis: The result is true for k > 2 integers ai .
Step: Suppose ai ∣ n for 1 ≤ i ≤ k + 1 and write ` = lcm(a1 , . . . , ak ). Then, ak+1 ∣ n and by
hypothesis ` ∣ n, therefore lcm(`, ak+1 ) ∣ n by the base case. Now from Proposition 11 we
have lcm(`, ak+1 ) = lcm(a1 , . . . , ak+1 ) ∣ n, as desired. K
Proposition 13. Let a1 , . . . , an ∈ Z be pairwise coprime. Then lcm(a1 , . . . , an ) = a1 ⋯an .
Exercises.
Exercise 9.6. Show that, abc = gcd(a, b, c) lcm(a, b, c) does not hold for general a, b, c ∈ Z
by finding a counterexample.
Exercise 9.7. Prove that, for all a, b, c ∈ Z>0 , we have abc = gcd(bc, ac, ab) lcm(a, b, c).
26
10. Primes of the Form 4k + 3
In this section, we will prove the following particular case of Dirichlet’s Density Theorem
(Theorem 10).
Theorem 12. There are infinitely many primes of the form 4k + 3 for k ∈ Z.
Proof of Theorem 12. We will proceed using proof by contradiction. Indeed, suppose there
are only finitely many primes of the form 4k + 3. Denote these primes p0 = 3, p1 , p2 , . . . , ps
and consider the number
Q = 4p1 p2 ⋯ps + 3.
Clearly, 2 ∤ Q, hence the prime factorization of Q (which exists by Theorem 11) contains
only odd primes. By Lemma 4, the primes in this factorization are all of the form 4k + 1 or
4k + 3. If all the primes occurring in the prime factorization of Q are of the form 4k + 1, by
Lemma 5, we conclude that Q is also of the form 4k + 1. Here, Q is of the form 4k + 3, so
that there is at least one prime factor of Q which is of the form 4k + 3.
Let p ∣ Q be of the form 4k + 3. Thus p = pi for some i. If p = 3, then 3 ∣ (Q − 3) = 4p1 ⋯ps ,
a contradiction. If p = pi ≠ 3, then p ∣ (Q − 4p1 ⋯ps ) = 3, a contradiction. Hence there are
infinitely many primes of the form 4k + 3. K
Example 10.1. The first few values of 4k + 3 are 3, 7, 11, 15, 19, 23, 27, so that clearly the
formula generates both primes and composite numbers. Theorem 12 guarantees that we will
always find larger and larger values of k giving rise to new primes.
Exercises.
Exercise 10.2. Give a counterexample to show that Lemma 5 is false if we replace 4k + 1
by 4k + 3.
27
11. Linear Diophantine Equations
Definition 11.1. Any equation with one or more variables to be solved in the integers is
called a Diophantine Equation.
Examples 11.2. The equations
3x = 1, 2x + 2y = 3, x2 + z 2 = y 2
are Diophantine equations when we are only interested in integer solutions. For example,
the first equation has solution x = 1/3. However, viewed as a Diophantine equation in Z,
this equation has no solutions.
Definition 11.3. Let a1 , . . . , an ∈ Z≠0 . A Diophantine equations of the form
a1 x1 + a2 x2 + ⋯ + an xn = b, with b ∈ Z
is a linear Diophantine equation in n variables x1 , . . . , xn .
Examples 11.4.
Our objective in this section is to prove Theorem 14 which gives a complete resolution of
linear Diophantine equations in two variables. The case of one variable follows directly from
the definition of divisibility.
Theorem 13. Let a, b ∈ Z with a ≠ 0. The equation ax = b has a unique solution if and only
if a ∣ b. When a solution exists, necessarily, it is given by x = ab .
Theorem 14. Let a, b, c ∈ Z, with a, b ≠ 0. Write d = (a, b). Consider the equation
(11.5) ax + by = c.
(A) The equation (11.5) has an integer solution (x0 , y0 ) if and only if d ∣ c.
(B) Suppose d ∣ c so that there is a solution (x0 , y0 ) by part (A). Then, all the solutions
to (11.5) are given by the formulas
b a
x = x0 + t, y = y0 − t with t ∈ Z.
d d
Proof. We have a = da′ and b = db′ with a′ , b′ ∈ Z. By Corollary 5 we know that (a′ , b′ ) = 1.
We will now prove part (A). Suppose first that ax + by = c has a solution (x0 , y0 ). Then,
ax0 + by0 = c ⇐⇒ d(a′ x0 ) + d(b′ y0 ) = d(a′ x0 + b′ y0 ) = c Ô⇒ d ∣ c.
Conversely, suppose d ∣ c. That is, c = dt with t ∈ Z. From Theorem 6, we know there are
x1 , y1 ∈ Z such that
ax1 + by1 = d ⇐⇒ a(tx1 ) + b(ty1 ) = dt = c.
Then, x0 = tx1 , y0 = ty1 is a solution to ax + by = c.
28
We now prove (B). Suppose d ∣ c and (x0 , y0 ) is a solution to ax + by = c. Let t ∈ Z. We
compute
b a ab ab
a (x0 + t) + b (y0 − t) = ax0 + t + by0 − t = ax0 + by0 = c,
d d d d
showing that the formula in the statement produces solutions to ax + by = c. To finish the
proof, it remains to show that all solutions are given by the formula above. Let (x1 , y1 ) be
another solution. We define the quantities tx = x1 − x0 and ty = y1 − y0 and compute
atx + bty = ax1 − ax0 + by1 − by0 = (ax1 + by1 ) − (ax0 + by0 ) = c − c = 0.
Then,
b a
bty = −atx ⇐⇒ d ( ) ty = −d ( ) tx where d = (a, b)
d d
a b
⇐⇒ b′ ty = −a′ tx , where a′ = , b′ = .
d d
′ ′ ′ ′
Since (a , b ) = 1, by Lemma 3, we have b ∣ tx , that is tx = b t for some t ∈ Z. Then,
b′ ty = −a′ b′ t Ô⇒ ty = −a′ t. Therefore,
b a
x1 = x0 + tx = x0 + b′ t = x0 + t and y1 = y0 + ty = y0 − a′ t = y0 − t,
d d
showing that (x1 , y1 ) is obtained from (x0 , y0 ) by the formula in the statement, as desired.
K
Example 11.6. We will solve the equation 154x + 35y = 7.
In Example 6.4, we have computed d = (154, 35) = 7 and, since d ∣ 7, there exist solutions
by part (A) of Theorem 14. Indeed, in the same example, we also computed the particular
solution (x0 , y0 ) = (−2, 9). Therefore, by part (B) of Theorem 14, the general solution is
given by
x = −2 + 5t, y = 9 − 22t, t ∈ Z.
In particular, taking t = 1 gives the particular solution (x1 , y1 ) = (3, −13).
Example 11.7. Consider the equation 154x + 35y = 24.
Since d = (154, 35) = 7 ∤ 24, there are no solutions by part (A) of Theorem 14.
Example 11.8. Consider the equation 154x + 35y = 21.
Since d = (154, 35) = 7 ∣ 21, this equation has solutions in Z. Example 11.6 shows that
154x + 35y = 7 has the solution x1 = −2 and y1 = 9. Then, 154x + 35y = 21 has the solution
x0 = 3x1 = −6, y0 = 3y1 = 27. We conclude that the general solution is given by
x = −6 + 5t, y = 27 − 22t for t ∈ Z.
Exercises.
Exercise 11.9. A shopper spends a total of $5.49 for oranges, which cost 18¢ each, and
grapefruit, which cost 33¢ each. What is the minimum number of pieces of fruit the shopper
could have bought?
29
12. Irrational Numbers
Any element in the set rational numbers Q is denoted as a fraction, a/b, where a, b ∈ Z with
b ≠ 0. By cancelling out the common factors of a and b we may obtain another fraction,
a′ /b′ . This new fraction a′ /b′ represents the same rational number a/b = a′ /b′ , but with a′
and b′ now coprime. Recall the inclusions Z ⊂ Q ⊂ R.
Definition 12.1. We say that a real number x ∈ R is irrational if x ∈/ Q.
We will prove the following standard fact using two techniques we have learned so far.
√
Theorem 15. The number 2 is irrational.
√
Proof
√ 1. Suppose, for contradiction, that 2 is rational. Then, by definition of the rationals,
2 = a/b with a, b positive integers. Consider the set
√ √
S = {k 2 ∣ k and k 2 are positive integers }.
√
Note that this set is non-empty since a = b√ 2 ∈ S. It follows by the WOP that √ there exists
a√smallest positive element in S, say s = t 2 with t ∈ Z>0 . We claim that s 2 − s ∈ S and
s 2 − s < s, obtaining a contradiction with minimality of s, and completing the proof.
Indeed, note that s ∈ S is an integer by definition of S. Additionally, since t ∈ Z>0 , it follows
that √ √ √
s 2 = t 2 ⋅ 2 = 2t and s − t
are integers. Moreover,
√ √ √ √
s 2 − s = s 2 − t 2 = (s − t) 2,
√
so that√s 2 − s ∈ S, provided that we show s − t is positive. This is the same as showing that
(s − t) 2 is positive, which is true because
√ √ √ √
(s − t) 2 = s 2 − s = s( 2 − 1) and ( 2 − 1), s > 0.
√ √ √
Therefore, s 2 − s ∈ S and since 2 − 1 < 1 we also have s 2 − s < s, as desired. K
√
Proof 2. Suppose, for contradiction, that 2 is rational. Then, by definition of the rationals,
√
2 = a/b with a, b coprime positive integers. Hence,
√
2 = a/b Ô⇒ 2b2 = a2 Ô⇒ 2 ∣ a
because 2 is a prime dividing the product a2 = a⋅a, so it divides one of the factors. Therefore,
a = 2k for some k ∈ Z and, replacing the above gives,
2b2 = a2 = (2k)2 ⇐⇒ b2 = 2k 2 Ô⇒ 2 ∣ b,
showing that both a, b are divisible by 2, contradicting the fact that (a, b) = 1. K
Using this corollary, we can easily give examples of irrational numbers; in particular, we
obtain another proof of Theorem 15.
√ √ √ √
Example 12.2. The numbers 2, 6, 3 5 and 10 19 are irrational.
Exercises.
√ √
Exercise 12.3. Show that 5 + 3 is irrational.
Exercise 12.4. Show that log2 3 is an irrational number.
31
13. Congruences
Using the language of congruences we can often state theorems in a more compact way; in
particular, we can now rephrase Lemma 5 and Theorem 12 as follows:
Lemma 6. Let a, b ∈ Z satisfy a, b ≡ 1 (mod 4). Then ab ≡ 1 (mod 4).
Theorem 17. There are infinitely many primes p such that p ≡ 3 (mod 4).
We will show that Lemma 6 is a special case of a general basic property of congruences (see
Corollary 10), but first we need to introduce other elementary properties and definitions.
Proposition 15. Let m ∈ Z>0 . Then, the relation of congruence modulo m is an equivalence
relation in Z. More precisely, for all a, b, c ∈ Z, we have
(i) a ≡ a (mod m) (reflexivity);
(ii) a ≡ b (mod m) Ô⇒ b ≡ a (mod m) (symmetry);
(iii) a ≡ b, b ≡ c (mod m) Ô⇒ a ≡ c (mod m) (transitivity).
Fix a congruence modulus m > 0. Since the relation of congruence mod m is an equivalence
relation, it divides Z into disjoint equivalence classes. The equivalence class of an integer a
is the set of integers which are congruent to a modulo m. We call it the congruence class of
a mod m and denote it by [a]. That is, for an integer a, we have
[a] ∶= {x ∈ Z ∶ x ≡ a (mod m)}.
32
We say that a is a representative of the class; we can choose any element y ∈ [a] as a
representative, in which case we have [y] = [a]. This is illustrated by the following examples.
Example 13.3. Let m = 4. We have Z = [0] ∪ [1] ∪ [2] ∪ [3], where
[0] = {x ∈ Z ∶ x ≡ 0 (mod 4)} = {x ∈ Z ∶ x − 0 = 4k with k ∈ Z}
= {. . . , −8, −4, 0, 4, 8, . . . }
[1] = {x ∈ Z ∶ x ≡ 1 (mod 4)} = {x ∈ Z ∶ x − 1 = 4k with k ∈ Z}
= {x ∈ Z ∶ x = 1 + 4k with k ∈ Z} = {. . . , −7, −3, 1, 5, 9, . . . }
[2] = {. . . , −6, −2, 2, 6, . . . }
[3] = {. . . , −5, −1, 3, 7, 11, . . . }
In particular, [0] = [−4], [1] = [9], [2] = [6] and [3] = [−1].
Example 13.4. Let m = 3. We have Z = [0] ∪ [1] ∪ [2], where
[0] = {. . . , −6, −3, 0, 3, 6, . . . }
[1] = {. . . , −5, −2, 1, 4, 7, . . . }
[2] = {. . . , −4, −1, 2, 5, 8, . . . }
In particular, [0] = [3], [1] = [−2] and [2] = [−1].
Example 13.5. Let m = 2. We have Z = [0] ∪ [1], where
[0] = { even integers }
[1] = { odd integers }
In particular, [0] = [4] and [1] = [3].
It follows from the previous discussion that every integer a belongs to an unique congru-
ence class modulo m. Given a, the next proposition determines the smallest non-negative
representative of the congruence class, [a].
Proposition 16. Let a, m ∈ Z with m > 0. Then a ≡ r (mod m), where r is the remainder
of the division of a by m.
In particular, [a] = [r] and a is congruent to exactly one integer in {0, 1, 2, . . . , m − 1}.
In what follows, we will see that Z/mZ has properties similar to the integers. In particular,
we shall soon define addition and multiplication in Z/mZ. However, let us first observe a
very important difference between Z/mZ and Z: there is no cancellation law in Z/mZ. More
precisely, in the integers we have
if a, b ∈ Z satisfy ab = 0, then a = 0 or b = 0
whilst, for example, in Z/4Z we have
2 ⋅ 2 ≡ 4 ≡ 0 (mod 4) and 2 ≡/ 0 (mod 4).
To define addition and multiplication in Z/mZ we will need the following result.
Theorem 18. Let m ∈ Z>0 . Suppose a ≡ b (mod m) and c ≡ d (mod m). Then,
(i) a + c ≡ b + d (mod m);
(ii) a − c ≡ b − d (mod m);
(iii) ac ≡ bd (mod m).
Proof. Take m = 4 and let a, c ∈ Z satisfy a, c ≡ 1 (mod 4). Then, by part (iii) of Theorem 18
with b = d = 1, we conclude ac ≡ 1 ⋅ 1 ≡ 1 (mod 4), as desired. K
Example 13.10. Let m = 5. To compute 492 (mod 5) we calculate that 492 = 2401 = 480⋅5+1
and by Proposition 16, it follows that 492 ≡ 1 (mod 5). However, Theorem 18 allows for the
much quicker calculations
492 ≡ 42 ≡ 16 ≡ 1 (mod 5) or 492 ≡ (−1)2 ≡ 1 (mod 5).
Remark 13.11. Theorem 18 does not hold for exponentiation. That is,
/ ac ≡ ad (mod m).
c ≡ d (mod m) Ô⇒
For example, taking m = 3, a = 2, d = 3, and c = 6, we have 3 ≡ 6 (mod 3) but
23 ≡ 8 ≡ 2 (mod 3) and 26 ≡ 23 ⋅ 22 ≡ 4 ≡ 1 (mod 3)
are not congruent mod 3.
Note that, in the above definitions, we use the concrete representatives r, s ∈ Z to calculate
the result of the operation. For example, for m = 5, we have [2] + [3] = [2 + 3] = [5], but,
since [2] = [7] and [3] = [−7], in order for the definition to make sense, we also need that
[7] + [−7] = [7 + (−7)] = [0] is equal to [5], which is the case. Clearly, for the operations to
be well defined, we need a similar compatibility for any other choice of representatives. This
is the content of the next proposition.
Proposition 17. The operations in Definition 13.12 are well defined. That is, their output
is independent of the choice of representatives.
Proof. Let r′ ∈ [r] and s′ ∈ [s], that is, r′ ≡ r (mod m) and s′ ≡ s (mod m). Then, by part (i)
of Theorem 18, we have
r + s ≡ r′ + s′ (mod m) ⇐⇒ [r + s] = [r′ + s′ ],
hence
[r] + [s] ∶= [r + s] = [r′ + s′ ] =∶ [r′ ] + [s′ ].
35
This shows that addition is well defined. Similar arguments show that the other operations
are also well-defined. K
Example 13.13. We can write tables of addition and multiplication in Z/mZ. For example,
the table of addition in Z/3Z:
+ [0] [1] [2]
[0] [0] [1] [2]
[1] [1] [2] [0]
[2] [2] [0] [1]
Example 13.14. In Z/7Z, we have [3] ⋅ [6] = [18] = [4] = [−10], but since [3] = [10] and
[6] = [−1] we also have, more directly, [3] ⋅ [6] = [10] ⋅ [−1] = [−10].
We have mentioned that, in general, there is no cancellation law in Z/mZ. For example,
when m = 4, we have 2 ⋅ 2 ≡ 0 (mod 4) and 2 ≡/ 0 (mod 4). The following is another example
of the failure of the cancellation law. For all a, b ∈ Z we have 6a ≡ 6b (mod 3), because both
sides of the congruence are congruent to 0 since 3 ∣ 6. If we just cancel out the 6 (like we do
in Z), we get a ≡ b (mod 3) for all a, b, which of course is false since 1 ≡/ 2 (mod 3).
The following lemma can be interpreted as a cancellation law in Z/mZ where we allow
changing the congruence modulus.
Lemma 7. Let a, b, c, m ∈ Z with m > 0 and c ≠ 0. Write d = (c, m). Then,
m
c ⋅ a ≡ c ⋅ b (mod m) ⇐⇒ a ≡ b (mod ).
d
Proof. Suppose first a ≡ b (mod md ). That is, a − b = md ⋅ k for some k ∈ Z. Then,
c c c
da − db = m ⋅ k ⇐⇒ (da − db) = mk ⇐⇒ ca − cb = m ( k) ⇐⇒ ca ≡ cb (mod m).
d d d
Conversely, suppose ca ≡ cb (mod m). That is ca − cb = mk for k ∈ Z. Therefore, we also
have
m c c m
⋅ k = (a − b) with ( , ) = 1,
d d d d
where the second condition follows from Corollary 5. Then, Lemma 3 implies that
m m
∣ a − b ⇐⇒ a ≡ b (mod ).
d d
K
Example 13.15. From Lemma 7, for all a, b ∈ Z, we have
3
6a ≡ 6b (mod 3) ⇐⇒ a ≡ b (mod ) ⇐⇒ a ≡ b (mod 1),
(3, 6)
which is true (see Examples 13.2 (f)).
36
Exercises.
Exercise 13.16. Let m ∈ Z>0 and let [r], [s] ∈ Z/mZ. Prove that multiplication, as defined
by
[r] ⋅ [s] ∶= [r ⋅ s],
is a well-defined operation on Z/mZ.
Exercise 13.17. Prove or disprove that {−39, 72, −23, 50, −15, 63, −52} is a complete residue
system modulo 7.
Exercise 13.18. Find a complete residue system modulo 7 consisting entirely of even inte-
gers.
Exercise 13.19. Determine all least positive integers k modulo 16 satisfying k ≡ 2 (mod 4).
37
14. Fast Modular Exponentiation
In this section, we describe an efficient procedure to deal with exponentials modulo m. More
precisely, given a, k, m ∈ Z with m, k ≥ 2, we will describe how to compute ak (mod m)
quickly.
The following method, know as fast modular exponentiation, consists of 3 main steps.
Step 1: Write the exponent in base 2. That is,
k = 2r1 + 2r2 + ⋯ + 2rl , r1 > r2 > ⋯ > rl .
Step 2: For all powers of 2 which are less than or equal to 2r1 , compute
r
a (mod m), a2 (mod m), a4 (mod m), . . . , a2 1 (mod m)
by successively squaring and reducing the result modulo m.
Step 3: Compute
ak = a2 1 +2 2 +⋯+2 l ≡ a2 1 ⋅ a2 2 ⋅ . . . ⋅ a2 l (mod m),
r r r r r r
where we use the values computed in Step 2 to obtain the right hand side of the congruence.
Example 14.1. Compute 751 (mod 17).
Step 1: 51 = 25 + 24 + 2 + 1 = 32 + 16 + 2 + 1.
Step 2:
7 ≡ 7 (mod 17) 72 ≡ 49 ≡ 15 ≡ −2 (mod 17)
74 ≡ (−2)2 ≡ 4 (mod 17) 78 ≡ 42 ≡ 16 ≡ −1 (mod 17)
716 ≡ (−1)2 ≡ 1 (mod 17) 732 ≡ 12 ≡ 1 (mod 17).
Step 3:
751 = 732+16+2+1 = 71 ⋅ 72 ⋅ 716 ⋅ 732 = 7 ⋅ (−2) ⋅ 1 ⋅ 1 ≡ −14 ≡ 3 (mod 17).
Example 14.2. In this example, we demonstrate why working with base 2 is efficient in
fast modular exponentiation. Suppose we want to compute 751 (mod 17) using base 3, for
instance.
We first compute 51 in base 3, obtaining
51 = 33 + 2 ⋅ 32 + 2 ⋅ 3 = 27 + 18 + 6.
Now, since
2 2
751 = 727+18+6 = 727 ⋅ 718 ⋅ 76 ≡ 14 ⋅ (79 ) ⋅ (73 ) ≡ 14 ⋅ 92 ⋅ 32 (mod 17),
2
we see that the above pre-calculations are insufficient, since we must also compute (79 )
2
and (73 ) modulo 17. Note that this is due to the fact that representations of integers in
38
base 3 (or any other base ≥ 3) can have coefficients other than 0 and 1, which is not the case
in base 2.
Exercises.
Exercise 14.3. Find the least positive residue of each of the following.
(a) 310 (mod 11)
(b) 516 (mod 17)
(c) 212 (mod 13)
(d) 322 (mod 23)
(e) Can you propose a theorem from the above congruences?
39
15. The Congruence Method
Before we proceed with the study of congruences, in this section, we will describe an appli-
cation of congruences to the solution of Diophantine equations.
The following method, called the congruence method, may sometimes be used to conclude
that certain Diophantine equations have no solutions in Z. The idea behind this method is
that if an equation is satisfied in Z, then it has to be satisfied modulo m for all m > 0. If,
however, we can find a value of m for which it is not satisfied mod m, then we can conclude
that there are no solutions in Z. We illustrate this with two examples.
Example 15.1. We will show that 3x3 + 2 = y 2 has no integer solutions. Indeed, suppose
there are x0 , y0 ∈ Z satisfying 3x30 + 2 = y02 . Since every integer is congruent to itself (see
Proposition 15 (i)) we conclude that, for all integers m > 0, we have the congruence
(15.2) y02 ≡ y02 = 3x30 + 2 (mod m).
In particular, taking m = 3, we have
y02 ≡ 2 (mod 3),
where we have used the fact that 3 ≡ 0 (mod 3).
On the other hand, every integer is congruent modulo 3 to one of {0, 1, 2}; in particular,
y0 ≡ 0, 1 or 2 (mod 3) and we respectively obtain
y02 ≡ 0, 1, 4 ≡ 0, 1, 1 (mod 3).
Thus y02 ≡/ 2 (mod 3) and the integer solution x0 , y0 to equation (15.2) cannot exist, otherwise
y0 satisfies an impossible congruence.
We note that there can be solutions mod m for other values of m. For example, if instead
we work modulo m = 2, from (15.2) we obtain
3x20 + 2 ≡ y02 (mod 2) ⇐⇒ x20 ≡ y02 (mod 2),
which is satisfied whenever x0 ≡ y0 (mod 2). For example, take x0 = y0 = 1. This shows that
the existence of solutions mod m says nothing about the existence of solutions in Z.
Example 15.3. We will show that 20y 2 + 2x = 3 has no integer solutions. Indeed, suppose
x0 , y0 ∈ Z is a solution. Taking m = 2 and arguing as in the previous example, we get
20y02 + 2x0 ≡ 3 (mod 2) ⇐⇒ 0 ≡ 1 (mod 2),
which is impossible. If instead we take m = 5, we obtain
(15.4) 20y02 + 2x0 ≡ 3 (mod 5) ⇐⇒ 2x0 ≡ 3 (mod 5).
We have that x0 ≡ 0, 1, 2, 3, 4 (mod 5) which implies, respectively,
2x0 ≡ 0, 2, 4, 1, 3 (mod 5),
so x0 ≡ 4 (mod 5) satisfies (15.4) and there is no contradiction. We conclude that every
integer x0 = 4 + 5k satisfies the congruence equation (15.4). However, we have shown above
that no integer x0 will satisfy the original equation in Z.
40
Exercises.
Exercise 15.5. Prove or disprove the following statements
(a) The Diophantine equation 3x2 − 7y 2 = 2 has no integral solutions.
(b) The Diophantine equation x2 + y 2 + 1 = 4z has no integral solutions.
41
16. Linear Congruences in One Variable
The previous examples show that an equation of the form (16.1) can have sets of solutions
with different behaviours. This is explained by the following theorem.
Theorem 19. Let a, b, m ∈ Z, with m > 0. Write d = (a, m).
Proof.
(i) Suppose for contradiction that x0 ∈ Z satisfies ax0 ≡ b (mod m). By definition of con-
gruence, there exists some y0 ∈ Z such that
ax0 − b = my0 ⇐⇒ ax0 + m(−y0 ) = b,
meaning that ax+my = b has the solution (x0 , −y0 ). Then (a, m) = d ∣ b by Theorem 14.
(ii) Suppose d ∣ b. Then ax − my = b has solutions by Theorem 14. Let (x0 , y0 ) be a
particular solution. By Theorem 14, the general solution is
m a
x = x0 − t, y = y0 − t, t ∈ Z,
d d
which gives all the integer solutions satisfying ax ≡ b (mod m).
42
To finish the proof, we must show that the above formula for x produces exactly d
incongruent values modulo m. Indeed, suppose we choose t1 , t2 ∈ Z giving the same
value for x modulo m, that is,
m m
x0 − t1 ≡ x0 − t2 (mod m).
d d
From here, we see that
m m m m
x0 − t1 ≡ x0 − t2 (mod m) ⇐⇒ (t2 − t1 ) ≡ 0 ≡ ⋅ 0 (mod m)
d d d d
m
⇐⇒ t2 − t1 ≡ 0 (mod ) by Lemma 7
(m, m/d)
⇐⇒ t1 ≡ t2 (mod d),
where we used (m, md ) = md in the last step. In other words, for t1 , t2 giving the
same value of x modulo m, we must have that t1 ≡ t2 (mod d). Therefore taking
t ∈ {0, 1, . . . , d − 1} gives the desired d non-congruent solutions mod m.
K
The solutions to the congruence in this corollary will play a crucial role in everything that
follows, so they deserve a special name.
Definition 16.4. Let a, m ∈ Z with m > 0 and (a, m) = 1. We call any integer solution of
the congruence ax ≡ 1 (mod m) an inverse of a modulo m.
Suppose that x0 ∈ Z is an inverse of a mod m. Then ax0 ≡ 1 (mod m) and we have the
following equalities in Z/mZ
[ax0 ] = [1] ⇐⇒ [a] ⋅ [x0 ] = [1].
Suppose x1 is another inverse of a mod m. Corollary 11 shows that x1 is congruent to x0
mod m so that [x1 ] = [x0 ]. In other words, the inverse of a mod m is unique when viewed
as an element of Z/mZ. This is summarized by the following definition.
Definition 16.5. Let a, m ∈ Z with m > 0 and (a, m) = 1. The congruence class [x0 ] in
Z/mZ which satisfies [a] ⋅ [x0 ] = [1] is called the inverse of [a] in Z/mZ. We denote it by
[a]−1 .
Remark 16.6. We note that the use of the term ‘inverse’ and the notation a−1 is analogous
to that of the real numbers. Indeed, for all a ∈ R≠0 , we call 1/a the ‘inverse’ of a, which we
also denote as a−1 . This number is also the unique number satisfying a ⋅ (1/a) = 1.
In practice, despite the fact that a−1 makes no sense as an integer, we write a−1 (mod m) to
denote the smallest positive representative of the congruence class [a]−1 . As an example, in
the following tables, we list the inverses modulo m = 10 and m = 5.
43
Examples 16.7.
(1) For m = 10,
a (mod 10) 0 1 2 3 4 5 6 7 8 9
a−1 (mod 10) − 1 − 7 − − − 3 − 9
(2) For m = 5,
a (mod 5) 0 1 2 3 4
a−1 (mod 5) − 1 3 2 4
Note that there are many integers a ≡/ 0 (mod 10) which are not invertible, while for m = 5,
only those congruent to zero have no inverse. This behavior for m = 5 holds more generally
for all prime numbers.
Corollary 12. Let a, p ∈ Z with p a prime and a ≡/ 0 (mod p). Then a has an inverse mod p.
Proof. Since p is prime and a ≡/ 0 (mod p) we have (a, p) = 1. The result follows from
Corollary 11. K
To find the inverses for m = 5, 10 we only have to try a few possibilities due to the small size
of m. In general, to compute a−1 (mod m), we need to solve the linear Diophantine equation
ax + my = 1 using the Euclidean Algorithm and back substitution.
Example 16.8. We will compute 17−1 (mod 55). Here, we need to solve 17x ≡ 1 (mod 55).
We will do this by finding x0 , y0 satisfying 17x0 + 55y0 = 1, because taking this equality
mod 55 gives precisely 17x0 ≡ 1 (mod 55). This means that x0 (mod 55) will be the inverse
of 17 mod 55 that we are looking for. First, we find (17, 55) using the Euclidean Algorithm:
55 = 17 ⋅ 3 + 4
17 = 4 ⋅ 4 + 1
4 = 1 ⋅ 4 + 0,
so (17, 55) = 1. Secondly, we find a solution (x0 , y0 ) to 17x + 55y = (17, 55) = 1 using back
substitution:
(17, 55) = 1 = 17 − 4 ⋅ 4 = 17 − 4(55 − 17 ⋅ 3)
= 17 − 4 ⋅ 55 + 12 ⋅ 17
= 17 ⋅ 13 − 55 ⋅ 4,
so x0 = 13 and y0 = −4. We conclude that 17 ⋅ 13 ≡ 1 (mod 55) and
[17]−1 = [13] in Z/55Z.
Proposition 18. Let a, m be coprime integers with m > 0 and let k ∈ Z>0 .
−1
≡ (a−1 ) (mod m).
k
Then (ak )
In other words, the previous proposition shows that mod m the inverse of a power is the
same power of the inverse; therefore, we may use the notation a−k (mod m) to denote both
−1
(ak ) and (a−1 ) mod m.
k
Exercises.
Exercise 16.9. Prove that the inverse of the inverse of a modulo m is a. More precisely, let
−1
a−1 be an inverse of a modulo m and prove that (a−1 ) ≡ a (mod m).
Exercise 16.10. Let a−1 be an inverse of a modulo m and let b−1 be an inverse of b modulo
m. Prove that a−1 b−1 is an inverse of ab modulo m.
Exercise 16.11. Find all least non-negative incongruent solutions of 623x ≡ 511 (mod 679).
45
17. The Chinese Remainder Theorem
The Chinese Remainder Theorem (CRT) is a tool that allows us to solve this and many
other systems of congruences.
Theorem 20 (Chinese Remainder Theorem). Let n1 , n2 , . . . , nk ∈ Z>0 be pairwise coprime
and b1 , b2 , . . . , bk ∈ Z. Consider the system of congruences
⎧
⎪ x ≡ b1 (mod n1 )
⎪
⎪
⎪
⎪
⎪x ≡ b2 (mod n2 )
⎪
(17.1) ⎨
⎪
⎪
⎪ ⋮ ⋮
⎪
⎪
⎩x ≡ bk (mod nk ).
⎪
⎪
Write m = n1 n2 . . . nk . Then, any two solutions x, x′ to the system satisfy x ≡ x′ (mod m),
that is, there is a unique solution modulo m.
Proof of CRT. This proof has two parts. Namely, we first show that a solution exists by
constructing it explicitly, and then we show that this solution is unique modulo (n1 n2 ⋯nk ).
Existence. Let m = n1 n2 ⋯nk and mi = m/ni . Since (ni , nj ) = 1 for all i ≠ j, we have
(mi , ni ) = 1, therefore the congruence equation mi y ≡ 1 (mod ni ) has a solution yi . Consider
the integer
x = b1 m1 y1 + b2 m2 y2 + ⋯ + bk mk yk
and observe that ni ∣ mi for all i ≠ j. Proposition 19 now implies
x ≡ 0 + 0 + ⋯ + bi mi yi + ⋯ + 0 ≡ bi mi yi (mod ni )
≡ bi (mod ni ),
where the last congruence follows because mi yi ≡ 1 (mod ni ).
Uniqueness. Suppose x, x′ ∈ Z are two solutions to the system in the statement of CRT. This
means that x ≡ bi ≡ x′ (mod ni ) for all i, and hence ni ∣ x − x′ for all i. From Proposition 12
we conclude that x − x′ is divisible by lcm(n1 , n2 , . . . , nk ). Since ni are pairwise coprime,
Proposition 13 tells us that
m = n1 n2 ⋯nk = lcm(n1 , n2 , . . . , nk ).
Then m ∣ (x − x′ ) ⇐⇒ x ≡ x′ (mod m) as desired. K
Proof. Clearly x = 1 and x = −1 satisfy the above systems, respectively. It follows from the
uniqueness part of the CRT that there are no other solutions modulo (n1 ⋯nk ). K
We observe that the proof of the CRT is an effective proof. That is, the proof of existence
provides us with a method to compute the solution x mod n1 ⋯nk .
Corollary 14. Consider a system of congruences as in (17.1). Let m = n1 n2 ⋯nk and
mi = m/ni . Since (ni , nj ) = 1 for all i ≠ j, we have (mi , ni ) = 1, so that mi y ≡ 1 (mod ni )
has a solution yi . Then
x = b1 m1 y1 + b2 m2 y2 + ⋯ + bk mk yk
is a solution to (17.1).
47
We illustrate this method with a few examples.
Example 17.2. Consider again
x ≡ 3 (mod 7) and x ≡ 2 (mod 3).
In the notation of the theorem and its proof we have b1 = 3, b2 = 2,
n1 = 7 n2 = 3, m = 3 ⋅ 7 = 21, m1 = m/n1 = 3, m2 = m/n2 = 7
and for i = 1, 2 we have to solve mi y ≡ 1 (mod ni ). Indeed,
i=1∶ 3y ≡ 1 (mod 7) Ô⇒ y1 = 5 (mod 7).
i=2∶ 7y ≡ 1 (mod 3) Ô⇒ y2 = 1 (mod 3).
Thus
x = b1 m1 y1 + b2 m2 y2 ≡ 3 ⋅ 3 ⋅ 5 + 2 ⋅ 7 ⋅ 1 ≡ 45 + 14 ≡ 17 (mod 21),
as expected.
Example 17.3. Find 17−1 (mod 55). We have to solve 17x ≡ 1 (mod 55). Since 55 = 5 ⋅ 11,
by Proposition 19, any solution to the previous congruence will also satisfy the following
congruences
17x ≡ 1 (mod 5) 2x ≡ 1 (mod 5)
{ ⇐⇒ {
17x ≡ 1 (mod 11) 6x ≡ 1 (mod 11).
Observe that the latter system is not yet ready to be solved using CRT, because the variable
x appears with coefficients different from 1. To make the coefficients equal to 1, we have to
multiply each equation by the corresponding inverse. Note that 3 ⋅ 2 ≡ 1 (mod 5), hence
2x ≡ 1 (mod 5) ⇐⇒ x ≡ 3 (mod 5),
and using 6 ⋅ 2 ≡ 1 (mod 11), we proceed similarly for the second congruence. This leads to
the equivalent system
x ≡ 3 (mod 5)
{
x ≡ 2 (mod 11)
to which we can now apply the CRT. In this case, we have
n1 = 5, n2 = 11, b1 = 3, b2 = 2,
so
m = 5 ⋅ 11 = 55, m1 = m/n1 = 11, m2 = m/n2 = 5
and we have to solve mi x ≡ 1 (mod ni ) for i = 1, 2. Indeed,
i=1∶ 11x ≡ 1 (mod 5) Ô⇒ y1 = 1,
i=2∶ 5x ≡ 1 (mod 11) Ô⇒ y2 = −2,
and the solution is given by
x ≡ b1 m1 y1 + b2 m2 y2 (mod m)
≡ 3 ⋅ 11 ⋅ 1 + 2 ⋅ 5 ⋅ (−2) (mod 55)
≡ 33 − 20 ≡ 13 (mod 55)
as computed in Example 16.8.
48
Example 17.4. In Section 14 we computed 810003 (mod 105) by using fast modular expo-
nentiation; here, we give an alternative calculation using CRT. We want to find an integer
x ≡ 810003 (mod 105) such that 0 ≤ x < 105. In particular, since 105 = 3⋅5⋅7, by Proposition 19,
we know that x satisfies
⎧
⎪ x ≡ 810003 (mod 3)
⎪
⎪
⎪
⎨x ≡ 810003 (mod 5)
⎪
⎪
⎪ 10003 (mod 7)
⎩x ≡ 8
⎪
and applying CRT will give the number we need. Before we proceed, we will simplify the
congruences above. First note that
⎧
⎪ 8 ≡ −1 (mod 3) ⎧
⎪ x ≡ (−1)10003 ≡ −1 (mod 3)
⎪
⎪
⎪ ⎪
⎪
⎪
⎨8 ≡ −2 (mod 5) Ô⇒ ⎨x ≡ (−2)10003 ≡ r (mod 5)
⎪
⎪
⎪ ⎪
⎪
⎪ 10003 ≡ 1 (mod 7).
⎩8 ≡ 1 (mod 7)
⎪ ⎩x ≡ 1
⎪
To find r, we observe (−2)4 ≡ 16 ≡ 1 (mod 5), thus
x ≡ r ≡ (−2)10003 ≡ (−2)10000 ⋅ (−2)3 ≡ ((−2)4 )2500 ⋅ (−2)3 ≡ 1 ⋅ (−8) ≡ 2 (mod 5).
Therefore, we have to apply CRT to the congruences
x ≡ −1 (mod 3), x ≡ 2 (mod 5), x ≡ 1 (mod 7).
In this case, we have b1 = −1, b2 = 2, b3 = 1,
n1 = 3, n2 = 5, n3 = 7, m = 105, m1 = 35, m2 = 21, m3 = 15
and we need to solve the congruences
35y ≡ 1 (mod 3), 21y ≡ 1 (mod 5), 15y ≡ 1 (mod 7).
We can take, respectively, the solutions y1 ≡ −1, y2 ≡ 1 and y3 ≡ 1, from which we obtain
x ≡ b1 m1 y1 + b2 m2 y2 + b3 m3 y3 (mod m)
≡ (−1) ⋅ 35 ⋅ (−1) + 2 ⋅ 21 ⋅ 1 + 1 ⋅ 15 ⋅ 1 (mod 105)
≡ 35 + 42 + 15 ≡ 92 (mod 105).
Remark 17.5. Since −1 ≡ 2 (mod 3), we could have rewritten the system in the previous
example as
x ≡ 2 (mod 3), x ≡ 2 (mod 5), x ≡ 1 (mod 7)
and grouped the first two congruences together into
x ≡ 2 (mod 15) x ≡ 1 (mod 7)
and applied CRT with these two congruences instead.
Exercises.
Exercise 17.6. Solve the following ancient Indian problem: If eggs are removed from a
basket 2, 3, 4, 5 and 6 at a time, there remain respectively, 1, 2, 3, 4 and 5 eggs. But if the
eggs are removed 7 at a time, no eggs remain. What is the least number of eggs that could
have been in the basket?
49
18. Applications of Congruences
Here we will explore a couple of applications of the theory we have developed so far.
18.1. Divisibility Tests. Here we will prove practical criteria to decide when a given in-
teger n is divisible by 3, 9, 11, or a power of 2. In particular, we will understand why the
following well known fact is true.
“A number is divisible by 3 if the sum of its digits is divisible by 3.”
Proposition 20. Let n ∈ Z>0 . Then n is divisible by 3 or 9 if and only if the sum of its
digits (in base 10) is divisible by 3 or 9, respectively.
18.2. The ISBN10 Code. In this section, we will apply congruences to describe the ISBN10
code and some of its properties. An ISBN10 code is a sequence of 10 digits, a1 , a2 ,. . . ,a10 ,
used to identify books, where
(i) 0 ≤ ai ≤ 9 for i = 1, . . . , 9;
(ii) a10 is an integer mod 11, where the letter X is used to denote 10 (mod 11).
An ISBn10 code is called valid if
10
S = ∑ i ⋅ ai ≡ 0 (mod 11).
i=1
Examples 18.2.
(1) The code is 0 − 321 − 50031 − 8 is valid because it satisfies
S = 1 ⋅ 0 + 2 ⋅ 3 + 3 ⋅ 2 + 4 ⋅ 1 + 5 ⋅ 5 + 6 ⋅ 0 + 7 ⋅ 0 + 8 ⋅ 3 + 9 ⋅ 1 + 10 ⋅ 8
≡ 16 + 49 + 89 ≡ 5 + 5 + 1 ≡ 0 (mod 11)
(2) The code 1 − 100 − 00000 − X is invalid since
S = 1 ⋅ 1 + 2 ⋅ 1 + 10 ⋅ 10 ≡ 103 ≡ 4 ≡/ 0 (mod 11).
51
Proposition 23. Let a1 , a2 , . . . , a9 be integers such that 0 ≤ ai ≤ 9 for i = 1, . . . , 9 and take
9
a10 = ∑ i ⋅ ai (mod 11),
i=1
where we write X for a10 if a10 ≡ 10 (mod 11). Then, a1 a2 ⋯a10 is a valid ISBN10 code.
Proof.
10 9 9 9 9
S = ∑ i ⋅ ai = (∑ i ⋅ ai ) + 10a10 = (∑ i ⋅ ai ) + 10 (∑ i ⋅ ai ) = 11 (∑ i ⋅ ai ) ≡ 0 (mod 11).
i=1 i=1 i=1 i=1 i=1
K
Suppose that an ISBN10 code x = x1 ⋯x10 is transmitted and the code y = y1 ⋯y10 is received;
the transmission is successful if x = y. We say that y contains a single error if there exists a
single value of j such that
∀i ≠ j we have xi = yi and yj = xj + a with − 10 ≤ a ≤ 10, a ≠ 0.
We say that y contains a transposition error if there are j ≠ k such that
xj ≠ xk , yj = xk , yk = xj and yi = xi ∀i ≠ j, k.
Proposition 24. The ISBN10 code detects both single errors and transposition errors.
We will assume that a single error has occurred in the transmission and will show that the
received code y is not valid. Indeed, let j and a be as described above and compute
10 10 10
Sy = ∑ i ⋅ yi = ∑ i ⋅ yi + j ⋅ yj = ∑ i ⋅ xi + jxj + ja = Sx + j ⋅ a ≡ ja (mod 11).
i=1 i=1,i≠j i=1,i≠j
52
Exercises.
Exercise 18.3. Suppose that n = 81294358X. Write down a digit in the slot marked X so
that n is divisible by
(a) 11
(b) 9
(c) 4
Exercise 18.4. Suppose that one digit, indicated with a question mark, in each of the
following ISBN10 codes has been smudged and cannot be read. What should this missing
digit be?
(a) 0 − 19 − 8?3804 − 9
(b) ? − 261 − 05073 − X
53
19. Wilson’s Theorem
In order to prove this theorem, we will need the following result. Note that this result is also
relevant on its own.
Lemma 8. Let a, p ∈ Z with p a prime and a invertible mod p. That is p ∤ a. Then
a ≡ a−1 (mod p) if and only if a ≡ ±1 (mod p).
Proof. Suppose first that a ≡ ±1 (mod p). Recall that a−1 is an integer satisfying aa−1 ≡ 1 (mod p).
Since 1⋅1 ≡ 1 (mod p) and (−1)(−1) ≡ 1 (mod p) we conclude, in both cases, that a ≡ a−1 (mod p).
Conversely, suppose a ≡ a−1 (mod p). Multiplying both sides by a then yields
a2 ≡ 1 (mod p) ⇐⇒ a2 − 1 = pk, for some k ∈ Z
⇐⇒ p ∣ (a − 1)(a + 1)
Ô⇒ p ∣ (a − 1) or p ∣ (a + 1) by Corollary 6
⇐⇒ a ≡ 1 (mod p) or a ≡ −1 (mod p).
K
Remark 19.1. In Lemma 8, the condition that p is prime is necessary. For example, take
a = 3 and p = 8; since 3 ⋅ 3 = 9 ≡ 1 (mod 8), we have a−1 ≡ 3 ≡ a (mod 8) but 3 ≡/ ±1 (mod 8).
Before we prove Wilson’s theorem, let us verify it via an example. This example illustrates
the main idea of the proof.
Example 19.2. Let p = 7. Wilson’s theorem tells us that (7 − 1)! = 6! ≡ −1 (mod 7). We
now verify this by direct computation.
6! = 6 ⋅ 5 ⋅ 4 ⋅ 3 ⋅ 2 ⋅ 1
= 1 ⋅ 6 ⋅ (2 ⋅ 4) ⋅ (3 ⋅ 5)
≡ 1 ⋅ 6 ⋅ 1 ⋅ 1 ≡ −1 (mod 7),
as expected. In the second equality, we note that we have reordered the integers in the
product. This reordering pairs the numbers in the brackets with their inverses mod p.
Suppose p > 3 is prime. We know that every a ≡/ 0 (mod p) has an inverse a−1 which is
unique in the range 1 ≤ a−1 ≤ p−1. Also, by Lemma 8, only 1 and p−1 are their own inverses.
Therefore the set S = {2, . . . , p − 2} contains p − 3 > 0 elements which can be grouped into
(p−3)/2 pairs of the form {a, a−1 }. This is the generalization of the situation in Example 19.2,
where we have the pairs {a = 2, a−1 = 4} and {a = 3, a−1 = 5}.
Now, the product of the elements of S satisfies
2 ⋅ 3 ⋅ . . . ⋅ (p − 2) ≡ (2 ⋅ 2−1 )(3 ⋅ 3−1 )⋯ ≡ 1 (mod p).
54
Multiplying this congruence by 1 on the left and p − 1 on the right gives
(p − 1)! = 1 ⋅ (2 ⋅ 3 ⋅ . . . ⋅ (p − 2))(p − 1) ≡ 1 ⋅ 1 ⋅ (p − 1) ≡ −1 (mod p).
K
Exercises.
Exercise 19.3. For each of the following congruences, find the least nonnegative integer x
that satisfies it.
(a)
60!
≡ x (mod 31)
31!
(b)
59!
≡ x (mod 31)
30!
55
20. Fermat’s Little Theorem
Claim: The integers in (20.1) are all distinct mod p and not congruent to zero mod p.
It follows from the claim that the sequence
a (mod p), 2a (mod p), . . . , (p − 1)a (mod p)
is comprised of p − 1 distinct integers in the interval [1, p − 1]. Hence, they must be the
integers 1, 2, . . . , p − 1 in some order (i.e. multiplication by a mod p is reordering them).
Therefore, by taking the product mod p of the elements in (20.1), we obtain
a ⋅ (2a) ⋅ (3a)⋯(p − 1)a ≡ 1 ⋅ 2 ⋅ 3⋯(p − 1) (mod p)
= (p − 1)! (mod p).
Since we also have
a(2a)(3a)⋯(p − 1)a ≡ ap−1 (1 ⋅ 2 ⋅ 3⋯p − 1) = ap−1 (p − 1)! (mod p),
it follows that
ap−1 (p − 1)! ≡ (p − 1)! (mod p).
Now, by Wilson’s theorem, we conclude that
ap−1 (−1) ≡ −1 (mod p) ⇐⇒ ap−1 ≡ 1 (mod p),
as desired. To complete the proof, it remains to prove the claim.
Proof of Claim: Suppose ka ≡ k ′ a (mod p). Note that a−1 exists since (a, p) = 1. Then,
multiplying the previous congruence by a−1 , we obtain
ka ≡ k ′ a (mod p) ⇐⇒ k(aa−1 ) ≡ k ′ (aa−1 ) (mod p) Ô⇒ k ≡ k ′ (mod p) Ô⇒ k = k ′ ,
where the last implication follows from Corollary 8 because 1 ≤ k, k ′ ≤ p − 1. Finally, since
p ∤ a and p ∤ k, we conclude ka ≡/ 0 (mod p) for all ka in (20.1), completing the proof. K
Exercises.
Exercise 20.3. Let p and q be distinct odd prime numbers with p − 1 ∣ q − 1. If a ∈ Z with
(a, pq) = 1, prove that aq−1 ≡ 1 (mod pq).
57
21. Primality Testing, Pseudoprimes, and Carmichael Numbers
Proof. Suppose that n is a composite number such that (n − 1)! ≡ −1 (mod n). In particular,
say n factors into n = a ⋅ b where 1 < a, b < n. We observe that a ≤ n − 1, so a ∣ (n − 1)!.
Moreover,
(n − 1)! ≡ −1 (mod n) ⇐⇒ n ∣ (n − 1)! + 1.
Lastly, since a ∣ n and n ∣ (n − 1)! + 1, we have a ∣ (n − 1)! + 1, and in particular, a divides the
difference,
a ∣ ((n − 1)! + 1 − (n − 1)!) = 1 Ô⇒ a = 1.
This is a contradiction, hence n is prime. K
We remark that this proposition, together with Wilson’s theorem, shows that the condition
(n − 1)! ≡ −1 (mod n) is equivalent to n being prime. This can be very helpful for theoretical
arguments, but in practice it is not a good test because computing (n − 1)! mod n is hard.
The following test is much better in practice.
Theorem 23 (Fermat’s Test). Let n, b ∈ Z>1 with 1 < b < n.
If bn−1 ≡/ 1 (mod n), then n is composite.
Unlike the condition (n − 1)! mod n, Fermat’s test does not classify prime numbers. That is,
the converse of the theorem does not imply that n is prime. For instance, bn−1 ≡ 1 (mod n)
does not necessarily mean that n is prime, as the following example illustrates.
Example 21.2. Taking n = 341 and b = 2, we observe that 2340 ≡ 1 (mod 341) but
341 = 11 ⋅ 31 so that 341 is not prime.
The previous example show that 341 passes Fermat’s test in base 2 but not in base 3. It is
natural to wonder if there are integers n that pass Fermat’s test in every base coprime to n.
Definition 21.5. We call an integer n > 1 a Carmichael number if it is a pseudoprime for
every base b ≥ 2 such that (n, b) = 1.
It is not easy to prove that Carmichael numbers actually exist. The following theorem
classifies them, allowing us to decide if an integer is a Carmichael number without checking
the definition. For now, we will only prove one implication of the theorem, as the other
direction (Theorem 44) requires the notion of primitive roots, which will only be introduced
in Section 26.
Definition 21.6. We say that an integer n is squarefree if no square number divides it. In
particular, the prime factorization of n contains only primes with exponent one.
Theorem 24 (Korset). A composite positive integer n is a Carmichael number if and only
if
(i) n is squarefree and
(ii) if p ∣ n is prime then p − 1 ∣ n − 1.
Proof. For now, we will only prove one implication. Suppose (i) and (ii) hold for n. and let
b ∈ Z satisfy (b, n) = 1.
From (i), we have n = p1 ⋯pk with pi distinct primes. Then (b, pi ) = 1 for i = 1, . . . , k.
From (ii), we have, for i = 1, .., k, that n − 1 = (pi − 1)ki for some ki ∈ Z. Then
where the second congruence follows from FLT. Therefore, the system of congruences
⎧
⎪ x ≡ 1 (mod p1 )
⎪
⎪
⎪
⎨ ⋮
⎪
⎪
⎪
⎩x ≡ 1 (mod pk )
⎪
has the solution x = bn−1 . Clearly, x = 1 is also a solution to the above system. From
the uniqueness part of CRT, we have bn−1 ≡ 1 (mod n = p1 ⋯pk ). This shows that n is a
pseudoprime for base b. Since b is arbitrary, we conclude that this holds for all values b such
that 1 < b < n so that n is a Carmichael number. K
59
Remark 21.7. In the previous proof, we could replace CRT by the following argument. For
all i = 1, . . . , k, we have bn−1 ≡ 1 (mod pi ), hence pi ∣ bn−1 − 1. Then lcm(p1 , .., pk ) ∣ bn−1 − 1 by
Proposition 12. Since the pi are distinct primes,
lcm(p1 , .., pk ) = p1 ⋯pk = n,
hence bn−1 ≡ 1 (mod n).
Example 21.8. The number 561 is the smallest Carmichael number. Indeed, 561 = 3 ⋅ 11 ⋅ 17
and 3 − 1 = 2, 11 − 1 = 10, and 17 − 1 = 16 all divide 561 − 1 = 560 = 24 ⋅ 5 ⋅ 7.
To conclude this section, we describe a primality test which is a refinement of Fermat’s test.
21.1. Miller’s Test. Let n > 0 be odd and suppose n is a pseudoprime for the base b ≥ 2.
That is,
bn−1 ≡ 1 (mod n).
n−1
Write x = b 2 (mod n). If n is prime, since x2 ≡ bn−1 ≡ 1 (mod n), it follows from Lemma 8
n−1
that x ≡ ±1 (mod n). So, if b 2 ≡/ ±1 (mod n), then n is composite.
n−1
Suppose we failed to conclude that n is composite in the previous step. If b 2 ≡ 1 (mod n)
n−1
and n − 1 is divisible by 4, then we can repeat the argument with y = b 4 .
n−1
Indeed, y 2 ≡ b 2 ≡ 1 (mod n) implies y ≡ ±1 (mod n) if n is prime. Then, if we have
n−1
b 4 ≡/ ±1 (mod n) we conclude that n is composite. If we fail again to conclude that n is
n−1
composite we can repeat this procedure as long as n−1
2k
is an integer and b 2k−1 ≡ 1 (mod n).
Example 21.9. We have seen that n = 561 is the smallest Carmichael number. In other
words,
b560 ≡ 1 (mod 561) for all b ≥ 2 satisfying (b, n) = 1.
Let b = 5. Then 5280 ≡ 67 ≡/ ±1 (mod 561) so that n is composite by Miller’s test.
Let b = 2; we have 2280 ≡ 1 (mod 561) but 2140 ≡ 67 ≡/ ±1 (mod 561) and we conclude again
that n is composite. Note, however, that depending on the base b we may need a different
number of steps in Miller’s test.
There are integers which fool the test, and we often refer to these integers as strong pseudo-
primes.
Example 21.10. Let n = 2047 = 23 ⋅ 89. Then
186
22046 = (211 ) = (2048)186 ≡ 1 (mod 2047),
so n is a pseudoprime in base b = 2. Moreover,
n−1 93
= 1023 and 21023 = (211 ) = 204893 ≡ 1 (mod 2047),
2
so 2047 fools Miller’s Test for base b = 2.
j in 0 ≤ j ≤ s − 1.
We have seen that Carmichael numbers fool Fermat’s test for every base. The following
theorem, which we will not prove, shows that this is not possible for Miller’s test.
n−1
Theorem 25. Let n ∈ Z>0 be odd and composite. Then n fools Miller’s test for at most 4
bases b such that 1 ≤ b ≤ n − 1.
Based on this theorem, there is the following very practical primality test.
Theorem 26 (Rabin’s probabilistic test). Let n ∈ Z>0 be odd and composite. Choose
b1 , . . . , bk ∈ Z such that 1 < bi ≤ n − 1. If n is composite, then the probability that it passes
Miller’s test for all bi is less than 41k .
Exercises.
Exercise 21.12. Prove that 1729 is a Carmichael number.
Exercise 21.13. Use Miller’s Test in base b = 2 to show that 1729 is composite.
61
22. Euler’s φ-Function and Euler’s Theorem
Fermat’s Little Theorem tells us that the (p − 1)-th power of any integer coprime to p is con-
gruent to one mod p. In this section we will study Euler’s theorem which generalizes this idea
to any congruence modulus m. In other words, for any fixed m, Euler’s theorem determines
y > 0 (depending on m) such that, for all a ∈ Z coprime to m, we have ay ≡ 1 (mod m).
To state Euler’s theorem we first need to introduce a very important function.
Definition 22.1. The Euler φ-function is the function φ ∶ Z>0 → Z>0 defined by
φ(n) = # {x ∈ Z ∶ 1 ≤ x ≤ n and (x, n) = 1} .
In words, it counts the number of positive integers up to n that are coprime to n.
Examples 22.2.
(1) φ(1) = φ(2) = 1;
(2) φ(3) = 2 since both {1, 2} are coprime to 3;
(3) φ(6) = 2 since, from {1, 2, 3, 4, 5, 6}, only 1 and 5 are coprime to 6;
(4) For any prime p, since p ∤ x if x < p, we have
φ(p) = # {x ∈ Z ∶ 1 ≤ x ≤ p and (x, p) = 1} = # {x ∈ Z ∶ 1 ≤ x ≤ p − 1} = p − 1.
Theorem 27 (Euler). Let a, m ∈ Z with m > 0 and (a, m) = 1. Then,
aφ(m) ≡ 1 (mod m).
Observe that, as a direct consequence of Example 22.2 (4) and Euler’s theorem, we recover
FLT.
Corollary 17. Let p be a prime. Then φ(p) = p − 1 and ap−1 ≡ 1 (mod p).
Proof of Euler’s Theorem. Let a ∈ Z satisfy (a, m) = 1. From the definition of φ(m), there
are φ(m) distinct positive integers, a1 , . . . , aφ(m) , such that ai ≤ m and (ai , m) = 1. Consider
the sequence of integers
(22.3) a ⋅ a1 , a ⋅ a2 , . . . , a ⋅ aφ(m) .
Claim. The integers in (22.3) are all distinct mod m, satisfy (a ⋅ ai , m) = 1, and are not
congruent to zero mod m.
It follows from the claim that, the mod m sequence,
a ⋅ a1 (mod m), a ⋅ a2 (mod m), . . . , a ⋅ aφ(m) (mod m).
is made of φ(m) distinct integers in the interval [1, m−1] which are coprime to m (by Propo-
sition 16). Since the integers with these properties are a1 , a2 , . . . , aφ(m) , we conclude that the
mod m sequence must be the integers a1 , a2 , . . . , aφ(m) in some order (i.e. multiplication by
a is reordering them). Therefore, by taking their product, we get
(a ⋅ a1 ) ⋅ (a ⋅ a2 )⋯(a ⋅ aφ(m) ) ≡ a1 ⋅ a2 ⋯aφ(m) (mod m)
⇐⇒ aφ(m) (a1 a2 ⋯aφ(m) ) ≡ a1 a2 ⋯aφ(m) (mod m).
62
Write A = a1 a2 ⋯aφ(m) . Clearly, (A, m) = 1, therefore A is invertible mod m, and multiplying
the last congruence by A−1 yields
aφ(m) ≡ 1 (mod m),
as desired. To complete the proof, we now prove the claim.
Proof of Claim. Suppose a ⋅ ai ≡ a ⋅ aj (mod m). Since (a, m) = 1, the inverse a−1 exists so
we can cancel the a in the previous congruence to obtain
ai ≡ aj (mod m) with 0 ≤ ai , aj ≤ m − 1.
It now follows from Corollary 8 that ai = aj . Suppose (a ⋅ ai , m) > 1 for some i. Then there
exists p such that p ∣ aai and p ∣ m; hence (p ∣ a and p ∣ m) or (p ∣ ai and p ∣ m). This
implies (a, m) > 1 or (ai , m) > 1, a contraction. We conclude (a ⋅ ai , m) = 1. Clearly a ⋅ ai ≡/ 0
(mod m), otherwise m ∣ a ⋅ ai , completing the proof. K
Remark 22.4. Since Euler’s theorem implies FLT (see Corollary 17), the previous proof, when
restricted to m = p a prime, must also provide a proof of Fermat’s Little Theorem. Indeed,
comparing both proofs, we see that the main difference is that instead of using Wilson’s
theorem, we used the fact that A = a1 a2 ⋯aφ(m) is invertible. Of course A is invertible in the
proof of FLT, since A ≡ −1 (mod m) by Wilson’s theorem.
Definition 22.5. A set of integers with φ(m) elements which are coprime to m such that
no two of them are congruent modulo m is called a reduced residue system modulo m.
22.1. A Formula for φ. The following theorem gives a formula to compute φ(n). We will
prove this formula in Section 23 when studying arithmetic functions. For the moment, we
are interested in using the formula to illustrate different kinds of calculations involving the
function φ(n).
Theorem 28. Let n ∈ Z>1 have factorization n = pa11 ⋯pakk , aj ≥ 1 and pj distinct primes.
Then, φ(n) is given by the formula,
k
1 1 1 a −1
(22.6) φ(n) = n (1 − ) (1 − ) ⋯ (1 − ) = ∏ pj j (pj − 1).
p1 p2 pk j=1
Examples 22.7.
Exercises.
Exercise 22.11. Find a reduced residue system modulo each integer below
(i) 15
(ii) 18
(iii) p, where p is a prime number
(iv) 2n , where n is a positive integer
(v) For each of (i) and (ii), give another solution sharing exactly one element with your
previous solution
Exercise 22.12. Prove that 98 ≡ 1 (mod 16) by following the steps in the proof of Euler’s
Theorem.
65
23. Arithmetic Functions
We have already encountered in Section 22.1 a very important function, the Euler-φ function.
In this section, we will study other relevant functions in number theory; in particular, we’ll
focus on those which are ‘multiplicative’, a property that sometimes allows to derive formulas
for the functions we consider.
Definition 23.1. A function whose domain is Z>0 is called an arithmetic function.
Examples 23.2.
(1) f (n) = 1 for all n ∈ Z>0 ;
(2) f (n) = n for all n ∈ Z>0 ;
(3) φ(n), the Euler φ-function;
(4) τ (n) = the number of positive divisors of n;
(5) σ(n) = the sum of the positive divisors of n.
Example 23.3. The positive divisors of 6 are {1, 2, 3, 6}. Therefore,
τ (6) = 4 and σ(6) = 1 + 2 + 3 + 6 = 12.
Definition 23.4. Let f be an arithmetic function. We say that f is multiplicative if, for all
n1 , n2 ∈ Z>0 satisfying (n1 , n2 ) = 1, we have
f (n1 ⋅ n2 ) = f (n1 ) ⋅ f (n2 )
and we say f is completely multiplicative if
f (n1 ⋅ n2 ) = f (n1 ) ⋅ f (n2 ) for all n1 , n2 ∈ Z>0 .
Clearly, both the constant function f (n) = n and the identity function f (n) = 1 are com-
pletely multiplicative. We shall prove that the three functions φ, τ , and σ are multiplicative.
We begin by showing that φ is multiplicative, which is a key ingredient to later establish the
formula (22.6).
Theorem 29. The Euler φ-function is multiplicative.
The following theorem will play a central role in proving that τ and σ are multiplicative
functions. This will yield a method to construct a new multiplicative function, provided
that we start with a multiplicative function.
Theorem 30. Let f be an arithmetic function and define the arithmetic function F by
F (n) = ∑ f (d), ∀n ∈ Z>0 .
d∣n,d>0
⎛ ⎞⎛ ⎞
=⎜
⎜∑ f (d )⎟ ⎜
1 ⎟⎜ ∑ f (d 2 ⎟ = F (n1 )F (n2 ),
) ⎟
d ∣n
⎝ d1 >01 d ∣n
⎠ ⎝ d2 >02 ⎠
1 2
In other words, they are of the form F as in Theorem 30 where we choose f (n) = 1 and
f (n) = n, respectively. Since these two functions f are multiplicative, the result now follows
from Theorem 30. K
Exercises.
Exercise 23.5. Prove that a completely multiplicative arithmetic function is completely
determined by its values at prime numbers.
Exercise 23.6. Let n ∈ Z with n > 0. Define an arithmetic function ρ by ρ(1) = 1 and
ρ(n) = 2m where m is the number of distinct prime factors dividing n. Prove that ρ is
multiplicative but not completely multiplicative.
68
24. Formulas for the Functions φ, τ and σ
Let f be a multiplicative arithmetic function and let n > 1 be an integer with prime factor-
ization n = pa11 pa22 ⋯pakk . Then
f (n) = f (pa11 ) f (pa22 ) ⋯f (pakk ) .
Thus, to determine the formula for f , it suffices to determine a formula for f (pai i ) and take
the product.
Lemma 9. Let p be a prime and a ≥ 1. Then,
1
φ (pa ) = pa − pa−1 = pa (1 − ) .
p
In particular, φ(p) = p − 1.
Proof. The result follows from Lemma 10 and the fact that τ and σ are multiplicative
functions. K
Proposition 26. An integer n > 0 is prime if and only if σ(n) = 1 + n.
Proof. Clearly, σ(n) ≥ 1 + n for all n. Furthermore, n is not a prime if and only if the set of
its positive divisors contains at least one element c such that 1 < c < n. That is,
σ(n) ≥ 1 + n + c > n + 1.
K
Example 24.1. Let n = 100 = 22 ⋅ 52 . Then τ (n) = (2 + 1)(2 + 1) = 9 and
23 − 1 53 − 1
σ(n) = ⋅ = 7 ⋅ 31 = 217.
2−1 5−1
Theorem 34. Let n ∈ Z>0 . Then
∑ φ(d) = n.
d∣n
d>0
a multiplicative function. In other words, F (n) = F (pa11 ) ⋯F (pakk ), where n = pa11 ⋯pakk is the
prime factorization of n. Lastly, we observe that
F (pa ) = ∑ φ (pi ) = 1 + (p − 1) + (p2 − p) + ⋯ + (pa − pa−1 ) = pa ,
0≤i≤a
70
Exercises.
Exercise 24.3. Show that φ, τ and σ are not completely multiplicative by providing a
counterexample in each case.
Exercise 24.4. Characterize the positive integers for which τ (n) is odd.
Exercise 24.5. Characterize the positive integers n such that
(i) φ(n) is odd
(ii) 4 ∣ φ(n)
Exercise 24.6. Let p be a prime such that p+2 is also a prime (these are called twin primes).
Prove that σ(p + 2) = σ(p) + 2.
71
25. Perfect Numbers and Mersenne Primes
In this section, we study the relationship between so-called perfect numbers and Mersenne
primes. We begin with a few definitions.
Definition 25.1. An integer n > 0 is called perfect if σ(n) = 2n.
Definition 25.2. Let n > 1 be an integer. We call the integer Mn = 2n − 1 the n-th Mersenne
number. If Mn is prime, we call it a Mersenne prime.
Examples 25.3.
(1) The positive divisors of n = 6 are {1, 2, 3, 6} and we have
σ(6) = 1 + 2 + 3 + 6 = 12 = 2 ⋅ 6,
so 6 is a perfect number;
(2) The positive divisors of n = 28 are {1, 2, 4, 7, 14, 28} and we have
σ(28) = 1 + 2 + 4 + 7 + 14 + 28 = 56 = 2 ⋅ 28,
so 28 is a perfect number.
(3) M5 = 25 − 1 = 31 is a Mersenne prime.
(4) M7 = 27 − 1 = 127 is a Mersenne prime.
(5) M11 = 211 − 1 = 2047 = 23 ⋅ 89 is not prime.
There are no known odd perfect numbers.1 It is also unknown if there are infinitely many even
ones, but the next theorem shows there is a one-to-one correspondence between Mersenne
primes and even perfect numbers.
Theorem 35. Let n ∈ Z>0 . Then n is an even perfect number if and only if
n = 2p−1 (2p − 1) with 2p − 1 a Mersenne prime.
Proof. We first suppose n = 2p−1 (2p − 1), where 2p − 1 is a Mersenne prime, and show that
n must be an even perfect number. In other words, 2p − 1 is a prime number and we have
σ(2p − 1) = (2p − 1) + 1 = 2p by Proposition 26. We now compute
σ(n) = σ(2p−1 (2p − 1)) = σ(2p−1 )σ(2p − 1) = σ(2p−1 )2p ,
where we used the fact that σ its multiplicative and (2p−1 , 2p − 1) = 1. Now, from Lemma 10,
it follows that
2p − 1
σ(n) = ( ) ⋅ 2p = (2p − 1) ⋅ 2p = 2 (2p−1 (2p − 1)) = 2n.
2−1
Conversely, suppose now n is an even perfect number. Write n = 2a ⋅ b, where a, b ∈ Z>0 , b is
odd, and a ≥ 1. Since σ is multiplicative, by Lemma 10,
2a+1 − 1
σ(n) = σ (2a ) σ(b) = ( ) σ(b) = (2a+1 − 1) σ(b).
2−1
Since n is perfect,
σ(n) = 2n = 2 (2a ⋅ b) = 2a+1 b,
1As of 2012 it is known that no odd perfect numbers were found up to 101500
72
we have
(2a+1 − 1) σ(b) = 2a+1 b Ô⇒ 2a+1 ∣ σ(b) ⇐⇒ σ(b) = 2a+1 c with c > 0
and it follows that
(2a+1 − 1) σ(b) = (2a+1 − 1) 2a+1 c = 2a+1 b Ô⇒ (2a+1 − 1) c = b.
To conclude this section, we establish the following two properties of Mersenne numbers.
Theorem 36. Let n ∈ Z>1 . If Mn is prime then n is prime.
From Example 25.3 (5), we see that M11 is not a prime despite the fact that 11 is a prime.
The next theorem shows that, in this kind of situation, the divisors of Mp cannot be arbitrary.
For this result, we require the following lemma.
Lemma 11. Let a and b be positive integers. Then (2a − 1, 2b − 1) = 2(a,b) − 1.
Exercises.
Exercise 25.5. Let n ∈ Z with n > 1. Then n is said to be almost perfect if σ(n) = 2n − 1.
Show that, for k ∈ Z>0 , the number 2k is almost perfect.
74
26. Primitive Roots
We know from Euler’s theorem that aφ(m) ≡ 1 (mod m) for any integer a coprime to m > 0.
Therefore, it is natural to ask if, fixed m > 0, there exists an integer x < φ(m) such that for
all integer a coprime to m we have ax ≡ 1 (mod m). For example, for m = 8 we can easily
compute that
12 ≡ 1, 32 = 9 ≡ 1, 52 = 25, 72 = 49 ≡ 1 (mod 8),
showing that x = 2 < φ(8) = 4 has the desired property. Inverting the question, we are inter-
ested in understanding when the smallest value of x with the above property is x = φ(m).
The complete answer to this question is provided by the Primitive Root Theorem (see The-
orem 40).
We begin with the following definition.
Definition 26.1. Let a, m ∈ Z with m > 0 and (a, m) = 1. The order of a modulo m, denoted
ordm (a), is the least positive integer n such that an ≡ 1 (mod m).
Example 26.2. Let m = 7, a = 3. We have
31 ≡ 3, 32 ≡ 2, 33 ≡ 6, 34 ≡ 4, 35 ≡ 5, 36 ≡ 1 (mod 7),
so ord7 (3) = 6 = φ(7). Similarly, we can compute the order of every integer a coprime to 7:
a (mod 7) 1 2 3 4 5 6
ord7 (a) 1 3 6 3 6 2
Example 26.3. Let m = 8. Here, φ(8) = 4 and the order of an integer a coprime to 8 is
given in the table
a (mod 8) 1 3 5 7
ord8 (a) 1 2 2 2
It is clear from Euler’s theorem that, for any a coprime to m, we have ordm (a) ≤ φ(m). In
the examples above, we see that ord7 (3) = ord7 (5) = φ(7), while for m = 8 there is no a with
the maximal order φ(8) = 4. However, in both cases, all the orders occurring are divisors of
φ(m). This is a general property.
Proposition 27. Let a, m ∈ Z such that m > 0 and (a, m) = 1. Then an ≡ 1 (mod m) for
some n ∈ Z>0 if and only if ordm (a) ∣ n. In particular, ordm a ∣ φ(m).
Proof. Suppose first that an ≡ 1 (mod m) for some n > 0. Dividing n by ordm (a) via the
division algorithm yields
n = ordm (a)q + r, 0 ≤ r < ordm (a).
Then
an = (aordm (a) ) ⋅ ar ≡ 1q ⋅ ar ≡ ar ≡ 1 (mod m),
q
where we used the definition of order and our assumption. Suppose r ≠ 0. Since ordm (a) is the
smallest integer for which aordm (a) ≡ 1 (mod m) and r < ordm (a), we obtain a contradiction.
So r = 0 and hence ordm (a) ∣ n.
75
Suppose now ordm (a) ∣ n. That is, n = ordm (a) ⋅ k for some k ∈ Z. Thus
an = (aordm (a) ) ≡ 1k ≡ 1 (mod m).
k
K
Example 26.4. Let m = 11 and a = 2. We have φ(11) = 10, so ord11 2 ∈ {1, 2, 5, 10} by
Proposition 27. We compute
21 ≡ 2 (mod 11), 22 ≡ 4 (mod 11), 25 ≡ 32 ≡ 10 (mod 11)
and since none of these are congruent to 1 mod 11, it follows that ord11 (2) = 10. Note that,
by using Proposition 27, we avoided computing 23 , 24 , 26 , 27 , 28 , 29 (mod 11).
Definition 26.5. Let a, m ∈ Z with m > 0 and (a, m) = 1. We say that a is a primitive root
modulo m if ordm (a) is maximal, that is ordm (a) = φ(m).
Examples 26.6. From the examples above, we already know the following:
(1) 3 and 5 are primitive roots modulo 7;
(2) 2 is a primitive root modulo 11;
(3) There are no primitive roots modulo 8.
We see now that the discussion in the first paragraph of this section can be summarized into
the question: which integers admit primitive roots?. The answer is given by Theorem 40 to
which we will not give a complete proof. In the remainder of this section, we will need the
following result.
Proposition 28. Let a, m ∈ Z, m > 0, (a, m) = 1.
(i) For i, j ∈ Z, we have ai ≡ aj (mod m) ⇐⇒ i ≡ j (mod ordm (a)).
(ii) For i > 0, we have
ordm (a)
ordm (ai ) = .
(ordm (a), i)
Proof. The set S has φ(m) elements. Additionally, because a is a primitive root, (a, m) = 1
so that all of these elements are coprime to m. It remains to show that no two of them are
congruent mod m.
Suppose that ai ≡ aj (mod m) for some ai , aj ∈ S. Then i ≡ j (mod ordm (a)) by Proposi-
tion 28 (i). Then i = j by Corollary 8 since ordm (a) = φ(m) and 0 ≤ i, j ≤ φ(m) − 1. K
Example 26.7. Let m = 7 and a = 3 which is a primitive root mod 7. For 0 ≤ i ≤ 6, in
Example 26.2, we computed 3i (mod 7) and obtained the second row of the table
i 1 2 3 4 5 6
3i(mod 7) 3 2 6 4 5 1
ord7 (3i ) 6 3 2 3 6 1
which is a reduced residue system mod 7, as predicted by the previous corollary. To obtain
the third row we can, for example, apply the formula in Proposition 28 (ii). For instance,
to determine ord7 (2), we compute
ord7 (3) 6 6
ord7 (2) = ord7 (32 ) = = = = 3.
(ord7 (3), 2) (6, 2) 2
Corollary 20. Let m be an integer admitting a primitive root. Then there are φ(φ(m))
non-congruent primitive roots mod m.
Proof. Let r be a primitive root mod m so ordm (r) = φ(m). By Corollary 19, any other
primitive root must be congruent to ri for some i such that 1 ≤ i ≤ φ(m). If ri is also a
primitive root, then ordm (ri ) = ordm (r) = φ(m) and, by Proposition 28 (ii), we have
ordm r
ordm (ri ) = ⇐⇒ (ordm (r), i) = 1.
(ordm (r), i)
Clearly, there are φ(ordm (r)) = φ(φ(m)) such i, giving the desired result. K
To understand which integers admit a primitive root it is convenient to first understand why
certain integers cannot have a primitive root.
Examples 26.8.
(1) For m = 15, we have φ(m) = φ(3)φ(5) = 2 ⋅ 4 = 8 and
a such that (a, 15) = 1 1 2 4 7 8 11 13 14
ord15 (a) 1 4 2 4 4 4 4 2
(2) For m = 16, we have φ(16) = 8 and
a such that (a, 16) = 1 1 3 5 7 9 11 13 15
ord16 (a) 1 4 4 2 2 4 4 2
77
(3) Recall there are no primitive roots mod 8.
We shall shortly prove results explaining what is behind these examples, but first let us have
a closer look at the case m = 15. From Euler’s theorem we know that a8 ≡ 1 (mod 15) when
(a, 15) = 1, hence, by Proposition 19, we have a8 ≡ 1 (mod 3) and a8 ≡ 1 (mod 5). Clearly,
from these two congruences we recover that a8 ≡ 1 (mod 15) by CRT. However, FLT gives
us the sharper congruences a2 ≡ 1 (mod 3) and a4 ≡ 1 (mod 5) which, after squaring the
first, leads to a4 ≡ 1 (mod 15). Note that this is consistent with the orders in the table for
m = 15 in Examples 26.8. The following theorem generalizes this idea.
Theorem 38. Let m ∈ Z>0 . Suppose m = kn where (k, n) = 1 and φ(k), φ(n) are even.
Then, for all a ∈ Z coprime to m, we have
φ(m)
a 2 ≡ 1 (mod m).
In particular, there are no primitive roots modulo m.
Proof. We can write m = pd q ` r, where p ≠ q are two odd primes and (r, p) = (r, q) = 1. Let
k = pd and n = q ` r, so that m = kn and (k, n) = 1. By the formula for φ, we also have that
φ(k) = pd−1 (p − 1) and φ(n) = q `−1 (q − 1)φ(r)
are even, so we can apply Theorem 38. K
Corollary 22. If m is divisible by 4p with p an odd prime, then there are no primitive roots
modulo m.
Putting together these results we conclude that primitive roots may exist only for the integers
m = 1, 2, 4, pd or 2pd , where d ≥ 1 and p is an odd prime. The following theorem guarantees
that primitive roots exist for all such integers.
Theorem 40 (Primitive Root Theorem). Let m ∈ Z>0 . Then a primitive root modulo m
exists if and only if m = 1, 2, 4, pd or 2pd , where d ≥ 1 and p is an odd prime.
This result is clear for m = 1, 2, 4 and, in the next section, we will prove it for m = p a prime.
To close this section, we will prove the above result for m = 2pd assuming it to be true for
m = pd . More precisely, we will prove that (1) implies (2) in the following result.
Theorem 41. Let p be an odd prime and d ≥ 1. Then,
(1) there exist a primitive root modulo pd ;
(2) there exist a primitive root modulo 2pd .
Proof of part (2). Write n = 2pd . Let r be a primitive root mod pd which exists by part (1).
Then (r, pd ) = 1 and, since r ≡ r + pd (mod pd ), if r is even we replace it by r + pd which is
odd. So, we can assume r is odd and (2pd , r) = 1.
We aim to show that r is also a primitive root mod n. Note that φ(2pd ) = φ(2)φ(pd ) = φ(pd ).
By Proposition 27, we have ordn r ∣ φ(2pd ) = φ(pd ). Moreover,
rordn r ≡ 1 (mod n = 2pd ) Ô⇒ rordn r ≡ 1 (mod pd )
which, by Proposition 27, implies ordpd r = φ(pr ) ∣ ordn r. Now, we have shown that
ordn r ∣ φ(pd ) and φ(pd ) ∣ ordn r, therefore ordn r = φ(pd ) = φ(2pd ) because both ordn r and
φ(pd ) are positive. We conclude that r is a primitive root modulo n = 2pd . K
79
Exercises.
Exercise 26.9.
(a) Show that 2 is a primitive root modulo 19.
(b) How many incongruent primitive roots modulo 19 are there?
(c) By Euler’s Theorem, we know that a18 ≡ 1 (mod 19) for any a coprime to 19. Explain
why a is not necessarily a primitive root modulo 19.
(d) Determine, with proof, a maximal set of incongruent primitive roots modulo 19.
80
27. Primitive Roots for Primes
The polynomial f of the previous examples has 0 and 1 roots modulo 2 and 3, respectively.
In both cases, the number of roots is smaller than the degree of f which is 3. The following
lemma shows this is a general fact.
Lemma 12 (Lagrange). Let f be a monic polynomial of degree n with integer coefficients.
Then, f has at most n roots modulo p.
Note that xi − ci0 = (x − c0 )hi−1 (x), where hi−1 (x) is a monic polynomial of degree i − 1, hence
f (x) − f (c0 ) = (x − c0 )hn−1 (x) + an−1 (x − c0 )hn−2 (x) + ⋯ + a1 (x − c0 )
= (x − c0 )g(x).
Here, g(x) is a monic polynomial of degree n − 1. Now, evaluating x at ci in the previous
equality gives
f (ci ) − f (c0 ) ≡ (ci − c0 )g(ci ) (mod p) ⇐⇒ (ci − c0 )g(ci ) ≡ 0 (mod p)
81
and, since p is prime, this implies
ci − c0 ≡ 0 (mod p) or g(ci ) ≡ 0 (mod p).
For i > 0, the first equivalence cannot occur since ck ≡/ c0 (mod p) by hypothesis. Hence
g(ci ) ≡ 0 (mod p) for all i = 1, .., n. Thus, g has degree n − 1 and n different roots mod p, a
contraction. We conclude that f has at most n roots mod p, as desired. K
We can now prove the following statement which implies Theorem 42.
Theorem 43. Let p be a prime and d ≥ 1 a divisor of p − 1. Then, there are φ(d) integers
a such that 1 ≤ a ≤ p − 1 such that ordp (a) = d.
In particular, there are φ(p − 1) primitive roots (mod p).
Proof. Let F (d) denote the number of integers a such that 1 ≤ a ≤ p − 1 and ordp (a) = d.
The proof is divided into two main parts:
(1) We will show that either F (d) = 0 or F (d) = φ(d);
(2) Using (1), we will show that F (d) = φ(d) when d ∣ p − 1.
We start by proving (1). If F (d) = 0 there are no integers of order d.
Suppose F (d) ≠ 0, so that there is at least one integer of order d. Note that any a of order d
is a root mod p of f (x) = xd − 1; indeed, ad ≡ 1 (mod p) ⇐⇒ f (a) ≡ ad − 1 ≡ 0 (mod p).
Fix a of order d. Note that f (ai ) = (ai )d − 1 ≡ (ad )i − 1 ≡ 0 (mod p) and ai ≡/ aj (mod p) if
i ≠ j are in the range 0 ≤ i, j ≤ d − 1. Then, a0 , a1 , . . . , ad−1 are d distinct mod p roots of f .
Since f has degree d, it follows from Lemma 12 that these are all the mod p roots of f .
We conclude that all the elements of order d are among the ai and so we need to determine
how many ai , 1 ≤ i ≤ d − 1 have order d. Suppose ai has order d. Then, from Proposi-
tion 28 (ii), we know that
ordp a d
ordp ai = ⇐⇒ d = ⇐⇒ (d, i) = 1,
(ordp a, i) (d, i)
which occurs for φ(d) values of i in 1 ≤ i ≤ d − 1. Then F (d) = φ(d), as desired.
We will now prove (2). Since every a in 1 ≤ a ≤ p−1 has a unique order d dividing φ(p) = p−1
(by Proposition 27), we can group these elements based on their orders. In doing so, we see
that the total amount of integers, p − 1, is equal to the sum of the number of integers F (d)
for each order d. Therefore,
p − 1 = ∑ F (d) = ∑ φ(d),
d∣p−1 d∣p−1
d>0 d>0
where the second equality follows from by Theorem 34. We conclude that
∑ (F (d) − φ(d)) = ∑ (F (d) − φ(d)) = − ∑ φ(d) = 0.
d∣p−1 d∣p−1 d∣p−1
d>0 F (d)=0 F (d)=0
Here, we used part (1) to discard all the terms such that F (d) ≠ 0 in the first equality. Since
φ(d) ≥ 1 for all d, we conclude that the last sum runs over the empty set, otherwise we have
a contradiction. Thus F (d) = φ(d) for all d ∣ p − 1. K
82
Example 27.2. Find all integers of order 6 modulo 19.
We first show that 2 is a primitive root. We have φ(19) = 18, so the possible order mod 19
is among the values {1, 2, 3, 6, 9, 18}. We compute
2 ≡ 2, 22 ≡ 4, 23 ≡ 8, 26 ≡ 7, 29 ≡ 18 (mod 19).
Since all of the above congruences have ≡/ 1 (mod 19), we conclude ord19 2 = 18, as desired.
Since φ(6) = 2, by Theorem 43, there are two integers of order 6 mod 19. From Corollary 19
we know they are congruent mod 19 to 2i for some 1 ≤ i ≤ 18. We need to find the values of
i in this interval satisfying
ord19 2 18
6 = ord19 2i = = ⇐⇒ (18, i) = 3.
(ord19 2, i) (18, i)
Therefore i = 3 or i = 15, hence
23 ≡ 8 (mod 19) and 215 ≡ 12 (mod 19)
are the two order 6 elements.
Proof. Let n > 2 be a Carmichael number and p ∣ N a prime factor. We can write n = pk n′
with (p, n′ ) = 1 for some n′ ∈ Z. Note that to prove (i) we need to show that k = 1. By CRT,
the system of congruences
x ≡ 1 + p (mod pk ), x ≡ 1 (mod n′ )
admits a solution, that is, there is an integer a satisfying
(27.3) a ≡ 1 + p (mod pk ), a ≡ 1 (mod n′ ).
We note that (a, n) = 1. Indeed, if a prime q ∣ (a, n) then, either q = p or q is a prime factor of
n′ . Reducing the first congruence mod q if q = p or the second if q ∣ n′ leads to 0 ≡ 1 (mod q)
in both cases, a contraction. Therefore, since n is a Carmichael number, we have
an−1 ≡ 1 (mod n).
Suppose now k ≥ 2 so that p2 ∣ n. Reducing this congruence mod p2 gives
an−1 ≡ 1 (mod p2 ) Ô⇒ (1 + p)n−1 ≡ 1 (mod p2 )
83
where we used (27.3) in the implication above. We have, by the Binomial theorem2, that
(1 + p)n−1 = (1 + p)(1 + p)⋯(1 + p) ≡ 1 + (n − 1)p (mod p2 ).
Since p ∣ N , we also have
1 + (n − 1)p = 1 + np − p ≡ 1 − p (mod p2 ),
therefore,
1 ≡ (1 + p)n−1 ≡ 1 + (n − 1)p ≡ 1 − p (mod p2 ) Ô⇒ −p ≡ 0 (mod p2 ),
which is impossible. Thus k = 1, completing the proof of (i).
We will now prove (ii). Let p ∣ n be a prime. Since n is squarefree by part (i), we have
(p, n/p) = 1. Let b be a primitive root mod p. This primitive root exists by Theorem 42. By
CRT, the system of congruences
x ≡ b (mod p), x ≡ 1 (mod n/p)
admits a solution. That is, there is an integer a satisfying
a ≡ b (mod p), a ≡ 1 (mod n/p).
A similar argument as above shows that (a, n) = 1 and, since n is a Carmichael number, we
have
an−1 ≡ 1 (mod n) Ô⇒ an−1 ≡ bn−1 ≡ 1 (mod p).
By Proposition 27 and since b is a primitive root mod p, we conclude
ordp b = φ(p) = p − 1 ∣ n − 1,
completing the proof of (ii). K
Exercises.
Exercise 27.4. Show that, if f (x) is a polynomial of degree n with integer coefficients, and
p and q are prime numbers such that p ≠ q, then the congruence f (x) ≡ 0 (mod pq) has at
most n2 incongruent solutions modulo pq.
Exercise 27.5.
(a) How many elements of order 6 modulo 17 are there?
(b) How many elements of order 4 modulo 17 are there?
(c) Find all elements of order 4 modulo 17 using the fact that 3 is a primitive root modulo 17.
84
28. Index Arithmetic and Discrete Logarithms
Let n be an integer admitting a primitive root. Recall that, if r is a primitive root, then the
set {1, r, r2 , . . . , rφ(n)−1 } is a reduced residue system mod n. In particular, for all a ∈ Z such
that (a, n) = 1 we have ri ≡ a (mod n) for some i in the range 1 ≤ i ≤ φ(n).
Definition 28.1. Let r be a primitive root mod n and a ∈ Z coprime to n. The index of
a relative to r is the least positive integer i such that ri ≡ a (mod n). We denote this by
indr a.
Example 28.2. We know that r = 3 is a primitive root mod n = 7. We have already
computed the first two rows of the following table. Using them we can determine all the
indices relative to 3 mod 7.
i 1 2 3 4 5 6
3i (mod 7) 3 2 6 4 5 1
a 1 2 3 4 5 6
ind3 a 6 2 1 4 5 3
Remark 28.3. Note that the existence of a primitive root is necessary for the definition of
the index to make sense. For instance, consider n = 12 and r = 5
i 1 2 3 4 5 6 7 8 9 10 11
5i (mod 12) 5 1 5 1 5 1 5 1 5 1 5
This shows that there is no integer i such that 5i ≡ a (mod n) for a ≠ 1, 5. This occurs
because 5 is not a primitive root modulo 12. In fact, since there is no primitive root mod
12, the index does not make sense in this setting.
It is common to also refer to indices as discrete logs since they share properties similar to
those of the usual logarithms of real numbers. This is clear from the following proposition.
Proposition 29. Let r be a primitive root mod n. Let a, b ∈ Z be coprime to n and d ≥ 1.
(a) indr 1 ≡ 0 (mod φ(n))
(b) indr r ≡ 1 (mod φ(n))
(c) indr ab ≡ indr a + indr b (mod φ(n))
(d) indr ad ≡ d ⋅ indr a (mod φ(n)).
Proof.
(a) By definition of the primitive root, we have rφ(n) ≡ 1 (mod n) and no smaller positive
exponent i satisfies ri ≡ 1 (mod n). Thus indr 1 = φ(n) ≡ 0 (mod φ(n)).
(b) Since i = 1 is the smallest positive exponent such that ri ≡ r (mod n), we have
indr r = 1 ≡ 1 (mod φ(n)).
(c) By definition of index, we have
rindr (ab) ≡ ab ≡ rindr a ⋅ rindr b ≡ rindr a+indr b (mod n),
hence, by Proposition 28 (i), we have
indr ab ≡ indr a + indr b (mod ordn r = φ(n)).
85
(d) Similarly to (c), we have
d
rindr a ≡ ad ≡ (rindr a )d ≡ rd⋅indr a (mod n),
hence, by Proposition 28 (i), indr ad ≡ d ⋅ indr a (mod ordn r) = φ(n).
We will now see how index arithmetic can be used to solve certain congruence equations.
Let r be a primitive root mod n, a, b, d ∈ Z with d ≥ 1 and consider the congruence equation
axd ≡ b (mod n).
We can rewrite this equation as rindr ax ≡ rindr b (mod n). By Propositions 28 and 29, we
d
also have
indr a + d ⋅ indr x ≡ indr b (mod φ(n)).
Relabeling y = indr x, a′ = d, and b′ = indr b − indr a, the equation transforms into the linear
congruence in one variable
a′ y ≡ b′ (mod φ(n)),
which we know how to solve using Theorem 19.
We will now solve some concrete equations in a couple of examples. For that we need to
have access to a table of indices.
Example 28.4. For n = 17, we check that r = 3 is a primitive root and compute the table
of indices relative to 3.
a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
ind3 a 16 14 1 12 5 15 11 10 2 3 7 13 4 9 6 8
In the next two examples, we will not refer to the previous table, though it should be
understood that we are using the results listed there.
Example 28.5. Determine all the integers satisfying 6x12 ≡ 11 (mod 17).
We have φ(17) = 16. Taking indices on both sides gives
6x12 ≡ 11 (mod 17) ⇐⇒ ind3 6 + 12 ⋅ ind3 x ≡ ind3 11 (mod 16)
⇐⇒ 15 + 12 ⋅ ind3 x ≡ 7 (mod 16)
⇐⇒ 12 ⋅ ind3 x ≡ 8 (mod 16)
⇐⇒ 3 ⋅ ind3 x ≡ 2 (mod 4)
⇐⇒ ind3 x ≡ 2 (mod 4)
⇐⇒ ind3 x ≡ 2, 6, 10, 14 (mod 16)
⇐⇒ x ≡ 3ind3 x ≡ 32 , 36 , 310 , 314 (mod 17)
Ô⇒ x ≡ 9, 15, 8, 2 (mod 17).
Here, we have changed from modulus 16 to 4 using Lemma 7. See also Exercise 13.19.
86
Example 28.6. Determine all the integers satisfying 7x ≡ 6 (mod 17).
Taking indices on both sides we get
7x ≡ 6 (mod 17) ⇐⇒ x ⋅ ind3 7 ≡ ind3 6 (mod 16)
⇐⇒ 11x ≡ 15 (mod 16)
⇐⇒ 33x ≡ 45 (mod 16)
⇐⇒ x ≡ 13 (mod 16).
We note that, in this example, the original congruence is mod 17 but the final description
of the integer solutions is mod 16.
The last two examples show that particular non-linear congruence equations can be solved
using indices. As for a general theorem, we will prove the following criterion to decide if
certain congruence equations have solutions.
Theorem 45. Let n be an integer admitting a primitive root. Let a, k ∈ Z with (a, n) = 1
and k ≥ 1. Consider the congruence equation
(28.7) xk ≡ a (mod n).
Write d = (k, φ(n)). Then,
φ(n)
(a) if a d ≡/ 1 (mod n), then (28.7) has no solutions;
φ(n)
(b) if a d ≡ 1 (mod n), then (28.7) has exactly d non-congruent solutions mod n.
88
29. Nonlinear Diophantine Equations
A Diophantine equation in one or more variables, which is not linear in the sense of Defini-
tion 11.3, is called nonlinear. In Section 11, we have studied linear Diophantine equations
and completely solved the case of two variables. There is no analogous result for the nonlin-
ear case. Moreover, it is a theorem that there is no algorithm that will solve all nonlinear
Diophantine equation, and it is usually very hard to solve particular examples. Nevertheless,
there are many methods that can be used for particular instances or families; for example,
in Section 15, we have used the congruence method to prove that 3x3 + 2 = y 2 has no integer
solutions. Depending on the situation, other methods may find a partial or complete list of
solutions and sometimes one finds all the solutions but has no proof there are no more.
Example 29.1. The following are examples of famous nonlinear Diophantine equations:
(1) The Pythagorean Equation
x2 + y 2 = z 2 ;
(2) The Fermat Equation
xn + y n = z n , where n ≥ 3;
(3) The Pell equation
x2 − ny 2 = 1,
where n ∈ Z>0 is not a square.
In the next two sections, we will solve two classical examples of nonlinear Diophantine
equations. More precisely, we will describe the complete set of solutions (a, b, c) satisfying
gcd(a, b, c) = 1 of the equations x2 + y 2 = z 2 and x4 + y 4 = z 4 .
Exercises.
Exercise 29.2. Which of the following equations are nonlinear?
x2 = 1, x + y + z = 4, xy = 3, y = 1, z + y2 = 7
89
30. Pythagorean Triples
It is very well known that, given a right triangle, the square of the hypotenuse is equal to
the sum of the squares of the other two sides. This statement can be made more precise in
the following way.
Theorem 46 (Pythagora’s theorem). Let x, y, z be the sides of a right triangle, where z
corresponds to the hypothenuse. Then, x2 + y 2 = z 2 .
Examples 30.1. Here are a few solutions to the Pythagorean equation:
√
(1) 12 + 12 = ( 2)2
(2) 32 + 42 = 52
(3) (−3)2 + 42 = 52
(4) 92 + 122 = 152
solution for us since 2 ∉ Z. Also, solutions (2) and (3) are related by a change of sign;
indeed, we can flip the sign of any variable and obtain a new solution. This occurs because
the exponents are even. Therefore, we will restrict ourselves to only positive values of x, y, z.
Definition 30.2. We call x, y, z ∈ Z a Pythagorean triple if x, y, z > 0 and x2 + y 2 = z 2 .
Note that solution (4) can be obtained by multiplying solution (2) by 3. In general, if x, y, z
is a Pythagorean triple, then
(dx)2 + (dy)2 = d2 (x2 + y 2 ) = d2 z 2 = (dz)2 ,
so dx, dy, dz is also a Pythagorean triple for all d > 0. Conversely, suppose x, y, z is a
Pythagorean triple with a common factor (x, y, z) = d. Then, we can write
x = dx0 , y = dy0 , z = dz0
to obtain
(dx0 )2 + (dy0 )2 = (dz0 )2 Ô⇒ x20 + y02 = z02
with (x0 , y0 , z0 ) = 1. Thus we can restrict our attention to coprime triples.
Definition 30.3. We call a Pythagorean triple x, y, z primitive if (x, y, z) = 1.
Proof. Suppose (x, y) ≠ 1. Then p ∣ x and p ∣ y for some prime p. Thus p ∣ (x2 + y 2 ) = z 2 ,
hence p ∣ z and (x, y, z) ≠ 1, a contradiction. Similarly for (x, z) and (y, z). K
Proposition 31. If x, y, z is a primitive Pythagorean triple then x ≡/ y (mod 2). That is,
exactly one of x, y is odd and the other is even.
90
Proof. By the previous proposition, x, y are not both even, otherwise 2 ∣ (x, y).
Suppose x, y are both odd, i.e. x ≡ y ≡ 1 (mod 2). Then, x, y ≡ 1, 3 (mod 4) and
z 2 = x2 + y 2 Ô⇒ z 2 ≡ x2 + y 2 ≡ 2 (mod 4)
which is impossible because
02 ≡ 0, 12 ≡ 1, 22 ≡ 0, 32 ≡ 1 (mod 4),
shows that 2 is not a square mod 4. K
Proof. Suppose m, n are positive integers satisfying (1), (2), (3) and (4). Let x, y, z be as in
(4). We compute
x2 + y 2 = (m2 − n2 )2 + (2mn)2 = m4 − 2m2 n2 + n4 + 4m2 n2
= m4 + 2m2 n2 + n4 = (m2 + n2 )2 = z 2 ,
so x, y, z form a Pythagorean triple. Suppose x, y, z are not primitive. That is, there exists
a prime p such that p ∣ x, p ∣ y, and p ∣ z.
Since x = m2 − n2 ≡ m − n ≡/ 0 (mod 2) by (3), it follows that x is odd, so p ≠ 2. Note also
that p divides both x + z = 2m2 and z − x = 2n2 , so p ∣ m and p ∣ n, contradicting (1). Thus,
x, y, z form a primitive Pythagorean triple.
Conversely, let x, y, z be a primitive Pythagorean triple with even y. From Proposition 30,
we know that (x, y) = (x, z) = (y, z) = 1. We also have
x2 + y 2 = z 2 ⇐⇒ y 2 = z 2 − x2 = (z − x)(z + x),
and dividing both sides by 4 we get
y 2 z−x z+x
( ) =( )( ).
2 2 2
Since x, z are odd and y is even, z−x
2 , 2 and 2 are integers.
z+x y
z−x z+x
= n2 and = m2 .
2 2
Writing x, y, z in terms of m, n gives z = m2 + n2 , x = m2 − n2 and
y 2 = z 2 − x2 = (z − x)(z + x) = (2n2 )(2m2 ) = 4m2 n2 ,
hence y = ±2mn. Choosing the positive value of y proves (4). Since x > 0, we have m > n > 0,
proving (2). If p divides n and m, then p divides x and z, contradicting (x, z) = 1; this
proves (1). Finally, to prove (3), suppose both m, n are odd. Then,
z ≡ 1 + 1 ≡ 0 (mod 2) and x ≡ 1 + 1 = 2 ≡ 0 (mod 2),
a contradiction with (x, z) = 1. This shows m, n are not both odd and, since they are coprime
by (1), they cannot be both even, proving (iii). K
Example 30.4. Using Theorem 47, we can easily produce non-trivial primitive Pythagorean
triples. For example, taking m = 5, n = 2 gives
(x, y, z) = (21, 20, 29),
and taking m = 6, n = 5 gives
(x, y, z) = (11, 60, 61).
We can also take m = 310 , n= 210 to obtain
(x, y, z) = (3485735825, 120932352, 3487832977).
Proposition 32. Let a, b, c ∈ Z with a, b > 0, (a, b) = 1 and ab = c2 . Then a and b are squares.
That is, there are positive integers m and n such that a = m2 and b = n2 .
Proof. Since (−c)2 = c2 we may assume c > 0. Consider the prime factorizations
a = pe11 ⋯pekk , b = q1s1 ⋯qm
sm
, c = `d11 ⋯`dnn ,
where pi are distinct primes and similarly for qi and `i . From ab = c2 , we have
(pe11 ⋯pekk )(q1s1 ⋯qm
sm
) = `2d 2dn
1 ⋯`n .
1
Since (a, b) = 1 we have that pi ≠ qj for all i, j. By uniqueness of the prime factorization, we
conclude that both sides of the equation are the unique prime factorization of c2 . It follows
that n = k + m and, for all 1 ≤ i ≤ k and 1 ≤ j ≤ m, there are 1 ≤ zi , zj ≤ n such that
2dzi s 2dz
pei i = `zi qj j = `zj j .
Therefore,
2d 2dzk d dz
a = `z1 z1 ⋯`zk = (`z1z1 ⋯`zkk )2
hence a is a square. Similarly, b is a square. K
Remark 30.5. The hypothesis in Proposition 32 are necessary. Indeed, (−4)(−9) = 62 and
(−4, −9) = 1 but neither −4 or −9 is a square. Moreover, (3 ⋅ 22 ) ⋅ 3 = 62 and both factors are
positive, but neither 3 ⋅ 22 or 3 is a square. However, we also have (22 )(32 ) = 62 , where the
factors are positive, coprime and squares, as predicted by the proposition.
92
30.2. Geometric View of Pythagorean Triples. In this section we will interpret Pythagorean
triples in a geometric way as points of positive rational coordinates on the unit circle, which
we recall is defined by the equation x2 + y 2 = 1.
y
(x0 , y0 )
y = t(x + 1)
(−1, 0)
x
point on the unit circle with rational coordinates x0 , y0 then, for some choice of a common
denominator c, we can write x0 = a/c and y0 = b/c and the equivalence (30.6) shows that a,
b, c, when positive, form a Pythagorean triple.
Example 30.7. The triple 32 + 42 = 52 gives rise to the point ( 53 , 45 ) on the unit circle.
It follows from the previous discussion that we can describe Pythagorean triples by describing
the points on the unit circle having rational coordinates. We will now obtain an explicit
description of such points.
Consider the line y = t(x + 1), passing through the point (−1, 0) and a point (x0 , y0 ) on the
unit circle. This line has slope t = x0y+1
0
which is a rational number if the coordinates x0 ,
y0 are rational. In particular, when (x0 , y0 ) = ( ac , cb ) arises from a Pythagorean triple, the
slope t is rational. Conversely, if we intersect the unit circle with the line y = t(x + 1) for
a rational value of the slope t we obtain a point (x0 , y0 ) with rational coordinates. Indeed,
the intersection are points whose coordinates (x, y) satisfy both equations
x2 + y 2 = 1 and y = t(x + 1).
Substituting the equation for y into the first equation leads to
x2 + (t(x + 1))2 = 1 ⇐⇒ x2 − 1 + t2 (x + 1)2 = 0 ⇐⇒ (x − 1)(x + 1) + t2 (x + 1)2 = 0
⇐⇒ (x + 1)((x − 1) + t2 (x + 1)) = 0,
93
hence
x+1=0 or x − 1 + t2 (x + 1) = 0.
Solving this for x gives
1 − t2
x = −1 or x= .
1 + t2
Now, by replacing these values in the equation y = t(x + 1) we see that the corresponding y
coordinates are
1 − t2 2t
y = 0 or y = t ( 2
+ 1) = .
1+t 1 + t2
Therefore, the points of intersection of the line with the unit circle are
1 − t2 2t
(−1, 0) and (x0 , y0 ) = ( , ).
1 + t2 1 + t2
The first point was expected due to our construction of the line, while the second has rational
coordinates if the slope t is a rational number, as desired. Suppose now t = m/n is rational
with m, n > 0. Then,
1 − (m/n)2 2(m/n) m2 − n2 2mn
(x0 , y0 ) = ( , ) = ( , )
1 + (m/n)2 1 + (m/n)2 m2 + n2 m2 + n2
and by the argument in the beginning of this section, we obtain the Pythagorean triple
m2 − n2 , 2mn, m2 + n2 ,
recovering the formulas in Theorem 47.
Exercises.
Exercise 30.8. Find formulas for the integers of all Pythagorean triples (x, y, z) with
z = y + 1.
Exercise 30.9. Use the classification of primitive Pythagorean triples to show that if (x, y, z)
is a PPT, then at least one of x, y, and z is divisible by 4.
94
31. Fermat’s Last Theorem and Infinite Descent
To prove FLT, it is enough to consider the case n = 4 or n = p, for p an odd prime. Indeed,
for a composite n ≥ 3 we can write n = ab where b = 4 or b = p is an odd prime. Therefore,
from a solution xn0 + y0n = z0n we get (xa0 )b + (y0a )b = (z0a )b . That is, a solution for exponent
b. So, if we show that xb + y b = z b has no solutions in non-zero integers, then the original
equation xn + y n = z n also cannot have solutions in non-zero integers.
We shall shortly prove Fermat’s Last Theorem for n = 4 by combining Theorem 47 with a
method called infinite descent due to Fermat.
Let us first sketch the main idea of infinite descent. Suppose x0 , y0 , z0 is an integral solution
to the equation x4 + y 4 = z 4 such that x0 y0 z0 ≠ 0. Starting from this solution, we construct
another solution to the same equation on non-zero integers x1 , y1 , z1 with the property that
0 < z1 < z0 . Then, from x1 , y1 , z1 , we construct another solution on non-zero integers x2 , y2 , z2
such that 0 < z2 < z1 < z0 . Repeating this procedure, the values of zi form a strictly decreasing
infinite sequence of positive integers; this is clearly impossible, therefore the original solution
x0 , y0 , z0 cannot exist.
To illustrate infinite descent in a simpler situation, we will prove the following fact.
√
Theorem 49. The number 2 is not rational.
√ √
Proof. Suppose 2 ∈ Q. Then, there are coprime integers p and q such that 2 = pq . Thus,
squaring both sides leads to
p2
2 = 2 ⇐⇒ 2q 2 = p2 Ô⇒ 2 ∣ p2 Ô⇒ 2 ∣ p,
q
so p = 2r for some r ∈ Z>0 . Then,
2q 2 = (2r)2 = 4r2 ⇐⇒ q 2 = 2r2 ,
hence, as above, 2 ∣ q, i.e. q = 2s for some s ∈ Z>0 . Therefore,
p 2r r
= = , with 0 < r < p, 0 < s < q
q 2s s
Now, starting from 2 = r/s and arguing as above, we get r′ , s′ such that 2 = r′ /s′ with
0 < r′ < r < p and 0 < s′ < s < q. Continuing this procedure generates a strictly decreasing
infinite sequence of positive integers, which is a contradiction. K
95
Finally, we will prove the following theorem, which implies FLT for n = 4.
Theorem 50 (Fermat). The equation x4 + y 4 = z 2 has no solutions in non-zero integers.
Corollary 23. FLT holds for exponent n = 4.
Proof of Theorem 50. Suppose x1 , y1 , z1 satisfy x41 + y14 = z12 and x1 y1 z1 ≠ 0. Since the expo-
nents are even we can assume x1 , y1 , z1 > 0. Further, we can assume (x1 , y1 ) = 1. Indeed, if
x1 = dx′1 , y1 = dy1′ , then (x′1 , y1′ ) = 1 and x1 , y1 , z/d2 also satisfies the equation, because
z 2
d4 x′4
1 + d4 y1′4 2
= z ⇐⇒ x′4
1 + y1′4 = ( 2) .
d
We will show there is another solution x2 , y2 , z2 > 0 such that (x2 , y2 ) = 1 and z2 < z1 .
Note that
x41 + y14 = (x21 )2 + (y12 )2 = z12 and (x21 , y12 , z1 ) = 1,
so that x21 , y12 , z1 forms a primitive Pythagorean triple. Further, we know that, by swapping
x21 and y12 if necessary, we can assume y12 to be even, hence y1 is even and x1 odd. Then,
from Theorem 47, there are coprime integers m > n > 0 with different parity such that
x21 = m2 − n2 , y12 = 2mn, z1 = m2 + n2 .
In particular, x21 +n2 = m2 so that x1 , n, m forms a primitive Pythagorean triple with n even.
Again from Theorem 47, there are coprime integers a > b > 0 with different parity such that
x 1 = a2 − b 2 , n = 2ab, m = a2 + b 2 .
We claim that m, a and b are squares, that is, there are positive integers z2 , y2 , x2 such that
m = z22 , a = x22 , b = y22
with (x2 , y2 ) = 1 since (a, b) = 1. Finally, from m = a2 +b2 and the claim, we obtain x42 +y24 = z22 ,
meaning that x2 , y2 , z2 give a solution to the equation x4 + y 4 = z 2 satisfying (x2 , y2 ) = 1 and
x2 y2 z2 ≠ 0. Furthermore,
0 < z2 ≤ z22 = m ≤ m2 < m2 + n2 = z1 ,
as desired. A contraction now follows by infinite descent as explained above.
We will now prove the claim. We have to show m, a, b are squares. Recall that
y12 = 2mn = m(2n), (m, 2n) = 1, m > 0, 2n > 0,
hence m and 2n are squares by Proposition 32. Since 2n is a square, there is an integer c > 0
such that 2n = 4c2 , hence n = 2c2 . Now
n = 2ab ⇐⇒ 2c2 = 2ab Ô⇒ ab = c2
and, since a and b are positIve and coprime, by Proposition 32 they must be squares. K
96
Exercises.
Exercise 31.1. Prove that there is at most one square in any Pythagorean triple.
97
32. Fermat Factorization
Fermat factorization, named after Pierre de Fermat, is a factorization method based on the
representation of an odd integer as the difference of two squares
n = a2 − b2 = (a − b)(a + b)
and, if neither factors a − b or a + b equals 1, this is a proper factorization of n.
Lemma 13. Let n ∈ Z>0 be odd. Then there is a 1 − 1 correspondence between factorizations
of n into 2 positive odd numbers and differences of squares that equal n.
Based on this lemma, to apply Fermat factorization, one tries various values t, hoping that
t2 − n is a square. More precisely, for n > 0 odd, we apply the following steps:
√
(i) Find the smallest integer t ≥ n
(ii) Consider the sequence of numbers
t2 − n, (t + 1)2 − n, (t + 2)2 − n, . . .
until a square s20 = (t + k)2 − n is found.
(iii) Let t0 = t + k. We have
n = t20 − s20 = (s0 + t0 )(s0 − t0 ).
This procedure (ii) will terminate since
n+1 2 n−1 2 n+1 2 n−1 2
n=( ) −( ) ⇐⇒ ( ) −n=( )
2 2 2 2
and
n+1 √
≥ n.
2
Corollary 24. Successive applications of Fermat’s factorization will factor n completely. In
particular, if n = pq one application suffices.
√
Example 32.1. Take n = 6077. The smallest integer t such that t ≥ n is t = 78. We
therefore compute the sequence of numbers (t + k)2 − n for k ≥ 0 until a square is found.
t2 − n = 782 − 6077 = 7
(t + 1)2 − n = 792 − 6077 = 164
(t + 2)2 − n = 802 − 6077 = 323
(t + 3)2 − n = 812 − 6077 = 484 = 222 .
We conclude that
6077 = 812 − 222 = (81 + 22)(81 − 22) = 103 ⋅ 59
98
Example 32.2. Fermat’s factorization works best when n has factors which are close to
each other. Let us consider the extreme case, where n = pq with p, q being ‘twin primes’,
that is, p and q = p + 2 are consecutive odd numbers.
√ √
We √ √ first 2that p < n ≤ p + 1. Indeed, suppose for
note contradiction that n √ ≤ p. Then
n= n n≤p√ , which is impossible because
√ √ n = pq = p 2 + 2p. Similarly, suppose n > p + 1.
It follows √
that n ≥ p + 2 = q and pq = n n ≥ q , a contradiction. So t = p + 1 is the smallest
2
integer ≥ n. Next, we compute the numbers (t + k)2 − n for k ≥ 0 until a square is found.
Indeed, we see that
t2 − n = (p + 1)2 − p(p + 2) = p2 + 2p + 1 − p2 − 2p = 1 = 12
so we stop in one step. We conclude that
n = (p + 1)2 − 12 = (p + 1 − 1)(p + 1 + 1) = pq.
Exercises.
Exercise 32.3. Using the Fermat factorization method, factor 8051.
99
33. The Pollard p − 1 Factorization Method
We will now introduce a factorization method due to John Pollard. Let n be a large integer
and compute Rk ≡ 2k! (mod n) recursively using fast modular exponentiation and the formula
k
Rk ≡ Rk−1 (mod n).
At each step, compute (Rk − 1, n) with the Euclidean Algorithm. Since 0 ≤ Rk ≤ n − 1, we
have Rk − 1 < n. Hence, if (Rk − 1, n) > 1, we have found a proper divisor of n.
Why does this work? Suppose p divides n and p − 1 ∣ k! for some k. Note this is true at least
for k ≥ p − 1. Hence, there exists a ∈ Z such that k! = (p − 1)a, and we have
2k! = 2(p−1)a = (2p−1 ) ≡ 1a ≡ 1 (mod p)
a
by FLT. It follows that p divides 2k! − 1. Since Rk ≡ 2k! (mod n), we also have
Rk = 2k! + bn
for some b ∈ Z. Then Rk − 1 = (2k! − 1) + bn, which implies p ∣ (Rk − 1) since p ∣ n and
p ∣ (2k! − 1). Therefore p ∣ (Rk − 1, n).
Example 33.1. Consider n = 10403. We compute
Rk (mod n) (n, Rk − 1)
R2 ≡ 22 ≡ 4 (mod n) (n, 3) = 1
R3 ≡ 43 ≡ 64 (mod n) (n, 63) = 1
R4 ≡ 644 ≡ 7580 (mod n) (n, 7579) = 1
R5 ≡ 75805 ≡ 4438 (mod n) (n, 4437) = 1
R6 ≡ 44386 ≡ 6862 (mod n) (n, 6861) = 1
R7 ≡ 68627 ≡ 137 (mod n) (n, 136) = 1
R8 ≡ 1378 ≡ 196 (mod n) (n, 195) = 1
R9 ≡ 1969 ≡ 3619 (mod n) (n, 3618) = 1
R10 ≡ 9798 (mod n) (n, 9797) = 101.
Since (n, 9797) = 101 > 1 we divide 10403 by 101 to get the factorization 10403 = 101 ⋅ 103.
Note that a large k always exists but is not practical. The Pollard p − 1 factorization method
is good if we can find small k such that p − 1 ∣ k! for some p ∣ n. This is likely to happen
when p − 1 has small prime factors.
Example 33.2. In the previous example, n = 10403 has the prime factor p = 101. We note
that p − 1 = 100 = 22 ⋅ 52 and 100 ∣ k! for k ≥ 10, finding a factor in 10 steps.
Of course, we can replace 2 by any other base b ≥ 2. Lastly, we note that in practice, this is
used after trial by division by small primes and before harder methods (which are not part
of these notes!).
Exercises.
Exercise 33.3. Use the Pollard p − 1 method to find a divisor of 689.
100
34. Cryptography
Suppose two friends, Alice and Bob, wish to communicate over an insecure channel in such
a way that their opponent Eve cannot understand or change what is being said. To keep
their conversation secure, Alice and Bob must consider the tools they are using to ensure
that their messages are kept secret, as well as the possible attacks on these tools to find out
their weaknesses.
The information that Alice wants to sent to Bob is called the plaintext. This is simply
data that can be read and understood without any special measures. Using a key, Alice will
encrypt the plaintext to obtain a ciphertext. To the unknowning observer, ciphertext appears
as unreadable gibberish. However, Bob, who knows the key, can decrypt the ciphertext to
obtain the original message from Alice. The following figure illustrates this process.
Encryption
Plaintext ÐÐÐÐÐÐÐ→ Ciphertext
Decryption
Definition 34.1. Cryptography is the design and implementation of secure systems. Crypt-
analysis is the process of breaking secure systems. The science that encompasses both of
these ideas is called cryptology.
The above process requires that both Alice and Bob have access to this key. However,
this key needs to be kept secret otherwise third parties such as Eve can use the key to
decrypt their messages. Encryption algorithms which have this property are called symmetric
cryptosystems or private key cryptosystems. There is a form of cryptography which uses two
different types of keys, one which is publicly available and used for encryption whilst the
other is private and used for decryption. These latter types of cryptosystems are called
asymmetric cryptosystems or public key cryptosystems. We will return to these types of
cryptosystems later in this section.
In this section, we use the mathematical techniques that we have thus far learned to encrypt
and decrypt messages that we wish to be kept secret. We will describe some historical
encryption methods that were used in the pre-computer era to encrypt data, as well as the
attacks on them.
Definition 34.2. A cryptosystem is made up of
● P: the set of all plaintext messages,
● C : the set of all ciphertext messages,
● K: the set of all keys,
and a correspondence
k ↦ (Ek , Dk ), for some k ∈ K
where
Ek ∶ P → C , the Encryption function and
Dk ∶ C → P, the Decryption function.
101
These functions satisfy
Dk (Ek (x)) = x, ∀x ∈ P.
In the private key cryptosystem described above, Eve wants to know what information
Bob and Alice are exchanging, and can attempt to decipher their messages and change the
information being sent between the two. To keep their messages secret from Eve, Alice and
Bob will first choose a random key k ∈ K. Then, to send a message to Bob over an insecure
channel, Alice will encrypt her message using Ek . That is, if the message is a string
x = x1 x2 ⋯xn ,
for some integer n > 0, where each xi ∈ P, then she will encrypt each xi as yi = Ek (xi ) and
send the resulting ciphertext
y = y1 y2 ⋯yn
to Bob. When Bob receives y, he deciphers it using Dk . Applying this protocol, their message
should remain secret from Eve, provided that she is not able to determine the key k. In the
following sections, we study classical cryptosystems based on congruences.
34.1. The Shift Cipher. When Julius Caesar sent messages to his generals, he did not
trust his messengers. So he replaced every A in his messages with a D, every B with an E,
and so on through the alphabet. Only someone who knew the “shift by 3” rule could decipher
his messages. This simple encryption algorithm is known as the Caesar cipher. Of course,
one could shift the alphabet by any arbitrary number. Such a generalization of Caesar’s
cipher is called a shift cipher.
Before describing this encryption algorithm, we must first translate the letters of the English
alphabet into numbers as follows.
A B C D E F G H I J K L M
0 1 2 3 4 5 6 7 8 9 10 11 12
N O P Q R S T U V W X Y Z
13 14 15 16 17 18 19 20 21 22 23 24 25
Note that we could extend this list by including symbols and numbers. For now, however,
we will just use the alphabet. In this case, P = C = K = Z/26Z. Let b ∈ K so that
b ∈ {0, 1, . . . , 25}.
Definition 34.3. The shift cipher is described via the correspondence
b z→ Eb (x) = x + b (mod 26), Db (x) = x − b (mod 26)
where the key b ∈ K is fixed and secret.
Example 34.4. Suppose Alice wants to send the message “MEET AT FOUR” to Bob using
a shift cipher with the key b = 3. This plaintext may be represented numerically as
MEET AT FOUR Ð→ 12 04 04 19 00 19 05 14 20 17.
Applying the shift cipher E3 (x) = x + 3 (mod 26) to each of the above numbers yields the
ciphertext
15 07 07 22 03 22 08 17 23 20 Ð→ PHHWDWIRXU,
102
where “PHHWDWIRXU” is the corresponding alphabetic representation of the ciphertext.
Hence, Alice sends the message “PHHWDWIRXU” to Bob.
Example 34.5. Using a shift cipher with the key b = 19, suppose Alice receives the message
BEHOXGBVDXEUTVDIEXTLXWHGMMXEETGRHGX
from Bob. Numerically, this corresponds to
01 04 07 14 23 06 01 21 03 23 04 20 19 21 03 08
04 23 19 11 23 22 07 06 12 12 23 04 04 19 06 17 07 06 23
To translate this back into plaintext, Alice uses the decryption function D3 (x) = x − 19
(mod 26) to obtain
08 11 14 21 04 13 08 02 10 04 11 01 00 02 10 15
11 04 00 18 04 03 14 13 19 19 04 11 11 00 13 24 14 13 04,
so that Alice deciphers the message as
I LOVE NICKELBACK PLEASE DONT TELL ANYONE.
The shift cipher is easy to break as soon as one understands the statistics of the underlying
language, in our case English. The distribution of English letter frequencies is described in
the table below.
Letter Percentage Letter Percentage
A 8.2 N 6.7
B 1.5 O 7.5
C 2.8 P 1.9
D 4.2 Q 0.1
E 12.7 R 6.0
F 2.2 S 6.3
G 2.0 T 9.0
H 6.1 U 2.8
I 7.0 V 1.0
J 0.1 W 2.4
K 0.8 X 0.1
L 4.0 Y 2.0
M 2.4 Z 0.1
To break a shift cipher, we compute the frequencies of the letters in the ciphertext and
compare them with the frequencies obtained from English.
For instance, suppose Eve intercepts the ciphertext
PTLKPAHALHASVUKVUKYBNZLCLYFTVYUPUN.
Suppose further that she knows P, C and that an encryption function of the form
Eb = x + b (mod 26)
103
was used. She wants to find b. To proceed, Eve begins by translating the ciphertext into its
numerical equivalent as
15 19 11 10 15 00 07 00 11 07 00 18 21 20 10 21 20 10 24 01 13 25 11 02 11 24 05 19 21 24 20 15 20 13.
Looking at the frequency of each letter appearing in the ciphertext, we note that the letters
L and U each occur four times. Since the most common letters in the English alphabet is ‘E’,
it is reasonable to guess that L or U correspond to E. Indeed, suppose that E is encrypted
as U. That is,
Eb (4) = 4 + b ≡ 20 (mod 26) Ô⇒ b = 16.
Using this key, Eve decrypts the message as
25 03 21 20 25 10 17 10 21 17 10 02 05 04 20 05 04 20 08 11 23 09 21 12 21 08 15 03 05 08 04 25 04 23
which corresponds to
ZDVUZKRKVRKCFEUFEUILXJVMVIPDFIEZEX.
Of course, this is just nonsense so we suppose instead that E is encrypted as L. That is
Eb (4) = 4 + b ≡ 11 (mod 26) Ô⇒ b = 7.
Using this key, Eve decrypts the message as
08 12 04 03 08 19 00 19 04 00 19 11 14 13 03 14 13 03 17 20 06 18 04 21 04 17 24 12 14 17 13 08 13 06.
This corresponds to
IMEDITATEATLONDONDRUGSEVERYMORNING
so that Eve deciphers the message as “I mediate at London Drugs every morning.”
34.2. The Affine Cipher. A generalization of the shift cipher is the affine cipher. In this
case, the key is (a, b, n) where a and n are coprime. We will denote the key simply by (a, b)
when the value of n is clear from the context.
Definition 34.6. The corresponding encryption function for the affine cipher is
(a, b) ↦ Ea,b (x) = ax + b (mod n).
08 06 25 23 25 07 12 25 08 06 25 04 05 11 07 21 25 23 05 10 08
06 25 23 08 07 12 23 06 17 16 25 20 08 25 12 16 12 17 23 25.
This corresponds to
XQHDZIGZKRUHYKMFUIRZMIGZXZHMZIG
ZEFLHVZXFKIGZXIHMXGRQZUIZMQMRXZ
where the most common letters are Z and I. The most frequent letters in English are E and
T, so we try
Ea,b Ea,b
E ÐÐ→ Z T ÐÐ→ I
4 Ð→ 25 19 Ð→ 8.
Therefore, Da,b (x) = cx + b must satisfy
Da,b (25) ≡ 4 25c + d ≡ 4 (mod 26)
{ ⇐⇒ {
Da,b (8) ≡ 19 8c + d ≡ 19 (mod 26).
Subtracting both equations gives
17c ≡ −15 ≡ 11 (mod 26).
Since 17−1 ≡ 23 (mod 26), we obtain
c ≡ 11 ⋅ 23 ≡ 19 (mod 26)
d ≡ 4 − 25 ⋅ 19 ≡ 23 (mod 26)
hence Da,b (y) = 19y + 23 (mod 26). If decrypting the intercepted ciphertext message with
this function leads to meaningful text, we conclude that
E↔Z and T ↔I
was the correct guess. Indeed, using (19,23) as our decryption key, we obtain
18 15 00 02 04 19 07 04 05 08 13 00 11 05 17 14 13 19 08 04 17
19 07 04 18 04 00 17 04 19 07 04 21 14 24 00 06 04 18 14 05 19
07 04 18 19 00 17 18 07 08 15 04 13 19 04 17 15 17 08 18 04,
which corresponds to the plaintext
SPACE THE FINAL FRONTIER
Indeed, since d ≡ e−1 (mod p − 1), we have that de ≡ 1 (mod p − 1) and hence de = 1 + k(p − 1)
for some integer k. It follows that
g(f (x)) = g(xe )
≡ (xe )d (mod p)
≡ xde (mod p)
≡ x1+k(p−1) (mod p)
≡ x1 (xp−1 )k (mod p)
≡ x (mod p) by FLT since x ≡/ 0 (mod p).
Note that if x ≡ 0 (mod p), then g(f (x)) = g(0) = 0 ≡ 0 (mod p) also.
To use this cipher, both Bob and Alice must know the key (p, e), which is kept secret. We
use the normal correspondence with an added zero (if necessary) to make all numbers have
2-digits. That is
A B C D E F G H I J K L M
00 01 02 03 04 05 06 07 08 09 10 11 12
N O P Q R S T U V W X Y Z
13 14 15 16 17 18 19 20 21 22 23 24 25
Example 34.10.
E X A M P L E
04 23 00 12 15 11 04
Next, we group the resulting numbers into blocks of 2m digits, where 2m is the largest
positive integer such that all blocks are < p. We choose our blocks in this way so that the
numerical value of each block does not get reduced modulo p. For instance, the word BB
corresponds to 0101, and the word LJ corresponds to 1110. If we choose p = 1009 and
2m = 4, then 0101 ≡ 1110 (mod p), so that BB is indistinguishable from LJ. In this case,
for this value of p, the correct choice of block length is 2m = 2. On the other hand, if
2525 < p < 252525, then m = 2. Note that the largest value of a 2-letter word is ZZ, which
corresponds to 2525.
Example 34.11. Take p = 2633 and e = 29 so that (2632, 29) = 1 and m = 2. In the example
above, we group the blocks
0423 0012 1511 0425
107
where the last 25 is used to fill the last block so that every block has 4 digits. Using f (x) = x29
(mod 2633) yields the following ciphertext
2437 2425 1729 0687.
34.4. The RSA Cryptosystem. Up until now, we have looked at cryptosystems that
required both communicating parties to have a copy of the same secret key. There is a form
of cryptography which uses two different types of keys, one which is publicly available and
used for encryption whilst the other is private and used for decryption. These latter types
of cryptosystems are called asymmetric cryptosystems or public key cryptosystems. In this
section, we will discuss the world’s first public key cryptosystem, RSA.
RSA is made of the initial letters of the surnames of Ron Rivest, Adi Shamir, and Leonard
Adleman, who first publicly described the algorithm in 1978. The RSA algorithm is based
on the difficulty of finding prime factors of large integers. In such a system, any person can
encrypt a message using the public key of the receiver, but such a message can be decrypted
only with the receiver’s private key. An analogy to this cryoptosystem is that of a locked
mail box with a mail slot. The mail slot is exposed and accessible to the public - its location
(the street address) is, in essence, the public key. Anyone knowing the street address can
go to the door and drop a written message through the slot. However, only the person who
possesses the key can open the mailbox and read the message.
Definition 34.13. To use an RSA cipher, each communicating party must choose two large
primes p and q and an exponent e such that
1 < e < (p − 1)(q − 1) and (e, (p − 1)(q − 1)) = 1.
Let n = pq so that φ(n) = (p − 1)(q − 1) and define d = e−1 (mod φ(n)). The (public)
encryption key is (n, e) and the corresponding encryption function is
Ek (x) = xe (mod n).
The (private) decryption key is (n, d) with decryption function
Dk (x) = xd (mod n).
The following theorem verifies that Dk does indeed recover the original message.
Theorem 51. We have Dk (Ek (x)) ≡ x (mod n).
Proof. We need to show xed ≡ x (mod n). By CRT, it is enough to show that
xed ≡ x (mod p) and xed ≡ x (mod q).
Suppose first that x ≡ 0 (mod p). Then
xed ≡ 0 ≡ x (mod p),
and we are done. Suppose now that x ≡/ 0 (mod p). By construction, d ≡ e−1 (mod φ(n)) so
ed = 1 + φ(n)k = 1 + (p − 1)(q − 1)k
109
hence
xed ≡ x1+(p−1)(q−1)k ≡ x(xp−1 )(q−1)k ≡ x (mod p),
where the last equivalence follows by FLT since (x, p) = 1. The same argument holds for
xed ≡ x (mod q), which completes the proof. K
Example 34.14. Take p = 11, q = 3. Then n = pq = 33 so that φ(n) = (p − 1)(q − 1) = 20.
Choose e = 3 and note that this is a valid choice since (e, (p − 1)(q − 1)) = (3, 20) = 1. In this
case, we find that d = e−1 ≡ 7 (mod φ(n)). Hence, the public key is given by (n, e) = (33, 3)
and the private key is (n, d) = (33, 7).
Suppose we want to use this system to encrypt the message “This is an example.” Since
25 < n < 2525, after changing each letter into its corresponding 2-digit number, we group the
resulting numbers into blocks of 2m digits, where m = 1. This means that 2m = 2 is the
largest positive integer such that all blocks are < n. Hence, we have
19 07 08 18 08 18 00 13 04 23 00 12 15 11 04
Now, for each of these 2-digit numbers x, we compute x3 (mod 33) to obtain the ciphertext
integers
28 13 17 24 17 24 00 19 31 23 00 12 09 11 31.
Example 34.15. Consider the system (n, e) = (3127, 11) and suppose we want to encrypt
the message “Number theory is my favourite class.” Converting the plaintext into digits and
separating these digits into blocks yields
1320 1201 0417 1907 0414 1724 0818 1224 0500 2114 2017 0819 0402 1100 1818.
Of course, here, since n = 3127 and 2525 < n < 252525, we take m = 2 so that each block has
2m digits to ensure that each block is < n. Using our encryption key, we compute xe (mod n)
for each block x. This gives the encrypted ciphertext
1464 2549 0702 1854 1122 2356 1196 2193 2150 0399 1611 1499 1988 0991 0100.
The reason why RSA is secure is that factoring large integers is very hard computationally.
Indeed, suppose we factor n = pq, then we can compute φ(n) = (p − 1)(q − 1) and since (n, e)
is public we can find d ≡ e−1 (mod φ(n)) via the Euclidean Algorithm. To break RSA we
only need the value of d which can be computed from φ(n) and e and not necessarily the
factorization of n. However, the following argument shows that computing the value of φ(n)
is not simpler than factoring n. Indeed, suppose we know both n and φ(n). We have
(i) φ(n) = √
(p − 1)(q − 1) = pq − p − q + 1 ⇐⇒ p + q = pq − φ(n) + 1 = n − φ(n) + 1
(ii) p − q = (p + q)2 − 4n,
Therefore, with the value of n and φ(n) we compute p + q using (i), then we use (ii) to
compute p − q. Finally we can determine p and q, computing
p = 21 ((p + q) + (p − q))
{
q = 21 ((p + q) − (p − q)),
showing that from knowing φ(n) we can factor n. There have been however successful
attacks on RSA but these issues were solved by being more careful when setting up an
implementation. For example, the primes p and q should not be close because of Fermat
110
factorization (see Example 32.2); moreover, we should choose p and q such that p − 1, q − 1
have large factors to avoid a successful factorisation of n = pq with Pollard p − 1 factorization
method from Section 33.
Exercises.
Exercise 34.16. Consider an affine cipher with encryption key (a, b, 26). We say that a
letter with numerical value x is “fixed” if x is enciphered as x. Is it possible to choose a, b
with gcd(a, 26) = 1, so that there is
(a) exactly one fixed letter?
(b) exactly two fixed letters?
(c) exactly three fixed letters?
(d) exactly four fixed letters?
(e) exactly 13 fixed letters?
In each part, give a proof if the answer is “no,” and an example if the answer is “yes”.
Exercise 34.17.
(a) Consider the RSA encryption scheme with public encryption key (2623, 11). Encipher
the message PATIENCE IS A VIRTUE.
(b) Decipher the message 284 926 2489 445 662 2445 926 178 using the encryption key as in
Part (a).
Exercise 34.18. Suppose a cryptanalyst discovers a message P that is not relatively prime
to the enciphering modulus n = pq used in an RSA cipher. (He can confirm this by running
the Euclidean algorithm.) Show that the cryptanalyst can factor n.
111