0% found this document useful (0 votes)
63 views

Mat 102

This document contains lecture notes for a MAT102 course. It covers various mathematical topics that will be important foundations and tools for the course, including the quadratic formula, inequalities, sets and functions, logic, induction, bijections and cardinality, divisibility, and modular arithmetic. It aims to introduce these concepts at a deeper level than secondary school and provide examples to illustrate their applications to solving problems.

Uploaded by

Souvik Ghosh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Mat 102

This document contains lecture notes for a MAT102 course. It covers various mathematical topics that will be important foundations and tools for the course, including the quadratic formula, inequalities, sets and functions, logic, induction, bijections and cardinality, divisibility, and modular arithmetic. It aims to introduce these concepts at a deeper level than secondary school and provide examples to illustrate their applications to solving problems.

Uploaded by

Souvik Ghosh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 81

MAT102 Lecture Notes

c
Tyler Holden, 2016-

Contents

1 Motivating Problems 3

2 Mathematical Infrastructure 4
2.1 Quadratic Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Bounding Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.2 The Arithmetic-Geometric Mean Inequality . . . . . . . . . . . . . . . . . . . 8
2.2.3 Absolute Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Sets and Set Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.1 Relations on Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.2 Operations on Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.3 Functions Between Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.4 Properties of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4 Ordered Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.1 The Field Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.2 Ordered Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.3 Complete Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3 Mathematical Logic 28
3.1 Mathematical Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 Universal and Existential Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 And, Or, Not . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.1 Negating Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4.1 Negating an Implication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Contradiction (Reductio ad absurdum) . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.6 A Rigmarole of Random Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

1
3.6.1 Some Number Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4 Induction 44
4.1 Summation and Product Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2 More General Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2.1 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3 Fallacies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5 Bijections and Cardinality 55


5.1 Injective and Surjective Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 Inverse Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.3 Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6 Divisibility and Modular Arithmetic 68


6.1 Divisibility and Primes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.1.1 Prime Divisibility and its Implications . . . . . . . . . . . . . . . . . . . . . . 69
6.2 The Euclidean Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.3 Linear Diophantine Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.4 Relations on Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.5 Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

2
1 Motivating Problems

1 Motivating Problems

The kinds of questions we will be considering in this course are not those amenable to rote memo-
rization or procedural algorithms, like those of which secondary school is replete. We will instead
be looking at questions that require you to think critically, to use knowledge you have already
acquired, and you apply it to solve a problem you have never seen before.
Critical thinking is hard! Do not be discouraged if you cannot do it at first. Like playing
a musical instrument or learning a sport, it is something that can be learned with practice and
dedication.
To this effect, let’s look at some famous mathematical problems.
Example 1.1

Suppose that a standard 8 × 8 chessboard has two diagonally opposite corners removed. Is
it possible to tile the chessboard with dominoes? More precisely, given 31 dominoes of size
2 × 1, is it possible to cover the chessboard in dominoes such that no two overlap?

Figure 1: A chessboard with two diagonal pieces removed. Notice that by necessity, those two
pieces are of the same color.

Another topic of great appeal to the layman is the notion of infinity. Did you know that there
are many different kinds of infinity? In fact, there are infinitely many different kinds of infinity,
with no infinity being the largest infinity. However, the infinity which enumerates the infinities is
larger than any infinity which it enumerates. That’s confusing eh?
Let’s start with a more reasonable example:
Example 1.2

There are as many whole numbers (like 1, 2, 3, 4, . . .) as rational numbers (like


1/2, 17/4, −22/883), but there are strictly more real numbers than these two.

Or how about this famous anecdote:

3
2016-
c Tyler Holden
2 Mathematical Infrastructure

Example 1.3

One day Gauss’ teacher asked his class to add together all the numbers from 1 to 100,
assuming that this task would occupy them for quite a while. He was shocked when young
Gauss, after a few seconds thought, wrote down the answer 5050.
More generally, what is the sum

1 + 2 + 3 + 4 + · · · + (n − 1) + n

for any natural number n? What if we change it to

12 + 22 + 32 + 42 + · · · + (n − 1)2 + n2 ?

We will be able to answer all of these questions and more by the end of this course.

2 Mathematical Infrastructure

In order to discuss proofs, we will need raw materials, things like numbers, functions, and sets.
You should already be familiar with some of the basics, but here we will introduce these items in
a little more depth.

2.1 Quadratic Formula

The quadratic formula represents an interesting starting point. Consider the equation

ax2 + bx + c = 0

for constants a, b, c. The student is familiar with the famous quadratic formula, which tells us that
the solutions to this equation are given by


−b ± b2 − 4ac
x= .
2a

But as far as most of you are concerned, this is some mysterious quantity that your high school
teachers materialized out of thin air. So where does it come from?
The answer is really quite simple, although there is some tricky algebra to be done. We are
taught when we are younger how to “complete the square,” which is to convert

ax2 + bx + c into something of the form α(x − β)2 + γ.

This is useful for graphing the quadratic, or maybe determining the apex of the corresponding

4
2016-
c Tyler Holden
2.2 Bounding Arguments 2 Mathematical Infrastructure

parabola. It is also very useful for finding the roots, since

α(x − β)2 + γ = 0 ⇔ α(x − β)2 = −γ


γ
⇔ (x − β)2 = −

γ
⇔ (x − β) = ± −
α
r
γ
⇔ x=β± − . (2.1)
α

Well now, that looks pretty darn similar to the quadratic formula, except that we need to write
α, β, γ in terms of a, b, c. So let’s complete the square on ax2 + bx + c, where we find that
 
2 2 b
ax + bx + c = a x + x + c factor out the a
a
b2 b2
 
2 b
=a x + x+ 2 − 2 +c squaring half the coefficient of x
a 4a 4a
b2 b2
 
b
= a x2 + x + 2 − +c pulling the b2 /(4a2 ) term out
a 4a 4a
2
4ac − b2

b
= |{z}
a x+ + .
2a 4a
α |{z} | {z }
β γ

Substituting these values of α, β, γ into (2.1) gives us the quadratic formula.


The discriminant of the quadratic is the term D := b2 − 4ac located under the square root
sign in the quadratic formula. The sign of the this term can tell us precisely how many roots a
quadratic formula has. For example, if D > 0 then there are precisely two distinct roots, if D = 0
then there is a single repeated root, and if D < 0 then there are no roots.

x x x

D>0 D=0 D<0

Figure 2: The various graphs of the parabola ax2 +bx+c depending on the value of the discriminant
D. There are 1 + sgn(D) roots of the parabola.

2.2 Bounding Arguments

To discuss the notion of length, we need to be able to compare relative sizes. This leads us to the
notion of inequalities. For example, we know that 2 < 4 or that −5 ≤ 0, or even that e < π. This

5
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.2 Bounding Arguments

is known as the total ordering of the real numbers, which we will discuss in more detail later.1

2.2.1 Inequalities

This becomes significantly more complicated when we are determining more general rules; that is,
rules that hold when we are unable to use specific numbers. You may take the following as axioms
(though in reality they need to be proven):

Proposition 2.1

If a, b, c are real numbers then the following hold:

1. If a < b and c > 0 then ca < cb,

2. a2 ≥ 0

3. If a ≥ 0 there is a unique non-negative number d such that d2 = a. We often write



d = a to explicitly denote the relationship between d and a.

4. If a < b and b < c then a < c.

Proposition 2.1 (3) in particular is difficult to prove, and requires something called the Com-
pleteness Axiom.

Exercise: How does Proposition 2.1 change if the less-than-or-equal signs are changed to
less-than signs?

We can use these basic tools to build more sophisticated results.


Proposition 2.2

For any real numbers a, b we have a2 + b2 ≥ 2ab.

Proof. Note that (a − b)2 ≥ 0 by property (2). Expanding the square gives

(a − b)2 ≥ 0 ⇔ a2 − 2ab + b2 ≥ 0
⇔ a2 + b2 ≥ 2ab,

which is what we wanted to show.

Example 2.3

Suppose that a, b, c ≥ 0 and b + c ≥ 2. Show that (a + b + c)2 ≥ 4a + 4bc.

1
The proper definition of the inequality is as follows: We say that a > b if a − b is a positive number. Using this,
try to prove the facts about inequalities. For example, Proposition 2.1 (1) and (4)

6
2016-
c Tyler Holden
2.2 Bounding Arguments 2 Mathematical Infrastructure

Solution. Starting with the left hand side, we expand to get

(a + b + c)2 = a2 + b2 + c2 + 2ab + 2bc + 2ac.

We need to find some way of using the fact that b + c ≥ 2. Notice that by clever factoring, we can
write
2ab + 2ac + 2bc = 2a(b + c) + 2bc ≥ 4a + 2bc

so that our inequality becomes

(a + b + c)2 ≥ a2 + b2 + c2 + 4a + 2bc.

We must somehow convert the a2 + b2 + c2 + 2bc into something that looks like 4bc. By Proposition
2.2, we know that b2 + c2 ≥ 2bc so

a2 + (b2 + c2 ) + 2bc ≥ a2 + 4bc ≥ 4bc

where in the last inequality we have used the fact that a2 ≥ 0. Putting this all together,

(a + b + c)2 ≥ 4a + 4bc

exactly as required. 

Proposition 2.4
√ √
If a, b are real numbers such that 0 < a < b, then a2 < ab < b2 and 0 < a< b.

Proof. Let’s start by showing that a2 < ab < b2 . We know that 0 < a < b and since a > 0 we can
multiply through by a, preserving the inequality, to get

0 < a2 < ab.

Similarly, since b > 0 we can multiply into 0 < a < b and preserve the inequality, to get

0 < ab < b2 .

Combining both inequalities together gives a2 < ab < b2 as required.


√ √
To show that 0 < a < b, note that that assumption a < b can be written as b − a > 0.
Thinking of this as a difference of squares allows us to write
√ √ √ √
b − a = ( b − a)( b + a) > 0
√ √ √ √ √ √
As both a, b > 0, so
√ too is there sum, showing that a + b > 0. Thus b − a > 0 as well,

or equivalently a < b.

7
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.2 Bounding Arguments

2.2.2 The Arithmetic-Geometric Mean Inequality

One of the more famous inequalities in mathematics is the Arithmetic-Geometric Mean Inequality.
In order to discuss it, we must remind ourselves of the arithmetic mean and the geometric mean.
Definition 2.5
If a, b are two real numbers, then the
a+b √
Arithmetic Mean is , Geometric Mean is ab.
2

The arithmetic mean is usually what is meant when we talk about an average. However, there
are cases in which the geometric mean can also be interpreted as an average. For example, say that
you have an investment of $100 which grows at a rate of 5% the first year and 8% in the second
year. After two years the value of your investment is
$100 × (1.05) × (1.08) = $100 × (1.134) = $113.40
It is often more convenient to discuss such a investment in terms of its effective annual rate, which
is the hypothetical fixed rate at which your bond would have accrued the same final value. If that
value is r, then we need to solve

r × r = 1.1134, ⇒ r = 1.05 × 1.08 ≈ 1.065.
The number 1.065 is precisely the geometric mean.
Given a collection of n numbers a1 , . . . , an , we know their arithmetic mean is given by
a1 + a2 + · · · + an
,
n
the geometric mean of that same group of number is given by

n
a1 a2 · · · an .

Theorem 2.6: Arithmetic-Geometric Mean Inequality (AM-GM)

For any real numbers a, b we have


 2
a+b
ab ≤
2

with equality if and only if a = b. If in addition ab > 0 then


√ a+b
ab ≤ .
2

Note that this theorem generalizes to


√ a1 + · · · + an
n
a1 · · · an ≤
2
but we will focus on the proof when n = 2.

8
2016-
c Tyler Holden
2.2 Bounding Arguments 2 Mathematical Infrastructure

Proof. To present the proof directly would likely lead to confusion, since at some point it will
appear as though we arbitrarily add a term. Instead, let’s work backwards to see if we can reduce
our inequality to a similar statement. If our inequality holds then
 2
a+b
ab ≤ ⇔ 4ab ≤ a2 + b2 + 2ab
2
⇔ 2ab ≤ a2 + b2

and we know that this last identity is correct by Proposition 2.2. A proper proof would now consist
of tracing back through this system of equivalences to arrive at the desired result.

Warning!

What I did in the proof above was create a chain of logically equivalent statements,
eventually arriving at a result which I knew to be true, thus implying that every statement
in the chain is true. I did not assume that the first inequality was true!

Students are often tempted to prove inequalities in a similar fashion, but fail to en-
sure that every statement is logically equivalent, or beg the question by assuming that the
end result is true. For example, if you complete a proof by concluding 0 = 0, then your
proof is almost certainly wrong.

Here’s an interesting alternate proof. Fix two positive real numbers a, b and construct a semi-
circle whose radius is a + b, as shown in Figure 3. Let h be the perpendicular line emanating from
the meeting point of line segments of length a and b along the radius.

h
D
A B
a b

Figure 3: An alternate proof of the AM-GM.

As an inscribed angle, ∠ACB is a right angle. This in turn implies that triangle 4ADC is
similar to triangle 4CDB. As these are similar, the ratios of their side-lengths are equal; namely,

a h √
= ⇒ h2 = ab ⇒ h= ab.
h b

Thus the height h is precisely the geometric mean. Compare this to the red line, whose length is
the radius (a + b)/2. Note that by construction, h will always be shorter than the radius, and they
will be equal precisely when a = b.

9
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.2 Bounding Arguments

Example 2.7

A farmer is given 120 metres of fencing and wishes to make a rectangular pasture which
encloses the maximum amount of area. Show that the largest such pasture is obtained with
a square.

Area = ab
a a
Perimeter = 2a + 2b

Figure 4: The farmer is making a rectangular pasture, but has a fixed amount of fencing.

Solution. Consider a rectangle with side length a, b as shown in Figure 4. The perimeter of this
rectangle is the fence, of which the farmer has 120 metres, so 120 = 2a + 2b. The area is ab, which
seems to work well with the AM-GM, so were we to plug this in directly we would get
 2
a+b
ab ≤ .
2

But we know 120 = 2a + 2b, so dividing everything by 4 gives (a + b)/2 = 30, and so
 2
a+b
ab ≤ = 302 = 900.
2

At this point, we only know that 900 is an upper bound for the area: It could be the case that the
true upper bound is some smaller number. One of the powerful aspects of the AM-GM is that it
gives a condition on the inequality to be saturated; namely, when a = b. Setting a = b gives us
a square, and if we want to solve for the precise values of a and b, we have 120 = 2a + 2b = 4a
showing that a = b = 30. 

Example 2.8

4
Find the minimum of the function f (x) = x3 + .
x3

Solution. If you know calculus, this problem can be done using optimization techniques. However,
it requires a great deal of time and energy to develop that infrastructure, whereas this problem can
be solved with the AM-GM. How would we realize this? To find the minimum, we need a bound
of the form f (x) ≥ m, where m is some constant. Note that if we multiply the two components of

10
2016-
c Tyler Holden
2.2 Bounding Arguments 2 Mathematical Infrastructure

f , we would indeed get a constant. To see if this goes anywhere, let a = x3 and b = 4/x3 , so that
ab = 4, and
2
a+b 2
 3
x + 4/x3 (x3 + 4/x3 )2
 
= = .
2 2 4
The AM-GM then says that
(x3 + 4/x3 )2 4
4≤ ⇒ 16 ≤ (x3 + 4/x3 )2 ⇒ 4 ≤ x3 + .
4 x3
So f (x) ≥ 4. As with all inequalities though, this might not be a good lower bound, maybe we
can do better. We need to check that this value of the lower bound is actually achieved. By the
AM-GM, we know that equality occurs when a = b, which in this case gives us
4
x3 = ⇒ (x3 )2 = 4
x3
√ √ √
so √x = 6 4 = 3 2, or more conveniently x3 = 2. From here it’s easy to see that when x = 3 2,
f ( 3 2) = 4 as required. 

2.2.3 Absolute Values

Absolute values are used to measure distances.


Definition 2.9
If x is a real number, then the absolute value of x is
(
x if x ≥ 0
|x| =
−x if x < 0.

Note that |x| ≥ 0, for if x is positive the absolute value does nothing, while if x is negative, the
absolute value adds on yet another negative sign, making it positive.
One can think of the absolute value as measure the distance of a number from zero. For example,
the numbers 4 and −4 should both be a distance of 4 from 0. We can also use absolute values to
measure the distance between two numbers. If x, y are real numbers, the distance from x to y is
|x − y|.

−5 0 5

Figure 5: The real line from −5 to positive 5. We would like to define a system of measurement
such that the red bars have the same length and the blue bars have the same length.

We can use absolute values to describe intervals of real numbers. For example, the collection of
x which satisfy |x| < 2 are those such that −2 ≤ x ≤ 2, or the interval (−2, 2). In a similar vein,
the collection of x such that |x − 1| < 3 are those that are “within a distance of 3 from the number
1. You can probably guess that this amounts to the interval (−2, 4), but to see it more precisely
note that
|x − 1| < 3 ⇔ −3 < x − 1 < 3 ⇔ −2 < x < 4.

11
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.2 Bounding Arguments

Proposition 2.10

If x, y are real numbers, then

1. x ≤ |x|,

2. |xy| = |x||y|,

3. x2 = |x|,

4. |x + y| ≤ |x| + |y| (Triangle Inequality).

Proof. We will prove (4) and leave the others as an exercise. Since x2 = |x|2 , y 2 = |y|2 , and
2xy ≤ 2|x||y| we have
x2 + 2xy + y 2 ≤ |x|2 + 2|x||y| + |y|2 ⇔ (x + y)2 ≤ (|x| + |y|)2 .
Taking the square root of both sides gives |x + y| ≤ |x| + |y| as required.

Exercise: When is the triangle inequality actually an equality?

Example 2.11

If |x − 1| < 1, find an M such that |x2 + x − 2| < M .

y = |x2 + x − 2|

|x − 1| < 1

Figure 6: Determining some number larger than |x2 + x − 2| when |x − 1| < 1.

Solution. Graphically, this question may be interpreted as in Figure 6. Note that we can write
|x2 + x − 2| = |x + 2||x − 1| < |x + 2| since |x − 1| < 1.
When |x − 1| < 1 we have 0 < x < 2, so to make this look like something involving x + 2 we add 2
to everything, giving 2 < x + 2 < 4, which implies that |x + 2| < 4. Hence |x2 + x − 2| < 4 when
|x − 1| < 1. 

12
2016-
c Tyler Holden
2.3 Sets and Set Building 2 Mathematical Infrastructure

Example 2.12

Find an M > 0 such that 3


x − x − 3
x4 + 1 ≤ M

whenever |x| < 2.

Solution. Looking at the numerator, we can use the triangle inequality to write
|x3 − x − 3| ≤ |x|3 + |x| + |3| < 23 + 2 + 3 = 13.
For the denominator, we have to be more careful. Recall that if when we take reciprocals, an
inequality changes direction, so we want to bound |x4 + 1| from below. Indeed,
|x4 + 1| = x4 + 1 ≥ x4 > 24 = 16.
Combining everything together gives
3
x − x − 3 13
x4 + 1 ≤ 16 .

Of course, any M larger than 13/16 will also work, like M = 1. 

2.3 Sets and Set Building

A set is any collection of distinct objects.2 Some examples of sets might include
Universities in
the alphabet = {a, b, c, . . . , x, y, z} , = {UofT, Ryerson, York} ,
Toronto
The Kardashian Sisters = {Kim, Khloe, Kourtney} .
We use the symbol ‘∈’ (read as ‘in’) to talk about when an element is in a set; for example,
1 ∈ {1, 2, 3} but _
¨ ∈
/ {dog, cat}.
Each of the previous examples were finite sets, as they consisted of only a finite number of
elements. A set can also have infinitely many elements. In such instances, it is inconvenient to
write out every element of the set so we use set builder notation. Herein, if P is a proposition on
the set S, such that for each x ∈ S, P (x) is either true or false, then one can define the set
{x ∈ S : P (x)}
which consists of all the elements in S which make P true. For example, if M is the set of months
in the year, then
{m ∈ M : m has 31 days} = {January, March, May, July, August, October, December} .
This was an example where the resulting set was still finite, but it still demonstrates the compactness
of setbuilder notation.
The following are some important infinite sets that we will see throughout the course:
2
This is not true, since it is possible to define objects called classes, but we will not worry about this too much in
this context

13
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.3 Sets and Set Building

• The naturals3 N = {1, 2, 3, . . .},

• The integers Z = {..., −2, −1, 0, 1, 2, ...},

• The rationals Q = {p/q : p, q ∈ Z, q 6= 0, gcd(p, q) = 1},

• The reals R (the set of all infinite decimal expansion).

Special subsets of the real numbers are the intervals. The mathematical definition of the interval
is somewhat complicated, but you’re likely familiar with them. Notationally, we write

(a, b) = {x ∈ R : a < x < b} (−∞, b) = {x : x ≤ b}


(a, b] = {x ∈ R : a < x ≤ b} (−∞, b] = {x ∈ R : x ≤ b}
(a, b] = {x ∈ R : a ≤ x < b} (a, ∞) = {x ∈ R : x > a}
[a, b] = {x ∈ R : a ≤ x ≤ b} [a, ∞) = {x ∈ R : x ≥ a}

Definition 2.13
If a, b ∈ Z, we say that a|b (read a divides b), if there exists some integer k such that b = ak.

Divisibility is something we’ll explore in great detail later in the course, and forms the foundation
of a field of mathematics called number theory. Number theory is really interested in the primes,
with which you should be familiar. Just in case you need a refresher, let’s recall the definition
below:
Definition 2.14
Let a be an integer. We say that a is prime number if the only numbers which divide a are
a and 1. We say that a is even if 2|a, and odd otherwise.

2.3.1 Relations on Sets

We can also talk about subsets, which are collections of items in a set and indicated with a ‘⊆’
sign. For example, if P is the set of prime numbers, then P ⊆ Z, since every element on the left
(a prime number) is also an element of the right (an integer). Similarly, one has N ⊆ Z ⊆ Q ⊆ R.
Note that if A is a set, A ⊆ A.
There is a particular distinguished set, known as the empty set and denoted by ∅, which contains
no elements. Recalling the definition of a vacuous truth, it is not too hard to convince oneself that
the empty set is a subset of every set!

Exercise: Determine the subset relations for the following sets:


 p
1. S = {x ∈ R : x = 2n, n ∈ Z}, 3. U = x ∈ Q : x = 2n , gcd(p, k) =1 ,

2. T = x ∈ R : x = a − 12 , ∀a ∈ N , 4. V = {x ∈ Z : x = 3n , n ∈ N}.


3
Some mathematicians believe that 0 is a natural number. I am personally undecided, and always just choose
which version is more convenient.

14
2016-
c Tyler Holden
2.3 Sets and Set Building 2 Mathematical Infrastructure

Two sets are equal when they contain precisely the same elements. In practice, showing that
two sets are equal is usually done by mutual subset inclusion: if A, B are sets then
A=B ⇔ A ⊆ B and B ⊆ A

Example 2.15

Consider the sets


A = {n ∈ N : n = 4k + 1 for some k ∈ N} ,
B = {n ∈ N : n = 4k − 3 for some k ∈ N}
Show that A = B.

Solution. Let’s begin by showing that A ⊆ B. Let n ∈ A, so that there exists a k such that
n = 4k + 1. Notice that we can equivalently write n = 4(k + 1) − 3, showing that n ∈ B. Since n
was arbitrary, we conclude that every element in A is also in B, so A ⊆ B.
Conversely, if n ∈ B then n = 4k − 3 for some k. We can write this as n = 4k − 3 = 4(k − 1) + 1,
showing that n ∈ A. Since n was arbitrary, every element of B is also an element of A, so B ⊆ A.
Both inclusions imply that A = B, as required. 

Example 2.16

Let A = x ∈ R : x2 − 1 < 0 and B = (−1, 1). Show that A = B.




Solution. Let x ∈ A so that x2 − 1 < 0. This means that x2 < 1, which we can solve to get
x ∈ (−1, 1). Hence x ∈ B, and A ⊆ B.
Conversely, if x ∈ (−1, 1) then we know x2 < 1, so x2 − 1 < 0, showing that x ∈ B and B ⊆ A.
Both inclusions show that A = B. 

2.3.2 Operations on Sets

Union and Intersection Let S be a set and choose two sets A, B ⊆ S. We define the union of
A and B to be
A ∪ B = {x ∈ S : x ∈ A or x ∈ B}
and the intersection of A and B to be
A ∩ B = {x ∈ S : x ∈ A and x ∈ B} .

Example 2.17

Determine the union and intersection of the following two sets:

A = {x ∈ R : x > 1} , B = {x ∈ R : −1 < x < 2} .

15
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.3 Sets and Set Building

A B A B

A∪B A∩B

Figure 7: Left: The union of two sets is the collection of all elements which are in both (though re-
member that elements of sets are distinct, so we do not permit duplicates). Right: The intersection
of two sets consists of all elements which are common to both sets.

Solution. By definition, one has


A ∪ B = {x ∈ R : x ∈ A or x ∈ B} = {x ∈ R : x > 1 or − 1 < x < 2}
= {x ∈ R : x > −1} ,
A ∩ B = {x ∈ R : x ∈ A and x ∈ B} = {x ∈ R : x > 1 and − 1 < x < 2}
= {x ∈ R : 1 < x < 2} . 

Let I ⊆ N be an indexing set: Given a collection of sets {Ai }i∈I in S, one can take the
intersection or union over the entire collection, and this is often written as
[ \
Ai = {x ∈ S : there is an i ∈ I, x ∈ Ai } , Ai = {x ∈ S : for every i ∈ I, x ∈ Ai } .
i∈I i∈I

Example 2.18

Consider the set {x ∈ R : sin(x) > 0}. Write this set as as an infinite union of intervals.

Solution. We are well familiar with the fact that sin(x) > 0 on (0, π), (2π, 3π), (4π, 5π), etc. If we
let the interval In = (2nπ, (2n + 1)π) then the aforementioned intervals are I0 , I1 , and I2 . We can
convince ourselves that that sin(x) > 0 on any of the In , and hence
[ [
{x ∈ R : sin(x) > 0} = In = (2nπ, (2n + 1)π). 
n∈Z n∈Z

Example 2.19
\
Define In = 0, n1 ⊆ R. Determine I =

In .
n∈N

Solution. By definition, I consists of the elements which are in In for every n ∈ N. We claim that
I cannot consist of any positive real number. Indeed, if p > 0 then there exists n ∈ N such that
1
n < p, which means that p ∈ / Ik for all k ≥ n, and hence cannot be in I. Since I has no positive real
numbers, and certainly cannot contain any non-positive real numbers, we conclude that I = ∅. 

16
2016-
c Tyler Holden
2.3 Sets and Set Building 2 Mathematical Infrastructure

S T
Exercise: Let In = (−n, n) ⊆ R for n ∈ N. Determine both n In and n In .

Ac

Figure 8: The complement of a set A with respect to S is the set of all elements which are in S
but not in A.

Complement If A ⊆ S then the complement of A with respect to S is all elements which are
not in A; that is,
Ac = {x ∈ S : x ∈
/ A} .

Example 2.20
S
Determine the complement of I = n∈Z (2nπ, (2n + 1)π) from Example 2.18, with respect to
R.

Solution. Since I contains all the open intervals of the form (2nπ, (2n + 1)π) we expect its comple-
ment to contain everything else. Namely,
[
Ic = [(2n − 1)π, 2nπ]. 
n∈Z

Example 2.21

/ B}. Show that A \ B = A ∩ B c .


For any sets A and B, let A \ B = {x ∈ A : x ∈

Solution. We begin by showing that A \ B ⊆ A ∩ B c . Let x ∈ A \ B, so that we know x ∈ A but


x∈ / B we know that x ∈ B c , and since x ∈ A and x ∈ B c we know x ∈ A ∩ B c . This
/ B. Since x ∈
shows that A \ B ⊆ A ∩ B c .
The reverse direction is almost identical. Let x ∈ A ∩ B c so that x ∈ A and x ∈ B c . The
statement x ∈ B c is equivalent to saying that x ∈ / B c implies that x ∈ A \ B.
/ B, so x ∈ A and x ∈
Both inclusion give the equality A \ B = A ∩ B c , as required. 

17
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.3 Sets and Set Building

Exercise:

1. Show that (A ∪ B)c = Ac ∩ B c ,

2. Show that (A ∩ B)c = Ac ∪ B c ,


\
3. Verify that I c = (2nπ, (2n + 1)π)c is an equivalent solution for Example 2.20.
n∈Z

Example 2.22

Let A, B, C be sets. Show that (B \ A) ∪ (C \ A) = (B ∪ C) \ A.

Solution. We must show two inclusions, so let’s start with (⊆). Let x ∈ (B \ A) ∪ (C \ A), so that
x ∈ (B \ A) or x ∈ (C \ A). For our first case, suppose x ∈ B \ A, in which case x ∈ B and x ∈/ A.
But as x ∈ B, we know x ∈ B ∪ C, so x ∈ B ∪ C and x ∈ / A implies x ∈ (B ∪ C) \ A. Precisely the
same reasoning holds if we take x ∈ C \ A, so (B \ A) ∪ (C \ A) ⊆ (B ∪ C) \ A.
Conversely, suppose x ∈ (B ∪ C) \ A, so that x ∈ B ∪ C and x ∈ / A. Since x ∈ B ∪ C, we
know either x ∈ B or x ∈ C. Suppose for now that x ∈ B. Since x ∈ B and x ∈ / A, we know
x ∈ (B \ A), so x ∈ (B \ A) ∪ (C \ A). Exactly the same reasoning holds if we assume x ∈ C, so
(B \ A) ∪ (C \ A) ⊆ (B ∪ C) \ A.
Both inclusions give equality, as required. 

Cartesian Product The Cartesian product of two sets A and B is the collection of ordered
pairs, one from A and one from B; namely,

A × B = {(a, b) : a ∈ A, b ∈ B} .

A geometric way (which does not generalize well) is to visualize the Cartesian product as sticking a
copy of B onto each element of A, or vice-versa. For our purposes, the main example of the product
will be to define higher dimensional spaces. For example, we know that we can represent the plane
R2 as an ordered pair of points R2 = {(x, y) : x, y ∈ R} , while three dimensional space is an ordered
triple R3 = {(x, y, z) : x, y, z ∈ R}. In this sense, we see that R2 = R × R, R3 = R × R × R, and
motivates the more general definition of Rn as an ordered n-tuple

Rn = |R × ·{z
· · × R} .
n-times

18
2016-
c Tyler Holden
2.3 Sets and Set Building 2 Mathematical Infrastructure

Exercise: We have swept some things under the rug in defining Rn , largely because the true
nature is technical and boring. There is no immediate reason to suspect that R × R × R
should be well defined: we first need to check that the Cartesian product is associative; that
is, (R × R) × R = R × (R × R). By definition, the left-hand-side is

(R × R) × R = {((a, b), c) : (a, b) ∈ R × R, c ∈ R}

while the right-hand-side is

R × (R × R) = {(a, (b, c)) : a ∈ R, (b, c) ∈ R × R} .

Syntactically, neither of these looks the same as R3 = {(a, b, c) : a, b, c ∈ R}, but nonetheless
they all define the same data.

Let S 1 = (x, y) : x2 + y 2 = 1 ⊆ R2 be the unit circle. What familiar shape is



Exercise:
S1 × S1?

2.3.3 Functions Between Sets

Given two sets A, B, a function f : A → B is a map which assigns to every point in A a unique
point of B. If a ∈ A, we usually denote the corresponding element of B by f (a). When specifying
the function, one may write a 7→ f (a). The set A is termed the domain, while B is termed the
codomain.
Some examples of functions are as follows:

1. Define f : N × N → N, (m, n) 7→ 3m 2n . For example, f (2, 2) = 32 22 = 36.

2. If PolyQ is the set of all polynomials with rational coefficients, π : PolyQ → Q given by
π(p) = p(0) is a function. For example, if p(x) = 21 x2 − x + 18
17
then π(p) = 17
18 .

3. Consider the function


(
1
q if x ∈ Q, x = pq , gcd(p, q) = 1
f : R → Q, x 7→ .
0 otherwise

If you have taken calculus before, this is an example of a function which is continuous on
R \ Q and discontinuous on Q.

The graph of a function f : A → B is the set

Γ(f ) = {(x, f (x)) ∈ A × B} .

When f : R → R, this coincides with the notion of a graph with which you are familiar.
It is important to note that not every element of B needs to be hit by f ; that is, B is not
necessarily the range of f . Rather, B represents the ambient space to which f maps. Also, if either

19
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.3 Sets and Set Building

of the domain or codomain changes the function itself changes. This is because the data of the
domain and codomain are intrinsic to the definition of a function. For example, f : R → R given
by f (x) = x2 is a different function than g : R → [0, ∞), g(x) = x2 .
Definition 2.23
Let f : A → B be a function.

1. If U ⊆ A, then we define the image of U to be

f (U ) = {y ∈ B : ∃x ∈ U, f (x) = y} = {f (x) : x ∈ U } .

2. If V ⊆ B, we define the pre-image of V to be

f −1 (V ) = {x ∈ A : f (x) ∈ V } .

A B
f :A→B
f (U )
U

Note that despite being written as f −1 (V ), the preimage of a set does not say anything about
the existence of an inverse function.
Example 2.24

Let f : R → R be specified by f (x) = x2 . Determine f ([0, 1]) and f −1 (f ([0, 1]).

Solution. By definition, one has

f ([0, 1]) = {f (x) : x ∈ [0, 1]} = [0, 1].

On the other hand, since f ([0, 1]) = [0, 1] we know that f −1 (f ([0, 1])) = f −1 ([0, 1]) for which

f −1 ([0, 1]) = {x ∈ R : f (x) ∈ [0, 1]} = [−1, 1]. 

Example 2.25

Let f (x) = x2 /(1 + x2 ). Show that f (R) = [0, 1).

Solution. First we notice that for any x ∈ R, f (x) ≥ 0. Indeed, since x2 ≥ 0 then 1 + x2 ≥ 0, giving

x2
≥ 0.
1 + x2

20
2016-
c Tyler Holden
2.3 Sets and Set Building 2 Mathematical Infrastructure

When x = 0 we do in fact have f (x) = 0 so this inequality is saturated. Now we also have f (x) < 1,
since
x2
1 + x2 > x2 , and so < 1.
1 + x2
These two facts together imply that f (R) ⊆ [0, 1). To show the other direct, we must show that
every element of [0, 1) is equal to f (x) for some x ∈ R. Let y ∈ [0, 1) and notice that

x2
f (x) = y ⇔ =y
1 + x2
⇔ x2 = y + x2 y
⇔ x2 (1 − y) = y
y
r
⇔ x= .
1−y

Notice that it was necessary for 0 ≤ y < 1 to ensure that the term y/(1 − y) under the square root
is positive. Since this value of x maps to y, we have [0, 1) ⊆ f (R), and equality then follows from
the double inclusion. 

Example 2.26

2|x + 1|
If f : R → R is given by f (x) = , show that f (R) = [0, 1].
3|x| + 2

Solution. We need to show a double subset inclusion, for which we start with (⊆). Suppose y ∈
f (R), so that y = f (x) for some x ∈ R; namely,

2|x + 1|
y= .
3|x| + 2

Both the numerator and denominator of y are positive, so y ≥ 0. Applying the triangle inequality
gives us
2|x + 1| 2|x| + 2 3|x| + 2
y= ≤ ≤ = 1,
3|x| + 2 3|x| + 2 3|x| + 2
so that y ≤ 1. Both inequalities show that y ∈ [0, 1], so f (R) ⊆ [0, 1].
For the other direction, there are a few arguments that could be made. First of all, note that
f (−1) = 0 and f (0) = 1, showing that the endpoints of [0, 1] are actually achieved.

1. Since f is continuous on [−1, 0], the Intermediate Value Theorem implies that every value
between [0, 1] is acheived, and so [0, 1] ⊆ f (R).

2. On [−1, 0] we know both |x| ≤ 0 and |x + 1| ≤ 0. Let y ∈ [0, 1], so that


−2x − 2 2y + 2
y= ⇔ x= ∈ R.
−3x + 2 3y − 2

Hence [0, 1] ⊆ f (R).

21
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.3 Sets and Set Building

In either case, both inclusions give the desired equality. 

Example 2.27

Let f : R3 → R2 be given by f (x, y, z) = (x, y). If

S 2 = (x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1 ,


determine f (S 2 ).

Solution. Let (a, b, c) ∈ S 2 so that a2 + b2 + c2 = 1. The image of this  point under f is f (a, b,
c) =
(a, b). It must be the case that a2 + b2 ≤ 1, and so f (S 2 ) ⊆ D2 = (x, y) ∈ R2 : x2 + y 2 ≤ 1 . We
claim that this is actually an equality; that is, f (S 2 ) = D2 . In general, to show that two sets A
and B are equal, we need to show A ⊆ B and B ⊆ A. As we have already shown that f (S 2 ) ⊆ D2 ,
we must now show that D2 ⊆ f (S 2 ).

Let (a, b) ∈ D2 so that a2 + b2 ≤ 1. Let c = 1 − a2 − b2 , which is well-defined by hypothesis.
Then a2 + b2 + c2 = 1 so that (a, b, c) ∈ S 2 , and f (a, b, c) = (a, b). Thus f (S 2 ) = D2 . 

Exercise: Let f : R3 → R2 be the function given in Example 2.27. Determine f −1 (D2 ).

Example 2.28

Let f : X → Y be a function, with A, B ⊆ X. Show that f (A ∩ B) ⊆ f (A) ∩ f (B).

Solution. Let y ∈ f (A ∩ B) so that y = f (x) for some x ∈ A ∩ B. Since x ∈ A ∩ B we know


that x ∈ A and x ∈ B. This in turn implies that f (x) ∈ f (A) and f (x) ∈ f (B), so that
y = f (x) ∈ f (A) ∩ f (B). 

Exercise: Does the converse to Example 2.28 hold? More precisely, is it the case that
f (A) ∩ f (B) ⊆ f (A ∩ B), and therefore the two sets are actually equal?

2.3.4 Properties of Functions

Definition 2.29
A function f : [a, b] → R is said to be

1. increasing on [a, b] if whenever x1 , x2 ∈ [a, b] satisfy x1 < x2 , then f (x1 ) ≤ f (x2 ). We


say that f is strictly increasing on [a, b] if x1 < x2 implies that f (x1 ) < f (x2 ).

2. decreasing on [a, b] if whenever x1 , x2 ∈ [a, b] satisfy x1 < x2 , then f (x1 ) ≥ f (x2 ). We


say that f is strictly decreasing on [a, b] if x1 < x2 implies that f (x1 ) > f (x2 ).

22
2016-
c Tyler Holden
2.4 Ordered Fields 2 Mathematical Infrastructure

Definition 2.30
A function f : R → R is said to be bounded if there exists an M > 0 such that |f (x)| ≤ M
for all x ∈ R.

Furthermore, note that if f, g : A → B are functions, then anything that can be done to points
in B can be done to f and g, by defining the operations in a pointwise fashion. For example, if
f, g : R → R, then since we can add/multiply in the codomain R, we can similarly perform these
actions on f, g as
(f + g)(x) = f (x) + g(x), (f g)(x) = f (x)g(x).

Example 2.31

Show that if f, g : R → R are bounded, then f + g is bounded as well.

Solution. Since both functions are bounded, there exists an M1 > 0 and M2 > 0 such that |f (x)| <
M1 and |g(x)| < M2 for all x ∈ R. Define M = M1 + M2 , which we claim will work for the sum
f + g. Indeed, for any x ∈ R we have

|f (x) + g(x)| ≤ |f (x)| + |g(x)| < M1 + M2 = M

which is what we wanted to show. 

2.4 Ordered Fields

Chances are you have seen the real numbers R before. In fact, you might even think that you
have a good understanding of the real number. The reality is, the real numbers are actually an
incredibly subtle and difficult object with which to play. In this section, I will show you examples
of other objects, called fields which have similar properties to the real numbers.

2.4.1 The Field Axioms

Fields are actually very complicated mathematical objects that have a lot of underlying structure.
This means that in order to tell you what a field does, I must enumerate a great deal of axioms.
Definition 2.32
Given a set S, a (closed) binary operator is a function b : S × S → S.

The definition of a binary operator is somewhat self-explanatory. Binary describes the number
2, so a binary operator is something which operates on two elements of S and produces another
element of S. The additional adjective closed is used to describe the fact that the output of elements
in S remains in S.
For example, multiplication and addition of integers are both closed binary operators. We abuse

23
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.4 Ordered Fields

notation somewhat, and write

+ : Z × Z → Z, × : Z × Z → Z,
(a, b) 7→ a + b, (a, b) 7→ a × b

However, notice that division is not a closed binary operator, since dividing two integers need not
give back an integer. For example, 1, 2 ∈ Z, but ÷(1, 2) = 1/2 is not an integer.
Definition 2.33
A field is any set F equipped with two closed binary operators ⊕, ⊗, called addition and
multiplication respectively, such that for any x, y, z ∈ F we have

1. [Associativity] x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z and x ⊗ (y ⊗ z) = (x ⊗ y) ⊗ z,

2. [Commutativity] x ⊕ y = y ⊕ x and x ⊗ y = y ⊗ x

3. [Identity] There exist distinct numbers 0F and 1F such that for any x ∈ F , x ⊕ 0F = x
and x ⊗ 1F = x.

4. [Additive Inverses] For any x ∈ F there exists n ∈ F such that x ⊕ n = 0F . We usually


write n = −x.

5. [Multiplicative Inverses] For any non-zero number x ∈ F , there exists r ∈ F such that
x ⊗ r = 1F . We usually write r = x−1 .

6. [Distributivity] x ⊕ (y × z) = (x ⊕ y) ⊕ (x ⊗ z)

The distributivity property is essential, since it says that addition and multiplication play
together nicely; that is, they are compatible. Out of laziness, we will write · for ⊗, + for ⊕, and
simply 0 and 1 for the identities from now on.

1. The real numbers R and the rational numbers Q are both fields (check this as best you can).
However, Z and N are not fields. In the case of N, elements do not have additive inverses. In
the case of Z, elements do not have multiplicative inverses.

2. Define a binary operator on N called the modulo operation, where a mod b is the remainder
when a is divided by b. For example,

5 mod 2 = 1, 8 mod 3 = 2, 19 mod 5 = 4, 72 mod 10 = 2

Consider the set F2 = {0, 1} where addition and multiplication are done modulo 2; that is,

a + b = (a + b) mod 2, a · b = (a · b) mod 2.

This is a field, with multiplication and addition tables given by

+ 0 1 · 0 1
0 0 1 , 0 0 0 .
1 1 0 1 0 1

24
2016-
c Tyler Holden
2.4 Ordered Fields 2 Mathematical Infrastructure

Similarly, F3 = {0, 1, 2} with addition and multiplication done modulo 3 is a field, with
addition and multiplication tables

+ 0 1 2 · 0 1 2
0 0 1 2 0 0 0 0
,
1 1 2 0 1 0 1 2
2 2 0 1 2 0 2 1

However, F4 = {0, 1, 2, 3} with addition and multiplication given modulo 4 is not a field, as
we will show in Example 2.37.

3. [Advanced Example] Let P1 (F2 ) be the degree one polynomials with coefficients in F2 satis-
fying the identity x2 + x + 1 = 0:

P1 (R) = ax + b : a, b ∈ F2 , x2 + x + 1 = 0 .


This is a field with precisely four elements, {0, 1, x, x + 1}. Addition and multiplication tables
are given by

+ 0 1 x x+1 · 0 1 x x+1
0 0 1 x x+1 0 0 0 0 0
1 1 0 x+1 x , 1 0 1 x x+1
x x x+1 0 1 x 0 x x+1 1
x+1 x+1 x 1 0 x+1 0 x+1 1 x

Example 2.34

If F is a field, show that for any x ∈ F , x · 0 = 0.

Solution. Notice that

x · 0 = x · (0 + 0) by (3)
= (x · 0) + (x · 0) by (6)

Let −(x · 0) be the additive inverse of x · 0, guaranteed to exist by (4), so

0 = (x · 0) + [−(x · 0)]
= [(x · 0) + (x · 0)] + [−(x · 0)] from above
= (x · 0) + [(x · 0) + [−(x · 0)]] by (6)
| {z }
=0
=x·0

as required. 

Example 2.35

If F is a field then the identity elements are unique

25
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.4 Ordered Fields

Solution. Let z be any additive identity element, so that z + x = x for all x ∈ F . In addition, we
have that z + x = x = x + 0. Let −x be the additive identity for x, so that
0 = x + (−x) = (z + x) + (−x) = z + (x + (−x)) = z
showing that necessarily, z = 0. A similar argument holds for the multiplicative identity. 

Example 2.36

If F is a field then inverse elements are unique.

Solution. Let x ∈ F , and take n, m to be additive inverses so that x + n = 0 = x + m. Notice that


n = (x + m) + n = (x + n) + m = m
showing that n = m and demonstrating uniqueness of inverses. A similar argument holds for
multiplicative inverses. 

Example 2.37

If F is a field and x, y ∈ F satisfy xy = 0, then either x = 0 or y = 0.

Solution. Suppose that xy = 0. If both x and y are zero then the result certainly holds, so assume
without loss of generality that x is non-zero. Since x is non-zero, it has a multiplicative inverse
x−1 , which implies
y = (x−1 · x)y = x−1 · (x · y) = x−1 · 0 = 0
showing that y = 0. 

Example 2.37 shows that F4 cannot be a field. Indeed, in F4 we have that 2 · 2 = 0 but neither
of these is zero. If F4 were a field, then Example 2.37 says that one of these must be zero, and that
is not the case.
Example 2.38

Let F3 = {0, 1, x} be a field with three elements. Determine 1 + x−1 .

Solution. We claim that 1 + x−1 = 0 (look at the chart in the examples above). To show this, let’s
assume that it’s not the case and show that something weird happens.
If 1 + x−1 6= 0, then it must have a multiplicative inverse, and its is either 1 or x. If its inverse
is 1, then
1 · (1 + x−1 ) = 1 ⇒ 1 + x−1 = 1 ⇒ x−1 = 0.
But this would imply that x · x−1 = 1 = 0, which cannot happen.
If the inverse is x, then
x · (1 + x−1 ) = 1 ⇒ x+1=x ⇒ 1=0
which is also not possible. We conclude that 1 + x−1 cannot be inverted, so it must be 0. 

26
2016-
c Tyler Holden
2.4 Ordered Fields 2 Mathematical Infrastructure

2.4.2 Ordered Fields

We have already looked at an ordering on the real numbers, where by definition we say that a < b
if b − a > 0. We generalize this notion as follows:
Definition 2.39
If F is a field, a subset P ⊆ F is a positive set if

1. [Closure] For any x, y ∈ P , x + y ∈ P and xy ∈ P ,

2. [Trichotomy] For any non-zero x ∈ F , either x ∈ P or −x ∈ P .

Notice that the set of positive real numbers is a positive set in R. If F admits a positive set
P , we can define an ordering on F by saying that x < y if y − x ∈ P . By y − x we of course
mean y + (−x), but we shall be sloppy with that notation henceforth. Any field endowed with an
ordering is said to be an ordered field.
Proposition 2.40

Let F be an ordered field.

1. If x < y and y < z then x < z.

2. If c > 0 and x < y then cx < cy.

3. If x < y and u < v then x + u < y + v

Proof. The proofs effectively imitate what we do in R.

1. By definition, we have that y − x ∈ P and z − y ∈ P . Hence

z − x = z + (y − y) + x = (z − y) + (y − x) ∈ P
| {z } | {z }
∈P ∈P

using the closure of P under addition. Since z − x ∈ P we conclude x < z.

2. Note that c > 0 is equivalent to c − 0 = c ∈ P , so we know that c ∈ P . Furthermore, we


know that y − x ∈ P . Hence
cy − cx = c(y − x) ∈ P

using the closure of P under multiplication.

3. We know that y − x ∈ P and v − u ∈ P , so

(y + v) − (x + u) = (y − x) + (v − u) ∈ P
| {z } | {z }
∈P ∈P

by closure of additivity.

27
2016-
c Tyler Holden
3 Mathematical Logic

Ordered fields must have infinitely many elements. Assume that 1 ∈ P . Since P is closed under
addition, by repeatedly using Proposition 2.40(3), we must have

0 < 1 < 1 + 1 < 1 + 1 + 1 < 1 + 1 + 1 + 1 < ···

and this must go on for ever. If our field only has finitely many elements, then at some point this
process must begin to cycle back on old numbers, allowing us to show something along the lines of
x < x, which is not possible.

2.4.3 Complete Fields

Definition 2.41
If F is an ordered field, we say that a subset S ⊆ F is bounded from above if there exists an
element M ∈ F such that for all x ∈ S, x < M . In this case, we say that M is an upper
bound of S.

A bounded set has many possible upper bounds. For example, the set S = {1/n : n ∈ N} ⊆ Q
is bounded, with an upper bound of 2. But 3, or 4, or in fact any rational number larger than 2 is
also an upper bound for S.
This pattern is typical. If M is an upper bound for S and M < N , then for every x ∈ S we
have
x<M <N
showing that N is also an upper bound for S. An interesting question which naturally arises is
then “Is there a least upper bound?”
Definition 2.42: The Completeness Axiom
An ordered field F is complete if whenever S ⊆ F is bounded from above, there exists a least
upper bound of S.

For example, the set S = x ∈ Q : x2 < 2 is certainly bounded above, since x < 2 for all x ∈ S.


However, this set does not have a least upper bound in the rational numbers. Therefore, Q is not
complete. However, notice that R is a complete ordered field, and in fact is constructed in such a
manner as to guarantee that it is complete.
It is possible to show that R is in essence the only complete ordered field, in the sense that any
other field which is complete and ordered is essentially just R in disguise.

3 Mathematical Logic

3.1 Mathematical Predicates

A logical predicate is a statement about objects in S which evaluate to either true or false. We will
denote predicates by a capital letter, such as P , in which case P (x) is read as “x satisfies property
P ” or some other equivalent sentence.

28
2016-
c Tyler Holden
3.2 Universal and Existential Quantifiers 3 Mathematical Logic

Simple examples of predicates include

P (x) “x is a dog,”
P (x) “x has a birthday today,”
P (x, y) “x and y have the same calculus lecture,”
P (x, y, z) “The sum of x and y is greater than z,”

where we have left the choice of universe to context.


As demonstrated above, there is no limit on the number of objects discussed in a predicate and,
given an explicit description of x (or y and z as appropriate), one can assign a value of true or false
to each predicate. In the first instance where P (x) reads “x is a dog,” it is hopefully clear that
P (Dalmation) is true, while P (Beluga) is false.
Not everything is a mathematical statement. For example, ‘152 ’ is not a statement: there is no
way to assign a value of true or false to ‘152 ’.
Example 3.1

If x, y ∈ N we say that x is divisible by y if there is no remainder when we divide x by


y. Consider the predicate P (x, y) representing “x is divisible by y.” Evaluate the truth of
P (x, y) on the following pairs (x, y):

(5, 2), (35, 5), (0, 1), (1, 0).

Solution. The statement P (5, 2) is that “5 is divisible by 2.” This is false for the division yields
5 = 2 · 2 + 1, leaving a remainder of 1. On the other hand, P (35, 5) is true, since 35 = 5 · 7 with no
remainder. P (0, 1) is also true as 0 = 0·1 and in fact, this would be true regardless of which number
we had chosen for y. The only contentious example occurs when trying to evaluate P (1, 0), since
we cannot divide by zero. In this case, we adhere to the convention that no number is divisible by
zero, so that P (1, 0) is false. As in previous case, P (1, 0) would be false regardless of our choice of
x. 

Remark 3.2 You might be disturbed at the idea of writing false mathematical statements,
such as “5 is divisible by 2.” Morality aside, it is important to realize that we can write false
statements in English is well. For example, the statement “Pigs can fly” is obviously false,
but this does not prevent me from writing it down. While we will eventually endeavour to
only write true statements, for the moment it is important to consider false statement as
well.

3.2 Universal and Existential Quantifiers

Quantifiers allow us to discuss the number of objects which satisfy a predicate. If we wish to discuss
every element of a set, we use the universal quantifier ∀, read as “for all.” To state that an element
in a set exists, we use the existential quantifier ∃, read as ”there exists.”

29
2016-
c Tyler Holden
3 Mathematical Logic 3.2 Universal and Existential Quantifiers

When combined with a predicate P , we can assign truth values to quantified statements. For
example, let S be a universe of discourse. The statement ∀x ∈ S, P (x) will be true precisely when
P (x) is true for every element in S. On the other hand, ∃x ∈ S, P (x) will be true as long as a
single element of S makes P (x) true.4
The addition of quantifiers allows us to make statements such as the following:

• Every cow has a favourite radio station.

• There is a black horse.

• In every sport, there exists someone who breaks the rules.

• There is one textbook that every class uses.

These last two examples have multiple quantifiers. Can you spot them?
Example 3.3

Determine whether each of the quantified statements is true or false.

1. ∀x ∈ N, x2 ≥ 0

2. ∃x ∈ R, x = −1,

3. ∀x ∈ Q, ∀y ∈ Q, x + y ∈ Q,

4. ∃x ∈ N, ∃y ∈ N, x/y ∈ N.

Solution.

1. This statement is true, since squaring any non-zero real number results in a positive number.

2. This statement is false. If such an x existed, it would also satisfy x2 = −1. By our comment
in part 1, the square of a non-zero number is always positive, leading us to a contradiction.

3. This statement is true. Write x = a/b and y = c/d so that

a c ad + bc
x+y = + = .
b d bd
Since ad + bc ∈ Z and bd ∈ N, this is also a rational number.

4. This statement is true. For example, by setting x = 4 and y = 2 we have x/y = 4/2 = 2,
which is also a natural number. 

Notice how the above solutions demonstrated the truth of quantifier statements. To show
that ∃x ∈ S, P (x), we find a single example of an x ∈ S which makes P (x) true. To show that
4
A simple mnemonic for remembering which symbol corresponds to which quantifier is that “for all” looks like an
upside down A which stands for ALL, and “there exists” looks like a backwards E which stands for EXISTS.

30
2016-
c Tyler Holden
3.2 Universal and Existential Quantifiers 3 Mathematical Logic

∀x ∈ S, P (x) is more subtle. Rather than try to demonstrate P (x) for every x, we choose an
arbitrary x ∈ S. If P (x) is true for an arbitrary x, then it must be true for every x.
Doubly quantified statements must be treated with caution. One may freely interchange two
adjacent quantifiers of the same type, but not of different type. For example, the statements

∀x ∈ Q, ∀y ∈ Q, x + y ∈ Q is logically equivalent to ∀y ∈ Q, ∀x ∈ Q, x + y ∈ Q,

and
∃x ∈ N, ∃y ∈ N, x/y ∈ N is logically equivalent to ∃y ∈ N, ∃x ∈ N, x/y ∈ N.

However, interchanging existential and universal quantifiers can lead to serious trouble.
Example 3.4

Consider the statements


∀x ∈ R, ∃y ∈ R, x + y = 0 (3.1)
and
∃x ∈ R, ∀y ∈ R, x + y = 0. (3.2)
Compare these expressions by translating them as follows:

1. Convert the mathematical notation into English.

2. Turn the sentence derived above into a simple sentence, which does not involve any
variables.

3. Evaluate whether each statement is true or false.

Solution. We start with equation (3.1) for which a direct translation of the notation into English
gives us

“For all x in the real numbers, there exists y in the real numbers, (such that) x + y = 0.”

This is fine but not very enlightening. By recognizing that x + y = 0 is equivalent to x = −y, we
could also re-interpret this sentence as saying “For every real number there is another real number
which is its negative.” Dropping the superfluous words we arrive at the intuitive statement

“Every real number has a negative.”

This statement is certainly true: Given an integer a, we can construct its negative to be −a.
Looking at (3.2) we have
∃y ∈ Z, ∀x ∈ Z, x + y = 0. (3.3)

Using the same translation process as above, the corresponding simple sentence is given by

“There is an element which is the negative of every integer.”

31
2016-
c Tyler Holden
3 Mathematical Logic 3.3 And, Or, Not

This says there is a number to which we can add any other number and always get zero.
Certainly this is not true! If it were, then there would be a number n such that n + a = 0 and
n + b = 0 for any integers a and b. Equating these expressions, we would find that n + a = n + b
which in turn implies that a = b. This would force all integers to be equal, which is nonsense. 

Example 3.4 teaches us that changing the order of the quantifiers significantly changes the logical
statement, and hence the truth of that statement. To borrow a term from the computer scientists,
universal quantifiers admit a ‘scope’ to the existential quantifiers they precede. For example, the
statement ∀x, ∃y, P (x, y) means that the choice of y is allowed to depend upon x. The statement
∃y, ∀x, P (x, y) does not confer this dependence: the choice of y must work for every x.
Example 3.5

Let S be the set of all students in a classroom, and B(a, b) be the statement “student a has
the same birthday as student b.” Write the mathematical statements

∀a ∈ S, ∀b ∈ S, B(a, b), ∀a ∈ S, ∃b ∈ S, B(a, b)

∃a ∈ S, ∀b ∈ S, B(a, b), ∃a ∈ S, ∃b ∈ S, B(a, b)


in plain language.

Solution. We may interpret each of the statements as follows:


∀a ∈ S, ∀b ∈ S, B(a, b) Every student (a) in the classroom has the same birthday
as every other student (b) in the classroom
∀a ∈ S, ∃b ∈ S, B(a, b) For each student (a) in the classroom, there exists some
other student (b) in the classroom with the same birthday.
∃a ∈ S, ∀b ∈ S, B(a, b) There exists a student (a) who has the same birthday as
every other student (b) in the classroom.
∃a ∈ S, ∃b ∈ S, B(a, b) There exists a student (a) in the classroom who has the same
birthday as another student (b) in the classroom.


3.3 And, Or, Not

Three operations allow us to form new predicates from old: The AND operation, also known
as conjunction; the OR operation, also known as disjunction; and the NOT operation, also as
negation.

• AND: When we link two predicates using an AND statement, both predicates must evaluate
to true for the new predicate to be true. Notationally, the statement P (x) AND Q(x) is
written P (x) ∧ Q(x).
For example, if our universe is N, let E(x) represent “x is even” and P (y) represents “y is
positive.” The statement E(x) ∧ P (y) is then “x is even and y is positive,”
E(2) ∧ P (2) is true, E(4) ∧ P (−1) is false,

32
2016-
c Tyler Holden
3.3 And, Or, Not 3 Mathematical Logic

E(3) ∧ P (1) is false, E(1) ∧ P (−1) is false.


As demonstrated, if either E(x) or P (y) is false, then E(x) ∧ P (y) will also be false.
• OR: An OR statement will evaluate to true when at least one of the component predicates
is true. The statement P (x) OR Q(x) is written P (x) ∨ Q(x).
E(2) ∨ P (2) is true, E(4) ∨ P (−1) is true,
E(3) ∨ P (1) is true, E(1) ∨ P (−1) is false.
The only way that E(x) ∨ P (y) is false is if both E(x) and P (y) is false.
• NOT: Finally, negation does not link predicates but rather acts on a single predicate and
negates it’s truth value. The statement NOT P (x) is written ¬P (x). For example, if E(x)
is “x is even,” then ¬E(x) is “x is not even.” Similarly, ¬P (x) is “x is not positive, and we
have
¬E(2) is false, ¬E(3) is true, P (4) is false, P (−1) is true.

All of these rules can be tricky to remember when trying to absorb the information from the
written word. By organizing the truth data of each operation into a truth table, we have a quick and
easy way of seeing the structure of each logical statement. To facilitate writing out these tables, let
T denote TRUE and F denote FALSE. The truth tables for all three operations are given in Table
1.
AND OR
P Q P ∧Q P Q P ∨Q NOT
T T T T T T P ¬P
T F F T F T T F
F T F F T T F T
F F F F F F

Table 1: The truth tables for the AND, OR, and NOT operations.

Example 3.6

Let O(x) represent “x is odd” and E(x) represent “x is even.” Compute O(x) ∧ E(x), O(x) ∨
E(x), ¬E(x) and ¬O(x) when x = 1 and x = 2.

Solution. We begin with the case x = 1:

O(1) ∧ E(1) “1 is both odd and even.” This statement is FALSE,


as 1 is certainly not even. This is the second row of
the AND truth table.
O(1) ∨ E(1) “1 is either odd or even.” This statement is TRUE,
since 1 is odd. This is the second row of the OR truth
table
¬E(1) “1 is not even.” This statement is TRUE as the num-
ber 1 is not even.
¬O(1) “1 is not odd.” This statement is FALSE as the num-
ber 1 is certainly odd.

33
2016-
c Tyler Holden
3 Mathematical Logic 3.3 And, Or, Not

Now on to the case for x = 2, which is very similar to the x = 1 case.

O(2) ∧ E(2) “2 is both odd and even.” This statement is FALSE as


2 is not odd. This is the third row of the AND truth
table.
O(2) ∨ E(2) “2 is either odd or even.” This statement is TRUE
since 2 is even. This is the third row of the OR truth
table.
¬E(2) “2 is not even.” This is FALSE, since 2 is certainly
even.
¬O(2) “2 is not odd.” This is TRUE, since 2 is not odd.

Example 3.7

Create the truth table corresponding to the statement ¬(P ∧ Q) ∨ R.

Solution. While it is possible to create the truth table immediately, this is prone to mistakes. By
breaking down the truth table into several smaller tables, we obtain a clearer picture and our
solution is more robust to error. We start by examining the ¬(P ∧ Q) predicate.

P Q P ∧Q ¬(P ∧ Q)
T T T F
T F F T
F T F T
F F F T

Now we add in the disjunction with R into an expanded truth table, which gives

P Q R ¬(P ∧ Q) ¬(P ∧ Q) ∨ R
T T T F T
T T F F F
T F T T T
T F F T T
F T T T T
F T T T T
F F F T T
F F T T T

We conclude that the statement ¬(P ∧ Q) ∨ R is always true except in the case where (P, Q, R) is
(T, T, F ). 

Showing that two logical statements are equivalent can be done by showing that they have the
same truth table, as the following Proposition demonstrates.

34
2016-
c Tyler Holden
3.3 And, Or, Not 3 Mathematical Logic

Proposition 3.8

Let P, Q be propositions. The negation of the AND and OR statements are as follows:

¬ (P ∧ Q) = (¬P ) ∨ (¬Q), ¬ (P ∨ Q) = (¬P ) ∧ (¬Q).

Proof. It suffices to show that the expressions have equivalent truth tables. We will give the result
for the first identity, and leave the second as an exercise. The truth tables for the negation of the
AND statement are as follows

P Q P ∧Q ¬(P ∧ Q) P Q ¬P ¬Q (¬P ) ∨ (¬Q)


T T T F T T F F F
T F F T T F F T T
F T F T F T T F T
F F F T F F T T T

The resulting values of the truth table are identical, showing that these statements are in fact
equivalent.

3.3.1 Negating Quantifiers

To develop intuition for negating quantifiers, let’s think about how we would disprove a statement
involving a quantifier. For example, the universally quantified statement ”every horse is black”
may be disproved by showing that there exists a non-black horse. Mathematically, if P (x) is “x is
a black horse,
the negation of ∀x, P (x) is ∃x, ¬P (x).

The existentially quantified statement “there exists a pink horse” is disproved by showing that
“every horse is not pink.” Mathematically, if P (x) is the statement “x is a pink horse,” then

the negation of ∃x, P (x) is ∀x, ¬P (x).

By thinking about the case of a general predicate P , the negation rules above still apply.
Example 3.9

Consider the mathematical statement ∀x ∈ R, x < x2 . Determine whether this sentence is


true or false, and write the negation of this sentence.

Solution. This sentence is false. For example, if x = 1/2 then x2 = 1/4, showing that x > x2 . The
negation of this sentence is
∃x ∈ R : x ≥ x2 .

Notice that our counter-example satisfies the negation of our sentence, as we would expect. 

35
2016-
c Tyler Holden
3 Mathematical Logic 3.4 Implications

Example 3.10

Negate the sentence ”Every real number has a negative.”

Solution. From Example 3.4 we know that the given sentence can be stated mathematically as
∀x ∈ R, ∃y ∈ R, x + y = 0.
Applying our rules for negation, the negative of this sentence becomes
∃x ∈ R, ∀y ∈ R, x + y 6= 0.
Translating this back into an English sentence, we have “There is a real number which has no
negative.” 

3.4 Implications

At the core of mathematical statements are implications, which consist of ‘if-then’ statements. A
typical theorem contains a hypothesis and a conclusion, such that IF the hypothesis is TRUE,
THEN the conclusion is TRUE. This is a conditional statement that requires us to first check
the truth of the hypothesis before we can ascertain the truth of the conclusion, and is called an
implication because the veracity of the first statement implies the veracity of the second.
To frame this mathematically, let P and Q be predicates. The statement “If P then Q,” or
alternatively “P implies Q,” is written P ⇒ Q and has truth table given by Table 2.

IMPLICATION
P Q P ⇒Q
T T T
T F F
F T T
F F T

Table 2: The truth table for the implication P ⇒ Q.

Carefully consider the bottom two rows of Table 2, which are known as vacuous truths. The idea
is that a universally false hypothesis P can have any implication it wants, since that implication
will never be tested. For example, consider the statement

“If pigs can fly, then the sky is black.”

This is a true statement because whenever pigs can fly then the sky is black. This may seem
artificial and contrived, but vacuous truths appear in mathematics frequently, so it is important to
be aware of how they are handled.
Example 3.11

Let D(x) be the predicate “x is a dog” and let A(x) be the predicate “x is an animal.”
Consider the truth of the implications D(x) ⇒ A(x), A(x) ⇒ D(x) and ¬A(x) ⇒ ¬D(x).

36
2016-
c Tyler Holden
3.4 Implications 3 Mathematical Logic

Solution. The arguments are given below:

D(x) ⇒ A(x) This is the statement “If x is a dog then it is an animal” or “All
dogs are animals” and is TRUE.
A(x) ⇒ D(x) This is the statement “If x is an animal then it is a dog” or “All
animals are dogs.” A cat is an animal which is not a dog, so this
implication must be FALSE.
¬A(x) ⇒ ¬D(x) This is the statement “If x is not an animal then it is not a dog,”
and is a TRUE sentence. Indeed, if x is not an animal then it
could not be a dog. If it were a dog, then by our first implication,
it would be an animal and this would be a contradiction.

Definition 3.12
Let P and Q be predicates with the implication P ⇒ Q.

• The contrapositive of P ⇒ Q is the statement ¬Q ⇒ ¬P .

• The converse of P ⇒ Q is the statement Q ⇒ P .

Example 3.11 shows that the converse of true statement is not necessarily true. As for the
contrapositive, we have the following result:
Proposition 3.13

If P and Q are predicates, then P ⇒ Q and ¬Q ⇒ ¬P are logically equivalent.

Proof. You should try proving this result on your own before proceeding further.
The truth table for the contrapositive ¬Q ⇒ ¬P is as follows:

P Q ¬P ¬Q ¬Q ⇒ ¬P
T T F F T
T F F T F
F T T F T
F F T T T

Comparing this to the truth table for P ⇒ Q given in Table 2, we see that the truth values are
identical as required.

When both P ⇒ Q and its converse Q ⇒ P are true, we write P ⇔ Q and say that “P is true if
and only if Q is true.” This means that the statements P and Q are logically equivalent: whatever
truth value P (x) has, Q(x) will have the same. It is difficult at this point to give examples of “if
and only if” statements that are not just trivial restatements of one another, but some examples
might include:

37
2016-
c Tyler Holden
3 Mathematical Logic 3.4 Implications

• An integer n ∈ Z is even if and only if n/2 is an integer.

• An integer n ∈ Z is divisible by 10 if and only if its one’s digit is a 0

• A triangle is isosceles if and only if exactly two of its angles are equal.

The words ‘necessary’ and ‘sufficient’ are often used to indicate the direction of an implication.
If P and Q are predicates, then

• “P is a necessary condition for Q” is the implication Q ⇒ P ,

• “P is a sufficient condition for Q” is the implication P ⇒ Q,

• “P is necessary and sufficient for Q” is the statement P ⇔ Q.

Example 3.14

Determine the truth table for the statement (P ∨ Q) ⇒ (¬Q ∧ R)

Solution. Building our truth table in parts, one can find

P Q R P ∨Q ¬Q ∧ R (P ∨ Q) ⇒ (¬Q ∧ R)
T T T T F F
T T F T F F
T F T T T T
T F F T F F
F T T T F F
F T F T F F
F F F T F F
F F F F F T

Example 3.15

Let n ∈ N. Show that n is even if and only if n2 is even.

Proof. We begin with the (⇒) direction, and assume that n is even so that n = 2k for some k ∈ N.
Squaring n gives
n2 = 4k 2 = 2(2k 2 )

showing that n2 is also even.

38
2016-
c Tyler Holden
3.4 Implications 3 Mathematical Logic

To prove the (⇐) direction, we will proceed by contrapositive. Assume that n is not even, so
that n = 2k + 1 for some k ∈ N. Squaring n gives

n2 = (2k + 1)2 = 4k 2 + 4k + 1 = 2(2k 2 + 2k) + 1

showing that n2 is odd.

3.4.1 Negating an Implication

In Example 3.13 we found that the statement A(x) ⇒ D(x), read as “Every animal is a dog,” was
false. To show that it was false we used the example of a cat, which is an animal but is not a dog.
More generally, if P and Q are predicates in a universe S, we say that x ∈ S is a counter-example
to P ⇒ Q if P (x) is true but Q(x) is not true; that is, P (x) ∧ ¬Q(x) is true. Counter-examples are
exactly how implications are negated.
Proposition 3.16

If P and Q are predicates, the negation of the implication P ⇒ Q is the statement P ∧ ¬Q.

Proof. The truth table for P ∧ ¬Q is given below, along with P ⇒ Q for reference

P Q ¬Q P ∧ ¬Q P Q P ⇒Q
T T F F T T T
T F T T T F F
F T F F F T T
F F T F F F T

Comparison of the last column of each table shows that the tables are negations of one another,
proving the result.

Example 3.17

Negate the following sentence: “If x is duck, then x likes peanut butter.”

Solution. Here we have the predicates P (x) =“x is a duck” and Q =“x likes peanut butter’ with
the sentence above being the implication P ⇒ Q. The negation of this implication is P ∧ ¬Q, or
“x is a duck and x does not like peanut butter.” 

Example 3.18

In calculus, we say that lim f (x) exists if


x→c

∃L ∈ R, ∀ > 0, ∃δ > 0, ∀x ∈ R, |x − c| < δ ⇒ |f (x) − L| < .

Negate this sentence to determine the mathematical statement that a limit does not exist.

39
2016-
c Tyler Holden
3 Mathematical Logic 3.5 Contradiction (Reductio ad absurdum)

Solution. Applying our rules for negation, the limit does not exist if the following sentence is
satisfed:
∀L ∈ R, ∃ > 0, ∀δ > 0, ∃x ∈ R, |x − c| < δ and |f (x) − L| ≥ . 

3.5 Contradiction (Reductio ad absurdum)

Let P and Q be predicates, and consider the problem of showing that the statement T : P ⇒ Q is
true. A proof by contradiction proceeds by assuming that T is false (or ¬T is true), and showing
that something bad happens. More specifically, if R is some other predicate which may not be
directly related to P or Q, then
¬T ⇒ (R ∧ ¬R) (3.4)

is a true statement. Here we recall that for any predicate R, R ∧ ¬R is always false, so we have
shown that the assumption that ¬T is true leads to a contradiction.
We can use a truth table to verify that the truth of T perfectly corresponds with the truth of
(3.4). Indeed

T R ¬P R ∧ ¬R ¬T ⇒ (R ∧ ¬R)
T T F F T
T F F F T
F T T F F
F F T F F

Hence one can prove that T is true using Equation (3.4).


There is a small group of mathematicians that objects to using proof by contradiction, the so
called constructivists. One of the consequences of using contradiction proofs is that one can show
an object exists without being able to construct it (by assuming it doesn’t exist and arriving at
a contradiction). In fact, there are cases where we can show that something exists, but have no
examples of it.
Proposition 3.19

In the decimal expansion of π, one of the digits {0, 1, 2, . . . , 8, 9} occurs infinitely often.

Proof. For the sake of contradiction, assume that each of the above digits occurs only finitely many
times in the decimal expansion of π. Let Ni be the number of times the digit i appears, so that
the decimal expansion of π consists of

N0 + N1 + · · · + N9

digits. As each Ni is finite, so too is this sum. If π has only a finite decimal expansion, it is
necessarily rational. Since we know that π is not rational, this is a contradiction and we conclude
that some digit must occur infinitely often.

40
2016-
c Tyler Holden
3.5 Contradiction (Reductio ad absurdum) 3 Mathematical Logic

Note that this proof is not constructive: we do not know which digit occurs infinitely often. In
fact, it is an open problem whether each digit occurs infinitely often.
Proposition 3.20

If A and B are sets, then A ∩ (B \ A) = ∅.

Proof. For the sake of contradiction, assume that A ∩ (B \ A) = 6 ∅, so that the there exists an
element x ∈ A ∩ (B \ A). By definition of the intersection, x ∈ A and x ∈ B \ A. However,
x ∈ B \ A implies that x ∈
/ A, contradicting the fact that x ∈ A. Hence A ∩ (B \ A) = ∅.

Proposition 3.21

The number 2 is irrational.

√ √
Proof. For the sake of contradiction, assume that 2 √ is rational and write 2 = p/q where
gcd(p, q) = 1; that is, p/q is in lowest terms. Hence q 2 = p and by squaring both sides we
get
2q 2 = p2 .
Notice that 2q 2 is even, and so therefore p2 must also be even. By Example 3.15 we know that p
is therefore also even, so p = 2k for some k ∈ N. Substituting this back into our equation, we get

2q 2 = (2k)2 = 4k 2 ⇔ q 2 = 2k 2

so that similarly, q 2 is even. This implies that q is even, so q = 2`. However, this is a contradiction.
We assumed
√ that p and q were written in lowest terms, but have demonstrated that both are even.
Hence 2 is not rational and so must be irrational.

Example 3.22

Show that there are no natural solutions to the equation x2 − 4y 2 = 7.

Solution. Suppose for the sake of contradiction that a solution exists. Note that we can factor the
left hand side, giving x2 − 4y 2 = (x − 2y)(x + 2y) = 7. Since 7 is prime, its two factors are either
−1, −7 or 1, 7, but we can throw away the negative factors since x + 2y > 0.
Thus we either have x − 2y = 1 and x + 2y = 7, or x − 2y = 7 and x + 2y = 1. In either case, if
we add the two equations together we get x = 4. In the first case, this implies that y = 3/2 which
is not possible. In the latter case, y = −3/2, which is also not possible. Hence we’ve arrived at a
contradiction, and no solutions can exist. 

Example 3.23

Show that x3 + x2 = 1 has no rational solutions.

41
2016-
c Tyler Holden
3 Mathematical Logic 3.6 A Rigmarole of Random Results

Solution. Suppose that a solution exists, and write it in lowest terms as x = a/b. Substituting in
we get
a3 a2
+ 2 = 1 ⇒ a3 + a2 b = b.
b3 b
Now we have three cases: Either both a, b are odd; a is even and b is odd; or a is odd and b is even.
Note that both cannot be even, as we’ve assumed a/b is in lowest terms.

1. If both are odd, then a3 , a2 b, and b are all odd. But this cannot happen, since if a3 and a2 b
are odd, then a3 + a2 b is even.
2. If a is even and b is odd, a3 is even, a2 b is even, and b is odd. This leads to the same problem,
as a3 + a2 b is then even.
3. If a is odd and b is even, then a3 is odd, a2 b is even, and b is even. But then a3 + a2 b is odd,
which is a contradiction.

Thus there cannot be any rational solutions to the given equation. 

3.6 A Rigmarole of Random Results

Let’s practice doing some proofs!


Example 3.24

3x
Show that the function f (x) = is injective; namely, if a 6= b then f (a) 6= f (b).
x+4

Solution. Proceeding by contrapositive, it is sufficient to show that whenever f (a) = f (b) then
a = b:
3a 3b
f (a) = f (b) ⇔ =
a+4 b+4
⇔ 3a(b + 4) = 3b(a + 4)
⇔ 3ab + 12a = 3ab + 12b
⇔ 12a = 12b
⇔ a = b.
This is precisely what we wanted to show, so the result follows. 

Example 3.25

There exists irrational a and b such that ab is rational.

√ √ √2
Solution. We know that 2 is√irrational (though we have not yet proven this). If 2 is rational
we are done, setting a = b = 2. Otherwise, it is irrational, in which case
√ √2 √2 √ 2
( 2 ) = 2 =2

42
2016-
c Tyler Holden
3.6 A Rigmarole of Random Results 3 Mathematical Logic


√ 2 √
works, with a = 2 and b = 2. 

3.6.1 Some Number Theory

Definition 3.26
If a, b ∈ Z we say that a|b (read “a divides b”) if there exists k ∈ Z such that ak = b.

For example, 5|35 since 5 · 7 = 35, while 26 |5 since there is no integer k for which 2k = 5.

Proposition 3.27

If a|b and a|c, then for any m, n ∈ Z, a|(mb + nc).

Proof. Our hypotheses indicate that a|b and a|c, so there exist k, ` ∈ Z such that ak = b and a` = c.
Using these equations, we can write

mb + nc = m(ak) + n(a`) = a(mk + n`)

showing that a|(mb + nc) as required.

Proposition 3.28

If a|b and a|(b + c) then a|c.

Proof. By assumption, there exists k, ` such that ak = b and a` = b + c. Using the latter equation,
we can write
c = a` − b = a` − ak = a(` − k)
showing that a|c as required.

Recall that p ∈ Z is a prime if its only factors are 1 and p. A number which is not prime is
called composite, and necessarily has non-trivial factors other than 1 and itself.
Proposition 3.29

Every natural number can be written as a product of primes.

Proof. For the sake of contradiction, assume that not every natural can be written as a product of
primes. In particular, there must be a smallest such number, say n. This number cannot be prime
itself, otherwise it is trivially a product of primes, so n is necessarily composite and can be written
as n = rs for 1 < r ≤ s < n.
Both r, s < n, and since n is the smallest number than cannot be written as a product of primes,
both r and s must be writable as products of primes. However, combining those primes then gives

43
2016-
c Tyler Holden
4 Induction

a decomposition of n into a product of primes, which contradicts our assumption. We conclude the
result, as required.

Theorem 3.30: Euclid’s Proof of the Infinitude of Primes

There are infinitely many prime numbers.

Solution. For the sake of contradiction, assume that there are only finitely many primes, and list
them as p1 , p2 , . . . , pn . Consider the number x = p1 p2 · · · pn + 1. This number is larger than any of
the given primes, and hence cannot be prime itself.
We claim that x cannot be written as a product of prime numbers. Indeed, suppose that pk
were a factor of x, so that pk |x. Since pk |p1 p2 · · · pk , by Proposition 3.28 we would have pk |1, and
this is not possible. Hence no prime can be a factor of x, and so x cannot be written as a product
of primes. This contradicts Proposition 3.29, so our original assumption must have been false; that
is, there are infinitely many primes. 

4 Induction

Mathematical induction is a proof technique used to show that a result holds for every natural
number N. It operates on the domino principle, by creating a chain of implications which extends
to every natural number. For example, suppose N is our universe and we would want to show P (n)
is true for every n ∈ N. If we can show that P (1) is, and then P (k) ⇒ P (k + 1), then the result
holds for any n. This is precisely because

P (1) ⇒ P (2) ⇒ P (3) ⇒ · · · ⇒ P (n − 1) ⇒ P (n) ⇒ · · ·

and since P (1) is true, so too is every P (n) thereafter.

Mathematical Induction
Let P be some predicate. If P (1) is true, and P (k) ⇒ P (k + 1) for any k, then P (n) is true
for all n ∈ N

Thus mathematical induction consists of two steps. The first is to demonstrate the base case
that P (1) is true. The second is to invoke the induction hypothesis that P (k) is true for some k,
and demonstrate that P (k) ⇒ P (k + 1).
Example 4.1

Show, using mathematical induction, that 2n + 2 ≤ 4n for all integers n ≥ 1.

Solution.

1. Base Case: The smallest number for which this occurs is n = 1, and in this case we have
2n + 2 = 4 and 4n = 4, so the result holds in the base case.

44
2016-
c Tyler Holden
4 Induction

2. Induction Step: Assume that 2k + 2 ≤ 4k for some natural number k. We want to show
that 2(k + 1) + 2 ≤ 4(k + 1). Indeed, notice that
4(k + 1) = 4k + 4
using the induction
≥ (2k + 2) + 2
hypothesis 4k ≥ 2k + 2
= 2k + 4 = 2(k + 1) + 2.
We conclude from the induction principle that 2k + 2 ≤ 4k for all k ∈ N. 

Example 4.2

Show that for every n ∈ N, 2n ≤ 2n .

Solution.

1. Base Case: When n = 1 one has 2 ≤ 2 which is a true statement, so the base case holds.
2. Induction Step: Assume that for some n we know that 2n ≤ 2n . Now
2n+1 = 2(2n )
≥ 2(2n) = 4n
≥ 2n + 2 by Example 4.1
= 2(n + 1)
which is what we wanted to show. 

Example 4.3

Show that the triangle inequality extends to more than two variables; namely,

|x1 + x2 + · · · + xn | ≤ |x1 | + |x2 | + · · · + |xn |.

Solution. The base case occurs when n = 2, since this is the first instance in which the inequality
makes sense. We have already proven this though, so the base case is done.
Assume then that |x1 + · · · + xn | ≤ |x1 | + · · · + |xn |, and notice that
|x1 + · · · + xn + xn+1 | = |(x1 + · · · + xn ) + xn+1 |
≤ |x1 + · · · + xn | + |xn+1 | by the base case
≤ |x1 | + · · · + |xn | + |xn+1 | by the induction hypothesis
giving the desired result. 

Example 4.4

Prove that for all positive integers k, 5|6k − 1.

45
2016-
c Tyler Holden
4 Induction

Solution.

1. Base Case: The simplest case is k = 1, for which we see that 6k − 1 = 5. Clearly 5|5 since
5/5 = 1, so the base case is satisfied.

2. Induction Step: For some positive integer k, assume that 5|6k − 1. Since by hypothesis, we
k
know that 5|6k − 1 we know there is some integer d such that 6 5−1 = d. Consider 6k+1 − 1
which we may write as

6k+1 − 1 = 6(6k ) − 1
= (1 + 5)(6k ) − 1
= 5(6k ) + (6k − 1)

We claim that 5 divides this number. To see that this is the case, let us divide by 5 and see
what we get.

6k+1 − 1 5(6k ) + (6k − 1)


=
5 5
5 · 6k 6k − 1
= +
5 5
k
=6 +d by induction hypothesis.

This is clearly an integer, so 5|6k+1 − 1 as required. 

Example 4.5: Bernoulli’s Inequality

Show that for all x ≥ −1 and n ∈ N, we have

(1 + x)n ≥ 1 + nx.

Solution. When n = 1 we have 1 + x = 1 + nx so the inequality is true. Assume then that


(1 + x)n ≥ 1 + nx, so that

(1 + x)n+1 = (1 + x)n (1 + x)
≥ (1 + nx)(1 + x) = 1 + x + nx + nx2
= 1 + (n + 1)x + nx2
≥ 1 + (n + 1)x

where we have used the fact that nx2 ≥ 0. This is what we wanted to show, so the inequality is
true. 

Definition 4.6
If S is a set, we denote by P(S) the power set of S, which is the collection of all subsets of
S.

46
2016-
c Tyler Holden
4 Induction

Example 4.7

Show that if |S| < ∞ is a finite set, then |P(S)| = 2|S| .

Solution. There are actually many ways to show this is true. The simplest is the following: To
count the number of elements in P(S), note that each element x ∈ S is either in a subset, or not
in a subset. Hence each x has two possible states it can be in. The set of all possible states for all
possible elements is therefore 2|S| , and we are done.
However, we can proceed by induction on the size of |S| instead. If |S| = 0 then S = ∅, and
P(S) = {∅} has size 20 = 1. The base case is thus true.
Now assume that the number of subsets of a set with n elements is 2n , and let S have n + 1
elements
S = {s1 , . . . , sn , sn+1 } .
Notice that every subset of S either contains sn+1 or does not. Of those that do contain sn+1 ,
there are 2n possible subsets (corresponding to the subsets of {s1 , . . . , sn }). Similarly, of those that
do not contain sn+1 there are also 2n such subsets. All together, there are 2n + 2n = 2n+1 such
subsets, as required. 

As a brief aside, one sometimes denotes the power set by 2S for this reason. Hence 2N and 2R
are the power sets of N and R respectively. There is yet another reason why this notation is great.
In general, if A and B are sets, then
AB = {f : B → A} ;
that is, the set of all functions from A to B. It is possible to show that the subsets of S are in
one-to-one correspondence with the functions f : S → {0, 1}, wherein if T ⊆ S, then we define
(
1 x∈T
fT (x) = ,
0 x∈ /T
and so P(S) = {0, 1}S = 2S , where we identify 2 = {0, 1}, since this set has two elements.
Example 4.8

Show that for all n ∈ N a 2n × 2n chessboard with a single tile removed can be L-tiled; that
is, tiled by an L-shape consisting of three squares.

Solution. The base case is when n = 1, in which we are tasked with tiling a 2 × 2 chessboard with
one tile removed. This is immediately possible, so we are done.
Assume then that any 2n × 2n board with a single tile removed admits an L-tiling, and consider
a 2n+1 × 2n+1 board. Divide this board into four quarters, so that each quarter has dimension
2n × 2n . By rotating the board if necessary, assume that the missing tile is located in the upper-left
quadrant. We place our first tile as illustrated in Figure ??. Note that, excluding the placement of
the first tile, every quadrant is now a 2n × 2n board with a single tile removed. By the induction
hypothesis, each board admits an L-tiling, so those tiling combined give the tiling of the 2n+1 ×2n+1
board. 

47
2016-
c Tyler Holden
4 Induction 4.1 Summation and Product Notation

2n

2n

Figure 9: Left: The base case. A 2×2 board with a single tile removed is an L-shape, and so admits
an L-tiling. Right: The induction step. By placing the first tile as such, each of the quadrants is a
2n × 2n board with a single tile removed.

4.1 Summation and Product Notation

Sigma notation is used to make complicated sums much easier to write down. In particular, we use
a summation index to iterate through elements of a list and then sum them together. Consider the
expression

m
X
ri (4.1)
i=n

which is read as “the sum from i = n to m of ri .” The element i is known as the dummy or
summation index, n and m are known as the summation bounds, and ri is the summand. In order
to decipher this rather cryptic notation, we adhere to the following algorithm:

1. Set i = n and write down ri ;

2. Add 1 to the index i and add ri to the current sum;

3. If i is equal to m then stop, otherwise go to step 2 and repeat.

For those computer savvy students out there, this is nothing more than a for-loop. Interpreting
(4.1) we thus have
Xm
ri = rn + rn+1 + rn+2 + · · · rm .
i=n

48
2016-
c Tyler Holden
4.1 Summation and Product Notation 4 Induction

Example 4.9

Let k be some fixed positive integer such that k ≥ 1. Show that


k
X 1 k
= .
n(n + 1) k+1
n=1

Solution. As always, we follow the program enumerated above.

1. Base Case: Here we check the easiest possible case, which corresponds to k = 1. When
k = 1 the left-hand-side becomes
1
X 1 1 1
= =
n(n + 1) 1(1 + 1) 2
n=1

1
which the right-hand-side is 1+1 = 21 . Clearly both sides agree, so the base case is satisfied.

2. Induction Step: Let k be some fixed by arbitrary number and assume that
k
X 1 k
= .
n(n + 1) k+1
n=1

We would like to show that the result holds for k + 1. It makes most sense to start by working
with the left-hand-side, since it will give us the most “flexibility.” Notice that
k+1 k
X 1 1 X 1
= +
n(n + 1) n(n + 1) n=k+1 n(n + 1)
n=1 n=1
1 k via the induction
= +
(k + 1)(k + 2) k + 1 hypothesis
1 + k(k + 2) k 2 + 2k + 1
= = common denominator
(k + 1)(k + 2) (k + 1)(k + 2)
(k + 1) 2 k+1 factoring and
= =
(k + 1)(k + 2) k+2 cancelling
(k + 1)
= .
(k + 1) + 1

This is precisely what we wanted to show, and so we are done. 

Example 4.10

Show that for any k ≥ 1 we have


k
X n(n + 1)(2n + 1)
n2 = . (4.2)
6
n=1

49
2016-
c Tyler Holden
4 Induction 4.1 Summation and Product Notation

Solution. The base case is k = 1, in which case we get


1×2×3
12 = 1 = ,
6
which is a true statement. Thus assume that (4.2) holds for some k, and notice that

k+1
X n
X
n2 = n2 + (n + 1)2
n=1 k=1
n(n + 1)(2n + 1)
= + (n2 + 2n + 1) by Induction Hypothesis
6
(2n3 + 3n2 + n) + (6n2 + 12n + 6)
=
6
3 2
2n + 9n + 13n + 6
=
6
(n + 1)(n + 2)(2n + 3)
= ,
6
which is precisely the correct equation for n + 1. 

Pi notation works in precisely the same way, except that instead of adding we multiply:
n
Y
ri = r1 r2 r3 · · · rn−1 rn .
i=1

so for example, factorial notation can be expressed using Pi-notation as


n
Y
n! = i = 1 · 2 · 3···n
i=1

or sometimes one sees double factorial notation


n
Y
(2n)!! = (2i) = 2 · 4 · 6 · · · (2n − 2) · (2n)
i=1
Yn
(2n + 1)!! = (2i − 1) = 1 · 3 · 5 · · · (2n − 1) · (2n + 1)
i=1

Example 4.11

Show that
n  
Y 1 n+1
1− = . (4.3)
r2 2n
r=2

Solution. In the base case we have n = 2, so the left hand side is 1 − 1/4 = 3/4, while the right
hand side is (2 + 1)/(2 · 2) = 3/4. These are equal, so the base case holds.

50
2016-
c Tyler Holden
4.2 More General Induction 4 Induction

Assume then that (4.3) holds for some n, so that

n+1 n 
Y 12
  
Y 1 1
1− = 1− 2 1−
r r (n + 1)2
r=2 r=2
  2 
n+1 n + 2n
=
2n (n + 1)2
n(n + 1)(n + 2)
=
2n(n + 1)2
n+2
=
2(n + 1)

exactly as desired. 

4.2 More General Induction

The principle of induction can be extended beyond just N. For example, suppose we want to show
that the predicate P (n) is true for all even numbers greater than or equal to 10. Here the base case
is to demonstrate P (10), followed by P (2n) ⇒ P (2n + 2). This creates the chain of implications

P (10) ⇒ P (12) ⇒ P (14) ⇒ · · · ⇒ P (2n) ⇒

demonstrating P (n) for all even numbers at least 10. This idea is easily generalized to any other
induction scheme. Of course, this can be seen as equivalent to induction by renaming Q(n) as the
statement P (2n) is true.
An ostensibly different type of induction is that of strong induction. We again aim to create a
chain of implications, but now we make a stronger induction hypothesis.
Theorem 4.12: Strong Induction

Let P be some predicate. Suppose that P (1) is true, and moreover

P (1) ∧ P (2) ∧ · · · ∧ P (k) ⇒ P (k + 1)

for any k, then P (n) is true for all n ∈ N

Proof. We will proceed by using (normal) induction. Let Q(k) = P (1) ∧ · · · ∧ P (k). Since Q(1) =
P (1), the base case is true. Now assume that Q(n) is true. Since Q(n) ⇒ P (n + 1), then Q(n + 1)
is true as well. By induction, Q(n) holds for all n, and this is only possible if all P (n) are true, as
required.

Hence (Induction) ⇒ (Strong Induction). Moreover, since normal induction uses a weaker
hypothesis, we see that (Strong Induction) ⇒ (Induction). This shows that induction and strong
induction are actually equivalent.

51
2016-
c Tyler Holden
4 Induction 4.2 More General Induction

Example 4.13

Show that any postage amount greater than 8 cents can be formed by using 3 cent and 5
cent stamps.

Solution. Let P (n) be the statement ”A postage of n cents can be made of 3 and 5 cent stamps.”
As our base cases,

P (8) = 3(1) + 5(1), P (9) = 3(3) + 5(0), P (10) = 3(0) + 5(2).

Now assume that P (k) is true for all 8 ≤ k ≤ n, for which we will show that P (n + 1) is true.
Indeed, notice that we can write n + 1 = (n − 2) + 3. By our induction hypothesis, we know that
a postage of n − 2 stamps can be resolved, say by r three-cent stamps and s five cent stamps, thus

n + 1 = (n − 2) + 3 = [3(r) + 5(s)] + 3 = 3(r + 1) + 5(s)

as required. 

Example 4.14

Consider a two-player game, consisting of two bowls of marbles. Each player takes a turn
removing any positive number of marbles from a single bowl. The player that removes the
last marble wins. Show that if both bowls have an identical number of marbles, the player
who goes second always has a winning strategy.

Solution. Let P (n) be the statement ”Player two wins when both bowls have n marbles.” The base
case is n = 1, in which both bowls have a single marble. Player one must remove at least one
marble from a single bowl. This leaves only one bowl with one marble, so player two wins.
Now assume that P (k) is true for all 1 ≤ k ≤ n, for which we will demonstrate P (k + 1). Player
One must go first, and so remove ` marbles from any bowl, leaving a bowl with (k + 1) marbles,
and one with (k + 1 − `) marbles. Player Two now moves by removing ` marbles from the other
bowl, leaving each bowl with k + 1 − ` marbles. The game has thus been reduced to the game with
(k + 1 − `) < k marbles, and we know P (k + 1 − `), so Player Two has a winning strategy. 

4.2.1 Recursion

Recursive sequences are those sequences whose elements depend explicitly upon previous entries.
For example, the well known Fibonacci sequence is defined as x1 = x2 = 1 and xn = xn−1 + xn−2 .
Unfortunately, computing xn means computing xk for all k ≤ n. More appealing would be to find
a closed form solution for xn .
The problem of determining a closed form solution can be rather tricky, and is often relegated
to the realm of combinatorial enumeration. However, given a closed form we can use induction to
verify the result.

52
2016-
c Tyler Holden
4.2 More General Induction 4 Induction

Example 4.15

Consider the recurrence relation x1 = 3, x2 = 7 and

xk = 5xk−1 − 6xk−2 .

Show that xk = 2k + 3k−1 .

Solution. We will proceed by strong induction. The base cases are

x1 = 21 + 30 = 3, x2 = 22 + 3 = 7

which agree with our initial configuration. Now assume that xk = 2k + 3k−1 for all 1 ≤ k ≤ n.
Examining xn+1 we have

xn+1 = 5xn − 6xn−1 = 5(2n + 3n−1 ) − 6(2n−1 + 3n−2 )


= 5(2n + 3n−1 ) − 3 · 2n + 2 · 3n−1
 

= 2 · 2n + 3 · 3n−1
= 2n+1 + 3n .

exactly as desired. 

Example 4.16

Let x1 = x2 = 1 and xn = xn−1 + xn−2 so that xn is the Fibonacci sequence. Define



1+ 5
α± = .
2

n − αn )/ 5.
Show that xn = (α+ −

Solution. First note that


2
1 + α± = α±

owing to the fact that these are the roots of the polynomial x2 − x − 1. It can also be verified by
straightforward computation:
√ √
1± 5 3± 5
1 + α± = +1=
2 2

and
√ √ √
2 1±2 5+5 6± 5 3± 5
α± = = = .
4 4 2

53
2016-
c Tyler Holden
4 Induction 4.3 Fallacies

We begin by checking the base cases:


α+ − α1 1 h √ √ i
√ = √ (1 + 5) − (1 − 5)
5 2 5

2 5
= √ =1
2 5
α+2 − α2
1 h √ √ i
√ 1 = √ (1 + 5)2 − (1 − 5)2
5 4 5
√ √
(1 + 2 5 + 5) − (1 − 2 5 + 5)
= √
4 5
= 1,

k − αk )/ 5 for all 1 ≤ k ≤ n, for which
so both base cases are satisfied. Assume then that xk = (α+ −
we demonstrate the result for xn+1 . Indeed,
n − αn
α+ αn−1 − αn−1
xn+1 = xn + xn−1 = √ −+ + √ −
5 5
n−1 n−1
α+ (α+ + 1) − α− (α− + 1)
= √
5
n−1 2 n−1 2
α+ α+ − α− α−
= √
5
n+1 n+1
α −α
= + √ −
5
as required. 

4.3 Fallacies

One must think carefully about both the base case and induction step, and ensure that everything
is being done correctly. There are some subtly wrong arguments that can be made.
Example 4.17

Let x0 = 0 and x1 = 1 and define xn = xn−1 + xn−2 . Every xn is even.

Solution. Let P (n) be the statement xn is even. Clearly P (0) = 0 is even, so the base case is true.
Now assume that P (k) is true for all 1 ≤ k ≤ n. Since xn+1 = xn + xn−1 is the sum of two even
numbers, it is even. 

This proof is certainly wrong, since x1 = 1 and x3 = 3, so what happened? Since xn depends
upon both xn−1 and xn−2 , we must check two base cases.
Example 4.18

All horses are the same color.

54
2016-
c Tyler Holden
5 Bijections and Cardinality

Solution. Let P (n) be the statement ”Any group of n horses all have the same colour.” Clearly
P (1) is true since there is only a single horse in the collection. Now assume that any group of n
horses has the same colour, and let H = {h1 , . . . , hn+1 } be a set of n + 1 horses. Break this into
the subsets
H1 = {h1 , . . . , hn } , H2 = {h2 , . . . , hn+1 } .
Each set Hi has only n-horses, so by the induction hypothesis, all horses in each group are the
same color. Since H1 ∩ H2 6= ∅, there is a common horse in both H1 and H2 , implying that every
horse in H = H1 ∪ H2 has the same colour. 

Here the issue is the induction step. The assumption H1 ∩ H2 6= ∅ fails when n = 2, since in
that case H = {h1 , h2 } making H1 = {h1 } and H2 = {h2 }.

5 Bijections and Cardinality

5.1 Injective and Surjective Functions

Injectivity is a powerful property of functions. In our context, it will facilitate discussion of inverse
functions, to be taken up in Section 5.2
Definition 5.1
A function f : S → T is said to be injective or one-to-one if whenever f (s1 ) = f (s2 ) then
s1 = s2 .

The output of an injective function uniquely corresponds to the input; that is, the only way for
two outputs to be equal (f (s1 ) = f (s2 )) is for the inputs to have also been equal (s1 = s2 ). The is
also alluded to in the phrase one-to-one.
When S and T are both subsets of the real numbers, one can test whether a function f is injective
by applying the Horizontal Line Test to the graph of f . A function satisfies the Horizontal Line
Test if whenever we draw a horizontal line in the plane, it intersects the graph of the function
function at most once.
A third perspective is to view a function as a collection of arrows as in Figure 10. In this case,
a function is injective if every element of the codomain has at most one arrow pointing to it.
Example 5.2

Consider the functions f (x) = x2 , g(x) = 1/x, and h(x) = 1 + x. Determine which, if any,
of these functions is injective.

Solution. We claim that f (x) = x2 is not injective; that is, we can find two different points x1 6= x2
such that f (x1 ) = f (x2 ). Indeed, notice that f (−1) = (−1)2 = 1 and f (1) = (1)2 = 1 so that
f (−1) = f (1). If f (x) were injective, Definition 5.1 would imply that −1 = 1, and this is certainly
not the case. In fact, it is not too hard to convince ourselves that for any non-zero real number r,
we have f (r) = f (−r) since r2 = (−r)2 , but r 6= −r.

55
2016-
c Tyler Holden
5 Bijections and Cardinality 5.1 Injective and Surjective Functions

f :X→Y

a δ

β
b

c
α
d γ

X Y
Figure 10: If f : X → Y is an injective function, each element of the codomain Y has at most one
arrow pointing at it.

On the other hand, the function g is injective. Assume that g(x) = g(y), which by definition of
g this tells us that 1/x = 1/y. By taking the reciprocal of both sides we get x = y and this is what
we wanted to show.
Finally, the function h(x) is injective, since if h(x) = h(y) then 1 + x = 1 + y. By subtracting
1 from both sides, we get x = y as required. 

Proposition 5.3

If f : B → C and g : A → B are injective functions, then h = f ◦ g : A → C is also injective.

Solution. Assume that h(x) = h(y) for some x, y ∈ A. By definition of h we have f (g(x)) = f (g(y)).
Since the function f is injective, the only way f (m1 ) = f (m2 ) is if m1 = m2 , so f (g(x)) = f (g(y))
implies that g(x) = g(y). Since g is also injective, it must be the case that x = y. Thus we have
show that if h(x) = h(y) then x = y, showing that h is injective as required. 

Proposition 5.4

Let f : B → C and g : A → B be injective functions. If f ◦ g is injective, then g is injective.

Solution. Assume that g(a1 ) = g(a2 ). Applying f to both sides gives f (g(a1 )) = f (g(a1 )). Since
the composition f ◦ g is injective, this means that a1 = a2 , which is what we wanted to show. 

The dual notion to an injective function is a surjective function, and this duality will be made
clearer in Section 5.2

56
2016-
c Tyler Holden
5.1 Injective and Surjective Functions 5 Bijections and Cardinality

Definition 5.5
A function f : S → T is said to be surjective or onto if for every element t ∈ T , there is an
element s ∈ S such that f (s) = t.

f :X→Y

a
β
b

c
α
d

X Y
Figure 11: If f : X → Y is surjective, then every element of the codomain has at least one arrow
pointing pointing at it.

When thinking about surjective functions, the idea to keep in mind is that every element in T is
the image of something in S. Put another way, if f maps elements of S to elements of T , everything
in T is hit by something in S. If we describe a function as arrows between sets, a function f is
surjective if everything in T has an arrow pointing to it.
Example 5.6

Of the following functions which map R → R, determine which maps are surjective:
1
f (x) = x2 , g(x) = , h(x) = 1 + x.
x

Solution. For functions R → R, a surjective function is the same as a function whose range is all of
R. The function f (x) = x2 is therefore not surjective since the range of f (x) is [0, ∞). Similarly,
h(x) has range R\{0} , so is not surjective. However, the function h(x) is surjective: if y is any real
number, we have that y is hit by y − 1, since

h(y − 1) = 1 + (y − 1) = y.

Thus the range of h(x) is all of R and we conclude that h(x) is surjective. 

Proposition 5.7

If f : B → C and g : A → B are surjective functions, then f ◦ g : A → C is also surjective.

57
2016-
c Tyler Holden
5 Bijections and Cardinality 5.2 Inverse Functions

Solution. Let c ∈ C be an arbitrary element, for which we need to find an a ∈ A such that
f (g(a)) = c. Since f is surjective, there exists b ∈ B such that f (b) = c. Since g is surjective, there
exists a ∈ A such that g(a) = b. Now f (g(a)) = f (b) = c as required. 

Leaving functions we might see in calculus, the following are some further examples:

1. The function f : R → R given by f (x) = sin(x) is neither injective nor surjective. Indeed,
sin(0) = sin(π) and f (R) = [0, 1]. No amount of finagling can make f surjective, but we can
restrict the domain to ensure that f is injective. An interval of length π is the largest we can
take, and a common choices is [−π/2, π/2].
2. The function d : R → R2 , x 7→ (x, x) is injective but not surjective. It is injective since if
d(x) = d(y) then (x, x) = (y, y) and equating any component gives x = y. On the other hand,
there is no point in the domain such that d(x) = (0, 1).
3. The function p : R2 → R, (x, y) 7→ x is surjective but not injective. It fails to be injective
since f (x, y1 ) = x = f (x, y2 ) for any y1 , y2 . On the other hand, if x0 ∈ R then f (x0 , 0) = x0 ,
showing that the map is surjective.
4. Let PolyR be the polynomials with real coefficients, and define ev0 : PolyR → R as ev0 (p) =
p(0). This map is surjective but not injective. Indeed, ev0 (x2 + a) = a = ev0 (x + a) for any
a ∈ R, showing both surjectivity and the failure of injectivity at the same time.

Definition 5.8
A function f : S → T is bijective if it is both injective and surjective.

If f : S → T is injective, every element of T has at most one arrow pointing at it. If f is


surjective, then every element of T has at least one arrow pointing at it. If f is bijective (and hence
both injective and surjective), this must mean that every element of T has exactly one arrow pointing
at it. We’ve shown that compositions of injective/surjective functions are injective/surjective, so
it immediately follows that composition of bijections are bijections.

5.2 Inverse Functions

The word “inverse” has many different meanings depending on the context in which it is used. For
example, what if we were to ask the student to find the inverse of the number 2? What does this
mean? To what are we taking the inverse? To properly understand this, we need to understand the
following: Given a binary operator (an operator which takes in two things and produces a single
thing in return, such as addition and multiplication of real numbers), we say that a number id
is the identity of that operator if operating against it does nothing to the input. In the case of
addition, the operator will satisfy x + id+ = x for all possible x; for example,
2 + id+ = 2, −5 + id+ = −5.
Our experience tells us that id+ = 0. Similarly, for multiplication the identity id× will satisfy
x × id× = x for all x; for example,
3 × id× = 3, π × id× = π.

58
2016-
c Tyler Holden
5.2 Inverse Functions 5 Bijections and Cardinality

Again our experience tells us that id× = 1. We say that 0 is the additive identity and 1 is the
multiplicative identity.
Given an operator and an identity, we say that the inverse of x is an element which, when paired
against x, gives the identity. The additive inverse of 2 is the number y such that 2 + y = id+ = 0.
In this case y = −2, and more generally the additive inverse of n is −n. For multiplication, we can
convince ourselves that the multiplicative inverse of x is 1/x; for example, 2 × (1/2) = 1 = id× .
Notice that every real number has an additive inverse, while there is no multiplicative inverse
for the number 0. In general, one cannot be guaranteed that an inverse always exists.
If f, g : A → A, then function composition f ◦ g is another example of a binary operator. What
is the identity for this operation? Well, we would like a function id◦ : A → A such that

f (id◦ (x)) = f (x)


= id◦ (f (x)).

The identity function is therefore the function id◦ (x) = x, the function which does nothing to the
argument! Therefore the inverse of a function f : A → A is another function f −1 : A → A such
that f ◦ f −1 = f −1 ◦ f = id◦ .
This conversation can be generalized for functions whose domain and codomain are not equal.
For example, if f : A → B then f −1 : B → A. However, we now require two identities functions,
idA B
◦ : A → A and id◦ : B → B such that

f −1 (f (y)) = idB
◦ (y) = y, f −1 (f (x)) = idA
◦ (x) = x.

Definition 5.9
Let f : S → T be a function. We say that g : T → S is a

• left-inverse of f if g(f (s)) = s for all s ∈ S,

• right-inverse of f if f (g(t)) = t for all t ∈ T ,

• inverse of f if it is both a left- and right-inverse. We denote the inverse of f as f −1 .

Injective functions and surjective functions have left- and right-inverses respectively, as demon-
strated in the following propositions:
Proposition 5.10

A function f : S → T is injective if and only if it has a left-inverse g : T → S.

Proof. Let’s begin by assuming that f : S → T is injective. Define the function g : T → S as


follows: Let s0 ∈ S be any element and set
(
s if f (s) = t
g(t) = .
s0 otherwise

59
2016-
c Tyler Holden
5 Bijections and Cardinality 5.2 Inverse Functions

If you reexamine Figure ??, the idea is to simply reverse each given arrow. However, anything
which does not already have an arrow pointing to it needs to map somewhere. Hence we choose an
arbitrary element s0 in the domain and map all those points to s0 . To see that g is a left inverse
of f , let s ∈ S, in which case g(f (s)) = s by definition of g.
Conversely, assume that f has a left inverse g : T → S so that g(f (s)) = s for any s ∈ S. Set
f (x) = f (y) for which we would like to show that x = y. By applying g to both sides we get

g(f (x)) = g(f (y)) ⇒ x=y

so that f is injective as required.

Proposition 5.11

A function f : S → T is surjective if and only if it has a right-inverse g : T → S.

Proof. We begin by assuming that f : S → T is surjective. For each t ∈ T , let P (t) denote the set5

P (t) = {s ∈ S : f (s) = t} ;

that is, P (t) consists of the elements of S which map to t. Since f is surjective, each set P (t) has at
least one element, so for each t ∈ T we choose6 an element st ∈ P (t). Define the function g : T → S
by g(t) = st , which we claim is a right-inverse to f . Indeed, f (g(t)) = f (st ) = t by definition of st ,
as required.
Conversely, assume that there is a function g : T → S such that f (g(t)) = t for all t ∈ T . We
want to show that for each t ∈ T there is an element s ∈ S such that f (s) = t. The function
g : T → S gives us a way of picking an element in S, so we choose the element s = g(t) ∈ S. It
then follows that f (s) = f (g(t)) = t as required.

Injective functions are precisely those with left-inverses, and surjective functions are those with
right-inverses. This is the notion of duality we mentioned before. Definition 5.9 says that a function
f : S → T has an inverse if it has both a left- and a right-inverse, so we can combine Proposition
5.10 and Proposition 5.11 to get the following corollary:
Corollary 5.12

A function f : S → T is bijective if and only if it has a two-sided inverse f −1 : T → S.

Proposition 5.13

If f : A → B is invertible, its inverse is unique.

5
The set P (t) is called the pre-image of an element t under f , and is usually denoted by f −1 (t). However, this
is just notation and does not mean that an inverse function f −1 exists! To avoid possible confusion, we have chosen
not to use this notation for this proof.
6
Here we have had to use something called the Axiom of Choice. Not all mathematicians believe that such a
choice is allowed to be made, but the author is not one of those mathematicians.

60
2016-
c Tyler Holden
5.2 Inverse Functions 5 Bijections and Cardinality

Proof. Let f −1 : B → A and g : B → A be inverses to f , so that


f ◦ f −1 = idB , f −1 ◦ f = idA , f ◦ g = idB , g ◦ f = idA .
We now exploit the defining property of the identity function and associativity of function compo-
sition:
g = g ◦ idB = g ◦ (f ◦ f −1 ) = (g ◦ f ) ◦ f −1 = idA ◦ f −1 = f −1 .
Hence g = f −1 , showing that the inverse function is unique.

If the function is only left/right invertible, the left/right inverse is certainly not unique.
Example 5.14

If they exist, find the inverses of the following functions:

f :R→R g : R \ {0} → R h:R→R


, , .
x 7→ x2 x 7→ 1/x x 7→ 1 + x

In each case, if the inverse does not exist, determine a necessary change to the function to
guarantee an inverse.

Solution. We start with h(x) = 1 + x as it is the easiest. Examples 5.2 and 5.6 demonstrated that
h(x) is both injective and surjective, and hence is bijective. By Corollary 5.12, we know there is
an inverse function h−1 : R → R such that (h ◦ h−1 )(x) = x and (h−1 ◦ h)(x) = x. The reader can
easily check that the desired inverse function is h−1 (x) = x − 1.
Now f (x) = x2 is neither injective nor surjective, so certainly it will have neither a left nor a
right-inverse. However, if we make a small change to the codomain of the function then we can

arrive a partial answer. If f : R → [0, ∞) then we have a right-inverse r(x) = x
√ √
(f ◦ r)(x) = ( x)2 = x, (r ◦ f )(x) = x2 = |x|.
Notice that r still fails to be a left-inverse. If we also change the domain so that f : [0, ∞) → [0, ∞)
then (r ◦ f )(x) = x since there are no negative values of x to realize the absolute value. With these
changes in place the function f is now invertible.
Finally, the function g(x) = x1 is injective but fails to be surjective, since there is not values
of x ∈ R \ {0} such that g(x) = 0. This is the only value we miss, so by removing it we have a
bijective function, whose inverse is s(x) = 1/x.
1 1
(g ◦ s)(x) = = x, (s ◦ g)(x) = = x. 
1/x 1/x

The above example demonstrates how critical the domain and codomain are to the definition
of a function. By changing the domain and codomain, one can change whether the function is
injective, surjective, or bijective, and hence whether or not it is invertible. In pratice, injectivity
is more critical than surjectivity. If f : A → B in injective, restricting the codomain to the range
of f does not result in any loss of information about the function. On the other hand, if f is not
injective, we have to restrict the domain to a subset on which the function is injective. This means
throwing away information about the function, which is less than ideal.

61
2016-
c Tyler Holden
5 Bijections and Cardinality 5.3 Cardinality

f : x 7→ x2
(Co)domain Injective Surjective
R −→ R No No
[0, ∞) −→ R Yes No
R −→ [0, ∞) No Yes
[0, ∞) −→ [0, ∞) Yes Yes

Warning

Many students confuse the notion of the function inverse with that of a reciprocal. The
reciprocal of a function f is the function 1/f , and is such that
1
f (x) × = 1.
f (x)

The inverse of a function is such that (f ◦ f −1 )(x) = x. Because this mistake occurs so
frequently, we make the message loud and clear:

“The inverse of a function is not the same as the reciprocal of a function.”

Proposition 5.15

If f : B → C and g : A → B are both bijections, then (f ◦ g)−1 = g −1 ◦ f −1 .

Solution. Since inverses are unique, it suffices to show that g −1 ◦ f −1 gives the identity function.
Indeed,
(f ◦ g) ◦ (g −1 ◦ f −1 ) = f ◦ (g ◦ g −1 ) ◦ f −1 = f ◦ idB ◦ f −1 = f ◦ f −1 = idC
and similarly
(g −1 ◦ f −1 ) ◦ (f ◦ g) = idA
showing that (f ◦ g)−1 = g −1 ◦ f −1 as required. 

5.3 Cardinality

The cardinality of a set S is how many elements are within the set, and is denoted |S|. When S
is finite, |S| is simple to define; however, the issue becomes trickier when S is an infinite set. For
example, we will see that |N| = |Z| = |Q|, which is surprising given that it looks as though Q is
much larger than N.
Consider the following situation: you are given two boxes, let’s call them box S and box T .
Each box contains an unknown number of rubber balls, and your job is to determine which box
contains more balls. One strategy is to reach into both boxes at the same time, withdrawing a
single ball from each. If, say, box S runs out of balls before box T , you know that box S contained
fewer balls than T . If S and T are sets, we know |S| < |T |.
How does this help us in determining the size of sets? Let f : S → T is a function, thought of as
a collection of arrows from S to T . The function f must define exactly |S| arrows, one emanating

62
2016-
c Tyler Holden
5.3 Cardinality 5 Bijections and Cardinality

from each element of S. If |S| > |T | it is possible to define a surjective function f : S → T , but
not an injective function. For each object in S we must choose an object in T . Since |S| > |T | we
have enough arrows to hit every element in T , giving us a surjective function. On the other hand,
the pigeonhole principle tells us that at some point we are going to have to send two elements of S
to the same element of T , breaking injectivity.
One can imagine a similar situation if |S| < |T |, wherein one can define an injective function
from S to T , but not a surjective function. It is this idea for finite sets that allows us to define the
notion of cardinality in general.
Definition 5.16
Let S and T be sets. We say |S| ≤ |T | if there is an injective function S → T .

For example, if S = {1, 2, 3} and T = {−3, −6, −9, −12} we could define an injection f : S → T
by f (s) = −3s, showing that |S| ≤ |T |. This agrees with our usual notion of counting, since |S| = 3
and |T | = 4. However, we can extend this idea to infinite sets. For example, if S = {1, 2, 3, . . .}
and T = {−1, −2, −3, . . .} then |S| ≤ |T | via the injection s 7→ −s. Of course, in this latter case
we expect |S| = |T |, but we are not yet able to discuss these things.
Before going any further, let’s ground our definition in reality.
Proposition 5.17

If S = {s1 , . . . , sn } and T = {t1 , . . . , tm } are finite sets, then |S| ≤ |T | if and only if n ≤ m.

Proof. [⇐] Suppose n ≤ m and define a map f : S → T by f (si ) = ti . This map makes sense only
if n ≤ m, and moreover it is certainly injective. Hence |S| ≤ |T |.
[⇒] By contrapositive, suppose n > m. Suppose for the sake of contradiction that an injective
function f : S → T exists. The data of f includes n outputs, so by the Pigeonhole principle at
least one of these output must be repeated; that is, there exist s1 and s2 such that s1 6= s2 and
f (s1 ) = f (s2 ), but this contradicts the fact that f is an injection.

Proposition 5.18

If S ⊆ T then |S| ≤ |T |.

Proof. Let ι : S → T be the inclusion function; that is, ι(s) = s. This function is certainly injective,
since if ι(s1 ) = ι(s2 ) then by definition, s1 = s2 . By Definition 5.16, |S| ≤ |T |.

The hierarchy of counting thus immediately implies that |N| ≤ |Z| ≤ |Q| ≤ |R|. Similarly,
|[0, 1]| ≤ R, and any other cardinality relation induced by the subset relation.
One can also use the notion of a surjection to compare cardinalities, as the following example
demonstrates:

63
2016-
c Tyler Holden
5 Bijections and Cardinality 5.3 Cardinality

Proposition 5.19

If S and T are non-empty sets and f : S → T is a surjection, then |T | ≥ |S|; that is, there
is an injection g : T → S.

Proof. Let f : S → T be surjective. By Proposition 5.11 we know that f has a right inverse;
namely, there is a function g : T → S such that f ◦ g = idT . The symmetry in this relationship
means that f is a left-inverse for g, which by Proposition 5.10 means that g is injective. Hence we
have an injective function from T to S, and |T | ≤ |S|.

Once again, we immediately get that if S = {s1 , . . . , sn } and T = {t1 , . . . , tm } are finite sets,
then there is a surjection from S to T if and only if m ≤ n. This leads us to the following definition:
Definition 5.20
If S and T are sets, then |S| = |T | if there is a bijection S → T .

If f : S → T is a bijection, it is both surjective and injective, so |S| ≤ |T | and |T | ≤ |S|. Thus


it makes sense to define equality of cardinality this way.
Note as well that this relationship is immediately transitive, so that if |A| = |B| and |B| = |C|,
then |A| = |B|. Indeed, there is a bijection f : A → B and a bijection g : B → C, then g ◦f : A → C
is a bijection.
Example 5.21

Consider the sets


2N = {0, 2, 4, 6, . . .} .
Show that |2N| = |N|.

Solution. Certainly 2N ⊆ N, so that |2N| ≤ |N|, but we have been asked to go one step further.
Define the function f : N → 2N by f (n) = 2n, which we shall show is a bijection.
To see that it is injective, notice that if f (n) = f (m) then 2n = 2m. Dividing by 2 gives n = m
as required. To see that f is surjective, notice that every positive even number k can be written
as k = 2m for some m ∈ N. Hence f (m) = 2m = k so f is surjective. Since there is a bijection
between N and 2N, we conclude that |N| = |2N|. 

Example 5.21 demonstrates that, unlike finite sets, infinite sets can have the same cardinality as
their subsets. This is just the first surprising statement in a plethora of unintuitive but interesting
results. A similar example is the following:
Example 5.22

Show that |(0, 1)| = |R| .

64
2016-
c Tyler Holden
5.3 Cardinality 5 Bijections and Cardinality

Solution. We need to find a bijective function that maps (0, 1) to R, or alternatively an in-
jection from R → (0, 1). The former is just as easy as the later, once one recognizes that
arctan : (−π/2, π/2) → R is a bijection. By appropriately modifying the arctangent function,
we get
2 arctan(t) + π
f : (0, 1) → R, t 7→ ,
2
is the desired bijection. 

Exercise: Modify Example 5.22 to show that |(a, b)| = |R| for any a < b. What about the
closed interval [a, b]?

A subtle question at this point is whether knowing |S| ≤ |T | and |T | ≤ |S| tells us that |T | = |S|.
Remember, these “inequalities” are just notation – notation that happens to coincide with our usual
inequality on N, but we can’t say more than that in general.
Let’s think about this more. We’re asking if knowing that there is an injection f : S → T
(|S| ≤ |T |) and an injection g : T → S (|T | ≤ |S|) guarantees the existence of a bijection from S to
T . That’s not at all obvious.
Theorem 5.23: Cantor-Bernstein-Schroeder

If f : A → B and g : B → A are injective functions, then there is a bijection h : A → B, and


hence |A| = |B|.

The proof is somewhat involved, and so is omitted. However, it’s worth pointing out that this
is not immediately obvious, and is difficult to prove.
Example 5.24

Show that |(0, 1)| = |[0, 1]|.

Solution. The inclusion map ι : (0, 1) → [0, 1], x 7→ x is an injection, so we need only construct
an injection in the other direction. Define f : [0, 1] → [0, 1] by f (t) = (1 + 2t)/4, and note that
f ([0, 1]) = [1/4, 3/4]. This function is injective, for if f (t1 ) = f (t2 ) then
1 + 2t1 1 + 2t2
= ⇒ 1 + 2t1 = 1 + 2t2 ⇒ t1 = t2 .
4 4
Thus by the Canotor-Bernstein-Schroeder theorem, there is a bijection between (0, 1) and [0, 1];
hence, |(0, 1)| = |[0, 1]. 

Definition 5.25
A set S is said to be countable if |S| ≤ |N|. Equivalently, S is countable if there is an injective
function f : S ,→ N. We say that S is countably infinite if |S| = |N|.

Our goal is to show that Z and Q are countable, but that R is not. For the former, we need the
following result:

65
2016-
c Tyler Holden
5 Bijections and Cardinality 5.3 Cardinality

Proposition 5.26

A countable union of (pairwise disjoint) countable sets sets is countable.

Proof. Let {Ai }i∈I be a countable collection of countable sets, so I is countable, as is each Ai . Let
g : I ,→ N be an injective function, and for each Ai let fi : Ai ,→ N be an injective function. Define
the map [
f: Ai → N, a 7→ 2g(n) 3fn (a) , a ∈ An .
i∈I

This map is well-defined by the uniqueness of prime decompositions and the fact that the An are
pairwise disjoint. The same uniqueness condition gives injectivity; that is, the only way 2n 3m = 2r 3s
is if n = r and m = s. Thus a countable union of countable sets is countable.

Theorem 5.27

The integers are the same size as the natural numbers: |Z| = |N|.

Proof. Note that [


N × N = {(n, m) : n, m ∈ N} = {n} × N
n∈N

is a countable union of countable sets, and hence is countable itself by Proposition 5.26. Define the
map f : Z → N × N as n 7→ (|n|, sgn(n)). As an example of what this map does, we have

−5 7→ (5, −1) 5 7→ (5, 1)


−4 7→ (4, −1) 4 7→ (4, 1)
−3 7→ (3, −1) 3 7→ (3, 1)
−2 7→ (2, −1) 2 7→ (2, 1)
−1 7→ (1, −1) 1 7→ (1, 1)
0 7→ (0, 0)

so that the second number just keeps track of whether the number is positive, negative, or zero.
This map is injective, since if f (n) = f (m) then (|n|, sgn(n)) = (|m|, sgn(m)). Equality in the first
component, |n| = |m|, implies that n = ±m. Equality in the second component, sgn(n) = sgn(m),
implies that that n = m.
Since f is injective, we thus have that |Z| ≤ |N × N| ≤ |N|. On the other hand, since N ⊆ Z it
must be that |N| ≤ |Z|. Both inclusions give us that |N| = |Z|.

Theorem 5.28

The rational numbers are countably infinite: |Q| = |N|.

Proof. Consider the map f : Q → Z × N given by f (p/q) = (p, q) where the fraction p/q is in lowest
terms (if the fraction is negative, we always take the sign in the p-component). Again this map is
injective, since if f (p/q) = f (r/s) then (p, q) = (r, s) which is true only if p = r and q = s. Since Z

66
2016-
c Tyler Holden
5.3 Cardinality 5 Bijections and Cardinality

and N are both countable, so too is Z × N and so |Q| ≤ |Z × N| ≤ |N|. On the other hand, N ⊆ Q
so |N| ≤ |Q| and this gives us that |Q| = |N|.

One might expect that this pattern continues forever, and that every infinite set has the same
cardinality as the naturals. The real numbers are our first counterexample.
Theorem 5.29

The real numbers are strictly larger than the natural numbers, and so not countable: |R| >
|N|.

Proof. It is sufficient to show that the real numbers [0, 1] are not countable, since then certainly
all of R will be uncountable also. For the sake of contradiction, assume that the real numbers are
countable and list them {r1 , r2 , r3 , . . .}. Write each ri in its decimal expansion as ri = 0.di1 di2 di3 di4 · · ·
so that
r1 = 0.d11 d12 d13 d14 · · ·
r2 = 0.d21 d22 d23 d24 · · ·
r3 = 0.d31 d32 d33 d34 · · ·
r4 = 0.d41 d42 d43 d44 · · ·
..
.
Define a new number s as follows. Let s = 0.s1 s2 s3 s4 · · · where
(
0 if dii = 1
si = .
1 if dii 6= 1
The number si is not in the list {r1 , r2 , r3 , . . .} by construction (think about this and you will see
it is true), but is a real number. This is a contradiction, since we assumed that we listed all of the
real numbers. We conclude that the real numbers are not countable as required.

So what goes wrong with the reals? The problem is that, given a fixed real number. There is
no reasonable way to say what the “next” real number is. In the case of N and Z it is easy, and in
this case of Q it is a bit tricky but still doable. But lets say you start at the real number 0. What
is the next real number? 0.1? Why not 0.01 or 0.001 or 0.0001? I can put an arbitrary number
of zeroes before putting that 1, so it does not make sense to say “the next real number.” This is
precisely what breaks.
So is there a cardinal strictly between |N| and |R|? This turns out to be an incredibly deep and
subtle question, and one that cannot be proven with our standard set of axioms. One must either
assume that there is such a cardinal, or assume there is no such cardinal, it cannot be proven either
way. However, there is a systematic way of taking a set, and getting a set with a strictly larger
cardinality.
Definition 5.30
Let S be a set, and define the power set P(S) to be the set of all subsets of S. The power
set is sometimes denoted 2S .

67
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic

Theorem 5.31: Cantor’s Theorem

If S is a set, then |S| < |P(S)|.

Proof. First we show that |S| ≤ |P(S)| by demonstrating an injection S → P(S). Define the
function f : S → P(S) by x 7→ {x}. This map is evidently injective.
The remainder of the proof is a generalization of the proof given in Theorem 5.29, and proceeds
by diagonalization. Assume that a surjective f : S → P(S) exists and define the set
D = {x ∈ S : x ∈
/ f (x)} ∈ P(S).
We claim D is not the image of any point in S. Indeed, for each element a ∈ S, either a ∈ f (a) or
a∈/ f (a). In the first case, if a ∈ f (a) then a ∈
/ D, so f (a) 6= D. On the other hand, if a ∈
/ f (a)
then a ∈ D, again showing that D 6= f (a). Thus D is not the image of any point, contradicting
the assumption that f was a surjection. We conclude that |S| < |P(S)|.

There is no greatest cardinal, since we can keep taking power sets to create a hierarchy of
cardinals. This leads to an interesting paradox: The set of all sets is its own power set, so must
have cardinality strictly greater than itself. This is resolved by declaring that the set of all sets is
not a set, but rather a proper class. Proper classes are strictly larger than sets, and therefore do
not have the same restrictions as sets.

6 Divisibility and Modular Arithmetic

6.1 Divisibility and Primes

We introduced the basics of divisibility in Definition 3.26. This section will focus on this topic, and
other number theoretical ideas.
Recall that if a, b ∈ Z we say that a|b if there exists as k ∈ Z such that ak = b. You can think of
a as being a factor of b. Furthermore, we’ve already proven several useful facts, which we recount
below:

1. [Proposition 3.27] If a|b and a|c then for any m, n ∈ Z, a|(mb + nc).
2. [Proposition 3.28] If a|b and a|(b + c) then a|c.
3. [Proposition 3.29] Every integer can be written as the product of primes.
4. [Theorem 3.30] There are infinitely many prime numbers.

Definition 6.1
If a, b ∈ Z, we define the greatest common divisor of a and b, written gcd(a, b), to be the
largest positive integer that divides both a and b. More precisely,

gcd(a, b) = max {d ∈ Z : d > 0 and d|a and d|b} .

68
2016-
c Tyler Holden
6.1 Divisibility and Primes 6 Divisibility and Modular Arithmetic

For example,
gcd(4, 6) = 2, gcd(15, 25) = 5, gcd(15, 33) = 3, gcd(17, 4) = 1.
Note that since every number divides 0, gcd(a, 0) = |a|.
If p is a prime number, then gcd(p, a) = 1 unless a is a multiple of p. A somewhat more
interesting notion is that of coprimality:
Definition 6.2
If a, b ∈ Z, we say that a and b are coprime or relatively prime if gcd(a, b) = 1.

Theorem 6.3

If a, b ∈ Z and gcd(a, b) = d, then there exist m, n ∈ Z such that ma + nb = d.

We do not yet have the tools to prove Theorem 6.3, but the result is sufficiently worthwhile now
for perspective. In some cases, the m, n guaranteed by the theorem are simple to see. For example,

• gcd(17, 4) = 1, and 17(1) + (4)(−4) = 1,


• gcd(15, 25) = 5 and 15(−3) + 25(2) = 5,
• gcd(15, 33) = 3 and 15(−2) + 33(1) = 3.

These were easy enough to do by inspection, but what if we are asked to find gcd(1053, 481)?
Can you see this by simple inspection? The answer is 13, but is there a systematic way of deducing
this answer? Furthermore, how do we find the m, n guaranteed by Theorem 6.3? Here the answer
is
13 = 481(46) + 1053(−21)
but there is no way we can see that just by inspection! These are some of the answers we will
eventually answer.

6.1.1 Prime Divisibility and its Implications

Number theory really is the study of primes, and those prime numbers have special properties when
it comes to divisibility.
Proposition 6.4

If a|bc and gcd(a, b) = 1 then a|c.

Solution. Since a|bc we know there is a k ∈ Z such that ak = bc. Furthermore, Theorem 6.3 we
know that there exist m, n ∈ Z such that am + bn = 1, since we assumed that gcd(a, b) = 1.
Multiplying through by c we get
c = amc + bcn = amc + (ak)c = a(mc + kc)
showing that a|c. 

69
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic 6.1 Divisibility and Primes

Theorem 6.5

A number p ∈ Z is prime if and only if whenever p|ab then p|a or p|b.

Proof. (⇒) Assume that p is prime and that p|ab. If p|a we are done, so assume that p does not
divide a, in which case gcd(a, p) = 1. By Theorem 6.4, it then follows that p|b.
(⇐) Conversely, we will proceed by the contrapositive; that is, we will show that if p is not
prime, then there exists a, b such that p|ab but neither p|a nor p|b. Assume that p is not a prime, so
it is necessarily composite. We can thus write p = rs for 1 < r ≤ s < p. Hence p|rs but p divides
neither r nor s,

This fact is equivalent to being a prime number, and is in fact the definition of prime in higher
mathematics. It is a straightforward induction proof to show that if p|a1 a2 · · · an then p|ai for some
i ∈ {1, . . . , n}.
In fact, we have used this divisibility fact before, but not in an obvious way. Recall in Example
3.15 we showed that “n is even if and only if n2 is even.” Since 2 is a prime number, and being
divisible by 2 is equivalent to being even, this is the statement that “2|n if and only if 2|n2 .” This
generalizes:
Proposition 6.6

If n ∈ Z and p is prime, then p|n if and only if p|n2 .

Proof. (⇒) Suppose that p|n so that pk = n for some k ∈ Z. Squaring n we get n2 = p2 k 2 = p(pk 2 )
so p|n2 .
(⇐) Converse, suppose that p|n2 . Since p is a prime, Theorem 6.5 shows that p|n.

We used the fact that n is even if and only if n2 is even to show that 2 is irrational. The same

proof now applies to show that p is irrational for any prime p.

Theorem 6.7

If p is a prime, then p is irrational.

Proof. For the sake of contradiction, assume that p is rational, and write p = a/b where gcd(a, b) =
1; that is, a/b is in lowest terms. Multiplying both sides by b and squaring gives b2 p = a2 .
Certainly p|b2 p and so p|a2 , which in turn implies that p|a. Hence we can write a = pk for some
k ∈ Z. Substituting this into b2 p = a2 gives

b2 p = p2 k 2 ⇒ b2 = pk 2 .

Once again p|pk 2 so p|b2 , showing that p|b, hence b = p` for some ` ∈ Z. This is a contradiction,
since
1 = gcd(a, b) = gcd(pk, p`)

70
2016-
c Tyler Holden
6.2 The Euclidean Algorithm 6 Divisibility and Modular Arithmetic


and the right hand side at least p. Thus p is irrational.

Theorem 6.8: The Fundamental Theorem of Arithmetic

Every positive integer greater than 1 can be uniquely expressed as a product of primes.

Proof. We have already shown that every number can be written as a product of primes, so it only
remains to show that this decomposition is unique. For the sake of contradiction, assume that there
exists some integer n > 1 which can be expressed with two different prime factorizations, say

n = p 1 · · · p n = q1 · · · qm ,

where all the pi and qj are prime. Since p1 |n, it must also be the case that p1 |q1 · · · qm . Since p1 is
a prime, by Theorem 6.5 we must have p1 |qj for some j. By reordering if necessary, let this be q1 .
Since q1 is also prime, the only way p1 |q1 is if p1 = q1 . Cancelling p1 and q1 thus gives

p2 p3 · · · pn = q2 q3 · · · qm .

We repeat the same argument above, deducing that p2 = q2 , p3 = q3 , and generally that pi = qi .
Hence n = m and, up to reordering the factors, the prime decomposition is unique.

Example 6.9

The number log36 (105) is irrational.

Solution. For the sake of contradiction, assume log36 (105) = p/q where gcd(p, q) = 1.By definition
of the logarithm, 36p/q = 105, or equivalently 36p = 105q . Factoring 36 and 105 into primes gives

(22 · 32 )p = (3 · 5 · 7)q ⇒ 22p 32p = 3q 5q 7q .

By the Fundamental Theorem of Arithmetic, it must be the case that 2p = 0, 2p = q, q = 0,


for which the only solution is p = q = 0. However, the number 0/0 is not rational, so this is a
contradiction. 

6.2 The Euclidean Algorithm

We now develop an algorithm for determining the greatest common divisor of two numbers, together
with the integer linear combination that achieves that number.
Proposition 6.10: The Division Algorithm

If a, b ∈ Z with b > 0, there exists unique q and r such that a = qb + r where 0 ≤ r < b.

71
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic 6.2 The Euclidean Algorithm

Proof. Assume that both a, b > 0, for which the case when a < 0 follows similarly. We proceed by
strong induction. As a base case, note that when a = 1 then
(
b · 0 + 1 if b > 1
a=
b · 1 + 0 if b = 1,
so the base case holds. Assume then that for all 1 ≤ n ≤ a, we can write n = bq + r for some
unique q and r, and consider a + 1. If a + 1 ≤ b then the result is trivial, so assume that a + 1 > b.
Now (a + 1 − b) ≤ a so by the induction hypothesis
a + 1 − b = bq + r
for some unique q and r, so a + 1 = b(q + 1) + r. It must also be the case that q + 1 and r are
unique, for otherwise q and r would not be unique.

Proposition 6.11

If a, b, q, r ∈ Z satisfy a = qb + r then gcd(a, b) = gcd(b, r).

Proof. Write r = a − qb. If a = b = 0 the necessarily q = r = 0 and so the result is trivially true.
Therefore, assume that not both of a and b are zero, and set d = gcd(a, b). Since d|a and d|b then
d|(a − qb) = r by Proposition 3.27. Now we must show that d is in fact the greatest common divisor
or r and b.
If c is any other divisor of b and r, then by Proposition 3.27 we have c|(bq + r) = a. Since d
is the greatest common divisor of a and b, it must be the case that c ≤ d, showing that d is the
greatest common divisor of b and r as required.

Theorem 6.12: The Euclidean Algorithm

Let a, b ∈ Z with b 6= 0 and assume that b does not divide a. Consider the following
algorithm:

a = q1 b + r1 , where 0 < r1 < |b|


b = q2 r1 + r2 , where 0 < r2 < r1
r1 = q3 r2 + r3 , where 0 < r3 < r2
..
.
rn−2 = qn rn−1 + rn , where 0 < rn < r−1
rn−1 = qn+1 rn + 0.

This algorithm must terminate in finitely many steps, and moreover gcd(a, b) = rn .

Proof. The remainders form a strictly decreasing sequence of positive integers: a ≥ b > r1 > r2 >
r3 > · · · and so must eventually reach zero, showing that the algorithm must eventually terminate.
Furthermore, by repeatedly applying Proposition 6.11 we get
gcd(a, b) = gcd(b, r1 ) = gcd(r1 , r2 ) = gcd(r2 , r3 ) · · · = gcd(rn−1 , rn ) = gcd(rn , 0) = rn .

72
2016-
c Tyler Holden
6.3 Linear Diophantine Equations 6 Divisibility and Modular Arithmetic

Example 6.13

Find the greatest common divisor of 504 and 1155.

Solution. Proceeding with our Euclidean algorithm, we have


1155 = 504(2) + 147
504 = 147(3) + 63
147 = 63(2) + 21
63 = 21(3) + 0
By the Euclidean algorithm, it must then be the case that gcd(1155, 504) = 21. 

Another trick to finding the greatest common divisor of two numbers is to use their prime
factorizations. Let a, b ∈ Z and, allowing powers to be zero if necessary, write a and b with the
same primes
a = pn1 1 pn2 2 · · · pnk k , b = pm1 m2 mk
1 p2 · · · pk .
The greatest common divisor of a and b must therefore be
min{n1 ,m1 } min{n2 ,m2 } min{nk ,mk }
d = gcd(a, b) = p1 p1 · · · pk .

Example 6.14

Find the greatest common divisor of 2100 and 1002 .

Solution. The prime factorization of 2100 is just 2100 , while for 1002 we get
1002 = (22 · 52 )2 = 24 · 54 .
The greatest common divisor is thus
d = gcd(2100 , 1002 ) = 2min{4,100} 5min{0,4} = 24 = 16. 

6.3 Linear Diophantine Equations

Definition 6.15
Given a, b, d ∈ Z, a Linear Diophantine Equation (in two variables) is any equation of the
form
ax + by = d.

Certainly there are rational solutions to this equation, but we are more interested in finding
integer solutions.
Theorem 6.16

If a, b, c ∈ Z, then ax + by = c has a solution if and only if gcd(a, b)|c.

73
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic 6.3 Linear Diophantine Equations

Proof. In both proofs, let d = gcd(a, b).


(⇒) Suppose that the equation has a solution, say x0 and y0 , so that ax0 + by0 = c. Necessarily,
d|a and d|b so d|(ax0 + by0 ) = c
(⇐) Suppose that d|c so that dk = c for some k ∈ Z. Applying the Euclidean algorithm to a
and b there is a sequence
a = q1 b + r1 ,
b = q2 r1 + r2 ,
r1 = q3 r2 + r3 ,
..
.
rn−2 = qn rn−1 + d,

Working back up through these equations, allows us to write d in terms of a and b, so there is an
x0 , y0 such that ax0 + by0 = d. Multiplying through by k we get
a(kx0 ) + b(ky0 ) = dk = c
as required.

Example 6.17

In Example 6.13 we showed that gcd(504, 1155) = 21. Find a solution to the Diophantine
equation 504x + 1155y = 42.

Solution. Since gcd(504, 1155) = 21|42 we know that a solution exists. We found in Example 6.13
that
1155 = 504(2) + 147
504 = 147(3) + 63
147 = 63(2) + 21
Write the last line as 21 = 147 + 63(−2). We now solve each remaining equation for the remainder
to find 63 = 504 + 147(−3) and 147 = 1155 + 504(−2). Substituting these we get
21 = 147 + 63(−2)
= 147 + [504 + 147(−3)](−2) from 63 = 504 + 147(−3)
= 147(7) + 504(−2)
= [1155 + 504(−2)](7) + 504(−2) from 147 = 1155 + 504(−2)
= 1155(7) + 504(−16).
This is not the desired solution though, so we multiply through by 2 to get
504(−32) + 1155(14) = 42
as required. 

74
2016-
c Tyler Holden
6.3 Linear Diophantine Equations 6 Divisibility and Modular Arithmetic

There are in fact infinitely many solutions to a solvable Diophantine equation. How do we find
them in general?
Proposition 6.18

Suppose that a, b, c ∈ Z and (x0 , y0 ) satisfy ax0 + by0 = c. If d = gcd(a, b) 6= 0 then the
general solution to the Diophantine equation ax + by = c is given by
b a
x = x0 + n , y = y0 − n , for all n ∈ Z.
d d
Note that it does not matter whether you put the minus sign in the x term or the y term,
so long as they are opposiing signs.

Proof. Subtract the equations ax0 + by0 = c and ax + by = c to find that a(x − x0 ) + b(y − y0 ) = 0.
Divide through by d and re-arrange to find that
a b
(x − x0 ) = − (y − y0 ).
d d
The student can show that gcd(a/d, b/d) = 1 (Good exercise!). Moreover, a/d divides the left hand
side, and so must also divide the right hand side. Since gcd(a/d, b/d) = 1, then by Proposition
6.4 we know that (a/d)|(y − y0 ). Similarly, (b/d)|(x − x0 ), so there exists n ∈ integ such that
x − x0 = n(b/d) so that
b a ab
− (y − y0 ) = (x − x0 ) = n ⇒ bd(y0 − y) = anb
d d dd
In both cases, we can solve for x and y to find that
b a
x = x0 + n , y = y0 − n .
d d
Now in fact every such n works, since
   
b  a anb anb
ax + by = a x0 + n + b y0 − n = (ax0 + by0 ) + − = ax0 + by0 = c.
d d d d

Example 6.19

Find the general solutions to 504x + 1155y = 42.

Solution. We have already found a particular solution; namely, 504(−32) + 1155(14) = 42. Fur-
thermore, d = gcd(504, 1155) = 21 and so the general solutions are of the form
1155 504
x = −32 + n = −32 + 55n, y = 14 − n = 14 − 24n. 
21 21

It is sometimes easier to normalize by the greatest common denominator before starting. Take
our carry-through example, where we want to solve 504x + 1155y = 42. Dividing through by the
greatest common divisor d = gcd(504, 1155) = 21 we get the equation
24x + 55y = 2.

75
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic 6.4 Relations on Sets

Since gcd(24, 55) = 1 we can find a solutions to 24x + 55y = 1, then multiply by 2. To do this, we
again use the Euclidean algorithm to find

55 = 24(2) + 7
24 = 7(3) + 3
7 = 3(2) + 1
3 = 3(1) + 0

(where of course we already knew the greatest common divisor is 1). Now working backwards

1 = 7 + 3(−2)
= 7 + [24 + 7(−3)](−2)
= 7(7) + 24(−2)
= [55 + 24(−2)](7) + 24(−2)
= 55(7) + 24(−16).

Multiplying by 2 gives us our solution 24(−32) + 55(14) = 2. Notice that (−32, 14) are the same
as the solutions we found above, as should be the case. Moreover, the general solutions are easily
read off as x = −32 + 55n and y = 14 − 24n, which is again the same solution we found above.
Example 6.20

Find all non-negative solutions to 504x + 1155y = 42, if they exist.

Solution. We know that the general solutions are of the form x = −32 + 55n and y = 14 − 24n.
We require both numbers to be positive, giving −32 + 55n > 0 and 14 − 24n > 0, which we can
solve for n to get
32 14
<n< .
55 24
There are no integers which satisfy this equation, so there are no non-negative solution. 

6.4 Relations on Sets

Given a set S, a relation on S is a way of comparing two elements of the set, or rather, specifying
their relationship. For example, equality is a relation, for when we write a = b we are specifying
a relationship between a and b. Similarly, a < b is a relation. Abstractly, if a, b ∈ S we write the
relation as aRb or R(a, b). This describes a heuristic, but is not a formal definition, which is the
following:
Definition 6.21
Given a set S, a relation on S is any subset of SR ⊆ S × S.

This seems pretty nebulous, but the subset precisely captures when elements are related. For
example, if S = Z and R =< then SR = {(a, b) ∈ Z × Z : a < b}. Here we see that (−2, 5) ∈ SR
since −2 < 5, while (15, 14) ∈
/ SR since 15 6< 14.

76
2016-
c Tyler Holden
6.4 Relations on Sets 6 Divisibility and Modular Arithmetic

Given a relation R as in the first paragraph, we can define the set SR as follows: Define a
function r : S × S → {0, 1} such that
(
1 aRb
r(a, b) =
0 otherwise

and set SR = r−1 (1). Conversely, given a set SR we say aRb if (a, b) ∈ SR .
To distinguish between different types of relations, we define properties that the relation can
exhibit. If R is a relation on S then

• R is reflexive if for all a, aRa,

• R is transitive if whenever aRb and bRc then aRc,

• R is symmetric if whenever aRb then bRa,

• R is anti-symmetric if whenever aRb and bRa then a = b,

• R is total if for all a and b either aRb or bRa.

For example, equality a = b is reflexive (a = a is always true), transitive (if a = b and b = c


then a = c), and symmetric (when a = b then b = a), but not total. On the other hand, inequality
a ≤ b is reflexive (a ≤ a is always true), transitive (if a ≤ b and b ≤ c then a ≤ c), anti-symmetric
(a ≤ b and b ≤ a implies a = b), and total (either a ≤ b or b ≤ a).
Definition 6.22
An relation R on a set S is said to be an equivalence relation if R is reflexive, transitive, and
symmetric. It is said to be an order relation if R is reflexive, transitive, and anti-symmetric.

Example 6.23

Let X be a set, and define a relation on the power set P(X) by subset inclusion; namely
ARB if A ⊆ B. Determine which of the five previous properties are satisfied by this relation.

Solution. This relation is reflexive, since A ⊆ A is always true. It is transitive, since if A ⊆ B and
B ⊆ C then A ⊆ C. Finally, it is also anti-symmetric, for if A ⊆ B and B ⊆ A then A = B. Hence
subset inclusion is a order relation.
However, notice that inclusion is not total. Given two arbitrary subsets A, B ∈ X, there need
not be a relation on them. It could be the case that A ⊆ B or B ⊆ A, but in general neither need
be true. We say that subset inclusion is a partial ordering. 

Example 6.24

Show that divisibility on N is a partial order relation, though not on Z.

77
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic 6.4 Relations on Sets

Solution. We need to check that divisibility is transitive, reflexive, and anti-symmetric, but non-
total. Being reflexive is immediate, since a|a for all a ∈ Z. Transitivity is true, for we have shown
that if a|b and b|c then a|c.
Anti-symmetry requires a bit of work, but isn’t too bad. Assume that a|b and b|a. If either is
zero then the relation x|0 can only be true if x = 0, so a = b. Thus assume that neither a nor b is
zero, so there exist k, ` ∈ N such that ak = b and b` = a. Multiplying the first equation by ` we get

ak` = b` = a ⇒ a(k` − 1) = 0

Since a 6= 0 we must have k` = 1 which implies that k = ` = ±1. Since k, ` ∈ N, it must be


that k = ` = 1 so a = b. On the other hand, notice that if we are taking divisibility in Z, then
k = ` = −1 is also a possibility, showing that a = −b. 

Example 6.25

Let S = R and let f : R → R be any function. Define a relation on R by saying that a ∼


=b

if f (a) = f (b). Show that = is an equivalence relation.

Solution. We need to show that ∼


= is reflexive, transitive, and symmetric.

• [Reflexive] Let a ∈ R. Since functions have unique outputs, f (a) = f (a) showing that a ∼
= a.

• [Transitive] Assume that a ∼ = b and b ∼


= c, so that f (a) = f (b) and f (b) = f (c). But then
f (a) = f (b) = f (c), so a ∼
= c.

• [Symmetric] If a ∼
= c then f (a) = f (c), or equivalently f (c) = f (a), showing that c ∼
= a. 

Definition 6.26
Given an equivalence relation ∼
= on a set S, the equivalence class of an element a ∈ S is the
set
[a] = {x ∈ S : x ∼
= a} .

If ∼ is an equivalence class on S, the set of equivalence classes in S is sometimes denoted S/ ∼.


Example 6.27

Define an equivalence relation on R by saying that a ∼ b if a − b ∈ Z. What do elements of


an equivalence class look like?

Solution. Let’s start with a simple example, and look at the equivalence class [0]. By definition,

[0] = {x ∈ R : x ∼ 0} = {x ∈ R : x − 0 ∈ Z} = Z

so the equivalence class of [0] is precisely the integers. What about something like [1.5]?

[1.5] = {x ∈ R : x ∼ 1.5} = {x ∈ R : x − 1.5 ∈ Z}

78
2016-
c Tyler Holden
6.5 Modular Arithmetic 6 Divisibility and Modular Arithmetic

So when will x − 1.5 look like an integer? Precisely when x the decimal part of x is 0.5; for example,
4.5 − 1.5 ∈ Z and −2.5 − 1.5 ∈ Z.
In general, equivalence classes [x] are the all those real numbers that have the same decimal
component as x. 

Exercise: Show that ∼ is an equivalence relation.

Theorem 6.28

If ∼ is an equivalence relation on S, then ∼ partitions S into disjoint equivalence classes;


that is, every element x ∈ S belongs to a unique equivalence class.

Solution. Certainly every element belongs to an equivalence class, so we need only show that two
disjoint equivalent classes have no intersection. Let [x] and [y] be disjoint equivalence classes, and
for the sake of contradiction assume that [x] ∩ [y] 6= ∅. Choose z ∈ [x] ∩ [y], so that z ∼ x and z ∼ y.
By transitivity, x ∼ y, which contradicts the fact that these were disjoint equivalence classes. We
conclude that all equivalence classes are disjoint. 

6.5 Modular Arithmetic

Definition 6.29
If n ∈ N and a, b ∈ Z, we say that a ≡ b (mod n) (read: a is congruent to b mod n) if
n|(b − a).

For example, if n = 4 then 1 ≡ 29 (mod 4), since 4|(29 − 1) = 28. The congruence classes are
precisely
[0] = {. . . , −12, −8, −4, 0, 4, 8, 12, . . .}
[1] = {. . . , −11, −7, −3, 1, 5, 9, 13, . . .}
[2] = {. . . , −10, −6, −2, 2, 6, 10, 14, . . .}
[3] = {. . . , −9, −6, −1, 3, 7, 11, 15, . . .}

Proposition 6.30

Congruence mod n is an equivalence relation.

Proof. We need to show that congruence is reflexive, transitive, and symmetric.

• [Reflexive] Fix an a ∈ Z. Note that a − a = 0 and n|0, so a ∼


= a (mod n).
• [Transitive] Suppose that a ≡ b (mod n) and b ≡ c (mod n), or equivalently n|(b − a) and
n|(c − b). Now
(c − a) = (c − b + b − a) = (c − b) + (b − a)

79
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic 6.5 Modular Arithmetic

and since n divides each of above terms, by Proposition 3.27, we know it divides the sum as
well. Thus n|(c − a), or a ≡ c (mod n).

• [Symmetric] Suppose that a ≡ b (mod n), so that n|(b − a). Then n|(a − b) = −(b − a), so
b ≡ a (mod n) as required.

Proposition 6.31

Fix an n ∈ N. If a ≡ r (mod n) and b ≡ s (mod n) then

1. (a + b) ≡ (r + s) (mod n).

2. ab ≡ rs (mod n)

Proof. Fix a, b, r, s and assume that a ≡ r (mod n) and b ≡ s (mod n). Hence n|(r − a) and
n|(s − b). Equivalently, there exist k, ` such that nk = r − a and n` = s − b.

1. Summing together our two equation above, we get

n(k + `) = (r + s) − (a + b)

so n|[(r + s) − (a + b)] or equivalently, (a + b) ≡ (r + s) (mod n)

2. We can write

rs − ab = rs − rb + rb − ab = r(s − b) + (r − a)b = n`r + nkb = n(`r + kb)

so n|(rs − ab) or equivalently, ab ≡ rs (mod n).

For a fixed n ∈ N we define the congruence class of a ∈ Z to be the equivalence class

[a] = {x ∈ Z : x ≡ a (mod n)} ,

of which there precisely n; namely [0], [1], [2], . . . , [n − 1]. We often denote this set of equivalence
classes by
Zn = Z/nZ = {[0]n , [1]n , . . . , [n − 1]n } .
According to Proposition 6.31, we can add and multiply congruence classes exactly as we would
integers, as long as we reduce modulo n. So for example, working modulo 3 we have

[3]4 + [2]4 = [5]4 = [1]4 , [3]4 · [2]4 = [6]4 = [2]4 .

Exercise: Show that Z/pZ is a field if and only if p is a prime.

Example 6.32

What is the last digit in 4441 ?

80
2016-
c Tyler Holden
6.5 Modular Arithmetic 6 Divisibility and Modular Arithmetic

Solution. Note that the last digit d satisfies d ≡ 4441 (mod 10). Moreover,

4441 = (43 )147 = (64)147 ≡ 4147 (mod 10).

The same trick again gives


4147 = 6449 ≡ 449 (mod 10).
Here we need to be a bit more clever. Rather than try to decompose this into 49 = 7 · 7, lets write
49 = 1 + 48 so that

449 = 4 · 448 = 4 · (64)12 ≡ 4 · 416 (mod 10)


4 4
≡ 4 · (256) (mod 10) ≡ 4 · 6 (mod 10)
≡ 4 · (36)2 (mod 10) ≡ 4 · ·6 (mod 10)
≡4 (mod 10).

Hence the last digit of 4441 is 4. 

Example 6.33

Show that if a2 − 2 is not divisible by 7, then a − 4 is not divisible by 7.

Solution. By contrapositive, assume that a − 4 is divisible by 7; that is, a ≡ 4 (mod 7). Then

a2 ≡ 16 mod 7 ≡ 2 mod 7 ⇒ a2 − 2 ≡ 0 mod 7

showing that a2 − 7 is divisible by 7. 

81
2016-
c Tyler Holden

You might also like