Mat 102
Mat 102
c
Tyler Holden,
2016-
Contents
1 Motivating Problems 3
2 Mathematical Infrastructure 4
2.1 Quadratic Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Bounding Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.2 The Arithmetic-Geometric Mean Inequality . . . . . . . . . . . . . . . . . . . 8
2.2.3 Absolute Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Sets and Set Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.1 Relations on Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.2 Operations on Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.3 Functions Between Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.4 Properties of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4 Ordered Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.1 The Field Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.2 Ordered Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.3 Complete Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3 Mathematical Logic 28
3.1 Mathematical Predicates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 Universal and Existential Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 And, Or, Not . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.1 Negating Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4.1 Negating an Implication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Contradiction (Reductio ad absurdum) . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.6 A Rigmarole of Random Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1
3.6.1 Some Number Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4 Induction 44
4.1 Summation and Product Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.2 More General Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2.1 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3 Fallacies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2
1 Motivating Problems
1 Motivating Problems
The kinds of questions we will be considering in this course are not those amenable to rote memo-
rization or procedural algorithms, like those of which secondary school is replete. We will instead
be looking at questions that require you to think critically, to use knowledge you have already
acquired, and you apply it to solve a problem you have never seen before.
Critical thinking is hard! Do not be discouraged if you cannot do it at first. Like playing
a musical instrument or learning a sport, it is something that can be learned with practice and
dedication.
To this effect, let’s look at some famous mathematical problems.
Example 1.1
Suppose that a standard 8 × 8 chessboard has two diagonally opposite corners removed. Is
it possible to tile the chessboard with dominoes? More precisely, given 31 dominoes of size
2 × 1, is it possible to cover the chessboard in dominoes such that no two overlap?
Figure 1: A chessboard with two diagonal pieces removed. Notice that by necessity, those two
pieces are of the same color.
Another topic of great appeal to the layman is the notion of infinity. Did you know that there
are many different kinds of infinity? In fact, there are infinitely many different kinds of infinity,
with no infinity being the largest infinity. However, the infinity which enumerates the infinities is
larger than any infinity which it enumerates. That’s confusing eh?
Let’s start with a more reasonable example:
Example 1.2
3
2016-
c Tyler Holden
2 Mathematical Infrastructure
Example 1.3
One day Gauss’ teacher asked his class to add together all the numbers from 1 to 100,
assuming that this task would occupy them for quite a while. He was shocked when young
Gauss, after a few seconds thought, wrote down the answer 5050.
More generally, what is the sum
1 + 2 + 3 + 4 + · · · + (n − 1) + n
12 + 22 + 32 + 42 + · · · + (n − 1)2 + n2 ?
We will be able to answer all of these questions and more by the end of this course.
2 Mathematical Infrastructure
In order to discuss proofs, we will need raw materials, things like numbers, functions, and sets.
You should already be familiar with some of the basics, but here we will introduce these items in
a little more depth.
The quadratic formula represents an interesting starting point. Consider the equation
ax2 + bx + c = 0
for constants a, b, c. The student is familiar with the famous quadratic formula, which tells us that
the solutions to this equation are given by
√
−b ± b2 − 4ac
x= .
2a
But as far as most of you are concerned, this is some mysterious quantity that your high school
teachers materialized out of thin air. So where does it come from?
The answer is really quite simple, although there is some tricky algebra to be done. We are
taught when we are younger how to “complete the square,” which is to convert
This is useful for graphing the quadratic, or maybe determining the apex of the corresponding
4
2016-
c Tyler Holden
2.2 Bounding Arguments 2 Mathematical Infrastructure
Well now, that looks pretty darn similar to the quadratic formula, except that we need to write
α, β, γ in terms of a, b, c. So let’s complete the square on ax2 + bx + c, where we find that
2 2 b
ax + bx + c = a x + x + c factor out the a
a
b2 b2
2 b
=a x + x+ 2 − 2 +c squaring half the coefficient of x
a 4a 4a
b2 b2
b
= a x2 + x + 2 − +c pulling the b2 /(4a2 ) term out
a 4a 4a
2
4ac − b2
b
= |{z}
a x+ + .
2a 4a
α |{z} | {z }
β γ
x x x
Figure 2: The various graphs of the parabola ax2 +bx+c depending on the value of the discriminant
D. There are 1 + sgn(D) roots of the parabola.
To discuss the notion of length, we need to be able to compare relative sizes. This leads us to the
notion of inequalities. For example, we know that 2 < 4 or that −5 ≤ 0, or even that e < π. This
5
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.2 Bounding Arguments
is known as the total ordering of the real numbers, which we will discuss in more detail later.1
2.2.1 Inequalities
This becomes significantly more complicated when we are determining more general rules; that is,
rules that hold when we are unable to use specific numbers. You may take the following as axioms
(though in reality they need to be proven):
Proposition 2.1
2. a2 ≥ 0
Proposition 2.1 (3) in particular is difficult to prove, and requires something called the Com-
pleteness Axiom.
Exercise: How does Proposition 2.1 change if the less-than-or-equal signs are changed to
less-than signs?
Proof. Note that (a − b)2 ≥ 0 by property (2). Expanding the square gives
(a − b)2 ≥ 0 ⇔ a2 − 2ab + b2 ≥ 0
⇔ a2 + b2 ≥ 2ab,
Example 2.3
1
The proper definition of the inequality is as follows: We say that a > b if a − b is a positive number. Using this,
try to prove the facts about inequalities. For example, Proposition 2.1 (1) and (4)
6
2016-
c Tyler Holden
2.2 Bounding Arguments 2 Mathematical Infrastructure
We need to find some way of using the fact that b + c ≥ 2. Notice that by clever factoring, we can
write
2ab + 2ac + 2bc = 2a(b + c) + 2bc ≥ 4a + 2bc
(a + b + c)2 ≥ a2 + b2 + c2 + 4a + 2bc.
We must somehow convert the a2 + b2 + c2 + 2bc into something that looks like 4bc. By Proposition
2.2, we know that b2 + c2 ≥ 2bc so
where in the last inequality we have used the fact that a2 ≥ 0. Putting this all together,
(a + b + c)2 ≥ 4a + 4bc
exactly as required.
Proposition 2.4
√ √
If a, b are real numbers such that 0 < a < b, then a2 < ab < b2 and 0 < a< b.
Proof. Let’s start by showing that a2 < ab < b2 . We know that 0 < a < b and since a > 0 we can
multiply through by a, preserving the inequality, to get
Similarly, since b > 0 we can multiply into 0 < a < b and preserve the inequality, to get
0 < ab < b2 .
7
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.2 Bounding Arguments
One of the more famous inequalities in mathematics is the Arithmetic-Geometric Mean Inequality.
In order to discuss it, we must remind ourselves of the arithmetic mean and the geometric mean.
Definition 2.5
If a, b are two real numbers, then the
a+b √
Arithmetic Mean is , Geometric Mean is ab.
2
The arithmetic mean is usually what is meant when we talk about an average. However, there
are cases in which the geometric mean can also be interpreted as an average. For example, say that
you have an investment of $100 which grows at a rate of 5% the first year and 8% in the second
year. After two years the value of your investment is
$100 × (1.05) × (1.08) = $100 × (1.134) = $113.40
It is often more convenient to discuss such a investment in terms of its effective annual rate, which
is the hypothetical fixed rate at which your bond would have accrued the same final value. If that
value is r, then we need to solve
√
r × r = 1.1134, ⇒ r = 1.05 × 1.08 ≈ 1.065.
The number 1.065 is precisely the geometric mean.
Given a collection of n numbers a1 , . . . , an , we know their arithmetic mean is given by
a1 + a2 + · · · + an
,
n
the geometric mean of that same group of number is given by
√
n
a1 a2 · · · an .
8
2016-
c Tyler Holden
2.2 Bounding Arguments 2 Mathematical Infrastructure
Proof. To present the proof directly would likely lead to confusion, since at some point it will
appear as though we arbitrarily add a term. Instead, let’s work backwards to see if we can reduce
our inequality to a similar statement. If our inequality holds then
2
a+b
ab ≤ ⇔ 4ab ≤ a2 + b2 + 2ab
2
⇔ 2ab ≤ a2 + b2
and we know that this last identity is correct by Proposition 2.2. A proper proof would now consist
of tracing back through this system of equivalences to arrive at the desired result.
Warning!
What I did in the proof above was create a chain of logically equivalent statements,
eventually arriving at a result which I knew to be true, thus implying that every statement
in the chain is true. I did not assume that the first inequality was true!
Students are often tempted to prove inequalities in a similar fashion, but fail to en-
sure that every statement is logically equivalent, or beg the question by assuming that the
end result is true. For example, if you complete a proof by concluding 0 = 0, then your
proof is almost certainly wrong.
Here’s an interesting alternate proof. Fix two positive real numbers a, b and construct a semi-
circle whose radius is a + b, as shown in Figure 3. Let h be the perpendicular line emanating from
the meeting point of line segments of length a and b along the radius.
h
D
A B
a b
As an inscribed angle, ∠ACB is a right angle. This in turn implies that triangle 4ADC is
similar to triangle 4CDB. As these are similar, the ratios of their side-lengths are equal; namely,
a h √
= ⇒ h2 = ab ⇒ h= ab.
h b
Thus the height h is precisely the geometric mean. Compare this to the red line, whose length is
the radius (a + b)/2. Note that by construction, h will always be shorter than the radius, and they
will be equal precisely when a = b.
9
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.2 Bounding Arguments
Example 2.7
A farmer is given 120 metres of fencing and wishes to make a rectangular pasture which
encloses the maximum amount of area. Show that the largest such pasture is obtained with
a square.
Area = ab
a a
Perimeter = 2a + 2b
Figure 4: The farmer is making a rectangular pasture, but has a fixed amount of fencing.
Solution. Consider a rectangle with side length a, b as shown in Figure 4. The perimeter of this
rectangle is the fence, of which the farmer has 120 metres, so 120 = 2a + 2b. The area is ab, which
seems to work well with the AM-GM, so were we to plug this in directly we would get
2
a+b
ab ≤ .
2
But we know 120 = 2a + 2b, so dividing everything by 4 gives (a + b)/2 = 30, and so
2
a+b
ab ≤ = 302 = 900.
2
At this point, we only know that 900 is an upper bound for the area: It could be the case that the
true upper bound is some smaller number. One of the powerful aspects of the AM-GM is that it
gives a condition on the inequality to be saturated; namely, when a = b. Setting a = b gives us
a square, and if we want to solve for the precise values of a and b, we have 120 = 2a + 2b = 4a
showing that a = b = 30.
Example 2.8
4
Find the minimum of the function f (x) = x3 + .
x3
Solution. If you know calculus, this problem can be done using optimization techniques. However,
it requires a great deal of time and energy to develop that infrastructure, whereas this problem can
be solved with the AM-GM. How would we realize this? To find the minimum, we need a bound
of the form f (x) ≥ m, where m is some constant. Note that if we multiply the two components of
10
2016-
c Tyler Holden
2.2 Bounding Arguments 2 Mathematical Infrastructure
f , we would indeed get a constant. To see if this goes anywhere, let a = x3 and b = 4/x3 , so that
ab = 4, and
2
a+b 2
3
x + 4/x3 (x3 + 4/x3 )2
= = .
2 2 4
The AM-GM then says that
(x3 + 4/x3 )2 4
4≤ ⇒ 16 ≤ (x3 + 4/x3 )2 ⇒ 4 ≤ x3 + .
4 x3
So f (x) ≥ 4. As with all inequalities though, this might not be a good lower bound, maybe we
can do better. We need to check that this value of the lower bound is actually achieved. By the
AM-GM, we know that equality occurs when a = b, which in this case gives us
4
x3 = ⇒ (x3 )2 = 4
x3
√ √ √
so √x = 6 4 = 3 2, or more conveniently x3 = 2. From here it’s easy to see that when x = 3 2,
f ( 3 2) = 4 as required.
Note that |x| ≥ 0, for if x is positive the absolute value does nothing, while if x is negative, the
absolute value adds on yet another negative sign, making it positive.
One can think of the absolute value as measure the distance of a number from zero. For example,
the numbers 4 and −4 should both be a distance of 4 from 0. We can also use absolute values to
measure the distance between two numbers. If x, y are real numbers, the distance from x to y is
|x − y|.
−5 0 5
Figure 5: The real line from −5 to positive 5. We would like to define a system of measurement
such that the red bars have the same length and the blue bars have the same length.
We can use absolute values to describe intervals of real numbers. For example, the collection of
x which satisfy |x| < 2 are those such that −2 ≤ x ≤ 2, or the interval (−2, 2). In a similar vein,
the collection of x such that |x − 1| < 3 are those that are “within a distance of 3 from the number
1. You can probably guess that this amounts to the interval (−2, 4), but to see it more precisely
note that
|x − 1| < 3 ⇔ −3 < x − 1 < 3 ⇔ −2 < x < 4.
11
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.2 Bounding Arguments
Proposition 2.10
1. x ≤ |x|,
2. |xy| = |x||y|,
√
3. x2 = |x|,
Proof. We will prove (4) and leave the others as an exercise. Since x2 = |x|2 , y 2 = |y|2 , and
2xy ≤ 2|x||y| we have
x2 + 2xy + y 2 ≤ |x|2 + 2|x||y| + |y|2 ⇔ (x + y)2 ≤ (|x| + |y|)2 .
Taking the square root of both sides gives |x + y| ≤ |x| + |y| as required.
Example 2.11
y = |x2 + x − 2|
|x − 1| < 1
Solution. Graphically, this question may be interpreted as in Figure 6. Note that we can write
|x2 + x − 2| = |x + 2||x − 1| < |x + 2| since |x − 1| < 1.
When |x − 1| < 1 we have 0 < x < 2, so to make this look like something involving x + 2 we add 2
to everything, giving 2 < x + 2 < 4, which implies that |x + 2| < 4. Hence |x2 + x − 2| < 4 when
|x − 1| < 1.
12
2016-
c Tyler Holden
2.3 Sets and Set Building 2 Mathematical Infrastructure
Example 2.12
Solution. Looking at the numerator, we can use the triangle inequality to write
|x3 − x − 3| ≤ |x|3 + |x| + |3| < 23 + 2 + 3 = 13.
For the denominator, we have to be more careful. Recall that if when we take reciprocals, an
inequality changes direction, so we want to bound |x4 + 1| from below. Indeed,
|x4 + 1| = x4 + 1 ≥ x4 > 24 = 16.
Combining everything together gives
3
x − x − 3 13
x4 + 1 ≤ 16 .
A set is any collection of distinct objects.2 Some examples of sets might include
Universities in
the alphabet = {a, b, c, . . . , x, y, z} , = {UofT, Ryerson, York} ,
Toronto
The Kardashian Sisters = {Kim, Khloe, Kourtney} .
We use the symbol ‘∈’ (read as ‘in’) to talk about when an element is in a set; for example,
1 ∈ {1, 2, 3} but _
¨ ∈
/ {dog, cat}.
Each of the previous examples were finite sets, as they consisted of only a finite number of
elements. A set can also have infinitely many elements. In such instances, it is inconvenient to
write out every element of the set so we use set builder notation. Herein, if P is a proposition on
the set S, such that for each x ∈ S, P (x) is either true or false, then one can define the set
{x ∈ S : P (x)}
which consists of all the elements in S which make P true. For example, if M is the set of months
in the year, then
{m ∈ M : m has 31 days} = {January, March, May, July, August, October, December} .
This was an example where the resulting set was still finite, but it still demonstrates the compactness
of setbuilder notation.
The following are some important infinite sets that we will see throughout the course:
2
This is not true, since it is possible to define objects called classes, but we will not worry about this too much in
this context
13
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.3 Sets and Set Building
Special subsets of the real numbers are the intervals. The mathematical definition of the interval
is somewhat complicated, but you’re likely familiar with them. Notationally, we write
Definition 2.13
If a, b ∈ Z, we say that a|b (read a divides b), if there exists some integer k such that b = ak.
Divisibility is something we’ll explore in great detail later in the course, and forms the foundation
of a field of mathematics called number theory. Number theory is really interested in the primes,
with which you should be familiar. Just in case you need a refresher, let’s recall the definition
below:
Definition 2.14
Let a be an integer. We say that a is prime number if the only numbers which divide a are
a and 1. We say that a is even if 2|a, and odd otherwise.
We can also talk about subsets, which are collections of items in a set and indicated with a ‘⊆’
sign. For example, if P is the set of prime numbers, then P ⊆ Z, since every element on the left
(a prime number) is also an element of the right (an integer). Similarly, one has N ⊆ Z ⊆ Q ⊆ R.
Note that if A is a set, A ⊆ A.
There is a particular distinguished set, known as the empty set and denoted by ∅, which contains
no elements. Recalling the definition of a vacuous truth, it is not too hard to convince oneself that
the empty set is a subset of every set!
2. T = x ∈ R : x = a − 12 , ∀a ∈ N , 4. V = {x ∈ Z : x = 3n , n ∈ N}.
3
Some mathematicians believe that 0 is a natural number. I am personally undecided, and always just choose
which version is more convenient.
14
2016-
c Tyler Holden
2.3 Sets and Set Building 2 Mathematical Infrastructure
Two sets are equal when they contain precisely the same elements. In practice, showing that
two sets are equal is usually done by mutual subset inclusion: if A, B are sets then
A=B ⇔ A ⊆ B and B ⊆ A
Example 2.15
Solution. Let’s begin by showing that A ⊆ B. Let n ∈ A, so that there exists a k such that
n = 4k + 1. Notice that we can equivalently write n = 4(k + 1) − 3, showing that n ∈ B. Since n
was arbitrary, we conclude that every element in A is also in B, so A ⊆ B.
Conversely, if n ∈ B then n = 4k − 3 for some k. We can write this as n = 4k − 3 = 4(k − 1) + 1,
showing that n ∈ A. Since n was arbitrary, every element of B is also an element of A, so B ⊆ A.
Both inclusions imply that A = B, as required.
Example 2.16
Solution. Let x ∈ A so that x2 − 1 < 0. This means that x2 < 1, which we can solve to get
x ∈ (−1, 1). Hence x ∈ B, and A ⊆ B.
Conversely, if x ∈ (−1, 1) then we know x2 < 1, so x2 − 1 < 0, showing that x ∈ B and B ⊆ A.
Both inclusions show that A = B.
Union and Intersection Let S be a set and choose two sets A, B ⊆ S. We define the union of
A and B to be
A ∪ B = {x ∈ S : x ∈ A or x ∈ B}
and the intersection of A and B to be
A ∩ B = {x ∈ S : x ∈ A and x ∈ B} .
Example 2.17
15
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.3 Sets and Set Building
A B A B
A∪B A∩B
Figure 7: Left: The union of two sets is the collection of all elements which are in both (though re-
member that elements of sets are distinct, so we do not permit duplicates). Right: The intersection
of two sets consists of all elements which are common to both sets.
Let I ⊆ N be an indexing set: Given a collection of sets {Ai }i∈I in S, one can take the
intersection or union over the entire collection, and this is often written as
[ \
Ai = {x ∈ S : there is an i ∈ I, x ∈ Ai } , Ai = {x ∈ S : for every i ∈ I, x ∈ Ai } .
i∈I i∈I
Example 2.18
Consider the set {x ∈ R : sin(x) > 0}. Write this set as as an infinite union of intervals.
Solution. We are well familiar with the fact that sin(x) > 0 on (0, π), (2π, 3π), (4π, 5π), etc. If we
let the interval In = (2nπ, (2n + 1)π) then the aforementioned intervals are I0 , I1 , and I2 . We can
convince ourselves that that sin(x) > 0 on any of the In , and hence
[ [
{x ∈ R : sin(x) > 0} = In = (2nπ, (2n + 1)π).
n∈Z n∈Z
Example 2.19
\
Define In = 0, n1 ⊆ R. Determine I =
In .
n∈N
Solution. By definition, I consists of the elements which are in In for every n ∈ N. We claim that
I cannot consist of any positive real number. Indeed, if p > 0 then there exists n ∈ N such that
1
n < p, which means that p ∈ / Ik for all k ≥ n, and hence cannot be in I. Since I has no positive real
numbers, and certainly cannot contain any non-positive real numbers, we conclude that I = ∅.
16
2016-
c Tyler Holden
2.3 Sets and Set Building 2 Mathematical Infrastructure
S T
Exercise: Let In = (−n, n) ⊆ R for n ∈ N. Determine both n In and n In .
Ac
Figure 8: The complement of a set A with respect to S is the set of all elements which are in S
but not in A.
Complement If A ⊆ S then the complement of A with respect to S is all elements which are
not in A; that is,
Ac = {x ∈ S : x ∈
/ A} .
Example 2.20
S
Determine the complement of I = n∈Z (2nπ, (2n + 1)π) from Example 2.18, with respect to
R.
Solution. Since I contains all the open intervals of the form (2nπ, (2n + 1)π) we expect its comple-
ment to contain everything else. Namely,
[
Ic = [(2n − 1)π, 2nπ].
n∈Z
Example 2.21
17
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.3 Sets and Set Building
Exercise:
Example 2.22
Solution. We must show two inclusions, so let’s start with (⊆). Let x ∈ (B \ A) ∪ (C \ A), so that
x ∈ (B \ A) or x ∈ (C \ A). For our first case, suppose x ∈ B \ A, in which case x ∈ B and x ∈/ A.
But as x ∈ B, we know x ∈ B ∪ C, so x ∈ B ∪ C and x ∈ / A implies x ∈ (B ∪ C) \ A. Precisely the
same reasoning holds if we take x ∈ C \ A, so (B \ A) ∪ (C \ A) ⊆ (B ∪ C) \ A.
Conversely, suppose x ∈ (B ∪ C) \ A, so that x ∈ B ∪ C and x ∈ / A. Since x ∈ B ∪ C, we
know either x ∈ B or x ∈ C. Suppose for now that x ∈ B. Since x ∈ B and x ∈ / A, we know
x ∈ (B \ A), so x ∈ (B \ A) ∪ (C \ A). Exactly the same reasoning holds if we assume x ∈ C, so
(B \ A) ∪ (C \ A) ⊆ (B ∪ C) \ A.
Both inclusions give equality, as required.
Cartesian Product The Cartesian product of two sets A and B is the collection of ordered
pairs, one from A and one from B; namely,
A × B = {(a, b) : a ∈ A, b ∈ B} .
A geometric way (which does not generalize well) is to visualize the Cartesian product as sticking a
copy of B onto each element of A, or vice-versa. For our purposes, the main example of the product
will be to define higher dimensional spaces. For example, we know that we can represent the plane
R2 as an ordered pair of points R2 = {(x, y) : x, y ∈ R} , while three dimensional space is an ordered
triple R3 = {(x, y, z) : x, y, z ∈ R}. In this sense, we see that R2 = R × R, R3 = R × R × R, and
motivates the more general definition of Rn as an ordered n-tuple
Rn = |R × ·{z
· · × R} .
n-times
18
2016-
c Tyler Holden
2.3 Sets and Set Building 2 Mathematical Infrastructure
Exercise: We have swept some things under the rug in defining Rn , largely because the true
nature is technical and boring. There is no immediate reason to suspect that R × R × R
should be well defined: we first need to check that the Cartesian product is associative; that
is, (R × R) × R = R × (R × R). By definition, the left-hand-side is
Syntactically, neither of these looks the same as R3 = {(a, b, c) : a, b, c ∈ R}, but nonetheless
they all define the same data.
Given two sets A, B, a function f : A → B is a map which assigns to every point in A a unique
point of B. If a ∈ A, we usually denote the corresponding element of B by f (a). When specifying
the function, one may write a 7→ f (a). The set A is termed the domain, while B is termed the
codomain.
Some examples of functions are as follows:
2. If PolyQ is the set of all polynomials with rational coefficients, π : PolyQ → Q given by
π(p) = p(0) is a function. For example, if p(x) = 21 x2 − x + 18
17
then π(p) = 17
18 .
If you have taken calculus before, this is an example of a function which is continuous on
R \ Q and discontinuous on Q.
When f : R → R, this coincides with the notion of a graph with which you are familiar.
It is important to note that not every element of B needs to be hit by f ; that is, B is not
necessarily the range of f . Rather, B represents the ambient space to which f maps. Also, if either
19
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.3 Sets and Set Building
of the domain or codomain changes the function itself changes. This is because the data of the
domain and codomain are intrinsic to the definition of a function. For example, f : R → R given
by f (x) = x2 is a different function than g : R → [0, ∞), g(x) = x2 .
Definition 2.23
Let f : A → B be a function.
f (U ) = {y ∈ B : ∃x ∈ U, f (x) = y} = {f (x) : x ∈ U } .
f −1 (V ) = {x ∈ A : f (x) ∈ V } .
A B
f :A→B
f (U )
U
Note that despite being written as f −1 (V ), the preimage of a set does not say anything about
the existence of an inverse function.
Example 2.24
On the other hand, since f ([0, 1]) = [0, 1] we know that f −1 (f ([0, 1])) = f −1 ([0, 1]) for which
Example 2.25
Solution. First we notice that for any x ∈ R, f (x) ≥ 0. Indeed, since x2 ≥ 0 then 1 + x2 ≥ 0, giving
x2
≥ 0.
1 + x2
20
2016-
c Tyler Holden
2.3 Sets and Set Building 2 Mathematical Infrastructure
When x = 0 we do in fact have f (x) = 0 so this inequality is saturated. Now we also have f (x) < 1,
since
x2
1 + x2 > x2 , and so < 1.
1 + x2
These two facts together imply that f (R) ⊆ [0, 1). To show the other direct, we must show that
every element of [0, 1) is equal to f (x) for some x ∈ R. Let y ∈ [0, 1) and notice that
x2
f (x) = y ⇔ =y
1 + x2
⇔ x2 = y + x2 y
⇔ x2 (1 − y) = y
y
r
⇔ x= .
1−y
Notice that it was necessary for 0 ≤ y < 1 to ensure that the term y/(1 − y) under the square root
is positive. Since this value of x maps to y, we have [0, 1) ⊆ f (R), and equality then follows from
the double inclusion.
Example 2.26
2|x + 1|
If f : R → R is given by f (x) = , show that f (R) = [0, 1].
3|x| + 2
Solution. We need to show a double subset inclusion, for which we start with (⊆). Suppose y ∈
f (R), so that y = f (x) for some x ∈ R; namely,
2|x + 1|
y= .
3|x| + 2
Both the numerator and denominator of y are positive, so y ≥ 0. Applying the triangle inequality
gives us
2|x + 1| 2|x| + 2 3|x| + 2
y= ≤ ≤ = 1,
3|x| + 2 3|x| + 2 3|x| + 2
so that y ≤ 1. Both inequalities show that y ∈ [0, 1], so f (R) ⊆ [0, 1].
For the other direction, there are a few arguments that could be made. First of all, note that
f (−1) = 0 and f (0) = 1, showing that the endpoints of [0, 1] are actually achieved.
1. Since f is continuous on [−1, 0], the Intermediate Value Theorem implies that every value
between [0, 1] is acheived, and so [0, 1] ⊆ f (R).
21
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.3 Sets and Set Building
Example 2.27
S 2 = (x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1 ,
determine f (S 2 ).
Solution. Let (a, b, c) ∈ S 2 so that a2 + b2 + c2 = 1. The image of this point under f is f (a, b,
c) =
(a, b). It must be the case that a2 + b2 ≤ 1, and so f (S 2 ) ⊆ D2 = (x, y) ∈ R2 : x2 + y 2 ≤ 1 . We
claim that this is actually an equality; that is, f (S 2 ) = D2 . In general, to show that two sets A
and B are equal, we need to show A ⊆ B and B ⊆ A. As we have already shown that f (S 2 ) ⊆ D2 ,
we must now show that D2 ⊆ f (S 2 ).
√
Let (a, b) ∈ D2 so that a2 + b2 ≤ 1. Let c = 1 − a2 − b2 , which is well-defined by hypothesis.
Then a2 + b2 + c2 = 1 so that (a, b, c) ∈ S 2 , and f (a, b, c) = (a, b). Thus f (S 2 ) = D2 .
Example 2.28
Exercise: Does the converse to Example 2.28 hold? More precisely, is it the case that
f (A) ∩ f (B) ⊆ f (A ∩ B), and therefore the two sets are actually equal?
Definition 2.29
A function f : [a, b] → R is said to be
22
2016-
c Tyler Holden
2.4 Ordered Fields 2 Mathematical Infrastructure
Definition 2.30
A function f : R → R is said to be bounded if there exists an M > 0 such that |f (x)| ≤ M
for all x ∈ R.
Furthermore, note that if f, g : A → B are functions, then anything that can be done to points
in B can be done to f and g, by defining the operations in a pointwise fashion. For example, if
f, g : R → R, then since we can add/multiply in the codomain R, we can similarly perform these
actions on f, g as
(f + g)(x) = f (x) + g(x), (f g)(x) = f (x)g(x).
Example 2.31
Solution. Since both functions are bounded, there exists an M1 > 0 and M2 > 0 such that |f (x)| <
M1 and |g(x)| < M2 for all x ∈ R. Define M = M1 + M2 , which we claim will work for the sum
f + g. Indeed, for any x ∈ R we have
Chances are you have seen the real numbers R before. In fact, you might even think that you
have a good understanding of the real number. The reality is, the real numbers are actually an
incredibly subtle and difficult object with which to play. In this section, I will show you examples
of other objects, called fields which have similar properties to the real numbers.
Fields are actually very complicated mathematical objects that have a lot of underlying structure.
This means that in order to tell you what a field does, I must enumerate a great deal of axioms.
Definition 2.32
Given a set S, a (closed) binary operator is a function b : S × S → S.
The definition of a binary operator is somewhat self-explanatory. Binary describes the number
2, so a binary operator is something which operates on two elements of S and produces another
element of S. The additional adjective closed is used to describe the fact that the output of elements
in S remains in S.
For example, multiplication and addition of integers are both closed binary operators. We abuse
23
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.4 Ordered Fields
+ : Z × Z → Z, × : Z × Z → Z,
(a, b) 7→ a + b, (a, b) 7→ a × b
However, notice that division is not a closed binary operator, since dividing two integers need not
give back an integer. For example, 1, 2 ∈ Z, but ÷(1, 2) = 1/2 is not an integer.
Definition 2.33
A field is any set F equipped with two closed binary operators ⊕, ⊗, called addition and
multiplication respectively, such that for any x, y, z ∈ F we have
1. [Associativity] x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z and x ⊗ (y ⊗ z) = (x ⊗ y) ⊗ z,
2. [Commutativity] x ⊕ y = y ⊕ x and x ⊗ y = y ⊗ x
3. [Identity] There exist distinct numbers 0F and 1F such that for any x ∈ F , x ⊕ 0F = x
and x ⊗ 1F = x.
5. [Multiplicative Inverses] For any non-zero number x ∈ F , there exists r ∈ F such that
x ⊗ r = 1F . We usually write r = x−1 .
6. [Distributivity] x ⊕ (y × z) = (x ⊕ y) ⊕ (x ⊗ z)
The distributivity property is essential, since it says that addition and multiplication play
together nicely; that is, they are compatible. Out of laziness, we will write · for ⊗, + for ⊕, and
simply 0 and 1 for the identities from now on.
1. The real numbers R and the rational numbers Q are both fields (check this as best you can).
However, Z and N are not fields. In the case of N, elements do not have additive inverses. In
the case of Z, elements do not have multiplicative inverses.
2. Define a binary operator on N called the modulo operation, where a mod b is the remainder
when a is divided by b. For example,
Consider the set F2 = {0, 1} where addition and multiplication are done modulo 2; that is,
a + b = (a + b) mod 2, a · b = (a · b) mod 2.
+ 0 1 · 0 1
0 0 1 , 0 0 0 .
1 1 0 1 0 1
24
2016-
c Tyler Holden
2.4 Ordered Fields 2 Mathematical Infrastructure
Similarly, F3 = {0, 1, 2} with addition and multiplication done modulo 3 is a field, with
addition and multiplication tables
+ 0 1 2 · 0 1 2
0 0 1 2 0 0 0 0
,
1 1 2 0 1 0 1 2
2 2 0 1 2 0 2 1
However, F4 = {0, 1, 2, 3} with addition and multiplication given modulo 4 is not a field, as
we will show in Example 2.37.
3. [Advanced Example] Let P1 (F2 ) be the degree one polynomials with coefficients in F2 satis-
fying the identity x2 + x + 1 = 0:
P1 (R) = ax + b : a, b ∈ F2 , x2 + x + 1 = 0 .
This is a field with precisely four elements, {0, 1, x, x + 1}. Addition and multiplication tables
are given by
+ 0 1 x x+1 · 0 1 x x+1
0 0 1 x x+1 0 0 0 0 0
1 1 0 x+1 x , 1 0 1 x x+1
x x x+1 0 1 x 0 x x+1 1
x+1 x+1 x 1 0 x+1 0 x+1 1 x
Example 2.34
x · 0 = x · (0 + 0) by (3)
= (x · 0) + (x · 0) by (6)
0 = (x · 0) + [−(x · 0)]
= [(x · 0) + (x · 0)] + [−(x · 0)] from above
= (x · 0) + [(x · 0) + [−(x · 0)]] by (6)
| {z }
=0
=x·0
as required.
Example 2.35
25
2016-
c Tyler Holden
2 Mathematical Infrastructure 2.4 Ordered Fields
Solution. Let z be any additive identity element, so that z + x = x for all x ∈ F . In addition, we
have that z + x = x = x + 0. Let −x be the additive identity for x, so that
0 = x + (−x) = (z + x) + (−x) = z + (x + (−x)) = z
showing that necessarily, z = 0. A similar argument holds for the multiplicative identity.
Example 2.36
Example 2.37
Solution. Suppose that xy = 0. If both x and y are zero then the result certainly holds, so assume
without loss of generality that x is non-zero. Since x is non-zero, it has a multiplicative inverse
x−1 , which implies
y = (x−1 · x)y = x−1 · (x · y) = x−1 · 0 = 0
showing that y = 0.
Example 2.37 shows that F4 cannot be a field. Indeed, in F4 we have that 2 · 2 = 0 but neither
of these is zero. If F4 were a field, then Example 2.37 says that one of these must be zero, and that
is not the case.
Example 2.38
Solution. We claim that 1 + x−1 = 0 (look at the chart in the examples above). To show this, let’s
assume that it’s not the case and show that something weird happens.
If 1 + x−1 6= 0, then it must have a multiplicative inverse, and its is either 1 or x. If its inverse
is 1, then
1 · (1 + x−1 ) = 1 ⇒ 1 + x−1 = 1 ⇒ x−1 = 0.
But this would imply that x · x−1 = 1 = 0, which cannot happen.
If the inverse is x, then
x · (1 + x−1 ) = 1 ⇒ x+1=x ⇒ 1=0
which is also not possible. We conclude that 1 + x−1 cannot be inverted, so it must be 0.
26
2016-
c Tyler Holden
2.4 Ordered Fields 2 Mathematical Infrastructure
We have already looked at an ordering on the real numbers, where by definition we say that a < b
if b − a > 0. We generalize this notion as follows:
Definition 2.39
If F is a field, a subset P ⊆ F is a positive set if
Notice that the set of positive real numbers is a positive set in R. If F admits a positive set
P , we can define an ordering on F by saying that x < y if y − x ∈ P . By y − x we of course
mean y + (−x), but we shall be sloppy with that notation henceforth. Any field endowed with an
ordering is said to be an ordered field.
Proposition 2.40
z − x = z + (y − y) + x = (z − y) + (y − x) ∈ P
| {z } | {z }
∈P ∈P
(y + v) − (x + u) = (y − x) + (v − u) ∈ P
| {z } | {z }
∈P ∈P
by closure of additivity.
27
2016-
c Tyler Holden
3 Mathematical Logic
Ordered fields must have infinitely many elements. Assume that 1 ∈ P . Since P is closed under
addition, by repeatedly using Proposition 2.40(3), we must have
and this must go on for ever. If our field only has finitely many elements, then at some point this
process must begin to cycle back on old numbers, allowing us to show something along the lines of
x < x, which is not possible.
Definition 2.41
If F is an ordered field, we say that a subset S ⊆ F is bounded from above if there exists an
element M ∈ F such that for all x ∈ S, x < M . In this case, we say that M is an upper
bound of S.
A bounded set has many possible upper bounds. For example, the set S = {1/n : n ∈ N} ⊆ Q
is bounded, with an upper bound of 2. But 3, or 4, or in fact any rational number larger than 2 is
also an upper bound for S.
This pattern is typical. If M is an upper bound for S and M < N , then for every x ∈ S we
have
x<M <N
showing that N is also an upper bound for S. An interesting question which naturally arises is
then “Is there a least upper bound?”
Definition 2.42: The Completeness Axiom
An ordered field F is complete if whenever S ⊆ F is bounded from above, there exists a least
upper bound of S.
For example, the set S = x ∈ Q : x2 < 2 is certainly bounded above, since x < 2 for all x ∈ S.
However, this set does not have a least upper bound in the rational numbers. Therefore, Q is not
complete. However, notice that R is a complete ordered field, and in fact is constructed in such a
manner as to guarantee that it is complete.
It is possible to show that R is in essence the only complete ordered field, in the sense that any
other field which is complete and ordered is essentially just R in disguise.
3 Mathematical Logic
A logical predicate is a statement about objects in S which evaluate to either true or false. We will
denote predicates by a capital letter, such as P , in which case P (x) is read as “x satisfies property
P ” or some other equivalent sentence.
28
2016-
c Tyler Holden
3.2 Universal and Existential Quantifiers 3 Mathematical Logic
P (x) “x is a dog,”
P (x) “x has a birthday today,”
P (x, y) “x and y have the same calculus lecture,”
P (x, y, z) “The sum of x and y is greater than z,”
Solution. The statement P (5, 2) is that “5 is divisible by 2.” This is false for the division yields
5 = 2 · 2 + 1, leaving a remainder of 1. On the other hand, P (35, 5) is true, since 35 = 5 · 7 with no
remainder. P (0, 1) is also true as 0 = 0·1 and in fact, this would be true regardless of which number
we had chosen for y. The only contentious example occurs when trying to evaluate P (1, 0), since
we cannot divide by zero. In this case, we adhere to the convention that no number is divisible by
zero, so that P (1, 0) is false. As in previous case, P (1, 0) would be false regardless of our choice of
x.
Remark 3.2 You might be disturbed at the idea of writing false mathematical statements,
such as “5 is divisible by 2.” Morality aside, it is important to realize that we can write false
statements in English is well. For example, the statement “Pigs can fly” is obviously false,
but this does not prevent me from writing it down. While we will eventually endeavour to
only write true statements, for the moment it is important to consider false statement as
well.
Quantifiers allow us to discuss the number of objects which satisfy a predicate. If we wish to discuss
every element of a set, we use the universal quantifier ∀, read as “for all.” To state that an element
in a set exists, we use the existential quantifier ∃, read as ”there exists.”
29
2016-
c Tyler Holden
3 Mathematical Logic 3.2 Universal and Existential Quantifiers
When combined with a predicate P , we can assign truth values to quantified statements. For
example, let S be a universe of discourse. The statement ∀x ∈ S, P (x) will be true precisely when
P (x) is true for every element in S. On the other hand, ∃x ∈ S, P (x) will be true as long as a
single element of S makes P (x) true.4
The addition of quantifiers allows us to make statements such as the following:
These last two examples have multiple quantifiers. Can you spot them?
Example 3.3
1. ∀x ∈ N, x2 ≥ 0
√
2. ∃x ∈ R, x = −1,
3. ∀x ∈ Q, ∀y ∈ Q, x + y ∈ Q,
4. ∃x ∈ N, ∃y ∈ N, x/y ∈ N.
Solution.
1. This statement is true, since squaring any non-zero real number results in a positive number.
2. This statement is false. If such an x existed, it would also satisfy x2 = −1. By our comment
in part 1, the square of a non-zero number is always positive, leading us to a contradiction.
a c ad + bc
x+y = + = .
b d bd
Since ad + bc ∈ Z and bd ∈ N, this is also a rational number.
4. This statement is true. For example, by setting x = 4 and y = 2 we have x/y = 4/2 = 2,
which is also a natural number.
Notice how the above solutions demonstrated the truth of quantifier statements. To show
that ∃x ∈ S, P (x), we find a single example of an x ∈ S which makes P (x) true. To show that
4
A simple mnemonic for remembering which symbol corresponds to which quantifier is that “for all” looks like an
upside down A which stands for ALL, and “there exists” looks like a backwards E which stands for EXISTS.
30
2016-
c Tyler Holden
3.2 Universal and Existential Quantifiers 3 Mathematical Logic
∀x ∈ S, P (x) is more subtle. Rather than try to demonstrate P (x) for every x, we choose an
arbitrary x ∈ S. If P (x) is true for an arbitrary x, then it must be true for every x.
Doubly quantified statements must be treated with caution. One may freely interchange two
adjacent quantifiers of the same type, but not of different type. For example, the statements
∀x ∈ Q, ∀y ∈ Q, x + y ∈ Q is logically equivalent to ∀y ∈ Q, ∀x ∈ Q, x + y ∈ Q,
and
∃x ∈ N, ∃y ∈ N, x/y ∈ N is logically equivalent to ∃y ∈ N, ∃x ∈ N, x/y ∈ N.
However, interchanging existential and universal quantifiers can lead to serious trouble.
Example 3.4
2. Turn the sentence derived above into a simple sentence, which does not involve any
variables.
Solution. We start with equation (3.1) for which a direct translation of the notation into English
gives us
“For all x in the real numbers, there exists y in the real numbers, (such that) x + y = 0.”
This is fine but not very enlightening. By recognizing that x + y = 0 is equivalent to x = −y, we
could also re-interpret this sentence as saying “For every real number there is another real number
which is its negative.” Dropping the superfluous words we arrive at the intuitive statement
This statement is certainly true: Given an integer a, we can construct its negative to be −a.
Looking at (3.2) we have
∃y ∈ Z, ∀x ∈ Z, x + y = 0. (3.3)
Using the same translation process as above, the corresponding simple sentence is given by
31
2016-
c Tyler Holden
3 Mathematical Logic 3.3 And, Or, Not
This says there is a number to which we can add any other number and always get zero.
Certainly this is not true! If it were, then there would be a number n such that n + a = 0 and
n + b = 0 for any integers a and b. Equating these expressions, we would find that n + a = n + b
which in turn implies that a = b. This would force all integers to be equal, which is nonsense.
Example 3.4 teaches us that changing the order of the quantifiers significantly changes the logical
statement, and hence the truth of that statement. To borrow a term from the computer scientists,
universal quantifiers admit a ‘scope’ to the existential quantifiers they precede. For example, the
statement ∀x, ∃y, P (x, y) means that the choice of y is allowed to depend upon x. The statement
∃y, ∀x, P (x, y) does not confer this dependence: the choice of y must work for every x.
Example 3.5
Let S be the set of all students in a classroom, and B(a, b) be the statement “student a has
the same birthday as student b.” Write the mathematical statements
Three operations allow us to form new predicates from old: The AND operation, also known
as conjunction; the OR operation, also known as disjunction; and the NOT operation, also as
negation.
• AND: When we link two predicates using an AND statement, both predicates must evaluate
to true for the new predicate to be true. Notationally, the statement P (x) AND Q(x) is
written P (x) ∧ Q(x).
For example, if our universe is N, let E(x) represent “x is even” and P (y) represents “y is
positive.” The statement E(x) ∧ P (y) is then “x is even and y is positive,”
E(2) ∧ P (2) is true, E(4) ∧ P (−1) is false,
32
2016-
c Tyler Holden
3.3 And, Or, Not 3 Mathematical Logic
All of these rules can be tricky to remember when trying to absorb the information from the
written word. By organizing the truth data of each operation into a truth table, we have a quick and
easy way of seeing the structure of each logical statement. To facilitate writing out these tables, let
T denote TRUE and F denote FALSE. The truth tables for all three operations are given in Table
1.
AND OR
P Q P ∧Q P Q P ∨Q NOT
T T T T T T P ¬P
T F F T F T T F
F T F F T T F T
F F F F F F
Table 1: The truth tables for the AND, OR, and NOT operations.
Example 3.6
Let O(x) represent “x is odd” and E(x) represent “x is even.” Compute O(x) ∧ E(x), O(x) ∨
E(x), ¬E(x) and ¬O(x) when x = 1 and x = 2.
33
2016-
c Tyler Holden
3 Mathematical Logic 3.3 And, Or, Not
Example 3.7
Solution. While it is possible to create the truth table immediately, this is prone to mistakes. By
breaking down the truth table into several smaller tables, we obtain a clearer picture and our
solution is more robust to error. We start by examining the ¬(P ∧ Q) predicate.
P Q P ∧Q ¬(P ∧ Q)
T T T F
T F F T
F T F T
F F F T
Now we add in the disjunction with R into an expanded truth table, which gives
P Q R ¬(P ∧ Q) ¬(P ∧ Q) ∨ R
T T T F T
T T F F F
T F T T T
T F F T T
F T T T T
F T T T T
F F F T T
F F T T T
We conclude that the statement ¬(P ∧ Q) ∨ R is always true except in the case where (P, Q, R) is
(T, T, F ).
Showing that two logical statements are equivalent can be done by showing that they have the
same truth table, as the following Proposition demonstrates.
34
2016-
c Tyler Holden
3.3 And, Or, Not 3 Mathematical Logic
Proposition 3.8
Let P, Q be propositions. The negation of the AND and OR statements are as follows:
Proof. It suffices to show that the expressions have equivalent truth tables. We will give the result
for the first identity, and leave the second as an exercise. The truth tables for the negation of the
AND statement are as follows
The resulting values of the truth table are identical, showing that these statements are in fact
equivalent.
To develop intuition for negating quantifiers, let’s think about how we would disprove a statement
involving a quantifier. For example, the universally quantified statement ”every horse is black”
may be disproved by showing that there exists a non-black horse. Mathematically, if P (x) is “x is
a black horse,
the negation of ∀x, P (x) is ∃x, ¬P (x).
The existentially quantified statement “there exists a pink horse” is disproved by showing that
“every horse is not pink.” Mathematically, if P (x) is the statement “x is a pink horse,” then
By thinking about the case of a general predicate P , the negation rules above still apply.
Example 3.9
Solution. This sentence is false. For example, if x = 1/2 then x2 = 1/4, showing that x > x2 . The
negation of this sentence is
∃x ∈ R : x ≥ x2 .
Notice that our counter-example satisfies the negation of our sentence, as we would expect.
35
2016-
c Tyler Holden
3 Mathematical Logic 3.4 Implications
Example 3.10
Solution. From Example 3.4 we know that the given sentence can be stated mathematically as
∀x ∈ R, ∃y ∈ R, x + y = 0.
Applying our rules for negation, the negative of this sentence becomes
∃x ∈ R, ∀y ∈ R, x + y 6= 0.
Translating this back into an English sentence, we have “There is a real number which has no
negative.”
3.4 Implications
At the core of mathematical statements are implications, which consist of ‘if-then’ statements. A
typical theorem contains a hypothesis and a conclusion, such that IF the hypothesis is TRUE,
THEN the conclusion is TRUE. This is a conditional statement that requires us to first check
the truth of the hypothesis before we can ascertain the truth of the conclusion, and is called an
implication because the veracity of the first statement implies the veracity of the second.
To frame this mathematically, let P and Q be predicates. The statement “If P then Q,” or
alternatively “P implies Q,” is written P ⇒ Q and has truth table given by Table 2.
IMPLICATION
P Q P ⇒Q
T T T
T F F
F T T
F F T
Carefully consider the bottom two rows of Table 2, which are known as vacuous truths. The idea
is that a universally false hypothesis P can have any implication it wants, since that implication
will never be tested. For example, consider the statement
This is a true statement because whenever pigs can fly then the sky is black. This may seem
artificial and contrived, but vacuous truths appear in mathematics frequently, so it is important to
be aware of how they are handled.
Example 3.11
Let D(x) be the predicate “x is a dog” and let A(x) be the predicate “x is an animal.”
Consider the truth of the implications D(x) ⇒ A(x), A(x) ⇒ D(x) and ¬A(x) ⇒ ¬D(x).
36
2016-
c Tyler Holden
3.4 Implications 3 Mathematical Logic
D(x) ⇒ A(x) This is the statement “If x is a dog then it is an animal” or “All
dogs are animals” and is TRUE.
A(x) ⇒ D(x) This is the statement “If x is an animal then it is a dog” or “All
animals are dogs.” A cat is an animal which is not a dog, so this
implication must be FALSE.
¬A(x) ⇒ ¬D(x) This is the statement “If x is not an animal then it is not a dog,”
and is a TRUE sentence. Indeed, if x is not an animal then it
could not be a dog. If it were a dog, then by our first implication,
it would be an animal and this would be a contradiction.
Definition 3.12
Let P and Q be predicates with the implication P ⇒ Q.
Example 3.11 shows that the converse of true statement is not necessarily true. As for the
contrapositive, we have the following result:
Proposition 3.13
Proof. You should try proving this result on your own before proceeding further.
The truth table for the contrapositive ¬Q ⇒ ¬P is as follows:
P Q ¬P ¬Q ¬Q ⇒ ¬P
T T F F T
T F F T F
F T T F T
F F T T T
Comparing this to the truth table for P ⇒ Q given in Table 2, we see that the truth values are
identical as required.
When both P ⇒ Q and its converse Q ⇒ P are true, we write P ⇔ Q and say that “P is true if
and only if Q is true.” This means that the statements P and Q are logically equivalent: whatever
truth value P (x) has, Q(x) will have the same. It is difficult at this point to give examples of “if
and only if” statements that are not just trivial restatements of one another, but some examples
might include:
37
2016-
c Tyler Holden
3 Mathematical Logic 3.4 Implications
• A triangle is isosceles if and only if exactly two of its angles are equal.
The words ‘necessary’ and ‘sufficient’ are often used to indicate the direction of an implication.
If P and Q are predicates, then
Example 3.14
P Q R P ∨Q ¬Q ∧ R (P ∨ Q) ⇒ (¬Q ∧ R)
T T T T F F
T T F T F F
T F T T T T
T F F T F F
F T T T F F
F T F T F F
F F F T F F
F F F F F T
Example 3.15
Proof. We begin with the (⇒) direction, and assume that n is even so that n = 2k for some k ∈ N.
Squaring n gives
n2 = 4k 2 = 2(2k 2 )
38
2016-
c Tyler Holden
3.4 Implications 3 Mathematical Logic
To prove the (⇐) direction, we will proceed by contrapositive. Assume that n is not even, so
that n = 2k + 1 for some k ∈ N. Squaring n gives
In Example 3.13 we found that the statement A(x) ⇒ D(x), read as “Every animal is a dog,” was
false. To show that it was false we used the example of a cat, which is an animal but is not a dog.
More generally, if P and Q are predicates in a universe S, we say that x ∈ S is a counter-example
to P ⇒ Q if P (x) is true but Q(x) is not true; that is, P (x) ∧ ¬Q(x) is true. Counter-examples are
exactly how implications are negated.
Proposition 3.16
If P and Q are predicates, the negation of the implication P ⇒ Q is the statement P ∧ ¬Q.
Proof. The truth table for P ∧ ¬Q is given below, along with P ⇒ Q for reference
P Q ¬Q P ∧ ¬Q P Q P ⇒Q
T T F F T T T
T F T T T F F
F T F F F T T
F F T F F F T
Comparison of the last column of each table shows that the tables are negations of one another,
proving the result.
Example 3.17
Negate the following sentence: “If x is duck, then x likes peanut butter.”
Solution. Here we have the predicates P (x) =“x is a duck” and Q =“x likes peanut butter’ with
the sentence above being the implication P ⇒ Q. The negation of this implication is P ∧ ¬Q, or
“x is a duck and x does not like peanut butter.”
Example 3.18
Negate this sentence to determine the mathematical statement that a limit does not exist.
39
2016-
c Tyler Holden
3 Mathematical Logic 3.5 Contradiction (Reductio ad absurdum)
Solution. Applying our rules for negation, the limit does not exist if the following sentence is
satisfed:
∀L ∈ R, ∃ > 0, ∀δ > 0, ∃x ∈ R, |x − c| < δ and |f (x) − L| ≥ .
Let P and Q be predicates, and consider the problem of showing that the statement T : P ⇒ Q is
true. A proof by contradiction proceeds by assuming that T is false (or ¬T is true), and showing
that something bad happens. More specifically, if R is some other predicate which may not be
directly related to P or Q, then
¬T ⇒ (R ∧ ¬R) (3.4)
is a true statement. Here we recall that for any predicate R, R ∧ ¬R is always false, so we have
shown that the assumption that ¬T is true leads to a contradiction.
We can use a truth table to verify that the truth of T perfectly corresponds with the truth of
(3.4). Indeed
T R ¬P R ∧ ¬R ¬T ⇒ (R ∧ ¬R)
T T F F T
T F F F T
F T T F F
F F T F F
In the decimal expansion of π, one of the digits {0, 1, 2, . . . , 8, 9} occurs infinitely often.
Proof. For the sake of contradiction, assume that each of the above digits occurs only finitely many
times in the decimal expansion of π. Let Ni be the number of times the digit i appears, so that
the decimal expansion of π consists of
N0 + N1 + · · · + N9
digits. As each Ni is finite, so too is this sum. If π has only a finite decimal expansion, it is
necessarily rational. Since we know that π is not rational, this is a contradiction and we conclude
that some digit must occur infinitely often.
40
2016-
c Tyler Holden
3.5 Contradiction (Reductio ad absurdum) 3 Mathematical Logic
Note that this proof is not constructive: we do not know which digit occurs infinitely often. In
fact, it is an open problem whether each digit occurs infinitely often.
Proposition 3.20
Proof. For the sake of contradiction, assume that A ∩ (B \ A) = 6 ∅, so that the there exists an
element x ∈ A ∩ (B \ A). By definition of the intersection, x ∈ A and x ∈ B \ A. However,
x ∈ B \ A implies that x ∈
/ A, contradicting the fact that x ∈ A. Hence A ∩ (B \ A) = ∅.
Proposition 3.21
√
The number 2 is irrational.
√ √
Proof. For the sake of contradiction, assume that 2 √ is rational and write 2 = p/q where
gcd(p, q) = 1; that is, p/q is in lowest terms. Hence q 2 = p and by squaring both sides we
get
2q 2 = p2 .
Notice that 2q 2 is even, and so therefore p2 must also be even. By Example 3.15 we know that p
is therefore also even, so p = 2k for some k ∈ N. Substituting this back into our equation, we get
2q 2 = (2k)2 = 4k 2 ⇔ q 2 = 2k 2
so that similarly, q 2 is even. This implies that q is even, so q = 2`. However, this is a contradiction.
We assumed
√ that p and q were written in lowest terms, but have demonstrated that both are even.
Hence 2 is not rational and so must be irrational.
Example 3.22
Solution. Suppose for the sake of contradiction that a solution exists. Note that we can factor the
left hand side, giving x2 − 4y 2 = (x − 2y)(x + 2y) = 7. Since 7 is prime, its two factors are either
−1, −7 or 1, 7, but we can throw away the negative factors since x + 2y > 0.
Thus we either have x − 2y = 1 and x + 2y = 7, or x − 2y = 7 and x + 2y = 1. In either case, if
we add the two equations together we get x = 4. In the first case, this implies that y = 3/2 which
is not possible. In the latter case, y = −3/2, which is also not possible. Hence we’ve arrived at a
contradiction, and no solutions can exist.
Example 3.23
41
2016-
c Tyler Holden
3 Mathematical Logic 3.6 A Rigmarole of Random Results
Solution. Suppose that a solution exists, and write it in lowest terms as x = a/b. Substituting in
we get
a3 a2
+ 2 = 1 ⇒ a3 + a2 b = b.
b3 b
Now we have three cases: Either both a, b are odd; a is even and b is odd; or a is odd and b is even.
Note that both cannot be even, as we’ve assumed a/b is in lowest terms.
1. If both are odd, then a3 , a2 b, and b are all odd. But this cannot happen, since if a3 and a2 b
are odd, then a3 + a2 b is even.
2. If a is even and b is odd, a3 is even, a2 b is even, and b is odd. This leads to the same problem,
as a3 + a2 b is then even.
3. If a is odd and b is even, then a3 is odd, a2 b is even, and b is even. But then a3 + a2 b is odd,
which is a contradiction.
3x
Show that the function f (x) = is injective; namely, if a 6= b then f (a) 6= f (b).
x+4
Solution. Proceeding by contrapositive, it is sufficient to show that whenever f (a) = f (b) then
a = b:
3a 3b
f (a) = f (b) ⇔ =
a+4 b+4
⇔ 3a(b + 4) = 3b(a + 4)
⇔ 3ab + 12a = 3ab + 12b
⇔ 12a = 12b
⇔ a = b.
This is precisely what we wanted to show, so the result follows.
Example 3.25
√ √ √2
Solution. We know that 2 is√irrational (though we have not yet proven this). If 2 is rational
we are done, setting a = b = 2. Otherwise, it is irrational, in which case
√ √2 √2 √ 2
( 2 ) = 2 =2
42
2016-
c Tyler Holden
3.6 A Rigmarole of Random Results 3 Mathematical Logic
√
√ 2 √
works, with a = 2 and b = 2.
Definition 3.26
If a, b ∈ Z we say that a|b (read “a divides b”) if there exists k ∈ Z such that ak = b.
For example, 5|35 since 5 · 7 = 35, while 26 |5 since there is no integer k for which 2k = 5.
Proposition 3.27
Proof. Our hypotheses indicate that a|b and a|c, so there exist k, ` ∈ Z such that ak = b and a` = c.
Using these equations, we can write
Proposition 3.28
Proof. By assumption, there exists k, ` such that ak = b and a` = b + c. Using the latter equation,
we can write
c = a` − b = a` − ak = a(` − k)
showing that a|c as required.
Recall that p ∈ Z is a prime if its only factors are 1 and p. A number which is not prime is
called composite, and necessarily has non-trivial factors other than 1 and itself.
Proposition 3.29
Proof. For the sake of contradiction, assume that not every natural can be written as a product of
primes. In particular, there must be a smallest such number, say n. This number cannot be prime
itself, otherwise it is trivially a product of primes, so n is necessarily composite and can be written
as n = rs for 1 < r ≤ s < n.
Both r, s < n, and since n is the smallest number than cannot be written as a product of primes,
both r and s must be writable as products of primes. However, combining those primes then gives
43
2016-
c Tyler Holden
4 Induction
a decomposition of n into a product of primes, which contradicts our assumption. We conclude the
result, as required.
Solution. For the sake of contradiction, assume that there are only finitely many primes, and list
them as p1 , p2 , . . . , pn . Consider the number x = p1 p2 · · · pn + 1. This number is larger than any of
the given primes, and hence cannot be prime itself.
We claim that x cannot be written as a product of prime numbers. Indeed, suppose that pk
were a factor of x, so that pk |x. Since pk |p1 p2 · · · pk , by Proposition 3.28 we would have pk |1, and
this is not possible. Hence no prime can be a factor of x, and so x cannot be written as a product
of primes. This contradicts Proposition 3.29, so our original assumption must have been false; that
is, there are infinitely many primes.
4 Induction
Mathematical induction is a proof technique used to show that a result holds for every natural
number N. It operates on the domino principle, by creating a chain of implications which extends
to every natural number. For example, suppose N is our universe and we would want to show P (n)
is true for every n ∈ N. If we can show that P (1) is, and then P (k) ⇒ P (k + 1), then the result
holds for any n. This is precisely because
Mathematical Induction
Let P be some predicate. If P (1) is true, and P (k) ⇒ P (k + 1) for any k, then P (n) is true
for all n ∈ N
Thus mathematical induction consists of two steps. The first is to demonstrate the base case
that P (1) is true. The second is to invoke the induction hypothesis that P (k) is true for some k,
and demonstrate that P (k) ⇒ P (k + 1).
Example 4.1
Solution.
1. Base Case: The smallest number for which this occurs is n = 1, and in this case we have
2n + 2 = 4 and 4n = 4, so the result holds in the base case.
44
2016-
c Tyler Holden
4 Induction
2. Induction Step: Assume that 2k + 2 ≤ 4k for some natural number k. We want to show
that 2(k + 1) + 2 ≤ 4(k + 1). Indeed, notice that
4(k + 1) = 4k + 4
using the induction
≥ (2k + 2) + 2
hypothesis 4k ≥ 2k + 2
= 2k + 4 = 2(k + 1) + 2.
We conclude from the induction principle that 2k + 2 ≤ 4k for all k ∈ N.
Example 4.2
Solution.
1. Base Case: When n = 1 one has 2 ≤ 2 which is a true statement, so the base case holds.
2. Induction Step: Assume that for some n we know that 2n ≤ 2n . Now
2n+1 = 2(2n )
≥ 2(2n) = 4n
≥ 2n + 2 by Example 4.1
= 2(n + 1)
which is what we wanted to show.
Example 4.3
Show that the triangle inequality extends to more than two variables; namely,
Solution. The base case occurs when n = 2, since this is the first instance in which the inequality
makes sense. We have already proven this though, so the base case is done.
Assume then that |x1 + · · · + xn | ≤ |x1 | + · · · + |xn |, and notice that
|x1 + · · · + xn + xn+1 | = |(x1 + · · · + xn ) + xn+1 |
≤ |x1 + · · · + xn | + |xn+1 | by the base case
≤ |x1 | + · · · + |xn | + |xn+1 | by the induction hypothesis
giving the desired result.
Example 4.4
45
2016-
c Tyler Holden
4 Induction
Solution.
1. Base Case: The simplest case is k = 1, for which we see that 6k − 1 = 5. Clearly 5|5 since
5/5 = 1, so the base case is satisfied.
2. Induction Step: For some positive integer k, assume that 5|6k − 1. Since by hypothesis, we
k
know that 5|6k − 1 we know there is some integer d such that 6 5−1 = d. Consider 6k+1 − 1
which we may write as
6k+1 − 1 = 6(6k ) − 1
= (1 + 5)(6k ) − 1
= 5(6k ) + (6k − 1)
We claim that 5 divides this number. To see that this is the case, let us divide by 5 and see
what we get.
(1 + x)n ≥ 1 + nx.
(1 + x)n+1 = (1 + x)n (1 + x)
≥ (1 + nx)(1 + x) = 1 + x + nx + nx2
= 1 + (n + 1)x + nx2
≥ 1 + (n + 1)x
where we have used the fact that nx2 ≥ 0. This is what we wanted to show, so the inequality is
true.
Definition 4.6
If S is a set, we denote by P(S) the power set of S, which is the collection of all subsets of
S.
46
2016-
c Tyler Holden
4 Induction
Example 4.7
Solution. There are actually many ways to show this is true. The simplest is the following: To
count the number of elements in P(S), note that each element x ∈ S is either in a subset, or not
in a subset. Hence each x has two possible states it can be in. The set of all possible states for all
possible elements is therefore 2|S| , and we are done.
However, we can proceed by induction on the size of |S| instead. If |S| = 0 then S = ∅, and
P(S) = {∅} has size 20 = 1. The base case is thus true.
Now assume that the number of subsets of a set with n elements is 2n , and let S have n + 1
elements
S = {s1 , . . . , sn , sn+1 } .
Notice that every subset of S either contains sn+1 or does not. Of those that do contain sn+1 ,
there are 2n possible subsets (corresponding to the subsets of {s1 , . . . , sn }). Similarly, of those that
do not contain sn+1 there are also 2n such subsets. All together, there are 2n + 2n = 2n+1 such
subsets, as required.
As a brief aside, one sometimes denotes the power set by 2S for this reason. Hence 2N and 2R
are the power sets of N and R respectively. There is yet another reason why this notation is great.
In general, if A and B are sets, then
AB = {f : B → A} ;
that is, the set of all functions from A to B. It is possible to show that the subsets of S are in
one-to-one correspondence with the functions f : S → {0, 1}, wherein if T ⊆ S, then we define
(
1 x∈T
fT (x) = ,
0 x∈ /T
and so P(S) = {0, 1}S = 2S , where we identify 2 = {0, 1}, since this set has two elements.
Example 4.8
Show that for all n ∈ N a 2n × 2n chessboard with a single tile removed can be L-tiled; that
is, tiled by an L-shape consisting of three squares.
Solution. The base case is when n = 1, in which we are tasked with tiling a 2 × 2 chessboard with
one tile removed. This is immediately possible, so we are done.
Assume then that any 2n × 2n board with a single tile removed admits an L-tiling, and consider
a 2n+1 × 2n+1 board. Divide this board into four quarters, so that each quarter has dimension
2n × 2n . By rotating the board if necessary, assume that the missing tile is located in the upper-left
quadrant. We place our first tile as illustrated in Figure ??. Note that, excluding the placement of
the first tile, every quadrant is now a 2n × 2n board with a single tile removed. By the induction
hypothesis, each board admits an L-tiling, so those tiling combined give the tiling of the 2n+1 ×2n+1
board.
47
2016-
c Tyler Holden
4 Induction 4.1 Summation and Product Notation
2n
2n
Figure 9: Left: The base case. A 2×2 board with a single tile removed is an L-shape, and so admits
an L-tiling. Right: The induction step. By placing the first tile as such, each of the quadrants is a
2n × 2n board with a single tile removed.
Sigma notation is used to make complicated sums much easier to write down. In particular, we use
a summation index to iterate through elements of a list and then sum them together. Consider the
expression
m
X
ri (4.1)
i=n
which is read as “the sum from i = n to m of ri .” The element i is known as the dummy or
summation index, n and m are known as the summation bounds, and ri is the summand. In order
to decipher this rather cryptic notation, we adhere to the following algorithm:
For those computer savvy students out there, this is nothing more than a for-loop. Interpreting
(4.1) we thus have
Xm
ri = rn + rn+1 + rn+2 + · · · rm .
i=n
48
2016-
c Tyler Holden
4.1 Summation and Product Notation 4 Induction
Example 4.9
1. Base Case: Here we check the easiest possible case, which corresponds to k = 1. When
k = 1 the left-hand-side becomes
1
X 1 1 1
= =
n(n + 1) 1(1 + 1) 2
n=1
1
which the right-hand-side is 1+1 = 21 . Clearly both sides agree, so the base case is satisfied.
2. Induction Step: Let k be some fixed by arbitrary number and assume that
k
X 1 k
= .
n(n + 1) k+1
n=1
We would like to show that the result holds for k + 1. It makes most sense to start by working
with the left-hand-side, since it will give us the most “flexibility.” Notice that
k+1 k
X 1 1 X 1
= +
n(n + 1) n(n + 1) n=k+1 n(n + 1)
n=1 n=1
1 k via the induction
= +
(k + 1)(k + 2) k + 1 hypothesis
1 + k(k + 2) k 2 + 2k + 1
= = common denominator
(k + 1)(k + 2) (k + 1)(k + 2)
(k + 1) 2 k+1 factoring and
= =
(k + 1)(k + 2) k+2 cancelling
(k + 1)
= .
(k + 1) + 1
Example 4.10
49
2016-
c Tyler Holden
4 Induction 4.1 Summation and Product Notation
k+1
X n
X
n2 = n2 + (n + 1)2
n=1 k=1
n(n + 1)(2n + 1)
= + (n2 + 2n + 1) by Induction Hypothesis
6
(2n3 + 3n2 + n) + (6n2 + 12n + 6)
=
6
3 2
2n + 9n + 13n + 6
=
6
(n + 1)(n + 2)(2n + 3)
= ,
6
which is precisely the correct equation for n + 1.
Pi notation works in precisely the same way, except that instead of adding we multiply:
n
Y
ri = r1 r2 r3 · · · rn−1 rn .
i=1
Example 4.11
Show that
n
Y 1 n+1
1− = . (4.3)
r2 2n
r=2
Solution. In the base case we have n = 2, so the left hand side is 1 − 1/4 = 3/4, while the right
hand side is (2 + 1)/(2 · 2) = 3/4. These are equal, so the base case holds.
50
2016-
c Tyler Holden
4.2 More General Induction 4 Induction
n+1 n
Y 12
Y 1 1
1− = 1− 2 1−
r r (n + 1)2
r=2 r=2
2
n+1 n + 2n
=
2n (n + 1)2
n(n + 1)(n + 2)
=
2n(n + 1)2
n+2
=
2(n + 1)
exactly as desired.
The principle of induction can be extended beyond just N. For example, suppose we want to show
that the predicate P (n) is true for all even numbers greater than or equal to 10. Here the base case
is to demonstrate P (10), followed by P (2n) ⇒ P (2n + 2). This creates the chain of implications
demonstrating P (n) for all even numbers at least 10. This idea is easily generalized to any other
induction scheme. Of course, this can be seen as equivalent to induction by renaming Q(n) as the
statement P (2n) is true.
An ostensibly different type of induction is that of strong induction. We again aim to create a
chain of implications, but now we make a stronger induction hypothesis.
Theorem 4.12: Strong Induction
Proof. We will proceed by using (normal) induction. Let Q(k) = P (1) ∧ · · · ∧ P (k). Since Q(1) =
P (1), the base case is true. Now assume that Q(n) is true. Since Q(n) ⇒ P (n + 1), then Q(n + 1)
is true as well. By induction, Q(n) holds for all n, and this is only possible if all P (n) are true, as
required.
Hence (Induction) ⇒ (Strong Induction). Moreover, since normal induction uses a weaker
hypothesis, we see that (Strong Induction) ⇒ (Induction). This shows that induction and strong
induction are actually equivalent.
51
2016-
c Tyler Holden
4 Induction 4.2 More General Induction
Example 4.13
Show that any postage amount greater than 8 cents can be formed by using 3 cent and 5
cent stamps.
Solution. Let P (n) be the statement ”A postage of n cents can be made of 3 and 5 cent stamps.”
As our base cases,
Now assume that P (k) is true for all 8 ≤ k ≤ n, for which we will show that P (n + 1) is true.
Indeed, notice that we can write n + 1 = (n − 2) + 3. By our induction hypothesis, we know that
a postage of n − 2 stamps can be resolved, say by r three-cent stamps and s five cent stamps, thus
as required.
Example 4.14
Consider a two-player game, consisting of two bowls of marbles. Each player takes a turn
removing any positive number of marbles from a single bowl. The player that removes the
last marble wins. Show that if both bowls have an identical number of marbles, the player
who goes second always has a winning strategy.
Solution. Let P (n) be the statement ”Player two wins when both bowls have n marbles.” The base
case is n = 1, in which both bowls have a single marble. Player one must remove at least one
marble from a single bowl. This leaves only one bowl with one marble, so player two wins.
Now assume that P (k) is true for all 1 ≤ k ≤ n, for which we will demonstrate P (k + 1). Player
One must go first, and so remove ` marbles from any bowl, leaving a bowl with (k + 1) marbles,
and one with (k + 1 − `) marbles. Player Two now moves by removing ` marbles from the other
bowl, leaving each bowl with k + 1 − ` marbles. The game has thus been reduced to the game with
(k + 1 − `) < k marbles, and we know P (k + 1 − `), so Player Two has a winning strategy.
4.2.1 Recursion
Recursive sequences are those sequences whose elements depend explicitly upon previous entries.
For example, the well known Fibonacci sequence is defined as x1 = x2 = 1 and xn = xn−1 + xn−2 .
Unfortunately, computing xn means computing xk for all k ≤ n. More appealing would be to find
a closed form solution for xn .
The problem of determining a closed form solution can be rather tricky, and is often relegated
to the realm of combinatorial enumeration. However, given a closed form we can use induction to
verify the result.
52
2016-
c Tyler Holden
4.2 More General Induction 4 Induction
Example 4.15
xk = 5xk−1 − 6xk−2 .
x1 = 21 + 30 = 3, x2 = 22 + 3 = 7
which agree with our initial configuration. Now assume that xk = 2k + 3k−1 for all 1 ≤ k ≤ n.
Examining xn+1 we have
= 2 · 2n + 3 · 3n−1
= 2n+1 + 3n .
exactly as desired.
Example 4.16
owing to the fact that these are the roots of the polynomial x2 − x − 1. It can also be verified by
straightforward computation:
√ √
1± 5 3± 5
1 + α± = +1=
2 2
and
√ √ √
2 1±2 5+5 6± 5 3± 5
α± = = = .
4 4 2
53
2016-
c Tyler Holden
4 Induction 4.3 Fallacies
4.3 Fallacies
One must think carefully about both the base case and induction step, and ensure that everything
is being done correctly. There are some subtly wrong arguments that can be made.
Example 4.17
Solution. Let P (n) be the statement xn is even. Clearly P (0) = 0 is even, so the base case is true.
Now assume that P (k) is true for all 1 ≤ k ≤ n. Since xn+1 = xn + xn−1 is the sum of two even
numbers, it is even.
This proof is certainly wrong, since x1 = 1 and x3 = 3, so what happened? Since xn depends
upon both xn−1 and xn−2 , we must check two base cases.
Example 4.18
54
2016-
c Tyler Holden
5 Bijections and Cardinality
Solution. Let P (n) be the statement ”Any group of n horses all have the same colour.” Clearly
P (1) is true since there is only a single horse in the collection. Now assume that any group of n
horses has the same colour, and let H = {h1 , . . . , hn+1 } be a set of n + 1 horses. Break this into
the subsets
H1 = {h1 , . . . , hn } , H2 = {h2 , . . . , hn+1 } .
Each set Hi has only n-horses, so by the induction hypothesis, all horses in each group are the
same color. Since H1 ∩ H2 6= ∅, there is a common horse in both H1 and H2 , implying that every
horse in H = H1 ∪ H2 has the same colour.
Here the issue is the induction step. The assumption H1 ∩ H2 6= ∅ fails when n = 2, since in
that case H = {h1 , h2 } making H1 = {h1 } and H2 = {h2 }.
Injectivity is a powerful property of functions. In our context, it will facilitate discussion of inverse
functions, to be taken up in Section 5.2
Definition 5.1
A function f : S → T is said to be injective or one-to-one if whenever f (s1 ) = f (s2 ) then
s1 = s2 .
The output of an injective function uniquely corresponds to the input; that is, the only way for
two outputs to be equal (f (s1 ) = f (s2 )) is for the inputs to have also been equal (s1 = s2 ). The is
also alluded to in the phrase one-to-one.
When S and T are both subsets of the real numbers, one can test whether a function f is injective
by applying the Horizontal Line Test to the graph of f . A function satisfies the Horizontal Line
Test if whenever we draw a horizontal line in the plane, it intersects the graph of the function
function at most once.
A third perspective is to view a function as a collection of arrows as in Figure 10. In this case,
a function is injective if every element of the codomain has at most one arrow pointing to it.
Example 5.2
Consider the functions f (x) = x2 , g(x) = 1/x, and h(x) = 1 + x. Determine which, if any,
of these functions is injective.
Solution. We claim that f (x) = x2 is not injective; that is, we can find two different points x1 6= x2
such that f (x1 ) = f (x2 ). Indeed, notice that f (−1) = (−1)2 = 1 and f (1) = (1)2 = 1 so that
f (−1) = f (1). If f (x) were injective, Definition 5.1 would imply that −1 = 1, and this is certainly
not the case. In fact, it is not too hard to convince ourselves that for any non-zero real number r,
we have f (r) = f (−r) since r2 = (−r)2 , but r 6= −r.
55
2016-
c Tyler Holden
5 Bijections and Cardinality 5.1 Injective and Surjective Functions
f :X→Y
a δ
β
b
c
α
d γ
X Y
Figure 10: If f : X → Y is an injective function, each element of the codomain Y has at most one
arrow pointing at it.
On the other hand, the function g is injective. Assume that g(x) = g(y), which by definition of
g this tells us that 1/x = 1/y. By taking the reciprocal of both sides we get x = y and this is what
we wanted to show.
Finally, the function h(x) is injective, since if h(x) = h(y) then 1 + x = 1 + y. By subtracting
1 from both sides, we get x = y as required.
Proposition 5.3
Solution. Assume that h(x) = h(y) for some x, y ∈ A. By definition of h we have f (g(x)) = f (g(y)).
Since the function f is injective, the only way f (m1 ) = f (m2 ) is if m1 = m2 , so f (g(x)) = f (g(y))
implies that g(x) = g(y). Since g is also injective, it must be the case that x = y. Thus we have
show that if h(x) = h(y) then x = y, showing that h is injective as required.
Proposition 5.4
Solution. Assume that g(a1 ) = g(a2 ). Applying f to both sides gives f (g(a1 )) = f (g(a1 )). Since
the composition f ◦ g is injective, this means that a1 = a2 , which is what we wanted to show.
The dual notion to an injective function is a surjective function, and this duality will be made
clearer in Section 5.2
56
2016-
c Tyler Holden
5.1 Injective and Surjective Functions 5 Bijections and Cardinality
Definition 5.5
A function f : S → T is said to be surjective or onto if for every element t ∈ T , there is an
element s ∈ S such that f (s) = t.
f :X→Y
a
β
b
c
α
d
X Y
Figure 11: If f : X → Y is surjective, then every element of the codomain has at least one arrow
pointing pointing at it.
When thinking about surjective functions, the idea to keep in mind is that every element in T is
the image of something in S. Put another way, if f maps elements of S to elements of T , everything
in T is hit by something in S. If we describe a function as arrows between sets, a function f is
surjective if everything in T has an arrow pointing to it.
Example 5.6
Of the following functions which map R → R, determine which maps are surjective:
1
f (x) = x2 , g(x) = , h(x) = 1 + x.
x
Solution. For functions R → R, a surjective function is the same as a function whose range is all of
R. The function f (x) = x2 is therefore not surjective since the range of f (x) is [0, ∞). Similarly,
h(x) has range R\{0} , so is not surjective. However, the function h(x) is surjective: if y is any real
number, we have that y is hit by y − 1, since
h(y − 1) = 1 + (y − 1) = y.
Thus the range of h(x) is all of R and we conclude that h(x) is surjective.
Proposition 5.7
57
2016-
c Tyler Holden
5 Bijections and Cardinality 5.2 Inverse Functions
Solution. Let c ∈ C be an arbitrary element, for which we need to find an a ∈ A such that
f (g(a)) = c. Since f is surjective, there exists b ∈ B such that f (b) = c. Since g is surjective, there
exists a ∈ A such that g(a) = b. Now f (g(a)) = f (b) = c as required.
Leaving functions we might see in calculus, the following are some further examples:
1. The function f : R → R given by f (x) = sin(x) is neither injective nor surjective. Indeed,
sin(0) = sin(π) and f (R) = [0, 1]. No amount of finagling can make f surjective, but we can
restrict the domain to ensure that f is injective. An interval of length π is the largest we can
take, and a common choices is [−π/2, π/2].
2. The function d : R → R2 , x 7→ (x, x) is injective but not surjective. It is injective since if
d(x) = d(y) then (x, x) = (y, y) and equating any component gives x = y. On the other hand,
there is no point in the domain such that d(x) = (0, 1).
3. The function p : R2 → R, (x, y) 7→ x is surjective but not injective. It fails to be injective
since f (x, y1 ) = x = f (x, y2 ) for any y1 , y2 . On the other hand, if x0 ∈ R then f (x0 , 0) = x0 ,
showing that the map is surjective.
4. Let PolyR be the polynomials with real coefficients, and define ev0 : PolyR → R as ev0 (p) =
p(0). This map is surjective but not injective. Indeed, ev0 (x2 + a) = a = ev0 (x + a) for any
a ∈ R, showing both surjectivity and the failure of injectivity at the same time.
Definition 5.8
A function f : S → T is bijective if it is both injective and surjective.
The word “inverse” has many different meanings depending on the context in which it is used. For
example, what if we were to ask the student to find the inverse of the number 2? What does this
mean? To what are we taking the inverse? To properly understand this, we need to understand the
following: Given a binary operator (an operator which takes in two things and produces a single
thing in return, such as addition and multiplication of real numbers), we say that a number id
is the identity of that operator if operating against it does nothing to the input. In the case of
addition, the operator will satisfy x + id+ = x for all possible x; for example,
2 + id+ = 2, −5 + id+ = −5.
Our experience tells us that id+ = 0. Similarly, for multiplication the identity id× will satisfy
x × id× = x for all x; for example,
3 × id× = 3, π × id× = π.
58
2016-
c Tyler Holden
5.2 Inverse Functions 5 Bijections and Cardinality
Again our experience tells us that id× = 1. We say that 0 is the additive identity and 1 is the
multiplicative identity.
Given an operator and an identity, we say that the inverse of x is an element which, when paired
against x, gives the identity. The additive inverse of 2 is the number y such that 2 + y = id+ = 0.
In this case y = −2, and more generally the additive inverse of n is −n. For multiplication, we can
convince ourselves that the multiplicative inverse of x is 1/x; for example, 2 × (1/2) = 1 = id× .
Notice that every real number has an additive inverse, while there is no multiplicative inverse
for the number 0. In general, one cannot be guaranteed that an inverse always exists.
If f, g : A → A, then function composition f ◦ g is another example of a binary operator. What
is the identity for this operation? Well, we would like a function id◦ : A → A such that
The identity function is therefore the function id◦ (x) = x, the function which does nothing to the
argument! Therefore the inverse of a function f : A → A is another function f −1 : A → A such
that f ◦ f −1 = f −1 ◦ f = id◦ .
This conversation can be generalized for functions whose domain and codomain are not equal.
For example, if f : A → B then f −1 : B → A. However, we now require two identities functions,
idA B
◦ : A → A and id◦ : B → B such that
f −1 (f (y)) = idB
◦ (y) = y, f −1 (f (x)) = idA
◦ (x) = x.
Definition 5.9
Let f : S → T be a function. We say that g : T → S is a
Injective functions and surjective functions have left- and right-inverses respectively, as demon-
strated in the following propositions:
Proposition 5.10
59
2016-
c Tyler Holden
5 Bijections and Cardinality 5.2 Inverse Functions
If you reexamine Figure ??, the idea is to simply reverse each given arrow. However, anything
which does not already have an arrow pointing to it needs to map somewhere. Hence we choose an
arbitrary element s0 in the domain and map all those points to s0 . To see that g is a left inverse
of f , let s ∈ S, in which case g(f (s)) = s by definition of g.
Conversely, assume that f has a left inverse g : T → S so that g(f (s)) = s for any s ∈ S. Set
f (x) = f (y) for which we would like to show that x = y. By applying g to both sides we get
Proposition 5.11
Proof. We begin by assuming that f : S → T is surjective. For each t ∈ T , let P (t) denote the set5
P (t) = {s ∈ S : f (s) = t} ;
that is, P (t) consists of the elements of S which map to t. Since f is surjective, each set P (t) has at
least one element, so for each t ∈ T we choose6 an element st ∈ P (t). Define the function g : T → S
by g(t) = st , which we claim is a right-inverse to f . Indeed, f (g(t)) = f (st ) = t by definition of st ,
as required.
Conversely, assume that there is a function g : T → S such that f (g(t)) = t for all t ∈ T . We
want to show that for each t ∈ T there is an element s ∈ S such that f (s) = t. The function
g : T → S gives us a way of picking an element in S, so we choose the element s = g(t) ∈ S. It
then follows that f (s) = f (g(t)) = t as required.
Injective functions are precisely those with left-inverses, and surjective functions are those with
right-inverses. This is the notion of duality we mentioned before. Definition 5.9 says that a function
f : S → T has an inverse if it has both a left- and a right-inverse, so we can combine Proposition
5.10 and Proposition 5.11 to get the following corollary:
Corollary 5.12
Proposition 5.13
5
The set P (t) is called the pre-image of an element t under f , and is usually denoted by f −1 (t). However, this
is just notation and does not mean that an inverse function f −1 exists! To avoid possible confusion, we have chosen
not to use this notation for this proof.
6
Here we have had to use something called the Axiom of Choice. Not all mathematicians believe that such a
choice is allowed to be made, but the author is not one of those mathematicians.
60
2016-
c Tyler Holden
5.2 Inverse Functions 5 Bijections and Cardinality
If the function is only left/right invertible, the left/right inverse is certainly not unique.
Example 5.14
In each case, if the inverse does not exist, determine a necessary change to the function to
guarantee an inverse.
Solution. We start with h(x) = 1 + x as it is the easiest. Examples 5.2 and 5.6 demonstrated that
h(x) is both injective and surjective, and hence is bijective. By Corollary 5.12, we know there is
an inverse function h−1 : R → R such that (h ◦ h−1 )(x) = x and (h−1 ◦ h)(x) = x. The reader can
easily check that the desired inverse function is h−1 (x) = x − 1.
Now f (x) = x2 is neither injective nor surjective, so certainly it will have neither a left nor a
right-inverse. However, if we make a small change to the codomain of the function then we can
√
arrive a partial answer. If f : R → [0, ∞) then we have a right-inverse r(x) = x
√ √
(f ◦ r)(x) = ( x)2 = x, (r ◦ f )(x) = x2 = |x|.
Notice that r still fails to be a left-inverse. If we also change the domain so that f : [0, ∞) → [0, ∞)
then (r ◦ f )(x) = x since there are no negative values of x to realize the absolute value. With these
changes in place the function f is now invertible.
Finally, the function g(x) = x1 is injective but fails to be surjective, since there is not values
of x ∈ R \ {0} such that g(x) = 0. This is the only value we miss, so by removing it we have a
bijective function, whose inverse is s(x) = 1/x.
1 1
(g ◦ s)(x) = = x, (s ◦ g)(x) = = x.
1/x 1/x
The above example demonstrates how critical the domain and codomain are to the definition
of a function. By changing the domain and codomain, one can change whether the function is
injective, surjective, or bijective, and hence whether or not it is invertible. In pratice, injectivity
is more critical than surjectivity. If f : A → B in injective, restricting the codomain to the range
of f does not result in any loss of information about the function. On the other hand, if f is not
injective, we have to restrict the domain to a subset on which the function is injective. This means
throwing away information about the function, which is less than ideal.
61
2016-
c Tyler Holden
5 Bijections and Cardinality 5.3 Cardinality
f : x 7→ x2
(Co)domain Injective Surjective
R −→ R No No
[0, ∞) −→ R Yes No
R −→ [0, ∞) No Yes
[0, ∞) −→ [0, ∞) Yes Yes
Warning
Many students confuse the notion of the function inverse with that of a reciprocal. The
reciprocal of a function f is the function 1/f , and is such that
1
f (x) × = 1.
f (x)
The inverse of a function is such that (f ◦ f −1 )(x) = x. Because this mistake occurs so
frequently, we make the message loud and clear:
Proposition 5.15
Solution. Since inverses are unique, it suffices to show that g −1 ◦ f −1 gives the identity function.
Indeed,
(f ◦ g) ◦ (g −1 ◦ f −1 ) = f ◦ (g ◦ g −1 ) ◦ f −1 = f ◦ idB ◦ f −1 = f ◦ f −1 = idC
and similarly
(g −1 ◦ f −1 ) ◦ (f ◦ g) = idA
showing that (f ◦ g)−1 = g −1 ◦ f −1 as required.
5.3 Cardinality
The cardinality of a set S is how many elements are within the set, and is denoted |S|. When S
is finite, |S| is simple to define; however, the issue becomes trickier when S is an infinite set. For
example, we will see that |N| = |Z| = |Q|, which is surprising given that it looks as though Q is
much larger than N.
Consider the following situation: you are given two boxes, let’s call them box S and box T .
Each box contains an unknown number of rubber balls, and your job is to determine which box
contains more balls. One strategy is to reach into both boxes at the same time, withdrawing a
single ball from each. If, say, box S runs out of balls before box T , you know that box S contained
fewer balls than T . If S and T are sets, we know |S| < |T |.
How does this help us in determining the size of sets? Let f : S → T is a function, thought of as
a collection of arrows from S to T . The function f must define exactly |S| arrows, one emanating
62
2016-
c Tyler Holden
5.3 Cardinality 5 Bijections and Cardinality
from each element of S. If |S| > |T | it is possible to define a surjective function f : S → T , but
not an injective function. For each object in S we must choose an object in T . Since |S| > |T | we
have enough arrows to hit every element in T , giving us a surjective function. On the other hand,
the pigeonhole principle tells us that at some point we are going to have to send two elements of S
to the same element of T , breaking injectivity.
One can imagine a similar situation if |S| < |T |, wherein one can define an injective function
from S to T , but not a surjective function. It is this idea for finite sets that allows us to define the
notion of cardinality in general.
Definition 5.16
Let S and T be sets. We say |S| ≤ |T | if there is an injective function S → T .
For example, if S = {1, 2, 3} and T = {−3, −6, −9, −12} we could define an injection f : S → T
by f (s) = −3s, showing that |S| ≤ |T |. This agrees with our usual notion of counting, since |S| = 3
and |T | = 4. However, we can extend this idea to infinite sets. For example, if S = {1, 2, 3, . . .}
and T = {−1, −2, −3, . . .} then |S| ≤ |T | via the injection s 7→ −s. Of course, in this latter case
we expect |S| = |T |, but we are not yet able to discuss these things.
Before going any further, let’s ground our definition in reality.
Proposition 5.17
If S = {s1 , . . . , sn } and T = {t1 , . . . , tm } are finite sets, then |S| ≤ |T | if and only if n ≤ m.
Proof. [⇐] Suppose n ≤ m and define a map f : S → T by f (si ) = ti . This map makes sense only
if n ≤ m, and moreover it is certainly injective. Hence |S| ≤ |T |.
[⇒] By contrapositive, suppose n > m. Suppose for the sake of contradiction that an injective
function f : S → T exists. The data of f includes n outputs, so by the Pigeonhole principle at
least one of these output must be repeated; that is, there exist s1 and s2 such that s1 6= s2 and
f (s1 ) = f (s2 ), but this contradicts the fact that f is an injection.
Proposition 5.18
If S ⊆ T then |S| ≤ |T |.
Proof. Let ι : S → T be the inclusion function; that is, ι(s) = s. This function is certainly injective,
since if ι(s1 ) = ι(s2 ) then by definition, s1 = s2 . By Definition 5.16, |S| ≤ |T |.
The hierarchy of counting thus immediately implies that |N| ≤ |Z| ≤ |Q| ≤ |R|. Similarly,
|[0, 1]| ≤ R, and any other cardinality relation induced by the subset relation.
One can also use the notion of a surjection to compare cardinalities, as the following example
demonstrates:
63
2016-
c Tyler Holden
5 Bijections and Cardinality 5.3 Cardinality
Proposition 5.19
If S and T are non-empty sets and f : S → T is a surjection, then |T | ≥ |S|; that is, there
is an injection g : T → S.
Proof. Let f : S → T be surjective. By Proposition 5.11 we know that f has a right inverse;
namely, there is a function g : T → S such that f ◦ g = idT . The symmetry in this relationship
means that f is a left-inverse for g, which by Proposition 5.10 means that g is injective. Hence we
have an injective function from T to S, and |T | ≤ |S|.
Once again, we immediately get that if S = {s1 , . . . , sn } and T = {t1 , . . . , tm } are finite sets,
then there is a surjection from S to T if and only if m ≤ n. This leads us to the following definition:
Definition 5.20
If S and T are sets, then |S| = |T | if there is a bijection S → T .
Solution. Certainly 2N ⊆ N, so that |2N| ≤ |N|, but we have been asked to go one step further.
Define the function f : N → 2N by f (n) = 2n, which we shall show is a bijection.
To see that it is injective, notice that if f (n) = f (m) then 2n = 2m. Dividing by 2 gives n = m
as required. To see that f is surjective, notice that every positive even number k can be written
as k = 2m for some m ∈ N. Hence f (m) = 2m = k so f is surjective. Since there is a bijection
between N and 2N, we conclude that |N| = |2N|.
Example 5.21 demonstrates that, unlike finite sets, infinite sets can have the same cardinality as
their subsets. This is just the first surprising statement in a plethora of unintuitive but interesting
results. A similar example is the following:
Example 5.22
64
2016-
c Tyler Holden
5.3 Cardinality 5 Bijections and Cardinality
Solution. We need to find a bijective function that maps (0, 1) to R, or alternatively an in-
jection from R → (0, 1). The former is just as easy as the later, once one recognizes that
arctan : (−π/2, π/2) → R is a bijection. By appropriately modifying the arctangent function,
we get
2 arctan(t) + π
f : (0, 1) → R, t 7→ ,
2
is the desired bijection.
Exercise: Modify Example 5.22 to show that |(a, b)| = |R| for any a < b. What about the
closed interval [a, b]?
A subtle question at this point is whether knowing |S| ≤ |T | and |T | ≤ |S| tells us that |T | = |S|.
Remember, these “inequalities” are just notation – notation that happens to coincide with our usual
inequality on N, but we can’t say more than that in general.
Let’s think about this more. We’re asking if knowing that there is an injection f : S → T
(|S| ≤ |T |) and an injection g : T → S (|T | ≤ |S|) guarantees the existence of a bijection from S to
T . That’s not at all obvious.
Theorem 5.23: Cantor-Bernstein-Schroeder
The proof is somewhat involved, and so is omitted. However, it’s worth pointing out that this
is not immediately obvious, and is difficult to prove.
Example 5.24
Solution. The inclusion map ι : (0, 1) → [0, 1], x 7→ x is an injection, so we need only construct
an injection in the other direction. Define f : [0, 1] → [0, 1] by f (t) = (1 + 2t)/4, and note that
f ([0, 1]) = [1/4, 3/4]. This function is injective, for if f (t1 ) = f (t2 ) then
1 + 2t1 1 + 2t2
= ⇒ 1 + 2t1 = 1 + 2t2 ⇒ t1 = t2 .
4 4
Thus by the Canotor-Bernstein-Schroeder theorem, there is a bijection between (0, 1) and [0, 1];
hence, |(0, 1)| = |[0, 1].
Definition 5.25
A set S is said to be countable if |S| ≤ |N|. Equivalently, S is countable if there is an injective
function f : S ,→ N. We say that S is countably infinite if |S| = |N|.
Our goal is to show that Z and Q are countable, but that R is not. For the former, we need the
following result:
65
2016-
c Tyler Holden
5 Bijections and Cardinality 5.3 Cardinality
Proposition 5.26
Proof. Let {Ai }i∈I be a countable collection of countable sets, so I is countable, as is each Ai . Let
g : I ,→ N be an injective function, and for each Ai let fi : Ai ,→ N be an injective function. Define
the map [
f: Ai → N, a 7→ 2g(n) 3fn (a) , a ∈ An .
i∈I
This map is well-defined by the uniqueness of prime decompositions and the fact that the An are
pairwise disjoint. The same uniqueness condition gives injectivity; that is, the only way 2n 3m = 2r 3s
is if n = r and m = s. Thus a countable union of countable sets is countable.
Theorem 5.27
The integers are the same size as the natural numbers: |Z| = |N|.
is a countable union of countable sets, and hence is countable itself by Proposition 5.26. Define the
map f : Z → N × N as n 7→ (|n|, sgn(n)). As an example of what this map does, we have
so that the second number just keeps track of whether the number is positive, negative, or zero.
This map is injective, since if f (n) = f (m) then (|n|, sgn(n)) = (|m|, sgn(m)). Equality in the first
component, |n| = |m|, implies that n = ±m. Equality in the second component, sgn(n) = sgn(m),
implies that that n = m.
Since f is injective, we thus have that |Z| ≤ |N × N| ≤ |N|. On the other hand, since N ⊆ Z it
must be that |N| ≤ |Z|. Both inclusions give us that |N| = |Z|.
Theorem 5.28
Proof. Consider the map f : Q → Z × N given by f (p/q) = (p, q) where the fraction p/q is in lowest
terms (if the fraction is negative, we always take the sign in the p-component). Again this map is
injective, since if f (p/q) = f (r/s) then (p, q) = (r, s) which is true only if p = r and q = s. Since Z
66
2016-
c Tyler Holden
5.3 Cardinality 5 Bijections and Cardinality
and N are both countable, so too is Z × N and so |Q| ≤ |Z × N| ≤ |N|. On the other hand, N ⊆ Q
so |N| ≤ |Q| and this gives us that |Q| = |N|.
One might expect that this pattern continues forever, and that every infinite set has the same
cardinality as the naturals. The real numbers are our first counterexample.
Theorem 5.29
The real numbers are strictly larger than the natural numbers, and so not countable: |R| >
|N|.
Proof. It is sufficient to show that the real numbers [0, 1] are not countable, since then certainly
all of R will be uncountable also. For the sake of contradiction, assume that the real numbers are
countable and list them {r1 , r2 , r3 , . . .}. Write each ri in its decimal expansion as ri = 0.di1 di2 di3 di4 · · ·
so that
r1 = 0.d11 d12 d13 d14 · · ·
r2 = 0.d21 d22 d23 d24 · · ·
r3 = 0.d31 d32 d33 d34 · · ·
r4 = 0.d41 d42 d43 d44 · · ·
..
.
Define a new number s as follows. Let s = 0.s1 s2 s3 s4 · · · where
(
0 if dii = 1
si = .
1 if dii 6= 1
The number si is not in the list {r1 , r2 , r3 , . . .} by construction (think about this and you will see
it is true), but is a real number. This is a contradiction, since we assumed that we listed all of the
real numbers. We conclude that the real numbers are not countable as required.
So what goes wrong with the reals? The problem is that, given a fixed real number. There is
no reasonable way to say what the “next” real number is. In the case of N and Z it is easy, and in
this case of Q it is a bit tricky but still doable. But lets say you start at the real number 0. What
is the next real number? 0.1? Why not 0.01 or 0.001 or 0.0001? I can put an arbitrary number
of zeroes before putting that 1, so it does not make sense to say “the next real number.” This is
precisely what breaks.
So is there a cardinal strictly between |N| and |R|? This turns out to be an incredibly deep and
subtle question, and one that cannot be proven with our standard set of axioms. One must either
assume that there is such a cardinal, or assume there is no such cardinal, it cannot be proven either
way. However, there is a systematic way of taking a set, and getting a set with a strictly larger
cardinality.
Definition 5.30
Let S be a set, and define the power set P(S) to be the set of all subsets of S. The power
set is sometimes denoted 2S .
67
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic
Proof. First we show that |S| ≤ |P(S)| by demonstrating an injection S → P(S). Define the
function f : S → P(S) by x 7→ {x}. This map is evidently injective.
The remainder of the proof is a generalization of the proof given in Theorem 5.29, and proceeds
by diagonalization. Assume that a surjective f : S → P(S) exists and define the set
D = {x ∈ S : x ∈
/ f (x)} ∈ P(S).
We claim D is not the image of any point in S. Indeed, for each element a ∈ S, either a ∈ f (a) or
a∈/ f (a). In the first case, if a ∈ f (a) then a ∈
/ D, so f (a) 6= D. On the other hand, if a ∈
/ f (a)
then a ∈ D, again showing that D 6= f (a). Thus D is not the image of any point, contradicting
the assumption that f was a surjection. We conclude that |S| < |P(S)|.
There is no greatest cardinal, since we can keep taking power sets to create a hierarchy of
cardinals. This leads to an interesting paradox: The set of all sets is its own power set, so must
have cardinality strictly greater than itself. This is resolved by declaring that the set of all sets is
not a set, but rather a proper class. Proper classes are strictly larger than sets, and therefore do
not have the same restrictions as sets.
We introduced the basics of divisibility in Definition 3.26. This section will focus on this topic, and
other number theoretical ideas.
Recall that if a, b ∈ Z we say that a|b if there exists as k ∈ Z such that ak = b. You can think of
a as being a factor of b. Furthermore, we’ve already proven several useful facts, which we recount
below:
1. [Proposition 3.27] If a|b and a|c then for any m, n ∈ Z, a|(mb + nc).
2. [Proposition 3.28] If a|b and a|(b + c) then a|c.
3. [Proposition 3.29] Every integer can be written as the product of primes.
4. [Theorem 3.30] There are infinitely many prime numbers.
Definition 6.1
If a, b ∈ Z, we define the greatest common divisor of a and b, written gcd(a, b), to be the
largest positive integer that divides both a and b. More precisely,
68
2016-
c Tyler Holden
6.1 Divisibility and Primes 6 Divisibility and Modular Arithmetic
For example,
gcd(4, 6) = 2, gcd(15, 25) = 5, gcd(15, 33) = 3, gcd(17, 4) = 1.
Note that since every number divides 0, gcd(a, 0) = |a|.
If p is a prime number, then gcd(p, a) = 1 unless a is a multiple of p. A somewhat more
interesting notion is that of coprimality:
Definition 6.2
If a, b ∈ Z, we say that a and b are coprime or relatively prime if gcd(a, b) = 1.
Theorem 6.3
We do not yet have the tools to prove Theorem 6.3, but the result is sufficiently worthwhile now
for perspective. In some cases, the m, n guaranteed by the theorem are simple to see. For example,
These were easy enough to do by inspection, but what if we are asked to find gcd(1053, 481)?
Can you see this by simple inspection? The answer is 13, but is there a systematic way of deducing
this answer? Furthermore, how do we find the m, n guaranteed by Theorem 6.3? Here the answer
is
13 = 481(46) + 1053(−21)
but there is no way we can see that just by inspection! These are some of the answers we will
eventually answer.
Number theory really is the study of primes, and those prime numbers have special properties when
it comes to divisibility.
Proposition 6.4
Solution. Since a|bc we know there is a k ∈ Z such that ak = bc. Furthermore, Theorem 6.3 we
know that there exist m, n ∈ Z such that am + bn = 1, since we assumed that gcd(a, b) = 1.
Multiplying through by c we get
c = amc + bcn = amc + (ak)c = a(mc + kc)
showing that a|c.
69
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic 6.1 Divisibility and Primes
Theorem 6.5
Proof. (⇒) Assume that p is prime and that p|ab. If p|a we are done, so assume that p does not
divide a, in which case gcd(a, p) = 1. By Theorem 6.4, it then follows that p|b.
(⇐) Conversely, we will proceed by the contrapositive; that is, we will show that if p is not
prime, then there exists a, b such that p|ab but neither p|a nor p|b. Assume that p is not a prime, so
it is necessarily composite. We can thus write p = rs for 1 < r ≤ s < p. Hence p|rs but p divides
neither r nor s,
This fact is equivalent to being a prime number, and is in fact the definition of prime in higher
mathematics. It is a straightforward induction proof to show that if p|a1 a2 · · · an then p|ai for some
i ∈ {1, . . . , n}.
In fact, we have used this divisibility fact before, but not in an obvious way. Recall in Example
3.15 we showed that “n is even if and only if n2 is even.” Since 2 is a prime number, and being
divisible by 2 is equivalent to being even, this is the statement that “2|n if and only if 2|n2 .” This
generalizes:
Proposition 6.6
Proof. (⇒) Suppose that p|n so that pk = n for some k ∈ Z. Squaring n we get n2 = p2 k 2 = p(pk 2 )
so p|n2 .
(⇐) Converse, suppose that p|n2 . Since p is a prime, Theorem 6.5 shows that p|n.
√
We used the fact that n is even if and only if n2 is even to show that 2 is irrational. The same
√
proof now applies to show that p is irrational for any prime p.
Theorem 6.7
√
If p is a prime, then p is irrational.
Proof. For the sake of contradiction, assume that p is rational, and write p = a/b where gcd(a, b) =
1; that is, a/b is in lowest terms. Multiplying both sides by b and squaring gives b2 p = a2 .
Certainly p|b2 p and so p|a2 , which in turn implies that p|a. Hence we can write a = pk for some
k ∈ Z. Substituting this into b2 p = a2 gives
b2 p = p2 k 2 ⇒ b2 = pk 2 .
Once again p|pk 2 so p|b2 , showing that p|b, hence b = p` for some ` ∈ Z. This is a contradiction,
since
1 = gcd(a, b) = gcd(pk, p`)
70
2016-
c Tyler Holden
6.2 The Euclidean Algorithm 6 Divisibility and Modular Arithmetic
√
and the right hand side at least p. Thus p is irrational.
Every positive integer greater than 1 can be uniquely expressed as a product of primes.
Proof. We have already shown that every number can be written as a product of primes, so it only
remains to show that this decomposition is unique. For the sake of contradiction, assume that there
exists some integer n > 1 which can be expressed with two different prime factorizations, say
n = p 1 · · · p n = q1 · · · qm ,
where all the pi and qj are prime. Since p1 |n, it must also be the case that p1 |q1 · · · qm . Since p1 is
a prime, by Theorem 6.5 we must have p1 |qj for some j. By reordering if necessary, let this be q1 .
Since q1 is also prime, the only way p1 |q1 is if p1 = q1 . Cancelling p1 and q1 thus gives
p2 p3 · · · pn = q2 q3 · · · qm .
We repeat the same argument above, deducing that p2 = q2 , p3 = q3 , and generally that pi = qi .
Hence n = m and, up to reordering the factors, the prime decomposition is unique.
Example 6.9
Solution. For the sake of contradiction, assume log36 (105) = p/q where gcd(p, q) = 1.By definition
of the logarithm, 36p/q = 105, or equivalently 36p = 105q . Factoring 36 and 105 into primes gives
We now develop an algorithm for determining the greatest common divisor of two numbers, together
with the integer linear combination that achieves that number.
Proposition 6.10: The Division Algorithm
If a, b ∈ Z with b > 0, there exists unique q and r such that a = qb + r where 0 ≤ r < b.
71
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic 6.2 The Euclidean Algorithm
Proof. Assume that both a, b > 0, for which the case when a < 0 follows similarly. We proceed by
strong induction. As a base case, note that when a = 1 then
(
b · 0 + 1 if b > 1
a=
b · 1 + 0 if b = 1,
so the base case holds. Assume then that for all 1 ≤ n ≤ a, we can write n = bq + r for some
unique q and r, and consider a + 1. If a + 1 ≤ b then the result is trivial, so assume that a + 1 > b.
Now (a + 1 − b) ≤ a so by the induction hypothesis
a + 1 − b = bq + r
for some unique q and r, so a + 1 = b(q + 1) + r. It must also be the case that q + 1 and r are
unique, for otherwise q and r would not be unique.
Proposition 6.11
Proof. Write r = a − qb. If a = b = 0 the necessarily q = r = 0 and so the result is trivially true.
Therefore, assume that not both of a and b are zero, and set d = gcd(a, b). Since d|a and d|b then
d|(a − qb) = r by Proposition 3.27. Now we must show that d is in fact the greatest common divisor
or r and b.
If c is any other divisor of b and r, then by Proposition 3.27 we have c|(bq + r) = a. Since d
is the greatest common divisor of a and b, it must be the case that c ≤ d, showing that d is the
greatest common divisor of b and r as required.
Let a, b ∈ Z with b 6= 0 and assume that b does not divide a. Consider the following
algorithm:
This algorithm must terminate in finitely many steps, and moreover gcd(a, b) = rn .
Proof. The remainders form a strictly decreasing sequence of positive integers: a ≥ b > r1 > r2 >
r3 > · · · and so must eventually reach zero, showing that the algorithm must eventually terminate.
Furthermore, by repeatedly applying Proposition 6.11 we get
gcd(a, b) = gcd(b, r1 ) = gcd(r1 , r2 ) = gcd(r2 , r3 ) · · · = gcd(rn−1 , rn ) = gcd(rn , 0) = rn .
72
2016-
c Tyler Holden
6.3 Linear Diophantine Equations 6 Divisibility and Modular Arithmetic
Example 6.13
Another trick to finding the greatest common divisor of two numbers is to use their prime
factorizations. Let a, b ∈ Z and, allowing powers to be zero if necessary, write a and b with the
same primes
a = pn1 1 pn2 2 · · · pnk k , b = pm1 m2 mk
1 p2 · · · pk .
The greatest common divisor of a and b must therefore be
min{n1 ,m1 } min{n2 ,m2 } min{nk ,mk }
d = gcd(a, b) = p1 p1 · · · pk .
Example 6.14
Solution. The prime factorization of 2100 is just 2100 , while for 1002 we get
1002 = (22 · 52 )2 = 24 · 54 .
The greatest common divisor is thus
d = gcd(2100 , 1002 ) = 2min{4,100} 5min{0,4} = 24 = 16.
Definition 6.15
Given a, b, d ∈ Z, a Linear Diophantine Equation (in two variables) is any equation of the
form
ax + by = d.
Certainly there are rational solutions to this equation, but we are more interested in finding
integer solutions.
Theorem 6.16
73
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic 6.3 Linear Diophantine Equations
Working back up through these equations, allows us to write d in terms of a and b, so there is an
x0 , y0 such that ax0 + by0 = d. Multiplying through by k we get
a(kx0 ) + b(ky0 ) = dk = c
as required.
Example 6.17
In Example 6.13 we showed that gcd(504, 1155) = 21. Find a solution to the Diophantine
equation 504x + 1155y = 42.
Solution. Since gcd(504, 1155) = 21|42 we know that a solution exists. We found in Example 6.13
that
1155 = 504(2) + 147
504 = 147(3) + 63
147 = 63(2) + 21
Write the last line as 21 = 147 + 63(−2). We now solve each remaining equation for the remainder
to find 63 = 504 + 147(−3) and 147 = 1155 + 504(−2). Substituting these we get
21 = 147 + 63(−2)
= 147 + [504 + 147(−3)](−2) from 63 = 504 + 147(−3)
= 147(7) + 504(−2)
= [1155 + 504(−2)](7) + 504(−2) from 147 = 1155 + 504(−2)
= 1155(7) + 504(−16).
This is not the desired solution though, so we multiply through by 2 to get
504(−32) + 1155(14) = 42
as required.
74
2016-
c Tyler Holden
6.3 Linear Diophantine Equations 6 Divisibility and Modular Arithmetic
There are in fact infinitely many solutions to a solvable Diophantine equation. How do we find
them in general?
Proposition 6.18
Suppose that a, b, c ∈ Z and (x0 , y0 ) satisfy ax0 + by0 = c. If d = gcd(a, b) 6= 0 then the
general solution to the Diophantine equation ax + by = c is given by
b a
x = x0 + n , y = y0 − n , for all n ∈ Z.
d d
Note that it does not matter whether you put the minus sign in the x term or the y term,
so long as they are opposiing signs.
Proof. Subtract the equations ax0 + by0 = c and ax + by = c to find that a(x − x0 ) + b(y − y0 ) = 0.
Divide through by d and re-arrange to find that
a b
(x − x0 ) = − (y − y0 ).
d d
The student can show that gcd(a/d, b/d) = 1 (Good exercise!). Moreover, a/d divides the left hand
side, and so must also divide the right hand side. Since gcd(a/d, b/d) = 1, then by Proposition
6.4 we know that (a/d)|(y − y0 ). Similarly, (b/d)|(x − x0 ), so there exists n ∈ integ such that
x − x0 = n(b/d) so that
b a ab
− (y − y0 ) = (x − x0 ) = n ⇒ bd(y0 − y) = anb
d d dd
In both cases, we can solve for x and y to find that
b a
x = x0 + n , y = y0 − n .
d d
Now in fact every such n works, since
b a anb anb
ax + by = a x0 + n + b y0 − n = (ax0 + by0 ) + − = ax0 + by0 = c.
d d d d
Example 6.19
Solution. We have already found a particular solution; namely, 504(−32) + 1155(14) = 42. Fur-
thermore, d = gcd(504, 1155) = 21 and so the general solutions are of the form
1155 504
x = −32 + n = −32 + 55n, y = 14 − n = 14 − 24n.
21 21
It is sometimes easier to normalize by the greatest common denominator before starting. Take
our carry-through example, where we want to solve 504x + 1155y = 42. Dividing through by the
greatest common divisor d = gcd(504, 1155) = 21 we get the equation
24x + 55y = 2.
75
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic 6.4 Relations on Sets
Since gcd(24, 55) = 1 we can find a solutions to 24x + 55y = 1, then multiply by 2. To do this, we
again use the Euclidean algorithm to find
55 = 24(2) + 7
24 = 7(3) + 3
7 = 3(2) + 1
3 = 3(1) + 0
(where of course we already knew the greatest common divisor is 1). Now working backwards
1 = 7 + 3(−2)
= 7 + [24 + 7(−3)](−2)
= 7(7) + 24(−2)
= [55 + 24(−2)](7) + 24(−2)
= 55(7) + 24(−16).
Multiplying by 2 gives us our solution 24(−32) + 55(14) = 2. Notice that (−32, 14) are the same
as the solutions we found above, as should be the case. Moreover, the general solutions are easily
read off as x = −32 + 55n and y = 14 − 24n, which is again the same solution we found above.
Example 6.20
Solution. We know that the general solutions are of the form x = −32 + 55n and y = 14 − 24n.
We require both numbers to be positive, giving −32 + 55n > 0 and 14 − 24n > 0, which we can
solve for n to get
32 14
<n< .
55 24
There are no integers which satisfy this equation, so there are no non-negative solution.
Given a set S, a relation on S is a way of comparing two elements of the set, or rather, specifying
their relationship. For example, equality is a relation, for when we write a = b we are specifying
a relationship between a and b. Similarly, a < b is a relation. Abstractly, if a, b ∈ S we write the
relation as aRb or R(a, b). This describes a heuristic, but is not a formal definition, which is the
following:
Definition 6.21
Given a set S, a relation on S is any subset of SR ⊆ S × S.
This seems pretty nebulous, but the subset precisely captures when elements are related. For
example, if S = Z and R =< then SR = {(a, b) ∈ Z × Z : a < b}. Here we see that (−2, 5) ∈ SR
since −2 < 5, while (15, 14) ∈
/ SR since 15 6< 14.
76
2016-
c Tyler Holden
6.4 Relations on Sets 6 Divisibility and Modular Arithmetic
Given a relation R as in the first paragraph, we can define the set SR as follows: Define a
function r : S × S → {0, 1} such that
(
1 aRb
r(a, b) =
0 otherwise
and set SR = r−1 (1). Conversely, given a set SR we say aRb if (a, b) ∈ SR .
To distinguish between different types of relations, we define properties that the relation can
exhibit. If R is a relation on S then
Example 6.23
Let X be a set, and define a relation on the power set P(X) by subset inclusion; namely
ARB if A ⊆ B. Determine which of the five previous properties are satisfied by this relation.
Solution. This relation is reflexive, since A ⊆ A is always true. It is transitive, since if A ⊆ B and
B ⊆ C then A ⊆ C. Finally, it is also anti-symmetric, for if A ⊆ B and B ⊆ A then A = B. Hence
subset inclusion is a order relation.
However, notice that inclusion is not total. Given two arbitrary subsets A, B ∈ X, there need
not be a relation on them. It could be the case that A ⊆ B or B ⊆ A, but in general neither need
be true. We say that subset inclusion is a partial ordering.
Example 6.24
77
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic 6.4 Relations on Sets
Solution. We need to check that divisibility is transitive, reflexive, and anti-symmetric, but non-
total. Being reflexive is immediate, since a|a for all a ∈ Z. Transitivity is true, for we have shown
that if a|b and b|c then a|c.
Anti-symmetry requires a bit of work, but isn’t too bad. Assume that a|b and b|a. If either is
zero then the relation x|0 can only be true if x = 0, so a = b. Thus assume that neither a nor b is
zero, so there exist k, ` ∈ N such that ak = b and b` = a. Multiplying the first equation by ` we get
ak` = b` = a ⇒ a(k` − 1) = 0
Example 6.25
• [Reflexive] Let a ∈ R. Since functions have unique outputs, f (a) = f (a) showing that a ∼
= a.
• [Symmetric] If a ∼
= c then f (a) = f (c), or equivalently f (c) = f (a), showing that c ∼
= a.
Definition 6.26
Given an equivalence relation ∼
= on a set S, the equivalence class of an element a ∈ S is the
set
[a] = {x ∈ S : x ∼
= a} .
Solution. Let’s start with a simple example, and look at the equivalence class [0]. By definition,
[0] = {x ∈ R : x ∼ 0} = {x ∈ R : x − 0 ∈ Z} = Z
so the equivalence class of [0] is precisely the integers. What about something like [1.5]?
78
2016-
c Tyler Holden
6.5 Modular Arithmetic 6 Divisibility and Modular Arithmetic
So when will x − 1.5 look like an integer? Precisely when x the decimal part of x is 0.5; for example,
4.5 − 1.5 ∈ Z and −2.5 − 1.5 ∈ Z.
In general, equivalence classes [x] are the all those real numbers that have the same decimal
component as x.
Theorem 6.28
Solution. Certainly every element belongs to an equivalence class, so we need only show that two
disjoint equivalent classes have no intersection. Let [x] and [y] be disjoint equivalence classes, and
for the sake of contradiction assume that [x] ∩ [y] 6= ∅. Choose z ∈ [x] ∩ [y], so that z ∼ x and z ∼ y.
By transitivity, x ∼ y, which contradicts the fact that these were disjoint equivalence classes. We
conclude that all equivalence classes are disjoint.
Definition 6.29
If n ∈ N and a, b ∈ Z, we say that a ≡ b (mod n) (read: a is congruent to b mod n) if
n|(b − a).
For example, if n = 4 then 1 ≡ 29 (mod 4), since 4|(29 − 1) = 28. The congruence classes are
precisely
[0] = {. . . , −12, −8, −4, 0, 4, 8, 12, . . .}
[1] = {. . . , −11, −7, −3, 1, 5, 9, 13, . . .}
[2] = {. . . , −10, −6, −2, 2, 6, 10, 14, . . .}
[3] = {. . . , −9, −6, −1, 3, 7, 11, 15, . . .}
Proposition 6.30
79
2016-
c Tyler Holden
6 Divisibility and Modular Arithmetic 6.5 Modular Arithmetic
and since n divides each of above terms, by Proposition 3.27, we know it divides the sum as
well. Thus n|(c − a), or a ≡ c (mod n).
• [Symmetric] Suppose that a ≡ b (mod n), so that n|(b − a). Then n|(a − b) = −(b − a), so
b ≡ a (mod n) as required.
Proposition 6.31
1. (a + b) ≡ (r + s) (mod n).
2. ab ≡ rs (mod n)
Proof. Fix a, b, r, s and assume that a ≡ r (mod n) and b ≡ s (mod n). Hence n|(r − a) and
n|(s − b). Equivalently, there exist k, ` such that nk = r − a and n` = s − b.
n(k + `) = (r + s) − (a + b)
2. We can write
of which there precisely n; namely [0], [1], [2], . . . , [n − 1]. We often denote this set of equivalence
classes by
Zn = Z/nZ = {[0]n , [1]n , . . . , [n − 1]n } .
According to Proposition 6.31, we can add and multiply congruence classes exactly as we would
integers, as long as we reduce modulo n. So for example, working modulo 3 we have
Example 6.32
80
2016-
c Tyler Holden
6.5 Modular Arithmetic 6 Divisibility and Modular Arithmetic
Solution. Note that the last digit d satisfies d ≡ 4441 (mod 10). Moreover,
Example 6.33
Solution. By contrapositive, assume that a − 4 is divisible by 7; that is, a ≡ 4 (mod 7). Then
81
2016-
c Tyler Holden