0% found this document useful (0 votes)
109 views120 pages

Probability and Functions in Card Games

The love for math 5

Uploaded by

dirzegotre
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
109 views120 pages

Probability and Functions in Card Games

The love for math 5

Uploaded by

dirzegotre
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

7.2.

DEFINITION AND EXAMPLES 481

g:A→B 1
a
2
b
3
c 4
d 5
e 6

A B
However, this does somehow represent the idea of a function. In this picture,
we have represented the domain A by an oval, and the same with the codomain
B. The elements of A and B are represented by dots inside those ovals (and
they are labeled), and we have drawn arrows between those dots based on what
the function g : A → B does.
Mostly, this method is used to explore a certain property of a function and
perhaps construct a counterexample to a claim. By drawing some dots and
arrows and playing around with how they connect, we can perhaps develop the
underlying structure of an example; then, we can go back and assign some names
and formulas to the parts of the diagram and make the picture more rigorous.
We will use some schematic diagrams to illustrate some properties and con-
cepts as we proceed, but these will always be accompanied by a more rigorous
statement or description. We encourage you to employ a similar method.

7.2.5 Questions & Exercises


Remind Yourself
Answering the following questions briefly, either out loud or in writing. These
are all based on the section you just read, so if you can’t recall a specific defi-
nition or concept or example, go back and reread that part. Making sure you
can confidently answer these before moving on will help your understanding and
memory!

(1) Write down the definition of a function, without looking it up. Then,
compare to our definition. Does yours convey the same information? If not,
what did you miss?

(2) What is the difference between the domain and the codomain of a func-
tion?

(3) What does it mean for a function to be well-defined?


482 CHAPTER 7. FUNCTIONS AND CARDINALITY

(4) What is the identity function and how is it defined?


(5) How can we prove that two functions are equal?

Try It
Try answering the following short-answer questions. They require you to actu-
ally write something down, or describe something out loud (to a friend/class-
mate, perhaps). The goal is to get you to practice working with new concepts,
definitions, and notation. They are meant to be easy, though; making sure you
can work through them will help you!
(1) Use proper notation to define a function that inputs an integer and outputs
the square root of its absolute value.
What is the domain of this function? What is its codomain?
(2) Use proper notation to define a function that inputs a pair of natural num-
bers and outputs their average (arithmetic mean).
What is the domain of this function? What is its codomain?
(3) Let A = {−2, −1, 0, 1, 2}. Let g : A → A be defined by ∀x ∈ A g(x) = .
x2 − 3. Draw a schematic diagram to determine whether g is well-defined
or not. Is it?
(4) Let X be any set. Use proper notation to define a function that inputs a
subset of X and outputs that set’s complement (in the context of X).
What is the domain of this function? What is its codomain?
.
(5) Let B = {−1, 0, 1}. Let h : B → B be defined by ∀b ∈ B h(b) = b3 . What
special function is this equal to?
.
(6) Let f : Z × Z → N be defined by ∀(x, y) ∈ Z × Z f (x, y) = 21 |x + 1| · |y|. Is
this a well-defined function? Why or why not?

7.3 Images and Pre-images


7.3.1 Image: Definition and Examples
Think back to the definition of a function. We required that every input have
a unique output. This ensures that a function is defined everywhere on its
domain. What about the codomain, though? All we required there was that
all of the outputs belong to the codomain. We never said anything about “how
much” of the codomain is “covered”. The idea of the image of a function is to
capture exactly this notion. As we will see from some examples, it is not always
easy to determine precisely what the image of a function is, even when we know
what the codomain is. It is for this reason that we defined a function and its
codomain first, before introducing the image; so don’t think we were trying to
fool you or anything!
7.3. IMAGES AND PRE-IMAGES 483

Definition
Definition 7.3.1. Let A, B be sets and let f : A → B be a function. Let
X ⊆ A.
The image of X under the function f is written and defined as
.
Imf (X) = {b ∈ B | ∃a ∈ X f (a) = b}
That is, the image of X under f is the set of all “outputs” that come from
“inputs” in the set X.
An equivalent way of writing this is
Imf (X) = {f (a) | a ∈ X}
(We will sometimes abbreviate the notation as just Im(X), when the function
is clearly identified and unambiguous, and consequently refer to the set as just
“the image of X”, instead of “the image of X under f ”.)
When we say the image of f , we mean the image of the entire domain, i.e.
Imf (A).
Notice that this is defined for any subset of the domain, X ⊆ A, so we can
talk about the image of any “piece” of the domain, or all of it. We will see some
examples now—and exercises later—that consider strict subsets X ⊂ A, as well
as A itself.

One Observation
Notice that
Imf (A) ⊆ B
no matter what f and A and B are. This follows by definition, since we used
set-builder notation to define the image via elements of B. In the next section,
we will explore what happens when Imf (A) = B.
For now, let’s practice identifying the images of certain functions. In some
cases, we will be provided with a function and its image and asked to verify this
claim, but in other cases, we will need to develop some techniques to figure out
what the image is in the first place!

Examples
Example 7.3.2. Define a function g : A → B by setting A = {a, b, c, d, e} and
B = {1, 2, 3, 4, 5, 6} and
g = {(a, 2), (b, 3)(c, 3), (d, 1), (e, 6)}
Define X1 = {a, b, c} and X2 = {a, c, e} and X3 = {c, d, e}.
You might notice that this is the same function we defined in the schematic
diagram in the last section! Let’s see that diagram again, because it can help
us identify the images in the following list.
484 CHAPTER 7. FUNCTIONS AND CARDINALITY

g:A→B 1
a
2
b
3
c 4
d 5
e 6

A B
(1) Img ({a}) = {2}
This is because g(a) = 2.
Notice the use of set brackets. We always find the image of a set, so writing
Img (a) would be incorrect.
(2) Img ({b, c}) = {3}
This is because g(b) = g(c) = 3.
(3) Img (X1 ) = {2, 3}
This is because g(b) = g(c) = 3 and g(a) = 2.
(4) Img (X2 ) = {2, 3, 6}
This is because g(a) = 2 and g(c) = 3 and g(e) = 6.
(5) Img (X3 ) = {1, 3, 6}
This is because g(c) = 3 and g(d) = 1 and g(e) = 6.
(6) Img (A) = {1, 2, 3, 6}
Looking at the set B in the schematic diagram, we see that these are the only
values that are “hit” by the function. Notice 4, 5 ∈ B but 4, 5 ∈
/ Img (A), so
Img (A) ⊂ B (a proper subset).

Example 7.3.3. Consider the temperatures (in degrees Celsius) where water is
in its liquid state. Specifically, define the set

C = {x ∈ R | 0 < x < 100}

and define the function F : C → R by

.
∀c ∈ C F (c) =
9
5
c + 32

What is ImF (C)? What does it represent?


To approach questions like these, we must (a) identify a claim for what ImF (C)
is by defining a set, and then (b) prove that the set we defined is actually equal
7.3. IMAGES AND PRE-IMAGES 485

to ImF (C), in the sense of sets. This means we will use a double-containment
argument!
Solution: Define S = {y ∈ R | 32 < y < 212}. (Notice that this represent the
set of temperatures (in degrees Fahrenheit) where water is in its liquid state.)
We claim S = ImF (C).

It is hard to give advice about how to come up with claims like this, in
general. Most often, this relies on some playing around with the function and
testing some values, and perhaps some insight about some other properties of the
function. In this specific case, we notice that this function is increasing; that is,
if we have two input values with c1 < c2 , then we know that F (c1 ) < F (c2 ). We
can glean this information from the graph of the function (see above) and/or
recognizing it is a linear polynomial. Accordingly, to identify the image, we
just have to consider the smallest and largest inputs and identify their outputs.
(Again, we can glean this information from a graph.) We find that
900
F (0) = 0 + 32 = 32 and F (100) = + 32 = 212
5
From this, we defined the set S. (Also, notice that we had to use “<” in the
inequality because, in fact, 0 ∈
/ C, the domain!) We also give this set a name so
that we can refer to it later without implicitly claiming, already, that it is the
image. This is a somewhat subtle distinction, but an important one! Now, let’s
prove our claim.
Proof. First, we’ll prove ImF (C) ⊆ S. (In other words, we’ll prove that every
output of the function F actually satisfies the inequality in the definition of S.)
(To do this we will start with an arbitrary element of ImF (C), and appeal to
the defintion of image to bring an element of the domain into play.)
Let y ∈ ImF (C) be arbitrary and fixed. By the definition of image, this means
∃x ∈ C such that F (x) = y. Let such an x be given.
By the definition of C, we know 0 < x < 100. By the definition of F , we know
486 CHAPTER 7. FUNCTIONS AND CARDINALITY

F (x) = 59 x + 32. Since multiplying by a positive number and adding to both


sides preserves inequalities, we can deduce that
9 9
· 0 + 32 < F (x) < · 100 + 32
5 5
and, simplifying, this tells u

32 < F (x) < 212

Thus, F (x) ∈ S, i.e y ∈ S. Therefore, ImF (C) ⊆ S.


Second, we’ll prove S ⊆ ImF (C). (In other words, we’ll prove that every
element of S is actually “achieved” by the function F somehow. This amounts
to proving an existential claim, i.e. that some element of the domain exists.)
Let s ∈ S be arbitrary and fixed. By the definition of S, we know that s ∈ R
.
and 32 < s < 212. We need to prove that ∃c ∈ C F (c) = s.
(By doing some scratch work on the side, that you can work through on your
own, we came up with the following idea. Just set an expression equal to s and
solve for c . . . )
Define c = 95 (s − 32).
Let’s show c ∈ C. By using the information we have about s and manipulating
the inequalities, we observe that

32 < s < 212 =⇒ 0 < s − 32 < 180


5 5
=⇒ 0 < (s − 32) < · 180 = 100
9 9
=⇒ 0 < c < 100

Since c ∈ R, certainly, this shows that c ∈ C.


Next, let’s show that F (c) = s. We observe that
 
9 9 5
F (c) = c + 32 = (s − 32) + 32
5 5 9
= (s − 32) + 32 = s

Together, this shows that s ∈ ImF (C), as well. Thus, S ⊆ ImF (C).
Overall, by a double-containment argument, we conclude that S = ImF (C).
The second half of our proof is certainly the harder part, and this is generally
true of proofs like this. In coming up with a candidate c, we essentially have to
“undo” the process that the function F does and find an input c for our given
output s. In a case like this, where the function is a numerical/arithmetical
operation on real numbers, the best route is to set up the desired equality, like
9
c + 32 = s
5
7.3. IMAGES AND PRE-IMAGES 487

and solve the equality for c. This function is linear, so this process only pro-
duces one such s but, in general, we might expect multiple values of s to work.
Ultimately, we only need one working value to complete this part of the proof,
so we can just select any one that works and use that as our claim. Sometimes,
though, this makes it harder to find such a value. It all depends on the example
at hand. Other times, we might be working with functions that aren’t defined
on sets of numbers, and we have to use some more abstract insight to come up
with a candidate element. Again, this all depends on the given situation, and
with practice, you’ll become much better at it!
Oh right, we asked what this image represents! Since the domain represented
the temperatures, in degrees Celsius, at which water is a liquid, the image
represents the temperatures, in degrees Fahrenheit, at which water is a liquid.
Let’s look at another example of proving the image of a function is a partic-
ular set.
Example 7.3.4. Define f : R → R by

x2
.
∀x ∈ R f (x) =
1 + x2

Let’s determine the image, Imf (R), and prove our claim!
Here, again, we must use some outside strategies and intuition to identify the
image first. Using some techniques from calculus or algebra, we could plot the
graph of this function and try to guess the image. Try that if you’d like. You’ll
end up with this graph:

We can also recognize that the denominator is greater than the numerator and
so, as x gets larger and larger, those two quantities get closer and closer together,
relatively speaking. (That is, their ratio approaches 1.) Also, both terms are
nonegative, since they involve squares, so their ratio is at least 0. In any event,
we can piece our observations together and make the following claim:
488 CHAPTER 7. FUNCTIONS AND CARDINALITY

Define the set


T = {y ∈ R | 0 ≤ y < 1}
We claim that T = Imf (R).
We now follow the same type of strategy we employed in the previous example.
Before we do so, let’s remember that the second part of that proof—showing the
claimed set is a subset of the image—was the harder part, and try to anticipate
what will happen there.
In that part, we will be working with an arbitrary element y ∈ T and we
will want to find an element x ∈ R that satisfies f (x) = y. To find such an x,
let’s set up the equality and try to solve for x:
x2
y= ⇐⇒ (1 + x2 )y = x2
1 + x2
⇐⇒ y + yx2 − x2 = 0
⇐⇒ (y − 1)x2 = −y
y
⇐⇒ x2 =
1−y
y
Now what? Can we be assured 1−y ∈ R, even? Can we be assured it’s nonneg-
ative, so that there exists such an x? What about the fact that there might be
two roots? Think about these potential issues and try to write your own version
of this proof before reading on for ours!

Proof. First, let’s prove Imf (R) ⊆ T .


Let y ∈ Imf (R) be arbitrary and fixed. By the definition of image, we know
∃x ∈ R such that f (x) = y. Let such an x be given.
Since x ∈ R, we know x2 ≥ 0 and 0 < x2 + 1. We can then deduce that
0 < x21+1 .
By multiplying the previous two inequalities—which we can do since all the
x2
terms are non-negative—we may deduce that 0 ≤ 1+x 2.

2
x
Next, we know that 0 ≤ x2 < x2 + 1, so 1+x 2 < 1, as well. (Note: why was it
2
important to point out that x ≥ 0? What can go wrong there?)
x2 x2
This shows that 0 ≤ 1+x2 < 1. Since y = f (x) = 1+x2 , this is equivalent to
saying 0 ≤ y < 1.
Thus, y ∈ T , and so Imf (R) ⊆ T .

Second, let’s prove T ⊆ Imf (R).


Let y ∈ T be arbitrary and fixed. This means y ∈ R and 0 ≤ y < 1.
To show y ∈ Imf (R), as well, we must produce an x such that f (x) = y.
7.3. IMAGES AND PRE-IMAGES 489
q
y
We claim that x = 1−y works.
y
Notice that y ≥ 0, and y < 1 implies −y > −1 so 1 − y > 0. Thus, 1−y ≥ 0,
and so x ∈ R is well-defined as a square root, and x belongs to the domain R.
y
Next, notice that x2 = 1−y , and so
y y y
x2 1−y 1−y 1−y y 1−y y
f (x) = = y = = 1 = · = =y
1 + x2 1 + 1−y (1−y)+y
1−y
1−y 1 1
1−y

This shows that y ∈ Imf (R), and so T ⊆ Imf (R).


Overall, by a double-containment proof, we conclude that T = Imf (R).
Notice how we addressed the issues discussed before the proof. Yes, two
potential x values existed that would work (namely, the + and − square roots)
but we only needed one, so we just picked one (the positive one) and ran with
it.
(Questions: What if this function was defined only on the nonnegative real num-
bers? What about just the negative real numbers? How might that restriction
affect our choice?)
Example 7.3.5. Consider the function p : N × N → N defined by

.
∀(a, b) ∈ N × N p(a, b) = ab + a

What is Imp (N × N)?


This example might feel a little trickier because the domain is a Cartesian
product of sets; that is, p inputs an ordered pair of natural numbers and outputs
a single natural number. A good approach in a situation like this is to just start
plugging in some values and seeing what happens. Consider the following table
of values as a way to get started, where the left column indicates values of a,
the top row indicates values of b, and the table entries are the values of p(a, b).

1 2 3 4 5
1 2 3 4 5 6
2 4 6 8 10 12
3 6 9 12 15 18
4 8 12 16 20 24
5 10 15 20 25 30

It looks like every natural number is “achieved” by the function p, except for 1.
Specifically, look at the top row of the array of values: there are all the natural
numbers except 1. Let’s use this insight in the following proof.
Proof. Let V = N − {1}. We claim V = Imp (N × N).
First, we prove Imp (N × N) ⊆ V . Let n ∈ Imp (N × N) be arbitrary and fixed.
This means n ∈ N and ∃(a, b) ∈ N × N such that p(a, b) = n. Let such (a, b) be
490 CHAPTER 7. FUNCTIONS AND CARDINALITY

given.
This means n = ab + a. Since a, b ≥ 1, then ab ≥ 1 and so n = ab + a ≥ 2. By
the defintion of V , this shows that n ∈ V .
Thus, Imp (N × N) ⊆ V .

(Try to write the next half of the proof before reading on and seeing ours! )

Second, we prove V ⊆ Imp (N × N). Let v ∈ V be arbitary and fixed.


This means v ∈ N and v ≥ 2. Define (a, b) = (v − 1, 1).
Notice that v − 1 ≥ 1, so v − 1 ∈ N and thus (a, b) ∈ N × N.
Also, notice that

p(a, b) = p(v − 1, 1) = (v − 1) · 1 + 1 = v − 1 + 1 = v

Thus, p(a, b) = v, and so (a, b) ∈ Imp (N × N). Therefore, V ⊆ Imp (N × N).


By a double-containment proof, we have shown V = Imp (N × N).

7.3.2 Proofs about Images


You might have observed the following fact by playing around with some of the
examples we have seen. Either way, we can make state and prove this claim
by working with the definition of image. Notice that it is a claim about an
arbitrary function; it holds no matter what f is!

Proposition 7.3.6. Let A, B be sets. Let f : A → B be a function. Let


S, T ⊆ A. Then,
Imf (S ∩ T ) ⊆ Imf (S) ∩ Imf (T )

Proof. Let z ∈ Imf (S ∩ T ) be arbitrary and fixed. This means ∃a ∈ S ∩ T such


that f (a) = z. Let such an a be given.
Since a ∈ S ∩ T , we know a ∈ S and a ∈ T .
Thus, z ∈ Imf (S) and z ∈ Imf (T ), by the definition of image.
We deduce that z ∈ Imf (S) ∩ Imf (T ), by the definition of intersection.
This shows the desired set containment.

Why didn’t we claim an equality here? It turns out that equality need not
hold, in fact! That is, there exists at least one function such that the reverse
containment—namely, Imf (S)∩Imf (T ) ⊆ Imf (S ∩T )—is False. We will provide
an example of such a function below.
(You should try to come up with an example of a function where this reverse
containment does hold. Together, we will have shown that one cannot make a
conclusion that necessarily holds about this containment!)
7.3. IMAGES AND PRE-IMAGES 491

We will use a schematic diagram to come up with an example with the de-
sired properties. We will then use this to formally define a function and state its
properties, pointing out how they match what will be established in our claim.
We want to point out that employing this technique is perfectly valid, as long
as you go back and write down a formal definition afterwards. Turning in just
a schematic diagram as a “proof” is not rigorous enough, but this can certainly
help guide your intuition into producing fruitful ideas for a proof!
Furthermore, keep in mind that there is no need to construct the most com-
plicated or interesting counterexample in situations like this. If you’re trying
to disprove a universally-quantified statement, you just need one example that
works! In particular, don’t feel like you need to define a function that works
with numbers, using some formula. Sometimes, this will actually make your job
much harder! It’s typically the case that a counterexample can be made using
sets with just a few (i.e. two or three) elements each.

Example 7.3.7. We claim that there exist sets A, B, S, T and a function f : A →


B such that Imf (S) ∩ Imf (T ) 6⊆ Imf (S ∩ T ). Let’s figure out how to construct
such an example. Based on our comment above, we are going to try to make
an example where the sets involve three or so elements. Let’s get the process
started by taking A to be {1, 2, 3} and defining f (1):

f :A→B
1 ?
2

A B

Now, just to have a defintion in hand, let’s choose S = {1, 2}. It seems like
it will be more reasonable to work with 2 elements in S, so we’ll make that
choice. Also, it seems like we should make f (1) 6= f (2). Otherwise, Imf (S)
would contain only one element, and there would have been no point in making
S have two elements. So let’s define f (2), as well:
492 CHAPTER 7. FUNCTIONS AND CARDINALITY

f :A→B
1 ? Imf (S)
S
2

A B
Now, we need to choose T . It will be interesting to have S ∩T 6= ∅, but it would
be hard to handle (perhaps) if T ⊇ S. So, let’s say T = {2, 3}. Then, we just
need to choose f (3). In considering each of these cases, look at the schematic
diagram above, and imagine drawing an arrow to represent f (3).
• What if f (3) = f (2) =  ?
In this case, Im( T ) = {  }, so Imf (S)∩Imf (T ) = {  }. But Imf (S∩T ) =
{  }, as well! This doesn’t work.
• What if f (3) is something else, like f (3) = , ?
This doesn’t work either! We will have Imf (S)∩Imf (T ) = {  } = Imf (S∩
T ).
• What if f (3) = f (1) = ? ?
It looks like this works!
f :A→B
1 ? Imf (S)
S
=

Imf (T )
2
T Imf (S ∩ T )

A B
We have made it so that Imf (S) ∩ Imf (T ) is a strict superset of Imf (S ∩ T ).
Look back over our construction, and see if you understand our thought process.
What were the restrictions we had to conform to? Where did we have freedom
of choice? What did we decide to do?
We want to point out that this is absolutely not the only such example, though!
Try to come up with others!
Right now, all we have left to do is take the final diagram we constructed and
use it to define an example and then prove it works. Here we go!
7.3. IMAGES AND PRE-IMAGES 493

Proof. Define A = {1, 2, 3} and B = {?, }.


Define f : A → B by setting f (1) = ?, and f (2) = , and f (3) = ?.
Define S = {1, 2} and T = {2, 3}.
Observe that S ∩ T = {2}, so Imf (S ∩ T ) = {f (2)} = {}.
However, observe that Imf (S) = Imf (T ) = B, so Imf (S) ∩ Imf (T ) 6= {}.
Since ? ∈ Imf (S) ∩ Imf (T ) but ? ∈
/ Imf (S ∩ T ), this proves our claim.

We have now seen an example of how to prove a claim about arbitrary


functions and images, as well as how to construct a specific counterexample to
disprove a claim. In the exercises, you will be asked to solve similar problems.
Sometimes, you will need to figure out whether a claim is True or not. (Here, we
told you which claim was valid beforehand.) We recommend trying one of two
things: (1) Try to prove the claim, and see if it breaks down somewhere, or (2)
Try to construct a counterexample, and see if you have trouble doing so. If you
complete either task . . . well, hey, you figured it out! But if you’re struggling, it
might help you figure out what’s really going on.
Specifically, you will be asked to examine the claim we discussed above, but
with “∪” instead of “∩”. What do you think will happen? Go ahead and try it!

7.3.3 Pre-Image: Definition and Examples


A natural question you might have now is: What about going the other way?
Can we take a subset of the codomain and identify the elements whose outputs
“land” in that set? Of course! This next definition provides us a term for this
notion, and you’ll notice many similarities with the definition of image.

Definition
Definition 7.3.8. Let A, B be sets and let f : A → B be a function. Let Y ⊆ B.
The pre-image of Y under the function f is written and defined as

PreImf (Y ) = {a ∈ A | f (a) ∈ Y }

That is, the pre-image of Y under f is the set of all “inputs” that produce an
“output” in Y .
(We will sometimes abbreviate the notation as just PreIm(Y ), when the function
is clearly identified and unambiguous, and consequently refer to the set as just
“the pre-image of Y ”, instead of “the pre-image of Y under f ”.)

Think about this first: What is PreImf (B), where B is the entire codomain?
Look back at the definition: this is the set of all inputs (in A) whose outputs
“land” in B. That’s all of A, of course, since f is a well-defined function!
Accordingly, we will really only be working with sets Y ⊂ B, since those cases
are more interesting.
494 CHAPTER 7. FUNCTIONS AND CARDINALITY

Examples
Example 7.3.9. This first example uses the same function we defined in the last
section when we discussed images. We’ll show you the schematic diagram again,
but spare you from re-defining all the details of the function. (See Example 7.3.2
for the details.)

g:A→B 1
a
2
b
3
c 4
d 5
e 6

A B
Define Z1 = {1, 2, 3} and Z2 = {2, 3, 4} and Z3 = {4, 5, 6}.
Let’s identify the following pre-images and explain them.
(1) PreImg ({1}) = d
This is because g(d) = 1 and no other x ∈ A satisfies g(x) = 1.
(Note: We need to use set brackets here. “PreImg (1)” would make no sense.)
(2) PreImg ({4}) = ∅
This is because no x ∈ A satisfies g(x) = 4
(3) PreImg (Z1 ) = {a, b, c, d}
This is because g(a) = 2, g(b) = g(c) = 3, and g(d) = 1, but no other x ∈ A
satisfies g(x) ∈ Z1 .
(4) PreImg (Z2 ) = {a, b, c}
This is because g(a) = 2 and g(b) = g(c) = 3, but no other x ∈ A satisfies
g(x) ∈ Z2 .
(5) PreImg (Z3 ) = {e}
This is because g(e) = 6, but no other x ∈ A satisfies g(x) ∈ Z3 .
(6) PreImg ({5}) = ∅
.
This is because ∀x ∈ A g(x) 6= 5.
.
Example 7.3.10. Let f : R → R be the function defined by ∀x ∈ R f (x) = x2 .
Let’s identify a few pre-images with this function. We will let you figure out
why our claims are valid, as well as how to explain and prove them, this time!
7.3. IMAGES AND PRE-IMAGES 495

(1) PreImf ({1}) = {−1, 1}

(2) PreImf ({y ∈ R | y < 0}) = ∅

(3) PreIm({y ∈ R|y ≥ 0}) = R

(4) PreIm({y ∈ R|0 < y < 1}) = {x ∈ R | −1 < x < 1}

7.3.4 Proofs about Pre-Images


Notice that the following claim is one of equality. Compare this to Propo-
sition 7.3.6, which has a similar statement about images but it is only a set
containment. Interesting, right?

Proposition 7.3.11. Let A, B be sets. Let f : A → B be a function. Let


X, Y ⊆ B. Then,

PreImf (X ∩ Y ) = PreImf (X) ∩ PreImf (Y )

Notice how the proof below appeals directly to the formal definition of pre-
images. We will jump right in and prove both parts. The exercises will ask you
to investigate this claim with “∪” instead of “∩”.

Proof. Let x ∈ PreImf (X ∩ Y ) be arbitrary and fixed.


By the definition of pre-image, this means f (x) ∈ X ∩Y . Accordingly, f (x) ∈ X
and f (x) ∈ Y .
Since f (x) ∈ X, this means that x ∈ PreImf (X) (by the definition of pre-
image). Similarly, since f (x) ∈ Y , this means that x ∈ PreImf (Y ).
Thus, by the definition of intersection, we can deduce that x ∈ PreIm(X) ∩
PreIm(Y ).
This shows PreIm(X ∩ Y ) ⊆ PreIm(X) ∩ PreIm(Y ).

Next, let y ∈ PreIm(X) ∩ PreIm(Y ) be arbitrary and fixed.


By the definition of pre-image, this means y ∈ PreImf (X) and y ∈ PreImf (Y ).
Since y ∈ PreImf (X), we can deduce that f (y) ∈ X, by the definition of pre-
image. Similarly, since y ∈ PreImf (Y ), we can deduce that f (y) ∈ Y .
By the definition of interesection, this tells us f (y) ∈ X ∩ Y . Then, by the
definition of pre-image, this tells us y ∈ PreIm(X ∩ Y ).
This shows PreIm(X ∩ Y ) ⊇ PreIm(X) ∩ PreIm(Y ).
By a double-containment proof, we have proven the claim.

You might read through this and think, “How does one come up with a proof
like this?” Well, there isn’t a whole lot of ingenuity behind a result like this. All
we did was appeal directly to definitions. Everything fell into place from there.
496 CHAPTER 7. FUNCTIONS AND CARDINALITY

If you find yourself stuck while working on a problem, or you’re just unsure of
where to start . . . just write down the relevant definitions. Try to apply them
to the statement you’re trying to prove. See what happens!

A Proof with Pre-Images and Images

Let’s work on one result that involves both of the concepts we have introduce in
this section. We will prove one containment and ask you to disprove the other
one in the exercises.

Proposition 7.3.12. Let A, B be sets. Let f : A → B be a function. Let


Y ⊆ B. Then,

Imf PreImf (Y ) ⊆ Y

Proof. Let b ∈ Imf PreImf (Y ) be arbitrary and fixed.
.
By the definition of image, this means ∃a ∈ PreImf (Y ) f (a) = b. Let such an
a be given.
Since a ∈ PreImf (Y ), this means f (a) ∈ Y , by the definition of pre-image.
Since b = f (a) and f (a) ∈ Y , this means b ∈ Y .
This proves the claim.

7.3.5 Questions & Exercises


Remind Yourself

Answering the following questions briefly, either out loud or in writing. These
are all based on the section you just read, so if you can’t recall a specific defi-
nition or concept or example, go back and reread that part. Making sure you
can confidently answer these before moving on will help your understanding and
memory!

(1) What are the differences between image and pre-image?

(2) Suppose f : A → B is a function. What is PreImf (B)?

(3) Suppose g : R → R is a function. Why is the expression Img (0) not a proper
statement? What do you think the writer of such an expression meant?

(4) Say f : A → B is a function and Y ⊆ B. What does it mean if PreImf (B) =


∅? Is this possible?

(5) Say f : A → B is a function and X ⊆ A. What does it mean if Imf (A) = ∅?


Is this possible?
7.4. PROPERTIES OF FUNCTIONS 497

Try It
Try answering the following short-answer questions. They require you to actu-
ally write something down, or describe something out loud (to a friend/class-
mate, perhaps). The goal is to get you to practice working with new concepts,
definitions, and notation. They are meant to be easy, though; making sure you
can work through them will help you!

.
(1) Let h : R − {−1} → R be defined by ∀x ∈ R − {−1} h(x) = x
1+x .

Prove that Imh (R − {−1}) = R − {1}.


Then, define P = {y ∈ R | y > 1} and U = {y ∈ R | y > −1}.
Prove that PreImh (P ) = U .

(2) Let f : A → B be a function. Let S, T ⊆ A. For each of the following


claims, prove it must hold, or disprove it by finding a counterexample.

(a) Imf (S ∪ T ) ⊆ Imf (S) ∪ Imf (T )


(b) Imf (S ∪ T ) ⊇ Imf (S) ∪ Imf (T )

(3) Let f : A → B be a function. Let Y, Z ⊆ B. For each of the following


claims, prove it must hold, or disprove it by finding a counterexample.

(a) PreImf (Y ∪ Z) ⊆ PreImf (Y ) ∪ PreImf (Z)


(b) PreImf (Y ∪ Z) ⊇ PreImf (Y ) ∪ PreImf (Z)

(4) Look back at Proposition 7.3.12. Consider the reverse containment:



Imf PreImf (Y ) ⊇ Y

Disprove the claim that this holds for any function f : A → B and any
Y ⊆ B by constructing a specific counterexample and proving that it works.

7.4 Properties of Functions


7.4.1 Surjective (Onto) Functions
You might be wondering something by now . . . If we can identify the image
of the domain under a given function, why bother with a codomain that’s any
“larger” than that set? Sure f : R → R defined by f (x) = x2 is a fine function,
but changing the codomain to just the nonegative real numbers doesn’t really
affect anything. It might even make it better, because nothing in the codomain is
“missed” by the function! If you’re thinking this way, then you have anticipated
our next definition, which encapsulates precisely this property of a function:
when the codomain and the image of the domain are the same set.
498 CHAPTER 7. FUNCTIONS AND CARDINALITY

Definition
Definition 7.4.1. Let A, B be sets and let f : A → B be a function. We say f
is a surjective function if and only if Imf (A) = B.
Equivalently, we just say “f is surjective” (adjectival form), or that “f is a
surjection” (nounal form).
(The word “onto” is a fairly commonly used synonym for this term, so we will
mention it here but won’t use it again. This is just in case you’ve seen this word
somewhere else.)
Referring back to the definition of image, we can state this property equivalently
in the form of a quantified statement:

. .
f is surjective ⇐⇒ ∀b ∈ B ∃a ∈ A f (a) = b

That is, f is surjective if and only if every output has at least one corresponding
input.

Think for a minute about why the second form of this definition is really the
same as the first one. The property that Imf (A) = B is a statement about sets.
We already know that, by definition, Imf (A) ⊆ B (nothing in the image can fall
“outside” of the codomain), so this further property means that B ⊆ Imf (A),
as well. This is precisely what the second form of the definition says: every
element of the codomain satisfies the defining property of being an element of
the image.
Also, notice that nothing about the definition says the a we find to corre-
spond to a b must be unique! All this property requires is that, for every b ∈ B,
we can identify at least one a ∈ A that satisfies f (a) = b. There might be more
than one, there might be exactly one. It doesn’t matter, as long as there aren’t
none.
What does the property of being a surjection mean in terms of a schematic
diagram? Since every element of the codomain is “hit” by the function, this
means that every dot on the right-hand side of the schematic has an incoming
arrow. (Remember: this type of heuristic language is fine to keep in mind—
we are using it to help describe these concepts, after all—but this does not
constitute a proof. Any sentence of this sort that you use in a proof should
be accompanied by a more rigorous statement, using mathematical language
and/or logical symbols.) Why would we care about such a property? In general,
it can be difficult to declare exactly what the image of a function is, and we might
(at first) be able to only declare what the codomain is. Proving that, in fact, all
of the codomain elements are outputs of the function can be additiona, helpful
information!

Negating the Definition


Typically, we will define a function and then ask: is this a surjection or not?
If we believe a function is a surjection, we should prove that by showing the
7.4. PROPERTIES OF FUNCTIONS 499

codomain and image are the same set. If we believe it is not a surjection, we
should prove that by finding a counterexample. Let’s look at the logical negation
of the statement that defines a surjective function:

. . .
¬(∀b ∈ B ∃a ∈ A f (a) = b) ⇐⇒ ∃b ∈ B ∀a ∈ A f (a) 6= b .
That is, to prove a function f is not a surjection, we must find an element of
the codomain that is not an element of the image. This involves some scratch
work and intuition to identify such a b. From there, we must somehow show
that no possible a satisfies f (a) = b. We might argue this directly by taking an
arbitrary a ∈ A and explaining why f (a) 6= b. Alternatively, we might argue
this by contradiction: assuming that there is an a ∈ A such that f (a) = b, we
seek a contradiction. Either of these appraoches is reasonable, and they are
logically equivalent.

Examples
Let’s see these techniques in action with a few examples. For some of them, we
might be able to use some graphical intuition or try a few test cases to figure out
a guess, but ultimately we need to settle in and prove some logical statements
to validate our claims.
Example 7.4.2. Consider p : N × N → N defined by p(a, b) = ab. Is p surjective?
Yes, it is! It looks like we can just allow a to be 1, so that the function outputs
whatever b is. Let’s make this observation more formal with a proof:

Proof. Let n ∈ N be arbitrary and fixed. Define (a, b) = (1, n).


Notice that (1, n) ∈ N × N and p(1, n) = 1 · n = n.
Since n was arbitrary, this shows p is surjective.

Example 7.4.3. Let C be the set of all cars in the United States. Let S be the
set of all strings of letters and digits that are of length at most 7 (i.e. these are
the potential strings you might see on a car’s license plate).
Let f : C → S be defined by inputting a car and outputting its license plate
string. Is the function f a surjection?
No, definitely not! In case you weren’t aware, curse words are disallowed on
license plates! So certainly, there exist many strings of letters that you will
never see on a license plate in the United States. (We’ll let you provide some
examples on your own . . . )
Because we have exhibited an element of S that is not an element of Imf (C)—or,
at least, you thought of an example—we have shown that f is not a surjection.
Example 7.4.4. Let d : N × N → Z be the function defined by

.
∀(a, b) ∈ N × N d(a, b) = a − b
500 CHAPTER 7. FUNCTIONS AND CARDINALITY

Let’s determine whether d is a surjection and prove our claim. We might start
by trying some “small values” for the input variables a and b. In the table below,
the left column is a and the top row is b, and the entries are d(a, b) = a − b:

1 2 3 4 5
1 0 -1 -2 -3 -4
2 1 0 -1 -2 -3
3 2 1 0 -1 -2
4 3 2 1 0 -1
5 4 3 2 1 0

It looks like all of the integers z ∈ Z will appear in this table. However, they
don’t all appear in one particular row or column. Rather, it looks like all
the non-negative integers appear in the first column, while all the non-positive
integers appear in the first row. Let’s use these observations to write a proof.
We’ll take an arbitrary integer z ∈ Z and consider two cases; if z ≥ 0, we will
do one thing, and if z < 0, we will do something else. As long as we succeed in
both cases, we will have proven that d is a surjection.
Proof. We claim d is a surjection. Let z ∈ Z be arbitrary and fixed. WWTS
.
∃(a, b) ∈ N × N d(a, b) = z. To do this, we consider two cases:
(1) Suppose z ≥ 0. Then define (a, b) = (z + 1, 1).
Since z ≥ 0, we know z + 1 ≥ 1 and so z + 1 ∈ N. This guarantees
(z + 1, 1) ∈ N × N.
Also, notice that d(z + 1, 1) = (z + 1) − 1 = z.
(2) Suppose z < 0. Then define (a, b) = (1, −z + 1).
Since z < 0, we know −z > 0 and so −z + 1 ≥ 2, meaning −z + 1 ∈ N. This
guarantees (1, −z + 1) ∈ N × N.
Also, notice that d(1, −z + 1) = 1 − (−z + 1) = z.
.
In either case, we are able to define (a, b) ∈ N × N d(a, b) = z. Since z ∈ Z was
arbitrary, this proves that d is surjective.
Example 7.4.5. Let g : R − {−1} → R be the function defined by

.
∀x ∈ R g(x) =
x
1+x
(Notice why we have removed −1 from the domain. This ensures g is a well-
defined function!)
Let’s determine whether g is a surjection and prove our claim. As mentioned
before, we can do some scratch work to figure out our claim: we could try
plugging in some values of x, testing “extreme cases” by letting x get very close
to −1 or letting x grow larger and larger . . . All of this can help us plot a graph
of the function, or we can just use some graphing software:
7.4. PROPERTIES OF FUNCTIONS 501

Regardless, none of this proves anything! What it does do is help us observe


that this function g is not surjective. There seems to be a horizontal asymptote
at y = 1. That is, the function g never “reaches” 1, but rather gets infinitely
close. In terms of our new definition of surjectivity, this is decidedly a NO
answer!
Try to prove this now. How can you show that the element −1 ∈ R is not
an element of the image Img (R)? Try it! Then read on for our proof.

We will actually present two proofs here, for you to compare and contrast.
They both accomplish the same goal—showing g is not surjective—but one does
so by a contradiction method and the other by a direct method (using cases).
Which do you think is better? Did you come up with one of these? Which is
easier to read? We have no definitive opinion on these questions; they are both
equally valid proofs!

Proof 1 (Direct). Let x ∈ R − {−1} be arbitrary and fixed. WWTS that g(x) 6=
1. We consider two cases:
1
• Suppose x > −1. This means x + 1 > 0, and so x+1 > 0. We also know
x + 1 > x (which is true for every x ∈ R.)
1
By multiplying this inequality by the positive term x+1 , we deduce that
x x
1 > x+1 . Certainly, then, g(x) = x+1 6= 1.
1
• Suppose x < −1. This means x + 1 < 0, and so x+1 < 0. We also know
x + 1 > x.
1
By multiplying this inequality by the negative term x+1 and switching
x x
the sign, we deduce that 1 < x+1 . Certainly, then, g(x) = x+1 6 1.
=

In either case g(x) 6= 1. These cases cover all possibilities because x ∈ R − {−1}
was arbitrary (and we need not consider x = −1). This shows

1∈
/ Img (R − {−1})

so g is not a surjection.
502 CHAPTER 7. FUNCTIONS AND CARDINALITY

Notice that this first proof proves an interesting qualitative observation about
the graph: that the function lies above the horizonatal asymptote to the left of
x = −1 and above the asymptote to the right of x = −1.

Proof 2 (Contradiction). AFSOC that g is surjective. This means

.
∀y ∈ R y ∈ Img (R − {−1})

.
In particular, then, we know 1 ∈ Img (R − {−1}), so ∃x ∈ R − {−1} g(x) = 1.
Let such an x be given.
x
This means g(x) = x+1 = 1. Multiplying both sides, we find x = x + 1.
×
Subtracting, we find 0 = 1, clearly a contradiction ×
××
Therefore, 1 ∈
/ Img (R − {−1}), so g is not a surjection.

Notice that this second proof does prove that g is not a surjection, but it doesn’t
add any other information about how the function behaves (like the previous
proof did).
Let’s move on from surjections and talk about a closley related property of
functions.

7.4.2 Injective (1-to-1) Functions


When trying to prove a function is surjective, we took an arbitrary element of the
codomain and had to find at least one element of the domain that corresponded
to the original element. Sometimes there is exactly one such element, sometimes
there are many, and sometimes there are none. What we will do now is consider
those functions that fall into the “exactly one” case. We won’t be presuming here
that functions are already surjective. Rather, we are imposing this condition:
we want there to be no more than one input for any given output. There might
be exactly one or there might be none, but there certainly aren’t two or more.
These types of functions are special enough that we give them a name.

Definition
Definition 7.4.6. Let A, B be sets and let f : A → B be a function. We say f
is an injective function if and only if it has the property that

.
∀a1 , a2 ∈ A a1 6= a2 =⇒ f (a1 ) 6= f (a2 )

Equivalently, we just say “f is injective” (adjectival form), or that “f is an


injection” (nounal form).
(The term “1-to-1”—sometimes written “1-1”—is a fairly commonly used syn-
onym for this word, so we will mention it here but won’t use it again. This is
just in case you’ve seen this term somewhere else.)
In other words, this defining property requires that “distinct inputs yield distinct
7.4. PROPERTIES OF FUNCTIONS 503

outputs”. Also, remembering that the contrapositive of a statement is logically


equivalent, we can express this property as

.
∀a1 , a2 ∈ A f (a1 ) = f (a2 ) =⇒ a1 = a2

This expresses the equivalent notion that “if two outputs are equal, they must
come from the same input”.

Think about how this definition conveys the notion we described above. Say
we have an injective function f : A → B, and let’s say we are given an element
b ∈ B. Does this definition say that there is at most one element x ∈ A such
that f (x) = b? What possibilities does the definition allow?

Motivation

Let’s motivate this by a particular application of functions. Think of a function


as a code-word machine for you to send and receive secret messages with a friend.
Your friend writes down a secret message, puts it in the encoder, and out pops
a scrambled code that he sends to you. Later, you receive this scrambled code.
You would really like to know that this code only came from at most one input
phrase. What if you try to decode it and it comes out with both I HATE YOU
and I LOVE YOU? What are you supposed to think then? Did your friend mean
to sedn you both messages? What a terrible code system you’ve designed if
both of those conflicting messages are encoded as the same scrambled message!
In this context, it would be much nicer to have an encoding function where
two distinct inputs can’t possibly give the same output. This is presicely the
defining property of being an injection.

Negating the Definition

It might be helpful to think about the property of being an injection in terms


of a schematic diagram, and in terms of the negation of the definition. Let’s
find that negation first:

.
¬ ∀a1 , a2 ∈ A a1 6= a2 =⇒ f (a1 ) 6= f (a2 )


.
⇐⇒ ∃a1 , a2 ∈ A a1 6= a2 ∧ f (a1 ) = f (a2 )


(Remember that the negation of P =⇒ Q is P ∧ ¬Q!)


This says a function is not injective if and only if we can find two distinct domain
elements that output the same codomain element.
With that in mind, here are canonical examples of an injective and non-injective
function:
504 CHAPTER 7. FUNCTIONS AND CARDINALITY

f :A→B f :A→B

A B A B
injective NOT injective
The non-injective function has two distinct domain elements that output the
same codomain element, whereas the injective function avoids this situation. It
might feel a little odd to phrase a property in this kind of negative sense—a
function is only injective if it doesn’t have . . . —but this is actually somewhat
common in mathematics. (We will even see this idea later on when we talk
about infinite sets, which are just . . . sets that are not finite!) This negative
formulation is easy enough to remember, and we can always relate it to another,
positive formulation: an injective function has only 0 or 1 inputs corresponding
to any given output.

Examples
Let’s think about how to prove/disprove the injectivity of functions. As you
might guess, the first two versions of the definition given above are useful when
trying to show a function is injective: take two distinct elements of the domain
and show their outputs are different, or take two equal outputs and show they
came from equal inputs. The negation can also be used to show a function is
injective via a proof by contradiction. Also, the third version is useful when
proving a function is not injective: a counterexample amounts to finding two
distinct inputs with the same output.
Let’s see these techniques in action with a few examples. In fact, we will
use some of the same examples we looked at in the previous section about
surjections!
Example 7.4.7. Consider p : N × N → N defined by p(a, b) = ab. Is p injective?
By trying some particular values of (a, b), we can see that p is definitely not
an injection. Pick any number that has two different factorizations, like 12 =
3 · 4 = 2 · 6. By letting (a, b) = (3, 4) and (c, d) = (2, 6), we can easily prove
this claim. But we can do this even more easily, by noting that the order of the
coordinates of an element like (a, b) matters!

Proof. This function is not injective. Let (a, b) = (1, 2) and (c, d) = (2, 1).
Notice that (a, b) 6= (c, d) because 1 6= 2. Also, notice that p(a, b) = 1 · 2 = 2
and p(c, d) = 2 · 1 = 2. Thus, p(a, b) = p(c, d). This shows that p is not
injective.
7.4. PROPERTIES OF FUNCTIONS 505

Example 7.4.8. Let C be the set of all cars in the United States. Let S be the
set of all strings of letters and digits that are of length at most 7 (i.e. these are
the potential strings you might see on a car’s license plate).
Let f : C → S be defined by inputting a car and outputting its license plate
string. Is the function f an injection?
No, we don’t think so! The same license plate string could appear on different
cars that are registered in different states. Now, we don’t have any examples of
this on hand, so this isn’t a totally formal proof, but hopefully you see the idea.
Could we amend the function definition to make it an injection? Sure, we
could try! Consider also defining S to be the set of U.S. states. Let the function
g : C → L × S be defined by inputting a car and outputting the order pair of
that car’s license plate string and home state. This will be an injection, because
no two cars in the same state can have the same plate. (Again, this is not really
a formal proof; we are just trying to illustrate the concept of injectivity with a
non-numerical example.)
Example 7.4.9. Let d : N × N → Z be the function defined by d(a, b) = a − b.
Determine whether d is an injection and prove your claim.
It turns out d is not an injection! Notice that a − b = (a + 1) − (b + 1). We
can use this to find a counterexample:
Consider the pairs (2, 1) ∈ N × N and (3, 2) ∈ N × N. Notice that d(2, 1) = 1
and d(3, 2) = 1. Since (2, 1) 6= (3, 2) and yet d(2, 1) = d(3, 2), we conclude that
d is not an injection.
Example 7.4.10. Let F : P(N) → P(Z) be defined by

∀X ∈ P(N) F (X) = . [
{a, −a}
a∈X

Do you see what this function does? (Can you explain why it’s even a well-
defined function?)
Let’s show you a few examples to give you an idea:
 [
F {1} = {a, −a} = {−1, 1}
a∈{1}
 [
F {1, 3, 5} = {a, −a} = {−1, 1} ∪ {−3, 3} ∪ {−5, 5}
a∈{1,3,5}

= {−5, −3, −1, 1, 3, 5}


[
F (∅) = {−a, a} = ∅
a∈∅

F (N) = Z − {0}
We claim that F is an injection. Think about how to prove this before reading
our proof. In particular, think about the different strategies we might employ
here, based on the formal definition of injectivity. Might one strategy be more
fruitful than another?
506 CHAPTER 7. FUNCTIONS AND CARDINALITY

Proof. WWTS F is an injection. Let X, Y ∈ P(N).


Suppose that X 6= Y . WWTS F (X) 6= F (Y ).
Since X 6= Y , we have two cases: either X 6⊆ Y or Y 6⊆ X (or both).
Suppose X 6⊆ Y . This means ∃n ∈ X n ∈.
/ Y . Let such an n be given.
Since n ∈ {−n, n} and n ∈ X, we see that n ∈ F (X), by the definition of F .
However, since n ∈ .
/ Y , we see that ∀a ∈ Y n ∈/ {−a, a}. This follows because
n∈ .
/ Y , as well as the fact that n ∈ N and Y ⊆ N, so ∀a ∈ Y n 6= −a ∈ Z.
Accordingly, n ∈
/ F (Y ). This shows that F (X) 6= F (Y ).
In the other case, where Y 6⊆ X, we can follow the exact same argument with
the roles reversed (i.e. switching X and Y in every step). This shows that
F (Y ) 6= F (X).
.
Together, we have shown that ∀X, Y ∈ P(N) X 6= Y =⇒ F (X) 6= F (Y ).
This shows F is an injection.

Think about how this proof might go if we used a different technique. Say
we started by assuming X, Y ∈ P(N) and that F (X) = F (Y ). Can we deduce
that X = Y ?

7.4.3 Proof Techniques for Jections


Let’s summarize the concepts of this section so far by presenting some proof
templates. These can be used when you are trying to prove/disprove that a
function is injective/surjective. We like using the shorthand “Jections” to refer
to these two function properties together.

Prove that f is surjective


1. Let b ∈ B be arbitrary and fixed.

2. Define a = .

3. Show that a ∈ A.

4. Show that f (a) = b.

5. This shows that b ∈ Imf (A). Thus, Imf (A) = B, so f is surjective.

Prove that f is not surjective


1. Define b = .

2. Show that b ∈ B.

3. Let a ∈ A be arbitrary and fixed.


7.4. PROPERTIES OF FUNCTIONS 507

4. Show that f (a) 6= b.


(Alternatively, suppose f (a) = b and find a contradiction.)

5. This shows that ∃b ∈ B b ∈ .


/ Imf (A), so f is not surjective.

Prove that f is injective


1. Let x, y ∈ A be arbitrary and fixed.

2. Suppose that f (x) = f (y).

3. Deduce that x = y.

Alternatively:

1. Let x, y ∈ A be arbitrary and fixed.

2. Suppose that x 6= y.

3. Deduce that f (x) 6= f (y).

Prove that f is not injective


1. Define x = and define y = .

2. Show that x ∈ A and y ∈ A.

3. Show that x 6= y.

4. Show that f (x) = f (y).

Prove that f is bijective


1. Prove that f is injective.

2. Prove that f is surjective.

7.4.4 Bijections
You might have guessed what we have been building towards here. Think about
the two main properties of functions we just studied: surjectivity and injectivity.
What happens when a function has both of these properties? What if a function
has the property that, for every element of the codomain, there is at least one
corresponding element in the domain (surjectivity) and there is also at most one
such element (injectivity)? That’s right: for every output, there is exactly one
input! This is an incredibly nice property to have, and will be the foundation
for our forthcoming discussion of cardinality (i.e. the size of a set). Let’s make
a definition and then discuss some examples.
508 CHAPTER 7. FUNCTIONS AND CARDINALITY

Definition
Definition 7.4.11. Let A, B be sets and let f : A → B be a function. We say
f is a bijective function if and only if f is both injective and surjective.
Equivalently, we just say “f is bijective” (adjectival form), or that “f is a bi-
jection” (nounal form).
We will sometimes say that f is a bijection between the sets A and B, instead
of saying “from A to B”. (The reason for this will become clear in the next
section!)
Notice that this definition is, logically speaking, an AND statement. For the
moment, anyway, the only technique we have to prove a function is bijective
is to just prove it is surjective and prove it is injective. Similarly, to prove
a function is not bijective, we need to prove it is either not surjective or not
injective. (It might be that both properties fail, but one such proof is sufficient
to show a function is not bijective.) Rather than go over these same techniques
(which are nicely summarized right before this section), we will just point out
whether some of the examples we have seen thus far are bijections are not.
Example 7.4.12.
(a) Let p : N × N → N be the function defined by p(a, b) = ab.
We proved that p is surjective but not injective, so it is not a bijection.
(b) Let d : N × N → Z be the function defined by d(a, b) = a − b.
We proved that d is surjective but not injective, so it is not a bijection.
(c) Let g : R − {−1} → R be the function defined by

.
∀x ∈ R g(x) =
x
1+x
We proved that g is not surjective. (Specifically, we showed 1 ∈
/ Img (R −
{−1}).) We will ask you in this section’s exercises to prove that g is an
injection, though. Together this means g is not a bijection.
However, consider defining h : R − {−1} → R − {1} by the same “rule” as
. x
g, i.e. ∀x ∈ R − {1} h(x) = 1+x .
We asked you to prove in the exercises of Section 7.3.5 that this function
satisfies Imh (R − {−1}) = R − {1}. This shows that h is a surjection.
Furthermore, we will ask you to prove in this section’s exercises that a
function defined in this way—by taking an injection, using the same “rule”,
and redefining the codomain to be the image—produces a bijection.
Together, all of this proves that h is a bijection from R − {−1} to R − {1}.
Example 7.4.13. Let’s look at one new example, specifically chosen to preview
some of the main ideas coming up ahead. Define E ⊆ N to be the set of all even
7.4. PROPERTIES OF FUNCTIONS 509

natural numbers; that is,

.
E = {e ∈ N | ∃k ∈ N e = 2k}

Define the function d : N → E by d(n) = 2n. We claim d is a bijection.

Proof. First, let’s prove d is a surjection. Let e ∈ E be given.


By the definition of E, ∃k ∈ N such that e = 2k. Let such a k be given.
This tells us d(k) = 2k = e. Since e was arbitrary, we conclude that d is a
surjection.

Second, let’s prove d is an injection. Let m, n ∈ N and assume d(m) = d(n).


This means 2m = 2n. Canceling the 2s from both sides, we find that m = n.
Thus, d is an injection.
Together, this proves that d is a bijection.

We’ll motivate some future considerations by posing some questions: Does


it seem a little strange to you that there is a bijection between N and E, a set
that is a proper subset of N? Is it always possible to find a bijection between a
set and a subset of itself? Have we seen other examples of this situation before?

Motivation
The main idea behind a bijection f : A → B is that we can pair up the
elements of A and B and identify them with each other, one by one. This idea
follows from the definitions of both surjectivity and injectivity: every output
has exactly one corresponding input. Furthermore, think more carefully about
what we show in the proofs of such properties. In proving f is surjective, we
show we can “move” from the codomain back to the domain in at least one way,
and then in proving f is injective, we show that this is the only way to do it.
In a sense, we are showing how to “undo” the function f and reverse its action.
In fact, we are implicitly defining a new function from B back to A. Have you
previously talked about the inverse of a function? That is precisely what we
are rediscovering now! To make this notion of “moving back from the codomain
to the domain” rigorous enough, we need to have a brief discussion about how
to “combine” functions appropriately. Right after that, we will be able to give
a precise definition of what we mean by the inverse of a function, and relate
this to bijections. All of this happens in the next section.

7.4.5 Questions & Exercises


Remind Yourself
Answering the following questions briefly, either out loud or in writing. These
are all based on the section you just read, so if you can’t recall a specific defi-
nition or concept or example, go back and reread that part. Making sure you
510 CHAPTER 7. FUNCTIONS AND CARDINALITY

can confidently answer these before moving on will help your understanding and
memory!
(1) Write down a definition of surjective in terms of an image. Then, write
down a definition of surjective in terms of quantifiers.
(2) Describe two different ways of proving that a function is injective.
(3) Can a function be both injective and surjective? If so, give an example.
(4) Can a function be neither injective nor surjective? If so, give an example.
(5) Consider the following schematic diagrams. For each one, declare whether
or not it is a function; and, if it is, declare whether or not it is (a) an
injection and (b) a surjection.

Try It
Try answering the following short-answer questions. They require you to actu-
ally write something down, or describe something out loud (to a friend/class-
mate, perhaps). The goal is to get you to practice working with new concepts,
definitions, and notation. They are meant to be easy, though; making sure you
can work through them will help you!
(1) Suppose f : R → R is an increasing function; that is, suppose
.
∀x, y ∈ R x < y =⇒ f (x) < f (y)
Prove that f must be injective.
Then, prove that f need not be surjective by defining an increasing function
that is not surjective.
(2) Let g : R − {−1} → R be the function defined by

.
∀x ∈ R g(x) =
x
1+x
Is g injective or not? Prove your claim.
(3) Give an example of a function f : P(N) → N that is surjective. Prove that
it is.
(Hint: Be careful about the fact that ∅ ∈ P(N). Also, consider looking at
Section 5.5.2 for some inspiration . . . )
(4) Give an example of a function F : N → P(N) that is injective. Prove that
it is.
Then, prove that your function F is not surjective.
(Note: Yes, we are asking you to prove your function is not surjective without
knowing what function you defined. We know we are right! You will learn
about our trick later in this chapter . . . )
7.5. COMPOSITIONS AND INVERSES 511

(5) Suppose f : A → B and g : B → C are surjective functions. Prove that


g ◦ f : A → C is also surjective.

(6) Let f : A → B be an injective function. Define g : A → Imf (A) by setting


.
∀x ∈ A g(x) = f (x). Prove that g is a bijection.

(7) Define F : R × R → R × R by F (x, y) = (x + y, 2x − y). Prove F is bijective.


(Hint: In your scratch work, you should try to solve a system of two equa-
tions. See Section 1.3.2 for some suggestions about how to do that.)

(8) Let A, B be sets. Let g : A → B be an injection.


.
Let X ⊆ A. Let h : X → B be the function defined by ∀x ∈ X h(x) = g(x).
(That is, h is defined by the same “rule” as g, but on a “restricted domain”.)
Prove that h is also an injection.

7.5 Compositions and Inverses


7.5.1 Composition of Functions
Motivation

Let’s think about the schematic interpretation of functions for a moment. Imag-
ine that we have a function f : A → B and we also have a function g : B → C,
defined like this:

f :A→B g:B→C

A B B C

In a heuristic sense, f is like a “map” that gives us a particular route from


elements of A to elements of B, while g is like a “map” from elements of B to
elements of C. What would happen if we were to simply follow the “maps” one
after the other? That is, let’s combine the two by overlaying them,
512 CHAPTER 7. FUNCTIONS AND CARDINALITY

f :A→B g:B→C

A B C

and then simply travel from A all the way to C, cutting out the middle man:
g◦f :A→C

A C
This seems like a reasonable thing to do, right? Yes, of course it is! Whenever
we have mathematical objects at our disposal, we’re always curious about how
we can resonably combine them and manipulate them and generalize them. In
the case of functions, we call this combination a composition of functions. You
might notice that such a composition really only makes sense if the codomain
of the “first” function and the domain of the “second” function are the same.
This is incorporated in the following definition.

Definition
Definition 7.5.1. Let A, B, C be sets, and let f : A → B and g : B → C be
functions. Consider the function h : A → C defined by

.
∀a ∈ A h(a) = g(f (a))

We say that h is the composition of g with f and we write h = g ◦ f .


We also shorten this terminology and say h is “g composed with f ”.

This incorporates all the ideas we mentioned above. It requires that the
codomain of f (the “first” function applied) to be the domain of g (the “second”
function applied).
Another intuitive idea is to think of a function as a machine or a black box.
Elements of the domain go in and elements of the codomain come out. We don’t
necessarily know what the machine does; we only see what comes out. Now,
7.5. COMPOSITIONS AND INVERSES 513

think of hooking up two machines, one for f and one for g; take the output of
f ’s machine and plug it into g’s machine. What comes out is an element of C.
We can take the work of these two machines and think of it as the work of one
bigger machine. This is what the composition g ◦ f does; it’s one larger machine
that takes the operations of two machines and does them in a specified order.

Notation
Notice the ordering of the notation g ◦ f and how it compares to the order in
which we apply the functions: f comes first, and then g, i.e. g(f (a)). In words,
we would read “g(f (a))” as “g of f of a”. In fact, if you find yourself having
trouble remembering this order, here’s a recommendation: read the “◦” out
loud as “after”. Thus, h = g ◦ f would mean “g after f ”, because we take an
element of a, apply f first, and then apply g.
It is also important to remember the notation of composed functions and to
distinguish the function g ◦ f itself from an application of the function g ◦ f to
some element x ∈ A. For instance, to write “g of f of x” using the “◦” notation,
we would write
(g ◦ f )(x)

because we are “hitting” the element x with the function g ◦ f . However, the
following notation make no sense because it mixes up the ideas of functions
and elements:
g ◦ f (x)

Do you see the difference? The object f (x) is an element of B, the codomain of f .
But g is a function. What does it mean to compose a function with an element of
a set? This doesn’t work. Be careful with this, in general! This distinction will
be especially important when we have to compose several functions together,
like (h ◦ (g ◦ k) ◦ f )(z), where z is an element of f ’s domain, and f, g, h, k are
functions.

Examples
Example 7.5.2. Let C : R → R be defined by

.
∀x ∈ R C(x) = x − 273.15

Let F : R → R be defined by

.
∀x ∈ R F (x) =
9
5
x + 32

The function C converts a temperature from degrees Kelvin to degrees Celsisus.


The function F converts from degrees Celsius to degrees Fahrenheit.
Then the function F ◦ C converts from degrees Kelvin to degrees Fahrenheit
514 CHAPTER 7. FUNCTIONS AND CARDINALITY

directly. We can compose the “rules” for the functions and find a formula for
this direct conversion:
.
∀x ∈ R (F ◦ C)(x) = F (C(x)) = F (x − 273.15)
9 9
= · x − 273.15) + 32 = x − 459.67
5 5
Example 7.5.3. Let f : R → Z be the function defined by
.
∀x ∈ R f (x) = bxc
(Recall that bxc is the floor of x: it is the largest integer z ∈ Z that satisfies
z ≤ x. Let g : Z → N be the function defined by
(
.
∀z ∈ Z g(z) =
−z if z < 0
z + 1 if z ≥ 0

Let’s find g ◦ f . Notice that whenever x ∈ R satisfies x < 0, we will have


bxcx < 0, as well. Similarly, whenever x ∈ R satisfies x ≥ 0, we will have
bxc ≥ 0. This tells us that the composition g ◦ f will also be a piece-wise
function: (
.
∀x ∈ R (g ◦ f )(x) =
−bxc if x < 0
bxc + 1 if x ≥ 0
Questions: Is this function injective? Surjective? Try to prove your claims!
Example 7.5.4. Define f : N → N and g : N → N and h : N → N by
.
∀n ∈ N f (n) = n + 3
∀n ∈ N. g(n) = n 2

∀n ∈ N. h(n) = 2n − 1

(Question: Are you sure these are well-defined functions? Why?)


We can find “rules” for the compositions g ◦ f and h ◦ f :
.
∀n ∈ N (g ◦ f )(n) = g(f (n)) = g(n + 3) = (n + 3)2 = n2 + 6n + 9
∀n ∈ N. (h ◦ g)(n) = h(g(n)) = h(n ) = 2n
2 2
−1
We can then use these to find a rule for a further composition, like h ◦ (g ◦ f ):
.
∀n ∈ N h ◦ (g ◦ f ) (n) = h (g ◦ f )(n) = h n2 + 6n + 9
  

= 2(n2 + 6n + 9) − 1 = 2n2 + 12n + 17


Likewise, we can use these to find a rule for (h ◦ g) ◦ f :
. 
∀n ∈ N (h ◦ g) ◦ f (n) = h ◦ g)(f (n)) = (h ◦ g)(n + 3)
= 2(n + 3)2 − 1 = 2(n2 + 6n + 9) − 1
= 2n2 + 12n + 17
7.5. COMPOSITIONS AND INVERSES 515

Look at that, they’re the same rule! That is, we just proved that

(h ◦ g) ◦ f = h ◦ (g ◦ f )

in the sense of functions by showing that they yield the same output on every
allowable input.

Composition is Associative
There was nothing particularly special about the functions f, g, h used in the
previous example. The result we obtained is actually true in general. The
following theorem and its proof will show this. We are proving that function
composition is associative. This means that whenever we have a string of
compositions, we can move the parentheses around at will; we know that the
order in which we apply the parentheses doesn’t matter.
Theorem 7.5.5. Let A, B, C, D be any sets. Let f : A → B and g : B → C
and h : C → D be functions. Then,

h ◦ (g ◦ f ) = (h ◦ g) ◦ f

Proof. WWTS that the outputs of the two functions h ◦ (g ◦ f ) and (h ◦ g) ◦ f


are the same, for every possible input.
Let x ∈ A be given. Applying the definition of composition, we see that
 
[h ◦ (g ◦ f )](x) = h g ◦ f )(x) = h g(f (x))

and  
[(h ◦ g) ◦ f ](x) = h ◦ g (f (x)) = h g(f (x))

Compositions and Jections


Here’s something interesting to ponder now: What happens if we take the com-
position of two functions with a shared property? Does that property “carry
over”, as well? For instance, if we compose two injections, do we get another
injection? Does only one of the composed functions need to be an injection to
guarantee the composition is an injection?
Similarly, let’s say we have a composition of two functions. If we know the
composition is a surjection, can we necessarily deduce that one of the functions
we composed is also a surjection? Do they both need to be?
We will state and prove some claims about questions like these in this short
section. We will let you prove some related facts (or find appropriate counterex-
amples, as the case may be) in the exericses, both for this section and at the
end of the chapter.
Proposition 7.5.6. Let A, B, C be sets and let f : A → B and g : B → C be
functions. If g ◦ f is injective, then f is necessarily injective.
516 CHAPTER 7. FUNCTIONS AND CARDINALITY

(Notice that this doesn’t assume any properties of g; it doesn’t even have to
be injective, necessarily! As an exercise, try to find an example of functions
f : A → B and g : B → C such that g ◦ f is injective and g is injective, and also
an example where g ◦ f is injective but g is not injective.)
Proof. Let x, y ∈ A be given. Suppose f (x) = f (y). WWTS x = y.
Since g is a well-defined function, g(f (x)) = g(f (y)).
This means (g ◦ f )(x) = (g ◦ f )(y).
Since g ◦ f is injective, x = y. This was our goal, so the claim is proven.
It turns out that the converse of the claim we just proved is False. Since
that claim is one about all functions, disproving it requires us to produce a
counterexample.
Proposition 7.5.7. Let A, B, C be sets and let f : A → B and g : B → C be
functions. Suppose f is injective. Then it is not necessarily the case that g ◦ f
is injective.
Try doing some scratch work on your own to come up with a counterexample
before reading about ours. Remember that you don’t need to find the most
interesting or complicated one, nor do you necessarily need one defined by a
rule; you just need to be able to define one!
Proof. We will exhibit a counterexample.
Define A = {1, 2} and B = {♥, ♦} and C = {?}.
Define f by setting f (1) = ♥ and f (2) = ♦.
Notice f is injective because f (1) 6= f (2).
Define g by setting g(1) = g(2) = ?.
Notice g ◦ f is defined by (g ◦ f )(1) = ? and (g ◦ f )(2) = ?.
This shows g ◦ f is not injective, because (g ◦ f )(1) = (g ◦ f )(2) but 1 6= 2.

7.5.2 Inverses
Motivation
As we said before, a bijection f : A → B has a very nice property, in that f
“pairs off” the elements of the two sets, A and B. Given an element a ∈ A,
there is exactly one element b ∈ B that satisfies f (a) = b. This is because f is
a well-defined function. But we also know that a is the only domain element
associated with b in this way. This is because f is a bijection. Because of this
unique association in both directions, we can think of “reversing” the action of
f . Given an element b ∈ B, identify the a that would produce that b. This
is what an inverse function does. Here, we will define it in terms of function
composition and identity functions. This is also the reason we say a bijection
7.5. COMPOSITIONS AND INVERSES 517

is between two sets as opposed to just from one set to the other; as soon as we
have it one way, we know we can have it the other way, too!
Before we see the definition, let’s quickly recall the definition of the identity
function that we saw before. It plays an important role in the forthcoming
definition of inverse.
Definition: Given a set X, the identity function IdX : X → X
.
is defined by ∀z ∈ X IdX (z) = z.

Definition
Notice that this definition doesn’t say anything about the functions being bi-
jections. This is purely a formal definition of what an inverse function means.
Afterwards, we will have to prove any claims about hwo inverses and bijections
are related.
Definition 7.5.8. Let f : A → B be a function. Suppose there is a function
g : B → A such that f ◦ g : A → A satisfies f ◦ g = IdA and g ◦ f : B → B
satisfies g ◦ f = IdB .
Then we say g is the inverse of f and write g = f −1 .
(Notice that some conditions are implicitly stated by the assumptions and con-
clusions in the definition above. Specifically, it must be that B = Imf (A), to
make sure g is a function. Likewise, A = Img (B).)

Example
Let’s look back at a function we saw before when we discussed bijections. With
your help in the exercises, we learned that this function is a bijection. Here, we
will find its inverse.
Example 7.5.9. Let h : R − {−1} → R − {1} be defined by

.
∀x ∈ R − {−1} h(x) =
x
1+x
To find a candidate function that will be the inverse of h, it usually helps to set
the “rule” for h equal to some new variable, and then solve for x.
Here, let’s say h(x) = y. How can we “reverse” this process and identify what
x is, in terms of y? Observe that we can make some algebraic steps, as follows:
x
h(x) = y ⇐⇒ =y
1+x
⇐⇒ (1 + x)y = x
⇐⇒ xy + y = x
⇐⇒ y = x(1 − y)
y
⇐⇒ x =
1−y
518 CHAPTER 7. FUNCTIONS AND CARDINALITY

This scratch work has given us a candidate for the inverse of h. We haven’t
proven anything with these observations! What we have to do now is make a
claim and then demonstrate, for the reader, all of the essential facts. Notice that
we took care to define a new function H, and used it to prove that H = h−1 ,
in fact. It would be presumptuous and erroneous to define h−1 and then work
with it. We are trying to show h has an inverse, so we can’t just declare it has
one at the beginning of our proof!

Proof. Define S = R − {−1} and T = R − {1} for convenient shorthand, so


h : S → T.
.
Let H : T → S be the function defined by ∀y ∈ T H(y) = y
1−y .

First, let’s show that H is a well-defined function. For every y ∈ T , we know


y
y 6= 1, so 1 − y 6= 0. Thus, the fraction 1−y is a well-defined real number.
y y
Furthermore, we can argue that 1−y 6= −1. AFSOC that 1−y = −1. Then
multiplying through by 1 − y tells us y = y − 1, a clear contradiction.
Second, let’s show that H ◦ h = IdS . Let x ∈ S be given. Observe that
 
x
(H ◦ h)(x) = H(h(x)) = H
1+x
x
1+x 1+x x
= x · =
1 − 1+x 1+x (1 + x) − x
x
= =x
1
Third, let’s show that h ◦ H = IdT . Let y ∈ T be given. Observe that
 
y
(h ◦ H)(y) = h(H(y)) = h
1−y
y
1−y 1−y y
= y · =
1 + 1−y 1−y (1 + y) − y
y
= =y
1
Therefore, by the definition of inverse, H = h−1 .

Checking Both Directions


Let’s say f : A → B is a function, and you have made a claim about f having
an inverse by defining a function g : B → A. It is extremely important that
you show both compositions yield the identity function; that is, you must show
both
f ◦ g = IdB and g ◦ f = IdA
You might occasionally forget to do so, or you just might not see why this is
necessary. To help you understand this importance, we have included Exercise
7.5. COMPOSITIONS AND INVERSES 519

2 in Section 7.5.4 below. It asks you to find an example where “one way”
yields the identity function but the “other way” does not, so that the proposed
function is actually not an inverse. Try to find several examples, if you can.
The more striking you make this point, the better!

7.5.3 Bijective ⇐⇒ Invertible


As we have been hinting at all along, a bijective function has an inverse. This
claim’s converse holds, as well, so we can state and prove this if and only if
statement. The word in the section heading here—invertible—is often used to
mean “has an inverse”.

Theorem 7.5.10. Let A, B be any sets. Let f : A → B be a function. Then,

f is bijective ⇐⇒ f has an inverse f −1 : B → A

Proof. ( =⇒ ) Assume f is bijective. This means f is surjective and injective.


We need to define an inverse function for f . Let’s define g : B → A as follows:
.
Let b ∈ B be given. Since f is surjective, we know ∃a ∈ A f (a) = b. Let such
an a be given. Since f is injective, we know that

.
∀x ∈ A x 6= a =⇒ f (x) 6= f (a) = b

That is, we know this a is the unique element of A that satisfies f (a) = b. Let’s
define g(b) = a. This is a well-defined function because of these observations.
Next, observe that (f ◦ g)(b) = f (g(b)) = f (a) = b, so f ◦ g = IdB .
Also, observe that (g ◦ f )(a) = g(f (a)) = g(b) = a, so g ◦ f = IdA .
Therefore, g = f −1 , so f has an inverse.

(⇐=) Assume f has an inverse function, f −1 : B → A.


First, let’s show f is injective. Let a1 , a2 ∈ A be given. Observe that

f (a1 ) = f (a2 ) =⇒ f −1 (f (a1 )) = f −1 (f (a2 )) f −1 : B → A is a function


=⇒ (f −1 ◦ f )(a1 ) = (f −1 ◦ f )(a2 ) definition of composition
=⇒ IdA (a1 ) = IdA (a2 ) definition of identity
=⇒ a1 = a2 definition of identity

Thus, f is injective.
Second, let’s show f surjective. Let b ∈ B be given. Since f −1 is a function, we
.
know ∃a ∈ A f −1 (b) = a. Let such an a be given. Then observe thatf −1 (b) =
520 CHAPTER 7. FUNCTIONS AND CARDINALITY

a.

f −1 (b) = a =⇒ f (f −1 (b)) = f (a) f : A → B is a function


−1
=⇒ (f ◦ f )(b) = f (a) definition of composition
=⇒ IdB (b) = f (a) definition of identity
=⇒ b = f (a) definition of identity

Proving a Function is Bijective


This helpful theorem now provides us with another technique for proving that
a given function f : A → B is a bijection. Rather than proving f is an injection
and a surjection, we can just define a new function g : B → A and prove that it
is the inverse of f , i.e. g = f −1 . Then, this theorem applies and tells us that
f is a bijection! Depending on the context, one or the other of these strategies
might be easier to apply, or you might just be more comfortable with one of
them. Keep in mind that both strategies are viable, though!

Inverse of an Inverse
The following corollary follows immediately from the theorem above. We call
it a corollary and not its own theorem because it doesn’t really assert anything
amazingly new; rather, its conclusion comes from applying the theorem above,
as you’ll see in the proof.

Corollary 7.5.11. Let A, B be any sets. Let f : A → B be a function.


If f is a bijection, then f −1 exists and it is also a bijection.
−1
Furthermore, f −1 = f.

Proof. Suppose f has an inverse, f −1 : B → A. This means f ◦ f −1 = IdB and


f −1 ◦ f = IdA , by the definition of inverse.
These are precisely the conditions that show f −1 = f , again by the definition


of inverse! This shows f −1 has an inverse (namely, f itself) so the theorem


above tells us that f −1 must be a bijection.

Inverse of a Composition
Before we move on to some exercises and the next section, let’s get your help in
putting together the main ideas of this chapter so far. Specifically, we are going
to state two results here. The proofs are left for you in the chapter exercises. By
working through those proofs, you will (a) solidify your understanding of many
of the concepts introduced so far—functions, jections, compositions, inverses—
and (b) obtain a helpful result about how to define the inverse of a composition
of functions!
7.5. COMPOSITIONS AND INVERSES 521

Proposition 7.5.12. Let f : A → B and g : B → C be bijective functions.


Define h : A → C to be h = g ◦ f . Then h is also bijective.

Proof. Left for the reader as Problem 7.8.9.

Proposition 7.5.13. Let f : A → B and g : B → C be bijective functions.


Define h : A → C to be h = g ◦ f . Then h is invertible and h−1 = f −1 ◦ g −1 .

Proof. Left for the reader as Problem 7.8.10

7.5.4 Questions & Exercises


Remind Yourself
Answering the following questions briefly, either out loud or in writing. These
are all based on the section you just read, so if you can’t recall a specific defi-
nition or concept or example, go back and reread that part. Making sure you
can confidently answer these before moving on will help your understanding and
memory!

(1) Is the composition of functions associative? (That is, does the order of
parentheses matter?) Why or why not?

(2) Is the composition of functions commutative? (That is, can we reverse


the order?) Why or why not?

(3) Suppose f : A → B and g : B → A are functions. How do we prove that


g = f −1 ?

(4) Suppose f : A → B is a bijection. Is its inverse also a bijection?

Try It
Try answering the following short-answer questions. They require you to actu-
ally write something down, or describe something out loud (to a friend/class-
mate, perhaps). The goal is to get you to practice working with new concepts,
definitions, and notation. They are meant to be easy, though; making sure you
can work through them will help you!

(1) Let O be the set of odd natural numbers and let E be the set of even natural
numbers. Define a function f : O → E that is a bijection and prove that
it is so by finding its inverse.

(2) In this problem, we want you to construct an example that shows the im-
portance of verifying both compositions yield the identity function when
we’re trying to find the inverse of a function.
Define sets A, B and functions f : A → B and g : B → A such that

.
∀x ∈ A g(f (x)) = x
522 CHAPTER 7. FUNCTIONS AND CARDINALITY

but
.
∃y ∈ B f (g(y)) 6= y

(Suggestion: You might find an example where A and B both only have
one or two elements . . . Or, you might find an example where A = B = N.)

(3) Let U = {y ∈ R | −1 < y < 1} and I = {y ∈ R | −6 < y < 12}.


.
Let g : U → I be the function defined by ∀x ∈ U g(x) = 9x + 3.
−1
Prove that g is a bijection by finding g .

(4) Define the function f : Z → N by


(
.
∀z ∈ Z f (z) =
−2z + 2
2z − 1
if z ≤ 0
if z > 0

Prove that f is a bijection by finding f −1 .


(Hint: Your proposed inverse function will also be piece-wise defined. Be
careful about the cases that will then come up in your proof.)

(5) Challenge: Define I = {y ∈ R | −1 < y < 1}. Find a function f : I → R


that is a bijection and prove that it is.
(Hint: You do not need to use any trigonometric functions. Consider using
|x| somewhere in your expression . . . )

7.6 Cardinality
7.6.1 Motivation and Definition
One important reason for caring about bijections is that they allow us to com-
pare the sizes of sets! This is a notion for which you have some intuitition. For
example, it’s pretty clear that the set

{1, 2, 3, 4, 5}

has 5 elements. It is finite. However, the set

N = {1, 2, 3, 4, 5, . . . }

is infinite. We also understand that Z is an infinite set. So are Q and R. What


are their sizes? Can we even compare them? How could we do so mathemat-
ically? What does it really mean to be an infinite set? Are there “different
infinities”?
7.6. CARDINALITY 523

Bijections “Pair” Elements


Let’s say there are 5 pens and 5 books on a table in front of us. But also, let’s
pretend that we didn’t know how to count them. How could we verify that
there are just as many pens as there are books? Instead of saying, “There are 5
pens and 5 books, and 5 = 5”, can we somehow show the set of books and the
set of pens has the same size, without knowing what that size is?
This is where a bijection comes into play. We can pair off the pens and
books one-by-one. We can line them up on the table and draw a line between
them, showing a correspondence between them. In the language of sets, we are
identifying a bijection between the set of pens and the set of books. This idea
is so important, that we want to impress it upon you with a quote:
In the land of Cardinality, the Bijection is King.
Imagine our study of cardinality is a journey through the Kingdom of Cardinal-
ity. In this Kingdom, we bow down to King Bijection, for he rules all. Only he
can tell us when two sets have the same cardinality, whether they be finite or
infinite.
Moreover, we really need to use this set terminology, because we will see
some surprising and counter-intuitive results. Using these formal definitions
and concepts will allow us to be rigorous and precise. The examples and results
we see might blow our minds a little bit (or a lot!), but having them rooted in
concepts we’ve already seen and theorems we’ve already proven lets us actually
belive these results, mathematically speaking!

Definitions and Notation


First, let’s define what it means to be finite.
Definition 7.6.1. Let S be any set. We say S is finite if and only if

∃n ∈ N ∪ {0} such that there exists a bijection f : S → [n]

In this case, we write |S| = n to indicate that the size of S is n.


Note: The empty set S = ∅ is finite, since [0] = ∅. This is why we said
n ∈ N ∪ {0} in the definition, and not just n ∈ N. The function f : ∅ → ∅
that is a bijection is simply the empty relation. (Remember that a function is a
relation!)
By definition, sets of the form [n] are finite. They are our standard examples
of finite sets, with size |[n]| = n. Thus, to show that a set S has size n, we need
to find a bijection between S and [n]. For example, consider the set {1, 3, 5}.
This clearly looks like it has size 3. We can show this by exhibiting the bijection
f : {1, 3, 5} → [3] defined by f (1) = 1 and f (3) = 2 and f (5) = 3.
It’s interesting to think about whether a finite set could have two different
sizes. The definition technically doesn’t preclude this, but we can prove that
the size of a finite set is unique. Think about how to do that . . . We will do so
after a few more essential definitions.
524 CHAPTER 7. FUNCTIONS AND CARDINALITY

Definition 7.6.2. Let S be any set. We say S is infinite if and only if S is


not finite.
That is, S is infinite if ∀n ∈ N ∪ {0}, every possible function f : S → [n] fails
to be a bijection.
When S is infinite, we use |S| to indicate the cardinality of the set.
It might seem silly to define infinite in this way—not finite—but it certainly
reflects the intuitive dichotomy between the two concepts. A set can’t be both
finite and infinite, so rather than come up with a way to categorize both of
them, let’s catgeorize one and define the other to be “anything else”.
Also, we do not write |S| = ∞ to indicate that a set is infinite. As we will
see very shortly, there are actually many different “levels” of infinite sets.
This might seem incredibly bizarre to you right now, but you will see what we
mean. Yes, there are different “sizes” of infinite sets, and we will use |S| to
indicate the cardinality of S so that we may compare it to that of other sets.
Writing |S| = ∞ would indicate there is only “one infinity”, and this is very
much incorrect.
Now, that said, we are mostly going to distinsuigh just two types of infinite
sets, for our purposes. We are doing this to show you some striking results
about the sets we are already familiar with, namely N and Z and Q and R. The
following definition tells us what these two types are.
Definition 7.6.3. Let S be any set.
We say S is countably infinite if and only if there exists a bijection f : S → N.
We say S is uncountably infinite (or just uncountable) if and only if S is
infinite and every function f : S → N fails to be a bijection.
Given an infinite set S, this definition establishes two possibilities for S,
based on how its cardinality |S| compares with N. We use the term countably
infinite because it represents why we intuitively think of N as infinite. The set
N has “a lot” of elements, so many so that if we tried to count them we would
never finish; however, the fact that we can even try to count them in this way
indicates something special. There is a 1st element of N, and a 2nd element, and
a 3rd, and . . . We can’t name them all in our lifetime, but we could program
a magical, immortal robot to print them out one-by-one. If we thought of a
natural number ahead of time, no matter how huge that number is, we know
the robot will eventually print out that number.
Perhaps we can’t do this with all infinite sets, though. This is what the
notion of an uncountably infinite set is meant to convey. Such a set is infinite,
so there is no correspondence with a set of the form [n], but it is also “so large”
that we cannot identify a “1st element” and a “2nd element” and a “3rd element”
and . . . This is what a bijection f : S → N would convey, a way to label all the
elements of S in a way that shows they are paired off with the natural numbers.
If we cannot do this, then the set is uncountably infinite. Now, you might not
believe that such sets exist! Don’t worry, we will show you some. For now, just
7.6. CARDINALITY 525

be aware of the distinction between countably and uncountably infinite: the


difference rests on whether a bijection with N exists.

Comparing Cardinalities
As we mentioned, when S is infinite, we use |S| to compare the cardinality of
S to that of other sets. We won’t write something like |S| = ∞. Rather, we will
write something like |S| = |T | to indicate that S and T have the same cardinality,
whatever that may be. We might also write something like |S| < |P | to indicate
P has a strictly larger cardinality than S. The following definition tells us how
the comparison of cardinalities is based on functions and, specifically, different
kinds of jections.
Definition 7.6.4. Let S, T be any sets.
• We write |S| = |T | if and only if there exists a bijection f : S → T .
In this case, we say S has the same cardinality as T .
• We write |S| ≤ |T | if and only if there exists an injection f : S → T .
In this case, we say S has cardinality at most |T |.
• We write |S| < |T | if and only if |S| ≤ |T | and |S| =
6 |T |.
In this case, we say S has a strictly smaller cardinality than T .
• We write |S| ≥ |T | if and only if there exists a surjection f : S → T .
In this case, we say S has cardinality at least |T |.
• We write |S| > |T | if and only if |S| ≥ |T | and |S| =
6 |T |.
In this case, we say S has a strictly larger cardinality than T .
Let’s explain the motivation behind these definitions in two different ways:
In general, f : A → B being an injection tells us |A| ≤ |B| and g : A → B
being a surjection tells us |A| ≥ |B|. Think about schematic diagrams for the
functions f and g to see why this definition makes sense. Having an injection
from A → B means we can definitely “pair” the elements of A to elements of B
without overlapping, but perhaps there are “many more” elements of B left over.
Likewise, having a surjection from A → B means we can definitely “cover” all of
B with elements of A, but maybe we had to overlap sometimes to do this, so A
could have “more” elements than B. Having both of these situations together
(i.e. a bijection from A to B) means that A and B actually have the same
cardinality: we can pair off all their elements. This is an intuitive explanation
to motivate these definitions, mind you. These types of explanations are not
rigorous proofs. But now that we have made these definitions, we can use them
to prove and disprove statements! To compare cardinalities of sets—even infinite
ones—we just need to find a function with an appropriate property. All of our
work in the rest of this chapter will be quite helpful in our journey through the
Kingdom of Cardinality.
526 CHAPTER 7. FUNCTIONS AND CARDINALITY

Another way to think about these definitions is that “has the same cardi-
nality as” is an “equivalence relation” on the “set of all sets”. We have to put
quotes around these phrases because, as we explained in detail in Section 3.3.5
about Russell’s Paradox, there is no such thing as the “set of all sets”. Thus,
it doesn’t make mathematical sense in our context to talk about an equivalence
relation on that “set”. In some fuzzy sense, though, this is what’s going on:

• Given any set S, there is certainly a bijection with itself: the identity
function, IdS : S → S. This shows |S| = |S|, i.e. the “has the same
cardinality as” relation is “reflexive”.

• Suppose |S| = |T |, so there is a bijection f : S → T . Is there a bijection


g : T → S, as well? Why yes, we can use g = f −1 , of course! We know
that is also a bijection. This shows |T | = |S| via a bijection, as well, i.e.
the “has the same cardinality as” relation is “symmetric”.

• Suppose |S| = |T | = |U |, so there are bijections f : S → T and g : T → U .


Does there exist a bijection h : S → U , as well? Yes! The composition
g ◦ f is also a bijection (this is something you will prove/have proven in
the exercises). This shows |S| = |U | via a bijection, as well, i.e. the “has
the same cardinality as” relation is “transitive”.

Again, this is not exactly what’s going on, but it can really help you sort through
these difficult, abstract ideas. We are establishing a way to take any two sets and
compare their cardinalities using functions. All of the sets in the universe will be
“partitioned” into different “classes” based on their cardinalities. What’s truly
amazing is what we are about to prove for you: that there are infinitely-many
cardinalities.

Cantor’s Theorem
The following result and proof are due to the German mathematician Georg
Cantor from the mid- to late-1800s. By now, mathematicians have fully em-
braced the result and its consequences. However, at the time, this idea was so
controversial that some mathematicians refused to believe him. In time, though,
his work and ideas helped lead to the development of formal set theory.
The proof of this particular result is known as Cantor’s Diagonalization
Argument. We will use an argument like this later on, where we will point out
why it is like a “diagonal”. For now, we are more interested in the conclusion
of this theorem.

Theorem 7.6.5. Let S be any set. Then |S| < |P(S)|.

This says that the power set of a set always has strictly larger cardinality
than the set itself. This makes sense for finite sets. You discovered already
that the power set of [n] has 2n elements, i.e. |P([n])| = 2n . (You will prove
this by induction, using results about cardinality, in Problem 7.8.30.) We see,
indeed, that n < 2n for every n ∈ N. However, this theorem also asserts that
7.6. CARDINALITY 527

this relationship holds for infinite sets. Wow! Immediately, this tells us that
there is a whole chain of infinite sets, each one bigger than the previous one.
We can just kept taking the power set of what we had before:
  
|N| < |P(N)| < P P(N) < P P P(N) < ···

Let’s prove this theorem. The proof is very short and clever, so don’t worry
about how to come up with such an argument. Focus on understanding the
logical flow.

Proof. Let S be any set. AFSOC |S| ≥ |P(S)|.


This means there exists a function g : S → P(S) that is surjective.
Define T = {X ∈ S | X ∈ / g(X)}. (This makes sense because, for any X ∈ S,
g(X) ∈ P(S), i.e. g(X) ⊆ S. Thus, either X ∈ g(X) or X ∈
/ g(X) must hold.)
Notice T ⊆ S, by a set-builder notation definition. This means T ∈ P(S).
Since g is surjective, ∃Y ∈ S such that g(Y ) = T . Let such a Y be given.
Now, is Y ∈ T ? We consider both cases:

• If Y ∈ T , then the definition of T says Y ∈/ g(Y ). However, g(Y ) = T , so


this means Y ∈ / T . This is a contradiction × ×
××

• If Y ∈
/ T , then the definition of T says Y ∈ g(Y ). However, g(Y ) = T , so
this means Y ∈ T . This is a contradiction × ×
××

In either case, both Y ∈ T and Y ∈


/ T hold. This is a contradiction ×
×
××
Therefore, there exists no such surjection from S to P(S), i.e. |S| < |P(S)|.

Look back at Exercise 4 in Section 7.4.5. Notice that we asked you to define a
function from N to P(N), and then we asked you to prove it was not surjective.
We didn’t have to know what your function was! Since we were aware of this
theorem, we knew you couldn’t possibly have defined a surjection!

Discussion: Axioms and Definitions


We want to make an admission. We have glossed over some details about what
constitutes a definition as opposed to a theorem, a result that needs proven from
fundamental assumptions. By definition (at least, in our context) an injection
and a surjection from A to B (in that direction, mind you) constitute sufficient
proof of equal cardinalities, which guarantees a bijection. Likewise, an injection
from A to B and one from B to A is sufficient to guarantee |A| = |B|, and so
there must be a bijection between A and B.
It is not totally obvious, though, why these claims should be true. Say we
have an injection from A to B and one from B to A. Does this guarantee a
bijection between the two sets? Well, one would hope! But this isn’t a proof.
This result is actually known as the Cantor-Schroeder-Bernstein Theorem:
528 CHAPTER 7. FUNCTIONS AND CARDINALITY

Theorem 7.6.6 (Cantor-Schroeder-Bernstein). Suppose A, B are any sets, and


f : A → B and g : B → A are injections. Then there exists a bijection
h : A → B.
Yes, that is a theorem; it is not trivial! One of the proofs is, in fact, con-
structive: it provides an algorithmic method for constructing that bijection
h : A → B, using the two injections, f : A → B and g : B → A. For our
purposes—and for time and space restrictions—there is no need to separate
this out as a theorem, let alone one with a constructive proof. It is sufficient
to consider injections and surjections and their consequences vis-à-vis cardinal-
ities as definitions; these results “feel” intuitive and we can accept them. Just
realize, though, that we are basing them on rigorous mathematical knowledge.
If you are interested in learning about these subtleties and their consequences,
consider taking a course or reading a book about set theory.
In essence, the real issue is that we pre-supposed any two sets, A and B,
can have their cardinalities compared in some meaningful way, mathematically
speaking. That is, for any A and B, we have pre-supposed that we can somehow
declare that |A| ≤ |B| or |B| ≤ |A| makes sense (or both, perhaps, if the sets
are of “equal size”). But how can we guarantee one such comparison, or maybe
both, will always apply, for any two given sets? It’s not a trivial consideration!
In the context of this book, one of our axioms is that the cardinalities of any
two sets we consider can be compared. In the context of the mathematical
universe at large, though, this is something that needs to be proved from more
fundamental assumptions.

7.6.2 Finite Sets


Before moving into the somewhat bizarre (but fascinating!) world of infinite
sets, let’s focus on some results about finite sets. These results will be easier
to understand, intuitively, and will give us some good practice in working with
functions and their properties to prove facts about cardinalities.

Theorems
For each of these results, we will state a theorem/proposition/lemma, and either
prove it or have you help us with the proof via some exercises.
Theorem 7.6.7. Suppose A, B are disjoint finite sets. Then |A∪B| = |A|+|B|.
Play around with some examples to see why this claim is True. Do you see
why we need the sets to be disjoint for this to work? Can you prove this claim?
Remember that we want to find a bijection between the two sets . . .

Proof. Let A, B be finite sets that are disjoint.


We know ∃a, b ∈ N ∪ {0} and there exist bijections f : A → [a] and g : B → [b].
7.6. CARDINALITY 529

(That is, we suppose |A| = a and |B| = b). Let such a, b, f, g be given.
WWTS |A ∪ B| = |A| + |B| = a + b; that is, WWTS there is a bijection
h : A ∪ B → [a + b].
Define the function h : A ∪ B → [a + b] by
(
∀x ∈ A ∪ B . h(x) =
f (x) if x ∈ A
g(x) + a if x ∈ B

Notice that h is well-defined because A ∩ B = ∅, so every x ∈ A ∪ B satisfies


x ∈ A or x ∈ B and certainly not both. Also, 1 ≤ h(x) ≤ a for every x ∈ A,
and a + 1 ≤ h(x) ≤ a + b for every x ∈ B, so h(x) ∈ [a + b] for every x in the
domain of h.
We claim that the function H : [a + b] → A ∪ B, defined by
(
f −1 (y)
∀y ∈ [a + b] . H(y) = −1
if 1 ≤ y ≤ a
g (y − a) if a + 1 ≤ y ≤ a + b

is the inverse of h. If this holds, then we have proven that h is a bijection.


Let’s show that H is well-defined. Every y ∈ [a + b] satisfies exactly one of
the two inequalities given in the definition of H. Also, f and g were given
to be bijections, so f −1 and g −1 are well-defined functions (that are bijections
themselves, even). Furthermore, if a + 1 ≤ y ≤ a + b then 1 ≤ y − a ≤ b so
y − a ∈ [b] (the domain of g −1 ).
Let’s show that h ◦ H = Id[a+b] . Let y ∈ [a + b] be given. We have two cases.
(1) Suppose 1 ≤ y ≤ a; that is, suppose y ∈ [a]. Then,
(h ◦ H)(y) = h(H(y)) = h f −1 (y) = f f −1 (y) = Id[a] (y) = y
 

where we used the fact f −1 (y) ∈ A.


(2) Suppose a + 1 ≤ y ≤ b; that is, suppose y − a ∈ [b]. Then,
(h ◦ H)(y) = h(H(y)) = h g −1 (y − a) = g g −1 (y − a)) + a


= Id[b] (y − a) + a = (y − a) + a = y
where we used the fact that g −1 (y − a) ∈ B.
In either case, we find (h ◦ H)(y) = y, and both cases are disjoint and cover all
possibilities.
Next, let’s show that H ◦ h = IdA∪B . Let x ∈ A ∪ B be given. We have two
cases.
(1) Suppose x ∈ A. Then,
(H ◦ h)(x) = H(h(x)) = H f (x) = f −1 f (x) = IdA (x) = A
 

where we used the fact that f (x) ∈ [a].


530 CHAPTER 7. FUNCTIONS AND CARDINALITY

(2) Suppose x ∈ B. Then,


 
(H ◦ h)(x) = H(h(x)) = H g(x) + a = g −1 g(x) + a − a
 

= g −1 g(x) = IdB (x) = x




where we have used the fact that g(x) ∈ [b] so a + 1 ≤ g(x) + a ≤ a + b.

In either case, we find (H ◦ h)(x) = x, and both cases are disjoint and cover all
possibilities.
Thus, H = h−1 , so h has an inverse. Therefore, h is a bijection.
Therefore, |A ∪ B| = [a + b] = a + b = |A| + |B|.

Corollary 7.6.8. Suppose S, T are finite sets and S ⊆ T . Then, |T − S| =


|T | − |S|.

Proof. Define U = T − S. Notice that U ∩ S = ∅. Apply the above theorem to


U and S to get
|U | + |S| = |U ∪ S| = |T |
then subtract from both sides to get |T − S| = |U | = |T | − |S|.

You can use the two results above to prove the following generalization:

Proposition 7.6.9. Suppose A, B are finite sets. Then |A ∪ B| = |A| + |B| −


|A ∩ B|.

Proof. Left for the reader as Exercise 1 in Section 7.6.5.

Here’s another corollary to the theorem above.

Corollary 7.6.10. Suppose A1 , A2 , . . . , An are finite and pairwise-disjoint (re-


member this means any two of the sets are disjoint).
Then |A1 ∪ · · · ∪ An | = |A1 | + · · · + |An |.

Proof. Left for the reader as Exercise 2 in Section 7.6.5.

You should also look at Problem 7.8.32 in this chapter’s exercises. There,
we guide you through a proof (by induction on two variables!) about the size
of the Cartesian product of two finite sets.

7.6.3 Countably Infinite Sets


Let’s move on to investigate the land of countably infinite sets. We will start
by talking about a famous thought experiment, named after the mathematician
David Hilbert.
7.6. CARDINALITY 531

The Hilbert Hotel


Let’s play make-believe. This will help us get a handle on infinite weirdness.
Pretened we own a hotel. There are countably infinitely many rooms in our
magical building. They are numbered as Room 1, Room 2, Room 3, . . . . That
is to say, our rooms are indexed by the set of natural numbers, N.
We want to accomodate as many people as we can (to make lots of money!) and
because our hotel is so swanky and accomodating, our guests are totally willing
to move to a new room whenever we ask them to. It just takes them a couple
of minutes to gather their belongings and walk down the hall to a new room.
We also have a loudspeaker system that allows us to communicate a message to
all of the guests at once.

• Suppose all the rooms are full. It’s a very busy weekend. One guy walks
into the lobby looking for a room. Can we squeeze him in? If not, why?
If so, how?

It turns out that we can! We can just shift all the guests down one room
and place this new guy into Room 1.
The catch, though, is to take advantage of our loudspeaker system. If
we had to go and knock on everyone’s door telling them to move down
one room, we would never actually finish; we would spend all of eternity
knocking on doors and delivering messages.
Instead, we make the following announcement:

Attention guests: If you find yourself in Room n, please move


to Room n + 1. Thank you!

After five minutes, the guests have all moved, and Room 1 is vacant for
our new guest.

Morally speaking, we have just verified that the set N and the set N ∪ {?}
have the same cardinality, for any particular object ?. In particular, say,
|N| = |N ∪ {0}|. Our hotel has only countably many rooms, and we have
accomodated one person associated with each natural number, as well as
one more person.

• It’s the next day. Our rooms are still full. Suppose a Scrabble convention
with countably infinitely many people shows up. The people are all wear-
ing nametags with natural numbers on them, so there is Person 1, Person
2, Person 3, . . . .
Can we accomodate these folks? How can we assign them rooms? How
do we move around the guests currently in the hotel?

It turns out that we can! The idea is to free up an infinite set of rooms.
532 CHAPTER 7. FUNCTIONS AND CARDINALITY

Again, the catch is to do this by making one blanket announcement to all


of the guests at once, as opposed to knocking on everyone’s door.
We recognize that the set of even-numbered rooms and the set of odd-
numbered rooms are both infinite in size, so let’s make the current guests
in the hotel occupy the even-numbered rooms and assign the new guests
from the convention to the odd-numbered rooms. We make the following
announcement to the hotel guests via the loudspeaker:

Attention guests: If you find yourself in Room n, please move


to Room 2n. Thank you!

Then, we make the following announcement to the convention folks waiting


in the lobby:

Attention convention-goers: If you are wearing nametag number


n, please go to Room 2n − 1. Thank you!

After five minutes, every hotel guest has moved, and after another five
minutes, every convention-goer has found their room. Voilá!

Morally speaking, we have just verified that the union of two disjoint
countably infinite sets is countably infinite, as well. That is, we took the
set A of current hotel guests (notice A is countably infinite) and the set
B of convention guests waiting for rooms (notice B is countably infinite,
and notice that A ∩ B = ∅) and found a bijection between A ∪ B and N,
where N represents the set of Rooms.

• Now, suppose another convention shows up. They play Scrabble in a


different language, so they don’t want to be associated with the other
convention. How can we move folks around to get everyone a room?
We can do the exact same thing! It’s as if we were facing the same situation
as before, with just a full hotel and a countably infinite set of people
waiting for rooms.

• Now, suppose countably infinitely many conventions show up, each of


them not wanting to be associated with the others. Oh my!
Luckily, the hotel convention organizer has assigned every convention a
natural number, and each member within a convention gets a hat with
that number on it. Also, within each convention, each person is assigned
a natural number, and they wear a badge with that number. Thus, each
person has two forms of identification: a hat and a badge. So we have
Person 1 from Convention 1, and Person 3 from Convention 7, and Person
12 from Convention 8, and so on and so forth.
How do we rearrange all of these people in the hotel? Can we even do it?
How can we do it efficiently?
7.6. CARDINALITY 533

The catch here is that we cannot apply the same method as the previous
two cases over and over. Yes, we can squeeze in Convention 1 using that
method. After that’s done, we would squeeze in Convention 2. And so
on. But never would we get to all of the conventions. It’s the same
problem we had before where knocking on every individual door would
take forever to accomplish; we needed to send a message to everyone at
once. Likewise, here, we need to send a message to all of the hotel guests,
and then a message to all of the convention-goers waiting outside the door.
It needs to be a general “formula” about which room to go to.
If it helps, think of this from the other side of the situation. Pretend you
are in Convention x and you are Person y in that convention. You are
eagerly awaiting a comfortable bed to sleep in for the night. You want to
know exactly what room to go to. ASAP. You don’t want to wait around
and see all of the conventions ahead of you given rooms, one by one. You
want to all go in at once and find your corresponding rooms.

Here’s one way to do it. Let’s take advantage of the structure of the
prime numbers. We know there are countably infinitely many primes,
and that for any two different primes p and q (i.e. p 6= q), it is true that
pk 6= q k for any natural number k. With this in mind, we see that assigning
individual conventions to the rooms that are powers of a corresponding
prime number, we can ensure that no two (potential) guests get assigned
to the same room. We make the following announcement to our current
hotel guests:
Attention guests: If you find yourself in Room n, please move
to Room 2n . Thank you!
We then make the following announcement to the conventions waiting
outside the door:
Attention convention-goers:
If you are Person number k from Convention number 1, please
go to Room 3k .
If you are Person number k from Convention number 2, please
go to Room 5k .
If you are Person number k from Convention number 3, please
go to Room 7k .
In general, if you are Person number k from Convention number
n, please go to the Room numbered by the (n + 1)-th prime
number raised to the k-th power.
Thank you!
(Note: We are assuming that all of our guests and potential guests are
math genii, and they can quickly figure out what the (n + 1)-th prime
534 CHAPTER 7. FUNCTIONS AND CARDINALITY

number is and raise it to the k-th power. Otherwise, we wouldn’t want


them to stay at our luxurious, mathematical hotel in the first place!)
Notice that this guarantees everyone has a room all to themself. Nobody
has to share a room. However, it does leave many rooms empty. Who is
in Room 1? Room 6? Room 18? In general, can you characterize the set
of rooms that will be empty?

How could we have been more “efficient” about this? Is there a certain
announcement we could make so that all the rooms are filled?

Morally speaking, we just verified that N and N × N have the same car-
dinality. We had countably infinitely many conventions with countably
infinitely many people in each, so every person we wanted to accomodate
corresponds to an ordered pair of natural numbers, where the first coordi-
nate is their Person number and the second coordinate is their Convention
number. Since we were able to match this set of people with the set of
rooms (which corresponds to N), then we showed N × N is countable.
(Note: We actually “overdid” it and found a way to embed the set N × N
in a strict subset of N!)

This hopefully gives you a flavor for how to think about countable infinities.
One important point to keep in mind is that infinity is a cardinality, not a
number, in our context here. It’s not as if the natural numbers “keep going”
and there’s some magical number ∞ lying out there past them all. Here, we
refer to countably infinite as a cardinality; it represents how “big” something
is. It’s more like a magnitude than a position.

Examples
Let’s take some of the ideas conveyed by the Hilbert Hotel examples and
express them more formally. We’ll make use of injections and surjections and
bijections. (Oh my!) The following result will be helpful as we go along, so let’s
prove it now.

Lemma 7.6.11. Let S, T be any sets. Suppose S ⊆ T . Then |S| ≤ |T |.

.
Proof. Define the “identity function” f : S → T , given by ∀x ∈ S f (x) = x.
Since S ⊆ T , this is a well-defined function.
(Note: We couldn’t technically define this as the usual identity function IdS ,
because the domain and codomain might not be equal sets; in essence, f does
the same action as IdA but has a different codomain).
Notice that f is injective!
(Note: It’s not necessarily bijective, because it might be that S 6= T .)
Since f is injective, this tells us that |A| ≤ |B|.
7.6. CARDINALITY 535

You might be wondering why we can’t conclude |A| < |B| here. Why is it “≤”
instead? Certainly, {1, 2} ⊆ {1, 2, 3} and |{1, 2}| = 2 < 3 = |{1, 2, 3}. This is
true for finite sets, but as we shall see in this section, there are infinite sets
that have strict subsets of equal cardinality!
Example 7.6.12. Z is countably infinite:
We know N is countably infinite by definition. The identify function IdN : N → N
is obviously a bijection, so N is countable.
In this example, we will prove that Z is countably infinite! To accomplish this,
we need to find a bijection f : Z → N. We will state one here and then prove
it is a bijection by finding its inverse. Before reading on, try to find a bijection
on your own! Maybe you’ll come up with a different function than ours! If you
need a hint for coming up with one, think about this: to prove an infinite set
is countably infinite, we want to find a way to start listing the elements one by
one. Try to find a pattern that identifies the “1st” integer, and then the “2nd,
and then the “3rd”, . . .

Let’s define a function f : Z → N and then prove it is a bijection by identifying


f −1 .
Explicit bijection: We choose to define f : Z → N by setting
(
.
∀z ∈ Z f (z) =
−2z + 2 if z ≤ 0
2z − 1 if z > 0

We chose this function because it “pairs off’ the integers with the natural num-
bers like this:
. . . , −3, −2, −1, 0, 1, 2, 3, . . .
l l l l l l l
. . . , 8, 6, 4, 2, 1, 3, 5, . . .
(That is, we are pairing off the even natural with the non-positive integers, and
the odd natural with the positive integers. Looking at this correspondence, we
can see how to “reverse” it. This is how we will find f ’s inverse.)
Next, Define F : N → Z by
(
− n2 + 1 if n is even
F (n) = n+1
2 if n is odd

Let’s show F = f −1 . Let z ∈ Z be given. We have two cases.


• Suppose z ≥ 1. Then f (z) = 2z − 1. Notice that 2z − 1 ∈ N and 2z − 1 is
odd. This means
(2z − 1) + 1 2z
(F ◦ f )(z) = F (f (z)) = F (2z − 1) = = =z
2 2
536 CHAPTER 7. FUNCTIONS AND CARDINALITY

• Suppose z ≤ 0. Then f (z) = −2z + 2. Notice that −2z ≥ 0 so −2z + 2 ≥ 2


so −2z + 2 ∈ N. Also, −2z + 2 is even. This means
−2z + 2
(F ◦ f )(z) = F (f (z)) = F (−2z + 2) = − +1
2
= −(−z + 1) + 1 = (z − 1) + 1 = z

In either case, (F ◦ f )(z) = z. This shows F ◦ f = IdZ .

Next, let n ∈ N. We have two cases.

• Suppose n is even. Then F (n) = − n2 + 1. Notice that n


2 ≥ 1 and so
− n2 ≤ −1 + 1 = 0. This means
 n   n 
(f ◦ F )(n) = f (F (n)) = f − + 1 = −2 − + 1 + 2
  2 2
2n
= −2 +2=n
2

n+1
• Suppose n is odd. Then F (n) = 2 . Notice that n + 1 ≥ 2 and so
n+1
2 ≥ 1. This means
   
n+1 n+1 2n + 2
(f ◦ F )(n) = f (F (n)) = f =2 −1= −1
2 2 2
= (n + 1) − 1 = n

In either case, (f ◦ F )(n) = n. This shows f ◦ F = IdN . Therefore, F = f −1 .


This shows that Z and N have the same cardinality, that |Z| = |N|. You might
feel like there are “twice as many” integers as naturals, but this is where your
intution fails. We can pair up the elements of these two sets one-by-one, so they
must be of the same size! This is an example that shows you why the conclusion
of Lemma 7.6.11 is the best it can be. Here, N ⊂ Z (a strict subset) and yet
|N| = |Z|. This can only happen when have infinite (not finite) sets, and here is
one such example.
(Later in this section, we will in fact prove that this is an equivalent way of
characterizing when a set is infinite: whether or not we can find a bijection
between the set and a strict subset of itself.)
Example 7.6.13. N × N is countably infinite:
With the Hilbert Hotel discussion in the previous section, we essentially
argued for the fact that N × N has the same cardinality as N. When we had
infintely-countably-many conventions, each with infinitely-countably-many peo-
ple in them, we could still fit them all into our hotel with infinitely-countably-
many rooms! That was more of an intuitive discussion, though, so let’s formally
prove this fact here. We will find an explicit bijection between the two sets.
7.6. CARDINALITY 537

Rather than finding its inverse, though, we will prove it is surjective, and ask
for your help in showing that it is injective.
Explicit bijection: Define f : N × N → N by setting
∀(x, y) ∈ N × N . f (x, y) = 2x−1 (2y − 1)
In proving that f is a bijection, we will be proving this fact:
Every natural number can be written uniquely as a power of 2 times
an odd natural number.
Look at the function we defined. It takes a pair of natural numbers and outputs
a power of 2 times an odd natural number. Proving this is a bijection shows that
it never outputs the same natural twice (injectivity) and every natural number
is an output of some pair (surjectivity). You might try playing around with the
function, plugging in some values and seeing what happens. Also, you might
try working “backwards”, trying to figure out what f −1 might possibly do. For
instance, take your favorite n ∈ N. Can you express it as a power of 2 times an
odd? If n is odd, this is quite easy, since 20 = 1. For instance,
11 = 1 · 11 = 20 · (2 · 6 − 1) = f (1, 6)
(Notice that we had to use x − 1 and 2y − 1 in the definition of f because we
are working with N, and 0 ∈
/ N.)
If n is even, we can just divide by 2 iteratively until we can’t anymore; what’s
left must be an odd number. For instance:
40 = 2 · 20 = 4 · 10 = 8 · 5 = 23 · (2 · 3 − 1) = f (4, 3)
and
32 = 2 · 16 = 22 · 8 = 23 · 4 = 24 · 2 = 25 = 25 · (2 · 1 − 1) = f (6, 1)
This observation is crucial in proving that f is surjective:
.
f is surjective: We claim ∀n ∈ N n ∈ Imf (N × N). We prove this by a
“minimal criminal” argument.
BC: Notice that f (1, 1) = 20 · 1 = 1. Thus, 1 ∈ Imf (N × N).
IH: Suppose we have n ∈ N − {1} that has no such representation as a power
of 2 times an odd, i.e. suppose n ∈
/ Imf (N × N).
IS: We have two cases:
• If n is odd, then . . . well, n · 20 = n · 1 = n is such a representation. That
is, we know n+12 ∈ N and we see that
   
n+1 n+1
f 1, = 20 · 2 · − 1 = 1 · (n + 1 − 1) = n
2 2
so n ∈ Imf (N × N). This contradicts our assumption that n ∈
/ Imf (N × N)
so this case is not valid.
538 CHAPTER 7. FUNCTIONS AND CARDINALITY

• If n is even, then, consider n2 . AFSOC we have a representation of n2 as


a power of 2 times an odd, i.e. suppose n2 ∈ Imf (N × N). This means
.
∃(x, y) ∈ N × N f (x, y) = n2 . Let such (x, y) be given. Consider, then,
f (x + 1, y) (which is valid since x + 1 ∈ N, as well). We see that

n
f (x + 1, y) = 2x+1 · (2y − 1) = 2 · 2x · (2y − 1) = 2 · f (x, y) = 2 · = n

2

This shows we would have such a representation for n; i.e., in fact, n ∈


Imf (N × N). Again, this contradicts our assumption that n ∈
/ Imf (N × N).
n n
Thus, 2 also has no such representation, i.e. 2 ∈
/ Imf (N × N).

We have shown that, supposing n is a counterexample to the claim, n2 is a


smaller counterexample to the claim. By a “minimal criminal” argument (since
we proved our base case), we conclude that the claim holds for every n ∈ N.
This shows f is surjective.
(Note: You might want to look back at Section 5.5.1 to refresh your memory
about how “minimal criminal” arguments work.)
f is injective: You prove this! See Exercise 7.8.21.
Together, we have proven that f is a bijection, and so |N × N| = |N|. That
is, N × N, the set of all ordered pairs of natural numbers, is countably infinite.
Does this surprise you at all? Does it seem counter-intuitive? What do you
think might be true about the set N3 of all ordered triplets of natural numbers?
What do you think would happen if we took N × N × N · · · ? Think about these
ideas. Discuss them with your classmates, and try to prove something!

Example 7.6.14. N × N as a lattice:


Before moving on to another example, let’s show you one more way of thinking
about why |N × N| = |N|. This will be an intuitive explanation, more like
a description of how to define a bijection between the sets without actually
making the definition. However, it’s a common argument and is well worth
seeing.
The idea is to think of N × N as a lattice of points, like so:
7.6. CARDINALITY 539

6
5
4
3
2
1

1 2 3 4 5 6

To show that this infinite grid of points is countably infinite, we can describe
a path that traverses all of the points (surjectivity!) exactly once (injectivity!)
and is indexed by the natural numbers (countably infinite!). That is, we can
just describe a way to traverse the whole grid in a series of steps; there will be
a “1st point” and a “2nd point” and so on.
The key observation to make is that the “northwestern” diagonals of this grid
are all finite. Start from the point (5, 1), for instance, and move upwards and
leftwards, diagonally. You will traverse over (4, 2) and (3, 3) and (2, 4) and (1, 5),
and then reach the boundary of the grid. This is true no matter where you start
along the bottom row of lattice points.
Let’s use this fact to label each lattice points with a natural number based on
(a) which diagonal it lies on, and (b) where it lies along that diagonal. We’ll
treat the diagonal starting at (1, 1) as the 1st diagonal, the one starting at (2, 1)
as the 2nd, and so on. This gives us the following labels:

6
5
15
4
10 14
3
6 9 13
2
3 5 8 12
1
1 2 4 7 11
1 2 3 4 5 6
540 CHAPTER 7. FUNCTIONS AND CARDINALITY

We can see that every point in the lattice will lie on exactly one such diago-
nal. Furthermore, there are countably-infinitely-many such diagonals (they are
indexed by N) and there are only finitely-many points on each diagonal. This
means (as we will prove below) that the collection of all the points on the diag-
onals is countably infinite.
You ought to try formalizing this argument by writing down a function that
achieves the labeling we’ve demonstrated. Or, you could at least work with
a similar one that also works, i.e. you could move southeastwards instead, or
reverse the direction of alternate diagonals . . .
Example 7.6.15. Q is countably infinite:
This result is one of the more striking examples of our intuition failing with
infinite sets and their cardinalities. Think about the elements of Q as laid out
on the real number line. They’re everywhere! In fact, look at Exercise 4.11.26;
there, you proved that the rationals are dense, and it is also true that they are
dense in R (i.e. between any two distinct real numbers lies a rational number).
Furthermore, the set of rationals seems so much larger than Z: between 0 and 1
alone, there lies infinitely many rational numbers! For these reasons, you might
believe that Q is uncountably infinite, but this is False.
In this example, we will present several arguments for this fact, especially be-
cause we realize it is so strange and striking.
(1) Intuitive argument:
Consider the following “representation” of Q as a union of sets:
Q “=” “N × N” ∪ “−”(N × N) ∪ {0}
In some sense, N × N corresponds to all the positive rationals. To see why,
just consider the function f : N × N → Q+ defined by f (x, y) = xy . We
definitely output all positive rationals (so f is a surjection), but 42 = 21 so
this is not an injection. At least, this shows |N × N| ≥ |Q| because f is a
surjection. Since N × N is countably infinite, and we certainly expect Q to
be infinite, this shows the positive rationals are countably infinite.
The set of negative rationals—let’s call it Q− —must have the same cardinal-
ity as the set of positive rationals—let’s call it Q+ . There is a clear bijection
.
between them: define g : Q+ → Q− by setting ∀q ∈ Q+ g(q) = −q.
All this leaves out is 0 ∈ Q. The union of two countably infinite sets is also
countably infinite (as we will prove below), and adding on one more element
won’t change that. Thus, Q is countably infinite.
Mind you, this is quite “hand-wavey”. All of the “scare quotes” in the
“equation” above mean you should take this as just a heuristic argument,
and not a proof. However, there are ways to make all of these arguments
formal. Try working on this on your own!
(2) Listing Q:
7.6. CARDINALITY 541

Consider writing a computer program to print out all the positive rational
numbers in a list. What algorithm would you use? As long as you can
guarantee that your program will “eventually” succeed and print them all,
then you have shown Q can be enumerated one-by-one, so it must be count-
ably infinite. (Remember, this is why we use N as the canonical countably
infinite set: we can enumerate its elements one-by-one, we can count them.)
Here’s one way that we might write such a program: Follow the same “path
through the lattice” argument that we used with N × N in the previous
example. This time, though, just “skip over” any rational you have already
printed.
That is, we would print the pair (1, 1) ↔ 1 and then (2, 1) ↔ 2 and then
(1, 2) ↔ 21 and then (3, 1) ↔ 3 and then . . .
Aha! We have to omit writing (2, 2) ↔ 1. How did we know that? We see
that we already printed 1. How did we know that? We just looked over the
list of rationals we had already printed and checked to see if what we were
about to print has already appeared. If so, we move on; if not, we print it
and then move on.
In terms of the enumeration process, this just means that for every point in
the lattice we pass through, we have to check finitely-many things; namely,
we have to look over the finitely-large set of rationals we have already
printed. This means the printing process at any individual step will take “a
little longer” but not infinitely-longer. Thus, our program will eventually
print out every rational number; no matter which one you have in mind, we
will get to it in finite time.
(3) Q is at most countably infinite:
Here’s another argument about Q being countable. (If this feels like overkill,
that’s fine, just move on. We just know that this is a surprising result and
having a few ways of thinking about it might help!)
Consider this: We can definitely agree a priori that |Q| ≥ |N|. This follows
from the fact that Q ⊇ N. Now, the only question is whether or not these
cardinalities are equal. To reach that conclusion, we would need to find
either (a) an injection from Q to a countable set, or (b) a surjection from a
countable set to Q.
We will prove below that that Z × N is countable. (That is, we will prove
generally that the Cartesian product of any two countably infinite sets is
also countably infinite.) We can then define the function f : Z × N → Q by

∀(z, n) ∈ Z × N . f (z, n) =
z
n
This is a surjection onto Q. It is definitely not injective (why not?) but
we don’t care. It shows that |Z × N| = |Q|. Once we have proven that
|Z × N| = |N|, this will have shown that |N| = |Q|.
542 CHAPTER 7. FUNCTIONS AND CARDINALITY

(4) Stern-Brocot Tree:


There are other visual representations of Q, too! The Stern-Brocot Tree
is particularly enlightening. This idea was, in fact, first introduced and de-
veloped by a French watchmaker named Achille Brocot who was looking for
ways to approximate the measurements of gears he needed to make while
building watches. Around the same time (the 1850s and 1860s), the Ger-
man mathematician Moritz Stern developed the idea. It’s amazing to think
that a non-mathematician would indpendently develop this fascinating idea
to solve a real-world problem he was facing!
(Do not worry too much about the terminology of graphs and trees here. We
will not have occasion to talk about these much further, and are only intro-
ducing this as a helpful way to represent Q and demonstrate it is countably
infinite.)
The root of this tree is 1. (This is the number at the very top of the di-
agram.) The parent-child relation (the way to generate what lies below
a point in the tree) is defined in terms of continued fractions. (We won’t
describe here what that means; instead, we describe below how to construct
the tree.)
What happens with this setup is that any path through the tree from the
root to another node yields a sequence of rational numbers that are better
and better approximations to the ultimate node; furthermore, each succes-
sive rational in the sequence has a larger denominator than the previous
one. This is the property that motivated Monsieur Brocot. He needed to
determine how big to make two gears inside a watch so that the ratio of
their sizes was very close to a particular number. By working downwards
through this tree, he could find better approximate ratios to the number he
needed! Pretty cool, right?

To actually construct the tree, we find mediants. Given two rationals ab and
c a+b
d , the mediant of those two is defined as c+d . (Notice that this is a special
7.6. CARDINALITY 543

object, the mediant; it is not the correct way to add two fractions!)
Each level of the tree consists of all the mediants made from consecutive
pairs of rationals in the level above; we don’t “count” the directly vertical
elements; they are just carried over for ease of reading and construction.
Also, notice that the fractions 01 and 10 (which is undefined, even!) are
included in the outside columns to help generate the elements on the outside
of each level.
(Play around with the properties of this tree, and read more about. It is an
interesting mathematical object!)
We won’t prove here that this tree contains all the rational numbers, but
we think you can see why this is believable. Also, we think you can see why
the set of all nodes in this tree is countably infinite. Each level has only
finitely many nodes, and there are countably-infinitely-many levels.

Theorems
Now we know that three of our standard sets of numbers—N and Z and Q—are
all countably infinite, as well as the set N × N. With the following theorems, we
will show you some ways to generate more countably infinite sets from existing
ones.
Let’s get you warmed up with one helpful result. It says that we can take a
countably infinite set and “tack on” finitely-many extra elements, and this keeps
the result contably infinite, as well.

Lemma 7.6.16. If A is countably infinite and B is finite and A ∩ B = ∅, then


A ∪ B is countably infinite.

Proof. Left for the reader as Exercise 7.8.19.


Hint: Try using a similar idea to our proof of Theorem 7.6.7.

Remark 7.6.17. Note: The assumption that A ∩ B = ∅ is not essential in this


Lemma, but it makes the proof easier.
When A ∩ B 6= ∅, we can apply the result just proven to the set A − B (which
is countably infinite) and the set B − A (which is finite) to get the countably
infinite set (A − B) ∪ (B − A) (since they’re disjoint). We can then apply the
above result again to that set—(A−B)∪(B−A)—and A∩B to get the countably
infinite set
A ∪ B = (A − B) ∪ (B − A) ∪ (A ∩ B)
The next result says that this works with A, B both countably infinite, as
well.

Lemma 7.6.18. If A and B are countably infinite and A ∩ B = ∅, then A ∪ B


is countably infinite.
544 CHAPTER 7. FUNCTIONS AND CARDINALITY

Proof. Since A and B are countably infinite, there exist bijections f : A → N


and g : B → N. Let such functions be given. We will use them to find a bijection
h : A ∪ B → N.
.
First, define the function p : N → Z − N by setting ∀n ∈ N p(n) = −n + 1.
This is a bijection because p−1 : Z − N → N is given by p−1 (z) = −z + 1. (Check
this for yourself!)
Since p and g are bijections, we know p ◦ g : B → Z − N is a bijection, as well.
Next, we define the piece-wise function q : A ∪ B → Z by setting
(
∀x ∈ A ∪ B . q(x) =
f (x) if x ∈ A
p(g(x)) if x ∈ B

This is well-defined because A ∩ B = ∅. Furthermore, this is a bijection because


it is a bijection on each of the pieces is a bijection. (Again, check this for yourself
to make sure it makes sense. Also, see Exercise 7.8.31 which proves this, in
generality.)
From previous work, we know how to find a bijection r : Z → N. (Remember
how we did that? Look back at Example 7.6.12!)
Finally, define h : A∪B → N by h = r ◦q. This is a composition of bijections, so
it is a bijection. This proves |A ∪ B| = |N|, i.e. A ∪ B is countably infinite.
The next corollary says that we did not, in fact, need to assume that A∩B =
∅. It made the proof easier. We will ask you to prove this corollary.
Corollary 7.6.19. If A and B are countably infinite, then A ∪ B is countably
infinite.
Proof. Left for the reader as Exercise 7.8.20.
(Hint: Apply Lemma 7.6.18 to appropriately-chosen sets. . . )
This proves several cases about finding the union of sets. Let’s prove a result
about taking a Cartesian product.
Theorem 7.6.20. If A and B are countably infinite, then A × B is countably
infinite.
This one is actually easy to prove, but only because we’ve already proven a
result about a canonical set that is a Cartesian product and is countably infinite,
itself. Look at how we use N × N in the proof:
Proof. Suppose A, B are countably infinite. Then there exist bijections f : A →
N and g : B → N. Let such functions be given.
Define the function h : A × B → N × N by

∀(x, y) ∈ A × B . h(x, y) = f (x), g(y)



7.6. CARDINALITY 545

We claim this is a bijection. Since f, g are invertible, we claim that H : N × N →


A × B given by

∀(k, `) ∈ N × N . H(k, `) = f −1 (k), g −1 (`)




satisfies H = h−1 .
To see why, notice that

∀(x, y) ∈ A × B . (H ◦ h)(x, y) = H(h(x, y)) = H f (x), g(y)




= f −1 (f (x)), g −1 (g(y)) = (x, y)




and

∀(k, `) ∈ N × N . (h ◦ H)(k, `) = h(H(k, `)) = h f −1 (k), g −1 (`)




= f (f −1 (k)), g(g −1 (`)) = (k, `)




so H ◦ h = IdA×B and h ◦ H = IdN×N . This shows H = h−1 .


Therfore, h is a bijection, and so |A × B| = |N × N| = |N|.

By applying induction to the two previous results, we can prove the following:

Corollary 7.6.21. Suppose A1 , . . . , An are countable (where n ∈ N, so we only


have finitely many sets).
Then A1 ∪ · · · ∪ An and A1 × · · · × An are countably infinite.

Proof. Left for the reader as Exercise 7.8.22

A Countable Union of Countable Sets is Countable


You might wonder now what happens when we take a union or product of a
countably-infinitenumber of sets, each of which is countably infinite . . . Let’s
tackle the union case here. This result is so fundamental and important, that
we’ve even reiterated it in the section title here!

Theorem 7.6.22. Suppose we have, for each n ∈ N, a countably infinite set


An . Then the set
[
A= An = A1 ∪ A2 ∪ A3 ∪ · · ·
n∈N

is also countably infinite.

We will prove this in the case that the sets are pairwise-disjoint, and leave
the rest of the details to you.
546 CHAPTER 7. FUNCTIONS AND CARDINALITY

Proof. Suppose we have, for each n ∈ N, a countably infinite set An . Further-


.
more, suppose ∀i, j ∈ N i 6= j =⇒ Ai ∩ Aj = ∅. Define
[
A= An
n∈N

We claim A is countably infinite.


Since each An is countably infinite, we know there exists a bijection fn : An → N,
for every n ∈ N. This lets us “number” the elements of every set An , based on
what the bijections fn do. Furthermore, we have a number on the An sets (they
are indexed by N). In essence, then, we have a “numbering” of the elements of
A that corresponds to N × N. Let’s formally define this correspondence.
Let’s define a function F : A → N × N. Given any x ∈ A, we know ∃n ∈
.
N x ∈ An and that this n is unique. (This follows because the given sets were
pairwise-disjoint). Set F (x) = n, fn (x) .
We claim that F is a bijection. To see why, consider the function G : N×N → A
defined by
.
∀(a, b) ∈ N × N G(a, b) = fa−1 (b)
That is, G uses the first coordinate a to identify the set Aa , and then uses the
function fa to identify the element of Aa that produced b ∈ N as an output.
(We will leave it to the reader to verify that, indeed, G = F −1 .)
This shows that |A| = |N × N| = |N|, so A is countably infinite.

In the case where the An sets are not necessarily pairwise-disjoint . . . we leave
this as Exercise 7.8.37.

Corollary 7.6.23. Suppose we have, for every n ∈ N, a finite set An . Fur-


thermore, suppose that these sets are pairwise-disjoint. Define
[
A= An
n∈N

Then A is countably infinite.

Proof. Left for the reader as Exercise 7.8.36

This result is very powerful. Let’s see it applied to two examples.


Example 7.6.24. The set of all powers of primes:
Recall the Hilbert Hotel discussion, where we accomodated infinitely many con-
ventiones of people that were each infinitely large. We sent people to rooms
corresponding to powers of primes. For every n ∈ N, define pn to be the n-th
prime number. Then, for every n ∈ N, define

An = {pkn | k ∈ N}
7.6. CARDINALITY 547

which is the set of all powers of the n-th prime. The theorem above says that
[
An = {all powers of primes}
n∈N

is countably infinite, as well. Indeed, we should have expected that because that
union is just a subset of the natural numbers, which is countably infinite itself!
Example 7.6.25. The set of all finite binary strings:
A binary string is defined to be an ordered list of 0s and 1s. A finite binary
string is one that is of finite length.
For example, the following are all finite binary strings:

0, 1, 101010, 10000000000000000001

For every n ∈ N, let’s define Fn to be the set of all binary strings of length n.
For instance,

F1 = { 0 , 1 }
F2 = { 00 , 01 , 10 , 11 }
F3 = { 000 , 001 , 010 , 100 , 011 , 101 , 110 , 111 }

and so on. (Notice that |Fn | = 2n . Try to prove that!) Then, define the set of
all finite binary strings by [
F = Fn
n∈N

An element of F must have come from some set in the big union; this means
that an arbirtrary element x ∈ F is some binary string with some finite length.
That length could be a huuuuuuuge number, but it is finite. (This points out
the distinction between allowing something to be “arbitrarily large (but finite)”
and allowing something to be “infinite”.)
The point of this example is that F is countably infinite, according to the
theorem above! (Well, it follows from the corollary stated right after, actually.)
Contrast this with the set S of all infinite binary strings, which is—as we will
prove shortly—uncountably infinite. We will use these sets of binary strings
fairly often as examples!

Passing Off To A “Limit”


We proved above that if A and B are countably infinite, then so are A ∪ B and
A × B. We also encouraged you to prove (by induction on the number of sets
in the union/product) that
[ Y
A1 ∪ A2 ∪ · · · ∪ An = Ai and Ai = A1 × A2 × · · · × An
i∈[n] i∈[n]

are both countably infinite, as well, for any n ∈ N.


548 CHAPTER 7. FUNCTIONS AND CARDINALITY

What do these results tell us, if anything, about


[
A1 ∪ A2 ∪ A3 ∪ · · · = Ak
k∈N

and Y
A1 × A2 × A3 · · · = Ak
k∈N
That is, what happens when we try to “jump to the limit” from having a finite
union/product (of arbitrarily large size, but still finite) to having an infinite
union/product? Can we make necessary conclusions? Can we find counterex-
amples?
The main idea is that “passing to a limit” does create some mathematical
object, but we can’t necessarily pre-suppose that this object has the exact same
properties as all of the objects in the sequence that defines that object.
Think about the finite sets [n], for every n. Each of them is finite, but “in
the limit” we get N which is not finite. So yes, we do get some object (another
set), but it doesn’t have to have the same properties.
The important theorem above shows that passing to the limit in the union
definitely preserves countability. As we will see below in the next section, the
product definitely does not preserve countability. (In fact, even an infinite
product of finite sets is uncountable. Yikes!)
A similar notion appears in calculus. We promised we would not use calculus,
but there is such a natural relationship between these ideas, so we feel compelled
to mention an easy example. If you don’t get anything out of this, no worries; if
you do, though, try to remember this connection and think about how it might
fundamentally change your view of everything you learned in calculus.)
Consider a limit, something like
1
lim =0
x→∞x
In what sense is this limit equal to 0? Why would we, as mathematicians over
the years, choose to define limits in this way? Formally, this limit makes sense
because of the quantified definition of a limit. Let P be the set of positive real
numbers. Then the definition of limit (applied to this example) says

.
∀ε ∈ P ∃M ∈ N ∀n ∈ N. . n > M =⇒
1
x



That is, for any small positive threshold (ε > 0), we can find a specific cutoff
point (a large natural number M that depends on ε somehow) such that, for
every point after M , the function x1 falls within that ε-threshold of the limit
point, zero.
1
Notice that this is very different than saying some nonsense like “ ∞ = 0” That’s
not what’s going on. We never actually get to “plug in” the end of the limit
and evaluate it. The limit is defined in terms of quantifications, some things
that are happening for arbitrarily large values, but not for an infinite value.
7.6. CARDINALITY 549

7.6.4 Uncountable Sets


To start our discussion of uncountable sets, let’s prove a result we’ve mentioned
already. Specifically, we will prove that a countably infinite Cartesian product
of sets is uncountbly infinite. Notice that we don’t even need to have the sets
be infinite: we can make them all finite with size 2! We will use this result in
the next part to demonstrate some examples of uncountable sets, including a
familiar set we already know . . .

An Uncountable Cartesian Product


Theorem 7.6.26. A countably infinite Cartesian product of sets with just two
elements is uncountably infinite. That is,
{0, 1}N = {0, 1} × {0, 1} × {0, 1} × · · ·
is uncountably infinite.
Proof. AFSOC that this set {0, 1}N is actually countably infinite. This means
we can find a bijection between this set and N; that is, we can identify a corre-
spondence between all the elements of this set and all of the natural numbers.
Thus, there is a 1st element of this set, the element that corresponds to 1; there
is a 2nd element of this set, the element that corresponds to 2; and so on.
We don’t know exactly what these elements are, we are just guaranteed that
this correspondence exists. Still, we can write out all the elements yi of {0, 1}N
in a list. Each yi is an ordered, infinite list of 0s and 1s, so we can write them
like this:
1 ↔ (a1,1 , a1,2 , a1,3 , a1,4 , a1,5 , . . .) = y1
2 ↔ (a2,1 , a2,2 , a2,3 , a2,4 , a2,5 , . . .) = y2
3 ↔ (a3,1 , a3,2 , a3,3 , a3,4 , a3,5 , . . .) = y3
4 ↔ (a4,1 , a4,2 , a4,3 , a4,4 , a4,5 , . . .) = y4
5 ↔ (a5,1 , a5,2 , a5,3 , a5,4 , a5,5 , . . .) = y5
..
.
Every value ai,j is either 0 or 1. The i tells us which natural number we
correspond to (i.e. the vertical position in the list) and the j tells us which
coordinate we are in (i.e. the horizontal position in the list).
Since we have assumed the correspondence is a bijection, we know that this list
contains all of the elements of {0, 1}N . To complete the contradiction argument,
we will construct an element of {0, 1}N that is guaranteed to not appear in this
list! (This is a version of Cantor’s Diagonalization Argument.)
Let’s define the object x = (x1 , x2 , x3 , . . . ) by saying
(
0 if ai,i = 1
xi =
1 if ai,i = 0
550 CHAPTER 7. FUNCTIONS AND CARDINALITY

That is, we are constructing x by going down the main diagonal of the grid of
elements (so we see all of the elements ai,i ) and switching the value from a 1 to
a 0, or vice-versa.
The following diagram is a specific example of how to do this, and is not part of
this more general proof. However, we are including it for the sake of illustration:

1 ↔ 1 , 1 , 0 , 0 , 1 , . . . = y1

2 ↔ 1 , 0 , 0 , 0 , 1 , . . . = y2

3 ↔ 0 , 0 , 1 , 1 , 0 , . . . = y3

4 ↔ 1 , 1 , 0 , 1 , 1 , . . . = y4

5 ↔ 0 , 1 , 1 , 1 , 0 , . . . = y5
..
.
 
x = 0 , 1 , 0 , 0 , 1 , ...

Why would we choose to do this? Well, think about whether or not the
object x could possibly belong to the list of elements above.
• Is x = y1 ? No, because x is different from y1 in their first coordinates. (In
our example x1 = 0 because y1,1 = 1.)
• Is x = y2 ? No, because x is different from y2 in their second coordinates.
(In our example, x2 = 1 because y2,2 = 0.)
• Is x = y3 ? No, because x is different from y3 in their third coordinates.
(In our example, x3 = 0 because y3,3 = 1.)
In general, for an arbitrary i ∈ N, we can guarantee that x and yi differ in the
i-th coordinate. Accordingly, none of the yi objects can be equal to this new
object x. That is,

. 
∀i ∈ N xi 6= yi,i =⇒ ∀i ∈ N x 6= yi . 

But the way we defined x, it is just an ordered, infinite list of 0s and 1s, so it is
definitely an element of {0, 1}N , itself.
This is a contradiction. We assumed we could list all the elements of our set,
but we then used this ordering to construct an element of our set that definitely
×
does not appear in the list. ×
××
Therefore, {0, 1}N is uncountably infinite.
Note: This is a very slick argument. It’s one of my favorite proofs in all of
mathematics. Cantor was a genius for coming up with it and, what’s even more
interesting, it’s actually fairly simple and memorable, as well. We belive taht
you won’t forget this “go down the main diagonal and switch the values” argu-
ment. The fact that we could even sumamrize the whole proof in nine words
7.6. CARDINALITY 551

like that is further indication of its brilliance.

Corollary: A countably infinite product of any sets with at least two elements
each is uncountably infinite.
(Note: We really only need to say that none of the sets in the product are empty
and that only finitely many of them are allowed to have exactly one element.)

Examples
You might be wondering now: what types of sets are uncountably infinite? Do
we know any? Sure we do! Here are some examples.
Example 7.6.27. The set of all infinite binary strings:
You may have noticed that the set we used in the proof above—namely {0, 1}N —
is “essentially” the set S of infinite binary strings! An element of {0, 1}N is an
infinitely-long ordered list of coordinates, each of which is 0 or 1. An element of
S is an infinitely-long ordered list of 0s and 1s, but just without the parentheses
and commas. As such, there is a very natural bijection between the two (just
drop the parentheses and commas, or throw them back in), so we will identify
these two sets as the same.
We saw above in Example 7.6.25 that the set of all finite binary strings is
countably infinite. This latest result shows that the set of all infinite binary
strings is uncountably infinite. An alternate proof of this fact involves finding a
bijection between S and P(N), and then applying Cantor’s Theorem that says
|N| < |P(N)|. (See Exercise 7.8.33 for these details.)
Example 7.6.28. R is uncountably infinite:
This is our first example of a standard set of numbers that is uncountably
infinite. We can use the above result to prove this fact.
This claim makes some intuitive sense, since it “looks like” the real number line
is “so much bigger” than just N or Z. But we also saw that Q is countably
infinite, and there are tons of rational numbers scattered across the real number
line; in fact, between any two real numbers there lies infinitely many rationals!
What we will see now is that, yes, it is true that R is uncountably infinite.
Furthermore, we will even show that R and P(N) are of the same “size” of
infinity; that is, we will show |R| = |P(N)|. (Remember that this is way more
informative than just saying both sets are uncountable; there are many levels of
uncountably infinite sets, we are just choosing not to talk about them too much
so we don’t hurt our brains.)
Morally speaking, the idea behind showing R is uncountably infinite, first of
all, is to relate R to the set {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}N . Every real number can
be expressed in decimal notation, which is just some ordered list of countably
infinite many digits. There’s a decimal point in there somewhere, and there are
552 CHAPTER 7. FUNCTIONS AND CARDINALITY

issues like 0.999999 · · · = 1, but those aren’t huge deals. Since we already saw
that even a “small” set like {0, 1} yields an uncountable set when we take its
product infinitely many times, then certainly a “bigger” set, like {0, 1, . . . , 9}
will also give an uncountable set, even factoring in those issues. This is the
intuitive argument you can carry around in your head and use to explain the
result to your friends. (In fact, this is the argument you will find in most
textbooks, as well.)
More formally, we can just prove that |R| = |P(N)|. This stronger result implies
that R is uncountably infinite (because Cantor’s Theorem tells us |N| < |P(N)|.)
To do this, we will consider the set
I = {y ∈ R | 0 ≤ y ≤ 1}
which is the interval [0, 1] ⊆ R. We will show that
|{0, 1}N | = |P(N)| = |I|
and then apply some results about bijections between intervals and R.
Consider the function f1 : {0, 1}N → I that takes in an infinite binary string,
puts a decimal point in front of all the 0s and 1s, and says, “Evaluate this
number as a decimal expansion”.
As an example, consider the element that is (1, 1, 0, 0, 1, 0, . . . ) where the rest
are 0s. Then
1 1 1 11001
f1 (1, 1, 0, 0, 1, 0, . . . ) = 0.110010 . . .DEC = 1 + 2 + 5 =
10 10 10 100000
Notice that this is a function because any output is definitely a real number
(since it has a decimal expansion; we just provided it) and it is somewhere
between 0 and 1, since we put the decimal point in front. Furthermore, notice
that f1 is an injection; two different infinite binary strings must be different
in some coordinate, so they yield two decimal expansions that differ somewhere
and, thus, cannot be the same real number. This shows that |{0, 1}N | ≤ |I|.
Consider the function f2 : {0, 1}N → I that takes in an infinite binary string,
puts a decimal point in front of all the 0s and 1s, and says, “Evaluate this
number as a binary expansion”.
As an example, consider the same element as above. Then
1 1 1 25
f2 (1, 1, 0, 0, 1, 0, . . . ) = 0.110010 . . .BIN =
+ 2+ 5 =
21 2 2 32
Notice that this is a function because any output is definitely a real number;
just evaluate the resulting sum of fractions and it yields a real number between
0 and 1 (and even if the series is infinite, it is guaranteed to converge). For
example, the input of all 0s yields 0 as an output, and the input of all 1s yields
1 as an output since
1 1 1 X 1
+ + + ··· = =1
2 4 8 2k
k∈N
7.6. CARDINALITY 553

Furthermore, notice that f2 is a surjection. This fact hinges on some external


knowledge about rational/irrational numbers; specifically, it is true that any
irrational number can be approximated by a sequence of dyadic rational numbers
(rationals whose denominators are powers of 2). We won’t state or prove these
results, but we think that by playing around with some examples, you’ll start to
see why this works. In fact, do some Googling for binary expansions of irrational
numbers and you’ll find some interesting results.
Since f2 is a surjection, this shows |{0, 1}N | ≥ |I|. Accordingly, we conclude
that |{0, 1}N | = I. We also know |P(N)| = |{0, 1}N | (see Exercise 7.8.33), so we
now know that |I| = |P(N)|.
The last step is to prove that |I| = |R|. Look at Exercise 5 in Section 7.5.4.
There, you found a bijection between the set J = {y ∈ R | −1 < y < 1} and R.
It is easy to find a bijection between J and the set K = {y ∈ R | 0 < y < 1} (try
it now!). This shows that |R| = |J| = |K|. Furthermore, K ⊆ I and they differ
by only two elements, 0 and 1, so |K| = |I|. Finally, this shows that |I| = |R|,
so we conclude that
|R| = |P(N)|

Look at the two arguments we mentioned:

• Considering the set {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}N , and

• Considering the set {0, 1}N

Both arguments involved some knowledge about decimal expansions (and bi-
nary expansions). It seems there is no easy way around this, so we hope that
the results above are still convincing. In particular, you might want to play
around with the idea that f2 in the discussion above is a surjection but not
an injection. Can you convince yourself of these claims? Can you convince
someone else?

Theorems

Let’s see one results about uncountable sets. Then, we will state a final theorem
about infinite sets, in general, before moving onwards!

Lemma 7.6.29. Suppose A is uncountably infinite and B is countably infinite,


and B ⊆ A. Then A − B is uncountably infinite.

(Note: We don’t need to assume that B ⊆ A here. If this were not the case,
we would just consider A and B ∩ A as the sets, instead.)

Proof. Left for the reader as Exercise 5 in Section 7.6.5.


(Hint: Use a contradiction argument . . . )
554 CHAPTER 7. FUNCTIONS AND CARDINALITY

Characterizing a Set as Infinite


To define infinite sets, we first defined finite sets, and then declared any set to
be infinite if it is not finite. The following thereom shows us that we could have
defined infinite in a different way. Namely, we can say a set is infinite if and
only if we can find a bijection to a proper subset of itself. First, let’s state and
prove this helpful lemma; we will need it in the proof of the theorem below.
Lemma 7.6.30. Let A be any set. Then, A is infinite ⇐⇒ there exists B ⊂ A
such that B is countably infinite.
Proof. The ⇐= direction is obvious. If A is bigger than some infinite set, it is
also infinite.
The =⇒ direction is more interesting. Suppose A is infinite. Let ? ∈ A be
some special element. We will take it out of consideration and construct a set
B that is countably infinite and does not contain ? as an element. This will
guarantee B ⊂ A, with B 6= A.
Consider A1 = A − {?}. This set is also infinite, so we can choose some element
b1 ∈ A1 .
Consider A2 = A1 − {b1 } = A − {?, b1 }. This set is also infinite, so we can
choose some element b2 ∈ A2 .
Consider A3 = A2 − {b2 } = A − {?, b1 , b2 }. This set is also infinite, so we can
choose some element b3 ∈ A3 .
We can continue this process forever. Define B = {b1 , b2 , b3 , . . . }. (Note: we
are “passing to a limit” here, but this is acceptable because we are not using
this to “preserve” any properties of B. We are merely constructing the object
B.)
Notice that B is countably infinite because there is an obvious bijection with
N.
With this lemma in hand, we can state and prove the next result:
Theorem 7.6.31. Let A be any set. Then, A is infinite ⇐⇒ there exists
B ⊂ A such that there exists f : A → B that is bijective.
Proof. ( =⇒ ) Suppose A is infinite. We must identify a proper subset B ⊂ A
and a bijection f : A → B.
Since A 6= ∅, take any x ∈ A. Consider B = A − {x}. Notice B ⊂ A.
We want to show there is a bijection f : A → B.
By Lemma 7.6.30 above, we know we can find a countably infinite strict subset
C ⊂ B. (Note: A is infinite, so B = A − {x} is also infinite, since we only
removed one element. If you need more convincing, AFSOC B is finite, so it
has some size; what, then, is the size of A?)
Since C is countably infinite, we can list the elements of C as {y1 , y2 , y3 , . . . }.
7.6. CARDINALITY 555

(Note: The idea is that there exists some bijection g : N → C, so we can let
y1 = g(1) and y2 = g(2) and so on.)
Define f : A → B by

 y if y 6= yi for all i ∈ N and y 6= x
∀y ∈ A . f (y) = y1 if y = x
yi+1 if y = yi for some i ∈ N

This is a bijection because we can identify its inverse function F : B → A, which


is 
z if z 6= yi for every i ∈ N
.

∀z ∈ B F (z) = x if z = y1

yi−1 if z = yi for some i ∈ N − {1}

We will leave it as an exercise for the reader to verify that F = f −1 . (Draw a


picture to intuitively convince yourself, at the very least.)
(⇐=) This direction claims that infinite sets are the only sets that have this
property. We will prove this claim by contrapositive. That is, we will show that
any finite set cannot have a bijection to a proper subset.
Suppose A is finite. This means it has a (unique) size, say n ∈ N. Consider an
arbitrary proper subset B ⊂ A. WWTS there cannot exist a bijection from A
to B.
AFSOC there is such a bijection f : A → B. Since B is finite and B ⊂ A, B
has some size m < n. Thus, there is a bijection g : B → [m]. Composing these
bijections, we get a bijection h : A → [m]. Thus, |A| = n and |A| = m, so
m = n. However, we also know m < n. This is a contradiction. × ×
×
×
(Note: we can also make this argument via the Pigeonhole Principle, which
we haven’t yet discussed but will soon. Essentially, we can’t have a bijection
p : [n] → [m] when n > m because there are “too few boxes” in which to stuff
n “pigeons”.)
In the context of solving a problem, perhaps you’ll want to argue that some
set is infinite. Rather than proving that you cannot possibly find a bijection
to any finite set, considering using this theorem! If you can identify a proper
subset and a bijection, then you have accomplished your goal, with the help of
this result.

7.6.5 Questions & Exercises


Remind Yourself
Answering the following questions briefly, either out loud or in writing. These
are all based on the section you just read, so if you can’t recall a specific defi-
nition or concept or example, go back and reread that part. Making sure you
can confidently answer these before moving on will help your understanding and
memory!
556 CHAPTER 7. FUNCTIONS AND CARDINALITY

(1) When is a set finite?


(2) What are two ways to characterize when a set is infinite?
(3) What is the difference between countably and uncountably infinite? Give
two examples of each type.
(4) Given two countably infinite sets, A and B, what set operations can we
perform on them that are guaranteed to yield a countably infinite set? Might
any set operations on them yield a finite set?
(5) Is R × N countably or uncountably infinite? What about R − N?

Try It
Try answering the following short-answer questions. They require you to actu-
ally write something down, or describe something out loud (to a friend/class-
mate, perhaps). The goal is to get you to practice working with new concepts,
definitions, and notation. They are meant to be easy, though; making sure you
can work through them will help you!
(1) Prove Proposition 7.6.9. That is, prove: If A and B are finite sets, then

|A ∪ B| = |A| + |B| − |A ∩ B|

(2) Prove Corollary 7.6.10. That is, prove:


If A1 , . . . , An are finite and pairwise-disjoint, then

|A1 ∪ · · · ∪ An | = |A1 | + · · · + |An |

(3) Find the flaw in the following “spoof” that R is countably infinite:
Let S ⊆ R be the set defined by S = {y ∈ R | 0 ≤ y < 1}.
For every x ∈ S, define the set Ax = {x + z | z ∈ Z}.
(For example, A1/2 = {. . . , − 23 , − 12 , 12 , 32 , . . . }.)
Since Z is countably infinite, each set Ax is countably infinite.
Also, notice that [
R= Ax
x∈S

This is a union of countably infinite sets, so R is also countably


infinite.
Be sure to point out any particular step that is incorrect, as well as why
that step is incorrect. Ideally, you should point out why the ultimate con-
clusion of the spoof is incorrect, but without just explicitly stating “R is
uncountable because we proved that”. Why is the incorrect step a misuse
of a result, and why is the conclusion of that particular step invalid?
7.7. SUMMARY 557

(4) For each of the following desired situations, provide an example or state
that it is impossible.
For example, if the situation were “Finite sets A and B such that A ∪ B
has size 4”, an answer might be “Consider A = {1, 2} and B = {3,[4}.” If
the situation were, “For every x ∈ N, an infinite set Sx , such that Sx is
x∈N
finite”, the answer would be “Impossible”.
There is no need to prove your answers here; a good example should suffice.
(a) An uncountably infinite set A and a countably infinite set B such that
A ∩ B is finite.
(b) Uncountably infinite sets C and D such that C −D is countably infinite.
(c) Uncountably infinite sets E and F such that E − F is uncountably
infinite.
[
(d) For every x ∈ N, a countably infinite set Sx , such that Sx is un-
x∈N
countably infinite.
[
(e) For every y ∈ R, a countably infinite set Ty , such that Ty is count-
y∈R
ably infinite.
(5) Prove Lemma 7.6.29. That is, suppose A is uncountably infinite and B ⊆ A
is countably infinite; prove that A − B is uncountably infinite.
Use this result to explain why the set of irrational real numbers is uncount-
ably infinite.

7.7 Summary
Now, we have fully explored functions and their related properties! We saw
that a function is just a relation with a particular property. This desired prop-
erty corresponds to how we usually think of a function as having an “output” for
every possible “input”. We formalized these notions mathematically by defining
terminology like domain, codomain, and image. Further properties of functions
include injectivity and surjectivity. We saw many examples and non-examples
of functions with these properties, and discussed how to prove/disprove these
properties, relating back to our logical proof techniques.
The notion of a bijection has been particularly helpful and powerful. We
related this to the notion of an inverse function. Specifically, we saw and proved
that a function is bijective if and only if it has an inverse! This made for an
important result later on when we discussed cardinality, where “the bijection
is king”. The notion of “pairing off elements” helped us make sense of some of
the more wild and counter-intuitive results about the “sizes of sets”.
We characterized infinite sets as either countably infinite or uncountably in-
finite. However, we also proved the historically significant result that is Cantor’s
558 CHAPTER 7. FUNCTIONS AND CARDINALITY

Theorem, which shows that there are, in fact, infinitely-many cardinalities! For
our purposes here, it was sufficient to distinguish these two types of infinite
sets. We saw several examples of each, and proved some theorems about how to
create sets of specific cardinalities from others. Ultimately, we find these results
intriguing and mathematically instructive. From now on, though, we will be
focusing on finite sets only.

7.8 Chapter Exercises


These problems incorporate all of the material covered in this chapter, as well
as any previous material we’ve seen, and possibly some assumed mathematical
knowledge. We don’t expect you to work through all of these, of course, but
the more you work on, the more you will learn! Remember that you can’t truly
learn mathematics without doing mathematics. Get your hands dirty working
on a problem. Read a few statements and walk around thinking about them.
Try to write a proof and show it to a friend, and see if they’re convinced. Keep
practicing your ability to take your thoughts and write them out in a clear,
precise, and logical way. Write a proof and then edit it, to make it better. Most
of all, just keep doing mathematics!
Short-answer problems, that only require an explanation or stated answer with-
out a rigorous proof, have been marked with a .
Particularly challenging problems have been marked with a ?.

Problem 7.8.1. For each of the following “rules” and proposed domains and
codomains, determine whether the “rule” defines a well-defined function.
Explain your answer using examples, if necessary.
x2
(a) Let a : Z − {1} → R be defined by a(x) = .
x−1
p
(b) Let b : Q → Q be defined by b(x) = |x|.

(c) Let c : Z → Z be defined on every input x ∈ Z by outputting an s ∈ Z such


that x ≡ s mod 3.
jxk
(d) Let d : N → N be defined by d(x) = .
10
(e) Let e : P(N) → P(Z) be defined by taking in a set of natural numbers and
outputting the set of all integer multiples of the least element of that set.

Problem 7.8.2. Consider the sets R3 = {(x, y, z) | x, y, z ∈ R} and R2 = {(a, b) |


a, b ∈ R}.
Consider the function f : R3 → R2 defined by f (x, y, z) = (xz, yz). Is f
injective? Surjective? Prove your claims.
7.8. CHAPTER EXERCISES 559

Problem 7.8.3. Define f : R → R and g : R → R by f (x) = x + 1 and


g(x) = x2 + x.
Find formulas for the compositions f ◦ g and g ◦ f . (Notice that they are
different.)
Prove that neither of those compositions is injective.
Problem 7.8.4. Let f : Z → Z be given by f (x) = 2x − 3. Let g : Z → N be
given by g(z) = |z| + 4.
What is the domain of g ◦ f ? What is the codomain?
Write down a rule that defines g ◦ f . Is this function injective? Surjective?
What is Img◦f (Z)?
Prove your claims.
Problem 7.8.5. Each of the following rules defines a function from N × N → Z.
For each, determine whether the resulting function is injective or surjective, or
both, or neither. Prove your claims.
(a) f1 (a, b) = a − b
(b) f2 (a, b) = 2a + 3b
(c) f3 (a, b) = a
(d) f4 (a, b) = a2 − b2
(e) f5 (a, b) = 2a · 3b
Problem 7.8.6. Define functions f1 , f2 , f3 , f4 with domain N and codomain P(N)
with the following properties, or else explain why the desired properties are not
possible to achieve.
• f1 is injective and not surjective
• f2 is neither injective nor surjective
• f3 is surjective and not injective
• f4 is bijective
Problem 7.8.7. Consider the function f : Z × Z → Z × Z defined by

.
∀(x, y) ∈ Z × Z f (x, y) = (y + 1, 3 − x)

Find a function F that is the inverse of f , and prove that it is. What does this
tell you about the function f ?
Problem 7.8.8. Define the set S = {x ∈ R | 0 < x < 1}. Define the function
g : S → R by
2x − 1
g(x) =
2x(1 − x)
560 CHAPTER 7. FUNCTIONS AND CARDINALITY

Prove that Img (S) = R.


(Hint: You’ll need to use the Quadratic Formula.)
Problem 7.8.9. Suppose f : A → B and g : B → C are functions.

(a) Suppose f, g are surjections. Prove that g ◦ f : A → C is also a surjection.

(b) Suppose f, g are injections. Prove that g ◦ f : A → C is also an injection.

(c) Suppose f, g are bijections. Prove that g ◦ f : A → C is also a bijection.

Problem 7.8.10. Suppose f : A → B and g : B → C are bijections. Define


h : A → C to be h = g ◦ f .
Prove that h is invertible and that h−1 = f −1 ◦ g −1 .
(Hint: Use the Associativity of Function Composition.)
Problem 7.8.11. Let f : A → B and g : B → C be functions. Let X ⊆ A. Prove
that Img◦f (X) = Img (Imf (X)).
Problem 7.8.12. Let f : A → B be a bijection, so f −1 : B → A is a function.
Let X ⊆ A. Prove that Imf (X) = PreImf −1 (X).
Problem 7.8.13. Let A, B be sets and let f : A → B be a function. Suppose
X, Y ⊆ A.

(a) Is it necessarily true that the following equality holds?

Imf (X ∪ Y ) = Imf (X) ∪ Imf (Y )

State your claim and prove it.

(b) Is it necessarily true that the following equality holds?

Imf (X ∩ Y ) = Imf (X) ∩ Imf (Y )

State your claim and prove it.

Problem 7.8.14. Let f : A → B be a function. Define the relation ∼ on B by


saying, for any x, y ∈ B,

x ∼ y ⇐⇒ PreImf ({x}) = PreImf ({y})

Explain why ∼ is an equivalence relation.


What are the equivalence classes?
Supposing that f is surjective, what are the equivalence classes?
Problem 7.8.15. Let f : A → B be a function. Define the relation ≈ on A by
saying, for any x, y ∈ A,

x ≈ y ⇐⇒ f (x) = f (y)
7.8. CHAPTER EXERCISES 561

Is ≈ an equivalence relation? If so, prove it, and describe the equivalence classes.
If not, provide a counterexample.
Now, suppose that f is an injection. Is ≈ an equivalence relation? If so, prove
it, and describe the equivalence classes. If not, provide a counterexample.
Problem 7.8.16. Let f : A → B be a function, and let X, Y ⊆ A. Consider the
claim that Imf (X) ∩ Imf (Y ) ⊆ Imf (X ∩ Y ). What is wrong with the following
“spoof” of that claim?

Let z ∈ Imf (X) ∩ Imf (Y ). Since z ∈ Imf (X), this means ∃a ∈ X


such that f (a) = z. Since z ∈ Imf (Y ), this means ∃a ∈ Y such that
f (a) = z. Since a ∈ X and a ∈ Y , we this means a ∈ X ∩ Y . Since
f (a) = z, this means z ∈ Imf (X ∩ Y ).

Provide a counterexample to show that the claim is, in fact, False.


Problem 7.8.17. Prove/disprove whether P(N) and P(Z) have the same cardi-
nality.
Problem 7.8.18. Fix an arbitrary n ∈ N. Consider the set [n] = {1, 2, 3, . . . , n}.
Let E be the set of subsets of [n] that have an even number of elements (like
∅ or {1, 4}), and let O be the set of subsets of [n] that have an odd number of
elements (like {5} or {1, 2, 3}).
Define a function p : E → O that is a bijection, and prove that it is a bijection.
(Hint: Write out some small cases, where n = 1 and n = 2 and n = 3. Then
try to generalize.)
Problem 7.8.19. Prove Lemma 7.6.16.
Hint: Try using a similar idea to our proof of Theorem 7.6.7: use the size of B
to “bump up” the bijection between A and N by a certain amount.
Problem 7.8.20. Prove Corollary 7.6.19. That is, suppose A and B are countably
infinite sets; prove that A ∪ B is countably infinite by applying Lemma 7.6.18
to appropriately-chosen sets.
Problem 7.8.21. Look back at Example 7.6.13. There, we defined f : N×N → N
by setting
∀(x, y) ∈ N × N .
f (x, y) = 2x−1 (2y − 1)
Prove that f is injective.
Problem 7.8.22. Prove Corollary 7.6.21. That is, suppose we have finitely-many
sets—A1 , A2 , . . . , An —where each set is countably infinite; prove that

A1 ∪ A2 ∪ · · · ∪ An

and
A1 × A2 × · · · × An
are both also countably infinite.
562 CHAPTER 7. FUNCTIONS AND CARDINALITY

Problem 7.8.23. Consider the set A, defined by

A = {(a, b) ∈ N × N | a ≤ b}

Prove that A is countably infinite in two ways:

(1) By writing A as a union of sets and citing a result.

(2) By finding an explicit bijection between A and a countable set of your


choosing.

Problem 7.8.24. Define g : N × N → N by setting

g :N×N→N ∀(x, y) ∈ N × N . g(x, y) = (x + y)2 + x

Prove that g is (a) injective and (b) not surjective.


Problem 7.8.25. Let A, B, C be sets. Let f : A → B and g : B → C and
h : B → C be functions.

(a) Suppose g = h. Is it necessarily True that g ◦ f = h ◦ f ? Prove or disprove


this claim.

(b) Suppose g ◦ f = h ◦ f . Is it necessarily True that g = h? Prove or disprove


this claim.

Problem 7.8.26. Let A, B be finite sets, with |A| = |B| = n. Suppose f : A → B


is a function. Prove that

f is injective ⇐⇒ f is surjective

Problem 7.8.27. Consider the following claim:

Suppose f : A → B and g : B → C are functions. Suppose g ◦ f :


A → C is injective.
Then g is also injective.

What is wrong with the following “spoof” of this claim?

Suppose g ◦ f is an injection. We want to show g is an injection.


Let x, y ∈ B be given. Suppose g(x) = g(y).
We know ∃a, b ∈ A such that f (a) = x and f (b) = y.
Since g is a well-defined function, this means g(f (a)) = g(x) and
g(f (b)) = g(y).
Since g ◦ f is injective and g(f (a)) = g(f (b)), this means a = b.
Since f is a well-defined function, then f (a) = f (b).
This means x = y. Thus, g is injective.
7.8. CHAPTER EXERCISES 563

Also, find a counterexample that shows the claim’s conclusion is incorrect.


Problem 7.8.28. Let a, b ∈ R be arbitrary and fixed. Suppose a2 + b2 6= 0.
Consider the function f : R × R → R × R defined by

∀(x, y) ∈ R × R . f (x, y) = (ax − by, bx + ay)

Prove that f is a bijection by finding its inverse and proving that inverse is
correct.
Problem 7.8.29. Let A and B be finite sets and suppose |A| = |B|.
Suppose f : A → B is a function that is injective.
Prove that f must also necessarily be surjective by showing imf (A) = B.
Problem 7.8.30. Let k ∈ N − {1} be given. Define

S1 = {X ∈ P( [k] ) | k ∈
/ X}

and
S2 = {X ∈ P( [k] ) | k ∈ X}
(a) Prove that the sets S1 and S2 form a partition of P( [k] ).
(b) Define a function f1 : S1 → P( [k − 1] ) that is a bijection and prove that it
is.
(c) Define a function f2 : S2 → P( [k − 1] ) that is a bijection and prove that it
is.
(d) Use what you proved in (a) and (b) and (c) to write an induction proof
that P( [n] ) has 2n elements, for every n ∈ N.
Note: Because of the restriction k ≥ 2 above, make n = 1 your base case, use
n = k ≥ 1 in your Induction Hypothesis, and prove the claim for n = k + 1
in the Induction Step.
Problem 7.8.31. Let A, B, C, D be sets, and suppose A ∩ B = C ∩ D = ∅.
Suppose f : A → B and g : C → D are bijections.
Define the piece-wise function h : A ∪ B → C ∪ D by setting
(
∀x ∈ A ∪ B h(x) = . f (x) if x ∈ A
g(x) if x ∈ B)

Explain why h is a well-defined function. Then, prove it is a bijection.


Problem 7.8.32. In this problem, you will prove that whenever A and B are
finite with |A| = a and |B| = b, it follows that |A × B| = ab. This will be
structured as a “double induction” proof on the two variables a, b ∈ N.
(a) Show that [1] × [1] = 1. (This is very, very easy, but necessary.)
564 CHAPTER 7. FUNCTIONS AND CARDINALITY

(b) Suppose n ∈ N and [1] × [n] = n. Show that [1] × [n + 1] = n + 1.

(c) Explain why (a) and (b) have shown that ∀n ∈ N . [1] × [n] = n.

(d) Suppose k ∈ N and suppose ∀n ∈ N . [k] × [n] = kn. Show that ∀n ∈


.
N [k + 1] × [n] = (k + 1)n.

(e) Explain why (c) and (d) have shown that ∀k, n ∈ N . [k] × [n] = kn.
(f) Explain why (e) proves the result stated in the problem description above.
Problem 7.8.33. Let S be the set of all infinite binary strings. (That is, elements
of S are infinitely-long strings of 0s and 1s.)
Find a bijection between S and P(N). Use this to prove that S is uncountably
infinite.
Problem 7.8.34. For each of the following sets, you are given its cardinality.
Prove that the given cardinality is correct by finding a bijection to a relevant
set and/or citing a result.
(Hint: If you don’t use some kind of inductive argument, your proof might not
be rigorous enough . . . )
(a) A is the set of all functions from N to N. Show that A is uncountably
infinite.
(Hint: Compare A with the set S of all functions from N to {1, 2}. Can
you explain why S is uncountably infinite? What does this say about A?
...)
(b) B is the set of all functions from N to N with the additional property that

.
∀x ∈ N f (x + 1) = f (x) + 1

Show that B is countably infinite.


(c) C is the set of all functions from N to N with the additional properties that

.
∀x ∈ N f (x + 1) = f (x) + 1
f (1) = 42

Show that C is finite, and has only one element.


Problem 7.8.35. Look back at Example 7.6.14 where we argued (informally)
that N × N is countably infinite by depicting the set as a lattice of points and
describing a countably infinite path that covers all its points.
Formalize this argument by defining a function f : N×N → N (or vice-versa) that
achieves the path we described (or a similar one) and proving it is a bijection.
Problem 7.8.36. Prove Corollary 7.6.23. That is prove, that a countably infinite
union of finite sets that are pairwise-disjoint is countably infinite.
7.8. CHAPTER EXERCISES 565

Problem 7.8.37. . Consider Theorem 7.6.22, which states that a countably


infinite union of countably infinite sets is also countably infinite. In our proof,
we only considered the case where the given sets were pairwise-disjoint. In this
problem, you should prove the general case, where the sets are not necessarily
pairwise-disjoint.
(Hint: Consider the functions we used in our proof. Can you adapt them to
find a surjection from N × N to the union of the sets?)
Problem 7.8.38. Consider the set S of all infinite binary strings. We proved that
S is uncountably infinite before.
Consider the set T ⊆ S that is the set of all infinite binary strings with only
finitely many 1s.
In this problem, you will prove that T is, in fact, countably infinite!
(a) Consider the set Nk of all ordered k-tuples of natural numbers. (Note:
N1 = N and N2 = N × N.)
Provide an inductive argument that shows that Nk is countably infinite for
every k ∈ N.
(Hint: This should be a pretty short proof. You should appeal to a result
proven in lecture about Cartesian products of countably infinite sets.)
(b) For every k ∈ N, let Tk ⊆ T be the set of all infinite binary strings with
exactly k 1s.
Find a bijective (or, at least, injective) function from Tk to Nk . Explain
why your function is well-defined and is a bijection (or injection).
(c) Use (b) to deduce that Tk is countably infinite. (Careful: If you found only
an injection, you should also explain why Tk is not finite.)
(d) Express T as a union of sets and deduce that T is countably infinite.
(Hint: You’ll need to apply an important Theorem from lecture.)
(Side note: Think about the consequences of this result. With a simple bijection,
you can deduce that the set of all infinite binary strings with only finitely many
0s is also countably infinite. This means that the reason S, the set of all infinite
binary strings, is uncountably infinite is completely tied to the set of strings
with both infinitely many 1s and 0s. That set alone is big enough to make S
uncountable!
Problem 7.8.39. (a) Let n ∈ N. Consider the set
S = {f : [n] → [n] | f is a bijection }
Show that S is closed under composition; that is, prove that
.
∀f, g ∈ S f ◦ g ∈ S
(Hint: Cite a problem from this section of chapter exercises.)
566 CHAPTER 7. FUNCTIONS AND CARDINALITY

(b) Consider the set


n o
T = f : N → N | f is a bijection and {i ∈ N | f (i) 6= i} is finite

Show that T is also closed under composition.


(c) Show that T is closed under inverses; that is, prove that

.
∀f ∈ T f −1 exists ∧ f −1 ∈ T

(d) Consider the set



U = f : N → N | f is a bijection

Prove that U is closed under inverses.


(e) Prove that

.
∀f ∈ T ∀g ∈ U − T . f ◦g ∈
/ T ∧ g◦f ∈
/T


(f) Find a counterexample to show that U − T is not closed under composition.


(g) Furthermore, given A ⊆ N with A finite, find functions f, g ∈ U − T such
that
{i ∈ N | (f ◦ g)(i) 6= i} = A

(h) What are the cardinalities of S, T, U ? If your answer is “finite”, also state
the size. If your answer is “infinite”, also state whether it is countable or
uncountable and prove your claim by finding a bijection to an appropriate
set or citing a relevant result.

7.9 Lookahead
In the next chapter, we will study combinatorics, the mathematical branch
of “counting things”. We saw in the section on cardinality that many results
about finite sets seemed rather intuitive. When we study combinatorics, we
will be describing the elements of a set by characterizing what properties they
have, rathern than simply stating them all or listing them. This will actually
make it quite interesting (and sometimes very difficult!) to determine just how
many elements we have described. Combinatorics is the study of techniques to
determine the number of elements of a set with certain properties. We will state
and prove some fundamental principles of counting (appealing to results from
this chapter, in fact) and use them to build more advanced techniques and solve
some interesting problems.
Chapter 8

Combinatorics: Counting
Stuff

8.1 Introduction
The field of combinatorics is one of the most active and exciting areas of
interest in modern mathematics. It is also sometimes known as “discrete math”
to distinguish it from analysis, which studies more “continuous” notions like
the real number line and functions defined on that set. In this chapter we
will explore some of the fundamental ideas in combinatorics and apply them
to solve interesting problems. Essentially, we will be learning interesting and
useful principles about how to count the number of elements in finite sets where
those elements are described in some way but not enumerated for us.

8.1.1 Objectives
The following short sections in this introduction will show you how this chapter
fits into the scheme of the book. They will describe how our previous work
will be helpful, they will motivate why we would care to investigate the topics
that appear in this chapter, and they will tell you our goals and what you
should keep in mind while reading along to achieve those goals. Right now,
we will summarize the main objectives of this chapter for you via a series of
statements. These describe the skills and knowledge you should have gained by
the conclusion of this chapter. The following sections will reiterate these ideas
in more detail, but this will provide you with a brief list for future reference.
When you finish working through this chapter, return to this list and see if you
understand all of these objectives. Do you see why we outlined them here as
being important? Can you define all the terminology we use? Can you apply
the techniques we describe?

567
568 CHAPTER 8. COMBINATORICS

By the end of this chapter, you should be able to . . .


• State the Rules of Sum and Product, and use and combine them to con-
struct simple counting arguments.

• Categorize several standard counting objects, as well as state correspond-


ing counting formulas and understand how to prove them.

• State the meaning of binomial coefficients and evaluate their numerical


formulae, know how to use them in counting arguments, and understand
how to derive those numerical formulae.

• Critique a proposed counting argument by properly demonstrating if it is


an undercount or overcount.

• Prove combinatorial identities by constructing “counting in two ways”


proofs.

• Understand various formulations of selection with repetition, and use them


to solve problems.

• State the Pigeonhole Principle and use it in counting arguments.

• State the Principle of Inclusion/Exclusion and use it in counting argu-


ments.

8.1.2 Segue from previous chapter


In Chapter 7, we left off talking about the cardinality of sets, both finite and
infinite. While many of the results about infinite sets are interesting and math-
ematically rich, that particular area can lead to some mind-bending and confus-
ing areas that are, alas, beyond the scope of our current studies. For now, we
will focus on finite sets. In particular, we will explore how some results about
the cardinality of finite sets can be used to solve problems about “counting”
mathematical objects. That is, we will explore how we can answer questions
of the form “How many objects are there with property X?” This branch of
mathematics is known as combinatorics. You can think of it as “the study
of combinations of objects”. While investigating this branch of mathematics,
we will develop some new notation and definitions, prove and use some results
about finite sets, and describe and study some particular objects that live in
the field of combinatorics and computer science. Importantly, we will learn an
entirely new proof technique based on counting objects!

8.1.3 Motivation
Think about playing poker. If you’re unfamiliar with the game, just think of
it as a simple system where two players receive a hand of 5 random cards each
and then they compare to see who wins. Hands are ranked according to the
following list, from best to worst:
8.1. INTRODUCTION 569

• Straight Flush (five in a row of one suit), e.g. T ♣ J♣ Q♣ K♣ A♣

• Four of a Kind, e.g. 3♣ 3♠ 3♥ 3♦ 7♥

• Full House (three of a kind and a pair), e.g. 4♣ 4♠ 4♦ 6♣ 6♥

• Flush (five of one suit), e.g. 2♥ 5♥ 8♥ Q♥ K♥

• Straight (five in a row, not all the same suit), e.g. 8♦ 9♣ T ♦ J♥ Q♥

• Three of a Kind, e.g. K♠ K♥ K♦ Q♥ 9♣

• Two Pair, e.g. A♠ A♥ J♠ J♦ 2♣

• One Pair, e.g. 8♥ 8♦ 2♠ 5♣ K♥

• High Card, e.g. Q♠ J♣ 9♦ 7♦ 2♦

Is this a fair game? If you’ve played poker before, and especially if you’ve
played a lot, you’ve not only learned to accept this ranking system but you’ve
also learned how to exploit it and make decisions. In Five Card Draw, if you’re
dealt 22345, should you keep the pair or go for the straight? Which is more
likely to happen? Which will pay off more handsomely?
By our question, “Is this a fair game?”, what we’re really wondering is why
the ranking is the way it is! Is drawing a flush actually rarer than a straight?
Does it make sense that a full house loses to four of a kind? Why? How can we
prove these results? To answer these questions, we will rephrase the questions
in terms of counting instead of probability. We will ask how many distinct five
card hands are flushes, how many are straights, and so on. This will allow us to
compare them directly. Do you see how this relates to our work in the previous
chapter, too? We will really be identifying the cardinality of the set of all poker
hands that are flushes, for instance, and comparing it to the cardinalities of
other sets of hands.

8.1.4 Goals and Warnings for the Reader


We will need to develop some notation and definitions to begin formulating a
method to count the elements of particular finite sets, but we want to emphasize
that this is really what is going on, overall: combinatorics is about counting the
number of elements in finite sets using particular methods (which we will develop
in this chapter). More specifically, we want to study these counting techniques
in an abstract sense so that we can apply them in an efficient manner. Perhaps
we could answer those poker questions we posed above by looking at all possible
five card hands and making a tally mark every time we see a flush, say, but surely
this will take way too long! There must be a better way! Well, of course, there
is, and we will develop it soon enough in this chapter’s first section.
We want to emphasize that we will be developing a new style of proof in this
chapter, as well. More so than previous problems and techniques we’ve studied,
proofs in combinatorics depend greatly on clarity and specificity of language.
570 CHAPTER 8. COMBINATORICS

Some of your proofs to the exercises in these sections may consist entirely of
English sentences, with almost no mathematical symbols! This will seem strange
at first, and might even seem to contradict the ideas we have emphasized so far
about precision, clarity, and mathematical rigor. This is definitely not the case,
though; combinatorics has a rigorous foundation in finite set theory, and we will
work hard to point out this relationship whenever it is relevant. This property of
combinatorics will also require you to be extra careful about your proof-writing
style, ensuring that your words are chosen appropriately to be unambiguous
and clear. More so than ever, be sure to reread your proofs after writing them,
pretending that you are someone else, to make sure that the points you want to
make actually come across in your proofs.
One final introductory point can be made by the following quote that a friend
of mine stated once when we were talking about how to teach combinatorics. I
found that it nicely summarized the sometimes strange transition from the the
proofs we have been doing so far (that might feel rather formal) to combinatorics
proofs (that might feel rather informal, in comparison).

Finite cardinality is boring. That’s not inconsistent with the fact


that combinatorics is hard.

You might not know what that means now, but if you look back at this quote
after working through this chapter, you’ll understand what he was getting at.
What this means is that, in an abstract and theoretical sense, finite cardinality is
boring; all the results are what you’d expect them to be—like |A∪B| = |A|+|B|
when A ∩ B = ∅—and the techniques are all the same—find a bijection to an
appropriate set. Infinite cardinality is far stranger and surprising—|A ∪ B| =
|A|+|B| can be False, even if A∩B = ∅, and even further, the addition |A|+|B|
is hard to make sense of, mathematically!
How does combinatorics differ, then? Well, in all of our work with combi-
natorics, we are given a finite set; the difference is that its elements are only
described to us in some way. We are not presented with the elements of a set
directly and asked to count them. (That would be easy: “One, two, three, . . . ”)
We have to come up with relevant and helpful strategies to identify how many
objects have a certain prescribed list of properties. That is where the difficulty
of combinatorics comes in. When we say, “Consider the set of all 5-card hands,
as drawn from a standard deck of cards”, you can immediately grasp the idea
of that set, but you certainly can’t picture all its elements laid out before you,
let alone begin to count them one-by-one. In this sense, combinatorics is hard;
this is also why it is incredibly interesting and popular!

8.2 Basic Counting Principles


8.2.1 The Rule of Sum
Look back at Theorem 7.6.7 that we proved in the previous chapter. It says
that when we take two finite sets that are disjoint (i.e. they share no elements),
8.2. BASIC COUNTING PRINCIPLES 571

the size of their union is the sum of their individual sizes. This makes intuitive
sense for finite sets, and we proved the result mathematically using A bijec-
tion. This result forms the basis for the first, fundamentally useful principle of
combinatorics. Notice that this grounds us firmly in the principles of set theory.

Partitions
We start by recalling Definition 3.6.9, which was introduced in our discussion
of sets.

Definition 8.2.1. Let A be a set. A partition of A is a collection of sets that


are pairwise disjoint and whose union is A.
That is, a partition is formed by an index set I and non-empty sets Si (defined
for every i ∈ I) that satisfy the following conditions:

(1) For every i ∈ I, Si ⊆ A.

(2) For every i, j ∈ I with i 6= j, we have Si ∩ Sj = ∅.


[
(3) Si = A
i∈I

Essentially, a partition is a way of breaking a set into smaller sets that do


not overlap. Let’s look at a couple of examples before moving on.
Example 8.2.2. Let A be the set of people in the room currently. Let I = {1, 2},
and let S1 be the set of left-handed people and let S2 be the set of right-handed
people. Then S = {S1 , S2 } is a partition of A. Notice the distinction between
writing “{S1 , S2 } partition A”, which is correct, and “S1 , S2 partition A”, which
is not correct. What does it mean to say S1 , S2 in this context? We really mean
that those two sets, taken together as a collection, form a partition of A. This
is why we must remember to write the elements S between brackets.
To be rigorous, we should prove why S is a partition of A. To do this, we
point out that S1 ∩ S2 = ∅ because everyone here is either left- or right-handed
but not both. (Let’s presume there are no “outlying cases” here, like truly
ambidextrous people or people with no hands. If any such people are present,
include them in a set S3 and include that in our partition set S.) We also point
out that S1 ∪ S2 = A because everyone in the room must be either left- or
right-handed, so there cannot exist an element x ∈ A that satisfies x ∈ / S1 and
x∈ / S2 . This shows why S is a partition.
What if we wanted to partition the set of people in this room by separating
them based on the first letter of their first name? Try to define this partition
using mathematical notation like the previous example.
Example 8.2.3. Now, let’s see a non-finite partition. Consider the set A = N
and the index set I = N. For every i ∈ N, define the set

Si = {2i − 1, 2i}
572 CHAPTER 8. COMBINATORICS

Is the set S = {Si | i ∈ N} a partition of N? We think so; let’s investigate why.


We could start by writing out what the first few sets look like (indeed, this is
usually a good first strategy: just write out the first few cases and see what
happens):
S1 = {1, 2}
S2 = {3, 4}
S3 = {5, 6}
..
.
and so on. This looks like a partition of N so far, doesn’t it? Let’s prove that it
truly is!
First, let’s show that the sets Si are pairwise-disjoint (i.e., any two of the
sets share no elements). We prove this by contradiction. AFSOC that ∃i, j ∈ N
with i 6= j such that Si ∩ Sj 6= ∅. This means that (at least) one element of Si
is also an element of Sj ; we find there are four possible cases for this situation:
1. 2i − 1 = 2j − 1
2. 2i − 1 = 2j
3. 2i = 2j − 1
4. 2i = 2j
The first and fourth cases immediately imply that i = j, by some simple algebra,
which contradicts our given condition that i 6= j. The second and third cases
are contradictions themselves because they involve an odd natural number and
an even natural number being equal. In any case, we find a contradiction.
Therefore, ∀i, j ∈ N with i 6= j, it’s the case that Si ∩ Sj = ∅.
Second, let’s show that the union of all of the Si sets is N. That is, let’s
prove [
Si = N
i∈N
Remember that the set on the left-hand side consists of all of the elements x
such that ∃i ∈ I that satisfies x ∈ Si . (Think about why this makes sense, even
though I is infinite. This just means the union contains all of the elements that
belong to at least one of the sets Si .) Notice that for every i ∈ N, the elements
2i − 1, 2i ∈ Si are both natural numbers. Thus,
[
N⊇ Si
i∈I

Next, we prove the reverse set containment. Let n ∈ N. We have two cases to
consider. (1) If n is even, then ∃k ∈ N such that n = 2k. Thus, n ∈ Sk . (2) If
n is odd, then ∃` ∈ N[such that n = 2` − 1. Thus, n ∈ S` . In either case, we
have shown that n ∈ Si .
i∈I
Therefore, S is a partition of N. In particular, it is an infinite partition.
8.2. BASIC COUNTING PRINCIPLES 573

Now we have seen an example of a finite and infinite partition.


(Challenge question: Can you identify an infinite partition of N such that all
of the component sets of the partition are also infinite?)

Statement
For the remainder of the chapter, we will only consider finite partitions of finite
sets. In particular, the Rule of Sum only applies in this specific case.

Proposition 8.2.4. Let A be a finite set, let n ∈ N, and let S = {Si | i ∈ [n]}
be a finite partition of A. The Rule of Sum states that
X
|A| = |Si |
i∈[n]

The Rule of Sum tells us that the size of a set can be found by partitioning
it into a finite number of smaller sets and summing their sizes. Notice that this
is precisely Corollary 7.6.10 that we saw last chapter in our discussion of finite
sets! There, we asked you to prove this claim by induction, in Exercise 2 in
Section 7.6.5. With this result in hand, we’ll move on to see some examples.

Examples
Example 8.2.5. At Unique Activity University, every student is required to par-
ticipate in exactly one varsity sport each year. Playing more than one would be
too much of a time commitment, and not playing at all would make them lazy,
so everyone plays exactly one of the following non-traditional-but-still-sports
sports: golf, cricket, badminton, and chess. The athletic department released
the following statistics about the rosters for each sport this year:

• Golf: 12 players

• Cricket: 18 players

• Badminton: 23 players

• Chess: 33 players

How many students attend UAU?


Okay, this is an easy example because we made sure to stipulate that the
sports offered by the university form a partition of the set of students. (Compare
that to the sentence, “The set of sports offered by the university is a partition
of the set of students.” Both are correct.) Thus, we can find the cardinality of
S, the set of all students, by adding;

|S| = 12 + 18 + 23 + 33 = 86

A small university, indeed, as well as a bizarre one. Don’t go there.


574 CHAPTER 8. COMBINATORICS

More interesting examples of applying the Rule of Sum will appear when
we combine it with other counting principles. For now, it’s a simple idea that
governs how to count sets that can be broken into disjoint parts. In general, the
hardest part about using the Rule of Sum is deciding which partition to apply
it to, and being creative about that.
The next counting principle is just as, if not more, helpful but a little more
intricate to define and prove.

8.2.2 The Rule of Product


Motivation
We’ll motivate this principle via an example.
Example 8.2.6. Let’s say we have three people in the room. We also have three
stickers bearing the numbers 1, 2, and 3 on them (with one distinct number on
each sticker). How many ways can we place these stickers on the three people?
For the sake of argument, let’s say the people are named Andy, Brendan, and
Carl, conveniently abbreviated as A, B, and C. To answer this question, we
can simply write out all of the sticker assignments in an organized manner to
make sure we don’t miss any. Specifically, we’ll rank them in increasing order
by Andy’s assignment, then Brendan’s, then Carl’s: we have (A, B, C) =
1. (1, 2, 3)
2. (1, 3, 2)
3. (2, 1, 3)
4. (2, 3, 1)
5. (3, 1, 2)
6. (3, 2, 1)
Thus, there are 6 total ways to assign the stickers.
What if we have four people–Andy, Brendan, Carl, and Dave–and four stick-
ers? Can we list all of those assignments? Sure, why not?

(1, 2, 3, 4) (1, 2, 4, 3) (1, 3, 2, 4) (1, 3, 4, 2)


(1, 4, 2, 3) (1, 4, 3, 2) (2, 1, 3, 4) (2, 1, 4, 3)
(2, 3, 1, 4) (2, 3, 4, 1) (2, 4, 1, 3) (2, 4, 3, 1)
(3, 1, 2, 4) (3, 1, 4, 2) (3, 2, 1, 4) (3, 2, 4, 1)
(3, 4, 1, 2) (3, 4, 2, 1) (4, 1, 2, 3) (4, 1, 3, 2)
(4, 2, 1, 3) (4, 2, 3, 1) (4, 3, 1, 2) (4, 3, 2, 1)

Okay, so there are 24 total ways to assign the stickers. What about with five
people? I don’t know about you, but my arm is getting tired writing out all of
these assignments. There must be a better way to do this! Yes! This is where
8.2. BASIC COUNTING PRINCIPLES 575

the Rule of Product comes in to save the day. (Side note: You might notice a
pattern to our list above; can you infer how we made sure we actually listed all
possibilities? Could you write a little computer program that would generate
all the possibilities, for any number of elements? Try it!)

Statement
We will actually make two separate statements of the Rule of Product. The first
is an intuitive statement of when and how it applies and what it claims. The
second is a more rigorous, mathematical statement that is rooted in the kind
of set-theoretic language that we have been using all along. We emphasize that
both definitions should, ideally, be understood; however, truly understanding
the first one is more important, and the second is presented mostly because it
is the one that can and will be rigorously proven.

Proposition 8.2.7. Consider a process that is completed in n distinct steps.


Assume that the i-th step, for every i ∈ [n], has exactly wi different ways to be
completed; moreover, assume that this number wi ∈ N does not depend on the
choices made in the previous steps. Also, assume that no two distinct choices
at any step yield the same outcome. Then the Rule of Product states that the
total number of outcomes, N , of this n-step process is
Y
N= wi
i∈[n]

Let’s relate this statement back to the previous example with the people and
stickers before moving on and stating the Rule of Product more rigorously.
Example 8.2.8. We can think of assigning the stickers to Andy, Brendan, and
Carl as a three-step process. Let’s line up the three gentlemen in alphabetical
order, left to right, then move along the row. At each step, we will place a sticker
on the gentleman in front of us by choosing one that hasn’t been assigned yet.
In the first step, we approach Andy and have 3 possible stickers to place on
him. In the second step, we approach Brendan and have 2 possible stickers to
place on him. Notice that this is true no matter what sticker was chosen for
Andy. We don’t actually care which sticker was chosen for Andy—be it 1, or
2, or 3—merely that the number of choices we have when we face Brendan is
always 2. In the third step, we approach Carl and find that we have only 1
sticker option, regardless of the previous two choices.
The Rule of Product tells us that the number of ways to complete this
process is the product of those numbers of options at each step: 3 · 2 · 1 = 6.
This agrees with our “exhaustive list” procedure. Hooray!
What about with 4 people? Using the same kind of logic, we can see that
there are 4·3·2·1 = 24 possible ways to complete the sticker-assignment process.
Again, this agrees with our previous procedure. Double hooray!
What about with 5 people? Well, 5 · 4 · 3 · 2 · 1 = 120. We figured out
something we didn’t know yet. Triple hooray! With 6 people? With 7 people?
576 CHAPTER 8. COMBINATORICS

With n people, where n ∈ N? We can answer all of these questions very easily
and precisely now, thanks to the Rule of Product. Infinite hooray!

Tree Diagrams

An interesting and helpful interpretation of the Rule of Product is evidenced


by a tree diagram. The concept of a tree arises in the branch of mathematics
known as graph theory, which studies mathematical objects consisting of vertices
(dots) connected by edges (lines between the dots, where we only care about
whether or not a line is present, and not on what it “looks like” when drawn on
a piece of paper). A tree is a particular type of graph, and it arises commonly
in computer science, as well, when studying branching processes. Within our
context, we can use a tree to represent the decision points of a procedure whose
end products will be counted by the Rule of Product. Furthermore, this method
will provide some insight into the mathematically rigorous statement and proof
of the Rule of Product. (We will leave these ultimate goals to the exercises, but
for those of you who are interested and motivated to attempt them, we strongly
encourage reading this section, as well; it will give you some intuition and guide
you through those exercises.)
Example 8.2.9. Let’s illustrate tree diagrams and how they relate to the Rule
of Product via an example. Let’s say we are planning our schedule for next
semester. Based on our major and time constraints (and personal interests, of
course), we must take exactly one class from each of three departments: math-
ematics, computer science, and philosophy. The number of courses available for
us to take in each department does not depend on the selection we make in any
other department; specifically, we have 4 mathematics courses to choose from,
3 computer science courses, and 2 philosophy courses, and any combination of
courses will fit our schedule (provided each department is represented exactly
once).
How might we apply the Rule of Product to our situation? We would need
to define a process and the steps of that process, and then identify how many
choices are available at each step. Naturally, the overall process here is identi-
fying our course schedule for next semester. Since we are constrained to select
(exactly) one course from each department, let us identify three steps: (1) select
a mathematics course; (2) select a computer science course; (3) select a philos-
ophy course. (Note: Does the order of these steps matter? What if we select a
philosophy course first, instead? Will our process be fundamentally different?
We think not, but make sure you see why before reading on.)
Next, let’s represent the choices we can make at each step. Let’s say the set
of 4 mathematics courses available to take is M = {M1 , M2 , M3 , M4 }, the set
of 3 computer science courses is C = {C1 , C2 , C3 }, and the set of 2 philosophy
courses is P = {P1 , P2 }. This immediately identifies for us the number of
choices available at each step: (1) there are |M| = 4 choices; (2) there are
|C| = 3 choices; (3) there are |P| = 2 choices. Thus, the Rule of Product tells
us there are 24 total course schedules we could create for next semester. But,
8.2. BASIC COUNTING PRINCIPLES 577

really, why is this true? What are those schedules? Let’s represent them by a
tree diagram!
M1 , C1 , P1

M1 , C1 , P2
M1 , C1
M1 , C2 , P1

M1 , C2 , P2
M1 , C2
M2 M1 , C3 , P1

M1 , C3 , P2
M1 , C3
M2 , C1 , P1

M2 , C1 , P2
M2 , C1
M2 M2 , C2 , P1

M2 , C2 , P2
M2 , C2

M3 , C3 , P1

M3 , C3 , P2
M2 , C3

M3 , C1 , P1
M3 , C1

M3 , C1 , P2

M3 , C2 , P1
M3 , C2
M3 M3 , C2 , P2

M3 , C3 , P1
M3 , C3

M3 , C3 , P2

M4 , C1 , P1
M4 , C1
M4
M4 , C1 , P2

M4 , C2 , P1
M4 , C2

M4 , C2 , P2

M4 , C3 , P1
M4 , C3

M4 , C3 , P2

Reading left to right in the diagram, we are following this three step pro-
cedure we established. The single vertex (or node) at the far left represents
the start of our process—no decisions have been made—and the four edges (or
branches) emerging from that vertex represent the four mathematics courses
from which we can choose. We have labeled each edge with one of the elements
of M. No matter which one of those edges we follow (i.e. no matter which
mathematics course we select), there are three edges emerging from the next
vertex (i.e. we still have three computer science courses from which we can
choose). We have labeled all of those edges with corresponding elements from
C. Following the same idea, every vertex in that column has two emerging edges
which are labeled by corresponding elements from P.
The benefit of this diagram is that we can see exactly what the 24 outcomes
of this process are by following the labels on the edges. For instance, look at the
vertex on the top of the far right column. This corresponds to selecting M1 and
578 CHAPTER 8. COMBINATORICS

C1 and P1 ; alternatively, we can represent this as the ordered triple (M1 , C1 , P1 ).


Further down that column, we see a vertex corresponding to the ordered triple
(M2 , C3 , P1 ), for example. Every vertex has an ordered triple representation!
What we are really doing when we apply the Rule of Product is identifying
the cardinality of some set that is a Cartesian product of several constituent
sets. The process corresponds to identifying elements of the constituent sets
and arranging them in an ordered tuple. The Rule of Product tells us how
many ways we can do this by identifying the cardinality of the product set
consisting of all such tuples. In this specific example, we have

|M × C × P| = |M| · |C| · |P| = 4 · 3 · 2 = 24

Does this make more sense, now? Does this provide you any insight into
how the Rule of Product actually works?

More Formal Statement


See Exercise 8.9.1, which asks for a proof of the following theorem. This is a
more formal statement of what the Rule of Product is, mathematically. After
the statement, we’ll describe how it relates to the previous version.
Theorem 8.2.10. Rule of Product (Set-Theoretic Version)
.
Let n ∈ N. Suppose that ∀i ∈ N Ti is a finite set. Then,

Y Y
Ti = |T1 × T2 × · · · × Tn | = |T1 | · |T2 | · · · |Tn | = |Ti |
i∈[n] i∈[n]

The relationship with the previously-stated rule of product is as follows. The


elements of the set T1 are the choices that can be made in Step 1 of the process.
For every element of T1 , we define the set T2 to be the set of choices that can be
made in Step 2 of the process after that choice made in Step 1. By assumption,
there is an equal number of such choices, regardless of the choice made in Step
1. Thus, it makes sense that the conclusion of the Theorem only incorporates
|T2 |, since this value is well-defined. Likewise, T3 is the set of choices for Step
3 that can follow the choices made in Steps 1 and 2, and by assumption, |T3 | is
well-defined.
In the end, we can describe an outcome of this process by an ordered n-tuple,
where coordinate i is an element of the set Ti . Indeed, what that element
could be does depend on what the previous coordinates are, but the number of
choices for this element is independent of those prior choices. Since, in the end,
we really only care about the number of possible outcomes, the result makes
sense. Actually listing all of the outcomes would require a careful analysis of
each step, seeing how a particular choice affects the choices in the next step
(and the steps thereafter), but that’s not the point of the result. This is why,
essentially, the result amounts to proving that the size of a product of finite sets
is equal to the product of their sizes.
8.2. BASIC COUNTING PRINCIPLES 579

Example: Applying the Rules of Sum and Product (Together)


Let’s practice using these two combinatorics Rules. You’ll also notice that we’ll
start abbreviating these rules as ROS and ROP, so that we can cite them easily.
And yes, we do need to cite them when we use them!
Example 8.2.11. License Plates:
Suppose a license plate string consists of 6 or 7 positions, each of which is filled
with a letter (from A to Z) or a digit (from 0 to 9).

(1) How many license plates are there?


We must partition based on the length of the string, whether it is 6 or 7.
Within each part, we have a 6 or 7 step process. At step i, we fill Position i
in the string with one of the 36 options (there are 26 letters and 10 digits).
By ROP, then, there are 366 strings of length 6 and 367 strings of length 7.
By ROS, then, there are 366 + 367 total license plate strings.

(2) How many license plates have at most 1 digit?


We must partition based on whether there are 0 digits or 1 digit.
With 0 digits, each step in our process places a letter in the corresponding
position. We either have 6 letters—yielding 266 possibilities—or 7 letters—
yielding 267 possibilities, by ROP.
By ROS, there are 366 or 367 such outcomes.
With 1 digit, step 1a chooses which of the positions is filled with a digit,
step 1b chooses the digit for that position, and the rest of the steps fill the
remaining positions with letters only.
There are 6 choices for which position is a digit, then 10 choices for how to
fill that position (wherever it is), and 26 choices each for the other positions.
Applying ROS and ROP, we find there are 6 · 10 · 365 or 7 · 10 · 366 such
outcomes.
In total, by ROS, there are

(366 + 6 · 10 · 365 ) + (367 + 7 · 10 · 366 )

total outcomes.

(3) How many license plates have at least 2 digits?


We could follow the same method we used with the previous question, and
partition this set of license plates into those with 3 digits, 4 digits, 5 digits,
6 digits, and 7 digits. We would then need to count each such set and
add their sizes. But how many license plates have, say, 4 digits? With 6
positions to be filled, how many ways are there to choose 4 positions to
be digits? This is where binomial coefficients will be helpful, soon enough
580 CHAPTER 8. COMBINATORICS

(after we have defined them and derived a formula).


Instead, let’s take advantage of the work we just did! Let’s partition the set
of all license plates (call this set Y ) into those with at most 1 digit (call this
set X1 ) and those with at least 2 digits (call this set X2 ). Notice that this
is a partition, so ROS tells us |Y | = |X1 | + |X2 |. Subtracting algebraically,
this tells us the expression we want is

|X2 | = |Y | − |X1 |
= (366 + 367 ) − (366 + 6 · 10 · 365 ) + (367 + 7 · 10 · 366 )
 

by just substituting in the expressions we’ve already derived. How conve-


nient!
In general, this is a good strategy: to count a set, we can count its comple-
ment (i.e. all of the “other” elements outside the set) and remove that count
from the “total”. However, remember that we only have a Rule of Sum at
our disposal, not a rule of Subtraction, so we should always be careful (for
now, at least) to phrase such a step in terms of a partition and a sum. After
that, we can subtract numbers or algebraic variables. Eventually, once we
are more mathematically mature, we can easily skip this formality and just
talk about “subtracting out” a count; for now, though, we want to empha-
size the underpinnings of these counting arguments, so we will require this
careful phrasing and application of the Rule of Sum.
(4) How many license plates have no vowels and no even digits?
This condition just limits the number of choices at each step. There are
only 21 letters and 5 digits to choose from, so we get

266 + 267

total outcomes, by ROP and ROS.

8.2.3 Fundamental Counting Objects and Formulas


Let’s return to our motivating example of counting poker hands. Remember
that we want to know how many of each type of hand there are, how many
ways we could be dealt a flush, say, from a freshly-shuffled deck of 52 cards.
Let’s start by answering a related, but simpler, question: how many total poker
hands are there? Another way of phrasing the question–one that will actually
hint at our method of answering it–is as follows: how many ways are there
to shuffle the entire deck of 52 cards, and how many of those yield the same
poker hand among the top 5 cards? That is, let’s identify how many distinct
(i.e. totally different) ways there are to shuffle the deck; let’s call these ways
shufflings. Then, let’s think of a specific hand, say T ♣ J♣ Q♣ K♣ A♣, and
count how many shufflings have the property that the top 5 cards of the deck
comprise that specific hand in any order (because we don’t care how we receive
the 5 card we’re dealt, we just care what we’re holding!).
8.2. BASIC COUNTING PRINCIPLES 581

What do we have at our disposal? That’s right, the Rules of Sum and
Product. That’s pretty much it, other than our mathematical wit and intuition,
so let’s dive right in. How does shuffling a deck of cards correspond to a partition,
or a multi-step process? Well, the interesting thing is that we don’t actually
care how the deck is shuffled, we only care about the number of outcomes of the
process. What actually matters about a deck of cards? Right, the order of the
cards from top to bottom. With that in mind, let’s think about constructing
an arbitrary shuffling by assigning the order of the cards.
Let’s create a shuffling by taking a deck of cards in our hands and, one by
one, placing a card face down on a stack in front of us. At the first step, we have
52 cards in our hands and no stack, so we have 52 choices. At the second step,
we have 51 cards remaining in our hands to choose from, no matter what that
first card was. (Remember: this is the important part of the Rule of Product,
that the number of choices is independent of the actual choices made.) In the
third step, we have 50 cards remaining, and so on. Eventually, in the 52nd step,
we have only 1 card in our hands to place on the stack of 51 cards on the table.
After that step is completed, we have a shuffling of the deck sitting in front of us,
with the cards stacked face-down. The card from the 1st step is on the bottom,
and the card from the last step is on the top. Moreover, we see that for any
arbitrary shuffling, there is exactly one sequence of choices that produces that
shuffling. (This satisfies that other part of the Rule of Product about having
distinct outcomes. Think about this carefully and why it’s required.)
These observations allow us to directly cite the Rule of Product to answer
the question: how many shufflings of a standard deck of cards are there? The
number is . . .
Y
52 · 51 · 50 · · · 3 · 2 · 1 = k = 8.06581752 × 1067
k∈[52]

Yowza! That’s a big number. For the sake of comparison, Avogadro’s Constant
(the number of atoms in a mole) is on the order of 1023 . There is a much better
notation for this kind of product that says “multiply all the natural numbers
from 52 down to 1”, and you’ve probably seen it before, but we’ll define it now.
Definition 8.2.12. Let n ∈ N. The natural number n!, read as n factorial, is
given by Y
n! = k = k · (k − 1) · (k − 2) · · · 3 · 2 · 1
k∈[n]

By definition, 0! = 1.
(Recall that we used computing factorials as an example of applying the
principle of induction to recursive programming, way back in Section 2.5.1.
Read that section again!)
Let’s think about what we’ve accomplished, in fact. What was special about
the number 52 in this case? Besides it being the number of cards in our deck,
nothing! What if we had posed the question: how many ways are there to put
the elements of [n] into an ordered list? If we replace n with 52, this is actually
582 CHAPTER 8. COMBINATORICS

the same questions as before! (We could just come up with a natural bijection
between the set of cards and the set [52]. Can you do this? Do you see why this
shows the questions are equivalent?)

Permutations
This type of question—how many ways are there to arrange n objects into an
ordered list—is so common that we have a specific term for these ordered lists.
We define them rigorously in terms of functions, but note their relationship to
other mathematical objects (ordered list, for instance).

Definition 8.2.13. Let n ∈ N. A permutation of [n] is a function f : [n] →


[n] that is a bijection.
Equivalently, a permuation of [n] is an ordered n-tuple of elements from [n] such
that every element appears exactly once.

Proposition 8.2.14. Let n ∈ N. Let S be the set of all permuations on [n].


Then |S| = n!.

Proof. We construct an arbitrary permutation of [n] by selecting which element


appears first in the ordered list. There are n options. Then, from all the elements
except that one already chosen, select one to appear second in the list. There
are n−1 options. In general, at step k, we choose from the n−(k −1) = n−k +1
elements not already chosen and pick one to appear next. This goes until step
n−1, where we only have 1 option. By ROP, there are n(n−1)(n−2) · · · 2·1 = n!
total outcomes.

(Note: this motivates the convention of choosing to define 0! as 1. Since n!


represents the number of ways to permute n objects, and there is exactly 1 way
to permute all of the elements of the empty set—there, we just did it!—it makes
sense that 0! = 1. This idea will return when we define binomial coefficients
shortly; it will be very helpful to have 0! = 1 for the corresponding formula.)

Selections
This mathematically proves a general version of our observation about shuffling
cards, and it brings us closer to answering our original question about ranking
poker hands. Remember that we hope to identify how many distinct shufflings
of the deck yield a certain type of five card hand among the top five cards, so
let’s attack a slightly more general problem, first. Think of a specific five card
hand, five particular cards. We’re thinking of T ♣ J♣ Q♣ K♣ A♣, so let’s use
that. Now, let’s count how many deck shufflings place this specific hand among
the top five cards.
How could we have such a situation? We don’t care about the order in which
we receive the cards in our hand, and we don’t care about the order of the other
47 cards in the deck. All that matters is whether those specific cards are on
the top. So let’s follow the same idea we used before and construct a shuffling
8.2. BASIC COUNTING PRINCIPLES 583

with this property. We want to use the Rule of Product, so we need to identify
a particular process that constructs a shuffling with the desired property. How
can we do this?
There are really only two properties we need to satisfy, so let’s identify a
two step process that ensures those properties hold. The first step should place
the 47 cards not from our hand on the bottom of the deck in some order. The
second step should place the five cards from our hand on top of that pile in
some order. The Rule of Product applies because no matter how we shuffle the
bottom 47 cards, this doesn’t affect the number of ways we can shuffle the top
five cards. (In general, be careful to note why the Rule of Product applies in a
given situation before applying it; this is often subtle and not obvious!) Now,
we just need to count the number of ways to perform each step.
The first step involves creating a permutation of 47 cards. Proposition 8.2.14
tells us there are 47! ways to do this. The second step involves creating a
permutation of five cards. Proposition 8.2.14 tells us there are 5! ways to do
this. Then, the Rule of Product tells us the number of ways to complete these
steps in succession is 47! · 5!. That’s it!
What was special about our choice of T ♣ J♣ Q♣ K♣ A♣ in this case?
That’s right, nothing! By applying the Rule of Product again, this fact will tell
us something more about the number of shufflings of the deck. Specifically, let’s
say X is the number of ways to select a set of five cards as a poker hand. Now,
consider the three step process of taking five particular cards from the deck,
arranging them in some order, and then arranging the other 47 cards below it.
The Rule of Product applies here because the number of ways to perform each
step doesn’t depend on the choices made in the previous steps. Furthermore,
every shuffling of the deck arises from exactly one particular instance of this
procedure. (Think about why this is true. Consider an arbitrary shuffling of
the deck. The top five cards determine which hand we chose in the first step,
the order of them determines how the second step was performed, and the order
of the others determines how the third step was performed.) Thus, we have
found two particular formulas for counting the same set of objects–that is, the
shufflings of a deck of cards–and so it must be true that

X · 5! · 47! = 52!

and therefore
52!
X=
5! · 47!
Think about what this formula tells us. We let X designate the number of ways
to choose a set of five cards from a set of 52 cards. What was special about five
or 52? Again, that’s right, nothing! We have essentially derived a formula for
the number of ways to select any number of objects from a larger set of objects.
It might not seem like it, but we are now very close to solving the poker hands
problem. Before we finish that project, let’s make one comment.
First, the type of argument we just made is a common and extremely useful
proof technique in combinatorics. It is known as counting in two ways. What
584 CHAPTER 8. COMBINATORICS

we did was identify a particular set of objects–in this case, the set of shufflings
of a deck of cards–and then describe two different procedures that allowed us to
count the size of that set. Each procedure led to a different formula, and because
we were counting the same set of objects, we know those formulas are equal.
We will explore this type of argument more explicitly and see many examples
in Section 8.4. For now, we hope that you can see why it is a valid argument
type, especially because we will expect you to use it to prove Proposition 8.2.16
below! In doing so, you will be generalizing the argument we presented here.
For illustration’s sake, let’s summarize what we did:
Argument Summary: We seek an expression for the number of ways to draw 5
cards from a deck of 52 cards. Let N be this number we are looking for. We will
identify two different formulas for expressions that involve N . This will allow
us to solve these algebraic expressions for a formula for N .
(1) Select an arbitrary and fixed five card hand. We will identify the number
of ways to shuffle a deck of cards such that the top five cards are that fixed
five card hand, in any order.
Note that there are N ways to do this step. We seek a formula for N .
(2) Count the number of permutations of the entire deck of 52 cards.
(3) Count the number of permutations of the deck that yield those fixed five
cards on the top. This is split into three steps:
(i) Count the number of ways to permute those five cards.
(ii) Count the number of ways to permute the other 47 cards.
(iii) Count the number of ways to put those 5 permuted cards on top of
those 47 permuted cards. (Note: There is only one way to do this, but
it’s important to point out as a separate step.)
(4) Overall, notice that we have counted the number of permutations (i.e. shuf-
flings) of the deck in two separate ways, so they must be the same number.
(5) Simplify the expression (which involves N ) to find a formula for N .
Now, let’s generalize the formula we just derived. First, we make a definition
and introduce some notation, and then we state a formula.
Definition 8.2.15. Let k, n ∈ N with n ≥ k. A k-selection from [n] is an
unordered set of k elements from [n].
The number of k-selections from [n] is represented by nk . This is known as a


binomial coefficient, and is read as “n choose k”.


Proposition 8.2.16. Let k, n ∈ N with n ≥ k. The number of k-selections
from [n] is given by  
n n!
=
k k! · (n − k)!
Proof. Left for the reader as Exercise 2 in Section 8.2.4
8.2. BASIC COUNTING PRINCIPLES 585

Binomial Coefficients
One thing you might find surprising about the above formula is that the fraction
is actually a natural number, no matter what k and n are! This is proven by the
fact that it represents a number of ways to complete a procedure, as described
in the proof, and this must be a natural number.
We want to point out one special case of this formula which may not occur
n
to you. What if k = 0, say?  What number should 0 be? You might be
n
surprised to find out that 0 = 1. Why does this make sense? Intuitively, we
think of nk as the number of ways to select k objects from a set of n objects;
so, how many ways can we select 0 objects from, say, 3 objects? Put 3 pens on
your desk. Now, select none of them. There! You just did it! That was one
way—and the only way—to select none of the objects. This argument works
just as well when n = 0, even! Put no pens on your desk. Now, select none of
them. There! You just did it in one way again. Thus,
 
n
∀n ∈ N ∪ {0}. =1
0

There are “better”, more mathematical reasons for this result, and we will point
these out in the next section when we prove Pascal’s Identity. For now, we hope
that this heuristic explanation with selections makes sense and can convince you
of this result.
n

Another fact is that K = 0 whenever K > n. This is because there are no
ways to choose, for instance, 5 objects from a set of only 3 objects. This fact
is borne out by our derivation above, because in one of the steps, we would be
trying to (impossibly) draw more cards for a hand than there are cards in that
deck, and there are 0 ways to do this. Then, when we apply ROP, the product
would evaluate to 0.
Ifyou play around with some values of k and n, you’ll notice that the values
of nk obeys a so-called unimodal distribution. That is, if we fix n and let
n
k increase
 n  from 0 to n, we find the numbers going up, reaching a peak at 2
and 2 (notice these are the same if n is even) and then decreasing again.
Furthermore, the distribution is symmetric around that middle! Can you prove
that these properties hold? Try it!

Arrangements
We now have all of the tools necessary to count poker hands (and plenty of
other objects, for that matter). We know how many ways there are to permute
the elements of a set, and we know how many ways there are to choose a subset
of a certain size from a larger set. Between these two tools, we know how to
count any combinations of cards. For instance, to count an ordered subset of
cards, we can count the number of ways to choose the subset and then permute
its elements, applying the Rule of Product to this two-step process. In fact, this
idea is common enough that we will give it a defined name.
586 CHAPTER 8. COMBINATORICS

Definition 8.2.17. Let k, n ∈ N with n ≥ k. A k-arrangement from [n] is


an ordered k-tuple of elements from [n] with no repeated elements.
Equivalently, a k-arrangement from [n] is a function f : [k] → [n] that is an
injection.
Proposition 8.2.18. Let k, n ∈ N with n ≥ k. The number of k-arrangements
from [n] is given by nk · k! = (n−k)!
n!
.

Proof. Left for the reader as Exercise 3 in Section 8.2.4

Repetition
Before we go on and count those poker hands, actually, we should point out
that all of the standard counting formulas we have seen in this section only
consider procedures where objects are not allowed to be repeated. That is, when
we choose a five card hand from a deck, we can’t have two A♣s, for instance.
There are situations where we will want to allow objects to be selected multiple
times. Look back at the License Plates example in the previous section. We
were allowed to repeat any digit/letter; for instance, 111AAA is a valid license
plate. Let’s see one more example here:
Example 8.2.19. Consider a standard, fair, two-sided coin. Flip the coin 6 times
in a row and write down the outcomes, either H or T for each flip.
Question: How many possible sequences of outcomes are there?
To answer this question, we note that there are 2 possible outcomes on each
flip, regardless of the outcomes on the previous flips. Thus, the Rule of Product
applies, and we can say there are 2 · 2 · 2 · 2 = 24 = 16 possible sequences of flips.
The reason this idea is related to selections and arrangements (beside using
the Rule of Product, of course) is that we can also represent these sequences
as arrangements of 4 objects from the set {H, T } where objects are allowed to
appear more than once. (There is a natural correspondence between {H, T }
and [2], so it is like we are arranging 4 objects from [2], where the objects can
occur more than once.)
This general idea is conveyed by this definition:
Definition 8.2.20. Let k, n ∈ N. A k-arrangement with repetition from
[n] is a k-tuple of elements from [n] where elements are allowed to appear more
than once.
Notice that there is no restriction on k because we are allowing elements
to appear multiple times. Before, with k-arrangements without repetition, it
wouldn’t make sense to choose 10 objects from 8 objects if we couldn’t repeat
any! Here, though, this is allowed, so k and n can be any natural numbers.
Proposition 8.2.21. Let k, n ∈ N. The number of k-arrangements with repe-
tition from [n] is given by nk .
Proof. Left for the reader as Exercise 4 in Section 8.2.4.
8.2. BASIC COUNTING PRINCIPLES 587

You might anticipate a definition and proposition for k-selections with rep-
etition that are similar to the ones for arrangements with repetition. We will
discuss these in Section 8.5, but the techniques used to count them are more
advanced than the ones we have now, so we will address this later.

Summarizing Counting Formulas


Let’s summarize the standard counting objects and formulas we have defined
and derived thus far: Say we have n objects and we want to select k of them.
How many ways can we do this? The answer depends on two questions:

• Are repeats allowed?

• Does order distinguish the outcome?

Each of these questions can be answer with Yes or No, and each of the four ways
to answer them yields a different formulation of the original question.

Repeats?
Yes No
k n!
Yes n
(n − k)!
Order
Matters?  
n
No ???
k
(Note: Sometimes, the roles of n and k are reversed in a problem. Be careful
about this! We’ll try to stick to these conventions but, in general, the letters
aren’t important; it’s what they represent.)

Combinatorics Definitions in terms of Functions


Remember there are also equivalent formulations of these counting ideas in
terms of functions, and it’s helpful to have this in mind. Perhaps representing
a problem in terms of functions will help us solve it. At the very least, it’s
a good mental exercise to work through and make sure you understand the
relationship between, for instance, permutations and bijections. We will just
state each of these formulations (and some corresponding formulas) and ask
you to think about them on your own. Try to see exactly why and how the
notions are related; try to explain them to a friend who only knows one of the
interpretations; work with your classmates to perhaps come up with a different
formulation!

• A permutation of n elements is a bijection f : [n] → [n].


There are n! possible bijections from the set [n] to itself.
588 CHAPTER 8. COMBINATORICS

• An arrangement of k elements from n elements is an injection f : [k] →


[n].
n!
There are (n−k)! injections from [k] to [n].

• An arrangement with repetition of k elements from n elements is a


function f : [k] → [n].
There are nk possible functions from [k] to [n].

8.2.4 Questions & Exercises


Remind Yourself
Answering the following questions briefly, either out loud or in writing. These
are all based on the section you just read, so if you can’t recall a specific defi-
nition or concept or example, go back and reread that part. Making sure you
can confidently answer these before moving on will help your understanding and
memory!
(1) What is the difference between a selection and an arrangement?
(2) How might a permutation be defined in terms of selections and arrange-
ments?
(3) What is 10

15 ?

(4) How is a permutation related to the concept of a bijection?

Try It
Try answering the following short-answer questions. They require you to actu-
ally write something down, or describe something out loud (to a friend/class-
mate, perhaps). The goal is to get you to practice working with new concepts,
definitions, and notation. They are meant to be easy, though; making sure you
can work through them will help you!
(1) Verify algebraically that nk = n−k
n
 
.
(2) Prove Proposition 8.2.16, i.e. prove that
 
n n!
=
k k!(n − k)!
Do this by adapting the argument we used for counting the number of 5
card hands from a standard deck.
(3) Prove Proposition 8.2.18. That is, prove there are
n!
(n − k)!
possible k-arrangements from [n].
8.3. COUNTING ARGUMENTS 589

(4) Prove Proposition 8.2.21. That is, prove there are

nk

possible k-arrangements with repetition from [n].

8.3 Counting Arguments


Now we are fully ready to address the motivating problem of this chapter! We
will employ the counting techniques we have developed–the Rules of Product and
Sum–as well as the formulas for selections and arrangements. Importantly, we
will show you some standard counting arguments and proof strategies. We will
point out some general guidelines and proof techniques as we go, motivating and
implementing these with several examples. These are the types of techniques
we will expect you to use in the future.

8.3.1 Poker Hands


Example 8.3.1. One Pair
Let’s start near the bottom of the ranks and count the number of poker hands
that correspond to one pair. We emphasize that we only want to count hands
with exactly one pair, and exclude two pairs, three of a kinds, full houses and
four of a kinds. This idea will surface soon enough in our counting argument.
(It also hints at why counting “high card” hands is actually quite difficult, far
more intricate than just selecting five random cards! How can we guarantee
that a hand has no matching cards, isn’t a straight, and isn’t a flush? We will
address this question later in this section.)
In this example–and in every other example we will explicate here, and every
other exercise you will complete (do you get the sense this is important?)–we seek
a process wherein we construct an object (in this case, a poker hand) with the
desired properties (in this case, having exactly one pair and no other matching
cards). By counting the number of options at each step in the process, and
ensuring that every desired object can only be obtained via one set of options
in the process, we can apply the Rule of Product and identify the number of
objects with the desired properties!
Here’s a useful strategy for coming up with these processes: pretend your
friend is holding one of the objects you’re counting in his/her hands, but you
can’t see it. What questions would you ask to identify the particular propeties
of the object he/she is holding? These can be yes/no questions or, more often
than not, queries about the particular properties the object has. In our specific
case, counting one-pair hands, we would likely ask the following questions: (1)
“What are the two cards in the pair?” and (2) “What are the three cards not
in the pair?” With the answers to those questions, we could fully specify the
hand our friend is holding. Unfortunately, it’s too hard to count the number of
answers to those questions as they are posed. We should be more specific and
590 CHAPTER 8. COMBINATORICS

break our questions into smaller parts. That way, we can count the number of
answers to each question and use those numbers in the Rule of Product.
How can we be more specific? How can we break question one into parts?
Imagine the types of answers our friend might give us for question one. We
might hear something like, “The Ace of Hearts and Ace of Spades” or “The
Sevens of Diamonds and Clubs”. This signals the important properties of an
answer to question one: we need to know the rank of the pair cards (are they
both Aces? Kings? Queens? etc.) and the two suits represented. We know
there are 13 ranks and 4 suits in the deck. With this information, we can identify
how to construct a pair and count the options.
1. Choose a rank for the two cards in the pair: 13 options
2. Choose the two suits for those cards: 42 = 6 options


Notice that we have used the binomial coefficient 42 to signify that we are


selecting 2 suits from a set of 4 suits, so there are 42 ways to do this.




Note: 42 is a NUMBER. It represent the number of ways to do something,




and does not actually correspond to doing that action. That is, we don’t say
something silly like “ 42 selects 2 suits from the set of 4 suits.” How can a


number choose cards from a deck?


Also note: We wrote 42 = 6 in this case for illustration’s sake but, in
general, we do not expect (or even necessarily want) you to evaluate binomial
coefficients. The
 arithmetic often involves very large numbers and, quite frankly,
the number 42 is far more illustrative than 6. It indicates to a reader that this
step in your process involves selecting 2 elements from a set of 4, whereas 6
could represent 61 or 2 · 32 and so on. With that observation made, we might
as well write the number in the first step as 13

1 , right?
Now, we observe that any selections made in these steps produce a unique
pair. That is, we can’t possibly have a pair that could arise from two different
versions of this proces.
 4 Thus, the Rule of Product applies, and we can conclude
that there are 13 1 · 2 ways to select a pair of cards.
What if we had performed these two steps in the opposite order? We could
just as well identify a pair of cards by asking which two suits are represented and
then asking what their common rank is? (Of course, this only works if we know,
a priori, that the cards have
 a common rank.) In that case, the Rule of Product
would tell us there are 42 · 13

1 such pairs. Hey, that’s the same number! The
commutativity of multiplication of real numbers (that is, x · y = y · x for any
x, y ∈ R) confirms our intuition that these steps are reversible.
We aren’t quite done constructing a poker hand with one pair. We need to
choose three more cards. What property should they have? What more specific
questions could we ask our friend, besides “What are they?”. We need to know
the three cards’ ranks and their suits. Is there any restriction on their suits?
No! (Because we have a pair already, there is no chance for a flush.) Is there
any restriction on their ranks? Yes! We know the three cards all have different
ranks, and none of them match the rank of the pair already chosen. With these
observations, we can reverse the process and construct the rest of the hand.
8.3. COUNTING ARGUMENTS 591

1. Choose 3 ranks from the 12 remaining (i.e. not the same rank as the pair
cards): 12

3 options

2. Arrange those 3 ranks in increasing order: 1 option

3. Choose a suit for the lowest-ranked card: 41 options




4. Choose a suit for the middle-ranked card: 41 options




5. Choose a suit for the highest-ranked card: 41 options




Why did we need step 2? Look back at the definition of selection; it is an


unordered list, or a set. Thus, it wouldn’t make sense to jump into step 2 by
saying “Choose a suit for the 1st of those chosen cards” because, well, there is
no 1st card! We need to impose some kind of ordering on the cards to refer
to them individually. You might be tempted to order them as we remove them
from the deck. This would break step 1 into 3 sub-steps: (a) choose the 1st
card: 12 11

1 options; (b) choose the 2nd card: 1 options; (c) choose the 3rd
10
card: 1 options. Applying the Rule of Product to this step yields a different
number than step 1:

12 · 11 · 10
       
12 11 10 12 12!
· · = 12 · 11 · 10 6= = =
1 1 1 3 3! · 9! 6

This is because the (a)-(b)-(c) step imposes an order on those three cards that
doesn’t actually matter within our poker hand. When playing cards, you don’t
care how you receive your cards, only what they are! (However, notice that if
we “divide out” by the number of ways to order 3 cards, namely 3!, we get the
same number. This hints at an interesting concept, a kind of “inverse” of the
Rule of Product. We will discuss this at the end of this section.) This is why
we couldn’t refer to “the 1st card” in step 2. Instead, we found an inherent
ordering of the cards, a particular property they possess that allows us to refer
to specific cards among them without applying an external ordering.
Again, the Rule of Product applies because any selection of 3 cards of differ-
ent ranks could only come from one set of choices made in these steps. Further-
more, we can think of selecting a pair as Step 1 and selecting three other cards
of different ranks as Step 2 and apply the Rule of Product to this entire process.
This finally gives us an answer for the number of “one pair” poker hands:
       3
13 4 12 4
· · ·
1 2 3 1

Notice that we have combined the three numbers from the last steps above into
one coefficient raised to the third power. Now, this type of numerical answer
is totally acceptable and is far better than just writing down 1, 098, 240. If
you make a “typo” on your homework or make a calculator error, how can we
identify the error and offer a comment? ,
592 CHAPTER 8. COMBINATORICS

We did previously note the commutativity of multiplication and the idea of


doing steps in different orders. However, we hope you’ll agree that explaining a
product like
 3      
4 13 12 4
· · ·
1 1 3 2
even though it represents the same process, is far more intricate, and unnecess-
sarily so, at that.
We chose to be particularly wordy with our explanation in the last subsec-
tion. We won’t expect you to write nearly as much. We were just officially
introducing a formal method that applies the counting rules and formulas we
developed in the last section, while also mentioning some heuristic rules and
strategies to approach problems. So with that said, let’s present a typical solu-
tion to this problem, in more condensed form. This is the type of solution we
will expect you to write:
Question: How many 5-card poker hands are “one pair” hands?
Answer: We claim there are
       3
13 4 12 4
· · ·
1 2 3 1
such hands. To show this, we will identify a four-step process and apply ROP.
An outcome of this process is a “one pair” poker hand:
(1) Select a rank to constitute the pair.
There are 13

1 ways to do this.

(2) Select which two suits of that card in (1) appear in the hand.
There are 42 ways to do this.


(3) Select three other ranks to appear.


There are 12

1 ways to do this.

(4) For each rank chosen in (3), select a suit of that card to appear in the hand.
3
There are 41 ways to do this, each of three times; thus, there are 41 ways


in total.
Applying ROP, we find the answer given above.
Does this make sense? Notice how much shorter it is than our explanation
above. This is fine! We will continue to sometimes write out some details
in our written examples here (to help you understand how to approach these
problems, before writing them up), but your written solutions can be a little
more condensed, as long as they identify all the key elements of the problem’s
solution. Notice that we pointed out a use of the ROP, cited it, and identified
all the steps in the process; for each step, we noted how many ways there are
to do that step. It just so happens each of these steps are pretty simple, and
8.3. COUNTING ARGUMENTS 593

the number of ways to perform them is clear in each case. In general, we might
expect a more thorough description. For instance, we would consider writing
that the number of ways to do step (3) is 12

1 because we aren’t allowed to
re-select the rank chosen in step (1). However, we felt this was clear from the
descriptions so we left it out. This is a judgment call, though, and we recommend
(as always) setting aside your proofs and rereading them as if you didn’t write
them. If you can’t remember, or aren’t entirely sure, why something is true,
consider adding a little extra description there.
Before doing another example, let’s point out a different solution to this
same problem!
Question: How many 5-card poker hands are “one pair” hands?
Answer: We claim there are
    3
13 4 4 4
4 1 2 1
“one pair” poker hands. We will identify a six-step process and apply ROP. The
main idea is that a one pair hand can be identified by choosing all four ranks
that appear and identifying which one is repeated twice (leaving the others to
appear just once).
(1) Select 4 ranks that will appear in our hand.
There are 13

4 ways to do this.

(2) Of the 4 ranks selected in Step (1), select one of them. Two cards of that
rank will appear in our hand.
There are 41 ways to do this.


(3) For that rank chosen


 in Step (2), select 2 suits. These will appear in our
hand. There are 42 ways to do this.

(4) For the lowest of those 3 ranks not chosen in Step (2), select a suit.
There are 41 ways to do this.


(5) For the middle of those 3 ranks not chosen in Step (2), select a suit.
There are 41 ways to do this.


(6) For the highest of those 3 ranks not chosen in Step (2), select a suit.
There are 41 ways to do this.


3
By ROP, and simplifying 41 41 41 = 41 , we have shown the expression above
  

is correct.
Isn’t that neat? We’ll leave it to you to verify that
    3     3
13 4 4 4 13 4 12 4
= 1098240 =
4 1 2 1 1 2 3 1
594 CHAPTER 8. COMBINATORICS

is true, numerically speaking. Without calculating that number in the middle,


though, we could be sure that the two expressions, on the left and right, are
absolutely equal representations of the same number because they count the
same thing: the number of one pair poker hands. This is another instance of
that idea of “counting in two ways” that we are building towards.
Example 8.3.2. Flush
Let’s jump right into another problem and solve it. Let’s count the number of
poker hands that are flushes. A flush hand is defined by two propeties: the suit
all 5 of its cards share, and the 5 ranks of those cards. Thus, a flush can be
generated by a two-step process:

(1) Select a suit for all five cards of the hand.


There are 41 ways to do this.


(2) Select five of the cards from that suit to appear in the hand.
There are 13

5 ways to do this.

Since each flush hand is uniquely defined by these two steps, we can apply ROP
and conclude that there are
   
4 13
· = 5148
1 5

poker hands that are flushes.


This proof given in this example (except for the final number 5148, which we
only included here for sake of comparison to the “one pair” answer which was
much larger), is completely correct and rigorous, and would receive full credit.
Use this as a model for simple counting problems with the Rule of Product.
Example 8.3.3. Straight
The ranks of the cards in a straight are uniquely determined by the “starting
rank”, the lowest card of the hand. If I told you I had a 5-card straight starting
with 7, you’d know immediately I have a 789TJ straight. Since we can have
a straight like A2345, or one like 23456, . . . all the way up to TJQKA (Note:
There is no “going around the corner” in a straight, like QKA23), this means
we have 10 possible lowest ranks in a straight. Thus, there are ten types of
straight, and after picking which type we have, we just need to assign the suits
so that they aren’t all the same (in which case we’d have a straight flush).
We claim there are
  " 5  #
10 4 4
− = 10 · (45 − 4)
1 1 1

5 card hands that are straights.

Proof. We will describe 5 card hands that are straights by a two-step process:
8.3. COUNTING ARGUMENTS 595

1. Select one of 10 ranks to be the lowest rank in the straight. These options
are A,2,3,4,5,6,7,8,9,T, so there are 10 options in this step.
Note: This determines the other 4 ranks in the hand, since the 5 ranks
must be consecutive and we know what the lowest one is.
2. Assign suits to the 5 cards so that they are not all the same suit.
Let’s say X is the set of all possible ways to assign suits in this manner,
so there are |X| options in this step.
We will now find |X| by establishing a partition. Let Y be the set of all assign-
ments of 5 suits so that they are all the same. Notice that the sets X and Y
form a partition of U , the set of all possible assignments of 5 suits. (That is,
any assignment of 5 suits either selects all the same suit, or it does not.) Thus,
by ROS, we have |U | = |X| + |Y |.
We can find |U | by a 5 step process, where in Step i, we select one of the 4 suits
for the i-th highest rank in the hand. With 4 options at each of 5 steps, we have
|U | = 45 .
We can find |Y | by noticing that any such selection amounts to picking one of
the 4 suits and assigning that suit to all 5 cards in the hand. Thus, |Y | = 4.
Accordingly, we can rearrange the above equality and write
|X| = |U | − |Y | = 45 − 4
Since |X| is the number of options in Step (2) above, by ROP, we have proven
the claim.
Note: In this proof, we came up with all of the relevant steps to show that
there are 10 · 4 = 40 possible straight flushes (straights of the same suit), only
1 · 4 = 4 of which are royal straight flushes (TJQKA of the same suit). Try to
write out those arguments for yourself!

8.3.2 Other Card-Counting Examples


Let’s look at some related examples to broaden the class of techniques we’re
applying.
Example 8.3.4. At least 3 Aces
For this example, let’s count the number of poker hands that have at least three
aces. Again, let’s apply the technique we used above and think of the essential
properties of such a hand. Try to think of a few questions yourself, with the
goal being that the answers determine a unique hand and, given any answer, we
can count exactly how many ways to construct a hand that yields that answer.
Did you notice the difficulty? One of the answers to the questions directly
affects the nature of the other questions! This indicates some deeper mathemat-
ical issues at play. Perhaps it makes sense to have that determining question
come first, and then consider what decisions must be made from there.
596 CHAPTER 8. COMBINATORICS

First, IF there are exactly 3 Aces in the hand, then we need to determine
the characteristics of the other two cards. Those two cards are either (a) the
same rank or (b) two different ranks. Thus, there are two sub-cases for this
particular case. This yields the following procedure:

1. Choose 3 suits for the 3 Aces: 43 options




(a) The remaining two cards are of different ranks:


12

i. Choose 2 ranks from the remaining 12 for the other 2 cards: 2
options
4

ii. Choose a suit for the lowest-rank card chosen in Step 2: 1
options
4

iii. Choose a suit for the highest-rank card chosen in Step 2: 1
options
(b) The remaining two cards are of the same rank:
12

i. Choose 1 rank from the 12 non-Ace ranks: 1 options
ii. Choose 2 suits for this rank: 42 options


Then, by the Rules of Product and Sum (since we have separate cases in a
process), we find there are
  "  2   #
4 12 4 12 4
+
3 2 1 1 2

hands with exactly 3 Aces.


Second, IF there are exactly 4 Aces in the hand, we need to determine the
characteristic of the fifth card in the hand. This yields the following procedure

1. Choose 4 suits for the 4 Aces: 44 options




2. Choose 1 rank from the remaining 12 for the other card: 12



1 options

3. Choose a suit for the card chosen in Step 2: 41 options




We can apply the Rule of Product and conclude that there are then 44 12
  4
1 1
hands with exactly 4 Aces. Now, we must apply the Rule of Sum! What we
have here is a partition of the set of desired hands–those with at least three
Aces–into two subsets–those with exactly three Aces and those with exactly
four Aces. Since those subsets partition the larger set (i.e. every hand with at
least three Aces has either three Aces or four Aces, not both and not neither),
we may apply the rule of sum and conclude that there are
  "  2   #    
4 12 4 12 4 4 12 4
+ +
3 2 1 1 2 4 1 1

poker hands with at least three Aces.


8.3. COUNTING ARGUMENTS 597

Recall that the rigorous statement of the Rule of Sum concerned cardinalities
of finite sets, and yet we didn’t technically get into those details in the previous
example. There is a certain amount of discretion and finesse required with these
types of combinatorial arguments. Is it obvious to you that every poker hand
with at least three Aces has either exactly three or exactly four Aces, not both and
not neither ? We are not saying it should be totally obvious and you’re a dummy
for not seeing it right away! Far from it! What we are saying is that this type
of statement should probably suffice as an explanation in a proof. Yes, we could
dive into further detail, reformulate poker hands in terms of sets, and completely
rigorize the game of poker into set notation. What good would that really do,
though? It seems far easier to explain it via the italicized statement above. If we
were pressed for details by a confused reader, we could offer further explanation,
but for a general audience, this argument would suffice. Hopefully, this rule of
thumb–convincing a general audience, but being able to explain further when
pressed further–should guide you into making decisions about how much detail
to include in a counting argument. The essential observation here is that we
indicated why our choices pertain to a partition of the set of hands in question.
No, we didn’t rigorously prove the two sets were disjoint, but we offered an
explanation as to why.
Another approach to this problem does not involve considering the suits of
the non-Ace cards. Instead, we can approach the process of constructing a poker
hand with at least 3 Aces as follows:

1. If there are exactly 3 Aces:


4

(a) Choose 3 suits for the 3 Aces: 3 options
(b) From the remaining 48 non-Ace cards, select 2 to “fill out” the 5 card
hand: 48
2

2. If there are exactly 4 Aces:


4

(a) Choose 4 suits for the 4 Aces: 4 = 1 option
(b) From the remaining 48 non-Ace cards, select 1 to “fill out” the 5 card
hand: 48
1

Thus, by the Rule of Sum (since we have partitioned the hands based on how
many Aces they have) and by the Rule of Product in each of the two cases, we
have      
4 48 4 48
+
3 2 4 1
total poker hands with at least 3 Aces. You will see (and use) this approach
more often. The previous argument was more similar to the previous example
involving flushes, so that’s what we presented first. This argument is a bit
shorter and “slicker”, and is thus more commonly used. But wait a minute,
these answers look different! We were counting the same set of poker hands, so
598 CHAPTER 8. COMBINATORICS

shouldn’t we expect the same final number ? Well, yes, and we recommend that
you perform the requisite algebraic manipulations to convince yourself that
       "  2   #    
4 48 48 4 12 4 12 4 4 12 4
+ = + +
3 2 1 3 2 1 1 2 4 1 1

It will only take a minute, and it is worthwhile.


Before moving on to another problem, let’s look at a false argument about
this one. It may seem strange to look at wrong answers, but we know from
experience that it can be extremely helpful and instructive to try to find the
flaw in a faulty argument. Sure, we could just compare two large integers and
just say, “Hey, look, they’re different!” but this is not enlightening. Rather,
we want to follow a combinatorial argument and pinpoint the step that makes
a logical flaw or alters the set of objects we are counting in a flawed way. We
highly recommend this tecnique for several reasons. First, it gives you good
practice with reading proofs and understanding others’ arguments. This will
help you as you learn more mathematics and read other books that might not
explain things in exactly the same way. Second, it helps you become a better
editor of your own proofs. After writing up a homework problem, set it aside
for half an hour and come back to it with a fresh mind. Read it as if you didn’t
write it (as best you can, we understand you just can’t pretend you didn’t do
it!). Does it make sense? Are there certain steps that seemed obvious when
you wrote them but whose details escape you now? Is the answer even correct
and are you convinced by it? Third, recognizing when a bad step is made in a
proof solidifies your understanding of the principles underlying the argument.
Going through combinatorics arguments and identifying flaws will really help
your intuition and understanding of the Rules of Sum and Product. Trust us.
What do you make of this argument? Remember, this answer is incorrect,
and we want to know why!
Example 8.3.5. Find the Flaw! How many 5-card poker hands have at least
three Aces?
1. For hands that have three Aces:
(a) Choose 3 of the 4 Aces: 43 options


49

(b) From the remaining 49 cards, choose 2 more: 2 options
2. For hands that have four Aces:
4

(a) Choose 4 of the 4 Aces: 4 = 1 option
48

(b) From the remainign 48 cards, choose 1 more: 1 options
Thus, there are      
4 49 4 48
+
3 2 4 1
poker hands with at least 3 Aces.
8.3. COUNTING ARGUMENTS 599

What’s the problem here? Do you see any errors? Was the Rule of Product
applied inappropriately? Was the Rule of Sum applied to something that isn’t
actually a partition? Did we overcount? Undercount? Did we count some hands
that do not have the desired properties? Think about this before reading on.
Here’s what we noticed: this answer is too large. We overcounted by includ-
ing certain hands multiple times in our count. That is, every hand we sought
to count is included at least once by the steps above, but some hands can be
constructed in multiple ways via those steps. These observations guarantee that
our number is too large.
How did we know this? We recommend actively trying to identify a hand
that can be constructed in different ways by following the above steps. If you’re
reading through a proof and can do this, you know that the entire proof is now
flawed. In this case, let’s examine a hand that has exactly 4 Aces; specifically,
let’s look at the hand A♣A♠A♦A♥2♣. We can construct this hand by the
following paths through the steps:
1. Choose 3 of the 4 Aces: A♣A♠A♦
2. From the remaining 49 cards, choose 2 more: A♥2♣
Or, we could take this path:
1. Choose 4 of the 4 Aces: A♣A♠A♦A♥
2. From the remaining 48 cards, choose 1 more: 2♣
Do you see the problem now? This exact same hand is produced in (at least) two
distinct ways via the process outlined above. Thus, the answer is an overcount.
Are there any other ways we could construct this same hand? How many? Try
to identify another hand that is overcounted. Can we possibly identify how
many times every hand is overcounted by and amend our answer that way?
This is an interesting (and very challenging, actually!) idea that we’ll return to
later.

Potential Flaws in Arguments


For now, we want to emphasize the technique of reading combinatorial proofs
and looking for some standard flaws:
• Misuse of Rule of Product:
The proof incorrectly applies the Rule of Product to a situation that
doesn’t warrant it. Perhaps the number of options at each step of the
procedure change somehow, depending on how the previous steps are
completed. Or, perhaps different sequences of steps produce the same
outcome.
• Misuse of Rule of Sum:
The proof incorrectly applies the Rule of Sum to a situation that doesn’t
warrant it. Perhaps the sets of the “partition” are not actually disjoint.
600 CHAPTER 8. COMBINATORICS

Or, perhaps the union the sets of the “partition” do not actually cover the
entire set in question.

• Overcount:
Every desired object is counted at least once, but some are counted more
than once. That is, some elements of the set in question can be counted
in multiple ways via the steps of the proof.

• Undercount:
Some desired objects are not counted at all. That is, some elements of the
set in question are not counted by the steps of the proof.

• Extraneous Count:
Some undesired objects are counted. That is, the steps of the proof count
some objects that are not elements of the set in question.

We recommend reading over your written proofs and trying to identify these
flaws, even if they aren’t there. Perhaps by struggling to find an overcounting
argument, say—by attempting to construct certain objects in multiple ways via
your steps—you actually identify a flaw you didn’t know was there! If you don’t
find any flaws, you can be more assured that your proof is fully correct.
Example 8.3.6. Here is a standard example of a naive overcount. We will show
how it is an overcount and then fix it by counting in a different way! Here is
the question:
How many 5-card hands have at least one card of each suit?
Here is an incorrect argument:
 4  
13 48
There are · = 1370928 such hands.
1 1
We can use a five-step process. In step 1, we select one of the 13
Hearts. In step 2, we select one of the 13 Diamonds. In step 3, we
select one of the
 13 Spades. In step 4, we select one of the 13 Clubs.
There are 13 1 ways to do each of these steps.

Next, from the remaining 48 cards, we select one of them to complete


our 5-card hand. By ROP, the claim above follows.

What’s wrong with this? Think about it carefully before reading on. Look at
the list of potential mistakes above; does one of them apply here? How would
you show this?

We think that this is an overcount. To show this, we will exhibit a particular


5-card hand that should be counted only once but is, in fact, counted at least
twice by the procedure outlined in the argument above.
Consider the hand A♥, A♦, A♠, A♣, K♥. Notice that this hand can be achieved
by the above procedure in two ways:

You might also like