Probability and Functions in Card Games
Probability and Functions in Card Games
g:A→B 1
a
2
b
3
c 4
d 5
e 6
A B
However, this does somehow represent the idea of a function. In this picture,
we have represented the domain A by an oval, and the same with the codomain
B. The elements of A and B are represented by dots inside those ovals (and
they are labeled), and we have drawn arrows between those dots based on what
the function g : A → B does.
Mostly, this method is used to explore a certain property of a function and
perhaps construct a counterexample to a claim. By drawing some dots and
arrows and playing around with how they connect, we can perhaps develop the
underlying structure of an example; then, we can go back and assign some names
and formulas to the parts of the diagram and make the picture more rigorous.
We will use some schematic diagrams to illustrate some properties and con-
cepts as we proceed, but these will always be accompanied by a more rigorous
statement or description. We encourage you to employ a similar method.
(1) Write down the definition of a function, without looking it up. Then,
compare to our definition. Does yours convey the same information? If not,
what did you miss?
(2) What is the difference between the domain and the codomain of a func-
tion?
Try It
Try answering the following short-answer questions. They require you to actu-
ally write something down, or describe something out loud (to a friend/class-
mate, perhaps). The goal is to get you to practice working with new concepts,
definitions, and notation. They are meant to be easy, though; making sure you
can work through them will help you!
(1) Use proper notation to define a function that inputs an integer and outputs
the square root of its absolute value.
What is the domain of this function? What is its codomain?
(2) Use proper notation to define a function that inputs a pair of natural num-
bers and outputs their average (arithmetic mean).
What is the domain of this function? What is its codomain?
(3) Let A = {−2, −1, 0, 1, 2}. Let g : A → A be defined by ∀x ∈ A g(x) = .
x2 − 3. Draw a schematic diagram to determine whether g is well-defined
or not. Is it?
(4) Let X be any set. Use proper notation to define a function that inputs a
subset of X and outputs that set’s complement (in the context of X).
What is the domain of this function? What is its codomain?
.
(5) Let B = {−1, 0, 1}. Let h : B → B be defined by ∀b ∈ B h(b) = b3 . What
special function is this equal to?
.
(6) Let f : Z × Z → N be defined by ∀(x, y) ∈ Z × Z f (x, y) = 21 |x + 1| · |y|. Is
this a well-defined function? Why or why not?
Definition
Definition 7.3.1. Let A, B be sets and let f : A → B be a function. Let
X ⊆ A.
The image of X under the function f is written and defined as
.
Imf (X) = {b ∈ B | ∃a ∈ X f (a) = b}
That is, the image of X under f is the set of all “outputs” that come from
“inputs” in the set X.
An equivalent way of writing this is
Imf (X) = {f (a) | a ∈ X}
(We will sometimes abbreviate the notation as just Im(X), when the function
is clearly identified and unambiguous, and consequently refer to the set as just
“the image of X”, instead of “the image of X under f ”.)
When we say the image of f , we mean the image of the entire domain, i.e.
Imf (A).
Notice that this is defined for any subset of the domain, X ⊆ A, so we can
talk about the image of any “piece” of the domain, or all of it. We will see some
examples now—and exercises later—that consider strict subsets X ⊂ A, as well
as A itself.
One Observation
Notice that
Imf (A) ⊆ B
no matter what f and A and B are. This follows by definition, since we used
set-builder notation to define the image via elements of B. In the next section,
we will explore what happens when Imf (A) = B.
For now, let’s practice identifying the images of certain functions. In some
cases, we will be provided with a function and its image and asked to verify this
claim, but in other cases, we will need to develop some techniques to figure out
what the image is in the first place!
Examples
Example 7.3.2. Define a function g : A → B by setting A = {a, b, c, d, e} and
B = {1, 2, 3, 4, 5, 6} and
g = {(a, 2), (b, 3)(c, 3), (d, 1), (e, 6)}
Define X1 = {a, b, c} and X2 = {a, c, e} and X3 = {c, d, e}.
You might notice that this is the same function we defined in the schematic
diagram in the last section! Let’s see that diagram again, because it can help
us identify the images in the following list.
484 CHAPTER 7. FUNCTIONS AND CARDINALITY
g:A→B 1
a
2
b
3
c 4
d 5
e 6
A B
(1) Img ({a}) = {2}
This is because g(a) = 2.
Notice the use of set brackets. We always find the image of a set, so writing
Img (a) would be incorrect.
(2) Img ({b, c}) = {3}
This is because g(b) = g(c) = 3.
(3) Img (X1 ) = {2, 3}
This is because g(b) = g(c) = 3 and g(a) = 2.
(4) Img (X2 ) = {2, 3, 6}
This is because g(a) = 2 and g(c) = 3 and g(e) = 6.
(5) Img (X3 ) = {1, 3, 6}
This is because g(c) = 3 and g(d) = 1 and g(e) = 6.
(6) Img (A) = {1, 2, 3, 6}
Looking at the set B in the schematic diagram, we see that these are the only
values that are “hit” by the function. Notice 4, 5 ∈ B but 4, 5 ∈
/ Img (A), so
Img (A) ⊂ B (a proper subset).
Example 7.3.3. Consider the temperatures (in degrees Celsius) where water is
in its liquid state. Specifically, define the set
.
∀c ∈ C F (c) =
9
5
c + 32
to ImF (C), in the sense of sets. This means we will use a double-containment
argument!
Solution: Define S = {y ∈ R | 32 < y < 212}. (Notice that this represent the
set of temperatures (in degrees Fahrenheit) where water is in its liquid state.)
We claim S = ImF (C).
It is hard to give advice about how to come up with claims like this, in
general. Most often, this relies on some playing around with the function and
testing some values, and perhaps some insight about some other properties of the
function. In this specific case, we notice that this function is increasing; that is,
if we have two input values with c1 < c2 , then we know that F (c1 ) < F (c2 ). We
can glean this information from the graph of the function (see above) and/or
recognizing it is a linear polynomial. Accordingly, to identify the image, we
just have to consider the smallest and largest inputs and identify their outputs.
(Again, we can glean this information from a graph.) We find that
900
F (0) = 0 + 32 = 32 and F (100) = + 32 = 212
5
From this, we defined the set S. (Also, notice that we had to use “<” in the
inequality because, in fact, 0 ∈
/ C, the domain!) We also give this set a name so
that we can refer to it later without implicitly claiming, already, that it is the
image. This is a somewhat subtle distinction, but an important one! Now, let’s
prove our claim.
Proof. First, we’ll prove ImF (C) ⊆ S. (In other words, we’ll prove that every
output of the function F actually satisfies the inequality in the definition of S.)
(To do this we will start with an arbitrary element of ImF (C), and appeal to
the defintion of image to bring an element of the domain into play.)
Let y ∈ ImF (C) be arbitrary and fixed. By the definition of image, this means
∃x ∈ C such that F (x) = y. Let such an x be given.
By the definition of C, we know 0 < x < 100. By the definition of F , we know
486 CHAPTER 7. FUNCTIONS AND CARDINALITY
Together, this shows that s ∈ ImF (C), as well. Thus, S ⊆ ImF (C).
Overall, by a double-containment argument, we conclude that S = ImF (C).
The second half of our proof is certainly the harder part, and this is generally
true of proofs like this. In coming up with a candidate c, we essentially have to
“undo” the process that the function F does and find an input c for our given
output s. In a case like this, where the function is a numerical/arithmetical
operation on real numbers, the best route is to set up the desired equality, like
9
c + 32 = s
5
7.3. IMAGES AND PRE-IMAGES 487
and solve the equality for c. This function is linear, so this process only pro-
duces one such s but, in general, we might expect multiple values of s to work.
Ultimately, we only need one working value to complete this part of the proof,
so we can just select any one that works and use that as our claim. Sometimes,
though, this makes it harder to find such a value. It all depends on the example
at hand. Other times, we might be working with functions that aren’t defined
on sets of numbers, and we have to use some more abstract insight to come up
with a candidate element. Again, this all depends on the given situation, and
with practice, you’ll become much better at it!
Oh right, we asked what this image represents! Since the domain represented
the temperatures, in degrees Celsius, at which water is a liquid, the image
represents the temperatures, in degrees Fahrenheit, at which water is a liquid.
Let’s look at another example of proving the image of a function is a partic-
ular set.
Example 7.3.4. Define f : R → R by
x2
.
∀x ∈ R f (x) =
1 + x2
Let’s determine the image, Imf (R), and prove our claim!
Here, again, we must use some outside strategies and intuition to identify the
image first. Using some techniques from calculus or algebra, we could plot the
graph of this function and try to guess the image. Try that if you’d like. You’ll
end up with this graph:
We can also recognize that the denominator is greater than the numerator and
so, as x gets larger and larger, those two quantities get closer and closer together,
relatively speaking. (That is, their ratio approaches 1.) Also, both terms are
nonegative, since they involve squares, so their ratio is at least 0. In any event,
we can piece our observations together and make the following claim:
488 CHAPTER 7. FUNCTIONS AND CARDINALITY
2
x
Next, we know that 0 ≤ x2 < x2 + 1, so 1+x 2 < 1, as well. (Note: why was it
2
important to point out that x ≥ 0? What can go wrong there?)
x2 x2
This shows that 0 ≤ 1+x2 < 1. Since y = f (x) = 1+x2 , this is equivalent to
saying 0 ≤ y < 1.
Thus, y ∈ T , and so Imf (R) ⊆ T .
.
∀(a, b) ∈ N × N p(a, b) = ab + a
1 2 3 4 5
1 2 3 4 5 6
2 4 6 8 10 12
3 6 9 12 15 18
4 8 12 16 20 24
5 10 15 20 25 30
It looks like every natural number is “achieved” by the function p, except for 1.
Specifically, look at the top row of the array of values: there are all the natural
numbers except 1. Let’s use this insight in the following proof.
Proof. Let V = N − {1}. We claim V = Imp (N × N).
First, we prove Imp (N × N) ⊆ V . Let n ∈ Imp (N × N) be arbitrary and fixed.
This means n ∈ N and ∃(a, b) ∈ N × N such that p(a, b) = n. Let such (a, b) be
490 CHAPTER 7. FUNCTIONS AND CARDINALITY
given.
This means n = ab + a. Since a, b ≥ 1, then ab ≥ 1 and so n = ab + a ≥ 2. By
the defintion of V , this shows that n ∈ V .
Thus, Imp (N × N) ⊆ V .
(Try to write the next half of the proof before reading on and seeing ours! )
p(a, b) = p(v − 1, 1) = (v − 1) · 1 + 1 = v − 1 + 1 = v
Why didn’t we claim an equality here? It turns out that equality need not
hold, in fact! That is, there exists at least one function such that the reverse
containment—namely, Imf (S)∩Imf (T ) ⊆ Imf (S ∩T )—is False. We will provide
an example of such a function below.
(You should try to come up with an example of a function where this reverse
containment does hold. Together, we will have shown that one cannot make a
conclusion that necessarily holds about this containment!)
7.3. IMAGES AND PRE-IMAGES 491
We will use a schematic diagram to come up with an example with the de-
sired properties. We will then use this to formally define a function and state its
properties, pointing out how they match what will be established in our claim.
We want to point out that employing this technique is perfectly valid, as long
as you go back and write down a formal definition afterwards. Turning in just
a schematic diagram as a “proof” is not rigorous enough, but this can certainly
help guide your intuition into producing fruitful ideas for a proof!
Furthermore, keep in mind that there is no need to construct the most com-
plicated or interesting counterexample in situations like this. If you’re trying
to disprove a universally-quantified statement, you just need one example that
works! In particular, don’t feel like you need to define a function that works
with numbers, using some formula. Sometimes, this will actually make your job
much harder! It’s typically the case that a counterexample can be made using
sets with just a few (i.e. two or three) elements each.
f :A→B
1 ?
2
A B
Now, just to have a defintion in hand, let’s choose S = {1, 2}. It seems like
it will be more reasonable to work with 2 elements in S, so we’ll make that
choice. Also, it seems like we should make f (1) 6= f (2). Otherwise, Imf (S)
would contain only one element, and there would have been no point in making
S have two elements. So let’s define f (2), as well:
492 CHAPTER 7. FUNCTIONS AND CARDINALITY
f :A→B
1 ? Imf (S)
S
2
A B
Now, we need to choose T . It will be interesting to have S ∩T 6= ∅, but it would
be hard to handle (perhaps) if T ⊇ S. So, let’s say T = {2, 3}. Then, we just
need to choose f (3). In considering each of these cases, look at the schematic
diagram above, and imagine drawing an arrow to represent f (3).
• What if f (3) = f (2) = ?
In this case, Im( T ) = { }, so Imf (S)∩Imf (T ) = { }. But Imf (S∩T ) =
{ }, as well! This doesn’t work.
• What if f (3) is something else, like f (3) = , ?
This doesn’t work either! We will have Imf (S)∩Imf (T ) = { } = Imf (S∩
T ).
• What if f (3) = f (1) = ? ?
It looks like this works!
f :A→B
1 ? Imf (S)
S
=
Imf (T )
2
T Imf (S ∩ T )
A B
We have made it so that Imf (S) ∩ Imf (T ) is a strict superset of Imf (S ∩ T ).
Look back over our construction, and see if you understand our thought process.
What were the restrictions we had to conform to? Where did we have freedom
of choice? What did we decide to do?
We want to point out that this is absolutely not the only such example, though!
Try to come up with others!
Right now, all we have left to do is take the final diagram we constructed and
use it to define an example and then prove it works. Here we go!
7.3. IMAGES AND PRE-IMAGES 493
Definition
Definition 7.3.8. Let A, B be sets and let f : A → B be a function. Let Y ⊆ B.
The pre-image of Y under the function f is written and defined as
PreImf (Y ) = {a ∈ A | f (a) ∈ Y }
That is, the pre-image of Y under f is the set of all “inputs” that produce an
“output” in Y .
(We will sometimes abbreviate the notation as just PreIm(Y ), when the function
is clearly identified and unambiguous, and consequently refer to the set as just
“the pre-image of Y ”, instead of “the pre-image of Y under f ”.)
Think about this first: What is PreImf (B), where B is the entire codomain?
Look back at the definition: this is the set of all inputs (in A) whose outputs
“land” in B. That’s all of A, of course, since f is a well-defined function!
Accordingly, we will really only be working with sets Y ⊂ B, since those cases
are more interesting.
494 CHAPTER 7. FUNCTIONS AND CARDINALITY
Examples
Example 7.3.9. This first example uses the same function we defined in the last
section when we discussed images. We’ll show you the schematic diagram again,
but spare you from re-defining all the details of the function. (See Example 7.3.2
for the details.)
g:A→B 1
a
2
b
3
c 4
d 5
e 6
A B
Define Z1 = {1, 2, 3} and Z2 = {2, 3, 4} and Z3 = {4, 5, 6}.
Let’s identify the following pre-images and explain them.
(1) PreImg ({1}) = d
This is because g(d) = 1 and no other x ∈ A satisfies g(x) = 1.
(Note: We need to use set brackets here. “PreImg (1)” would make no sense.)
(2) PreImg ({4}) = ∅
This is because no x ∈ A satisfies g(x) = 4
(3) PreImg (Z1 ) = {a, b, c, d}
This is because g(a) = 2, g(b) = g(c) = 3, and g(d) = 1, but no other x ∈ A
satisfies g(x) ∈ Z1 .
(4) PreImg (Z2 ) = {a, b, c}
This is because g(a) = 2 and g(b) = g(c) = 3, but no other x ∈ A satisfies
g(x) ∈ Z2 .
(5) PreImg (Z3 ) = {e}
This is because g(e) = 6, but no other x ∈ A satisfies g(x) ∈ Z3 .
(6) PreImg ({5}) = ∅
.
This is because ∀x ∈ A g(x) 6= 5.
.
Example 7.3.10. Let f : R → R be the function defined by ∀x ∈ R f (x) = x2 .
Let’s identify a few pre-images with this function. We will let you figure out
why our claims are valid, as well as how to explain and prove them, this time!
7.3. IMAGES AND PRE-IMAGES 495
Notice how the proof below appeals directly to the formal definition of pre-
images. We will jump right in and prove both parts. The exercises will ask you
to investigate this claim with “∪” instead of “∩”.
You might read through this and think, “How does one come up with a proof
like this?” Well, there isn’t a whole lot of ingenuity behind a result like this. All
we did was appeal directly to definitions. Everything fell into place from there.
496 CHAPTER 7. FUNCTIONS AND CARDINALITY
If you find yourself stuck while working on a problem, or you’re just unsure of
where to start . . . just write down the relevant definitions. Try to apply them
to the statement you’re trying to prove. See what happens!
Let’s work on one result that involves both of the concepts we have introduce in
this section. We will prove one containment and ask you to disprove the other
one in the exercises.
Answering the following questions briefly, either out loud or in writing. These
are all based on the section you just read, so if you can’t recall a specific defi-
nition or concept or example, go back and reread that part. Making sure you
can confidently answer these before moving on will help your understanding and
memory!
(3) Suppose g : R → R is a function. Why is the expression Img (0) not a proper
statement? What do you think the writer of such an expression meant?
Try It
Try answering the following short-answer questions. They require you to actu-
ally write something down, or describe something out loud (to a friend/class-
mate, perhaps). The goal is to get you to practice working with new concepts,
definitions, and notation. They are meant to be easy, though; making sure you
can work through them will help you!
.
(1) Let h : R − {−1} → R be defined by ∀x ∈ R − {−1} h(x) = x
1+x .
Disprove the claim that this holds for any function f : A → B and any
Y ⊆ B by constructing a specific counterexample and proving that it works.
Definition
Definition 7.4.1. Let A, B be sets and let f : A → B be a function. We say f
is a surjective function if and only if Imf (A) = B.
Equivalently, we just say “f is surjective” (adjectival form), or that “f is a
surjection” (nounal form).
(The word “onto” is a fairly commonly used synonym for this term, so we will
mention it here but won’t use it again. This is just in case you’ve seen this word
somewhere else.)
Referring back to the definition of image, we can state this property equivalently
in the form of a quantified statement:
. .
f is surjective ⇐⇒ ∀b ∈ B ∃a ∈ A f (a) = b
That is, f is surjective if and only if every output has at least one corresponding
input.
Think for a minute about why the second form of this definition is really the
same as the first one. The property that Imf (A) = B is a statement about sets.
We already know that, by definition, Imf (A) ⊆ B (nothing in the image can fall
“outside” of the codomain), so this further property means that B ⊆ Imf (A),
as well. This is precisely what the second form of the definition says: every
element of the codomain satisfies the defining property of being an element of
the image.
Also, notice that nothing about the definition says the a we find to corre-
spond to a b must be unique! All this property requires is that, for every b ∈ B,
we can identify at least one a ∈ A that satisfies f (a) = b. There might be more
than one, there might be exactly one. It doesn’t matter, as long as there aren’t
none.
What does the property of being a surjection mean in terms of a schematic
diagram? Since every element of the codomain is “hit” by the function, this
means that every dot on the right-hand side of the schematic has an incoming
arrow. (Remember: this type of heuristic language is fine to keep in mind—
we are using it to help describe these concepts, after all—but this does not
constitute a proof. Any sentence of this sort that you use in a proof should
be accompanied by a more rigorous statement, using mathematical language
and/or logical symbols.) Why would we care about such a property? In general,
it can be difficult to declare exactly what the image of a function is, and we might
(at first) be able to only declare what the codomain is. Proving that, in fact, all
of the codomain elements are outputs of the function can be additiona, helpful
information!
codomain and image are the same set. If we believe it is not a surjection, we
should prove that by finding a counterexample. Let’s look at the logical negation
of the statement that defines a surjective function:
. . .
¬(∀b ∈ B ∃a ∈ A f (a) = b) ⇐⇒ ∃b ∈ B ∀a ∈ A f (a) 6= b .
That is, to prove a function f is not a surjection, we must find an element of
the codomain that is not an element of the image. This involves some scratch
work and intuition to identify such a b. From there, we must somehow show
that no possible a satisfies f (a) = b. We might argue this directly by taking an
arbitrary a ∈ A and explaining why f (a) 6= b. Alternatively, we might argue
this by contradiction: assuming that there is an a ∈ A such that f (a) = b, we
seek a contradiction. Either of these appraoches is reasonable, and they are
logically equivalent.
Examples
Let’s see these techniques in action with a few examples. For some of them, we
might be able to use some graphical intuition or try a few test cases to figure out
a guess, but ultimately we need to settle in and prove some logical statements
to validate our claims.
Example 7.4.2. Consider p : N × N → N defined by p(a, b) = ab. Is p surjective?
Yes, it is! It looks like we can just allow a to be 1, so that the function outputs
whatever b is. Let’s make this observation more formal with a proof:
Example 7.4.3. Let C be the set of all cars in the United States. Let S be the
set of all strings of letters and digits that are of length at most 7 (i.e. these are
the potential strings you might see on a car’s license plate).
Let f : C → S be defined by inputting a car and outputting its license plate
string. Is the function f a surjection?
No, definitely not! In case you weren’t aware, curse words are disallowed on
license plates! So certainly, there exist many strings of letters that you will
never see on a license plate in the United States. (We’ll let you provide some
examples on your own . . . )
Because we have exhibited an element of S that is not an element of Imf (C)—or,
at least, you thought of an example—we have shown that f is not a surjection.
Example 7.4.4. Let d : N × N → Z be the function defined by
.
∀(a, b) ∈ N × N d(a, b) = a − b
500 CHAPTER 7. FUNCTIONS AND CARDINALITY
Let’s determine whether d is a surjection and prove our claim. We might start
by trying some “small values” for the input variables a and b. In the table below,
the left column is a and the top row is b, and the entries are d(a, b) = a − b:
1 2 3 4 5
1 0 -1 -2 -3 -4
2 1 0 -1 -2 -3
3 2 1 0 -1 -2
4 3 2 1 0 -1
5 4 3 2 1 0
It looks like all of the integers z ∈ Z will appear in this table. However, they
don’t all appear in one particular row or column. Rather, it looks like all
the non-negative integers appear in the first column, while all the non-positive
integers appear in the first row. Let’s use these observations to write a proof.
We’ll take an arbitrary integer z ∈ Z and consider two cases; if z ≥ 0, we will
do one thing, and if z < 0, we will do something else. As long as we succeed in
both cases, we will have proven that d is a surjection.
Proof. We claim d is a surjection. Let z ∈ Z be arbitrary and fixed. WWTS
.
∃(a, b) ∈ N × N d(a, b) = z. To do this, we consider two cases:
(1) Suppose z ≥ 0. Then define (a, b) = (z + 1, 1).
Since z ≥ 0, we know z + 1 ≥ 1 and so z + 1 ∈ N. This guarantees
(z + 1, 1) ∈ N × N.
Also, notice that d(z + 1, 1) = (z + 1) − 1 = z.
(2) Suppose z < 0. Then define (a, b) = (1, −z + 1).
Since z < 0, we know −z > 0 and so −z + 1 ≥ 2, meaning −z + 1 ∈ N. This
guarantees (1, −z + 1) ∈ N × N.
Also, notice that d(1, −z + 1) = 1 − (−z + 1) = z.
.
In either case, we are able to define (a, b) ∈ N × N d(a, b) = z. Since z ∈ Z was
arbitrary, this proves that d is surjective.
Example 7.4.5. Let g : R − {−1} → R be the function defined by
.
∀x ∈ R g(x) =
x
1+x
(Notice why we have removed −1 from the domain. This ensures g is a well-
defined function!)
Let’s determine whether g is a surjection and prove our claim. As mentioned
before, we can do some scratch work to figure out our claim: we could try
plugging in some values of x, testing “extreme cases” by letting x get very close
to −1 or letting x grow larger and larger . . . All of this can help us plot a graph
of the function, or we can just use some graphing software:
7.4. PROPERTIES OF FUNCTIONS 501
We will actually present two proofs here, for you to compare and contrast.
They both accomplish the same goal—showing g is not surjective—but one does
so by a contradiction method and the other by a direct method (using cases).
Which do you think is better? Did you come up with one of these? Which is
easier to read? We have no definitive opinion on these questions; they are both
equally valid proofs!
Proof 1 (Direct). Let x ∈ R − {−1} be arbitrary and fixed. WWTS that g(x) 6=
1. We consider two cases:
1
• Suppose x > −1. This means x + 1 > 0, and so x+1 > 0. We also know
x + 1 > x (which is true for every x ∈ R.)
1
By multiplying this inequality by the positive term x+1 , we deduce that
x x
1 > x+1 . Certainly, then, g(x) = x+1 6= 1.
1
• Suppose x < −1. This means x + 1 < 0, and so x+1 < 0. We also know
x + 1 > x.
1
By multiplying this inequality by the negative term x+1 and switching
x x
the sign, we deduce that 1 < x+1 . Certainly, then, g(x) = x+1 6 1.
=
In either case g(x) 6= 1. These cases cover all possibilities because x ∈ R − {−1}
was arbitrary (and we need not consider x = −1). This shows
1∈
/ Img (R − {−1})
so g is not a surjection.
502 CHAPTER 7. FUNCTIONS AND CARDINALITY
Notice that this first proof proves an interesting qualitative observation about
the graph: that the function lies above the horizonatal asymptote to the left of
x = −1 and above the asymptote to the right of x = −1.
.
∀y ∈ R y ∈ Img (R − {−1})
.
In particular, then, we know 1 ∈ Img (R − {−1}), so ∃x ∈ R − {−1} g(x) = 1.
Let such an x be given.
x
This means g(x) = x+1 = 1. Multiplying both sides, we find x = x + 1.
×
Subtracting, we find 0 = 1, clearly a contradiction ×
××
Therefore, 1 ∈
/ Img (R − {−1}), so g is not a surjection.
Notice that this second proof does prove that g is not a surjection, but it doesn’t
add any other information about how the function behaves (like the previous
proof did).
Let’s move on from surjections and talk about a closley related property of
functions.
Definition
Definition 7.4.6. Let A, B be sets and let f : A → B be a function. We say f
is an injective function if and only if it has the property that
.
∀a1 , a2 ∈ A a1 6= a2 =⇒ f (a1 ) 6= f (a2 )
.
∀a1 , a2 ∈ A f (a1 ) = f (a2 ) =⇒ a1 = a2
This expresses the equivalent notion that “if two outputs are equal, they must
come from the same input”.
Think about how this definition conveys the notion we described above. Say
we have an injective function f : A → B, and let’s say we are given an element
b ∈ B. Does this definition say that there is at most one element x ∈ A such
that f (x) = b? What possibilities does the definition allow?
Motivation
.
¬ ∀a1 , a2 ∈ A a1 6= a2 =⇒ f (a1 ) 6= f (a2 )
.
⇐⇒ ∃a1 , a2 ∈ A a1 6= a2 ∧ f (a1 ) = f (a2 )
f :A→B f :A→B
A B A B
injective NOT injective
The non-injective function has two distinct domain elements that output the
same codomain element, whereas the injective function avoids this situation. It
might feel a little odd to phrase a property in this kind of negative sense—a
function is only injective if it doesn’t have . . . —but this is actually somewhat
common in mathematics. (We will even see this idea later on when we talk
about infinite sets, which are just . . . sets that are not finite!) This negative
formulation is easy enough to remember, and we can always relate it to another,
positive formulation: an injective function has only 0 or 1 inputs corresponding
to any given output.
Examples
Let’s think about how to prove/disprove the injectivity of functions. As you
might guess, the first two versions of the definition given above are useful when
trying to show a function is injective: take two distinct elements of the domain
and show their outputs are different, or take two equal outputs and show they
came from equal inputs. The negation can also be used to show a function is
injective via a proof by contradiction. Also, the third version is useful when
proving a function is not injective: a counterexample amounts to finding two
distinct inputs with the same output.
Let’s see these techniques in action with a few examples. In fact, we will
use some of the same examples we looked at in the previous section about
surjections!
Example 7.4.7. Consider p : N × N → N defined by p(a, b) = ab. Is p injective?
By trying some particular values of (a, b), we can see that p is definitely not
an injection. Pick any number that has two different factorizations, like 12 =
3 · 4 = 2 · 6. By letting (a, b) = (3, 4) and (c, d) = (2, 6), we can easily prove
this claim. But we can do this even more easily, by noting that the order of the
coordinates of an element like (a, b) matters!
Proof. This function is not injective. Let (a, b) = (1, 2) and (c, d) = (2, 1).
Notice that (a, b) 6= (c, d) because 1 6= 2. Also, notice that p(a, b) = 1 · 2 = 2
and p(c, d) = 2 · 1 = 2. Thus, p(a, b) = p(c, d). This shows that p is not
injective.
7.4. PROPERTIES OF FUNCTIONS 505
Example 7.4.8. Let C be the set of all cars in the United States. Let S be the
set of all strings of letters and digits that are of length at most 7 (i.e. these are
the potential strings you might see on a car’s license plate).
Let f : C → S be defined by inputting a car and outputting its license plate
string. Is the function f an injection?
No, we don’t think so! The same license plate string could appear on different
cars that are registered in different states. Now, we don’t have any examples of
this on hand, so this isn’t a totally formal proof, but hopefully you see the idea.
Could we amend the function definition to make it an injection? Sure, we
could try! Consider also defining S to be the set of U.S. states. Let the function
g : C → L × S be defined by inputting a car and outputting the order pair of
that car’s license plate string and home state. This will be an injection, because
no two cars in the same state can have the same plate. (Again, this is not really
a formal proof; we are just trying to illustrate the concept of injectivity with a
non-numerical example.)
Example 7.4.9. Let d : N × N → Z be the function defined by d(a, b) = a − b.
Determine whether d is an injection and prove your claim.
It turns out d is not an injection! Notice that a − b = (a + 1) − (b + 1). We
can use this to find a counterexample:
Consider the pairs (2, 1) ∈ N × N and (3, 2) ∈ N × N. Notice that d(2, 1) = 1
and d(3, 2) = 1. Since (2, 1) 6= (3, 2) and yet d(2, 1) = d(3, 2), we conclude that
d is not an injection.
Example 7.4.10. Let F : P(N) → P(Z) be defined by
∀X ∈ P(N) F (X) = . [
{a, −a}
a∈X
Do you see what this function does? (Can you explain why it’s even a well-
defined function?)
Let’s show you a few examples to give you an idea:
[
F {1} = {a, −a} = {−1, 1}
a∈{1}
[
F {1, 3, 5} = {a, −a} = {−1, 1} ∪ {−3, 3} ∪ {−5, 5}
a∈{1,3,5}
F (N) = Z − {0}
We claim that F is an injection. Think about how to prove this before reading
our proof. In particular, think about the different strategies we might employ
here, based on the formal definition of injectivity. Might one strategy be more
fruitful than another?
506 CHAPTER 7. FUNCTIONS AND CARDINALITY
Think about how this proof might go if we used a different technique. Say
we started by assuming X, Y ∈ P(N) and that F (X) = F (Y ). Can we deduce
that X = Y ?
2. Define a = .
3. Show that a ∈ A.
2. Show that b ∈ B.
3. Deduce that x = y.
Alternatively:
2. Suppose that x 6= y.
3. Show that x 6= y.
7.4.4 Bijections
You might have guessed what we have been building towards here. Think about
the two main properties of functions we just studied: surjectivity and injectivity.
What happens when a function has both of these properties? What if a function
has the property that, for every element of the codomain, there is at least one
corresponding element in the domain (surjectivity) and there is also at most one
such element (injectivity)? That’s right: for every output, there is exactly one
input! This is an incredibly nice property to have, and will be the foundation
for our forthcoming discussion of cardinality (i.e. the size of a set). Let’s make
a definition and then discuss some examples.
508 CHAPTER 7. FUNCTIONS AND CARDINALITY
Definition
Definition 7.4.11. Let A, B be sets and let f : A → B be a function. We say
f is a bijective function if and only if f is both injective and surjective.
Equivalently, we just say “f is bijective” (adjectival form), or that “f is a bi-
jection” (nounal form).
We will sometimes say that f is a bijection between the sets A and B, instead
of saying “from A to B”. (The reason for this will become clear in the next
section!)
Notice that this definition is, logically speaking, an AND statement. For the
moment, anyway, the only technique we have to prove a function is bijective
is to just prove it is surjective and prove it is injective. Similarly, to prove
a function is not bijective, we need to prove it is either not surjective or not
injective. (It might be that both properties fail, but one such proof is sufficient
to show a function is not bijective.) Rather than go over these same techniques
(which are nicely summarized right before this section), we will just point out
whether some of the examples we have seen thus far are bijections are not.
Example 7.4.12.
(a) Let p : N × N → N be the function defined by p(a, b) = ab.
We proved that p is surjective but not injective, so it is not a bijection.
(b) Let d : N × N → Z be the function defined by d(a, b) = a − b.
We proved that d is surjective but not injective, so it is not a bijection.
(c) Let g : R − {−1} → R be the function defined by
.
∀x ∈ R g(x) =
x
1+x
We proved that g is not surjective. (Specifically, we showed 1 ∈
/ Img (R −
{−1}).) We will ask you in this section’s exercises to prove that g is an
injection, though. Together this means g is not a bijection.
However, consider defining h : R − {−1} → R − {1} by the same “rule” as
. x
g, i.e. ∀x ∈ R − {1} h(x) = 1+x .
We asked you to prove in the exercises of Section 7.3.5 that this function
satisfies Imh (R − {−1}) = R − {1}. This shows that h is a surjection.
Furthermore, we will ask you to prove in this section’s exercises that a
function defined in this way—by taking an injection, using the same “rule”,
and redefining the codomain to be the image—produces a bijection.
Together, all of this proves that h is a bijection from R − {−1} to R − {1}.
Example 7.4.13. Let’s look at one new example, specifically chosen to preview
some of the main ideas coming up ahead. Define E ⊆ N to be the set of all even
7.4. PROPERTIES OF FUNCTIONS 509
.
E = {e ∈ N | ∃k ∈ N e = 2k}
Motivation
The main idea behind a bijection f : A → B is that we can pair up the
elements of A and B and identify them with each other, one by one. This idea
follows from the definitions of both surjectivity and injectivity: every output
has exactly one corresponding input. Furthermore, think more carefully about
what we show in the proofs of such properties. In proving f is surjective, we
show we can “move” from the codomain back to the domain in at least one way,
and then in proving f is injective, we show that this is the only way to do it.
In a sense, we are showing how to “undo” the function f and reverse its action.
In fact, we are implicitly defining a new function from B back to A. Have you
previously talked about the inverse of a function? That is precisely what we
are rediscovering now! To make this notion of “moving back from the codomain
to the domain” rigorous enough, we need to have a brief discussion about how
to “combine” functions appropriately. Right after that, we will be able to give
a precise definition of what we mean by the inverse of a function, and relate
this to bijections. All of this happens in the next section.
can confidently answer these before moving on will help your understanding and
memory!
(1) Write down a definition of surjective in terms of an image. Then, write
down a definition of surjective in terms of quantifiers.
(2) Describe two different ways of proving that a function is injective.
(3) Can a function be both injective and surjective? If so, give an example.
(4) Can a function be neither injective nor surjective? If so, give an example.
(5) Consider the following schematic diagrams. For each one, declare whether
or not it is a function; and, if it is, declare whether or not it is (a) an
injection and (b) a surjection.
Try It
Try answering the following short-answer questions. They require you to actu-
ally write something down, or describe something out loud (to a friend/class-
mate, perhaps). The goal is to get you to practice working with new concepts,
definitions, and notation. They are meant to be easy, though; making sure you
can work through them will help you!
(1) Suppose f : R → R is an increasing function; that is, suppose
.
∀x, y ∈ R x < y =⇒ f (x) < f (y)
Prove that f must be injective.
Then, prove that f need not be surjective by defining an increasing function
that is not surjective.
(2) Let g : R − {−1} → R be the function defined by
.
∀x ∈ R g(x) =
x
1+x
Is g injective or not? Prove your claim.
(3) Give an example of a function f : P(N) → N that is surjective. Prove that
it is.
(Hint: Be careful about the fact that ∅ ∈ P(N). Also, consider looking at
Section 5.5.2 for some inspiration . . . )
(4) Give an example of a function F : N → P(N) that is injective. Prove that
it is.
Then, prove that your function F is not surjective.
(Note: Yes, we are asking you to prove your function is not surjective without
knowing what function you defined. We know we are right! You will learn
about our trick later in this chapter . . . )
7.5. COMPOSITIONS AND INVERSES 511
Let’s think about the schematic interpretation of functions for a moment. Imag-
ine that we have a function f : A → B and we also have a function g : B → C,
defined like this:
f :A→B g:B→C
A B B C
f :A→B g:B→C
A B C
and then simply travel from A all the way to C, cutting out the middle man:
g◦f :A→C
A C
This seems like a reasonable thing to do, right? Yes, of course it is! Whenever
we have mathematical objects at our disposal, we’re always curious about how
we can resonably combine them and manipulate them and generalize them. In
the case of functions, we call this combination a composition of functions. You
might notice that such a composition really only makes sense if the codomain
of the “first” function and the domain of the “second” function are the same.
This is incorporated in the following definition.
Definition
Definition 7.5.1. Let A, B, C be sets, and let f : A → B and g : B → C be
functions. Consider the function h : A → C defined by
.
∀a ∈ A h(a) = g(f (a))
This incorporates all the ideas we mentioned above. It requires that the
codomain of f (the “first” function applied) to be the domain of g (the “second”
function applied).
Another intuitive idea is to think of a function as a machine or a black box.
Elements of the domain go in and elements of the codomain come out. We don’t
necessarily know what the machine does; we only see what comes out. Now,
7.5. COMPOSITIONS AND INVERSES 513
think of hooking up two machines, one for f and one for g; take the output of
f ’s machine and plug it into g’s machine. What comes out is an element of C.
We can take the work of these two machines and think of it as the work of one
bigger machine. This is what the composition g ◦ f does; it’s one larger machine
that takes the operations of two machines and does them in a specified order.
Notation
Notice the ordering of the notation g ◦ f and how it compares to the order in
which we apply the functions: f comes first, and then g, i.e. g(f (a)). In words,
we would read “g(f (a))” as “g of f of a”. In fact, if you find yourself having
trouble remembering this order, here’s a recommendation: read the “◦” out
loud as “after”. Thus, h = g ◦ f would mean “g after f ”, because we take an
element of a, apply f first, and then apply g.
It is also important to remember the notation of composed functions and to
distinguish the function g ◦ f itself from an application of the function g ◦ f to
some element x ∈ A. For instance, to write “g of f of x” using the “◦” notation,
we would write
(g ◦ f )(x)
because we are “hitting” the element x with the function g ◦ f . However, the
following notation make no sense because it mixes up the ideas of functions
and elements:
g ◦ f (x)
Do you see the difference? The object f (x) is an element of B, the codomain of f .
But g is a function. What does it mean to compose a function with an element of
a set? This doesn’t work. Be careful with this, in general! This distinction will
be especially important when we have to compose several functions together,
like (h ◦ (g ◦ k) ◦ f )(z), where z is an element of f ’s domain, and f, g, h, k are
functions.
Examples
Example 7.5.2. Let C : R → R be defined by
.
∀x ∈ R C(x) = x − 273.15
Let F : R → R be defined by
.
∀x ∈ R F (x) =
9
5
x + 32
directly. We can compose the “rules” for the functions and find a formula for
this direct conversion:
.
∀x ∈ R (F ◦ C)(x) = F (C(x)) = F (x − 273.15)
9 9
= · x − 273.15) + 32 = x − 459.67
5 5
Example 7.5.3. Let f : R → Z be the function defined by
.
∀x ∈ R f (x) = bxc
(Recall that bxc is the floor of x: it is the largest integer z ∈ Z that satisfies
z ≤ x. Let g : Z → N be the function defined by
(
.
∀z ∈ Z g(z) =
−z if z < 0
z + 1 if z ≥ 0
∀n ∈ N. h(n) = 2n − 1
Look at that, they’re the same rule! That is, we just proved that
(h ◦ g) ◦ f = h ◦ (g ◦ f )
in the sense of functions by showing that they yield the same output on every
allowable input.
Composition is Associative
There was nothing particularly special about the functions f, g, h used in the
previous example. The result we obtained is actually true in general. The
following theorem and its proof will show this. We are proving that function
composition is associative. This means that whenever we have a string of
compositions, we can move the parentheses around at will; we know that the
order in which we apply the parentheses doesn’t matter.
Theorem 7.5.5. Let A, B, C, D be any sets. Let f : A → B and g : B → C
and h : C → D be functions. Then,
h ◦ (g ◦ f ) = (h ◦ g) ◦ f
and
[(h ◦ g) ◦ f ](x) = h ◦ g (f (x)) = h g(f (x))
(Notice that this doesn’t assume any properties of g; it doesn’t even have to
be injective, necessarily! As an exercise, try to find an example of functions
f : A → B and g : B → C such that g ◦ f is injective and g is injective, and also
an example where g ◦ f is injective but g is not injective.)
Proof. Let x, y ∈ A be given. Suppose f (x) = f (y). WWTS x = y.
Since g is a well-defined function, g(f (x)) = g(f (y)).
This means (g ◦ f )(x) = (g ◦ f )(y).
Since g ◦ f is injective, x = y. This was our goal, so the claim is proven.
It turns out that the converse of the claim we just proved is False. Since
that claim is one about all functions, disproving it requires us to produce a
counterexample.
Proposition 7.5.7. Let A, B, C be sets and let f : A → B and g : B → C be
functions. Suppose f is injective. Then it is not necessarily the case that g ◦ f
is injective.
Try doing some scratch work on your own to come up with a counterexample
before reading about ours. Remember that you don’t need to find the most
interesting or complicated one, nor do you necessarily need one defined by a
rule; you just need to be able to define one!
Proof. We will exhibit a counterexample.
Define A = {1, 2} and B = {♥, ♦} and C = {?}.
Define f by setting f (1) = ♥ and f (2) = ♦.
Notice f is injective because f (1) 6= f (2).
Define g by setting g(1) = g(2) = ?.
Notice g ◦ f is defined by (g ◦ f )(1) = ? and (g ◦ f )(2) = ?.
This shows g ◦ f is not injective, because (g ◦ f )(1) = (g ◦ f )(2) but 1 6= 2.
7.5.2 Inverses
Motivation
As we said before, a bijection f : A → B has a very nice property, in that f
“pairs off” the elements of the two sets, A and B. Given an element a ∈ A,
there is exactly one element b ∈ B that satisfies f (a) = b. This is because f is
a well-defined function. But we also know that a is the only domain element
associated with b in this way. This is because f is a bijection. Because of this
unique association in both directions, we can think of “reversing” the action of
f . Given an element b ∈ B, identify the a that would produce that b. This
is what an inverse function does. Here, we will define it in terms of function
composition and identity functions. This is also the reason we say a bijection
7.5. COMPOSITIONS AND INVERSES 517
is between two sets as opposed to just from one set to the other; as soon as we
have it one way, we know we can have it the other way, too!
Before we see the definition, let’s quickly recall the definition of the identity
function that we saw before. It plays an important role in the forthcoming
definition of inverse.
Definition: Given a set X, the identity function IdX : X → X
.
is defined by ∀z ∈ X IdX (z) = z.
Definition
Notice that this definition doesn’t say anything about the functions being bi-
jections. This is purely a formal definition of what an inverse function means.
Afterwards, we will have to prove any claims about hwo inverses and bijections
are related.
Definition 7.5.8. Let f : A → B be a function. Suppose there is a function
g : B → A such that f ◦ g : A → A satisfies f ◦ g = IdA and g ◦ f : B → B
satisfies g ◦ f = IdB .
Then we say g is the inverse of f and write g = f −1 .
(Notice that some conditions are implicitly stated by the assumptions and con-
clusions in the definition above. Specifically, it must be that B = Imf (A), to
make sure g is a function. Likewise, A = Img (B).)
Example
Let’s look back at a function we saw before when we discussed bijections. With
your help in the exercises, we learned that this function is a bijection. Here, we
will find its inverse.
Example 7.5.9. Let h : R − {−1} → R − {1} be defined by
.
∀x ∈ R − {−1} h(x) =
x
1+x
To find a candidate function that will be the inverse of h, it usually helps to set
the “rule” for h equal to some new variable, and then solve for x.
Here, let’s say h(x) = y. How can we “reverse” this process and identify what
x is, in terms of y? Observe that we can make some algebraic steps, as follows:
x
h(x) = y ⇐⇒ =y
1+x
⇐⇒ (1 + x)y = x
⇐⇒ xy + y = x
⇐⇒ y = x(1 − y)
y
⇐⇒ x =
1−y
518 CHAPTER 7. FUNCTIONS AND CARDINALITY
This scratch work has given us a candidate for the inverse of h. We haven’t
proven anything with these observations! What we have to do now is make a
claim and then demonstrate, for the reader, all of the essential facts. Notice that
we took care to define a new function H, and used it to prove that H = h−1 ,
in fact. It would be presumptuous and erroneous to define h−1 and then work
with it. We are trying to show h has an inverse, so we can’t just declare it has
one at the beginning of our proof!
2 in Section 7.5.4 below. It asks you to find an example where “one way”
yields the identity function but the “other way” does not, so that the proposed
function is actually not an inverse. Try to find several examples, if you can.
The more striking you make this point, the better!
.
∀x ∈ A x 6= a =⇒ f (x) 6= f (a) = b
That is, we know this a is the unique element of A that satisfies f (a) = b. Let’s
define g(b) = a. This is a well-defined function because of these observations.
Next, observe that (f ◦ g)(b) = f (g(b)) = f (a) = b, so f ◦ g = IdB .
Also, observe that (g ◦ f )(a) = g(f (a)) = g(b) = a, so g ◦ f = IdA .
Therefore, g = f −1 , so f has an inverse.
Thus, f is injective.
Second, let’s show f surjective. Let b ∈ B be given. Since f −1 is a function, we
.
know ∃a ∈ A f −1 (b) = a. Let such an a be given. Then observe thatf −1 (b) =
520 CHAPTER 7. FUNCTIONS AND CARDINALITY
a.
Inverse of an Inverse
The following corollary follows immediately from the theorem above. We call
it a corollary and not its own theorem because it doesn’t really assert anything
amazingly new; rather, its conclusion comes from applying the theorem above,
as you’ll see in the proof.
Inverse of a Composition
Before we move on to some exercises and the next section, let’s get your help in
putting together the main ideas of this chapter so far. Specifically, we are going
to state two results here. The proofs are left for you in the chapter exercises. By
working through those proofs, you will (a) solidify your understanding of many
of the concepts introduced so far—functions, jections, compositions, inverses—
and (b) obtain a helpful result about how to define the inverse of a composition
of functions!
7.5. COMPOSITIONS AND INVERSES 521
(1) Is the composition of functions associative? (That is, does the order of
parentheses matter?) Why or why not?
Try It
Try answering the following short-answer questions. They require you to actu-
ally write something down, or describe something out loud (to a friend/class-
mate, perhaps). The goal is to get you to practice working with new concepts,
definitions, and notation. They are meant to be easy, though; making sure you
can work through them will help you!
(1) Let O be the set of odd natural numbers and let E be the set of even natural
numbers. Define a function f : O → E that is a bijection and prove that
it is so by finding its inverse.
(2) In this problem, we want you to construct an example that shows the im-
portance of verifying both compositions yield the identity function when
we’re trying to find the inverse of a function.
Define sets A, B and functions f : A → B and g : B → A such that
.
∀x ∈ A g(f (x)) = x
522 CHAPTER 7. FUNCTIONS AND CARDINALITY
but
.
∃y ∈ B f (g(y)) 6= y
(Suggestion: You might find an example where A and B both only have
one or two elements . . . Or, you might find an example where A = B = N.)
7.6 Cardinality
7.6.1 Motivation and Definition
One important reason for caring about bijections is that they allow us to com-
pare the sizes of sets! This is a notion for which you have some intuitition. For
example, it’s pretty clear that the set
{1, 2, 3, 4, 5}
N = {1, 2, 3, 4, 5, . . . }
Comparing Cardinalities
As we mentioned, when S is infinite, we use |S| to compare the cardinality of
S to that of other sets. We won’t write something like |S| = ∞. Rather, we will
write something like |S| = |T | to indicate that S and T have the same cardinality,
whatever that may be. We might also write something like |S| < |P | to indicate
P has a strictly larger cardinality than S. The following definition tells us how
the comparison of cardinalities is based on functions and, specifically, different
kinds of jections.
Definition 7.6.4. Let S, T be any sets.
• We write |S| = |T | if and only if there exists a bijection f : S → T .
In this case, we say S has the same cardinality as T .
• We write |S| ≤ |T | if and only if there exists an injection f : S → T .
In this case, we say S has cardinality at most |T |.
• We write |S| < |T | if and only if |S| ≤ |T | and |S| =
6 |T |.
In this case, we say S has a strictly smaller cardinality than T .
• We write |S| ≥ |T | if and only if there exists a surjection f : S → T .
In this case, we say S has cardinality at least |T |.
• We write |S| > |T | if and only if |S| ≥ |T | and |S| =
6 |T |.
In this case, we say S has a strictly larger cardinality than T .
Let’s explain the motivation behind these definitions in two different ways:
In general, f : A → B being an injection tells us |A| ≤ |B| and g : A → B
being a surjection tells us |A| ≥ |B|. Think about schematic diagrams for the
functions f and g to see why this definition makes sense. Having an injection
from A → B means we can definitely “pair” the elements of A to elements of B
without overlapping, but perhaps there are “many more” elements of B left over.
Likewise, having a surjection from A → B means we can definitely “cover” all of
B with elements of A, but maybe we had to overlap sometimes to do this, so A
could have “more” elements than B. Having both of these situations together
(i.e. a bijection from A to B) means that A and B actually have the same
cardinality: we can pair off all their elements. This is an intuitive explanation
to motivate these definitions, mind you. These types of explanations are not
rigorous proofs. But now that we have made these definitions, we can use them
to prove and disprove statements! To compare cardinalities of sets—even infinite
ones—we just need to find a function with an appropriate property. All of our
work in the rest of this chapter will be quite helpful in our journey through the
Kingdom of Cardinality.
526 CHAPTER 7. FUNCTIONS AND CARDINALITY
Another way to think about these definitions is that “has the same cardi-
nality as” is an “equivalence relation” on the “set of all sets”. We have to put
quotes around these phrases because, as we explained in detail in Section 3.3.5
about Russell’s Paradox, there is no such thing as the “set of all sets”. Thus,
it doesn’t make mathematical sense in our context to talk about an equivalence
relation on that “set”. In some fuzzy sense, though, this is what’s going on:
• Given any set S, there is certainly a bijection with itself: the identity
function, IdS : S → S. This shows |S| = |S|, i.e. the “has the same
cardinality as” relation is “reflexive”.
Again, this is not exactly what’s going on, but it can really help you sort through
these difficult, abstract ideas. We are establishing a way to take any two sets and
compare their cardinalities using functions. All of the sets in the universe will be
“partitioned” into different “classes” based on their cardinalities. What’s truly
amazing is what we are about to prove for you: that there are infinitely-many
cardinalities.
Cantor’s Theorem
The following result and proof are due to the German mathematician Georg
Cantor from the mid- to late-1800s. By now, mathematicians have fully em-
braced the result and its consequences. However, at the time, this idea was so
controversial that some mathematicians refused to believe him. In time, though,
his work and ideas helped lead to the development of formal set theory.
The proof of this particular result is known as Cantor’s Diagonalization
Argument. We will use an argument like this later on, where we will point out
why it is like a “diagonal”. For now, we are more interested in the conclusion
of this theorem.
This says that the power set of a set always has strictly larger cardinality
than the set itself. This makes sense for finite sets. You discovered already
that the power set of [n] has 2n elements, i.e. |P([n])| = 2n . (You will prove
this by induction, using results about cardinality, in Problem 7.8.30.) We see,
indeed, that n < 2n for every n ∈ N. However, this theorem also asserts that
7.6. CARDINALITY 527
this relationship holds for infinite sets. Wow! Immediately, this tells us that
there is a whole chain of infinite sets, each one bigger than the previous one.
We can just kept taking the power set of what we had before:
|N| < |P(N)| < P P(N) < P P P(N) < ···
Let’s prove this theorem. The proof is very short and clever, so don’t worry
about how to come up with such an argument. Focus on understanding the
logical flow.
• If Y ∈
/ T , then the definition of T says Y ∈ g(Y ). However, g(Y ) = T , so
this means Y ∈ T . This is a contradiction × ×
××
Look back at Exercise 4 in Section 7.4.5. Notice that we asked you to define a
function from N to P(N), and then we asked you to prove it was not surjective.
We didn’t have to know what your function was! Since we were aware of this
theorem, we knew you couldn’t possibly have defined a surjection!
Theorems
For each of these results, we will state a theorem/proposition/lemma, and either
prove it or have you help us with the proof via some exercises.
Theorem 7.6.7. Suppose A, B are disjoint finite sets. Then |A∪B| = |A|+|B|.
Play around with some examples to see why this claim is True. Do you see
why we need the sets to be disjoint for this to work? Can you prove this claim?
Remember that we want to find a bijection between the two sets . . .
(That is, we suppose |A| = a and |B| = b). Let such a, b, f, g be given.
WWTS |A ∪ B| = |A| + |B| = a + b; that is, WWTS there is a bijection
h : A ∪ B → [a + b].
Define the function h : A ∪ B → [a + b] by
(
∀x ∈ A ∪ B . h(x) =
f (x) if x ∈ A
g(x) + a if x ∈ B
= Id[b] (y − a) + a = (y − a) + a = y
where we used the fact that g −1 (y − a) ∈ B.
In either case, we find (h ◦ H)(y) = y, and both cases are disjoint and cover all
possibilities.
Next, let’s show that H ◦ h = IdA∪B . Let x ∈ A ∪ B be given. We have two
cases.
(1) Suppose x ∈ A. Then,
(H ◦ h)(x) = H(h(x)) = H f (x) = f −1 f (x) = IdA (x) = A
In either case, we find (H ◦ h)(x) = x, and both cases are disjoint and cover all
possibilities.
Thus, H = h−1 , so h has an inverse. Therefore, h is a bijection.
Therefore, |A ∪ B| = [a + b] = a + b = |A| + |B|.
You can use the two results above to prove the following generalization:
You should also look at Problem 7.8.32 in this chapter’s exercises. There,
we guide you through a proof (by induction on two variables!) about the size
of the Cartesian product of two finite sets.
• Suppose all the rooms are full. It’s a very busy weekend. One guy walks
into the lobby looking for a room. Can we squeeze him in? If not, why?
If so, how?
It turns out that we can! We can just shift all the guests down one room
and place this new guy into Room 1.
The catch, though, is to take advantage of our loudspeaker system. If
we had to go and knock on everyone’s door telling them to move down
one room, we would never actually finish; we would spend all of eternity
knocking on doors and delivering messages.
Instead, we make the following announcement:
After five minutes, the guests have all moved, and Room 1 is vacant for
our new guest.
Morally speaking, we have just verified that the set N and the set N ∪ {?}
have the same cardinality, for any particular object ?. In particular, say,
|N| = |N ∪ {0}|. Our hotel has only countably many rooms, and we have
accomodated one person associated with each natural number, as well as
one more person.
• It’s the next day. Our rooms are still full. Suppose a Scrabble convention
with countably infinitely many people shows up. The people are all wear-
ing nametags with natural numbers on them, so there is Person 1, Person
2, Person 3, . . . .
Can we accomodate these folks? How can we assign them rooms? How
do we move around the guests currently in the hotel?
It turns out that we can! The idea is to free up an infinite set of rooms.
532 CHAPTER 7. FUNCTIONS AND CARDINALITY
After five minutes, every hotel guest has moved, and after another five
minutes, every convention-goer has found their room. Voilá!
Morally speaking, we have just verified that the union of two disjoint
countably infinite sets is countably infinite, as well. That is, we took the
set A of current hotel guests (notice A is countably infinite) and the set
B of convention guests waiting for rooms (notice B is countably infinite,
and notice that A ∩ B = ∅) and found a bijection between A ∪ B and N,
where N represents the set of Rooms.
The catch here is that we cannot apply the same method as the previous
two cases over and over. Yes, we can squeeze in Convention 1 using that
method. After that’s done, we would squeeze in Convention 2. And so
on. But never would we get to all of the conventions. It’s the same
problem we had before where knocking on every individual door would
take forever to accomplish; we needed to send a message to everyone at
once. Likewise, here, we need to send a message to all of the hotel guests,
and then a message to all of the convention-goers waiting outside the door.
It needs to be a general “formula” about which room to go to.
If it helps, think of this from the other side of the situation. Pretend you
are in Convention x and you are Person y in that convention. You are
eagerly awaiting a comfortable bed to sleep in for the night. You want to
know exactly what room to go to. ASAP. You don’t want to wait around
and see all of the conventions ahead of you given rooms, one by one. You
want to all go in at once and find your corresponding rooms.
Here’s one way to do it. Let’s take advantage of the structure of the
prime numbers. We know there are countably infinitely many primes,
and that for any two different primes p and q (i.e. p 6= q), it is true that
pk 6= q k for any natural number k. With this in mind, we see that assigning
individual conventions to the rooms that are powers of a corresponding
prime number, we can ensure that no two (potential) guests get assigned
to the same room. We make the following announcement to our current
hotel guests:
Attention guests: If you find yourself in Room n, please move
to Room 2n . Thank you!
We then make the following announcement to the conventions waiting
outside the door:
Attention convention-goers:
If you are Person number k from Convention number 1, please
go to Room 3k .
If you are Person number k from Convention number 2, please
go to Room 5k .
If you are Person number k from Convention number 3, please
go to Room 7k .
In general, if you are Person number k from Convention number
n, please go to the Room numbered by the (n + 1)-th prime
number raised to the k-th power.
Thank you!
(Note: We are assuming that all of our guests and potential guests are
math genii, and they can quickly figure out what the (n + 1)-th prime
534 CHAPTER 7. FUNCTIONS AND CARDINALITY
How could we have been more “efficient” about this? Is there a certain
announcement we could make so that all the rooms are filled?
Morally speaking, we just verified that N and N × N have the same car-
dinality. We had countably infinitely many conventions with countably
infinitely many people in each, so every person we wanted to accomodate
corresponds to an ordered pair of natural numbers, where the first coordi-
nate is their Person number and the second coordinate is their Convention
number. Since we were able to match this set of people with the set of
rooms (which corresponds to N), then we showed N × N is countable.
(Note: We actually “overdid” it and found a way to embed the set N × N
in a strict subset of N!)
This hopefully gives you a flavor for how to think about countable infinities.
One important point to keep in mind is that infinity is a cardinality, not a
number, in our context here. It’s not as if the natural numbers “keep going”
and there’s some magical number ∞ lying out there past them all. Here, we
refer to countably infinite as a cardinality; it represents how “big” something
is. It’s more like a magnitude than a position.
Examples
Let’s take some of the ideas conveyed by the Hilbert Hotel examples and
express them more formally. We’ll make use of injections and surjections and
bijections. (Oh my!) The following result will be helpful as we go along, so let’s
prove it now.
.
Proof. Define the “identity function” f : S → T , given by ∀x ∈ S f (x) = x.
Since S ⊆ T , this is a well-defined function.
(Note: We couldn’t technically define this as the usual identity function IdS ,
because the domain and codomain might not be equal sets; in essence, f does
the same action as IdA but has a different codomain).
Notice that f is injective!
(Note: It’s not necessarily bijective, because it might be that S 6= T .)
Since f is injective, this tells us that |A| ≤ |B|.
7.6. CARDINALITY 535
You might be wondering why we can’t conclude |A| < |B| here. Why is it “≤”
instead? Certainly, {1, 2} ⊆ {1, 2, 3} and |{1, 2}| = 2 < 3 = |{1, 2, 3}. This is
true for finite sets, but as we shall see in this section, there are infinite sets
that have strict subsets of equal cardinality!
Example 7.6.12. Z is countably infinite:
We know N is countably infinite by definition. The identify function IdN : N → N
is obviously a bijection, so N is countable.
In this example, we will prove that Z is countably infinite! To accomplish this,
we need to find a bijection f : Z → N. We will state one here and then prove
it is a bijection by finding its inverse. Before reading on, try to find a bijection
on your own! Maybe you’ll come up with a different function than ours! If you
need a hint for coming up with one, think about this: to prove an infinite set
is countably infinite, we want to find a way to start listing the elements one by
one. Try to find a pattern that identifies the “1st” integer, and then the “2nd,
and then the “3rd”, . . .
We chose this function because it “pairs off’ the integers with the natural num-
bers like this:
. . . , −3, −2, −1, 0, 1, 2, 3, . . .
l l l l l l l
. . . , 8, 6, 4, 2, 1, 3, 5, . . .
(That is, we are pairing off the even natural with the non-positive integers, and
the odd natural with the positive integers. Looking at this correspondence, we
can see how to “reverse” it. This is how we will find f ’s inverse.)
Next, Define F : N → Z by
(
− n2 + 1 if n is even
F (n) = n+1
2 if n is odd
n+1
• Suppose n is odd. Then F (n) = 2 . Notice that n + 1 ≥ 2 and so
n+1
2 ≥ 1. This means
n+1 n+1 2n + 2
(f ◦ F )(n) = f (F (n)) = f =2 −1= −1
2 2 2
= (n + 1) − 1 = n
Rather than finding its inverse, though, we will prove it is surjective, and ask
for your help in showing that it is injective.
Explicit bijection: Define f : N × N → N by setting
∀(x, y) ∈ N × N . f (x, y) = 2x−1 (2y − 1)
In proving that f is a bijection, we will be proving this fact:
Every natural number can be written uniquely as a power of 2 times
an odd natural number.
Look at the function we defined. It takes a pair of natural numbers and outputs
a power of 2 times an odd natural number. Proving this is a bijection shows that
it never outputs the same natural twice (injectivity) and every natural number
is an output of some pair (surjectivity). You might try playing around with the
function, plugging in some values and seeing what happens. Also, you might
try working “backwards”, trying to figure out what f −1 might possibly do. For
instance, take your favorite n ∈ N. Can you express it as a power of 2 times an
odd? If n is odd, this is quite easy, since 20 = 1. For instance,
11 = 1 · 11 = 20 · (2 · 6 − 1) = f (1, 6)
(Notice that we had to use x − 1 and 2y − 1 in the definition of f because we
are working with N, and 0 ∈
/ N.)
If n is even, we can just divide by 2 iteratively until we can’t anymore; what’s
left must be an odd number. For instance:
40 = 2 · 20 = 4 · 10 = 8 · 5 = 23 · (2 · 3 − 1) = f (4, 3)
and
32 = 2 · 16 = 22 · 8 = 23 · 4 = 24 · 2 = 25 = 25 · (2 · 1 − 1) = f (6, 1)
This observation is crucial in proving that f is surjective:
.
f is surjective: We claim ∀n ∈ N n ∈ Imf (N × N). We prove this by a
“minimal criminal” argument.
BC: Notice that f (1, 1) = 20 · 1 = 1. Thus, 1 ∈ Imf (N × N).
IH: Suppose we have n ∈ N − {1} that has no such representation as a power
of 2 times an odd, i.e. suppose n ∈
/ Imf (N × N).
IS: We have two cases:
• If n is odd, then . . . well, n · 20 = n · 1 = n is such a representation. That
is, we know n+12 ∈ N and we see that
n+1 n+1
f 1, = 20 · 2 · − 1 = 1 · (n + 1 − 1) = n
2 2
so n ∈ Imf (N × N). This contradicts our assumption that n ∈
/ Imf (N × N)
so this case is not valid.
538 CHAPTER 7. FUNCTIONS AND CARDINALITY
n
f (x + 1, y) = 2x+1 · (2y − 1) = 2 · 2x · (2y − 1) = 2 · f (x, y) = 2 · = n
2
6
5
4
3
2
1
1 2 3 4 5 6
To show that this infinite grid of points is countably infinite, we can describe
a path that traverses all of the points (surjectivity!) exactly once (injectivity!)
and is indexed by the natural numbers (countably infinite!). That is, we can
just describe a way to traverse the whole grid in a series of steps; there will be
a “1st point” and a “2nd point” and so on.
The key observation to make is that the “northwestern” diagonals of this grid
are all finite. Start from the point (5, 1), for instance, and move upwards and
leftwards, diagonally. You will traverse over (4, 2) and (3, 3) and (2, 4) and (1, 5),
and then reach the boundary of the grid. This is true no matter where you start
along the bottom row of lattice points.
Let’s use this fact to label each lattice points with a natural number based on
(a) which diagonal it lies on, and (b) where it lies along that diagonal. We’ll
treat the diagonal starting at (1, 1) as the 1st diagonal, the one starting at (2, 1)
as the 2nd, and so on. This gives us the following labels:
6
5
15
4
10 14
3
6 9 13
2
3 5 8 12
1
1 2 4 7 11
1 2 3 4 5 6
540 CHAPTER 7. FUNCTIONS AND CARDINALITY
We can see that every point in the lattice will lie on exactly one such diago-
nal. Furthermore, there are countably-infinitely-many such diagonals (they are
indexed by N) and there are only finitely-many points on each diagonal. This
means (as we will prove below) that the collection of all the points on the diag-
onals is countably infinite.
You ought to try formalizing this argument by writing down a function that
achieves the labeling we’ve demonstrated. Or, you could at least work with
a similar one that also works, i.e. you could move southeastwards instead, or
reverse the direction of alternate diagonals . . .
Example 7.6.15. Q is countably infinite:
This result is one of the more striking examples of our intuition failing with
infinite sets and their cardinalities. Think about the elements of Q as laid out
on the real number line. They’re everywhere! In fact, look at Exercise 4.11.26;
there, you proved that the rationals are dense, and it is also true that they are
dense in R (i.e. between any two distinct real numbers lies a rational number).
Furthermore, the set of rationals seems so much larger than Z: between 0 and 1
alone, there lies infinitely many rational numbers! For these reasons, you might
believe that Q is uncountably infinite, but this is False.
In this example, we will present several arguments for this fact, especially be-
cause we realize it is so strange and striking.
(1) Intuitive argument:
Consider the following “representation” of Q as a union of sets:
Q “=” “N × N” ∪ “−”(N × N) ∪ {0}
In some sense, N × N corresponds to all the positive rationals. To see why,
just consider the function f : N × N → Q+ defined by f (x, y) = xy . We
definitely output all positive rationals (so f is a surjection), but 42 = 21 so
this is not an injection. At least, this shows |N × N| ≥ |Q| because f is a
surjection. Since N × N is countably infinite, and we certainly expect Q to
be infinite, this shows the positive rationals are countably infinite.
The set of negative rationals—let’s call it Q− —must have the same cardinal-
ity as the set of positive rationals—let’s call it Q+ . There is a clear bijection
.
between them: define g : Q+ → Q− by setting ∀q ∈ Q+ g(q) = −q.
All this leaves out is 0 ∈ Q. The union of two countably infinite sets is also
countably infinite (as we will prove below), and adding on one more element
won’t change that. Thus, Q is countably infinite.
Mind you, this is quite “hand-wavey”. All of the “scare quotes” in the
“equation” above mean you should take this as just a heuristic argument,
and not a proof. However, there are ways to make all of these arguments
formal. Try working on this on your own!
(2) Listing Q:
7.6. CARDINALITY 541
Consider writing a computer program to print out all the positive rational
numbers in a list. What algorithm would you use? As long as you can
guarantee that your program will “eventually” succeed and print them all,
then you have shown Q can be enumerated one-by-one, so it must be count-
ably infinite. (Remember, this is why we use N as the canonical countably
infinite set: we can enumerate its elements one-by-one, we can count them.)
Here’s one way that we might write such a program: Follow the same “path
through the lattice” argument that we used with N × N in the previous
example. This time, though, just “skip over” any rational you have already
printed.
That is, we would print the pair (1, 1) ↔ 1 and then (2, 1) ↔ 2 and then
(1, 2) ↔ 21 and then (3, 1) ↔ 3 and then . . .
Aha! We have to omit writing (2, 2) ↔ 1. How did we know that? We see
that we already printed 1. How did we know that? We just looked over the
list of rationals we had already printed and checked to see if what we were
about to print has already appeared. If so, we move on; if not, we print it
and then move on.
In terms of the enumeration process, this just means that for every point in
the lattice we pass through, we have to check finitely-many things; namely,
we have to look over the finitely-large set of rationals we have already
printed. This means the printing process at any individual step will take “a
little longer” but not infinitely-longer. Thus, our program will eventually
print out every rational number; no matter which one you have in mind, we
will get to it in finite time.
(3) Q is at most countably infinite:
Here’s another argument about Q being countable. (If this feels like overkill,
that’s fine, just move on. We just know that this is a surprising result and
having a few ways of thinking about it might help!)
Consider this: We can definitely agree a priori that |Q| ≥ |N|. This follows
from the fact that Q ⊇ N. Now, the only question is whether or not these
cardinalities are equal. To reach that conclusion, we would need to find
either (a) an injection from Q to a countable set, or (b) a surjection from a
countable set to Q.
We will prove below that that Z × N is countable. (That is, we will prove
generally that the Cartesian product of any two countably infinite sets is
also countably infinite.) We can then define the function f : Z × N → Q by
∀(z, n) ∈ Z × N . f (z, n) =
z
n
This is a surjection onto Q. It is definitely not injective (why not?) but
we don’t care. It shows that |Z × N| = |Q|. Once we have proven that
|Z × N| = |N|, this will have shown that |N| = |Q|.
542 CHAPTER 7. FUNCTIONS AND CARDINALITY
To actually construct the tree, we find mediants. Given two rationals ab and
c a+b
d , the mediant of those two is defined as c+d . (Notice that this is a special
7.6. CARDINALITY 543
object, the mediant; it is not the correct way to add two fractions!)
Each level of the tree consists of all the mediants made from consecutive
pairs of rationals in the level above; we don’t “count” the directly vertical
elements; they are just carried over for ease of reading and construction.
Also, notice that the fractions 01 and 10 (which is undefined, even!) are
included in the outside columns to help generate the elements on the outside
of each level.
(Play around with the properties of this tree, and read more about. It is an
interesting mathematical object!)
We won’t prove here that this tree contains all the rational numbers, but
we think you can see why this is believable. Also, we think you can see why
the set of all nodes in this tree is countably infinite. Each level has only
finitely many nodes, and there are countably-infinitely-many levels.
Theorems
Now we know that three of our standard sets of numbers—N and Z and Q—are
all countably infinite, as well as the set N × N. With the following theorems, we
will show you some ways to generate more countably infinite sets from existing
ones.
Let’s get you warmed up with one helpful result. It says that we can take a
countably infinite set and “tack on” finitely-many extra elements, and this keeps
the result contably infinite, as well.
satisfies H = h−1 .
To see why, notice that
and
By applying induction to the two previous results, we can prove the following:
We will prove this in the case that the sets are pairwise-disjoint, and leave
the rest of the details to you.
546 CHAPTER 7. FUNCTIONS AND CARDINALITY
In the case where the An sets are not necessarily pairwise-disjoint . . . we leave
this as Exercise 7.8.37.
An = {pkn | k ∈ N}
7.6. CARDINALITY 547
which is the set of all powers of the n-th prime. The theorem above says that
[
An = {all powers of primes}
n∈N
is countably infinite, as well. Indeed, we should have expected that because that
union is just a subset of the natural numbers, which is countably infinite itself!
Example 7.6.25. The set of all finite binary strings:
A binary string is defined to be an ordered list of 0s and 1s. A finite binary
string is one that is of finite length.
For example, the following are all finite binary strings:
0, 1, 101010, 10000000000000000001
For every n ∈ N, let’s define Fn to be the set of all binary strings of length n.
For instance,
F1 = { 0 , 1 }
F2 = { 00 , 01 , 10 , 11 }
F3 = { 000 , 001 , 010 , 100 , 011 , 101 , 110 , 111 }
and so on. (Notice that |Fn | = 2n . Try to prove that!) Then, define the set of
all finite binary strings by [
F = Fn
n∈N
An element of F must have come from some set in the big union; this means
that an arbirtrary element x ∈ F is some binary string with some finite length.
That length could be a huuuuuuuge number, but it is finite. (This points out
the distinction between allowing something to be “arbitrarily large (but finite)”
and allowing something to be “infinite”.)
The point of this example is that F is countably infinite, according to the
theorem above! (Well, it follows from the corollary stated right after, actually.)
Contrast this with the set S of all infinite binary strings, which is—as we will
prove shortly—uncountably infinite. We will use these sets of binary strings
fairly often as examples!
and Y
A1 × A2 × A3 · · · = Ak
k∈N
That is, what happens when we try to “jump to the limit” from having a finite
union/product (of arbitrarily large size, but still finite) to having an infinite
union/product? Can we make necessary conclusions? Can we find counterex-
amples?
The main idea is that “passing to a limit” does create some mathematical
object, but we can’t necessarily pre-suppose that this object has the exact same
properties as all of the objects in the sequence that defines that object.
Think about the finite sets [n], for every n. Each of them is finite, but “in
the limit” we get N which is not finite. So yes, we do get some object (another
set), but it doesn’t have to have the same properties.
The important theorem above shows that passing to the limit in the union
definitely preserves countability. As we will see below in the next section, the
product definitely does not preserve countability. (In fact, even an infinite
product of finite sets is uncountable. Yikes!)
A similar notion appears in calculus. We promised we would not use calculus,
but there is such a natural relationship between these ideas, so we feel compelled
to mention an easy example. If you don’t get anything out of this, no worries; if
you do, though, try to remember this connection and think about how it might
fundamentally change your view of everything you learned in calculus.)
Consider a limit, something like
1
lim =0
x→∞x
In what sense is this limit equal to 0? Why would we, as mathematicians over
the years, choose to define limits in this way? Formally, this limit makes sense
because of the quantified definition of a limit. Let P be the set of positive real
numbers. Then the definition of limit (applied to this example) says
.
∀ε ∈ P ∃M ∈ N ∀n ∈ N. . n > M =⇒
1
x
<ε
That is, for any small positive threshold (ε > 0), we can find a specific cutoff
point (a large natural number M that depends on ε somehow) such that, for
every point after M , the function x1 falls within that ε-threshold of the limit
point, zero.
1
Notice that this is very different than saying some nonsense like “ ∞ = 0” That’s
not what’s going on. We never actually get to “plug in” the end of the limit
and evaluate it. The limit is defined in terms of quantifications, some things
that are happening for arbitrarily large values, but not for an infinite value.
7.6. CARDINALITY 549
That is, we are constructing x by going down the main diagonal of the grid of
elements (so we see all of the elements ai,i ) and switching the value from a 1 to
a 0, or vice-versa.
The following diagram is a specific example of how to do this, and is not part of
this more general proof. However, we are including it for the sake of illustration:
1 ↔ 1 , 1 , 0 , 0 , 1 , . . . = y1
2 ↔ 1 , 0 , 0 , 0 , 1 , . . . = y2
3 ↔ 0 , 0 , 1 , 1 , 0 , . . . = y3
4 ↔ 1 , 1 , 0 , 1 , 1 , . . . = y4
5 ↔ 0 , 1 , 1 , 1 , 0 , . . . = y5
..
.
x = 0 , 1 , 0 , 0 , 1 , ...
Why would we choose to do this? Well, think about whether or not the
object x could possibly belong to the list of elements above.
• Is x = y1 ? No, because x is different from y1 in their first coordinates. (In
our example x1 = 0 because y1,1 = 1.)
• Is x = y2 ? No, because x is different from y2 in their second coordinates.
(In our example, x2 = 1 because y2,2 = 0.)
• Is x = y3 ? No, because x is different from y3 in their third coordinates.
(In our example, x3 = 0 because y3,3 = 1.)
In general, for an arbitrary i ∈ N, we can guarantee that x and yi differ in the
i-th coordinate. Accordingly, none of the yi objects can be equal to this new
object x. That is,
.
∀i ∈ N xi 6= yi,i =⇒ ∀i ∈ N x 6= yi .
But the way we defined x, it is just an ordered, infinite list of 0s and 1s, so it is
definitely an element of {0, 1}N , itself.
This is a contradiction. We assumed we could list all the elements of our set,
but we then used this ordering to construct an element of our set that definitely
×
does not appear in the list. ×
××
Therefore, {0, 1}N is uncountably infinite.
Note: This is a very slick argument. It’s one of my favorite proofs in all of
mathematics. Cantor was a genius for coming up with it and, what’s even more
interesting, it’s actually fairly simple and memorable, as well. We belive taht
you won’t forget this “go down the main diagonal and switch the values” argu-
ment. The fact that we could even sumamrize the whole proof in nine words
7.6. CARDINALITY 551
Corollary: A countably infinite product of any sets with at least two elements
each is uncountably infinite.
(Note: We really only need to say that none of the sets in the product are empty
and that only finitely many of them are allowed to have exactly one element.)
Examples
You might be wondering now: what types of sets are uncountably infinite? Do
we know any? Sure we do! Here are some examples.
Example 7.6.27. The set of all infinite binary strings:
You may have noticed that the set we used in the proof above—namely {0, 1}N —
is “essentially” the set S of infinite binary strings! An element of {0, 1}N is an
infinitely-long ordered list of coordinates, each of which is 0 or 1. An element of
S is an infinitely-long ordered list of 0s and 1s, but just without the parentheses
and commas. As such, there is a very natural bijection between the two (just
drop the parentheses and commas, or throw them back in), so we will identify
these two sets as the same.
We saw above in Example 7.6.25 that the set of all finite binary strings is
countably infinite. This latest result shows that the set of all infinite binary
strings is uncountably infinite. An alternate proof of this fact involves finding a
bijection between S and P(N), and then applying Cantor’s Theorem that says
|N| < |P(N)|. (See Exercise 7.8.33 for these details.)
Example 7.6.28. R is uncountably infinite:
This is our first example of a standard set of numbers that is uncountably
infinite. We can use the above result to prove this fact.
This claim makes some intuitive sense, since it “looks like” the real number line
is “so much bigger” than just N or Z. But we also saw that Q is countably
infinite, and there are tons of rational numbers scattered across the real number
line; in fact, between any two real numbers there lies infinitely many rationals!
What we will see now is that, yes, it is true that R is uncountably infinite.
Furthermore, we will even show that R and P(N) are of the same “size” of
infinity; that is, we will show |R| = |P(N)|. (Remember that this is way more
informative than just saying both sets are uncountable; there are many levels of
uncountably infinite sets, we are just choosing not to talk about them too much
so we don’t hurt our brains.)
Morally speaking, the idea behind showing R is uncountably infinite, first of
all, is to relate R to the set {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}N . Every real number can
be expressed in decimal notation, which is just some ordered list of countably
infinite many digits. There’s a decimal point in there somewhere, and there are
552 CHAPTER 7. FUNCTIONS AND CARDINALITY
issues like 0.999999 · · · = 1, but those aren’t huge deals. Since we already saw
that even a “small” set like {0, 1} yields an uncountable set when we take its
product infinitely many times, then certainly a “bigger” set, like {0, 1, . . . , 9}
will also give an uncountable set, even factoring in those issues. This is the
intuitive argument you can carry around in your head and use to explain the
result to your friends. (In fact, this is the argument you will find in most
textbooks, as well.)
More formally, we can just prove that |R| = |P(N)|. This stronger result implies
that R is uncountably infinite (because Cantor’s Theorem tells us |N| < |P(N)|.)
To do this, we will consider the set
I = {y ∈ R | 0 ≤ y ≤ 1}
which is the interval [0, 1] ⊆ R. We will show that
|{0, 1}N | = |P(N)| = |I|
and then apply some results about bijections between intervals and R.
Consider the function f1 : {0, 1}N → I that takes in an infinite binary string,
puts a decimal point in front of all the 0s and 1s, and says, “Evaluate this
number as a decimal expansion”.
As an example, consider the element that is (1, 1, 0, 0, 1, 0, . . . ) where the rest
are 0s. Then
1 1 1 11001
f1 (1, 1, 0, 0, 1, 0, . . . ) = 0.110010 . . .DEC = 1 + 2 + 5 =
10 10 10 100000
Notice that this is a function because any output is definitely a real number
(since it has a decimal expansion; we just provided it) and it is somewhere
between 0 and 1, since we put the decimal point in front. Furthermore, notice
that f1 is an injection; two different infinite binary strings must be different
in some coordinate, so they yield two decimal expansions that differ somewhere
and, thus, cannot be the same real number. This shows that |{0, 1}N | ≤ |I|.
Consider the function f2 : {0, 1}N → I that takes in an infinite binary string,
puts a decimal point in front of all the 0s and 1s, and says, “Evaluate this
number as a binary expansion”.
As an example, consider the same element as above. Then
1 1 1 25
f2 (1, 1, 0, 0, 1, 0, . . . ) = 0.110010 . . .BIN =
+ 2+ 5 =
21 2 2 32
Notice that this is a function because any output is definitely a real number;
just evaluate the resulting sum of fractions and it yields a real number between
0 and 1 (and even if the series is infinite, it is guaranteed to converge). For
example, the input of all 0s yields 0 as an output, and the input of all 1s yields
1 as an output since
1 1 1 X 1
+ + + ··· = =1
2 4 8 2k
k∈N
7.6. CARDINALITY 553
Both arguments involved some knowledge about decimal expansions (and bi-
nary expansions). It seems there is no easy way around this, so we hope that
the results above are still convincing. In particular, you might want to play
around with the idea that f2 in the discussion above is a surjection but not
an injection. Can you convince yourself of these claims? Can you convince
someone else?
Theorems
Let’s see one results about uncountable sets. Then, we will state a final theorem
about infinite sets, in general, before moving onwards!
(Note: We don’t need to assume that B ⊆ A here. If this were not the case,
we would just consider A and B ∩ A as the sets, instead.)
(Note: The idea is that there exists some bijection g : N → C, so we can let
y1 = g(1) and y2 = g(2) and so on.)
Define f : A → B by
y if y 6= yi for all i ∈ N and y 6= x
∀y ∈ A . f (y) = y1 if y = x
yi+1 if y = yi for some i ∈ N
Try It
Try answering the following short-answer questions. They require you to actu-
ally write something down, or describe something out loud (to a friend/class-
mate, perhaps). The goal is to get you to practice working with new concepts,
definitions, and notation. They are meant to be easy, though; making sure you
can work through them will help you!
(1) Prove Proposition 7.6.9. That is, prove: If A and B are finite sets, then
|A ∪ B| = |A| + |B| − |A ∩ B|
(3) Find the flaw in the following “spoof” that R is countably infinite:
Let S ⊆ R be the set defined by S = {y ∈ R | 0 ≤ y < 1}.
For every x ∈ S, define the set Ax = {x + z | z ∈ Z}.
(For example, A1/2 = {. . . , − 23 , − 12 , 12 , 32 , . . . }.)
Since Z is countably infinite, each set Ax is countably infinite.
Also, notice that [
R= Ax
x∈S
(4) For each of the following desired situations, provide an example or state
that it is impossible.
For example, if the situation were “Finite sets A and B such that A ∪ B
has size 4”, an answer might be “Consider A = {1, 2} and B = {3,[4}.” If
the situation were, “For every x ∈ N, an infinite set Sx , such that Sx is
x∈N
finite”, the answer would be “Impossible”.
There is no need to prove your answers here; a good example should suffice.
(a) An uncountably infinite set A and a countably infinite set B such that
A ∩ B is finite.
(b) Uncountably infinite sets C and D such that C −D is countably infinite.
(c) Uncountably infinite sets E and F such that E − F is uncountably
infinite.
[
(d) For every x ∈ N, a countably infinite set Sx , such that Sx is un-
x∈N
countably infinite.
[
(e) For every y ∈ R, a countably infinite set Ty , such that Ty is count-
y∈R
ably infinite.
(5) Prove Lemma 7.6.29. That is, suppose A is uncountably infinite and B ⊆ A
is countably infinite; prove that A − B is uncountably infinite.
Use this result to explain why the set of irrational real numbers is uncount-
ably infinite.
7.7 Summary
Now, we have fully explored functions and their related properties! We saw
that a function is just a relation with a particular property. This desired prop-
erty corresponds to how we usually think of a function as having an “output” for
every possible “input”. We formalized these notions mathematically by defining
terminology like domain, codomain, and image. Further properties of functions
include injectivity and surjectivity. We saw many examples and non-examples
of functions with these properties, and discussed how to prove/disprove these
properties, relating back to our logical proof techniques.
The notion of a bijection has been particularly helpful and powerful. We
related this to the notion of an inverse function. Specifically, we saw and proved
that a function is bijective if and only if it has an inverse! This made for an
important result later on when we discussed cardinality, where “the bijection
is king”. The notion of “pairing off elements” helped us make sense of some of
the more wild and counter-intuitive results about the “sizes of sets”.
We characterized infinite sets as either countably infinite or uncountably in-
finite. However, we also proved the historically significant result that is Cantor’s
558 CHAPTER 7. FUNCTIONS AND CARDINALITY
Theorem, which shows that there are, in fact, infinitely-many cardinalities! For
our purposes here, it was sufficient to distinguish these two types of infinite
sets. We saw several examples of each, and proved some theorems about how to
create sets of specific cardinalities from others. Ultimately, we find these results
intriguing and mathematically instructive. From now on, though, we will be
focusing on finite sets only.
Problem 7.8.1. For each of the following “rules” and proposed domains and
codomains, determine whether the “rule” defines a well-defined function.
Explain your answer using examples, if necessary.
x2
(a) Let a : Z − {1} → R be defined by a(x) = .
x−1
p
(b) Let b : Q → Q be defined by b(x) = |x|.
.
∀(x, y) ∈ Z × Z f (x, y) = (y + 1, 3 − x)
Find a function F that is the inverse of f , and prove that it is. What does this
tell you about the function f ?
Problem 7.8.8. Define the set S = {x ∈ R | 0 < x < 1}. Define the function
g : S → R by
2x − 1
g(x) =
2x(1 − x)
560 CHAPTER 7. FUNCTIONS AND CARDINALITY
x ≈ y ⇐⇒ f (x) = f (y)
7.8. CHAPTER EXERCISES 561
Is ≈ an equivalence relation? If so, prove it, and describe the equivalence classes.
If not, provide a counterexample.
Now, suppose that f is an injection. Is ≈ an equivalence relation? If so, prove
it, and describe the equivalence classes. If not, provide a counterexample.
Problem 7.8.16. Let f : A → B be a function, and let X, Y ⊆ A. Consider the
claim that Imf (X) ∩ Imf (Y ) ⊆ Imf (X ∩ Y ). What is wrong with the following
“spoof” of that claim?
A1 ∪ A2 ∪ · · · ∪ An
and
A1 × A2 × · · · × An
are both also countably infinite.
562 CHAPTER 7. FUNCTIONS AND CARDINALITY
A = {(a, b) ∈ N × N | a ≤ b}
f is injective ⇐⇒ f is surjective
Prove that f is a bijection by finding its inverse and proving that inverse is
correct.
Problem 7.8.29. Let A and B be finite sets and suppose |A| = |B|.
Suppose f : A → B is a function that is injective.
Prove that f must also necessarily be surjective by showing imf (A) = B.
Problem 7.8.30. Let k ∈ N − {1} be given. Define
S1 = {X ∈ P( [k] ) | k ∈
/ X}
and
S2 = {X ∈ P( [k] ) | k ∈ X}
(a) Prove that the sets S1 and S2 form a partition of P( [k] ).
(b) Define a function f1 : S1 → P( [k − 1] ) that is a bijection and prove that it
is.
(c) Define a function f2 : S2 → P( [k − 1] ) that is a bijection and prove that it
is.
(d) Use what you proved in (a) and (b) and (c) to write an induction proof
that P( [n] ) has 2n elements, for every n ∈ N.
Note: Because of the restriction k ≥ 2 above, make n = 1 your base case, use
n = k ≥ 1 in your Induction Hypothesis, and prove the claim for n = k + 1
in the Induction Step.
Problem 7.8.31. Let A, B, C, D be sets, and suppose A ∩ B = C ∩ D = ∅.
Suppose f : A → B and g : C → D are bijections.
Define the piece-wise function h : A ∪ B → C ∪ D by setting
(
∀x ∈ A ∪ B h(x) = . f (x) if x ∈ A
g(x) if x ∈ B)
(c) Explain why (a) and (b) have shown that ∀n ∈ N . [1] × [n] = n.
(e) Explain why (c) and (d) have shown that ∀k, n ∈ N . [k] × [n] = kn.
(f) Explain why (e) proves the result stated in the problem description above.
Problem 7.8.33. Let S be the set of all infinite binary strings. (That is, elements
of S are infinitely-long strings of 0s and 1s.)
Find a bijection between S and P(N). Use this to prove that S is uncountably
infinite.
Problem 7.8.34. For each of the following sets, you are given its cardinality.
Prove that the given cardinality is correct by finding a bijection to a relevant
set and/or citing a result.
(Hint: If you don’t use some kind of inductive argument, your proof might not
be rigorous enough . . . )
(a) A is the set of all functions from N to N. Show that A is uncountably
infinite.
(Hint: Compare A with the set S of all functions from N to {1, 2}. Can
you explain why S is uncountably infinite? What does this say about A?
...)
(b) B is the set of all functions from N to N with the additional property that
.
∀x ∈ N f (x + 1) = f (x) + 1
.
∀x ∈ N f (x + 1) = f (x) + 1
f (1) = 42
.
∀f ∈ T f −1 exists ∧ f −1 ∈ T
.
∀f ∈ T ∀g ∈ U − T . f ◦g ∈
/ T ∧ g◦f ∈
/T
(h) What are the cardinalities of S, T, U ? If your answer is “finite”, also state
the size. If your answer is “infinite”, also state whether it is countable or
uncountable and prove your claim by finding a bijection to an appropriate
set or citing a relevant result.
7.9 Lookahead
In the next chapter, we will study combinatorics, the mathematical branch
of “counting things”. We saw in the section on cardinality that many results
about finite sets seemed rather intuitive. When we study combinatorics, we
will be describing the elements of a set by characterizing what properties they
have, rathern than simply stating them all or listing them. This will actually
make it quite interesting (and sometimes very difficult!) to determine just how
many elements we have described. Combinatorics is the study of techniques to
determine the number of elements of a set with certain properties. We will state
and prove some fundamental principles of counting (appealing to results from
this chapter, in fact) and use them to build more advanced techniques and solve
some interesting problems.
Chapter 8
Combinatorics: Counting
Stuff
8.1 Introduction
The field of combinatorics is one of the most active and exciting areas of
interest in modern mathematics. It is also sometimes known as “discrete math”
to distinguish it from analysis, which studies more “continuous” notions like
the real number line and functions defined on that set. In this chapter we
will explore some of the fundamental ideas in combinatorics and apply them
to solve interesting problems. Essentially, we will be learning interesting and
useful principles about how to count the number of elements in finite sets where
those elements are described in some way but not enumerated for us.
8.1.1 Objectives
The following short sections in this introduction will show you how this chapter
fits into the scheme of the book. They will describe how our previous work
will be helpful, they will motivate why we would care to investigate the topics
that appear in this chapter, and they will tell you our goals and what you
should keep in mind while reading along to achieve those goals. Right now,
we will summarize the main objectives of this chapter for you via a series of
statements. These describe the skills and knowledge you should have gained by
the conclusion of this chapter. The following sections will reiterate these ideas
in more detail, but this will provide you with a brief list for future reference.
When you finish working through this chapter, return to this list and see if you
understand all of these objectives. Do you see why we outlined them here as
being important? Can you define all the terminology we use? Can you apply
the techniques we describe?
567
568 CHAPTER 8. COMBINATORICS
8.1.3 Motivation
Think about playing poker. If you’re unfamiliar with the game, just think of
it as a simple system where two players receive a hand of 5 random cards each
and then they compare to see who wins. Hands are ranked according to the
following list, from best to worst:
8.1. INTRODUCTION 569
Is this a fair game? If you’ve played poker before, and especially if you’ve
played a lot, you’ve not only learned to accept this ranking system but you’ve
also learned how to exploit it and make decisions. In Five Card Draw, if you’re
dealt 22345, should you keep the pair or go for the straight? Which is more
likely to happen? Which will pay off more handsomely?
By our question, “Is this a fair game?”, what we’re really wondering is why
the ranking is the way it is! Is drawing a flush actually rarer than a straight?
Does it make sense that a full house loses to four of a kind? Why? How can we
prove these results? To answer these questions, we will rephrase the questions
in terms of counting instead of probability. We will ask how many distinct five
card hands are flushes, how many are straights, and so on. This will allow us to
compare them directly. Do you see how this relates to our work in the previous
chapter, too? We will really be identifying the cardinality of the set of all poker
hands that are flushes, for instance, and comparing it to the cardinalities of
other sets of hands.
Some of your proofs to the exercises in these sections may consist entirely of
English sentences, with almost no mathematical symbols! This will seem strange
at first, and might even seem to contradict the ideas we have emphasized so far
about precision, clarity, and mathematical rigor. This is definitely not the case,
though; combinatorics has a rigorous foundation in finite set theory, and we will
work hard to point out this relationship whenever it is relevant. This property of
combinatorics will also require you to be extra careful about your proof-writing
style, ensuring that your words are chosen appropriately to be unambiguous
and clear. More so than ever, be sure to reread your proofs after writing them,
pretending that you are someone else, to make sure that the points you want to
make actually come across in your proofs.
One final introductory point can be made by the following quote that a friend
of mine stated once when we were talking about how to teach combinatorics. I
found that it nicely summarized the sometimes strange transition from the the
proofs we have been doing so far (that might feel rather formal) to combinatorics
proofs (that might feel rather informal, in comparison).
You might not know what that means now, but if you look back at this quote
after working through this chapter, you’ll understand what he was getting at.
What this means is that, in an abstract and theoretical sense, finite cardinality is
boring; all the results are what you’d expect them to be—like |A∪B| = |A|+|B|
when A ∩ B = ∅—and the techniques are all the same—find a bijection to an
appropriate set. Infinite cardinality is far stranger and surprising—|A ∪ B| =
|A|+|B| can be False, even if A∩B = ∅, and even further, the addition |A|+|B|
is hard to make sense of, mathematically!
How does combinatorics differ, then? Well, in all of our work with combi-
natorics, we are given a finite set; the difference is that its elements are only
described to us in some way. We are not presented with the elements of a set
directly and asked to count them. (That would be easy: “One, two, three, . . . ”)
We have to come up with relevant and helpful strategies to identify how many
objects have a certain prescribed list of properties. That is where the difficulty
of combinatorics comes in. When we say, “Consider the set of all 5-card hands,
as drawn from a standard deck of cards”, you can immediately grasp the idea
of that set, but you certainly can’t picture all its elements laid out before you,
let alone begin to count them one-by-one. In this sense, combinatorics is hard;
this is also why it is incredibly interesting and popular!
the size of their union is the sum of their individual sizes. This makes intuitive
sense for finite sets, and we proved the result mathematically using A bijec-
tion. This result forms the basis for the first, fundamentally useful principle of
combinatorics. Notice that this grounds us firmly in the principles of set theory.
Partitions
We start by recalling Definition 3.6.9, which was introduced in our discussion
of sets.
Si = {2i − 1, 2i}
572 CHAPTER 8. COMBINATORICS
Next, we prove the reverse set containment. Let n ∈ N. We have two cases to
consider. (1) If n is even, then ∃k ∈ N such that n = 2k. Thus, n ∈ Sk . (2) If
n is odd, then ∃` ∈ N[such that n = 2` − 1. Thus, n ∈ S` . In either case, we
have shown that n ∈ Si .
i∈I
Therefore, S is a partition of N. In particular, it is an infinite partition.
8.2. BASIC COUNTING PRINCIPLES 573
Statement
For the remainder of the chapter, we will only consider finite partitions of finite
sets. In particular, the Rule of Sum only applies in this specific case.
Proposition 8.2.4. Let A be a finite set, let n ∈ N, and let S = {Si | i ∈ [n]}
be a finite partition of A. The Rule of Sum states that
X
|A| = |Si |
i∈[n]
The Rule of Sum tells us that the size of a set can be found by partitioning
it into a finite number of smaller sets and summing their sizes. Notice that this
is precisely Corollary 7.6.10 that we saw last chapter in our discussion of finite
sets! There, we asked you to prove this claim by induction, in Exercise 2 in
Section 7.6.5. With this result in hand, we’ll move on to see some examples.
Examples
Example 8.2.5. At Unique Activity University, every student is required to par-
ticipate in exactly one varsity sport each year. Playing more than one would be
too much of a time commitment, and not playing at all would make them lazy,
so everyone plays exactly one of the following non-traditional-but-still-sports
sports: golf, cricket, badminton, and chess. The athletic department released
the following statistics about the rosters for each sport this year:
• Golf: 12 players
• Cricket: 18 players
• Badminton: 23 players
• Chess: 33 players
|S| = 12 + 18 + 23 + 33 = 86
More interesting examples of applying the Rule of Sum will appear when
we combine it with other counting principles. For now, it’s a simple idea that
governs how to count sets that can be broken into disjoint parts. In general, the
hardest part about using the Rule of Sum is deciding which partition to apply
it to, and being creative about that.
The next counting principle is just as, if not more, helpful but a little more
intricate to define and prove.
Okay, so there are 24 total ways to assign the stickers. What about with five
people? I don’t know about you, but my arm is getting tired writing out all of
these assignments. There must be a better way to do this! Yes! This is where
8.2. BASIC COUNTING PRINCIPLES 575
the Rule of Product comes in to save the day. (Side note: You might notice a
pattern to our list above; can you infer how we made sure we actually listed all
possibilities? Could you write a little computer program that would generate
all the possibilities, for any number of elements? Try it!)
Statement
We will actually make two separate statements of the Rule of Product. The first
is an intuitive statement of when and how it applies and what it claims. The
second is a more rigorous, mathematical statement that is rooted in the kind
of set-theoretic language that we have been using all along. We emphasize that
both definitions should, ideally, be understood; however, truly understanding
the first one is more important, and the second is presented mostly because it
is the one that can and will be rigorously proven.
Let’s relate this statement back to the previous example with the people and
stickers before moving on and stating the Rule of Product more rigorously.
Example 8.2.8. We can think of assigning the stickers to Andy, Brendan, and
Carl as a three-step process. Let’s line up the three gentlemen in alphabetical
order, left to right, then move along the row. At each step, we will place a sticker
on the gentleman in front of us by choosing one that hasn’t been assigned yet.
In the first step, we approach Andy and have 3 possible stickers to place on
him. In the second step, we approach Brendan and have 2 possible stickers to
place on him. Notice that this is true no matter what sticker was chosen for
Andy. We don’t actually care which sticker was chosen for Andy—be it 1, or
2, or 3—merely that the number of choices we have when we face Brendan is
always 2. In the third step, we approach Carl and find that we have only 1
sticker option, regardless of the previous two choices.
The Rule of Product tells us that the number of ways to complete this
process is the product of those numbers of options at each step: 3 · 2 · 1 = 6.
This agrees with our “exhaustive list” procedure. Hooray!
What about with 4 people? Using the same kind of logic, we can see that
there are 4·3·2·1 = 24 possible ways to complete the sticker-assignment process.
Again, this agrees with our previous procedure. Double hooray!
What about with 5 people? Well, 5 · 4 · 3 · 2 · 1 = 120. We figured out
something we didn’t know yet. Triple hooray! With 6 people? With 7 people?
576 CHAPTER 8. COMBINATORICS
With n people, where n ∈ N? We can answer all of these questions very easily
and precisely now, thanks to the Rule of Product. Infinite hooray!
Tree Diagrams
really, why is this true? What are those schedules? Let’s represent them by a
tree diagram!
M1 , C1 , P1
M1 , C1 , P2
M1 , C1
M1 , C2 , P1
M1 , C2 , P2
M1 , C2
M2 M1 , C3 , P1
M1 , C3 , P2
M1 , C3
M2 , C1 , P1
M2 , C1 , P2
M2 , C1
M2 M2 , C2 , P1
M2 , C2 , P2
M2 , C2
M3 , C3 , P1
M3 , C3 , P2
M2 , C3
M3 , C1 , P1
M3 , C1
M3 , C1 , P2
M3 , C2 , P1
M3 , C2
M3 M3 , C2 , P2
M3 , C3 , P1
M3 , C3
M3 , C3 , P2
M4 , C1 , P1
M4 , C1
M4
M4 , C1 , P2
M4 , C2 , P1
M4 , C2
M4 , C2 , P2
M4 , C3 , P1
M4 , C3
M4 , C3 , P2
Reading left to right in the diagram, we are following this three step pro-
cedure we established. The single vertex (or node) at the far left represents
the start of our process—no decisions have been made—and the four edges (or
branches) emerging from that vertex represent the four mathematics courses
from which we can choose. We have labeled each edge with one of the elements
of M. No matter which one of those edges we follow (i.e. no matter which
mathematics course we select), there are three edges emerging from the next
vertex (i.e. we still have three computer science courses from which we can
choose). We have labeled all of those edges with corresponding elements from
C. Following the same idea, every vertex in that column has two emerging edges
which are labeled by corresponding elements from P.
The benefit of this diagram is that we can see exactly what the 24 outcomes
of this process are by following the labels on the edges. For instance, look at the
vertex on the top of the far right column. This corresponds to selecting M1 and
578 CHAPTER 8. COMBINATORICS
Does this make more sense, now? Does this provide you any insight into
how the Rule of Product actually works?
Y Y
Ti = |T1 × T2 × · · · × Tn | = |T1 | · |T2 | · · · |Tn | = |Ti |
i∈[n] i∈[n]
total outcomes.
|X2 | = |Y | − |X1 |
= (366 + 367 ) − (366 + 6 · 10 · 365 ) + (367 + 7 · 10 · 366 )
266 + 267
What do we have at our disposal? That’s right, the Rules of Sum and
Product. That’s pretty much it, other than our mathematical wit and intuition,
so let’s dive right in. How does shuffling a deck of cards correspond to a partition,
or a multi-step process? Well, the interesting thing is that we don’t actually
care how the deck is shuffled, we only care about the number of outcomes of the
process. What actually matters about a deck of cards? Right, the order of the
cards from top to bottom. With that in mind, let’s think about constructing
an arbitrary shuffling by assigning the order of the cards.
Let’s create a shuffling by taking a deck of cards in our hands and, one by
one, placing a card face down on a stack in front of us. At the first step, we have
52 cards in our hands and no stack, so we have 52 choices. At the second step,
we have 51 cards remaining in our hands to choose from, no matter what that
first card was. (Remember: this is the important part of the Rule of Product,
that the number of choices is independent of the actual choices made.) In the
third step, we have 50 cards remaining, and so on. Eventually, in the 52nd step,
we have only 1 card in our hands to place on the stack of 51 cards on the table.
After that step is completed, we have a shuffling of the deck sitting in front of us,
with the cards stacked face-down. The card from the 1st step is on the bottom,
and the card from the last step is on the top. Moreover, we see that for any
arbitrary shuffling, there is exactly one sequence of choices that produces that
shuffling. (This satisfies that other part of the Rule of Product about having
distinct outcomes. Think about this carefully and why it’s required.)
These observations allow us to directly cite the Rule of Product to answer
the question: how many shufflings of a standard deck of cards are there? The
number is . . .
Y
52 · 51 · 50 · · · 3 · 2 · 1 = k = 8.06581752 × 1067
k∈[52]
Yowza! That’s a big number. For the sake of comparison, Avogadro’s Constant
(the number of atoms in a mole) is on the order of 1023 . There is a much better
notation for this kind of product that says “multiply all the natural numbers
from 52 down to 1”, and you’ve probably seen it before, but we’ll define it now.
Definition 8.2.12. Let n ∈ N. The natural number n!, read as n factorial, is
given by Y
n! = k = k · (k − 1) · (k − 2) · · · 3 · 2 · 1
k∈[n]
By definition, 0! = 1.
(Recall that we used computing factorials as an example of applying the
principle of induction to recursive programming, way back in Section 2.5.1.
Read that section again!)
Let’s think about what we’ve accomplished, in fact. What was special about
the number 52 in this case? Besides it being the number of cards in our deck,
nothing! What if we had posed the question: how many ways are there to put
the elements of [n] into an ordered list? If we replace n with 52, this is actually
582 CHAPTER 8. COMBINATORICS
the same questions as before! (We could just come up with a natural bijection
between the set of cards and the set [52]. Can you do this? Do you see why this
shows the questions are equivalent?)
Permutations
This type of question—how many ways are there to arrange n objects into an
ordered list—is so common that we have a specific term for these ordered lists.
We define them rigorously in terms of functions, but note their relationship to
other mathematical objects (ordered list, for instance).
Selections
This mathematically proves a general version of our observation about shuffling
cards, and it brings us closer to answering our original question about ranking
poker hands. Remember that we hope to identify how many distinct shufflings
of the deck yield a certain type of five card hand among the top five cards, so
let’s attack a slightly more general problem, first. Think of a specific five card
hand, five particular cards. We’re thinking of T ♣ J♣ Q♣ K♣ A♣, so let’s use
that. Now, let’s count how many deck shufflings place this specific hand among
the top five cards.
How could we have such a situation? We don’t care about the order in which
we receive the cards in our hand, and we don’t care about the order of the other
47 cards in the deck. All that matters is whether those specific cards are on
the top. So let’s follow the same idea we used before and construct a shuffling
8.2. BASIC COUNTING PRINCIPLES 583
with this property. We want to use the Rule of Product, so we need to identify
a particular process that constructs a shuffling with the desired property. How
can we do this?
There are really only two properties we need to satisfy, so let’s identify a
two step process that ensures those properties hold. The first step should place
the 47 cards not from our hand on the bottom of the deck in some order. The
second step should place the five cards from our hand on top of that pile in
some order. The Rule of Product applies because no matter how we shuffle the
bottom 47 cards, this doesn’t affect the number of ways we can shuffle the top
five cards. (In general, be careful to note why the Rule of Product applies in a
given situation before applying it; this is often subtle and not obvious!) Now,
we just need to count the number of ways to perform each step.
The first step involves creating a permutation of 47 cards. Proposition 8.2.14
tells us there are 47! ways to do this. The second step involves creating a
permutation of five cards. Proposition 8.2.14 tells us there are 5! ways to do
this. Then, the Rule of Product tells us the number of ways to complete these
steps in succession is 47! · 5!. That’s it!
What was special about our choice of T ♣ J♣ Q♣ K♣ A♣ in this case?
That’s right, nothing! By applying the Rule of Product again, this fact will tell
us something more about the number of shufflings of the deck. Specifically, let’s
say X is the number of ways to select a set of five cards as a poker hand. Now,
consider the three step process of taking five particular cards from the deck,
arranging them in some order, and then arranging the other 47 cards below it.
The Rule of Product applies here because the number of ways to perform each
step doesn’t depend on the choices made in the previous steps. Furthermore,
every shuffling of the deck arises from exactly one particular instance of this
procedure. (Think about why this is true. Consider an arbitrary shuffling of
the deck. The top five cards determine which hand we chose in the first step,
the order of them determines how the second step was performed, and the order
of the others determines how the third step was performed.) Thus, we have
found two particular formulas for counting the same set of objects–that is, the
shufflings of a deck of cards–and so it must be true that
X · 5! · 47! = 52!
and therefore
52!
X=
5! · 47!
Think about what this formula tells us. We let X designate the number of ways
to choose a set of five cards from a set of 52 cards. What was special about five
or 52? Again, that’s right, nothing! We have essentially derived a formula for
the number of ways to select any number of objects from a larger set of objects.
It might not seem like it, but we are now very close to solving the poker hands
problem. Before we finish that project, let’s make one comment.
First, the type of argument we just made is a common and extremely useful
proof technique in combinatorics. It is known as counting in two ways. What
584 CHAPTER 8. COMBINATORICS
we did was identify a particular set of objects–in this case, the set of shufflings
of a deck of cards–and then describe two different procedures that allowed us to
count the size of that set. Each procedure led to a different formula, and because
we were counting the same set of objects, we know those formulas are equal.
We will explore this type of argument more explicitly and see many examples
in Section 8.4. For now, we hope that you can see why it is a valid argument
type, especially because we will expect you to use it to prove Proposition 8.2.16
below! In doing so, you will be generalizing the argument we presented here.
For illustration’s sake, let’s summarize what we did:
Argument Summary: We seek an expression for the number of ways to draw 5
cards from a deck of 52 cards. Let N be this number we are looking for. We will
identify two different formulas for expressions that involve N . This will allow
us to solve these algebraic expressions for a formula for N .
(1) Select an arbitrary and fixed five card hand. We will identify the number
of ways to shuffle a deck of cards such that the top five cards are that fixed
five card hand, in any order.
Note that there are N ways to do this step. We seek a formula for N .
(2) Count the number of permutations of the entire deck of 52 cards.
(3) Count the number of permutations of the deck that yield those fixed five
cards on the top. This is split into three steps:
(i) Count the number of ways to permute those five cards.
(ii) Count the number of ways to permute the other 47 cards.
(iii) Count the number of ways to put those 5 permuted cards on top of
those 47 permuted cards. (Note: There is only one way to do this, but
it’s important to point out as a separate step.)
(4) Overall, notice that we have counted the number of permutations (i.e. shuf-
flings) of the deck in two separate ways, so they must be the same number.
(5) Simplify the expression (which involves N ) to find a formula for N .
Now, let’s generalize the formula we just derived. First, we make a definition
and introduce some notation, and then we state a formula.
Definition 8.2.15. Let k, n ∈ N with n ≥ k. A k-selection from [n] is an
unordered set of k elements from [n].
The number of k-selections from [n] is represented by nk . This is known as a
Binomial Coefficients
One thing you might find surprising about the above formula is that the fraction
is actually a natural number, no matter what k and n are! This is proven by the
fact that it represents a number of ways to complete a procedure, as described
in the proof, and this must be a natural number.
We want to point out one special case of this formula which may not occur
n
to you. What if k = 0, say? What number should 0 be? You might be
n
surprised to find out that 0 = 1. Why does this make sense? Intuitively, we
think of nk as the number of ways to select k objects from a set of n objects;
so, how many ways can we select 0 objects from, say, 3 objects? Put 3 pens on
your desk. Now, select none of them. There! You just did it! That was one
way—and the only way—to select none of the objects. This argument works
just as well when n = 0, even! Put no pens on your desk. Now, select none of
them. There! You just did it in one way again. Thus,
n
∀n ∈ N ∪ {0}. =1
0
There are “better”, more mathematical reasons for this result, and we will point
these out in the next section when we prove Pascal’s Identity. For now, we hope
that this heuristic explanation with selections makes sense and can convince you
of this result.
n
Another fact is that K = 0 whenever K > n. This is because there are no
ways to choose, for instance, 5 objects from a set of only 3 objects. This fact
is borne out by our derivation above, because in one of the steps, we would be
trying to (impossibly) draw more cards for a hand than there are cards in that
deck, and there are 0 ways to do this. Then, when we apply ROP, the product
would evaluate to 0.
Ifyou play around with some values of k and n, you’ll notice that the values
of nk obeys a so-called unimodal distribution. That is, if we fix n and let
n
k increase
n from 0 to n, we find the numbers going up, reaching a peak at 2
and 2 (notice these are the same if n is even) and then decreasing again.
Furthermore, the distribution is symmetric around that middle! Can you prove
that these properties hold? Try it!
Arrangements
We now have all of the tools necessary to count poker hands (and plenty of
other objects, for that matter). We know how many ways there are to permute
the elements of a set, and we know how many ways there are to choose a subset
of a certain size from a larger set. Between these two tools, we know how to
count any combinations of cards. For instance, to count an ordered subset of
cards, we can count the number of ways to choose the subset and then permute
its elements, applying the Rule of Product to this two-step process. In fact, this
idea is common enough that we will give it a defined name.
586 CHAPTER 8. COMBINATORICS
Repetition
Before we go on and count those poker hands, actually, we should point out
that all of the standard counting formulas we have seen in this section only
consider procedures where objects are not allowed to be repeated. That is, when
we choose a five card hand from a deck, we can’t have two A♣s, for instance.
There are situations where we will want to allow objects to be selected multiple
times. Look back at the License Plates example in the previous section. We
were allowed to repeat any digit/letter; for instance, 111AAA is a valid license
plate. Let’s see one more example here:
Example 8.2.19. Consider a standard, fair, two-sided coin. Flip the coin 6 times
in a row and write down the outcomes, either H or T for each flip.
Question: How many possible sequences of outcomes are there?
To answer this question, we note that there are 2 possible outcomes on each
flip, regardless of the outcomes on the previous flips. Thus, the Rule of Product
applies, and we can say there are 2 · 2 · 2 · 2 = 24 = 16 possible sequences of flips.
The reason this idea is related to selections and arrangements (beside using
the Rule of Product, of course) is that we can also represent these sequences
as arrangements of 4 objects from the set {H, T } where objects are allowed to
appear more than once. (There is a natural correspondence between {H, T }
and [2], so it is like we are arranging 4 objects from [2], where the objects can
occur more than once.)
This general idea is conveyed by this definition:
Definition 8.2.20. Let k, n ∈ N. A k-arrangement with repetition from
[n] is a k-tuple of elements from [n] where elements are allowed to appear more
than once.
Notice that there is no restriction on k because we are allowing elements
to appear multiple times. Before, with k-arrangements without repetition, it
wouldn’t make sense to choose 10 objects from 8 objects if we couldn’t repeat
any! Here, though, this is allowed, so k and n can be any natural numbers.
Proposition 8.2.21. Let k, n ∈ N. The number of k-arrangements with repe-
tition from [n] is given by nk .
Proof. Left for the reader as Exercise 4 in Section 8.2.4.
8.2. BASIC COUNTING PRINCIPLES 587
You might anticipate a definition and proposition for k-selections with rep-
etition that are similar to the ones for arrangements with repetition. We will
discuss these in Section 8.5, but the techniques used to count them are more
advanced than the ones we have now, so we will address this later.
Each of these questions can be answer with Yes or No, and each of the four ways
to answer them yields a different formulation of the original question.
Repeats?
Yes No
k n!
Yes n
(n − k)!
Order
Matters?
n
No ???
k
(Note: Sometimes, the roles of n and k are reversed in a problem. Be careful
about this! We’ll try to stick to these conventions but, in general, the letters
aren’t important; it’s what they represent.)
Try It
Try answering the following short-answer questions. They require you to actu-
ally write something down, or describe something out loud (to a friend/class-
mate, perhaps). The goal is to get you to practice working with new concepts,
definitions, and notation. They are meant to be easy, though; making sure you
can work through them will help you!
(1) Verify algebraically that nk = n−k
n
.
(2) Prove Proposition 8.2.16, i.e. prove that
n n!
=
k k!(n − k)!
Do this by adapting the argument we used for counting the number of 5
card hands from a standard deck.
(3) Prove Proposition 8.2.18. That is, prove there are
n!
(n − k)!
possible k-arrangements from [n].
8.3. COUNTING ARGUMENTS 589
nk
break our questions into smaller parts. That way, we can count the number of
answers to each question and use those numbers in the Rule of Product.
How can we be more specific? How can we break question one into parts?
Imagine the types of answers our friend might give us for question one. We
might hear something like, “The Ace of Hearts and Ace of Spades” or “The
Sevens of Diamonds and Clubs”. This signals the important properties of an
answer to question one: we need to know the rank of the pair cards (are they
both Aces? Kings? Queens? etc.) and the two suits represented. We know
there are 13 ranks and 4 suits in the deck. With this information, we can identify
how to construct a pair and count the options.
1. Choose a rank for the two cards in the pair: 13 options
2. Choose the two suits for those cards: 42 = 6 options
Notice that we have used the binomial coefficient 42 to signify that we are
and does not actually correspond to doing that action. That is, we don’t say
something silly like “ 42 selects 2 suits from the set of 4 suits.” How can a
1. Choose 3 ranks from the 12 remaining (i.e. not the same rank as the pair
cards): 12
3 options
12 · 11 · 10
12 11 10 12 12!
· · = 12 · 11 · 10 6= = =
1 1 1 3 3! · 9! 6
This is because the (a)-(b)-(c) step imposes an order on those three cards that
doesn’t actually matter within our poker hand. When playing cards, you don’t
care how you receive your cards, only what they are! (However, notice that if
we “divide out” by the number of ways to order 3 cards, namely 3!, we get the
same number. This hints at an interesting concept, a kind of “inverse” of the
Rule of Product. We will discuss this at the end of this section.) This is why
we couldn’t refer to “the 1st card” in step 2. Instead, we found an inherent
ordering of the cards, a particular property they possess that allows us to refer
to specific cards among them without applying an external ordering.
Again, the Rule of Product applies because any selection of 3 cards of differ-
ent ranks could only come from one set of choices made in these steps. Further-
more, we can think of selecting a pair as Step 1 and selecting three other cards
of different ranks as Step 2 and apply the Rule of Product to this entire process.
This finally gives us an answer for the number of “one pair” poker hands:
3
13 4 12 4
· · ·
1 2 3 1
Notice that we have combined the three numbers from the last steps above into
one coefficient raised to the third power. Now, this type of numerical answer
is totally acceptable and is far better than just writing down 1, 098, 240. If
you make a “typo” on your homework or make a calculator error, how can we
identify the error and offer a comment? ,
592 CHAPTER 8. COMBINATORICS
(2) Select which two suits of that card in (1) appear in the hand.
There are 42 ways to do this.
(4) For each rank chosen in (3), select a suit of that card to appear in the hand.
3
There are 41 ways to do this, each of three times; thus, there are 41 ways
in total.
Applying ROP, we find the answer given above.
Does this make sense? Notice how much shorter it is than our explanation
above. This is fine! We will continue to sometimes write out some details
in our written examples here (to help you understand how to approach these
problems, before writing them up), but your written solutions can be a little
more condensed, as long as they identify all the key elements of the problem’s
solution. Notice that we pointed out a use of the ROP, cited it, and identified
all the steps in the process; for each step, we noted how many ways there are
to do that step. It just so happens each of these steps are pretty simple, and
8.3. COUNTING ARGUMENTS 593
the number of ways to perform them is clear in each case. In general, we might
expect a more thorough description. For instance, we would consider writing
that the number of ways to do step (3) is 12
1 because we aren’t allowed to
re-select the rank chosen in step (1). However, we felt this was clear from the
descriptions so we left it out. This is a judgment call, though, and we recommend
(as always) setting aside your proofs and rereading them as if you didn’t write
them. If you can’t remember, or aren’t entirely sure, why something is true,
consider adding a little extra description there.
Before doing another example, let’s point out a different solution to this
same problem!
Question: How many 5-card poker hands are “one pair” hands?
Answer: We claim there are
3
13 4 4 4
4 1 2 1
“one pair” poker hands. We will identify a six-step process and apply ROP. The
main idea is that a one pair hand can be identified by choosing all four ranks
that appear and identifying which one is repeated twice (leaving the others to
appear just once).
(1) Select 4 ranks that will appear in our hand.
There are 13
4 ways to do this.
(2) Of the 4 ranks selected in Step (1), select one of them. Two cards of that
rank will appear in our hand.
There are 41 ways to do this.
(4) For the lowest of those 3 ranks not chosen in Step (2), select a suit.
There are 41 ways to do this.
(5) For the middle of those 3 ranks not chosen in Step (2), select a suit.
There are 41 ways to do this.
(6) For the highest of those 3 ranks not chosen in Step (2), select a suit.
There are 41 ways to do this.
3
By ROP, and simplifying 41 41 41 = 41 , we have shown the expression above
is correct.
Isn’t that neat? We’ll leave it to you to verify that
3 3
13 4 4 4 13 4 12 4
= 1098240 =
4 1 2 1 1 2 3 1
594 CHAPTER 8. COMBINATORICS
(2) Select five of the cards from that suit to appear in the hand.
There are 13
5 ways to do this.
Since each flush hand is uniquely defined by these two steps, we can apply ROP
and conclude that there are
4 13
· = 5148
1 5
Proof. We will describe 5 card hands that are straights by a two-step process:
8.3. COUNTING ARGUMENTS 595
1. Select one of 10 ranks to be the lowest rank in the straight. These options
are A,2,3,4,5,6,7,8,9,T, so there are 10 options in this step.
Note: This determines the other 4 ranks in the hand, since the 5 ranks
must be consecutive and we know what the lowest one is.
2. Assign suits to the 5 cards so that they are not all the same suit.
Let’s say X is the set of all possible ways to assign suits in this manner,
so there are |X| options in this step.
We will now find |X| by establishing a partition. Let Y be the set of all assign-
ments of 5 suits so that they are all the same. Notice that the sets X and Y
form a partition of U , the set of all possible assignments of 5 suits. (That is,
any assignment of 5 suits either selects all the same suit, or it does not.) Thus,
by ROS, we have |U | = |X| + |Y |.
We can find |U | by a 5 step process, where in Step i, we select one of the 4 suits
for the i-th highest rank in the hand. With 4 options at each of 5 steps, we have
|U | = 45 .
We can find |Y | by noticing that any such selection amounts to picking one of
the 4 suits and assigning that suit to all 5 cards in the hand. Thus, |Y | = 4.
Accordingly, we can rearrange the above equality and write
|X| = |U | − |Y | = 45 − 4
Since |X| is the number of options in Step (2) above, by ROP, we have proven
the claim.
Note: In this proof, we came up with all of the relevant steps to show that
there are 10 · 4 = 40 possible straight flushes (straights of the same suit), only
1 · 4 = 4 of which are royal straight flushes (TJQKA of the same suit). Try to
write out those arguments for yourself!
First, IF there are exactly 3 Aces in the hand, then we need to determine
the characteristics of the other two cards. Those two cards are either (a) the
same rank or (b) two different ranks. Thus, there are two sub-cases for this
particular case. This yields the following procedure:
Then, by the Rules of Product and Sum (since we have separate cases in a
process), we find there are
" 2 #
4 12 4 12 4
+
3 2 1 1 2
We can apply the Rule of Product and conclude that there are then 44 12
4
1 1
hands with exactly 4 Aces. Now, we must apply the Rule of Sum! What we
have here is a partition of the set of desired hands–those with at least three
Aces–into two subsets–those with exactly three Aces and those with exactly
four Aces. Since those subsets partition the larger set (i.e. every hand with at
least three Aces has either three Aces or four Aces, not both and not neither),
we may apply the rule of sum and conclude that there are
" 2 #
4 12 4 12 4 4 12 4
+ +
3 2 1 1 2 4 1 1
Recall that the rigorous statement of the Rule of Sum concerned cardinalities
of finite sets, and yet we didn’t technically get into those details in the previous
example. There is a certain amount of discretion and finesse required with these
types of combinatorial arguments. Is it obvious to you that every poker hand
with at least three Aces has either exactly three or exactly four Aces, not both and
not neither ? We are not saying it should be totally obvious and you’re a dummy
for not seeing it right away! Far from it! What we are saying is that this type
of statement should probably suffice as an explanation in a proof. Yes, we could
dive into further detail, reformulate poker hands in terms of sets, and completely
rigorize the game of poker into set notation. What good would that really do,
though? It seems far easier to explain it via the italicized statement above. If we
were pressed for details by a confused reader, we could offer further explanation,
but for a general audience, this argument would suffice. Hopefully, this rule of
thumb–convincing a general audience, but being able to explain further when
pressed further–should guide you into making decisions about how much detail
to include in a counting argument. The essential observation here is that we
indicated why our choices pertain to a partition of the set of hands in question.
No, we didn’t rigorously prove the two sets were disjoint, but we offered an
explanation as to why.
Another approach to this problem does not involve considering the suits of
the non-Ace cards. Instead, we can approach the process of constructing a poker
hand with at least 3 Aces as follows:
Thus, by the Rule of Sum (since we have partitioned the hands based on how
many Aces they have) and by the Rule of Product in each of the two cases, we
have
4 48 4 48
+
3 2 4 1
total poker hands with at least 3 Aces. You will see (and use) this approach
more often. The previous argument was more similar to the previous example
involving flushes, so that’s what we presented first. This argument is a bit
shorter and “slicker”, and is thus more commonly used. But wait a minute,
these answers look different! We were counting the same set of poker hands, so
598 CHAPTER 8. COMBINATORICS
shouldn’t we expect the same final number ? Well, yes, and we recommend that
you perform the requisite algebraic manipulations to convince yourself that
" 2 #
4 48 48 4 12 4 12 4 4 12 4
+ = + +
3 2 1 3 2 1 1 2 4 1 1
49
(b) From the remaining 49 cards, choose 2 more: 2 options
2. For hands that have four Aces:
4
(a) Choose 4 of the 4 Aces: 4 = 1 option
48
(b) From the remainign 48 cards, choose 1 more: 1 options
Thus, there are
4 49 4 48
+
3 2 4 1
poker hands with at least 3 Aces.
8.3. COUNTING ARGUMENTS 599
What’s the problem here? Do you see any errors? Was the Rule of Product
applied inappropriately? Was the Rule of Sum applied to something that isn’t
actually a partition? Did we overcount? Undercount? Did we count some hands
that do not have the desired properties? Think about this before reading on.
Here’s what we noticed: this answer is too large. We overcounted by includ-
ing certain hands multiple times in our count. That is, every hand we sought
to count is included at least once by the steps above, but some hands can be
constructed in multiple ways via those steps. These observations guarantee that
our number is too large.
How did we know this? We recommend actively trying to identify a hand
that can be constructed in different ways by following the above steps. If you’re
reading through a proof and can do this, you know that the entire proof is now
flawed. In this case, let’s examine a hand that has exactly 4 Aces; specifically,
let’s look at the hand A♣A♠A♦A♥2♣. We can construct this hand by the
following paths through the steps:
1. Choose 3 of the 4 Aces: A♣A♠A♦
2. From the remaining 49 cards, choose 2 more: A♥2♣
Or, we could take this path:
1. Choose 4 of the 4 Aces: A♣A♠A♦A♥
2. From the remaining 48 cards, choose 1 more: 2♣
Do you see the problem now? This exact same hand is produced in (at least) two
distinct ways via the process outlined above. Thus, the answer is an overcount.
Are there any other ways we could construct this same hand? How many? Try
to identify another hand that is overcounted. Can we possibly identify how
many times every hand is overcounted by and amend our answer that way?
This is an interesting (and very challenging, actually!) idea that we’ll return to
later.
Or, perhaps the union the sets of the “partition” do not actually cover the
entire set in question.
• Overcount:
Every desired object is counted at least once, but some are counted more
than once. That is, some elements of the set in question can be counted
in multiple ways via the steps of the proof.
• Undercount:
Some desired objects are not counted at all. That is, some elements of the
set in question are not counted by the steps of the proof.
• Extraneous Count:
Some undesired objects are counted. That is, the steps of the proof count
some objects that are not elements of the set in question.
We recommend reading over your written proofs and trying to identify these
flaws, even if they aren’t there. Perhaps by struggling to find an overcounting
argument, say—by attempting to construct certain objects in multiple ways via
your steps—you actually identify a flaw you didn’t know was there! If you don’t
find any flaws, you can be more assured that your proof is fully correct.
Example 8.3.6. Here is a standard example of a naive overcount. We will show
how it is an overcount and then fix it by counting in a different way! Here is
the question:
How many 5-card hands have at least one card of each suit?
Here is an incorrect argument:
4
13 48
There are · = 1370928 such hands.
1 1
We can use a five-step process. In step 1, we select one of the 13
Hearts. In step 2, we select one of the 13 Diamonds. In step 3, we
select one of the
13 Spades. In step 4, we select one of the 13 Clubs.
There are 13 1 ways to do each of these steps.
What’s wrong with this? Think about it carefully before reading on. Look at
the list of potential mistakes above; does one of them apply here? How would
you show this?