Set Theory, Logic, and Their Limitations - Machover, Moshé - 1996 - Cambridge New York - Cambridge University Press - 0521479983 - Anna's Archive
Set Theory, Logic, and Their Limitations - Machover, Moshé - 1996 - Cambridge New York - Cambridge University Press - 0521479983 - Anna's Archive
AY
Vy
S
Y
\
Set theory, logic and
their |imitations Yo ~
NN
Ne
whey,
SEN
.
28
=
\ “a
‘ ~<a
\
ws
~
AA VG
WC
SE
Re
NSIS
AS
WW
ws
N
¥
\
ee
-
|,
owe
“Saez,
——L.
— oo.
eo
AQj
—
2
FLORIDA STATE
UNIVERSITY LIBRARIES
TALLAHASSEE, FLORIDA
Digitized by the Internet Archive
in 2023 with funding from
Kahle/Austin Foundation
https://archive.org/details/settheorylogicthOO0Omach
Set Theory, Logic and their Limitations
Set Theory, Logic
and their Limitations
Moshé Machover
King’s College London
= CAMBRIDGE
ee UNIVERSITY PRESS
Published by the Press Syndicate of the University of Cambridge
The Pitt Building, Trumpington Street, Cambridge CB2 1RP
40 West 20th Street, New York, NY 10011-4211, USA
10 Stamford Road, Oakleigh, Melbourne 3166, Australia
A catalogue record for this book is available from the British Library
KT
Contents
Preface Vii
Mathematical induction
Cardinals
Ordinals
Vii
Vill Preface
Moshé Machover
Note
@ Throughout ‘B&M’ refers to
Warning
In the last three chapters of this book there is a systematic interplay
between parallel sets of symbols; one set consisting of symbols in
ordinary (feint) typeface:
eed ‘_? ENT IS ‘>? ey" Vie ee oa?
i)
Omit 2. n n+l
the initial domino (here labelled ‘0’) is given a gentle push — and the
whole row comes cascading down.
If you want to perform this trick, how can you make sure that all the
dominoes standing in a row will fall? Clearly, the following two
conditions are jointly sufficient.
The reasoning that allows us to infer from Conditions 1 and 2 that all
the dominoes will fall is based on the Principle of Mathematical (or
Complete) Induction. This is a fundamental — arguably the most
fundamental — fact about the so-called natural numbers (0, 1, 2, etc.).
It has several equivalent forms, three of which will be presented here.
1
2 0. Mathematical induction
WARNING
1.1. Fact
The relation < between numbers is transitive: whenever k<m and
m <n, thenalsok <n.
1.2. Fact
The relation < obeys the trichotomy: for any numbers m and n, exactly
one of the following three holds:
m=<norm=norn<m.
1.3. Fact
Every number n has an immediate successor n + 1, such that, for any
mnsmiin tls m.
1.4. Fact
Zero is the least number: 0 < n for all n.
§2. Weak induction 3
hes TGR
basis — that is, you have not proved that PO) then you are not entitled
to conclude that Pn holds for all numbers n; indeed you are not even
entitled to conclude that there exist any numbers n for which Pn
holds. For example, let P be the property of being a number that is
greater than itself; so Pn means that n > n. Now, from the hypothesis
n> n it is easy to deduce n+1>n+4 1 (for example, by adding 1 to
both sides of the hypothesis); so we have shown that Vn[Pn =>
P(n + 1)]. But it doesn’t follow that there is any number greater than
itself.
2.2. Remark
The Weak Principle of Induction was first invoked in 1653 by Pascal in
the proof of one of the results (Corollary 12) in his Traité du triangle
arithmétique (published in 1665). Pascal does not give an explicit
formulation of the principle in general, for arbitrary P; but from his
presentation of the method of proof it is clear that the general principle
is being invoked. We shall not reproduce Pascal’s proof here. Instead,
we shall illustrate the use of weak induction in proving a simpler result.
2.3. Example
We shall prove that, for all n,
(*) 0+1+2+---+n=n(n+1)/2.
PROOF
Define the property P by stipulating that Pn iff (*) holds for n. We
show by weak induction that VnPn.
Basis. For n = 0 the sum on the left-hand side reduces to 0, and the
value of the right-hand side is 0. Thus PO.
Induction step. Let n be any number such that Pn; thus our induction
hypothesis is that (*) holds for this n. Then
OFT +2+---+n+
(n+ 1) =nn+ D2 +H +1) by ind. hyp.
=(n + 1)(n/2 + 1)
= (n + 1)(n + 2)/2.
(The last two steps consist of simple algebraic manipulation.) Thus
§3. Strong induction 5
from the induction hypothesis we have deduced that
3.2. Theorem
The Strong Principle of Induction follows from the Weak Principle of
Induction.
6 0. Mathematical induction
PROOF
Assume that P is a property of numbers such that Vn[Vm < nPm =>
Pn] holds. We shall show, using weak induction, that VnPn holds as
well. To this end, we define a new property Q by stipulating that, for
any number n,
(*) Qn q Vm < nPm.
(The subscript ‘df’ is short for ‘definition’.) Note that our assumption
regarding P can now be rewritten as
(#*) Yn[Qn => Pn].
4.1. Theorem
The LNP follows from the Strong Principle of Induction.
PROOF
Let MCW and suppose that M does not have a least member. We
must show M is empty. To this end, let P be the property of not
belonging to M. Thus, for any n,
Pnosgn
¢€ M.
4.2. Theorem
The Weak Principle of Induction follows from the LNP.
PROOF
Let P be a property of numbers such that PO and Vn[Pn > P(n + 1)]
hold. We must prove that VnPn holds. This amounts to showing that
the class
We have thus shown that the Weak Principle of Induction, the Strong
Principle of Induction and the LNP are equivalent to one another.
4.3. Remark
While there is no evidence that the ancient Greek mathematicians
knew the Principles of Weak and Strong Induction, they did use
mathematical induction in the form of the LNP. We shall quote here
from a proof of Proposition 31 in Euclid’s Elements, Book VIII.
First we need a few definitions. By arithmoés (plural: arithmoi) the
Greeks meant what we call natural number greater than 1. An arithmos
b is said to measure an arithmos a if b<a and b goes into a (in
modern terminology: 5 is a proper divisor of a). An arithmos a is said
to be composite if there is an arithmos that measures it; otherwise, a is
said to be prime.
In Proposition 31 of Book VU, Euclid claims that every composite
arithmos is measured by some prime arithmos. He writes:
‘Let a be a composite arithmos. I say that it is measured by some prime
arithmos. For since a is composite, it will be measured by an arithmos,
and let b be the least of the arithmoi measuring it.’
Here the LNP is clearly invoked. The proof is now easily concluded: b
must be prime; otherwise, it would be measured by some smaller
arithmos c, which must then also measure a — contrary to the choice of
b as the least of the arithmoi measuring a.
Euclid also gives another proof of the same proposition, in which he
uses yet another form of the Principle of Induction: There does not
exist an infinite decreasing sequence of natural numbers.'
' On these matters see David Fowler, ‘Could the Greeks have used Mathematical
Induction? Did they use it?’, Physis, vol. 31 1994 pp. 252-265.
1
Sets and classes
§1. Introduction
1.1. Preview
9
10 1. Sets and classes
' Cf. Eric Partridge, Usage and abusage: ‘COLLECTIVE NOUNS; . .. Such collective nouns
as can be used either in the singular or in the plural (family, clergy, committee,
Parliament), are singular when unity (a unit) is intended; plural, when the idea of
plurality is predominant.’
§1. Introduction iil
1.4. Definition
We write ‘a € A’ as short for ‘[the object] a belongs to [the class] A’.
The same proposition is also expressed by saying that a is a member of
A, or an element of A, or that A contains a. We write ‘a ¢ A’ to
negate the proposition that a e€ A.
1.5. Definition
If P is any definite property, such that the condition Px is meaningful
for an arbitrary object x, then the extension of P, denoted by
Be ston a2
is the class of all objects x such that Px. Thus a € {x : Px} iff Pa.
Classes having exactly the same members are regarded as identical. Let
us state this more formally:
then A= B.
1 1. Sets and classes
are equal: although the two defining conditions differ in meaning, they
are satisfied by the same objects — the integers 0 and 1.
1.7. Remark
Set theory (along with other parts of present-day mathematics) is
dominated by a structuralist ideology, which entails an extensionalist
view of properties. This means that properties having equal extensions
are considered to be equal; thus a property and its extension uniquely
determine each other.
least two years earlier. The antinomy results directly from the assump-
tion that the class W of all ordinals is a set. (The theory of ordinals is
an important but quite technical part of set theory. In Ch. 4, when we
study the ordinals, we shall prove that W cannot be a set.) Similar
antinomies were later discovered by Cantor himself and by others.
Cantor was not too disturbed by these discoveries. He noticed that
the antinomies arose from applying the Comprehension Principle to
classes that were not just infinite but extremely vast. (An early result
of his set theory was that not all infinite classes have the same ‘size’.)
He concluded that some classes are not merely infinite but absolutely
infinite, hence simply too large to be comprehended as a single object.
Set theory would be on safe ground if the Comprehension Principle
were restricted to classes of moderate size.' However, he did not
specify precisely how to draw the line between moderately large
infinite classes, which can be regarded as sets with impunity, and vast
ones, which cannot be so regarded.
Matters came to a head in 1903, when Bertrand Russell published a
new antinomy, Russell’s Paradox, which he had discovered two years
earlier. Whereas previous antinomies arose in rather technical reaches
of set theory and therefore required lengthy expositions, Russell’s
Paradox checkmated the Comprehension Principle in two simple
moves, as follows. Let
' See Michael Hallett, Cantorian set theory and limitation of size. '
2 Russell’s paper, ‘Mathematical logic as based on the theory of types’, is reprinted in
van Heijenoort, From Frege to Gédel.
14 1. Sets and classes
' Russell too had briefly toyed with the same idea in 1905.
* A translation of Zermelo’s paper, ‘Investigations in the foundations of set theory I’, is
printed in van Heijenoort, From Frege to Gédel.
3 This postulate, as well as Zermelo’s Axiom of Separation and Axiom of Union Set, had
in fact been foreshadowed in 1899 by Cantor, in a letter to Dedekind, a translation of
which is printed in van Heijenoort, From Frege to Gédel.
§3. Zermelo’s axioms 5
' The first to formulate such precise conditions was Hermann Weyl in Das Kontinuum
(1918). A similar (and somewhat more formal) characterization was given independ-
ently by Skolem in a 1922 paper whose translation, ‘Some remarks on axiomatized set
theory’, is printed in van Heijenoort, From Frege to Godel.
16 1. Sets and classes
3.1. Definition
If n is any natural number and aj, a, ..., a, are any objects, not
necessarily distinct, we put
(0,03,
4-1. G,} =ap{x oY x Ol Xi a) OFF Gy Ore Oli.
In particular, for n = 0 we get the empty class { } = {x : x #x}, which
we denote by ‘2’. (No object can differ from itself!)
§3. Zermelo’s axioms 7
3.3. Remarks
(i) This set is called the pair of a and b. By PX we have
{a, b} = {b, a}.
(ii) For any object a we clearly have {a} = {a, a}, which is a set by
A2. This set is called the singleton of a.
(iii) From our assumption that there exists at least one object a, it
now follows that there exists at least one set, namely {a}. Note
however that we cannot prove the existence of an individual: our
postulates are neutral on this matter.
3.4. Definition
Let A and B be classes. If every member of B is also a member of A,
we say that B is a subclass of A (also, B is included in A, or A
includes B), briefly: B C A.
If BC A but A # B, we say that B is a proper subclass of A (also,
B is properly included in A, or A properly includes B), briefly:
BGA.
3.5. Warnings
(i) Beware of confusing ‘contains’ and ‘includes’; the former refers
to the relation of membership € while the latter refers to the
relation C just defined.
(ii) However, this terminological distinction is not observed by all
authors, so watch out for other usages.
(iii) Also, the notation introduced in Def. 3.4 is not universally
accepted. Some authors use ‘C’ instead of ‘C’ for not-necessarily-
proper inclusion; and ‘’ instead of ‘C’ for proper inclusion.
3.7. Definition
If A is a class and P is a definite property such that the condition Px is
meaningful for any object x, we put
3.8. Remarks
(i) Zermelo’s formulation of AS, clearly equivalent to the one used
here, said (in effect) that if A is a set then the class {x € A: Px}
is always a set. Since this class separates or singles out those
members of A that have the property P, he called AS the Axiom
of Separation (Aussonderung). This name is still in current use.
(ii) The intuitive idea behind AS is clear: if B C A and A is not too
vast, then B cannot be too vast either.
3.9. Theorem
© is a set.
PROOF
3.10. Theorem
The class of all objects (the universe of discourse) and the class of all
sets are proper classes.
PROOF
We saw in § 2 that Russell’s class,
cannot be a set. Since Russell’s class is included in the class of all sets,
the latter cannot be a set by AS. The same applies to the universe of
discourse. 8
§3. Zermelo’s axioms 19
3.11. Definition
If A is any class, we put
3.13. Remarks
(i) The members of UA are the members of the members of A.
(ii) Intuitively, the idea behind AU is that if A is a set then it does
not have ‘too many’ members; and each of these, being an object
(an individual or a set), in turn does not have ‘too many’
members. Therefore UA - obtained by pooling together not-too-
many collections, none of which is too vast — cannot itself be too
vast.
3.14. Definition
For any classes A and B, we put
AU B= {xix e A orx eB}.
3.15. Theorem
AU Bis aset iff both A and B are sets.
PROOF
3.16. Theorem
If n is any natural number and aj, a2, ..., Gm, are any objects, the class
(ay, 2358 5, ) OS @ SEL.
20 1. Sets and classes
PROOF
By (weak) induction on n.
which is a set by the induction hypothesis, Rem. 3.3(ii) and Thm. 3.15.
ie
3.17. Definition
If A is any class, we put
3.19. Remark
Intuitively, the idea behind AP is that although PA can be very large —
in fact, much larger than A — its size is nevertheless bounded provided
A itself is not too vast.
3.20. Problem
Prove that if A is a class of sets (that is, a class all of whose members
are sets) such that UA is aset, then A is a set as well.
3.22. Remarks
(i) Without AI it is impossible to prove that there are infinite sets.
On the other hand, it is easy to see intuitively that any set Z
satisfying the conditions imposed by AI must be infinite. We shall
be able to prove this rigorously when we have a rigorous defini-
tion of infiniteness.
(ii) A2, AS, AU and AP are clearly particular cases of the Principle
of Comprehension: they say that certain classes are sets. Al-
though AI as it stands is not of this form, we shall see later that it
is equivalent to the proposition that a certain class, «, is a set.
4.1. Definition
If A is any class,
(\A =a {x x € y for everyy € A}.
(\A is called the intersection class of A.
4.2. Definition
If A and B are classes,
AM B=7ix.x7
eA andx e Bh.
4.3. Definition
If A is any class,
Ae ke ee).
4.4. Definition
If A and B are any classes,
A-B=g AN B.
4.5. Problem
(i) Prove that if A is a non-empty class then ()A is a set. What is
No
(ii) Prove that if A or B is aset then so is AN B.
(iii) Prove that A and A‘ cannot both be sets.
2
Relations and functions
For any two objects a and b, not necessarily distinct, we need a unique
object (a, b) called the ordered pair of a and b [in this order]. It is not
really important how the ordered pair is defined, so long as the
following condition is satisfied:
(1.2) (a,b). = (c,d) a= cand b =u,
1.3. Warning
The ordered pair (a,b) must not be confused with the set {a, b},
sometimes known as an unordered pair, whose members are just a and
b. For example, the sets {a, b} and {b, a} are always equal (see Rem.
1.3.3(i)), but by (1.2) the ordered pairs (a,b) and (b,a) are equal
only if a= 6. However, when there is no risk of confusion we shall
often omit the adjective ‘ordered’ and say ‘pair’ when we mean ordered
pair.
23
24 2. Relations and functions
1.4. Definition
For any objects a and b,
1.5. Problem
Prove that (1.2) follows from Def. 1.4.
More generally, for any number n and any n objects a), a2, ..., a,
—not necessarily distinct—we need a unique object (a;, a2, ..., a)
called the ordered n-tuple of a,, a, ..., a, {in this order]. Again, it is
not really important how ordered n-tuples are defined, so long as the
following condition—of which (1.2) is a special case —is satisfied:
1.7. Definition
For any 72, and @bjectS ay, do... Ge) age
1.8. Problem
Prove (1.6) for all n > 2. (Use weak induction on n, taking n = 2 as
basis.)
$1. Ordered n-tuples 25
{a) = (b)sa=b.
The simplest way to satisfy this is to adopt the following.
1.9. Definition
(a) =a¢ a.
1.10. Definition
() =a ©.
1.11. Remark
The equality which was decreed by Def. 1.7 for n = 2, now holds also
for n = 1 by virtue of Def. 1.9. However, it does not hold for n = 0,
because by Def. 1.9 (a) = a, whereas by Def. 1.10 ((), a) = (©, a).
1.12. Definition
(i) For any classes A;, Az, ..., A,, not necessarily distinct, their
cartesian product [in this order] is the class
A, X Az X::: X An =at
that is, the class of all n-tuples whose i-th component belongs to
A tori — 002, sacs I:
(ii) The n-th cartesian power of a class A is the cartesian product of
A with itself n times:
A= aA KOA X=? XA,
n times
26 2. Relations and functions
1.13. Remarks
(i) In Def. 1.12(i) we have used a convenient generalization of the
class notation introduced in Def. 1.1.5. Although it is almost
self-explanatory, let us spell it out.
Suppose F(x, x2, ..., X,) is an object whenever x1, X2,...,
X, are objects; and suppose P(x,;, x2, ..., X,) iS a condition
involving x1, X2,...,X,- Then
WA Grwee peas A BS Bap 2 tho 2 1
1.14. Definition
(i) For any n = 1 and any class A, an n-ary relation on A is a class of
n-tuples of members of A —that is, a subclass of A”.
(ii) In particular, a property on A is a unary relation on A—that is, a
subclass of A.
1.15. Remarks
(i) If R is an n-ary relation we shall often write ‘R(a;, a), ..., dp)’
as short for ‘(a, a2, ..., G,) € R’. In the special case where R is
a binary relation we shall often write ‘aRb’ for ‘(a, b) € R’.
§2. Functions; the axiom of replacement 21.
(ii) We could extend Def. 1.14(i) to the case n = 0, but the resulting
notion of 0-ary relation is found to be of little use.
2.1. Definition
A function (a.k.a. map or mapping) is a class f of ordered pairs
satisfying the functionality condition: whenever both (x,y) € f and
(x27) € f then-y
= z.
2.2. Definition
Let f be a function.
2.3. Problem
Verify that from Defs. 2.1 and 2.2 it follows that a function f is equal
to its own graph; that is,
f= shor domf
28 2. Relations and functions
Hence prove that functions f and g are equal iff dom f = dom g and
fx = gx for every x in their common domain.
2.4. Definition
Let f be a function.
(i) We say that f is a map from A to B (or that f maps A into B) if
dom f = A andranfC B.
(ii) We say that f is a surjection from A to B (or that f maps A onto
B) if dom f = A and ranf = B.
(iii) We say that f is an injection (or a one-to-one map) if whenever x
and y are distinct members of domf then fx and fy are also
distinct.
(iv) We say that f is a bijection from A to B if it is an injection as
well as a surjection from A to B (that is, a one-to-one map from
A onto B).
2.5. Lemma
Let A and B be non-empty classes. Then A X B is a set iff both A and B
are sets.
PROOF
2.6. Theorem
Let n=1, and let Aj, Az, ..., A, be non-empty classes. Then
Aga Max <= “XA MS asetiff Avis a set for eacrt = lo lae tf.
PROOF
By weak induction on n.
(use Defs. 1.12(i) and 1.7 and Rem. 1.11). Hence, by Lemma 2.5 and
the induction hypothesis, A, X A, X--+ X A, X Ajn+, 18 a Set iff A; is
ARSC OMCACINT — lee TLe I) oe he]
2.7. Corollary
If A is a set and R is an n-ary relation on A (for some n 2 1) then R is a
set as well.
PROOF
2.8. Theorem
Let f be a function. Then f is a set iff both dom f and ran f are sets.
PROOF
From this the required result follows, using the same argument as in
the proof of Lemma 2.5. a
2.10. Remarks
(i) AR is clearly a particular case of the Comprehension Principle.
(ii) In view of Thm. 2.8, AR is equivalent to the proposition that if f
is a function such that dom f is a set then f itself is a set. The
intuitive idea behind AR is that f has exactly ‘as many’ members
as does dom f : for each a e dom f, f contains the corresponding
pair (a, fa). Therefore if dom f is not too vast, neither is f itself.
(iii) In mathematical applications, a function f is almost always
defined as a mapping from A to B, where both A and B are
known in advance to be sets. It then follows from AS and Thm.
2.8 that ran f and f itself are sets. AR is not needed for this. But
as we shall see AR plays an important role within set theory
itself.
3.2. Definition
R is an equivalence relation on A if R is a binary relation on A such
that, for any members x, y and z of A, the following three conditions
are satisfied:
xRx (reflexivity),
if xRy then also yRx (symmetry),
if xRy and yRz then also xRz (transitivity).
§3. Equivalence and order relations ol
3.3. Example
The paradigmatic example of an equivalence relation on A is the
binary relation {(x,x):x € A}, called the identity (or diagonal)
relation on A, and denoted by ‘id4’. By the way, id, is clearly a
function; indeed, it is a bijection from A to itself.
3.4. Definition
Let R be an equivalence relation on A. For each a € A we put
[alr = df {x é xRa}.
3.5. Theorem
Let R be an equivalence relation on A and let a and b be any members
ofA. Then [a] = [6] iff aRb.
PROOF
3.6. Corollary
Let R be an equivalence relation on A and let a be any member of A.
Then a belongs to exactly one R-class, namely [a].
PROOF
We have seen that a € [a]. If also a € [b] then by Def. 3.4 aRb, so by
Thm. 3.5 it follows that [a] = [5]. a
3.7. Definition
(i) S is a sharp partial order on A if S is a binary relation on A such
that, for any members x, y and z of A, the following two
BP: 2. Relations and functions
xBx (reflexivity),
if xBy and yBx thenx = y (weak anti-symmetry),
if xBy and yBz then also xBz (transitivity).
3.8. Example
Let A be a class of sets (that is, all the members of A are sets rather
than individuals). Let S and B be the restrictions to A of C and C
respectively; that is,
3.9. Problem
Let S and B be a sharp and a blunt partial order, respectively, on A.
Put
(i) Prove that s> and B* are a blunt and a sharp order on A,
respectively.
(ii) Verify that S6* = § and B* = B.
3.10. Remarks
(i) The qualifications ‘sharp’ and ‘blunt’ are often omitted and a
partial order of either kind is referred to simply as a ‘partial
order’. There is no real harm in this, for two reasons. First,
because it is usually clear from the context which kind of partial
order is meant. Second, as shown in Prob. 3.9, there is a natural
$4. Operations on functions 33
3.11. Definition
(i) S is a sharp total order on A if S is a binary relation on A such
that, for any members x, y and z of A, the following two
conditions are satisfied:
3.12. Problem
Let S and B be a sharp and a blunt total order, respectively, on A.
Prove that
(i) S is a sharp partial order, (i1) S> is a blunt total order,
(iii) B is a blunt partial order, (iv) Brisa sharp total order,
on A.
4.1. Definition
If f and g are functions such that ran f C dom g, we put
Pot Se x weV Re ey ea
34 2. Relations and functions
4.2. Problem
Show: gof is a function, dom(gef) =domf and ran(g°ef) Crang.
Moreover, for any x in dom(g° f)—which is also dom f —check that
(go f)x = (fr).
4.3. Definition
If f is an injective (that is, one-to-one) function we put
Fog =the ere
f—} is called the inverse of f.
4.4. Problem
Verify that f—! itself is an injective function and, moreover,
dom(f~ )= ran f, ran(f—')
= domf,
fle f = idgoms fof? = idwnf.
4.5. Problem
Prove that if f is a function from a proper class to a set, then f is not
injective.
4.6. Definition
If f is a function and C C dom f, we put
GO) ifGeer (eat) xe Cl:
(ii) f[C] =ae {fx : x € C}.
fC is called the restriction of f to C and f[C] is called the image of C
under f.
4.7. Problem
Verify that fC is a function, dom(f!C)=C and ran(f}C)=
f[C]. Moreover, (f }C)x = fx for every x € C.
$4. Operations on functions 3)
4.8. Problem
Let F be a class whose members are functions. Show that UF is a
function iff the following coherence condition is fulfilled: fx = gx for
all f and g in F and all x edomf dom g. Assuming this condition
holds, what are dom F and ran F?
3
Cardinals
1.1. Definition
Let A and B be sets. We say that A and B are equipollent, briefly:
A = B, if there exists a bijection from A to B (that is, a one-to-one
map from A onto B).
1.2. Theorem
Equipollence is an equivalence relation on the class of sets.
PROOF
36
§1. Equipollence and cardinality oH
1.4. Remarks
(i) Def. 1.3 is incomplete, because we have not specified what the
object |A| is or how it is to be chosen.
Cantor regarded cardinals as special abstract entities of a new
kind. In effect, this amounted to introducing the notion of
cardinal as a separate primitive notion.
However, it would obviously be more convenient — and con-
form to the reductionist programme — if cardinals were among the
hitherto posited objects of set theory. In this spirit, Frege pro-
posed in 1884 the elegant idea of defining |A| as [A]~, the
equivalence class of A modulo = (see Def. 2.3.4). The condition
required by Def. 1.3 —|A| =|B|< A = B - would then follow at
once by Thm. 2.3.5.
This procedure, novel at the time, was to become standard
practice, used with respect to various equivalence relations that
arise in numerous mathematical situations.
Ironically, Frege’s procedure does not work at all well in the
present case, where the equivalence relation is ~. Unaware that
the Comprehension Principle had to be restricted, he assumed as
a matter of course that [A]~ is always a set, hence an object.
Unfortunately, this is in general false. For example, if A is a
singleton, then [A]~ is the class of all singletons, and hence
UA]. is the class of all objects, the entire universe of discourse,
which is a proper class by Thm. 1.3.10. Hence by AU [A]~ must
be a proper class as well. This is very inconvenient, because we
would like to be able to form classes of cardinals, which is
impossible if cardinals are proper classes.
Fortunately there are other ways of defining cardinals, satisfy-
ing the requirement of Def. 1.3, while ensuring that the cardinals
are sets. Later on, in Ch. 6, we shall follow one such procedure.
In each ~-class we shall be able to select a unique ‘distinguished’
member. Then, for any set A, we can take |A| to be the
distinguished member of [A]~ rather than that class itself. Then
Thm. 2.3.5 ensures that the requirement of Def. 1.3 is satisfied.
(ii) For the time being, let us take it on trust that Def. 1.3 can be
completed in a satisfactory way. This is not asking too much,
since our reference to cardinals may be regarded as a mere
convenience: everything that we shall say in this chapter in terms
of cardinals can easily be rephrased (at the cost of some circum-
locution) in terms of sets and mapping between sets.
38 3. Cardinals
2.1. Definition
Let A and wu be cardinals. Let A and B be sets such that |A| = A and
|B| = u. We say that A is smaller-than-or-equal-to u — briefly: 4 < u — if
there is an injection from A to B.
2.2. Remark
This definition is in need of legitimation: we must make sure that the
criterion it provides for asserting that A<w depends only on these
cardinals themselves rather than on the choice of particular sets A and
B such that |A| = A and |B| =u. This is done as follows. Let A, A’,
B, B' be sets such that |A| = |A’| and |B| = |B’|. Given an injection
from A to B, it is easy to show — DIY! — that there is also an injection
from A’ to B’.
2.3. Theorem
Let A and wu be cardinals and let B be a set such that |B\| = uw. Then
As wiff B has a subset whose cardinality is i.
PROOF
PROOF
DIY: &
2.5. Definition
A map g from a class of sets to a class of sets is monotone if whenever
X and Y are sets in domg such that X¥ C Y thengX CqY.
2.6. Lemma
Let A be a set and let g be a monotone map from PA to itself. Then A
has a subset G such that gG = G.
PROOF
G=(\ixePA
yx CX
(See Def. 1.4.1.) We claim that G itself is good. To show this, let X be
any good set. Then G C X because G is the intersection of all good
sets. Therefore by the monotonicity of g we have gG CgX. Also,
since X is good, we have gX C X; hence gGC X. Thus we see gG is
included in every good set. Hence gG must also be included in the
intersection of all good sets. But this intersection is G itself; this means
that gG C G, so G is good, as claimed.
It now follows that 7G is good as well. But G, the intersection of all
40 3. Cardinals
good sets, is included in each of them and in particular in the good set
gG. So we have shown bothgG C GandGCgG.ThusgG=G. H&
PROOF
Let A be a set such that |A| = w. Since A < w, according to Thm. 2.3 A
has a subset, say B, such that |B| = A. Since also «<A, according to
Def. 2.1 there is an injection, say, f, from A to B.
The claim that A= wu will be proved if we show that there is a
bijection from A to B.
Define a map ¢ from PA into itself by putting, for any X C A,
gX =(A—
B)U f{[X].
(For the definitions of A — B and f[X], cf. Def. 1.4.4 and Def. 2.4.6.)
It is easy to see that g is monotone. By Lemma 2.6, there exists some
G CA such that G = gG. Thus
Gata
— Bru GL
Note that f[G] C B because f maps the whole of A into B. (See Fig.
1. The large rectangle represents A; like Gaul, it is divided into three
parts.)
Now, f}G is an injection from G to B and a bijection from G to
f[G] (see Prob. 2.4.7). Let us put
h = (f1G) VU idg_g.
i Q
=a
Fig. 1
$3. Cardinals for natural numbers 4]
Thus / is a map whose domain in the whole of A, such that
2.8. Remarks
(i) In view of Thms. 2.4 and 2.7, < is a [blunt] partial order on the
class of cardinals.
(ii) As usual in such cases, we denote by ‘<’ the sharp partial order
associated with <. (Thus < is <#, see Prob. 2.3.9.) If A. and ware
cardinals such that A < wwe say that Ais smaller than wu.
(iii) Later on we shall prove (using the Axiom of Choice) that < is a
total order on the class of cardinals.
3.2. Remarks
(i) To legitimize Def. 3.1 we must verify that if a,, a2, ..., a, are
distinct objects and b,, bz, ..., b, are likewise distinct objects
then
3.3. Problem
Define c,, by induction on n as follows:
Co = Oand Cn+1 = {Cy} for each n.
42 3. Cardinals
Prove that, for each n, the objects cg, ci, ..., Cy are distinct. (Use
induction on n.)
Thus for any natural number n there exist n distinct objects, and
hence the corresponding cardinal n exists.
3.4. Theorem
Let a, a7, ..., a, be any objects. Then there does not exist an injection
from the set {a,, a2, ..., G,} to any proper subset of itself.
PROOF
3.5. Theorem
For any natural numbers n and m:
(WARNING. The two ‘<’ here mean different things: the first denotes
the usual order among natural numbers, while the second denotes the
partial order on the cardinals. )
§ 4. Addition 43
PROOF
(i) Assume m<n. Take n distinct objects a,, a2, ..., a, (which
exist by Prob. 3.3). Since {a,, az, ..., a} is clearly a subset of
HOR eee ae a,}, we have m <n by Thm. 2.3.
(ii) Let m#n. Without loss of generality we may assume m <n.
Take n distinct objects a,, a2, ..., a,. By Thm. 3.4 there is no
bijection from {@,, a), ..., a,} to its proper subset {a,, ao, ...,
am}. Therefore m# n. Ls)
3.6. Remark
A subtle matter: we have not shown that being a natural number is a
notion of set theory. Rather, we have taken this notion to be under-
stood in advance, prior to the development of set theory. Therefore
Def. 3.1 cannot be regarded as a single definition within this theory.
Rather, it is a definition scheme, a sequence of definitions whereby
each of the cardinals 0, 1, 2, 3, etc., in turn may be defined separately.
Similar caveats apply to the whole of this section as well as to
definitions like 1.3.1 and 2.1.7 and theorems like 1.3.16.
§4. Addition
In this section we shall see how cardinals may be added. But first we
introduce a useful bit of terminology.
4.1. Definition
If AM B=, we say that A and B are disjoint.
4.2. Lemma
For any sets A and B, there are disjoint sets A' and B', such that
|A| =|A"| and |B| = |B’.
PROOF
Take any two distinct objects a and b (for example, @ and {©}; see
Prob. 3.3). Then let
4.3. Lemma
Let A, B, A’, B' be sets such that AN B= A'1\ B' =, |A| =|A’
and.\B) = Bo ther |A UBL =A" Bae
PROOF
4.4. Definition
For any cardinals 4 and uw, we define the sum of A and u:
A+u=q lA UB b)
where A and B are disjoint sets such that |A| = A and |B] = u.
4.5. Remarks
(i) Def. 4.4 is legitimized by Lemma 4.3.
(ii) In the proof of Thm. 2.7 we made use of a special case of Lemma
4.3. We had there A = GU(A — G) and B= f[G]U (A—- G),
where the unions in both cases are between disjoint sets. Also,
|G| =|f[G]| because f is injective. Hence we concluded that
|Al'=|BI:
4.6. Theorem
If k, m and n are natural numbers and k + m=n, thenk+m=n.
PROOF
DIY. (WARNING. The two ‘+’ here mean different things. The first
denotes the operation of addition of numbers. The second denotes
addition of cardinals.) a
$4. Addition 45
4.7. Problem
Verify, for all cardinals x, A and wu:
(i) w+(A+u)=(%+A)+u (associativity of addition),
Gi) A+u=ut+a (commutativity of addition),
(ii) A+ 0 =A (neutrality of 0 w.r.t. addition),
(Gv) ASp>xtAsutu (weak monotonicity of addition).
4.8. Warning
Although cardinal addition behaves in many ways like ordinary addi-
tion of natural numbers, not all rules of ordinary arithmetic apply
here. For example, as we shall see later, from x +A= x it does not
always follow that A= 0. Hence the cancellation law does not apply in
general (from x + A= x + wit does not always follow that A = w); nor
is addition of cardinals strongly monotone (from A< wu it does not
always follow thatx+A<x+ wu).
4.9. Definition
If B is a function whose domain is a set X, we sometimes denote the
value of B at x e X by ‘B, rather than by ‘Bx’ and denote B itself by
CB lixe XY.
In this connection we refer to X as the index set and to B as the family
of the B,, indexed by X.
4.10. Remark
Many authors use the vertical stroke ‘|’ instead of the colon for class
abstraction (as in Def. 1.1.5) and so use some other notation for
indexed families.
46 3. Cardinals
4.11. Definition
Let {B,|x € X} be an indexed family of sets (that is, all the B, are
sets). Let uw, = |B,| for each x e X. We put:
Ge me eeX =a |Uf{x} x By: x € X}}|.
4.12. Remarks
(i) Thus, to add up all the “, simultaneously, we form the cartesian
product {x} x B, for each x e X. (Note that these products are
pairwise disjoint: if x # y then {x} x B, and {y} x B, are dis-
joint, although B, and B, need not be disjoint and may even be
equal.) Then we take the union of all these products. Using AR
and AU it is easy to verify that this union is a set. The cardinality
of this set is the required sum.
(ii) To legitimize this definition one must show that if A is another
indexed family of sets with the same index set X such that
|A,| = |B,| for all x e X, then
§5. Multiplication
5.1. Definition
For any cardinals A and uw, we define the product of A and u:
A-w=a|A X B =)
$5. Multiplication 47
where A and B are any sets such that |A| = A and |B| = uw. We often
abbreviate ‘A- w’ as ‘Aw.
5.2. Remarks
5.3. Theorem
Let i and x be any cardinals and let {u,| a € A} be an indexed family
of cardinals such that u, = % for every a€ A and such that |A| =A.
Then
D {a |a€ A} = Ax.
PROOF
Let D be a set such that |D| = x. Applying Def. 4.11 to the indexed
family of sets {B,|ae¢A} such that B, = D for every ae A, we
obtain
5.4. Theorem
If k, m and n are natural numbers and km = n, then km =n.
PROOF
DInxe ae
48 3. Cardinals
5.5. Problem
Verify, for all cardinals x, A and wu:
5.6. Problem
Prove the following generalization of Prob. 5.5(v): if {A, |x € X} is
any indexed family of cardinals and w is any cardinal then
5.7. Warning
The same as 4.8, mutatis mutandis.
5.8. Lemma
Let C and D be any sets and let u and v be distinct objects. Let P be
the class
PROOF
5.9. Definition
If {B, |x € X} is an indexed family of sets, the class
{f : f is a function such that dom f = X and fx e B, for all x « X}
is denoted by
‘X {B, |x € X}
and called the direct product of the family {B, |x € X }.
5.10. Lemma
If {B,| x € X} is any indexed family of sets, then X{B,|x € X} isa
set.
PROOF
Recall (Def. 4.9) that {B, | x € X} is the function having the index set
X as its domain, whose value at each x € X is B,. Therefore the range
of this function is
Bede Crk}
and this range is a set by AR. Now let us put
U = U{B,:
x € X}.
U is aset by AU. Next, observe that by Def. 5.9, if f is any member of
<{B,|x e X} then f is a map from X to U. Hence fE Xx U,
which means that f e P(X x U). Thus we have shown that
Mt Baek SrtxX x U).
5.11. Definition
Let {B,|x € X} be a family of sets and let u, =|B,| for each x € X.
We put
5.12. Remarks
(i) Using AC it is easy to legitimize this definition by showing that if
A is another indexed family of sets with the same index set X
such that |A,| = |B,| for all x e X, then
KAY
| eX) = 1B, xe XY).
(ii) Def. 5.1 can be regarded as a special case of Def. 5.11. Indeed, if
C and D are any sets, whose cardinalities are x and / respect-
ively, take X = {u,v}, where uw and v are distinct objects, and let
{B,|x € X} be the family such that B, = C and B, = D. Then
Lemma 5.8, rewritten in the notation of Def. 5.9, says that
X{ Bele
X= Cx
|X {B, |x € X}|=|C
xD ’
6.2. Remarks
(i) If f is any member of map(A, B) then f C A x B, hence f is
a member of P(A x B). Thus map(A, B) CP(A x B), and
map (A, B) isa set.
(ii) Perhaps more instructively, the same result can be derived
from Lemma 5.10, as follows. Consider the indexed family
$6. Exponentiation; Cantor’s Theorem Sill
dom
f = A andfa € B forallae A}.
By Def. 6.1 this is exactly map (A, B).
6.3. Definition
For any cardinals A and uw, we define wu to the [power of] A:
u’ = |map (A, B) ’
where A and B are sets such that |A| = A and |B] = wu.
6.4. Remarks
(i) This definition is legitimized by the easily verified fact that if
A = A' and B = B' then map(A, B) ~ map(A’, B’).
(ii) From Rem. 6.2(ii) it follows that exponentiation (raising to a
power) can be achieved by repeated multiplication, in the follow-
ing sense: if {x,|a¢ A} is an indexed family of cardinals such
that x, = uw for all a € A, and if |A| = A, then
Tl{%, | a € A} =.
6.5. Problem
Let k, m be natural numbers, and let n = m*. Verify that n = m*.
6.6. Problem
Verify that for any cardinals x, A and u:
6.7. Theorem
For any set A, |PA| = 241.
Sy 3. Cardinals
PROOF
By Def. 6.3, what we have to show is that PA is equipollent to
map (A, B), where B is a set having exactly two members. Let us take
B = {@, {@}}. Define a map F from map (A, B) to PA, by putting,
for every f e map(A, B),
Ff ={aeA:
fa =O}.
PROOF
D={xeEA:x
¢€ gx}.
6.9. Remark
Ordinals
e<e~<e
ee
23
54 4. Ordinals
The new order type just formed is described by the next ordinal, which
Cantor denoted by ‘w+ 1’. We can continue in this way, getting not
only w+ n for every natural n but also w + w, then w+ w+ 1 and so
on and on and on.
Examining the ‘queues’ formed in this way, Cantor saw that they are
not merely totally ordered, but have a special property not shared by
all totally ordered sets: every non-empty subset of the queue has a
least (first) member. Cantor called such queues well-ordered.
An example of a total order that is not a well-ordering is provided by
the integers, ordered according to magnitude:
De ee ee eed) ee ae
Note that the fact that the pattern (*), described by the ordinal a, is
well-ordered is just the Least Number Principle, a form of the Principle
of Mathematical Induction (see § 4 of Ch. 0).
Cantor introduced the ordinals as a new and separate sort of abstract
entity, just as he did with cardinals. However, in 1923 John von
Neumann pointed out that among all well-ordered sets having a given
Cantorian ordinal as their order-type there is a particular one with
some very special properties. In the spirit of reductionism, this particu-
lar set can then be taken to be the ordinal of that order type.
We shall present von Neumann’s theory of ordinals as streamlined
by Raphael M Robinson and others.
2.2. Remarks
(i) Instead of demanding that b < x for every other x € B, we may
equivalently demand that b <x for every x € B. Here < is of
course <", the blunt partial order associated with < (see Prob.
2.3.9 and Rem. 2.3.10).
(ii) When there is no risk of confusion, we omit the phrase ‘with
respect to <”.
(iii) Since < is anti-symmetric, if B does have a least member it is
unique and we may therefore refer to it as the least member of
B:
2.3. Definition
A well-ordering on a class A is a partial order on A such that every
non-empty set included in A has a least member.
2.4. Lemma
If < is a well-ordering on a class A then < is a [sharp] total order on
A.
PROOF
According to Def. 2.3.11, we must show that < fulfils the trichotomy
and transitivity conditions. The latter condition is fulfilled because by
Def. 2.3 < is a partial order; so it only remains to verify the
trichotomy.
Let x and y be any members of A. We must show that exactly one
of the three disjuncts
ye Sj) Olens = j) (re yy << 9%
2.5. Definition
If A is any class, we define the binary relation €,4 on A, called the
restriction of € to A, by putting
2.6. Remark
The relation €,4 can also be characterized by the fact that, for all x
and y,
xe€,yexeAandye Aandxey.
2.7. Definition
We say that a class A is e€-well-ordered if the relation €,4 is a
well-ordering on A.
2.8. Problem
(i) Let A be a class such that €, is a sharp total order on A; let
BCA and be B. Prove that b is least in B w.r.t. €, iff b is
either an individual or a set such thatbN B=.
(ii) Hence verify that a class A is €-well-ordered iff the following two
conditions are satisfied:
(1) €,4 is asharp total order on A.
(2) Every non-empty set u included in A has a member v such
that v is either an individual or a set such thatv Nu =.
(iii) Prove that in (ii) we may replace (1) by the weaker condition:
(1') For any members x and y of A, at least one of the following
three disjuncts holds:
xE€yorx=yoryex.
2.9. Theorem
PROOF
2.10. Definition
A class A is transitive if, for all y,
yeA>yCa.
2.11. Remarks
(i) Note that every member of a transitive class must be a set rather
than an individual, because by Def. 1.3.4 y C A holds only if y is
a class. So a class A is transitive iff:
(1) all its members are sets and
(2) UA CA; that is, for allx andy,xe ye A>xeA.
(ii) Unfortunately, ‘transitivity’ is used with two meanings: the pre-
sent one and that applicable to binary relations (as, for example,
in Def. 2.3.2). In practice no confusion shall arise, as the context
will indicate which meaning is intended.
2.12. Definition
An ordinal is a transitive and €-well-ordered set. The class of all
ordinals is denoted by ‘W’.
2.13. Examples
The empty set @ is, vacuously, an ordinal. It is also easy to verify that
{O} and {G, {@}} are ordinals.
58 4. Ordinals
2.14. Convention
We shall use lower-case Greek letters — mainly ‘a’, ‘f’, ‘y’, ‘WV’, ‘& and
‘y’ — as variables ranging over the ordinals.
2.15. Theorem
All members of an ordinal are ordinals; thus, if @ is an ordinal,
w={E:Eea}.
PROOF
2.16. Lemma
If y is any transitive subset of an ordinal a then y itself is an ordinal;
moreover, y= Worye da.
PROOF
2.17. Theorem
The class W of all ordinals is transitive and €-well-ordered.
§2. Definition and basic properties By)
PROOF
The transitivity of W follows at once from Thm. 2.15. To prove that W
is €-well-ordered, we shall make use of Prob. 2.8(iii).
To verify that condition (1’) of Prob. 2.8(iii) holds for
W, let w and
B be any ordinals. Since both a and f are transitive, it is easy to see
that a B is also transitive. Thus by Lemma 2.16 a / £ is an ordinal,
Say y; moreover, y = wor y € aw. Likewise, y = Por ye f.
But we cannot have both y € a and yé # because then ye aN B -
that is, y € y; and this would violate the anti-symmetry of the well-
ordering relation €, on y. Therefore y = aw or y = B. Hence a = f or
aw € Bor B € a, which proves condition (1') for W.
Now let u be any non-empty set of ordinals. We must prove that
there exists an ordinal € € u such that EM u=©. Take any we u. If
au =, we are through.
On the other hand, suppose aM u#. Since a is €-well-ordered,
there must exist some member & of aM u such that ENanu=2@.
But€ € wand q@is transitive;so§ C wa. Hence ENUu=ENaNuUu=G@.
a
2.18. Corollary
W is a proper class (that is, not a set).
PROOF
If W were a set, then by Def. 2.12 and Thm. 2.17 it would be an
ordinal, hence W e W, in violation of the anti-symmetry of the well-
ordering relation €w. a
2.19. Remarks
(i) The (naive) assumption that W is a set led to a contradiction.
This was the Burali-Forti Paradox (see § 2 of Ch. 1). Cor. 2.18 is
a ‘tame’ version, within ZF, of the paradox. Similarly, Thm.
1.3.10 is a ‘tame’ ZF version of Russell’s Paradox.
(ii) In the proofs of Thm. 2.17 and Cor. 2.18 we used the argument
that an ordinal y cannot be a member of itself because this would
violate the anti-symmetry of the well-ordering relation €, on y.
In mathematical practice it is often convenient to posit a further
postulate — the Axiom of Foundation (or Regularity), first pro-
posed by Dimitry Mirimanoff in 1917 — one of whose effects is to
60 4. Ordinals
exclude any set that belongs to itself. On the other hand, in some
special applications of set theory — notably in so-called situation
semantics, developed by Jon Barwise and others, and in abstract
computation theory — it is convenient to use an extension of ZF
proposed by Peter Aczel, which negates the Axiom of Founda-
tion and admits some sets that belong to themselves. In the
present course we do not commit ourselves either way.
2.20. Corollary
Any class of ordinals is €-well-ordered.
PROOF
Immediate from Thm. 2.17 and Prob. 2.8(iv). id
2.21. Definition
The €-well-ordering on W shall be denoted by ‘<’. Thus for any
ordinals w and £,
a<pooefB.
2.22. Remarks
(i) As usual, we denote by ‘<’ the blunt version of <. Thus
a=Boaefpora=B.
(ii) Thm. 2.15 can now be read as saying that if w is any ordinal then
Vee = Or
(iii) From now on, whenever we use order-related terminology in
connection with ordinals, we shall take it for granted that the
order relation referred to is the €-well-ordering, unless otherwise
stated.
2.23. Definition
Let < be a partial order on a class A and let BC A.
2.24. Remarks
(i) The phrase ‘with respect to <’ is omitted when there is no danger
of confusion.
(ii) A subclass B of A need not in general have any upper bound, let
alone a lub; but if it has a lub, it is unique.
2.25. Theorem
If A is a set of ordinals then its union-set UA is an ordinal. Moreover,
UA is the lub of A.
PROOF
2.26. Definition
For any ordinal aw we put a’ =g~U {a}. We call a’ the immediate
successor of a. (This terminology is justified by the following
theorem.)
2.27. Theorem
For any a, a is an ordinal. Moreover, for any B, B= a iff B<a’'
(equivalently: a < B iff a’ < B). Hence w< B iff a’ < fp’.
62 4, Ordinals
PROOF
2.28. Definition
(i) An ordinal of the form a’ is called asuccessor ordinal.
(ii) An ordinal that is neither @ nor a successor ordinal is called a
limit ordinal.
3.2. Theorem
w is transitive.
PROOF
3.3. Theorem
(i) © is a finite ordinal.
(ii) If aw is a finite ordinal then so is a’.
PROOE:
3.4. Theorem
@ iS a Set.
PROOF
Using the Axiom of Infinity (Ax. 1.3.21), take a set Z such that We Z
and such that whenever x € Z, then also x U {x} € Z. Thus if an
ordinal a belongs to Z then (by Def. 2.26) so does a’.
Consider the class w — Z, the class of all finite ordinals not belon-
ging to Z. If this class is non-empty, then by Thm. 2.9 it must have a
least member, say B. Now, f cannot be @, because @ does belong to
Z. Also, f, being a finite ordinal, cannot be a limit ordinal. So it must
be a successor ordinal, say 6 = a’ = w U {a}. But in this case a itself
is a finite ordinal (by Thm. 3.2), such that a<. Since B was
supposed to be the Jeast finite ordinal not belonging to Z, it follows
that ae Z. Therefore by the assumption on Z also a’ € Z. But this is
impossible, because a’ = f, which is the least finite ordinal not belon-
ging to Z.
So w— Z must be empty. Thus wC Z; hence wisasetby AS.
3.5. Corollary
w is the unique set X having the following three properties:
(i) De X;
(ii) whenever aw € X then also a’ € X;
(iii) X C Z for any set Z such that @ e€ Z and such that whenever
awe Z thenalso a’ € Z.
PROOF
Thm. 3.3 says that w has properties (i) and (ii). The proof of Thm. 3.4
shows that w has also property (iii). The uniqueness of w follows by
PX, because if X is any set having the three properties then both
wc X and X Co. a
3.6. Remarks
(i) Our first use of AI was to prove that @ is a set. Conversely, if we
postulate that w is a set, then by Thm. 3.3 wis a set satisfying the
conditions that AI lays down for Z. This shows that (in the
64 4. Ordinals
3.8. Remarks
(i) We see that the set w of finite ordinals, with its €-well-ordering,
simulates, within the confines of ZF set theory, the behaviour
that characterizes the system of natural numbers. We can take ©
as the counterpart of the number 0 and the €-well-ordering on w
as the counterpart of the usual ordering of the natural numbers.
Just as each natural number 7 has an immediate successor, 1 + 1,
so every finite ordinal aw has an immediate successor, a’.
Moreover, the basic facts about the ordering of the natural
numbers (Facts 0.1.1-0.1.5) are mimicked by theorems about the
finite ordinals and their €-well-ordering. And, most importantly,
the Principle of Mathematical Induction is mimicked by the
Principle of Induction on Finite Ordinals. Certainly, within ZF o
impersonates, plays the role of, ‘the set of natural numbers’. In
fact, Cor. 3.5 reproduces within ZF Richard Dedekind’s famous
characterization of the natural numbers.!
(ii) The obvious reductionist step at this point is to identify the
ZF-set w of finite ordinals as the ‘true’ (hitherto intuitive) set N
of natural numbers. This would be a grand reduction indeed,
because work done during the 19th century by several mathemati-
cians (including Hamilton, Bolzano, Weierstrass, Dedekind and
Cantor) showed that all the concepts of mathematical analysis
could be reduced to those of natural number, set and member-
ship (plus concepts such as relation and function that we have by
! Was sind und was sollen die Zahlen?, 1888. (English translation in Essays on the theory
of numbers edited by W. W. Beman, 1901.)
$3. The finite ordinals 65
now reduced to set-theoretic concepts). Thus a huge part, if not
the whole, of mathematics would be reduced to set theory.
Many (perhaps most) mathematicians, under the influence of
the dominant structuralist ideology, do proceed in this way, and
frame (or think of) their mathematical discourse as taking place
within set theory.
3.9. Warning
This reduction, although extremely successful in a formal sense, is by
no means unproblematic, as Skolem pointed out in 1922, when he
published his famous paradox. (We shall discuss Skolem’s Paradox in
the Appendix.)
3.10. Theorem
w is the least infinite ordinal and the least limit ordinal.
PROOF
That w is an ordinal follows at once from Cor. 2.20 and Thms. 3.2 and
3.4. Also, w cannot be a finite ordinal, because that would mean that
w € w — which is impossible for an ordinal. Thus w must be an infinite
ordinal. On the other hand, if & < w — that is, § € w—- then by Def. 3.1
E is a finite ordinal; hence w must be the /east infinite ordinal.
If £€ w then, as we have just seen, & is a finite ordinal, hence a
fortiori, not a limit ordinal. If w itself were not a limit ordinal then by
Def. 3.1 it would follow that is a finite ordinal, contrary to what we
have proved. Thus w must be a limit ordinal. As we have just
observed, no ordinal smaller than w can be a limit ordinal. Hence a is
the Jeast limit ordinal. |
3.11. Preview
We have yet to justify the adjectives finite and infinite introduced in
Def. 3.1 in connection with ordinals. Dedekind defined a set as infinite
if there exists an injection from it to a proper subset of itself, and as
finite if there is no such injection. We will not adopt Dedekind’s
definition, but we shall show that finite and infinite ordinals in the
sense of Def. 3.1 are finite and infinite respectively in Dedekind’s
sense.
66 4. Ordinals
3.12. Theorem
There does not exist an injection from a finite ordinal to a proper subset
of itself .
PROOF
3.13. Theorem
PROOF
3.14. Theorem
PROOF
3.15. Definition
A set is finite if it is equipollent to a finite ordinal (in the sense of Def.
3.1). Otherwise, it is infinite.
3.16. Remarks
(i) By virtue of Thm. 3.14, an ordinal is finite (or infinite) in the
sense of Def. 3.1 iff it is finite (or infinite, respectively) in the
sense of Def. 3.15; so there in no conflict between the two
definitions.
(ii) By Thm. 3.14, a finite set is equipollent to a unique finite ordinal.
3.17. Problem
(i) Prove that there does not exist an injection from a finite set to a
proper subset of itself. (Use Thm. 3.12.)
(ii) Prove that if A is a non-empty finite set of ordinals, then A has a
greatest member — that is, an ordinal we A such that € < a for
each & € A. (Otherwise, define a map f on A by taking, for each
aw é A, fa as the least € € A such that a < &. Show that f would
be an injection from A to a proper subset of itself.)
68 4. Ordinals
3.18. Problem
Let n be a natural number. Show that for any objects a;, a2, ..., Gn,
the set {a,, a, ..., a,} is finite. (Use weak mathematical induction
on the number 7.)
§ 4. Transfinite induction
Various forms of the Principle of Mathematical Induction have an-
alogues that apply to ordinals. These analogues collectively are known
as the Principle of Transfinite Induction. First, by virtue of the fact
that W is well-ordered, we have immediately by Thm. 2.9:
(*) ne X foreveryn<§&>EeEX,
then X = W.
PROOF
4.3. Remark
then X = W.
PROOF
4.5. Remarks
(i) These principles have restricted forms, in which X is assumed to
be a subset of some (arbitrary) given ordinal @ rather than a
subclass of W. Thus, the form of Thm. 4.1 restricted to an
arbitrary ordinal w says that a non-empty subset of a has a least
member. The restricted form of Thm. 4.2 says that if X is a
subset of aw such that for all §< awe have §C X > Ee X, then
X=a.
(ii) The Principle of Transfinite Induction restricted to the particular
ordinal is precisely the Principle of Induction on Finite Ordin-
als.
4.6. Problem
Prove the restricted form of Thm. 4.2. Formulate and prove a form of
Thm. 4.4 restricted to an arbitrary ordinal.
5.2. Definition
A partially ordered set (briefly, poset) is a pair (A, <), where A is a
set and < is a [sharp] partial order on A. A totally ordered set is a
poset (A, <), in which < is a total order on A. A well-ordered set is a
poset (A, <), in which < is a well-ordering on A.
70 4. Ordinals
5.3. Remarks
(i) This is just a convenient way of packaging a set A together with a
particular partial order on A into a single object. It saves us
having to keep saying ‘such-and-such a set with such-and-such a
partial order on it’.
(ii) However, we shall often refer, somewhat inaccurately, to A itself
as the poset (or ordered set, or well-ordered set) when, strictly
speaking, we have in mind the pair (A,<). We shall only
commit this peccadillo when it is clear from the context which
relation < is involved. Thus, we refer to an ordinal @ as a
well-ordered set, when strictly speaking we mean the pair
(a, <), where < is €,, the €-well-ordering on a.
5.4. Definition
A similarity map (a.k.a. isomorphism) from a poset (A, <) to a poset
(A', <') is a bijection f from A to A’ such that, for all x and y in A,
Pave fy:
5.5. Remark
It is easy to see that the identity map id, is a similarity map from
(A, <) to itself. Also if f is a similarity map from (A, <) to (A’, <’)
then its inverse f-' is a similarity map from (A’,<') to (A, <).
Finally, if f is a similarity map from (A, <) to (A’,<’) and gisa
similarity map from (A’, <') to (A”, <”) then the composition go f
is a similarity map from (A, <) to (A"”, <").
It follows that similarity is an equivalence relation on the class of
posets.
5.6. Theorem
Iff is a similarity map from an ordinal « to an ordinal B then f is the
identity map id,, hence w = f.
PROOF
Let € € a. By the induction hypothesis, if 7 < & then » < fn. But if
n<6§ then also fn < f&, since f is a similarity map. Thus for every
n<& we have n< f&. In particular, n # f& for every n < &; in other
words, f§ < & is impossible. This proves that & < f& and completes the
induction.
Now, f! is a similarity map from £ to a; therefore by the same
token we have also €< f ‘6 for all Ce £. Taking € to be f&, where
E € aw, we obtain f§ < f -'f& = E. Thus fE < Eas well as E < fE, which
shows that f must be the identity id,. |
5.7. Corollary
For any poset (A,~<), there exists at most one similarity map from
(A, <) toan ordinal.
PROOF
5.8. Preliminaries
(i) For the rest of this section, we consider a fixed but otherwise
arbitrary well-ordered set (A, <).
(ii) If B C A, then B is clearly well-ordered by the relation < Be
that is:
Ansan e AN aa}.
5.9. Lemma
Let Fa = w. Then for any ordinal B < aw there exists some b <a such
that Fb = B. Conversely, if b <a then b belongs to domF and Fb is
some ordinal B < a.
PROOF
Let f be the similarity map from A, to w. Suppose 6 < a. This means
that Be aw. Therefore fb = 6 for some b € A, — that is, b<a. Note
that by the transitivity of wa we have 6 Ca. It is easy to verify that
f{Aj,, the restriction of f to A,, is a similarity map from Ay, to P.
Hence Fb = £.
Conversely, suppose that b < a. This means that b € A,. Therefore
fb = B for some Bea —- that is, B<a. As before, it follows that
Fb = B. B
5.10. Lemma
F is injective.
PROOF
5.11. Lemma
The set ran F is an ordinal.
PROOF
PROOF
5.13. Definition
A set is denumerable if it is equipolient to w. A set is countable if it is
finite or denumerable.
5.14. Problem
(i) Let D be a subset of an ordinal aw. By Cor. 2.20, D is €-well-
ordered; and by Thm. 5.12, D is similar to an ordinal f. Prove
that B <a. (Let f be a similarity map from f to D. Show that
E = fé& for every €€ B.)
(ii) Prove that a set is countable iff it is equipollent to a subset of w.
(Use (i) to show that every subset of is countable.)
6.2. Convention
Throughout this section we let C be a fixed but arbitrary function such
that dom C is the class of all sets.
6.3. Definition
We shall write ‘@-(F, w)’ as short for the statement:
6.4. Remarks
(i) Recall that a’ = {§:§ <a}.
(ii) Note that F|& = {(n, Fn): n € &}. Therefore the recursion equa-
tion determines F& in terms of the ‘previous behaviour’ of F —
the restriction of F to the set of all ordinals 7 < §. Note also that
even if F is a proper class, F | § is always a set by AR and Thm.
2a
(iii) Rc(F, v) means that F is defined and satisfies the recursion
equation for all ordinals up to a inclusive. Hence
6.5. Lemma
If both RC(F, w) and R(G, w) then FE = G& for all E< a.
PROOF
6.6. Lemma
For any ordinal a@ there exists a unique function fy such that
dom fy = a = {§: § < a} and such that Rc(fy, d).
$6. Transfinite recursion 75
PROOF
PROOF
To define F, note that the f, of Lemma 6.6 satisfy the recursion
equation wherever they are defined, and any two of them agree with
each other wherever both are defined. Therefore all we have to do is
glue them together:
F=y Vif, we Wi.
6.8. Remarks
(i) Note the phrasing of Thm. 6.7: it does not claim that such-and-
such an F exists but that we can define it. To say, in set theory,
76 4. Ordinals
1.3. Remarks
(i) AC was the first postulate of set theory (apart from PX) to be
stated as such. Its first known explicit formulation is due to
Giuseppe Peano (1890), who however rejected it as untenable. It
was first proposed as a new valid mathematical principle by
Beppo Levi in 1902, although it had been used inadvertently by
Cantor and others long before that. Zermelo, who was told about
AC by Erhard Schmidt, used it almost at once in his first (1904)
proof of the Well-Ordering Theorem (WOT, Cor. 1.6 below), a
result that had been conjectured by Cantor. Our formulation of
AC is essentially that used by Zermelo in his 1904 paper.
(ii) In his 1908 paper on the foundations of set theory, in which the
theory is given its first fully fledged axiomatic presentation,
Zermelo does not state AC in this form but in a more restricted
version. He assumes that 5 is a set of non-empty sets that are
pairwise disjoint—that is, X Y = © for any two distinct mem-
bers of 5 (see Def. 3.4.1). He then postulates the existence of a
set A such that, for any X ed, the intersection AMX has
exactly one member.
FI
78 5. The Axiom of Choice
SHAN)
KA GX Cals
It is easy to verify that 7 is a set of non-empty and pairwise
disjoint sets. According to the restricted version, there exists a set
A whose intersection with each member of 7 is a singleton. We
now define a function g on 3 as follows. For any X € 6, the set
{X} x X belongs to 7 and hence its intersection with A has
exactly one member. This member must be of the form (X, x9),
where x9 is some member of X. We put gX =x . Then g is a
choice function on 3.
(iii) Using AC, Def. 3.4.11 is easily legitimized. If |A,|~|B,| for
each x € X, then by AC there exists a family f = {f,|x € X}
such that, for each x, f, is a bijection from {x} x A, to {x} x B,.
Then it is easy to see that Uranf is a bijection from U{{x} x A,:
x eX} to Uf{{x} x B,: x € X}. A similar argument applies to
Dates it:
(iv) AC has been regarded with suspicion because it is a purely
existential postulate. It asserts the existence of a set — a choice
function — without characterizing it as the extension of some
previously specified property. In other words, AC is not a special
case of the Principle of Comprehension. In this respect AC is
markedly different from all other existential postulates of set
theory. For example, the Power-set Axiom asserts that, for each
set A, there exists the power-set PA, which is characterized as
the extension of the property being a subset of A.
(v) In 1938 Gédel proved that AC is consistent relative to the other,
commonly accepted, postulates of set theory, in the sense that if
they are consistent, then the addition of AC does not result in
inconsistency. In 1963 P. J. Cohen proved that the same holds
also for the negation of AC.
(vi) AC has some weird (counter-intuitive) consequences. However,
its negation has even weirder ones: for example, the direct
product of a family of non-empty sets may well be empty. Note
$1. From AC to WOT 79
1.4. Preview
Starting from AC, we shall prove a chain of other major principles, all
of which turn out to be equivalent to each other and to AC. The first
of these principles, which is also the most important, is a corollary of
the following theorem.
1.5. Theorem
Every set is equipollent to an ordinal.
PROOF
Let A be a set, and let 5 be the set PA — {©} of all non-empty subsets
of A. By AC there exists a choice function g on 3. Since A is a set, it
cannot be the universal class (Thm. 1.3.10); so there exists an object b
that does not belong to A.
We now define a function C whose domain is the class of all sets, as
follows: for any set x we put
F [g(A —ranx) ifx isa map such that ranx C A,
(*) o |b otherwise.
Let & be any ordinal such that F§ # b. This means that F | must be a
map from & to A, and
FE =g(A — ran(F/&)) e A — ran(Ff6).
PROOF
1.7. Remarks
1.8. Corollary
For any sets A and B, A| <|B| or |B| <|A|.
PROOF
2.1. Lemma
PROOF
2.2. Lemma
If f is a map such that dom f is finite then ran f is finite as well.
PROOF
By Def. 4.3.15, dom f is equipollent to a finite ordinal w. Without loss
of generality we may therefore assume that dom f is a itself. (Other-
wise, replace f by feh, where h is a bijection from a to dom f.)
Define a map g from ran f to a by putting, for each x € ran f,
2.3. Definition
Let < be a partial order on a class A. A member a of A is said to be
maximal in A with respect to < if there is no x € A such that a < x.
2.4. Remarks
(i) When there is no risk of confusion, we shall omit the phrase ‘in A
with respect to <’.
(ii) In general, A may not have a maximal member; or it may have
more than one.
(iii) Do not confuse maximal with greatest. However, if < is a total
order on A and a is maximal in A then a is also the greatest
member of A, in the sense that x < a for any other x < A for any
other x € A. In this case it is clear that A cannot have more than
one. maximal member.
2.5. Definition
If ¢ is any class of sets, we put
Gus ade CX, Ve. [= A ee ray
2.6. Remarks
(i) We can also characterize the relation C_, by saying that, for any
X and Y,
XCy4xYeXectandYectandx
CY.
2.7. Definition
A class ct of sets is of finite character if, for any set X,
PROOF
Take any A € c+; we shall hold A fixed for the rest of the proof.
Without loss of generality, we may assume that A = G©@ — otherwise,
we could compose G with the bijection from c# to itself that inter-
changes A with G©@ and leaves all other members of < alone.
Using transfinite recursion restricted to w (see Rem. 4.6.8(ii)), we
define a map F on a such that, for every §€< a,
U{Fn:n<§} C GE.
But in this case the definition of F says that F§ = G&. It would then
follow that U{ Fn: 1 < a} C FE- which is impossible. @
2.9. Definition
Let (A, <) be a poset. A chain in (A, <) is any subset C of A such
that,
for allx and yinC,x< yorx=yory
<x.
2.10. Remark
In other words, a chain in (A, <) is a subset of A that is totally
ordered by the restriction of < to it.
PROOF
The condition for C being a chain in (A, <) (see Def. 2.9) involves
only two members of C at a time. Hence it is easy to see that the set @
of all chains is of finite character. Therefore the TT Lemma applies to
é. @
The most famous and frequently used of all the maximality principles
that are equivalent to AC is generally known as ‘Zorn’s Lemma’
although it is arguably due to Kuratowski, who published a version of
it in 1922, thirteen years before Zorn. We shall now deduce it from the
Hausdorff Maximality Principle (HMP). (For the meaning of upper
bound, see Def. 4.2.23.)
PROOF
As before, let @ be the set of all chains in (A, <), and consider the
poset consisting of @ with the partial order Cg on it.
The singleton {a} is, trivially, a chain in (A,<). Hence by the
HMP {a} is included in a chain C that is maximal in @ w.r.t. Ce. By
hypothesis, C has an upper bound wu in A. Since a € C, it follows that
au.
It remains to show that wu is maximal in A. Suppose it were not
maximal. Then there would exist some v such that u < v. Since u is an
upper bound for C, it would follow that x < v for all x ec C. But then
C U {v} would be a chain that properly includes C — contradicting the
maximality of C in @. @
2.13. Theorem
ACfollows from Zorn’s Lemma.
PROOF
Let 5 be a set of non-empty sets. We must show that there exists a
choice function on 3.
If 5 is empty then © is the required choice function. So from now on
we may assume that 3 is non-empty.
Let us say that / is a partial choice function (pcf), if ¢ is a choice
function on a subset of 3. Such creatures do exist: for example, if A is
any member of J and a is any member of A then {( A, a)} is a choice
function on {A} and hence a pcf. Let F be the set of all pcfs. (It is
easy to verify that (F is indeed a set; DIY.) As we have just seen, (F is
non-empty.
We now consider the poset ((F, Cz). Note that if / and g are pcfs,
then {Cg means that dom/Cdomg and /X =gX for each X €
dom/.
We shall show that ((7,C¢) satisfies the condition of Zorn’s
Lemma. To this end, let us consider any chain @ in this poset. We
claim that its union, Ue, is an upper bound for €@ in (Ff.
For any / € @ we obviously have / C UZ. So it only remains to show
that Ue belongs to (F; in other words, that U@ isa pef.
Since every member of @, being a pcf, is a set of ordered pairs
(X,x) such that x e X €4, it is clear that UE likewise is a set of
ordered pairs of this kind. It only remains to show that U@ is a
function.
Now, if both / and g are members of € then, since €@ is a chain,
we must have /Cg or g C/¢/. Therefore X edom/Mdomg then
{X = qX. Thus the coherence condition is fulfilled, showing that Ué is
indeed a function (see Prob. 2.4.8).
We can now apply Zorn’s Lemma to the poset ((F, C+). Since F is
non-empty, it follows from the Lemma that there exists some g € (F
that is maximal w.r.t. Cz. Such g is a pef — a choice function on a
subset of 3. However, if domg were not the whole of 5, we could take
any A € 5 — domg and any a € A, and put
f= gU {(A, a)}.
Then / would be a pef such that g C /, contradicting the maximality of
g. Therefore g must be a choice function on the whole of 3. s
§2. From WOT to AC 87
2.14. Remarks
AC => WOT => TT Lemma > HMP => Zorn’s Lemma => AC,
1.2. Definition
For any finite set A, the cardinality |A\| of A is the (necessarily unique
and finite) ordinal w such that A ~ a. A finite cardinal is an ordinal w
such that |A| = w for some finite set A.
1.3. Remarks
(i) Clearly, if A and B are finite sets then |A| =|B| iff A ~ B, as
required by the incomplete Def. 3.1.3.
(ii) By Def. 1.2, a finite cardinal is a finite ordinal. Conversely, if a is
a finite ordinal, then obviously |a| = a. Thus the finite cardinals
are just the finite ordinals by another name.
(iii) Let n be any natural number. By Def. 3.3.1 and Prob. 4.3.18, the
corresponding cardinal, n, is finite. This result also follows from
the next theorem, in which we calculate these cardinals.
88
§1. Finite cardinals 89
1.4. Theorem
PROOF
a+1=|AUBI,
where A and B are any disjoint set such that |A| =a and
\B| =1.
As A we take a@ itself. As B we may then take any set
equipollent to 1 — that is, any singleton — provided it is disjoint
from a. We put B= {a}, which is disjoint from a@ because an
ordinal cannot belong to itself (see Rem. 4.2.19(ii)). Hence
"fad WeDo
a I io
But by Def. 4.2.26 this is |a’|. Moreover, by Thm. 4.3.3(ii), since
w is a finite ordinal so is a’. Hence w+ 1= a’, which (as we
have just noted) is a finite ordinal.
(iii) We proceed by weak mathematical induction on m. For m=0n
is 1 and the required result, 1 = {0}, follows at once from (i).
Now assume, as induction hypothesis, that m is a number for
which (iii) holds. Let p = (m+1)+1=n+1. Then
1.5. Theorem
For any finite cardinals a and B, a + f is a finite cardinal. Moreover,
a+0=aanda+ fp’ =(a+ f)’.
90 6. Finite cardinals and alephs
PROOF
By Prob. 3.4.7(iii), the equality a +0 =a holds for all cardinals «,
not just for finite ones.
To prove that a + f is a finite cardinal, we apply to f induction on
finite ordinals.
For B = ©, the sum a + f is a+ 0 by Thm. 1.4(i), and we have just
seen that this is the finite cardinal a.
Now assume, as induction hypothesis, that (6 is a finite cardinal such
that w + f is also a finite cardinal. Then
1.6. Theorem
For any finite cardinals w and B, a: B is a finite cardinal. Moreover,
a:0:=Oanda:p'=a-Br+a.
PROOF
1.7. Problem
Prove that if < is a [sharp] total order on a finite set A, then < is a
well-ordering on A. (Apply induction on finite ordinals to |A|. For any
non-empty subset B of A you must show that B has a least member. If
BCA, use Lemma 5.2.1. If B is A itself, let a be any member of A
and apply the induction hypothesis to A — {a}.)
1.8. Remark
' A translation of his paper, ‘The principles of arithmetic, presented by a new method’,
is in van Heijenoort, From Frege to Gédel.
§1. Finite cardinals 91
satisfied by any system whatever) he proposed five postulates which we
now state, with some inessential modifications.
(6) m+0=m.
(7) m+s(n) =s(m + rn).
(8) m:0=0.
(9) m-s(n)=m-n+m.
1.9. Warning
All this does not quite answer the question whether the ZF system of
finite cardinals is a faithful and correct representation of the (informal)
system of natural numbers, which mathematicians had studied long
before the invention of set theory.
Note that for any natural number n,. we can prove that the cor-
responding cardinal n is a finite cardinal (Thm. 1.4, or Def. 3.3.1 and
Prob. 4.3.18). But we have not proved that
(*) Every finite cardinal has the form n for some natural number n.
But in order to be able to do so, we must first prove that such a set
exists as an object of set theory. This, in turn, requires the property
being a natural number, in terms of which this would-be set is defined,
to be a set-theoretic concept (see discussion at the end of § 2 and
beginning of § 3 of Ch. 1). But we have taken the notion of natural
number as given in advance, prior to the development of set theory (cf.
Rem. 3.3.6); and without begging the question we cannot presuppose
that it is also a set-theoretic notion.
We have no assurance that the ZF system of finite cardinals is a
faithful and correct representation of the pre-ZF informal system of
natural numbers, so long as the status of (*) is in question. We shall
see in the Appendix that this question has a rather surprising answer.
2.1. Definition
For any set A, the cardinality |A| of A is the least ordinal a such that
A ~ a. A cardinal is an ordinal a such that |A| = a for some set A.
2.2. Remarks
(i) This definition obviously agrees with Def. 1.2 when A is a finite
set.
(ii) Def. 2.1 clearly satisfies the condition imposed in Def. 3.1.3: for
any sets A and B, |A| =|B| iffA ~ B.
(iii) From Def. 2.1 it follows at once that a cardinal is an ordinal that
is not equipollent to any smaller ordinal. Conversely, if an
ordinal @ is not equipollent to any smaller ordinal, then clearly
|a| = a, so that @ is a cardinal.
(iv) If A and w are cardinals, then the statement ‘A < w’ is apparently
ambiguous, because we can interpret ‘<’ according to Def. 4.2.21
(that is, as denoting the order on the class of ordinals) or
according to Def. 3.2.1. In the next lemma we shall prove that
these two interpretations are in fact equivalent. In the formula-
tion and proof of this lemma we shall use the symbol ‘S’ in the
sense of Def. 3.2.1 only, so as not to prejudge the issue. There-
after, we shall revert to using ‘S’ in either sense, as it will make
no difference.
2.3. Lemma
For any cardinals 4 and uw, 4 < wu (in the sense of Def. 3.2.1) iff A € wor
A=.
PROOF
Suppose A € wor A= u. Since ordinals are transitive sets, it follows that
AC u. Hence by Thm. 3.2.3 |A| < |u|. But A and w are cardinals, so
|A| = A and |u| = uw. Thus A < w.
Conversely, suppose that A ¢ uw and A# uw. Then, since the class of
ordinals is €-well-ordered, we must have we A. In the same way as
before, it now follows that «<A. Hence we cannot have AS yw, as by
the Schréder—Bernstein Theorem 3.2.7 it would then follow that A = wu,
contrary to hypothesis. @
94 6. Finite cardinals and alephs
2.4. Problem
Prove that if w is an infinite ordinal then |a| = |a’|; hence a’ cannot be
a cardinal. (Let f be the map such that domf =a’, f§ = &' for all
finite &, f& = & for all infinite §< a, and fa =. Show that f is a
bijection from a’ to @.)
2.5. Theorem
w is the least infinite cardinal. G
2.6. Theorem
If A is a set of cardinals, then UA is the lub of A in the class of all
cardinals, that is, the least cardinal A such that §<A forall §€ A.
PROOF
2.7. Theorem
PROOF
2.8. Corollary
The class of all cardinals is a proper class. |
2.9. Lemma
We can define a (necessarily unique) function F such that dom F = W
and for every ordinal a,
PROOF
2.10. Definition
For any ordinal a,
SH = df Fa,
2.11. Remarks
(i) ‘8’ is aleph, the first letter in the Hebrew alphabet. It is also the
first letter of the Hebrew word ‘101°X’ (einsoph, meaning
infinity), which is a cabbalistic appellation of the deity. The
notation is due to Cantor, who was deeply interested in mysti-
cism.
(ii) Combining Def. 2.10 with the characterization of F in Lemma
2.9, we obtain:
OS eu Sx) 2
96 6. Finite cardinals and alephs
2.12. Theorem
(i) For any a, &,q is an infinite cardinal.
(ii) For any ordinals w and B, ~< B>®q < Xz.
(il) Xo = @.
PROOF .
All three statements follow easily from Rem. 2.11(ii). w
2.13. Theorem
Every infinite cardinal is %_ for some ordinal a.
PROOF
From Thm. 2.12(ii) it follows that wa# B+, #&,. This means that
the function F of Lemma 2.9 is a bijection from the class W of all
ordinals to the class {8,: a € W} of all alephs. Since W is a proper
class (Cor. 4.2.18), it follows from Prob. 2.4.5 that the class of all
alephs must likewise be a proper class.
Now let A be any infinite cardinal. Then A, being an ordinal, is a set.
Hence there must be some a@ such that &, ¢ A — otherwise the set A
would include the class of all alephs, and by AS the latter would be a
set, contrary to what we have just shown.
Since both A and &, are ordinals, the fact that S, ¢ A implies that
A<,. If A= Ny, then there is nothing further to prove. On the other
hand, if A< &, then by Rem. 2.11(ii) it follows that A belongs to the
set {Ne: € < a}. Hence A= &; for some §< a. g
2.14. Remarks
(i) By Thms. 2.12 and 2.13, the alephs are just the infinite cardinals
by another name. Moreover, each infinite cardinal is an 8, for
some unique ordinal @.
(ii) The theory of real numbers, as other branches of mathematics,
can be developed within set theory. In doing so, one identifies
the finite cardinals with the natural numbers (see Rem. 1.8). It is
then not difficult to show that PXy (= Pw by Thm. 2.12(iii)) is
equipollent to the continuum — the set of all real numbers. (It is
§3. Arithmetic of the alephs 97
also equipollent to the set of all real numbers lying in any given
interval, for example, between 0 and 1.) The cardinal |PXo| is
therefore the cardinality of the continuum.
Cantor conjectured (but was unable to prove) that |P&9| =X).
This conjecture is known as the Continuum Hypothesis (CH).
More generally, the Generalized Continuum Hypothesis
(GCH) is the conjecture that |P®.| = &q for every a.
(iii) In 1938 Gédel proved that GCH is consistent relative to the
commonly accepted postulates of set theory, in the sense that if
they are consistent, then the addition of GCH does not result in
inconsistency. In 1963 P. J. Cohen proved that the same holds
also for the negation of CH (and hence GCH).
3.2. Theorem
Ro . Ro = Ro.
PROOF (OUTLINE)
According to Def. 3.5.1, &o-*o is the cardinality of the set A x B,
where A and B are any sets whose cardinality is $9. We shall take both
A and B to be &j itself.
Recall that by Thm. 2.12(iii) So = @, which is the set of finite
ordinals (as well as the set of finite cardinals). Thus we must show that
the set w X w of all ordered pairs of finite ordinals is equipollent to w
itself.
For any ordinals € and n, we let max (&, ) be the greater of € and n.
(If &= n then max (, 7) is equal to both of them.)
We define an order < on the set w X w as follows. For any finite
ordinals &, n, g and w we stipulate that (&, 1) < (@, wy) iff one of the
98 6. Finite cardinals and alephs
To make this clearer, here are the first few members of w X w, listed
according to the order <:
(0,0),
(0,1), (1,0), (1,1),
(0,2), (1,2), (2,0), (2,1), (2,2),
COPS). (1,3) 742,55. (3, Oneal) ahSs Bh, os Seer
It is not difficult to see that w X w with this order on it is similar to w
itself with its €-well-ordering. In particular, w x w is equipollent to a.
ee
3.3. Theorem
Nat Sa = Xq for any ordinal a.
PROOF
of <, for each (gp, y) € A we must have y< £ and y< €. Therefore
A is a subset of €’ x &’, hence |A| < |€'| -|C'|.
If ¢ is finite then €’ is finite as well and hence, by Thm. 1.6, so is
|A|.
If |¢) = 8g for some B < a, then by Prob. 2.4 |¢’| = Xz as well, so by
the induction hypothesis |A| < &,. Thus in any case |A| is smaller than
St
However, since f(§, 1) = Xq, it follows that f |A is a bijection from
A to &, and hence |A| =, — contrary to what we have just shown.
This contradiction shows that 6 must be equal to X,. =
3.4. Remark
In view of Thm. 2.13, Thm. 3.3 means simply that AA=A for any
infinite cardinal A.
3.5. Theorem
If uw is an infinite cardinal and i is any cardinal such that 1S AS yp,
then Au= wu.
PROOF
3.6. Theorem
If u is an infinite cardinal and A is any cardinal such that 1< w, then
A+ P= UE
PROOF
3.7. Theorem
If A is an infinite cardinal and « is any finite cardinal other than 0, then
AX =A.
100 6. Finite cardinals and alephs
PROOF
3.8. Definition
Let A be aclass. A map from an ordinal @ to A is called an A-string of
length w. A map from a finite ordinal to A is called a finite A-string.
3.9. Theorem
Let A be an infinite set and let S be the class of all finite A-strings. Then
S is a set and |S| =|Al|.
PROOF
S =U{S,:
a < o}.
Hence it is easy to see that
1.1. Specification
The primitive symbols of & fall into two mutually exclusive categories:
1.2. Warning
The statement just made does not mean that, for example, the
implication symbol of & is a boldface arrow-shaped figure. (In fact, for
all we care & may not have a written form at all!) Rather, the boldface
arrow is a syntactic constant, a symbol in our metalanguage, used as a
name for the implication symbol of 2.
1.3. Definition
If / is a natural number and sj, s>, ..., s; are primitive symbols of 2,
not necessarily distinct, then the concatenation $$ ... s, is called an
£-string and the number / is called its length. (More formally, an
£-string of length / can be defined as map from the set {1,2,...,/} to
101
102 7. Propositional logic
1.4. Definition
L-formulas are strings constructed according to the following three
rules.
(1) A string consisting of a single occurrence of a propositional
symbol is an £- formula.
(2) If B is an £- formula then —f (the string obtained by concatenat-
ing a single occurrence of — and the string B, in this order) is an
£- formula.
(3) If B and y are £-formulas then —fy (the string obtained by
concatenating a single occurrence of —, the string B and the
string y, in this order) is an £- formula.
1.5. Warnings
(i) In some books, particularly older ones, what we call ‘strings’ are
referred to as ‘formulas’, whereas what we call ‘formulas’ are
referred to as ‘well-formed formulas’ (‘wffs’).
(ii) Def. 1.4 does not mean that boldface lower-case Greek letters
are “-formulas. Rather, they are syntactic variables, symbols in
our metalanguage used to range over L-formulas.
1.6. Definition
A propositional symbol occurring in a formula @ is called a prime
component of @.
$1. Basic syntax 103
1.7. Definition
The degree of complexity of a formula @ — briefly, dega — is the total
number of occurrences of connectives (— and —>) in a.
1.8. Remark
We shall often wish to prove that all formulas a have some property P
— briefly, VaPa. This may be done by [strong] induction on dega, as
follows. Define a property Q of natural numbers by stipulating that Q
holds for a given number n iff P holds for all formulas @ such that
dega =n. Then clearly VaPa is equivalent to VnQn. As we know
(see § 3 of Ch. 0), to prove VnQn by strong induction we deduce Qn
(for arbitrary 1) from the induction hypothesis Vm < nQm.
Stated in terms of P rather than Q, this is tantamount to saying: if
we deduce Pa (for arbitrary a) from the induction hypothesis that PB
holds for all formulas 6 such that degB < dega, then it follows that
VaPa.
1.9. Problem
Assign to each primitive symbol s of £ a weight w(s) by stipulating: if
S iS a propositional symbol then w(s) = —1, while w(—)=0 and
w(—) =1. If s;,82,...,8,; are primitive symbols, we assign to the
string $18)... 8, weight
Thus, the weight of a string is the sum obtained by adding —1 for each
occurrence of a propositional symbol and +1 for each occurrence of >
in the string (occurrences of — make no contribution to the weight).
Since a formula is also a string, every formula @ has now been assigned
a weight w(a). Show that, for any formula a,
In other words, (ii) states that any string which is a proper initial
segment of @ (an initial part of @ short of the whole of a) has
non-negative weight. (Prove (i) and (ii) by strong induction on deg a.)
2.1. Definition
(0B) =a >a.
2.5. Definition
(i) (AAB) =a 7(4>B),
(ii) (aV B) =a 78,
(ili) (a<>B) =gr (AB)
A (Be).
(aA) is called a conjunction formula and « and B its first conjunct
and second conjunct respectively; (av B) is called a disjunction formula
and «@ and § its first disjunct and second disjunct respectively; (a<>B) is
called a bi-implication formula and @ and f its left-hand side and
right-hand side respectively.
2.6. Warning
The metalinguistic symbol ‘a’ does not denote anything; strictly speak-
ing it has no meaning on its own - only the package ‘(aA)’ as a whole
has been defined as an abbreviation for ‘~(a—>-—B)’. This is an
example of a contextual definition. Similar remarks apply to the other
two clauses of Def. 2.5.
In view of Def. 2.5 we need to modify our procedure for omitting and
restoring brackets in metalinguistic expressions. We leave Rules 2.2
and 2.4 as they are, but we replace Rule 2.3 by the following more
comprehensive rule for restoring brackets, which takes into account
not only ‘>’ but also the newly introduced metalinguistic symbols ‘a’,
“Vand <<".
arAB>yora>Bvy = (arBP)>yore>Bvy
= (Ar B)>yo70>(6vy)
= (ar B)>yo[44a>(Bv y)] = [((aaB)>y]e[74a>(Bv
y)]
= {[(4B)>y]e[4a>8vy)]}}.
The idea behind Rule 2.7 is that — in the absence of brackets that
indicate otherwise — a symbol-occurrence of higher rank separates more
strongly than one of lower rank, in much the same way as in English
punctuation a full stop separates more strongly than a semicolon, and
the latter separates more strongly than a comma.
3.1. Definition
Let B,, Bo, ..., B, be any formulas. A propositional combination of
B,, Bo, ..., B, is any formula constructed according to the following
three rules.
3.2. Warnings
(i) In forming a combination of B,, B2, ..., Bx, not all the B; need
actually be used. For example, according to Def. 3.1, both B) and
6,—B> are combinations of B,, Bo, Bs.
108 7. Propositional logic
3.3 : Problem
Let B,, Bo, ..., By be distinct prime formulas, among which are all the
prime components of a formula a. Prove that a can be obtained as a
combination of B;, Bo, ..., By in exactly one way. (Use induction on
dega, distinguishing three cases corresponding to the three clauses of
Def. 1.4.)
4.1. Remark
From a purely technical point of view, it does not matter what the
truth values T and 1 are, so long as they are two distinct objects. But
intuitively it is best to think of them as abstract entities standing
outside the language 2.
4.2. Definition
(i) A truth valuation on £ is a mapping o from the set of all prime
-£-formulas to the set {7,1} of truth values. For any truth
valuation o and any prime formula « we denote by ‘a” the truth
value assigned by o to a.
§4. Basic semantics 109
4.3. Remarks
4.4. Definition
(i) If @ is a formula and oa is a truth valuation such that @° = T, we
say that o satisfies @ and write ‘oF @’.
(ii) If o is a truth valuation that satisfies every member of a set ® of
formulas, we say that o satisfies ® and write ‘oF ®’.
(iii) If a formula @ is satisfied by every truth valuation, we say that a
is a tautology and write ‘Fy @’.
(iv) If ® is a set of formulas and @ is a formula such that every truth
valuation satisfying ® also satisfies a, we say that @ is a tauto-
logical consequence of ® and write ‘® Fy a’.
(v) If a set ® of formulas is not satisfied by any truth valuation, we
say that ® is /[propositionally] unsatisfiable and write ‘® Fy’.
110 7. Propositional logic
4.5. Remarks
(i) According to Def. 4.4(ii), a truth valuation o fails to satisfy a set
® of formulas, iff &@ has a member that fails to be satisfied by o.
Therefore if o is any truth valuation, then oF @. Indeed, © does
not have a member that fails to be satisfied by o, because it has
no members at all.
(ii) By Def. 4.4(iv), @ ky @ means that every truth valuation satisfies
a (because, as we have just seen, every truth valuation satisfies
the empty set ©); by Def. 4.4(iii) this means that a is a tautology.
Thus, a formula is a tautology iff it is a tautological consequence
of the empty set.
(iii) In connection with ‘Fy’ we employ certain notational simplifica-
tions that ought to be self-explanatory. Thus, for example, we
write ‘®, a Fy B’ instead of ‘® U {a} Fo B’.
4.6. Problem
(i) For any set ® of formulas and any two formulas « and B, prove
that ®, af Biff B Fy af.
(ii) Prove that {a,, @,...,@ ,} Fo Biff Fp a3>a, - a, >.
4.7. Warning
Never, never get — and Fy confused with each other. (I was not
referring just now to the symbols ‘—’ and ‘Fo’. You are not likely to
get them confused, because you can see they are different: the former
is a boldface arrow-shaped figure, while the latter is shaped like a
double-barred turnstile with a little ring on its lower right-hand side.
Rather, I was referring to what these symbols denote.) Much can be
written about this, but the following should help you to avoid the most
common errors.
Suppose a and f# are “-formulas. Then a—-f is another such
formula. ‘a—f’ is a nominal phrase: if you write it on its own, you
would not be making any statement, but only referring to that formula
— just as when I say ‘my income-tax statement’ and no more I am not
making a statement but merely referring to my income-tax statement.!
' We must exclude here cases of ellipsis, such as when, in reply to the question ‘What
were you doing last night?’, I say ‘My income-tax statement.’ as an ellipsis for the
sentence ‘I was doing my income-tax statement.’
$5. Truth tables 111
On the other hand, if you write ‘ao PB’ on its own, you would be
Stating that B is a tautological consequence of @ (or, more precisely, of
the singleton {a}); and if you write ‘tj a>’ on its own, you would be
Stating that the implication formula af is a tautology. By Prob. 4.6,
these two statements are equivalent.
B |B
1G fea
‘if lil
The idea here is that any truth valuation that assigns to 6 (or to B and
y) the truth value(s) shown in the first column (or the first two
columns) at a given row must assign to 6 (or to By) the truth value
shown in the last column at the same row.
This idea can be applied more generally. In the following definition
the formula @ is any combination of formulas B;, B2, ..., By. The
definition prescribes how to construct a truth table for o. in terms of By,
p>, ..., By. It proceeds by induction on dega: the induction hypothe-
sis is that if y is any combination of B,, Bo, ..., B, and degy <dega
then we can construct a truth table for y in terms of B,, Bo, ..-, Bx;
and using this hypothesis the definition tells us how to construct a truth
table for a in terms of B,, Bo, ..., Bx.
5.1. Definition
Let the formula @ be a combination of formulas f;, B., ..., By. A
truth value for « in terms of B,, Bo, ..., Bx is constructed as follows.
First, set up a rectangular table with k columns — headed ‘f,’, ‘B,’,
..., ‘By’ respectively — and 2* rows. In each of the k -2* spaces enter
‘T’ or ‘1’, so that no two rows are filled out in the same way. Thus
each of the 2* different strings of length k made up of ‘T’s and ‘1’s
should appear in exactly one row. (For the sake of definiteness, regard
these strings as ‘words’ in an alphabet consisting of the two letters ‘T’
it2 7. Propositional logic
and ‘1’ in this order, and enter the 2* different strings in lexicographic
order.)
Next, add a new last column, headed ‘a’, and — proceeding by
induction on dega — fill it out with ‘T’s and ‘1’s according to the
following three rules corresponding to the three clauses of Def. 3.1.
5.2. Warning
Since in general the same a may be obtained as a combination of
formulas B,, B2, ..., B, in more than one way — see Warning 3.2(ii) —
Def. 5.1 may not yield a unique result: a may have more than one
truth table in terms of B,, Bo, ... , Bx.
5.3. Problem
Construct truth tables in terms of a, B for:
(i) anB,
(ii) avB,
(ili) af.
(See Det2: 5.)
5.4. Problem
5.5. Lemma
Let a be a combination of By, Bo, ..., Bx. Consider a given row in a
truth table for « in terms of B,, Bo, ..., By. Let o be any truth
valuation such that for every i (where i=1, 2, ..., k) B;° is the truth
value indicated in the given row at the i-th column (the one headed ‘f;’).
Then @° is the truth value indicated in the given row at the last column
(headed ‘a’).
PROOF
PROOF
Let o be any truth valuation. Clearly, the truth values B,°, B2°, ...,
B,° must be respectively the same as those indicated in one particular
row of the given truth table. Hence by Lemma 5.5 @? is the truth value
indicated in the same row in the last column. But by assumption this
truth value is T. Thus a? = T for all o. a
5.7. Problem
Verify that for any a, B and y:
5.8. Warning
The converse of Thm. 5.6 is not generally true. To see this, let
a = B—y; then a truth table for a in terms of B, y is shown above (p.
111) and has an ‘1’ in its last column. Does it follow that « cannot be a
tautology? No; this truth table only shows that «° = 1 provided o is a
114 7. Propositional logic
PROOF
Consider an arbitrary row in this truth table. Since B;, Bo, ..., By, are
prime and distinct, there exists a truth valuation o such that the truth
values B,°, Bo’, ..., B,° are respectively the same as those indicated
in this particular row of the truth table. By Lemma 5.5, @° is the truth
value indicated in the same row at the last column. But a’ = T since a
is a tautology. Thus the entry at the last column in this row is ‘T’. S
5.10. Remark
Thms. 5.6 and 5.9 together provide us with an algorithm (a mechanic-
ally performable procedure) whereby we can test any formula @ and
decide whether or not it is a tautology: construct the truth table for a
in terms of its prime components (or in terms of any distinct prime
formulas among which are all the prime components of a@; see Prob.
3.3);
Using Prob. 4.6, this algorithm also enables us to decide, for any
finite set ® of formulas and any formula a@, whether or not ® Fy a@.
5.11. Definition
If « and B are formulas satisfied by exactly the same truth valuations
(that is, both a Fo B and B ky @) we say that @ and B are tautologically
equivalent and write ‘a =o B’.
5.12. Remarks
(i) From Prob. 5.3(iii) it is easy to see that @ =o B iff Fy) aco.
(ii) An argument similar to the one used in the proof of Thm. 5.6
§5. Truth tables HS
5.13. Problem
Verify that for any a, B, y, @, Qo, ... , Qx:
5.14. Problem
Let a and £ be any formulas. Let ® be the set of all formulas
obtainable from @ and f using negation and conjunction. More pre-
cisely,
5.15. Problem
The same as Prob. 5.14, but with ‘conjunction’ and ‘A’ replaced by
‘disjunction’ and ‘v’ respectively.
5.16. Problem
For any formulas @ and f, put «|B =a; a(a@AB). The ‘|’ here is known
as Sheffer’s stroke. The formula a|f is called the non-conjunction of «.
116 7. Propositional logic
and B. Let ® be the set of all formulas obtainable from «@ and B using
non-conjunction. Thus,
(1) a and f are in ®;
(2) if y and 6 are in ® then so is y|8.
Find formulas in ® that are tautologically equivalent to —@ and a—f
respectively.
5.17. Problem
Let a and f be distinct prime formulas. Let ® be defined as in Prob.
5.16, but with ‘non-conjunction’ and ‘|’ replaced by ‘implication’ and
‘_»’ respectively. Prove that no formula in ® is tautologically equiva-
lent to anf.
5.18. Problem
Let a and f£ be distinct prime formulas. Let ® be defined as in Prob.
5.14, but with ‘conjunction’ and ‘A’ replaced by ‘bi-implication’ and
‘<>’ respectively.
5.19. Remark
Prob. 5.4 means that all binary truth functions are reducible to
negation and implication. Prob. 5.14 (Prob. 5.15) means that implica-
tion — and hence all binary truth functions — can be reduced to negation
and conjunction (negation and disjunction). Prob. 5.16 means that
negation and implication — and hence all binary truth functions — can
be reduced to non-conjunction. Prob. 5.17 means that conjunction
cannot be reduced to implication (although by Prob. 5.13(i) disjunction
can be so reduced). Prob. 5.18(ii) means that implication cannot be
reduced to negation and bi-implication.
6.1. Definition
Modus ponens is the [formal] operation that may be applied to any two
formulas of the form «@ and a—f8, to yield the formula B; schematically,
a, a>
B
In this connection, a and a—f are called the minor premiss and major
premiss respectively, and 6 is called the conclusion.
6.2. Remark
Note that these are not five single axioms but axiom schemes, each
representing infinitely many axioms obtained by all possible choices of
formulas a, B, and y. We shall refer to them briefly as ‘Ax. i’, ‘Ax. 11’,
etc:
6.8. Definition
(i) A propositional deduction from a set ® of formulas is a non-
empty finite sequence of formulas @, @2, ..., @, such that for
each k (k =1,2,..., n) at least one of the following conditions
118 7. Propositional logic
holds:
(1) @, is a propositional axiom,
(2) gE ®,
(3) @, is obtained by modus ponens from two earlier formulas in
the sequence; that is, there are i and j, both smaller than k, such
that @; = Gi Gx.
In this connection ® is called a set of hypotheses.
(ii) A propositional proof is a propositional deduction from the
empty set of hypotheses.
Where there is no risk of ambiguity, we shall usually omit the qualifica-
tion ‘propositional’ and say simply ‘deduction’ and ‘proof’. Similar
ellipses will be used in connection with other bits of terminology
introduced below.
6.9. Definition
(i) A deduction (or proof) whose last formula is @ is said to be a
deduction (or proof, respectively) of a.
(ii) If there exists a propositional deduction of a formula @ from a set
® of formulas, we say that «@ is /propositionally] deducible from
® and write, briefly, ‘® +) a’.
(iii) If there exists a propositional proof of a formula @ — that is, a
deduction of a from the empty set — we say that a« is /proposition-
ally] provable and write, briefly, ‘ko a@’. In this case @ is also
called a [propositional] theorem.
6.10. Remarks
(i) The calculus we have specified here is a linear calculus, as
distinct from calculi whose deductions have a more complex
tree-like branching form rather than being ordinary (linear) se-
quences as in Def. 6.8. A linear calculus is characterized uniquely
by specifying its axioms (by means of axiom-schemes or in some
other way) and rules of inference. In the present case the axioms
are all instances of Ax. i-Ax. v, and the sole rule of inference is
modus ponens.
$6. The propositional calculus 119
1 That is, ignoring irrelevant differences between the formal languages in which these
various calculi are formulated.
120 7. Propositional logic
(viii) Propcal is pitifully inadequate for formalizing any but the most
trivial mathematical deductions. Its is however of interest as a
sort of pilot project for more powerful and useful systems.
6.11. Example
We show that }) a—a for every a. (In other words, we are going to
prove a [meta]theorem about Propcal, which asserts that, for every
formula @, G—« is a propositional theorem, a theorem of Propcal.)
The following sequence of five formulas is a [propositional] proof of
a> a:
[a> (a>a)>a]>(a>a>a)> a4, (Ax. ii)
(a>-a>a)>a->4, (m.p.)
01> a. (m.p.)
$6. The propositional calculus 121
The marginal comments on the right have been added for convenience.
Thus the first formula is an instance of Ax. ii, obtained from (6.4) by
taking B = a—a and y = a; the second formula is an instance of Ax. i,
with 6 = a—a,; the third formula is obtained by modus ponens from
the preceding two; the fourth formula is an instance of Ax. i, with
6 =a; and the fifth formula is again obtained by modus ponens from
the preceding two. In principle these explanations are redundant,
because you can always check whether or not a given formula is an
instance of an axiom scheme, or obtainable by modus ponens from two
earlier formulas.
PROOF
PROOF
Take a deduction of « from W U {6;, 55, ..., 6,} and whenever 9; is
used there as an hypothesis replace it by a deduction of 6; from ®. The
result is clearly a deduction of « from ® UW. &
6.14. Remark
The Cut Rule clearly holds for any linear calculus, irrespective of its
axioms and rules of inference. The strange name of this rule is due to
the fact that it allows us to ‘cut out the middlemen’ 6;.
We shall often refer to this rule briefly as ‘Cut’.
® |, a8 = ®, a} B.
PROOF
Px, (ax.)
§7. The Deduction Theorem 123
A> Q x.
(m.p.)
ae: (hyp.)
o> Gi Fx, (hyp.)
(4>G;> 9, )> (4G) 9 Fx, (Ax. ii)
(a>G))> a> Gx, (m.p.)
0 Px: (m.p.)
7.3. Remarks
(i) We shall refer to the Deduction Theorem briefly as ‘DT’.
(ii) In proving DT (and in Ex. 6.11, which is used in the proof) we
invoked only Ax. i and Ax. 11. In fact, it is not even necessary for
formulas of the forms (6.3) and (6.4) to be axioms: it would have
been enough if they were just theorems. More precisely: if -* is
the relation of deducibility in a linear calculus whose sole rule of
inference is modus ponens and if +/*a—-fB—-a as well as
-* (a-p>y)>(a>B)> ay; for all a, B and y, then DT holds
for -*, that is: ®, a+*B> ®/+* af.
124 7. Propositional logic
(iii) Now that we have DT, we shall not need to invoke Ax. i and Ax.
ii again. Indeed, the sole purpose of adopting these axiom
schemes was to enable us to establish DT.
7.4. Problem
Let }* be the deducibility relation in a calculus that has modus ponens
as a — not necessarily sole — rule of inference.
Show that if Cut and DT hold for }+*, then }*a—-f—-a and
-L* (a>fp>y)>(a>f)>
ay for all a, B and y.
8.2. Warning
Some authors use ‘contradictory’, ‘consistent’ and ‘inconsistent’ as
semantic terms; so that, for example, a set ® of formulas would be
said to be inconsistent if ® Fo, that is, if it is not satisfied by any truth
valuation. We shall strictly avoid that semantic usage. Although it will
transpire that a set ® of formulas is satisfied by some truth valuation
iff it is consistent (in the proof-theoretic sense of Def. 8.1), this fact is
a far from trivial theorem rather than a mere matter of definition.
8.3. Problem
(i) Prove that if W C ® and W is inconsistent then ® is inconsistent.
(ii) Prove that if ® is inconsistent then it has an inconsistent finite
subset.
8.4. Theorem
An inconsistent set of formulas is not satisfied by any truth valuation: if
@ ko then ® Eg.
§8. Inconsistency and consistency 125
PROOF
PROOF
The claim is equivalent to saying that the empty set is consistent; but
the empty set is satisfied by every truth valuation (cf. Rem. 4.5(i)). &
PROOF
8.7. Remarks
(i) For brevity, we shall refer to the Inconsistency Effect as ‘IE’.
(ii) The converse of Thm. 8.6 is trivial: if all formulas are deducible
from @®, then in particular both members of any contradictory
pair are deducible from it.
(iii) Our sole purpose in adopting Ax. iv was to enable us to establish
IE. From now on this axiom scheme will not have to be invoked.
8.8. Problem
Let }* be the deducibility relation in a calculus for which both DT and
IE hold. Prove that }* aa->a-f for all a and B.
PROOF
Assume ®, «| ,. Then by IE we have ®, a+) s@ and hence, by DT,
®},)a>74.
Now, (o——4)—-14 is an instance of Ax. v; hence d>74@ /) 74.
Using Cut, we get ® fy sa, as claimed. i
8.10. Remarks
(i) The converse of reductio is immediate: if ® |) sa then a fortiori
®, ay a. But clearly also ®, a+) a; hence ®, a fo.
(ii) The sole purpose of adopting Ax. v was to enable us to prove
reductio. Henceforth there will be no need to invoke that axiom
scheme.
8.11. Problem
Let +* be the deducibility relation in a calculus that has modus ponens
as a rule of inference and for which DT and reductio hold. Prove that
+* (a>74a)>71¢ for all a.
8.12. Problem
Prove that a) a—@ for all a.
8.13. Remark
All the proof-theoretic results we have obtained so far - Cut, DT, IE
and reductio — hold also for the intuitionistic propositional calculus
(the most important non-classical propositional calculus). But the
following result — the inverse of Prob. 8.12 — does not hold for that
calculus, so in order to prove it we shall have to invoke Ax. iii, which
is not valid in intuitionistic logic.
8.14. Lemma
a7a oaforall a.
PROOF
PROOF
8.16. Remarks
(i) For brevity, we shall refer to the Principle of Indirect Proof as
BLP’.
(ii) Lemma 8.14 is a special case of PIP, for clearly {a 14, a} Fo.
(iii) The converse of PIP is immediate.
(iv) The sole purpose of adopting Ax. iii was to enable us to prove
PIP. Henceforth it will no longer be necessary to invoke this
axiom scheme.
(v) Indeed, from now on we shall not invoke any propositional
axiom, because the four proof-theoretic principles — DT, IE,
reductio and PIP — jointly contain all the information that the
choice of axioms was designed to provide (cf. Probs. 7.4, 8.8,
8.11 and 8.18). We use these four principles even where, as in the
proof of Lemma 8.14, it would have been quicker to invoke an
axiom. The reason for this apparent perversity is that the axioms
are forgettable, mere scaffolding, whereas the four principles
(together with modus ponens and Cut) encapsulate the most
1 We could have got (1) more directly, as in the proof of Thm. 8.9; but see Rem.
8.16(v).
128 7. Propositional logic
8.17. Warning
Do not commit the solecism of confusing PIP with reductio. The two
principles, though formally similar to each other,. are quite distinct.
(Intuitionistic logic rejects the former and upholds the latter.)
8.18. Problem
Let }* be the deducibility relation in a calculus that has modus ponens
as a rule of inference and for which Cut, DT, IE and PIP hold. Prove
that }* [((a>8)—>a]|—a for all a and B.
8.19. Problem
Prove:
(i) Aa by a,
(ii) BEo a8,
(iii) {a, 4B} +) (a8),
(iv) 4(a>B) bo a,
(v) 7(0->) Fy 3B.
8.20. Problem
Using Def. 2.5, prove:
(i) aATaFko,
(ii) koava,
(ili) AABEo BAG,
(iv) avB ho Bva.
8.21. Remark
In Prob. 8.20, (ii) does not depend on the intuitionistically invalid PIP
(or Ax. 111), whereas (iv) does. On the other hand, it is well known that
in intuitionistic logic the law of excluded middle is invalid, whereas
disjunction has a symmetric meaning. This apparent incongruity is due
to the fact that in intuitionistic logic Def. 2.5(i1) itself is not acceptable,
because disjunction (and, for that matter, conjunction) cannot be
reduced to negation and implication.
$9. Weak completeness 129
8.22. Problem
Prove that fy (ma—>f)>(ma—>f)>.4.
8.23. Remark
9.2. Lemma
Let « be a combination of formulas B,, Bo, ..., By. Select a given row
in a truth table for @ in terms of By, Bo, ...,B,. For eachi=1,2,...,
k let B; be B; or —B,, according as the entry in the given row at the i-th
column is ‘T’ or ‘1’. Similarly, let a’ be « or —4a, according as the
entry in the given row at the last column is ‘1’ or ‘L?.
Then {Bi, B5,..., Bi} Fo@’.
PROOF
For brevity, we put ® = {B}, Bj, ..., B,}, so we must prove ®} ya’.
We proceed by induction on dega@ and distinguish three cases, accord-
ing to which of the three rules in Def. 5.1 was used to construct the last
column in the truth table in question.
Case 1: « = B; (where 1 <i =< k) and Rule (1) of Def. 5.1 was used. In
this case the entry in the given row and last column is a copy of the one
in the i-th column. Then a’ = B; € ® and obviously ® fy a’.
Subcase 2b: y' = —y. Then according to Rule (2) we get a’ =a=
—y; and y’ fo a@’ is the same as my kp My, which is obvious.
Subcase 3a: y' = sy. Then according to Rule (3) we get a’ =a=
y—9. So y’ oa’ is the same as myo y>8, which holds by Prob.
8.19(i). Therefore a fortiori {y', 5'} ko a’.
$9. Weak completeness 131
9.3. Lemma
Let « be a combination of Bi, Bo, ..., Bx, and suppose that in some
truth table for a in terms of B,, Bo, ..., By all the entries in the last
column are ‘T’. For eachi=1,2,..., k let Bj be chosen arbitrarily as
B; or —B; — the choice being made independently for different i. Then
ifs, --- > Px-y) foe jor every p=, 1, .....°k; In‘partcular, for
p= Ke ko a.
PROOF
By induction on p. For p = 0 the claim is that {B;, Bj, ..., Bx} Fo.
This holds by Lemma 9.2, because according to our present assump-
tion the formula a’ (defined there) is always @ itself.
For the induction step, let p < k. We must show that ® |) a, where
® = {B}, Bs, ... , Bk_-(p4:y}- (If p= k — 1 then ® = ©.)
The induction hypothesis is that ®, B;_, |o @. But we are free to
choose B;_, in two ways: as B,_, or as —B,_,. So we have
Hence
®, 740, br, fo and ®, 7a, aB,_, ko.
PROOF
Suppose Fy a. Then by Thm. 5.9 the truth table of @ in terms of all its
prime components satisfies the assumption of Lemma 9.3; hence by
that lemma fg @.
To prove the second part of the theorem, assume that ® Fy a, where
® is a finite set of formulas. Let @,, @, ..., @, be all the members of
®; then ® = {q@,, @, ..., B,} and we have {@,, @,..., @,} Foa.
By Prob. 4.6(ii) we get Fp gj >q@2—: - ->@,-—4. Therefore, by the
first part of the present theorem, /p 9; @2—: - -> @,—>@. Hence, by
k applications of modus ponens, we obtain {@,, @2, ..., Qe} ko G,
that is, B Fo a. z
9.5. Theorem
A finite unsatisfiable set of formulas is inconsistent: if ® is finite and
® Fo, then P fo.
PROOF
9.6. Remarks
Our final task in propositional logic will be to prove the full converse
of Thm. 6.12 — the strong semantic completeness of the propositional
calculus. From Rem. 9.6 it should be clear that this task can be
accomplished by proving first the full converse of Thm. 8.4: A consist-
ent — finite or infinite — set of formulas is satisfiable. We shall do so in
three easy stages.
First, we shall show that certain special sets of formulas, called
Hintikka sets, are satisfiable. This will be quite easy, because the
definition of these sets is rigged for this very purpose.
Second, we shall introduce the even more special maximal consistent
sets of formulas and show that each such set is a Hintikka set, and
hence satisfiable. In fact, it will transpire that there is a one-to-one
correspondence between maximal consistent sets and truth valuations.
Finally, using a simple but powerful result from set theory, we shall
show that every consistent set of formulas is a subset of a maximal
consistent set, and is therefore automatically satisfied by the truth
valuation that satisfies the latter.
10.2. Definition
A [propositional] Hintikka set [in £] is a set ® of formulas satisfying
the following four conditions for all formulas @ and B:
10.3. Theorem
If ® is a Hintikka set, it is satisfied by some truth valuation.
PROOF
Case 1: @ is prime.
(la) pe®>qQG’°=T by the definition of o.
(1b) ngpe®=>QEe_® by clause (1) of Def. 10.2,
=@gr=t by the definition of o.
(2a) pe®>r7nac®@
= a7 = 1 by part (b) of ind. hyp.,
Soo = 7 by clause (2) of Def. 4.2(1i).
(2b) nge®>a7A74€0@
>ace® by clause (2) of Def. 10.2,
>a°=T by part (a) of ind. hyp.,
= Get by clause (2) of Def. 4.2(11).
(3a) peD>a->-fpe@
>nacePMorpe® by clause (3) of Def. 10.2,
=«@ €l or B? = 7 by ind. hyp.,
=o =" | by clause (3) of Def. 4.2(i1).
(3b) ngpe®SacD& ape by clause (4) of Def. 10.2,
=>oa° = T and p? = 1 by ind. hyp.,
= @ = by clause (3) of Def. 4.2(ii). 1
' Note that (a) by itself is sufficient for proving our theorem; and once (a) is established
for all @ then (b) would follow automatically. But if you try to prove (a) on its own,
you will find out that the inductive argument runs into snags.
§11. The ambient metatheory 135
12.2. Remarks
12.3. Theorem
If ® is maximal consistent and ® ‘ya, thenae ®.
PROOF
12.4. Theorem
A consistent set ® is maximal consistent iff for every formula « either
ae®Morjnace®.
First, assume ® is maximal consistent. If a ¢ ® then by the maxim-
ality of ® it follows that ®, a |». Hence by reductio ® |) 44, and by
Thm. 12.3 nae @®.
Conversely, assume ® is consistent and satisfies the condition in
question. Let @ be any formula that does not belong to ®; so by the
assumed condition a € ®. It follows that ®, a+ ,. Thus we see that
by adding to ® even a single new formula we get an inconsistent set.
Thus (cf. Rem. 12.2(i)) ® is maximal consistent. a
12.5. Theorem
Every maximal consistent set is a Hintikka set.
PROOF
12.6. Theorem
(i) For any truth valuation o, the set {@: @° = T} is maximal consist-
ent.
(ii) Conversely, if ® is maximal consistent then ® = {q: g° = T}
for some truth valuation o. Moreover, this o is the unique truth
valuation satisfying ®.
PROOF
(i) Put B = {q@: @° = T}. ® is evidently satisfiable: it is satisfied by o.
Hence by Thm. 8.4 it must be consistent.
If a is a formula such that a ¢ W then, by the definition of W, it
follows that a° = |. Hence (4@)° = T and so nae W. Thus by Thm.
12.4 W is maximal consistent.
12.7. Remark
It is now clear that showing a set of formulas to be satisfiable is
tantamount to showing that it is included in a maximal consistent set.
13.1. Theorem
Every consistent set of formulas is satisfied by a truth valuation.
PROOF
PROOF
13.3. Remarks
(i) If the primitive symbols of £ are given by an explicit enumera-
tion:
{p,:
ne N},
then the proof of Thm. 13.1 can be made more elementary and
constructive. First, it is easy to define explicitly an enumeration
of all &-formulas:
{p,: ne N}.
PROOF
1.1. Specification
The primitive symbols of a first-order language -£ fall into five
mutually exclusive categories:
142
$1. Basic syntax 143
that is not an individual constant (that is, at least one n-ary function
symbol with positive n), then it must be a language with equality.
The variables, the connectives, the universal quantifier and the
equality symbol (if present) are the logical symbols of £. All other
primitive symbols (namely, the function symbols and the predicate
symbols other than =) are extralogical.
1.2. Warnings
(i) Specification 1.1 must not be read as exhibiting any symbol of the
object language “, which indeed may not have a written form.
Thus, for example, it must not be supposed that ‘v,’ is a variable
of £. Rather, it is a syntactic constant, belonging to our metalan-
guage and denoting the first variable (in alphabetic order) of 2.
Also, ‘=’ should not be taken to be the equality symbol of 2.
Rather, it is a syntactic constant used to denote the equality
symbol of &, if it has one. (Cf. Warning 7.1.2.)
(ii) Note carefully the distinction between ‘=’ and ‘=’. Both are
symbols in our metalanguage. The former is a name (in the
metalanguage) of the equality symbol of the object language (if it
has one); the latter is the equality symbol of the metalanguage,
an abbreviation of the phrase ‘is the same as’.
The similarity of shape between ‘=’ and ‘=’ — which may be
confusing at first — is an intended pun and a mnemonic device;
see Rem. 4.3(iii) below.
1.3. Remark
The difference in the logical symbols between two different first-order
languages is clearly inessential, and there would be no real loss of
generality if we were to assume that all first-order languages share the
same logical symbols. (In the case of the equality symbol this would
mean that all first-order languages with equality have the same equality
symbol.) Two first-order languages are essentially different if only one
of them is with equality, of if they have different stocks of extralogical
symbols.
1.4. Definition
An £-string is defined in the same way as in propositional logic (see
Def. 7.1.3), namely as a finite sequence of primitive symbols of 2.
144 8. First-order logic
1.5. Definition
L-terms are strings constructed according to the following two rules.
In a term ft,;t, ... t, constructed according to clause (2), the terms t;,
to, ..., t, are the first argument, second argument, ..., nth argu-
ment, respectively.
For n=0, (2) says that a single occurrence of a constant is an
£-term (see Specification 1.1(11)).
1.6. Definition
The degree of complexity of a term t — briefly, degt — is the total
number of occurrences of function symbols in t.
We shall often use induction on degt in order to prove general
statements about all terms t.
1.7. Definition
£-formulas are strings constructed according to the following four
rules.
(1) If P is an n-ary predicate symbol and t,, t2,..., t,, are &-terms
then the string Pt,t, ... t, (obtained by concatenating a single
occurrence of P and t), tz, ..., t,, in this order) is an L-for-
mula.
(2) If B is an £-formula then - (the string obtained by concatenat-
ing a single occurrence of — and the string B, in this order) is an
£- formula.
(3) If B and y are £-formulas then >fy (the string obtained by
concatenating a single occurrence of —, the string B and the
string y, in this order) is an L- formula.
(4) If x is a variable and B is an £-formula then Vxf (the string
§2. Adaptation of previous material 145
obtained by concatenating a single occurrence of V, a single
occurrence of x and the string , in this order) is an £- formula.
A formula Pt)t, ... t,, constructed according to (1) is called an atomic
formula; the terms t,, t), ..., t, are its first argument, second
argument, ..., nth argument, respectively. In the particular case
where P is the equality symbol = (in which case n must be 2) the
atomic formula is also called an equation and its first and second
arguments are called its left-hand side and right-hand side respectively.
In connection with formulas constructed according to (2) and (3) we
use the same terminology as before (see Def. 7.1.4).
A formula Vx constructed according to (4) is called a universal
formula; here x is the variable of quantification and the string xf is the
scope of the initial occurrence of the universal quantifier.
1.8. Definition
The degree of complexity of a formula a — briefly, dega@ — is the total
number of occurrences of connectives (— and —) and the universal
quantifier V in a.
1.9. Definition
An -expression is an £-term or an £-formula.
1.10. Remark
We use ‘r’, ‘s’ and ‘t’ (sometimes with subscripts) as syntactic variables
ranging over £-terms. Boldface lower-case Greek letters (sometimes
with subscripts) are used as syntactic variables ranging over &-formu-
las. These and other notational conventions of this kind should be
self-evident.
2.1. Problem
Assign to each primitive symbol p of £ a weight w(p) by stipulating
that if x is a variable then w(x) = —1; if f is an n-ary function symbol
then w(f)=n-—1; if P is an n-ary predicate symbol then w(P) =
n — 1; while w(—) = 0, w(—) = 1 and w(V) = 1. If pi, po, ..-, py are
primitive symbols, assign to the string p;p2 . . . py weight
(i) wisi
(ii) if tis the string pyp2... p, and k </, then w(pip2 .. . py) 29.
(iii) Show that if t is a term ft,t, ... t, formed according to Def.
1.5(2), then for each Kk=0,1,..., n, ft,t, ... ty, is the shortest
non-empty initial segment of t whose weight is n — k — 1.
(iv) Show that if @ is a formula Pt)t, ... t,, formed according to Def.
1.7(1)sthen foreach ik = 0; ly. : «, ny PG... 2 t; 1s the shortest
non-empty initial segment of a whose weight is nm — k — 1.
(v) Also show that the results of Prob. 7.1.9 concerning formulas
hold for the present language “. (For (1) and (ii) of Prob. 7.1.9,
four cases now need to be considered, corresponding to the four
clauses of Def. 1.7. In the case where @ is atomic, the previous
results of the present problem are invoked.)
Prob. 2.1 shows that the Polish notation decreed for £ makes brackets
and other punctuation marks unnecessary in that language.’ However,
for reasons explained in §2 of Ch. 7, we decree:
2.2. Definition
(i)i The-samesas Def. 7.2.
(ii) (r=s) =ae =rs,
' The ambiguities that might otherwise arise are illustrated by a piece that appeared in
The Guardian on 10 October 1985, reporting ‘grisly new details of the murder by Lord
Lucan in 1974 of one of his children’s two nannies’. Did the writer intend to say ‘. . . of
[one of (his children’s two nannies)]’ or ‘. . . of [(one of his children)’s two nannies]’?
Did Lord Lucan murder one of the two nannies of his children, or did he commit the
double murder of two nannies of one of his children?
§2. Adaptation of previous material 147
2.3. Definition
(i)—(iii) The same as Def. 7.2.5(i)-(iii).
(iv) Axa =gs 2AVx-14.
2.5. Definition
A prime formula is a formula that is atomic or universal.
2.6. Definition
The set of prime components of a formula @. is the smallest set of prime
formulas from which « can be obtained as a propositional combination.
In detail, by induction on deg a:
3.2. Definition
For n 2 1, an n-ary operation on a class A is a map from A” to A.
If f is an n-ary operation on A, and aj, a>, ..., a, € A, then the
value of f at the n-tuple (a;, a), ..., a,) is usually denoted by
‘f(a1, 42, ..., G,)’ with parentheses instead of corner brackets.
3.3. Remark
From Def. 3.2 and the definitions made in Ch. 2 it is not difficult to see
that f is an n-ary operation on A iff f is an (m + 1)-ary relation on A
such that for any a;, a), ..., a, € A there is a unique a € A for which
(diva, aaa, ay ede
§3. Mathematical structures 149
3.4. Definition
A 0-ary operation on a class A is a member of A.
We are now ready to lay down the main definition of this section.
3.5. Definition
A mathematical structure is a composite entity U consisting of the
following ingredients.
3.6. Example
The elementary (or first-order) structure of natural numbers may be
defined as the structure It having the following ingredients.
(i) Its domain is the set N = {0, 1, 2, ...} of all natural numbers.
(ii) Its four basic operations are the designated individual 0; the
150 8. First-order logic
3.7. Example
A more general notion of structure than that prescribed by Def. 3.5 is
obtained by allowing the domain to be a proper class rather than a set,
and also admitting a basic relation which is a proper class. The most
important example of this liberalized notion is the structure of sets M,
having the following ingredients.
(i) Its domain is the class M of all objects, that is sets and individuals
(if any) of set theory, a.k.a. the universal class.
(ii) No basic operations.
(iii) Its basic relations are the identity relation on M and the relation
€ of membership between objects and sets.
3.8. Remark
A great many mathematical statements are, or can be construed as,
statements about mathematical structures. The structuralist view of
mathematics holds that mathematics is essentially the study of such
structures.
4.2. Definition
An £-interpretation (or £-structure) is a package — that is, a composite
entity (or, to be pedantic, an ordered triple) — U, consisting of the
following three components.
4.3. Remarks
4.4. Definition
4.5. Definition
If o is a valuation and uw is an individual in its universe, then o(x/u) is
the valuation that is based on the same structure as o and assigns the
same values as o to all variables other than x, while x°°/) = uw. We say
that o(x/w) is obtained from o by revaluing x as u.
Thus £°°/“) = £% for every function symbol f; and P°/ = p? for
every predicate symbol P; and y°/“) = y° for every variable y # x;
while x" = y.
Fl) If P is an n-a ry p redicate y.symbol and t;, to, ..., t, are terms,
then
CL ~ ae ht St
”) ’ > tee
*n P ’
In particular,
— oO —
‘Vimais = ts 5)
is5) ih otherwise.
L 6? = Tandy? =
(Fo) o
(B>y) ie otherwise.
4.7. Remarks
(i) Strictly speaking, what the BSD defines is a pair of new mappings
induced by the given valuation o and extending it to larger
domains. This is somewhat obscured by the fact that both of
these two new mappings are also denoted by ‘o’. The first of
these, defined in (T1) and (T2), is a map from the set of all terms
to the universe U of o. The second map induced by og, defined in
(F1)—(F4), maps the set of all formulas to the set {T, 1} of truth
values.
(ii) Clauses (F2) and (F3) are identical with clauses (2) and (3) of
Def. 7.4.2(ii), and so ensure that [the second mapping induced
by] a valuation assigns truth values to formulas in just the way a
truth valuation is required to do in propositional semantics. Thus,
as far as its effect on formulas is concerned, a valuation may be
regarded as a special case of a truth valuation. Note however that
not every truth valuation can be obtained in this way from a
valuation. For example, if a is any formula and x is any variable,
then by Def. 2.5 the formula Vx(a—«a) is prime; hence there are
truth valuations under which Vx(a—a) has the truth value L.
But it is easy to see that [Vx(a—a)]° = T for any valuation o. (If
£ is a language with equality, then a simpler counter-example is
provided by the equation x=x, where x is any variable: this
formula is prime, but its truth value under any valuation is T .)
(iii) Due to (F4), the BSD has a strongly non-effective character: it
does not, in general, provide us with a method whereby the truth
value «° (for given a and o) might be found in a finite number of
steps. For, if the universe U is infinite and @ is a universal
formula, VxB, then by (F4) the truth value a° depends on the
infinitely many truth values B°*/”, for all the infinitely many wu in
U. Of course, for some particular x, B and o with infinite
universe U it may be possible to determine, using some theoret-
ical argument, whether or not B°*/“) = T for all the infinitely
§4. Basic semantics SS
4.8. Problem
Using Def. 2.3(iv), show that
4.9. Definition
(i) If @ is a formula and o is a valuation such that @° = T, we say
that o satisfies @ and write ‘oF @’.
(ii) If ® is a set of formulas and o is a valuation that satisfies every
member of ®, we say that o satisfies ® and write ‘oF ®’.
4.10. Definition
(i) If the formula @ is satisfied by every valuation, we say that a is
logically true (or logically valid) and write ‘F a’.
(ii) If ® is a set of formulas and @ is a formula such that every
valuation that satisfies ® also satisfies a, we say that a is a logical
consequence of ® and write ‘®fa@’. In this connection we
employ simplified notation similar to that used in connection with
‘Eo’. For example, we write ‘®, mF a as short for ‘® U {@} Fa’.
(iii) If ® is a set of formulas that is satisfied by some valuation, we
say that ® is satisfiable. If ® is not satisfied by any valuation we
say that it is unsatisfiable and write ‘®F’.
(iv) If a B and also BF a@ (that is, a? = B° for every valuation o) then
we say that @ and f are logically equivalent and write ‘a. = B’.
4.11. Theorem
If ®&ya then also ® Ea. In particular, if Foa then also Fa; and if
a =, f then also a =f.
156 8. First-order logic
PROOF
Immediate from Rem. 4.7(ii). iy
4.12. Problem
(i) For any set ® of formulas and any two formulas @ and B, prove
that ®, a FBPiff DE af.
(ii) Prove that {a,,0@,...,a,} EB iff Fa;>a,>:- -—a,>8.
(iii) Prove that a = BiffFaop.
4.13. Remark
We say that B is a subformula of a if the formula B, regarded as an
£-string, occurs as a consecutive part of the formula a, where the
latter is also regarded as an “-string. (Note that B can occur in @ more
than once; but using Prob. 2.1(v) it is easy to show that two distinct
occurrences of B in @ cannot overlap.)
An obvious feature of the BSD is that if @ is a non-atomic formula,
then a is determined in terms of the truth values of certain subformu-
las of a under o itself and (if @ is a universal formula) under certain
other valuations. Note that it is the truth values of these subformulas
that matter, not the subformulas themselves.
This has the following consequence. Suppose that 8’ is a formula
such that B’ = and let a’ result from @ when an occurrence of a
subformula f in @ is replaced by 8’. Then a’ =a. This rather obvious
result can be proved rigorously by a simple but tedious induction on
deg a.
4.14. Remark
Let us pause to consider the issue raised in §11 of Ch. 7: that of the
ambient metatheory. While the mathematical presuppositions required
for the first three sections of this chapter are rather modest, the
Tarskian semantics presented in this section is quite another matter.
This is mainly due to Def. 4.10, which refers (albeit implicitly) to the
§5. Free and bound occurrences IS7/
5.2. Definition
We say that valuations o and t agree on a variable x (or function
symbol f, or extralogical predicate symbol P) if o and t have the same
universe and x” = x" (or f° = f’, or P° = P’, respectively).
158 8. First-order logic
5.3. Remark
We can characterize o(x/u). as the valuation that agrees with o on all
extralogical symbols and all variables other than x, whereas xO) 1,
5.4. Theorem
Let t be a term and let o and t be valuations that agree on all function
symbols and all variables occurring in t. Then t° = t’.
PROOF
Easy, by induction on degt. DIY or see B&M, p. 54. a
5.5. Remark
In particular, if t contains no variables (and is therefore made up
entirely of constants and other function symbols) and the valuations o
and t are based on the same -structure then t? = t’.
5.6. Definition
A term t is closed if it contains no variables. If t is such a term and U is
an -£-structure, we put t" =4,t, where o is some valuation based on
U. (By Rem. 5.5 it makes no difference which valuation based on U is
chosen.)
5.7. Definition
The occurrences of a variable x in a formula @ are classified into two
mutually exclusive kinds, free occurrences of x in @ and bound
occurrences of x in a, as follows:
5.8. Theorem
Let a be a formula and let o and t be valuations that have the same
universe and agree on all the extralogical symbols and free variables of
a. Then a? =a".
PROOF
5.9. Remark
In particular, if « has no free variables (so that all occurrences of
variables in it, if any, are bound) and the valuations o and t are based
on the same structure then a? = a’.
5.10. Definition
A sentence is a formula without free variables. If @ is a sentence and U
is an £-structure, and @ is satisfied by some — and hence (cf. Rem. 5.9)
by every — valuation based on U, then we say that @ holds (or is
satisfied) in U1, and that U is a model for a, and write ‘UF @’.
If UE @ for every member g of a set & of sentences, we say that U is
a model for &.
5.11. Problem
Prove: F Vx(a—>$)>Vxa—Vxf. (Use Prob. 4.12.)
5.12. Problem
Show that if x is not free in a then Vxa = a = Axa.
5.13. Problem
Assuming that x is not free in B, show that
5.14. Problem
Construct a sentence @ containing only logical symbols (that is, no
function symbols and no predicate symbols other than =) such that a
holds in a structure U iff the domain U of U has
(i) at least three members,
(ii) at most three members,
(iii) exactly three members.
5.15. Problem
§6. Substitution
Substitution is a purely syntactic operation: occurrences of a variable in
a given expression are replaced by [occurrences of] a term. Thus, three
£-entities are involved: first, the expression in which the substitution is
made; second, the variable for which a term is substituted; and third,
the term which is substituted for occurrences of this variable. We start
with the straightforward case where the first mentioned entity, the
expression in which the substitution is made, is itself a term.
We denote by ‘s(x/t)’ (read: ‘s, with x replaced by t’) the result
obtained from the term s when all occurrences of the variable x in s are
simultaneously replaced by occurrences of the term t. In detail, s(x/t)
is defined by induction on degs.
6.1. Definition
For any variable x and any term t,
(fa)ix(x/t) =;
(1b) y(x/t) = y for any variable y other than x;
162 8. First-order logic
(2) if s = fs|s) ... 8, where f is an n-ary function symbol and sj, sp,
...,S8, are terms, then s(x/t) = fs;(x/t)s(x/t) ...s,(x/t).
The most important fact about s(x/t) is its semantic behaviour: the way
its value undera valuation 0 depends on s, x, t and o.
We must not expect the value s(x/t)° to be the same as s’, because
s(x/t) and s are, in general, two different terms. However, note that in
the former term t occupies the same positions that x occupies in the
latter. Thus we ought to expect the value s(x/t)” to be the same as the
value of s not under o itself, but under the valuation obtained from o
by revaluing x and assigning to it the value that t has under o (see Def.
4.5). Thus we ought to have:
6.3. Remark
For purely typographical reasons, the printed form of (6.2) is a bit
more complicated than it need be. When writing this formula by hand,
there is no need to use ‘?’ at all, because the ‘?’ in the main part of the
formula can be replaced by ‘t®’. The form then taken by (6.2) is shown
here:
6.4. Theorem
(6.2) holds for all s, x,t and o.
PROOF
Case la: s is x. Then s(x/t)° = x(x/t)” = t°, by Def. 6.1. On the other
hand, s’*/) = x°@/9 = ¢, by Def. 4.5. So (6.2) holds in this case.
§6. Substitution 163
Case Ib: s is a variable y #x. Then s(x/t)’ = y(x/t)’ = y’, by Def.
6.1. On the other hand, s/) = y°@/) = y?, by Def. 4.5. So (6.2)
holds also in this case.
6.5. Remark
Thm. 6.5 does not tell us anything unexpected about the semantic
effect of substitution — on the contrary, the result is what we anticip-
ated. The point of the theorem is that it confirms that Def. 6.1 was
correct, in the sense of ensuring the desired effect.
Let us turn to the case where a term t is to be substituted for a
variable x in a formula a. For reasons that should now be clear, we
must define the substitution in such a way that
6.7. Definition
If no free occurrence of x in @ is within the scope of a y-quantifier,
where y is a variable that occurs in t, then we say that t is free [to be
substituted] for x in @; and in this case we define a(x/t) as the result
obtained from @ when all free occurrences of x in @ are simultaneously
replaced by t. [For a more detailed version of this definition, proceed-
ing by induction on dega, see B&M, p. S9f.]
It is now fairly easy to show that (6.6) holds in the special case where
a.(x/t) has so far been defined.
6.8. Theorem
(6.6) holds whenever the term t is free for the variable x in the form-
ula a.
PROOF
6.9. Remark
There are two special cases where t is free for x in a. First, where t
contains no variable other than x. Def. 6.7 therefore applies in this
case. In particular, in the case where t is x itself, it is easy to see that
a(x/x) is just @, as it ought to be. The second special case is where t
contains no variable that occurs bound in @: in this case a does not
contain any y-quantifier where y occurs in t.
In order to define a(x/t) in the remaining case — where t is not free for
x in @ — we must first modify the offending parts of a@ and make them
harmless. The trouble is caused by free occurrences of x in @ that fall
within subformulas of @ having the form Vy, where y is a variable
that occurs in t. In order to make the substitution work, so that (6.6) is
ensured, such subformulas of a@ must first be replaced by logically
equivalent ones that use a harmless variable, say z, instead of y. This
motivates the following
6.10. Definition
If z is a variable that does not occur free in B but is free for y in B, we
say that the formula Vz[B(y/z)] arises from the formula VyB by
[correct] alphabetic change [of variable of quantification].
6.11. Remarks
(i) The reasons for requiring that z be free for y in 6 is that
otherwise the substitution B(y/z) is not defined as yet. The reason
for requiring that z has no free occurrences in is that otherwise
the formulas Vz[B(y/z)] and VyB may not be logically equivalent.
For example, let 6 be y=z, where y and z are distinct variables.
It is easy to see that Vz(z=z) and Vy(y=z) are not logically
equivalent: the former is logically true, whereas the latter is
satisfied by a valuation o iff the universe of o is a singleton.
(ii) If z does not occur at all in B, then z clearly fulfils the conditions
in Def. 6.10.
(iii) It is not difficult to show that the operation of alphabetic change
is reversible; in other words, if Vz[B(y/z)] arises from VyB by an
alphabetic change, then the latter formula can be retrieved from
the former by an alphabetic change (see B&M, p. 61).
166 8. First-order logic
6.12. Theorem
If z[B(y/z)] arises from.“yB by alphabetic change then these two
formulas are logically equivalent.
PROOF
DIY or see B&M, p. 61. ' |
6.13. Definition
(i) We say that a formula y is obtained from a formula @ by an
alphabetic step if a has a subformula of the form VyB and y
results from @ when one occurrence of Vyf is replaced by a
formula Vz[B(y/z)] that arises from it by alphabetic change.
(ii) We say that a’ is a variant of a, and write ‘a ~ «@’’, if a’ can be
obtained from @ by a finite number of alphabetic steps.
6.14. Remarks
(i) The relation ~ is easily seen to be an equivalence relation. It is
reflexive: 0 ~ a always holds because @ is obtained from itself by
0 alphabetic steps. It is symmetric: if a~a’ then also a’~a
because alphabetic changes, and hence also alphabetic steps, are
reversible. Finally, it is clearly transitive: if a~a’ and a'~ a”
then alsoa ~ a”.
(ii) By Thm. 6.12 and Rem. 4.13, ifa~a’ thena=a’.
6.15. Definition
Let a variable x and a term t be given. For any formula @, we select a
formula a’ such that if t is free for x in a, then q@’ is @ itself; but if t is
not free for x in a, then @’ is a variant of @ in which t is free for x.
Thus «’(x/t) is already defined in Def. 6.7. We now define a(x/t) to be
the same as a’(x/t). [For details see B&M, p. 63. If t is not free for x in
a, then it does not really matter which variant of @ is selected to be a’,
so long as t is free for x in a’. But a definition must be unambiguous,
so a particular variant a’ must be selected. This is done by induction
on dega. The gist of the choice is that each offending subformula VyB
of @ is replaced by Vz[B(y/z)], where z is the first variable in the
§7. Hintikka sets 167
alphabetic list of ’-variables — that is, the v; with the least i — such that
this is a correct alphabetic change and such that z does not occur in t.]
6.16. Problem
Show that (6.6) holds for all a, x, t and o.
7.1. Definition
A [first-order] Hintikka set [in £] is a set ® of L£-formulas satisfying
the following nine conditions:
7.2. Remarks
(i) Conditions (8) and (9) of the definition are vacuous if £ is a
language without equality. The reason for excluding the case
n = 0 in (8) is that for n = 0 this condition would have reduced to
requiring that if c is any individual constant of & then c=c € ®,
which is already covered by condition (7).
(ii) Condition (9) applies in particular to the case where n = 2 and P
is = itself. In this special case the condition says that if s,, sz, t;
and t, are any four terms such that the three equations s,;=t,,
s,=t, and s;=s, are in ®, then the equation t;=t, is also in ®.
Fig. 2 can be used as a mnemonic for this statement. The four
terms are represented by the four corners of the square; the three
equations assumed to belong to ® are represented by the three
solid sides, reading from top to bottom and from left to nght; and
the fourth equation, which is then required to belong to @, is
represented by the dotted side, again reading from left to right.
For the rest of this section, we let ® be a fixed but arbitrary Hintikka
set. We shall refer to the nine conditions of Def. 7.1 simply as ‘(1)’,
‘(2)’ and so on.
Our aim is to prove that ® is satisfiable. We shall define a particular
valuation o and show that oF ®. In order to define o, we must specify
its various ingredients: first, we must specify its universe U; next, for
each variable x we shall have to specify its value x°, which must of
course be a member of V/; then, for each function symbol f we must
specify the corresponding operation f° on U; finally, for each extra-
logical predicate symbol P we have to specify the corresponding
relation P° on U. (As for the logical predicate symbol =, if it is
present in L£, we have no choice: =° has to be the diagonal relation on
Ue)
Of all the ingredients of o, the first - the universe U — turns out to
S| 8,
-
to
§7. Hintikka sets 169
require most work. Once U has been properly set up, the rest will
follow quite smoothly. The nature of the members of U (that is, what
‘stuff they are made of) is clearly of no importance; what is vital is that
for each term t there should be a member of U to serve as the value t”.
In general, the universe of a valuation may have members that do not
serve as the value of any term under that valuation; but in the present
case Occam’s razor turns out to be useful. So we shall define an object
[t] for each term t and - even before deciding what [t] is to be — we put
7.3. Definition
U =g, {[t] : tis an 2-term}.
Our plan is to define o in such a way that t° = [t] for every term t. As
we have said, the nature of [t] is unimportant; but we must decide
whether distinct terms are to have distinct values; in other words, if s
and t are distinct, should [s] and [t] also be distinct? The simplest
choice is to answer this question in the affirmative. The good news is
that if 2 is without equality then this simplest choice actually works.
The bad news is that it does not work if 2 has equality. The snag is
that in this case ® may contain equations s=t, where s and t are
distinct terms. If o is to satisfy ®, it must in particular satisfy these
equations, which (by the BSD F1) means that s° and t® must be the
same. As we intend these values to be [s] and [t] respectively, we are
forced to allow [s] and [t] to be equal whenever s=t € ®, even though
s and t may be distinct. This motivates the following definition of the
relation E between terms:
7.4. Definition
The relation E holds between two terms s and t — briefly, sEt — if
either £ is without equality and s is the same as t, or £ has equality
and the equation s=t is in ®.
7.5. Lemma
E is an equivalence relation: it is reflexive, symmetric and transitive.
PROOF
The case where & is without equality is trivial. Now suppose that £
does have equality.
170 8. First-order logic
7.6. Definition
For each term t, we define [t] as the E-class of t (see Def. 2.3.4).
Thus,
7.7. Remarks
(i) If £ is without equality, then [t] is simply {t}, so that if s and t
are distinct terms then [s] and [t] are also distinct. If 2 does have
equality, then [t] is a class of terms that may have several —
indeed even infinitely many — members.
(ii) Recall that by Thm. 2.3.5, [s] = [t] iff sEt. Also, by Cor. 2.3.6,
each term belongs to a unique E-class.
(iii) The class of all 2-strings is a set by Thm. 6.3.9. Hence by AS the
class T of all terms is also a set. For each t, [t] is a subset of T
and so, by Def. 7.3, U C PT. Thus U is a set by AP and AS.
Our intention was to have t° = [t] for very term t. For the particular
case where t is a variable we are free to decree this as part of the
specification of o.
7.8. Definition
We put x? = [x] for each variable x.
Next, for each n-ary function symbol f we must define the n-ary
operation on U that is to serve as f°. To define f°, we must specify, for
each n-tuple of members of U, the member of U produced by
applying f° to that n-tuple. Take n arbitrary members of U; by Def.
7.5 they are of the form ti]. [to|.-2 2a[t,)) wherett ts < at,cate
terms. We must specify a member of U as f?([t,], [t.], ..., [t,]). This
individual (again by Def. 7.3) must have the form [t] where t is some
term. How shall we choose this t? Clearly, t must involve f and t), to,
. , t,. So an obvious choice is
7.9. Definition
If f is any n-ary function symbol and t;, t2,..., t, are any terms,
7.10. Legitimation
If n > 0 — in which case, as stipulated in Sp. 1.1, & must have equality
— then this definition needs to be legitimized. The point is that one and
the same member of U may be represented in more than one way: r
and s may be distinct terms such that the object [r] is the same as [s].
WZ 8. First-order logic
However, the definiendum f?((t;], [t2], ..., [t,,]) must depend only on
the objects [t,], [to], ..., [t,] and not on the particular terms t,, ty,
..., t, that happen to represent them. So we have to prove that the
definiens [ft,t. ».. t,] depends only on the objects [t;], [tz], ..., [tn]
rather than on the particular terms t,, tz, ..., t, used to represent
them. We must therefore show that if [s;] = [t;] for 7=1, 2,..., n,
then also ,
This is easily done. Indeed, if [s;] = [t,;] for i=1, 2, ..., m, then by
Rem. 7.7(ii) for each i the equation s;=t; is in ®. So by (8) the equation
fs;S> ...8,=ft,t, ...t, is also in ® and [fss) ...s,]=[ft;t....t,].
7.11. Lemma
t° = [t] for every term t.
PROOF
We proceed by induction on degt. The case where t is a variable is
covered by Def. 7.8. Now let t be ft,t, .. . t,,. Then
7.12. Definition
If P is any n-ary extralogical predicate symbol, then P? is defined to be
the subset of U” such that for any 7 terms t;, to,..., t,,
7.13. Legitimation
This definition too needs legitimation. We must make sure that
whether or not ([t;], [t2], ..., [t,]) € P° holds depends on the objects
[t;], [to], ..., [t,] rather than on the terms that happen to represent
them. In other words, it must be proved that if [s;] = [t;] for i = 1, 2,
Men Ih, THON
Psis>....5S,
€ ® = PLL ...t, 6 ®.
7.14. Remark
As mentioned before, if & has equality we have no choice as to the
relation =°; we must put, for all terms s and t,
7.15. Theorem
For any formula @,
PROOF
We shall prove this double claim simultaneously by induction on deg @.
We distinguish four cases, corresponding to the clauses of Def. 1.7.
174 8. First-order logic
(la) wpe@M=>Ptt,...t,E6@
=> ([ti], [te], ---, [tr]) € P?
by Def. 7.12 and Rem. 7.14,
=e Atianilo: sects Oy) ek by Lemma 7.11,
== (Ptits..-t.)°= 7 by BSD F1,
=>@g°=T.
(lb) nge®>QgEe® by (1),
=> Pt,t,...t, ¢.@
(4a) mweD®=Vxac®
=> a(x/t) € ® for every term t by (5),
=> a(x/t)” = T for every term t by ind. hyp.,
=> a%*/) = T (where t = t’) for every termt
by Prob. 6.16,
= alt) = T for every termt by Lemma7.11,
=> 0/4) = T for every ue U by Def. 7.3,
=> (Wxa)? = T by BSD F4,
St aba
(4b) ngage ® > AVxac ®
=> 7a(x/t) € ® for some term t by (6),
=> a(x/t)° = L for some term t by ind. hyp.,
=> a%/) = | (where t = t”) for some term t
by Prob. 6.16,
=> alt) = 1 for some term t by Lemma 7.11,
§8. Prenex formulas; parity 175
=> 0°) = | for some u € U by Def. 7.3,
=> (Vxa)° = 1 by BSD F4,
Sat t= ahd
le
&
We have thus shown that the valuation o — specified by Defs. 7.3, 7.6,
7.8, 7.9 and 7.12 — satisfies the Hintikka set ®. We shall now obtain
an
upper bound for the cardinality of the universe of o.
7.16. Definition
The cardinality of the set of all primitive symbols of & is called the
cardinality of £ and denoted by ‘||£
>
7.17. Theorem
Given a Hintikka set ® in £, we can define an £-valuation o such that
the cardinality of the universe of o is at most ||£\| and such that oF ®.
PROOF
Take o as the valuation specified above. By AC, there exists a choice
function on the universe U of o: a function that selects a single term in
each E-class of terms. Since by Rem. 7.7(ii) distinct E-classes are
disjoint, the choice function is an injection from U to the set of all
£-strings, whose cardinality, by Thm. 6.3.9, is exactly ||£||. r]
8.2. Problem
(i) Let @ be a formula containing n + 1 occurrences of V. Show how
to find a formula of the form Qxw - where Q is V or J and p
contains only 1 occurrences of V — which is logically equivalent to
@. (Proceed by [strong] induction on deg q. In the case where @ is
a—f, we may assume, by the induction hypothesis, that @ is
logically equivalent to a formula of the form Qxy—f or a—>Qyéd,
and by alphabetic change we can arrange that x is not free in B
and y is not free in a. Then use Prob. 5.13(v)-—(viii).)
(ii) Hence show how to obtain a prenex normal form for any given
formula.
8.3. Definition
By induction on dega, we assign to each formula @ a parity pra,
which is either 0 or 1, as follows:
8.4. Problem
(i) Show that the set of all even formulas is a Hintikka set, and
hence is satisfiable.
(ii) Without using (i), define directly a valuation o such that oF « iff
a is even. (Take the universe of o to be a singleton.)
forany 7 = 1, any 272 terms §7,°S7,..., Sp; ti, , ..=, t, and any
n-ary function symbol f.
(Oteany 7 ly, ANY. 21) 1e0IMS Sis. 85.0 - « yseSy,5 ti, .tono ty, andeany,
n-ary predicate symbol P.
9.9. Remarks
(i) Six of the eight groups of axioms are given by means of schemes;
but the first and last groups are miscellanies. We shall refer to
these eight groups of axioms briefly as ‘Ax. 1’, “Ax. 2’ and so on.
178 8. First-order logic
9.10. Definition
(i) The [classical] first-order predicate calculus [in £] (briefly,
Fopcal) is the linear calculus based on the first-order axioms
listed above, and on modus ponens as sole rule of inference.
(ii) First-order deduction is defined in the same way as propositional
deduction (Def. 7.6.8), except that ‘propositional axiom’ is re-
placed by ‘first-order axiom’.
(iii) We use ‘+’ to denote first-order deducibility — that is, deducibility
in Fopcal — in the same way as ‘fy’ was used to denote proposi-
tional deducibility.
(iv) All terminological and notational definitions and conventions laid
down in §§6-8 and §12 of Ch. 7 in connection with +» and
Propcal are hereby adopted, mutatis mutandis, in connection with
+ and Fopcal.
9.11. Theorem
The Cut Rule, the Deduction Theorem, the Inconsistency Effect, reduc-
tio ad absurdum and the Principle of Indirect Proof hold for Fopcal. @
9.12. Remark
In B&M a similar system of axioms is used, but Ax. 4 is subject to the
proviso that t be free for x in a. The two versions of Fopcal are
equivalent, the B&M version is more economical whereas the present
one is a bit more user-friendly.
9.13. Warning
Versions of the classical Fopcal found in the literature fall into two
groups. One group consists of strong versions that are equivalent to
$9. First-order predicate calculus 179
ours. The other group consists of weak versions that are equivalent to
each other, but not to ours. To describe the relationship between the
two groups, let us denote by ‘+”” the relation of deducibility in a weak
version of Fopcal. The following four facts must be noted.
PROOF
9.15. Theorem
If ® +o a then also ® + «@. In particular, if to & then also F a. |
9.16. Problem
Prove that | a(x/t)>4xa.
9.17. Problem
Prove that + dx(t=x), provided x does not occur in t. Point out where
you use the assumption about x and t.
180 8. First-order logic
10.2. Remarks
(i) For brevity we shall refer to this rule as ‘UI’.
(ii) Clearly, UI holds for any linear calculus with modus ponens as a
rule of inference and all formulas of the form Vxa—a(x/t) as
theorems.
(iii) The only purpose of adopting Ax. 4 was to enable us to establish
UI. Now that we have done so, Ax. 4 need not be invoked again.
Indeed, it is easy to see that any calculus for which UI and DT
hold has all formulas of the form Vxa—a(x/t) as theorems.
(iv) Closely related to UI is the Rule of Existential Generalization
(briefly, EG): If ® + a(x/t) for some term t, then ® + Axa. This
rule follows at once from Prob. 9.16.
10.3. Definition
A variable is said to be free in a set (or a sequence) of formulas, if that
variable is free in some formula belonging to the set (or the sequence).
16.4. Theorem
Given a deduction D of a formula « from a set ® of hypotheses, if x is
a variable that is not free in ® then we can construct a deduction D' of
Vxa from ® such that x is not free in D' and every variable free in D'
is free in D as well.
PROOF
take D; to be
Gx, (hyp.)
QP. VXQx, (Ax. 3)
Vx@. (m.p.)
Vxq;, (hyp.)
Vx(Gi> Q,), (hyp.)
Vx(@;> ©, ) > Vx@;> Vx, (Ax. 2)
VQi>VXQx, (m.p.)
VxQ,. (m.p.)
10.6. Remarks
(i) We shall refer to this rule briefly as ‘UGV’.
(11) The only purpose of adopting Ax. 2, Ax. 3 and Ax. 8 was to
enable us to prove Thm. 10.4. Now that this has been done these
axioms need not be invoked again.
(iii) It is obvious that if +* is the relation of deducibility in any
calculus for which UGV holds, then from +*@ it follows that also
+*Y/xa for any variable x (cf. Ax. 8). If in addition DT also holds
for }*, then }*a—Vxea. for any formula @ and any variable x that
is not free in @ (cf. Ax. 3). See also Prob. 10.7 below.
(iv) Thm. 10.4 can be strengthened: it is enough to require that x is
not free in any formula of ® used as a hypothesis in the given
deduction (although it may be free in formulas of ® that are not
so used). To see this, let ®p be the set of those members of ®
that are used in the given deduction D, and apply the theorem to
®,. Similarly, in Cor. 10.5 it is enough to require that x is not
182 8. First-order logic
10.7. Problem
Let '* be the relation of deducibility in a calculus with modus ponens
as a rule of inference and for which Cut, DT, UI and UGV hold. Show
that /*Vx(a>B8)>Vxa—>Vxf for any formulas «@ and fB and any
variable x.
10.8. Definition
For any formula @ and variable x, we put
Alxa =; JyVx(a<>x=y),
where y is the first variable in alphabetic order that differs from x and
is not free in a.
10.9. Problem
(i) Verify that oF A!xo iff o(x/w) a for exactly one individual u in
the universe U of o.
(ii) Prove that -d!x(t=x), provided x does not occur in t.
PROOF
and does not occur at all (either free or bound) in the deduction D. Let
D' be the sequence @;', @.’, ..., @,’ of formulas obtained from D
upon replacing ¢ everywhere by y. We claim that D’ is a deduction of
a(x/y) from ®.
Indeed, for any k (where 1 < k <n) three cases are possible. First,
@, may be an axiom. In this case it is easy to verify that @,' is also an
axiom. Second, @, may be a hypothesis, a member of ®. In this case
@x' 1S @, itself, because ¢ does not occur in ®. Finally, @, may have
been obtained by modus ponens from two earlier formulas in D, q;
and q;. In this case it is obvious that @,’ is obtained by modus ponens
from g;' and q;’. Thus D’ is a deduction of a(x/c)' from ®.
We still have to show that a@(x/c)’ is in fact a(x/y). To see this, recall
that ¢ does not occur in @. Thus the occurrences of ¢ in a(x/c) are just
those that replace the free occurrences of x in a; there are no other
occurrences of ¢ in a(x/c). Now, a(x/c)’ was obtained from a(x/c)
upon replacing these occurrences of ¢ by the new variable y. Thus
a(x/c)’ can be obtained directly from @ upon replacing all free occur-
rences of x in @ by y. But a(x/y) is obtained from @ in precisely the
same way, because y is a new variable, not occurring in @, so that the
substitution of y for x in @ does not involve any alphabetic changes.
We have now established that D’ is indeed a deduction of a(x/y)
from ®. Moreover, note that y does not occur in those members of ®
that are used as hypotheses in D’: the only occurrences of y in D’ are
those that have replaced occurrences of c, but ¢ does not occur in ®.
Therefore by UGV we have ® | Vy[a(x/y)].
By UI we have Vy[a(x/y)] + a(x/y)(y/x). But it is easy to see that
a(x/y)(y/x) is in fact @ itself; hence we have got Vy[a(x/y)] a. Now,
x is clearly not free in Vy[a(x/y)], so we can use UGV again and
obtain Vy[a(x/y)] + Vxa.
By Cut we finally have ® + Vxa., as required. @
10.11. Remark
We shall refer to this rule briefly as ‘UGC’.
§11. Consistency
We have already noted (Thm. 9.11) that IE, reductio and PIP hold
for Fopcal. The other results of §8 of Ch. 7 also have counterparts in
Fopcal. In particular, the following two results are proved similarly to
Thm. 7.8.4 and Cor. 7.8.5.
11.1. Theorem
If B+ then PE. &
11.3. Remark
This proof of the consistency of Fopcal uses semantic notions which,
generally speaking, require a relatively powerful set-theoretic ambient
theory (see Rem. 4.14). On the other hand, since deductions are finite
objects, proof-theoretic notions such as deducibility and consistency
are quite elementary. It is therefore natural to ask whether the
consistency of Fopcal can be proved in an elementary way, without
appealing to semantics. Such a proof is outlined in the following
problem.
11.4. Problem
(i) Show that if ® + a and ® is a set of even formulas (see Def. 8.3)
then @ is even as well. (Verify that all the axioms of Fopcal are
even formulas and that modus ponens yields an even conclusion
from even premisses.)
(ii) Hence prove the consistency of Fopcal.
11.5. Lemma
If ® is consistent and AVxa¢€ ®, and ¢ is a constant that does not
occur in ®, then ® U {7.a(x/c)} is also consistent.
PROOF
11.6. Problem
Prove the Rule of Existential Instantiation with a Constant (EIC): Jf
® is consistent and 4xa € ®, and ¢ is a constant that does not occur in
®, then ® U {a(x/c)} is also consistent.
11.7. Lemma
Let ® be consistent; for each i=1, 2, ..., k, let 3Vx;a; € ®, and
let ¢; be distinct constants that do not occur in ®.
Then ® U {4 ;,(x;/c;):i = 1,2,..., k} is also consistent.
PROOF
11.8. Lemma
Let ® be consistent; let ®' be obtained from ® by adding, for every
formula of the form aNxa in ®, a ‘witnessing’ formula 74(x/c),
where ¢ does not occur in ® and where distinct constants ¢ are used for
distinct formulas of the form axa. Then ®’ is consistent as well.
PROOF
It is enough to prove that every finite subset of ®’ is consistent. (Cf.
Prob. 7.8.3(i): a similar result clearly holds for Fopcal.) However, a
finite subset of ®’ contains only a finite number of the new witnessing
formulas, and is therefore included in a set of the form ®U
{71a,;(x;/e;):i=1,2,..., k}, which is consistent by Lemma 11.7. Mf
11.9. Theorem
Let ® be a set of £-formulas that is consistent within £. Let £* be
obtained from £ by adding a set C of new individual constants. Then ®
is consistent within £* as well.
PROOF
12.1. Theorem
If ® is a maximal consistent setand ® + a, thnae ®. a
12.2. Theorem
A consistent set ® is maximal consistent iff for every formula © either
aePMornac®. i
12.3. Remark
From Thm. 12.2 it follows that if / is extended to a richer language
£~, by adding new extralogical symbols (for example, new constants)
then a set ® of “-formulas that is maximal consistent in £ will no
longer be so in £*. Indeed, if @ is an £*-formula containing a new
symbol (one that does not belong to £) then @ is not an £-formula, so
neither @ nor —a@ can belong to ®. Of course, by Thm. 11.9 @® is still
consistent in £*.
12.4. Theorem
For any valuation o, the set {@: @° = T } is maximal consistent. a
12.5. Theorem
If ® is maximal consistent, it fulfils conditions (1)—(5) and (7)—(9) of
Detain
PROOF
Conditions (1)—(4) are verified as in the proof of Thm. 7.12.5. Condi-
tions (5) and (7)-(9) are verified by invoking UI and Ax. 5—Ax. 7
respectively and using Thm. 12.1. a
12.6. Problem
Let £ be a first-order language with equality but without any extra-
logical symbols. Let o be the -£-valuation whose universe is
U = {u, v}, where u and v are distinct, and such that x° = u for every
variable x. Let ® = {@: @’ = T}, so that by Thm. 12.4 ® is maximal
consistent. Let a be the formula x=y, where x and y are distinct
variables.
Show. that —~Vxae@® but there is no L£-term t such that
—.a(x/t) € ®. (Note that the only terms of are the variables.)
§ 13. Completeness
13.1. Preview
As in propositional logic, the [strong] completeness of Fopcal will
follow immediately once we show that any given consistent set ® of
£-formulas is satisfiable. Also, exactly as in propositional logic, it is
easy to see that the set of all consistent sets of “-formulas is of finite
character (cf. proof of Thm. 7.13.1); hence, by the Tukey—Teichmiiller
Lemma (Thm. 5.2.8), any consistent ® is included in some W that is
maximal consistent within &. However, since a maximal consistent set
may not be a Hintikka set, we have no direct way of showing that W is
satisfiable.
It is clear from Thm. 12.5 and Prob. 12.6 that the only reason that
may prevent W from being a Hintikka set is the absence in it of
witnessing formulas. To overcome this obstacle, we use Lemma 11.8,
$13. Completeness 189
and add to W enough witnessing formulas, using constants as witnes-
ses. However, in order to make sure that these witness constants do
not occur in W (as Lemma 11.8 requires) we extend % to a richer
language “; by adding an adequate supply of new constants. By Thm.
11.9 W is still consistent in £;, so we may apply Lemma 11.8 there. Let
®, be the set so obtained. Unfortunately, in 2; W is no longer
maximal consistent (see Rem. 12.3), nor does the addition of new
witnessing formulas produce a maximal consistent set: all we can say
about ®, is that it is consistent. It seems as though we are back where
we Started.
Not despairing, we extend ®, to a maximal consistent set W, within
£,. Then we extend £;, to a richer language -2, by adding yet more
new constants, and get ®, from W, in the same way as we got ®,
from W.
The good news is that by iterating this procedure ad infinitum we
obtain in the limit a set that is not only maximal consistent but also a
Hintikka set, and includes our original set ®.
Throughout this section we shall be working within set theory (that
is, assume it as an ambient metatheory). In particular, as explained in
Rem. 6.1.8, we shall identify the natural numbers with the finite
ordinals (a.k.a. finite cardinals).
13.2. Definition
A set ® of £-formulas is a Henkin set in £ if ® is maximal consistent
in £ and, for any formula «@ and variable x, if aVxae@® then
“.a(x/t) € ® for some term t.
13.3. Remark
From Thm. 12.5 and Def. 7.1 it follows at once that a Henkin set in 2
is also a Hintikka set in &. Hence by Thm. 7.17 such a set is satisfied
by some valuation whose universe has cardinality not greater than ||£).
From now until the end of the proof of Thm. 13.8 we let ® be a
fixed but arbitrary consistent set of -formulas.
By [weak] induction on n we define for each natural number n a
first-order language &,,, a set ®, of £,-formulas, and a set W, of
£,-formulas that is maximal consistent in £,,.
190 &. First-order logic
13.4, Definition
Basis. We put £) = £ and ®,=@. As Wo we choose some set of
formulas that is maximal consistent in “, and includes ®o. (The
existence of such Wy is ensured by the Tukey—Teichmiiller Lemma.)
13.5. Remark
From Def. 13.4 it is evident that the ®,, and W,, form a chain of sets:
13.6. Definition
b
We define ,, as the union of all the languages “,; and W,, as the
union of all the sets W,, for n = 0;1,2,.....
Thus -£,, is obtained from “ by adding to the latter the union of all
the sets C,,, for n =0, 1, 2,...; and an £,,-formula a belongs to W,,
iff it belongs to W,, for some n.
§ 13. Completeness 191
13.7. Remark
From Rem. 13.5 it follows that an “.,-formula a belongs to W,, iff
there is some n such that a € W, for all k= n.
13.8. Theorem
Wis a Henkin set in L£,,.
PROOF
13.9. Theorem
If ® is a consistent set of £-formulas then ® is satisfied by some
L-valuation whose universe has cardinality not greater than |\|£\.
PROOF
We have specified in Defs. 13.4 and 13.6 how to extend the language
£ to a language “,, by adding new constants, and how to define a set
Ww, of £,,-formulas such that ® C W,; and we have shown in Thm.
13.8 that W,, is a Henkin set in Z,,.
By Rem. 13.3, W,, — and hence also its subset ® — is satisfied by
some “,,-valuation, say 0,,, as obtained in §7, whose universe has
cardinality not greater than ||-Z,,|).
Let o be the £-valuation that agrees with o,, on all the variables, as
well as on all the extralogical symbols of “. (The only difference
between o,, and o is that the former assigns interpretations to the new
constants, which are not in £, while o ignores them.) Then clearly o is
an £-valuation that satisfies ®.
The universe of o is the same as that of o,,; so we shall complete the
proof by showing that |[Z,,|| = |L2||. For brevity, we put A= ||2||. Of
course, A is an infinite cardinal, because the set of variables is infinite;
in fact, its cardinality is Xo.
The set of all -formulas is included in the set of all &-strings, hence
by Thm. 6.3.9 the cardinality of the former set is <A. (In fact, it is
quite easy to show that its cardinality is exactly A, but we shall not need
this.) Recall that Lo is £ itself; so by Def. 13.4 Co is equipollent to the
set of £-formulas, hence |Co| <A. By Def. 13.4 and Thm. 6.3.6 we
have ||£,||= A. The same argument shows, by induction on n, that
\|-2,,|| =A and |C,,| <A for all n.
It now follows that |U{C,, : 2 < @}| < &o-4, which by Thm. 6.3.5 is
exactly A. Using Thm. 6.3.6 as before, we see that ||2,,|| = A. a
PROOF
13.11. Remarks
PFac@Pla.
PFs@®@}.
PROOF
Similar to that of Thm. 7.13.4. |
PROOF
§1. Preliminaries
1.1. Preview
In this chapter we put formal languages on one side and present some
concepts and results from recursion theory that will be needed in the
sequel.
Recursion theory was created in the 1930s by logicians (Alonzo
Church, Kurt Gédel, Stephen Kleene, Emil Post, Alan Turing and
others) mainly for the sake of its applications to logic. But the theory
itself belongs to the abstract part of computing science. It is concerned
with computability — roughly speaking, the property of being mechan-
ically computable in principle (ignoring practical limitations of time
and memory storage space).
Our exposition will be neither rigorous nor self-contained. For some
of the key concepts, we shall provide intuitive explanations rather than
precise definitions. Instead of proving all theorems rigorously, we shall
in most cases present intuitive arguments. One major result — the
MRDP Theorem - will be stated without proof.
For a rigorous coverage of all this material, see Ch. 6 of B&M.
Alternative presentations of recursion theory can be found in books
wholly devoted to this subject, as well as in books that combine it with
logic. A classic of the first kind is
194
§1. Preliminaries 195
1.2. Conventions
1.3. Definition
We shall write, more briefly, Px <= JyQ(x, y), and say that P is
obtained from Q by existential quantification.
196 9. Recursion theory
1.4. Warning
Take care not to confuse ‘4’, ‘V’, etc. with their bold-face counter-
parts, ‘1’, ‘V’, etc. The former denote operations on relations; the
latter denote symbols in a formal language (which we are not studying
in this chapter). The typographical similarity between the two sets of
symbols is an intended pun and a mnemonic device, as will become
clearer in the next chapter.
§2. Computers
We shall define the central concepts of recursion theory in terms of the
notion of computer. The computers we have in mind are like real-life
programmable digital computers, but idealized in one crucial respect
(see Assumption 2.6 below). To help clarify this notion, we state in
informal intuitive terms the most essential assumptions we will make
about computers and the way they operate.
2.1. Assumption
A computer is a digital calculating machine: its states differ from each
other in a discrete manner. (This rules out analogue calculating devices
such as the slide-rule, whose states [are supposed to] vary continu-
ously.)
2.2. Assumption
A computer is a deterministic mechanism: it operates by rigidly and
deterministically following instructions stored in it in advance. (This
rules out resort to chance or random devices.)
2.3. Assumption
A computer operates in a serial discrete step-wise manner.
§2. Computers 197
2.4, Assumption
A computer has a memory capable of storing finitely many [represen-
tations of] natural numbers — which may be part of the input or the
output or an intermediate stage of a computation — and instructions.
(Without loss of generality, we may assume that instructions are coded
by natural numbers, as is in fact the case in present-day programmable
computers; so the content of the memory is always a finite sequence of
numbers.)
2.5. Assumption
A computer operates according to a program, a finite list of instruc-
tions, stored in it in advance (see Assumptions 2.2 and 2.4). Each
instruction requires the computer to execute a simple step such as to
erase a number stored in a specified location in the memory, or
increase by 1 the number stored in a specified location, or print out as
output the number stored in a specified location, or simply to stop.
After each step, the next instruction to be obeyed is determined by the
content of the memory (including the program itself).
2.6. Assumption
The computer’s memory has an unlimited storage capacity: it is able to
store an arbitrarily long finite sequence of natural numbers, each of
which can be arbitrarily large. (Thus, although the amount of informa-
tion stored in the memory is always finite, we assume that this amount
has no upper bound.)
2.7. Remarks
(i) Assumptions 2.1-2.5 are perfectly realistic: they are in fact
satisfied by many existing machines, from giant super-computers
down to modest programmable pocket calculators. Assumption
2.6, in contrast, is a far-reaching idealization: a real-life machine
can only store a limited amount of information. While the storage
capacity of many real machines can be enhanced by adding on
peripheral devices such as magnetic tapes or disks, this cannot be
done without limit.
(ii) In connection with Assumption 2.5 it is interesting to note that
the repertory of commands that a computer is able to obey (that
198 9. Recursion theory
§3. Recursiveness
3.1. Definition
Let P be an n-ary relation. By a decide-P machine we mean a
computer with an input port and an output port, which is programmed
so that if any n-tuple x e N” is fed into the input port then after a
finite number of steps the computer prints out an output — say 1 for yes
and 0 for no — indicating whether Px holds or not.
A relation P is recursive (or computable) if a decide- P machine can
be constructed (that is, if a computer can be programmed to act as a
decide-P machine).
3.2. Remarks
(iii) Any relation you are likely to think of, off-hand, is certain to be
recursive — unless you are already familiar with some of the tricks
of recursion theory or are exceptionally ingenious. (We shall
meet examples of non-recursive relations in the next chapter.)
(iv) Nevertheless, set-theoretically speaking, the overwhelming ma-
jority of relations are non-recursive. (Here is an outline of a
proof. Working within ZF set theory, we identify N with the set
of finite cardinals. Using Thm. 6.3.7 and Cantor’s Thm. 3.6.8, it
is easy to show that for each n = 1 the set of all n-ary relations
has cardinality >. On the other hand, a computer program is a
finite string of instructions, each of which is a finite string of
symbols in some programming language with a countable set of
primitive symbols. Hence by Thm. 6.3.9 the set of all programs is
countable. If follows that the set of all recursive relations must
also be countable.)
3.3 Definition
Let P be an n-ary relation. By an enumerate-P machine we mean a
computer with an output port and programmed so that it prints out,
one by one, all the n-tuples x e€ N” for which Px holds, and no others.
A relation P is said to be recursively enumerable — briefly, r.e. — if
an enumerate-P can be constructed (that is, if a computer can be
programmed to act as an enumerate-P machine).
3.4. Remarks
(i) If P is infinite (that is, holds for infinitely many n-tuples) then an
enumerate-P machine, once switched on, will never stop unless it
is switched off. We impose no bound on the number of computa-
tion steps the machine may make between printing out two
successive n-tuples; we only require it to be finite.
(ii) An r.e. relation is sometimes said to be semi-recursive. The
reason for this will soon become clear.
3.5. Lemma
The n-ary relation N” (the set of all n-tuples of natural numbers) is r.e.
200 9. Recursion theory
PROOF
(0, 0),
(0,1), (1,0), (1,1),
(0,2), (1,2), (2,0), (2, 1), (2,2),
(On eo Gilic Sis (2) a oe Sse oe ee
Clearly, this procedure can be mechanized: a computer can be pro-
grammed to spew out all n-tuples of natural numbers in this order. Mf
3.6. Theorem
Let P be an n-ary relation. Then P is recursive iff both P and =P
are r.e.
PROOF
3.7. Remarks
(i) Note that in the second half of this proof we needed both
enumerating machines. If we only had an enumerate-P machine,
and we tried to use it for testing whether Pa holds, then if the
answer happened to be negative we would never find that out.
(ii) By Thm. 3.6, every recursive relation is r.e. We shall see in the
next chapter that the converse of this is false.
3.8. Theorem
If P is obtained from Q by existential quantification and Q is r.e., then
P is r.e. as well.
PROOF
3.9. Definition
Let f be an n-ary function. By a compute-f machine we mean a
computer with an input port and an output port, and programmed so
that if any x € N” is fed into the input port, then after a finite number
of steps the computer prints out as output the value fx.
We say that f is a recursive (or computable) function if a compute-f
machine can be constructed.
202 9. Recursion theory
3.10. Theorem
For any function f, the following three conditions are equivalent:
PROOF
Bue Remarks
§ 4. Closure results
4.1. Theorem
The class of recursive relations is closed under all propositional opera-
tions.
PROOF
Let P and Q be n-ary recursive relations. Thus we can construct a
decide-P machine Dp and a decide-Q machine Dg. Then Dp can be
turned into a decide-— P machine, simply be reversing its outputs.
Therefore — P is recursive.
To construct a decide-(P v Q) machine, let Dp and Dg operate
alongside each other. Given any n-tuple a €eN”, a copy of it is fed into
each of these two machines. Their two outputs are channelled into a
collating unit. This unit checks the two outputs, and if at least one of
them is ‘yes’ it gives out a final output ‘yes’; but if both Dp and Do say
‘no’, then the collating unit gives out a final output ‘no’. We have now
got a decide-(P v Q) machine, showing that P v Q is recursive. The
other Boolean operations can be reduced to negation and disjunction.
ET
4.2. Remark
According to Assumption 2.3, a computer is supposed to operate in
a serial manner. This seems to be violated by the decide-(P v Q)
machine just described, which has Dp and Dog as two components
working in parallel. The apparent difficulty can be resolved by assum-
ing that the two components operate alternately, as in bipedal walking:
each one pausing while the other performs a step.
4.3. Theorem
The class of r.e. relations is closed under disjunction, conjunction and
existential quantification.
PROOF
Next, we show that the class of r.e. relations is closed under the
operation of adding a redundant variable.
4.4. Theorem
Let P be an n-ary relation. Let Q be the (n + 1)-ary relation such that,
forallxée N” andall yeN,
OV i Pe.
PROOF
present in the P buffer; if it is, (a, b) goes on the final list; if not, it
goes to the N‘"*!) buffer. @
4.5. Remarks
(i) Results similar to Thm. 4.4 hold also for the class of recursive
relations and the class of recursive functions; but they are too
obvious to be stated as theorems.
(ii) Using these facts, we can deal with disjunctions and conjunctions
of r.e. or recursive relations that are not of the same n-arity. For
example, if P and Q are binary, we can form a quaternary
relation R by stipulating that for all w, x, y and z,
For the final theorem of this section, we let f,, fo, ..., f, be n-ary
functions. Let g be a k-ary function and let the function h be obtained
by composing g with fj, fo, ..., f,; in other words, for all x e N”,
4.6. Theorem
Let fi, fo, ..-5 fe be recursive functions.
PROOF
(i) By hypothesis, we can construct machines §,, §, ..., §, that
compute f;, fz, ..., f; respectively; also, we can construct a compute-
g machine, 8. To compute h, we proceed as follows.
Given any n-tuple a € N", copies of it are fed into the input ports of
$1, S2, -.., Sx. When these k machines have produced their outputs,
bi, bz, ..., by, they are put together as a k-tuple (b,, bo, ..., by),
§5. The MRDP Theorem 207
which is fed into the input port of ©. The output produced by the latter
is the required value ha.
This procedure can be mechanized, yielding a compute-h machine.
The proof of (ii) is similar. To prove (iii), we note that
5.2. Definition
(i) An n-ary function f is a monomial if for some natural number a
(called the coefficient) and natural numbers ky, k2, ..., ky
(called the exponents) the equality
fx= DNs Nye ei Kae
5.3. Definition
(i) An n-ary relation P is elementary if there are n-ary polynomials
f and g such that, for allx € N”,
5.5. Remarks
(i) The = part of the theorem is simple to prove. First, let P be an
n-ary elementary relation, and let f and g be polynomials
satisfying the condition of Def. 5.3(i). For any given x € N” we
can calculate the values fx and gx — this involves a finite number
of additions and multiplications of natural numbers. Then the two
values can be compared to see whether Px holds or not. This
procedure can clearly be mechanized, yielding a decide-P
machine. Thus every elementary relation is recursive, and hence
r.e. by Thm. 3.6. Now, by Def. 5.3(ii), any diophantine relation
is obtainable from an elementary relation by a finite number of
existential quantifications; so it is r.e. by Thm. 3.8.
(ii) The => part of the MRDP Thm. is far harder to prove. The
original proof (including Robinson’s early results and her joint
work with Davis and Putnam) is reproduced in B&M, pp.
284-311. A shorter and more direct version of the proof is
presented in pp. 111-123 of Cohen’s book cited in § 1.
(iii) The proof of the MRDP Thm. is effective: it provides us with a
method whereby from a given description (program) of an
§5. The MRDP Theorem 209
§1. Preliminaries
1.1. Preview
The main results in this chapter reveal the inherent limitations of
formalism and the formalist approach to mathematics. For the sake of
simplicity we confine ourselves to a very basic part of mathematics:
elementary arithmetic (a.k.a. elementary number theory), whose sub-
ject-matter is the elementary structure of natural numbers (see Ex.
8.3.6). However, these results can be generalized without much dif-
ficulty to richer and more elaborate mathematical contexts.
1.2. Convention
We shall often write ‘number’ as short for ‘natural number’. Unless
stated otherwise, we shall follow the notation and terminology of Ch. 9
(see Conv. 9.1.2). Also, we use ‘k’, ‘m’, ‘n’ and ‘p’ as informal
variables ranging over numbers.
1.3. Specification
From now on, unless stated otherwise, our formal object language 2
will be the first-order language of arithmetic; namely, the first-order
language with equality =, whose extralogical symbols are:
1.4. Remarks
210
§1. Preliminaries Dull
(ii) Since ‘s’ is now used as a syntactic constant denoting the unary
function-symbol of £, we cannot use it any longer as a syntactic
variable ranging over ’-terms. For this purpose we shall use ‘q’,
‘r’ and ‘t’, with or without subscripts.
(ili) The terms of & evidently fall into the following five mutually
exclusive categories:
(1) Terms of the form x, consisting of a single occurrence of a
variable;
(2) The single term 0;
(3) Terms of the form st, where t is any term;
(4) Terms of the form +rt, where r and t are any terms;
(5) Terms of the form Xrt, where r and t are any terms.
Terms of the last three categories will be referred to as ‘s-terms’,
‘+-terms’ and *X-terms’ respectively.
1.5. Definition
In addition to Def. 8.2.2, which remains in force here — and for similar
reasons — we put, for any terms r and t:
Gi) (r-+t) =a +1t,
(li) (rXt) =g¢ Xrt.
In using this metalinguistic notation, brackets are required. To prevent
proliferation of brackets, which would impair legibility, we omit brack-
ets subject to three simple conventions. First, the Greek cross ‘+’ is
deemed to separate more strongly than the St Andrew cross ‘x’.
Second, of any two occurrences of ‘+’ (or of ‘X’) enclosed within the
same pairs of brackets, the one further to the left is deemed to
separate more strongly. Third, we do not omit any pair of brackets
whose left member comes immediately after an occurrence of ‘s’;
hence, when restoring brackets, no new left bracket should be placed
immediately after an ‘s’. For example,
s0+ss0Xs0Xsss0+0 = s0+ss0X (s0Xsss0)+0
= s0+[ss0X (s0Xsss0)|+0
= s0+{[ss0X (s0Xsss0)|+0} = {s0+{[ss0X (s0Xsss0)]+0}}.
1.6. Definition
Proceeding by induction, we define, for each natural number k, an
L-term s,, called the k-th L-numeral:
So = 0, Sxi1 = SS.
212 10. Limitative results
1.7. Recapitulation
Applying Def. 8.4.2 to our present language’, we see that an
£-interpretation (a.k.a. “-structure) U is completely determined by
the following ingredients.
1.8. Definition
The intended or standard £-interpretation Mt is characterized as fol-
lows:
1.9. Definition
1.10. Remarks
(i) We have chosen the syntactic constants ‘0’, ‘s’, ‘+’ and ‘x’
advisedly, so as to serve a mnemonic purpose: each of these
symbols graphically suggests the standard interpretation of the
£-symbol that it denotes. This punning mnemonic role of the
four syntactic constants is made manifest in clauses (ii), (iii) and
(iv) of Def. 1.8. For example, ‘0’ has been chosen as the name (in
our metalanguage) for the individual constant of £. The shape (if
any!) of the latter constant is left unspecified, but under the
standard interpretation of & it is treated as a name of the number
zero, that number which is conventionally denoted by the num-
eral ‘0’. Since ‘0’ was chosen for its present role precisely because
it looks like ‘0’, we have a mnemonically useful pun: 0* = 0.
A similar mnemonic purpose is served by the choice of ‘=’ as
the syntactic constant denoting the equality symbol of “, except
that in this case the pun is not confined to the standard interpre-
tation. Indeed, by Def. 8.4.2(iii), under any “-interpretation U
the equality symbol of & is interpreted as denoting the identity
relation on the domain U of U. As a result, we have (as part of
clause F1 of the BSD) the mnemonically useful pun:
1.11. Warning
Beware, however, of being deceived by this suggestive notation: Rem.
1.10Gi) works for the standard interpretation, but not necessarily for
other interpretations. Thus, for example, you must not assume that 0
always denotes the number 0. Rather, under an arbitrary “-interpreta-
tion U, the object 0" denoted by 0 need not be a number at all, let
alone the number 0; in fact, it can be any object whatsoever.
Or, to take another simple example, you must not assume that the
sentence 0+0=0 is true under an arbitrary -interpretation. Of
course, this sentence is easily seen to be true in the sense of Def.
1.9(ii). It is clearly satisfied in the standard structure J. But it is not
logically true: If o is a valuation based on an arbitrary interpretation
U, then we find (using the BSD) that (0+0=0)° = T iff f(a, a) =a,
where f = +" and a = 0" (that is, f and a are the binary operation and
individual named by + and 0 respectively under U). It is quite possible
that f(a, a) # a, in which case UF 0+0=0.
1.12. Problem
Show that s;,™ = k (see Def. 1.6).
1.13. Problem
Let x, y and z be distinct variables. Let o be a valuation based on %
and let x and y be the numbers x° and y” respectively. For each of the
§2. Theories 215
(i) 4z(x+z=y),
(ii) 4z(x+sz=y),
(ili) Vy(x#sy),
(iv) dy(x=s2Xy),
(v) dz(x=yXz).
§2. Theories
2.1. Definition
For any number n, we let ®,, be the set of all 2-formulas whose free
variables are among vj, Vo, ..., V,, the first m variables of 2 in
alphabetic order (cf. Spec. 8.1.1(i)). In particular, ®9 is the set of all
£-sentences.
2.2. Remark
If ge @,,, it does not follow that all the variables v,, v>, ..., v, must
be free in @; but only that no other variables are free in @. Hence
®, C®@®,,,, forall n.
2.3. Definition
2.4. Remarks
By definition, DcI is the set of all sentences that can be deduced from
I in Fopcal. However, by the soundness and completeness of Fopcal
(Thms. 8.9.14 and 8.13.10), DeI is also the set of all sentences that are
logical consequences of I; in particular A is the set of all logically true
sentences (cf. Def. 8.4.10). ‘A’ is mnemonic for ‘logic’.
216 10. Limitative results
2.5. Definition
An £-theory is a set & C Pp such that & = Dex; in other words, it is a
set of 2-sentences closed (or saturated) under deducibility of L-sent-
ences.
2.6. Problem
If F is any set of sentences, show that Def is a theory that includes
itself. Moreover, Def is the smallest such theory: if X is any theory
that includes [, then DeI C x.
2.7. Definition
If Z is a theory, then a postulate set for X is any set I of sentences such
that & = Dc.
2.8. Remark
The ideas we have just introduced may be applied in two mutually
converse ways. In some cases we start with a given set I of sentences
as postulates, and wish to investigate the resulting theory Decl. In
other cases we start with a given theory & and wish to find a set of
postulates for it that has some desirable property. (Of course, by Defs.
2.5 and 2.7 every theory is a postulate set for itself; but the point is to
find a simpler set.)
2.9. Examples
2.10. Definition
For any -£-structure U we put
Thu =a {@ € Do: UE g}.
ThU is called the theory of U; it is the set of all sentences that hold in
bE
2.11. Remark
It is easy to see that THU is indeed a theory in the sense of Def. 2.5: if
yw is a sentence such that Thu} yw then, by the soundness of Fopcal,
UE w; therefore wp € ThU.
2.12. Definition
A theory 2 is complete if it is consistent, and for any sentence @ either
@meloraq@er.
2.13. Problem
2.14. Definition
(i) We put
Q =<dt Th.
2.15. Remarks
(i) By Prob. 2.13(ii), & is indeed a complete theory. By Def. 2.14,
Q is a sound theory. In fact, Q is the only complete sound
218 10. Limitative results
3.2. Convention
We shall often wish to consider the standard structure Jt alongside
some -2-structure, which may or may not be the standard one. In such
cases it will be convenient to denote the latter structure by ‘*It’.
Whenever we use this notation, we shall take it for granted that
(i) *N is the domain of *%,
(ii) *0 is 0” (the designated individual of *N),
(iii) *s is s * (the basic unary operation of *2),
(iv) *+ and *x are + and x™ respectively (the basic binary
operations of *J).
The prefix ‘*’ is pronounced as ‘pseudo’.
3.3. Remark
3.4. Definition
(i) An embedding of the structure M in the structure *N is an
injection from N to *N (that is, a 1-1 mapping from N into *N )
$3. Skolem’s Theorem 219
such that
3.5. Remarks
(i) If f is an isomorphism between Jt and *N, then *N is an exact
replica of Jt: each number n has a unique counterpart fn and
each individual of *Jt is the counterpart of a unique number; and,
moreover, by (*) the basic operations on numbers are exactly
mimicked by the corresponding basic operations on their counter-
parts. The two structures are structurally indistinguishable.
For this reason we shall from now on refer not just to It itself
but also to any “-structure isomorphic to it as the standard
structure.
(ii) If f is merely an embedding of N in *I, then this means that *2
has a substructure isomorphic to 2.
3.6. Problem
Let f be an embedding of Jt in *M. For any valuation o based on
MN, we define fo as the valuation based on *® such that, for each vari-
able y,
y’’ = f(y’).
(i) Show that t/° = f(t’) for any term t. Hence, in particular, if t is a
closed term it follows that t™ = f(t”). (Use induction on degt,
distinguishing the five cases mentioned in Rem. 1.4(iii). Note that
the fact that f is injective need not be used in the proof.)
(ii) Show that f[o(x/n)] = (fo)(x/fn), where x is any variable and n
is any number.
(iii) Show that if f is an isomorphism between Q and *® then
a/° = a? for any formula @. In particular, *M + » iff NE @ for any
sentence @.
220 10. Limitative results
3.7. Remark
by Def. 2.14, Ne@ iff pe Q; thus N is a model for Q (see Def.
8.5.10). From Prob. 3.6(iii) it follows that any structure *J? isomorphic
to N is likewise a model for Q. This is hardly surprising, since such *2
is a carbon copy of Xt. The surprising fact, which will be proved next, is
that not all models for 2 are standard.
PROOF
Choose any variable x, and for each number n let @, be the formula
x#s,. Now consider the following set of formulas:
®=QU (g,:neN}.
prove that f cannot be surjective (that is, cannot map N onto *N).
Indeed, for each number n our valuation T satisfies the formula ®n>
that is, x#s,,. Hence (by the BSD) we must have
3.9. Problem
Let *It be any model for 2. Let f be the mapping from N to *N
defined by:
(i) Show that f is injective. (If m#n then s,, #s, is in ® and so
must hold in *9.) Prove:
(ii) f is an embedding of J in *2N.
(iii) f is the only embedding of M in *It. (Use Prob. 3.6(i).)
(iv) Hence *® is a standard model of Q iff *N = {s, %: ne N}.
3.10. Remark
Skolem’s Theorem means that the whole truth about St cannot be
expressed in £. As we have noted, @ is all that can be said in £ about
MN; but Q fails to pin I down uniquely (even up to isomorphism). At
first sight it may seem that is perhaps due to some accidental defect of
£. Can £ perhaps be enriched (and Jt correspondingly elaborated) so
that in the richer formal language the correspondingly more elaborate
structure of natural numbers may be characterized uniquely up to
isomorphism? For a discussion of this question, and a pessimistic
answer, see B&M, pp. 320-324. We shall return to this issue in the
Appendix.
§4. Representability
4.1. Preview
This section is devoted to defining new concepts rather than to proving
major results. We shall introduce two ways in which a relation on N
may be formally expressed or represented in a theory 2.
php 10. Limitative results
4.2. Reminder
We recall some of the conventions introduced in Ch. 9. Lower-case
German letters ‘a’, ‘b’, ‘x’ and ‘y’ are used as informal variables ranging
over the set N” of all n-tuples of numbers. Where a German letter is
used for an n-tuple, the corresponding italic letter is used for the
components of that n-tuple. Thus, for example,-a = (a1, d2,..., Gn)
ANG Minuet
Note that the number of components of a tuple denoted by a
German letter is always assumed to be n (rather than k or m etc.).
Recall that by relation we mean relation on N. If P is an n-ary
relation, we usually write, for example, ‘Pa’ as short for ‘a € P’.
4.3. Remark
The symbols ‘a’ and ‘x’ do not refer to, or have anything to do with,
the formal language ; they are ordinary mathematical symbols used
as variables in our own language.
4.5. Remarks
(i) Here the terms t,, t2, ..., t, are substituted simultaneously for
all free occurrences of v,, V2, ..., V, respectively — the first n
variables in alphabetic order (cf. Spec. 8.1.1(i)). So, for example,
‘a(t)’ is short for ‘a(v,/t)’. If t is to be substituted for a variable x
other than v,, we cannot use the abbreviated notation but have to
write ‘a(x/t)’ in full.
(ii) When substituting several terms in a formula, as in Def. 4.4(ii),
alphabetic changes of bound variables may be necessary in order
to prevent capture. Also, it is important that the terms are
substituted simultaneously rather than successively. (For a de-
tailed precise treatment of the technicalities involved in simul-
taneous substitution, see B&M, pp. 65-67.) However, in many
$4. Representability 223
cases when the abbreviated notation is used below, the terms that
are substituted will be closed terms; so no changes of bound
variables will be required. In such cases it is also unimportant
whether the substitution is made simultaneously or successively.
Next, for the case where the terms to be substituted for the variables
Vi, V2, ..-., V, are numerals, we introduce a further useful abbrevi-
ation which slightly stretches the use of lower-case German letters.
4.6. Definition
For any term r, any formula @ and anya € N”, we put
Thus, @(s,) is obtained from @ by substituting the a;-th numeral for all
free occurrences of v;, where i=1,2,..., 7.
4.7. Definition
Let P be any n-ary relation and let 2 be a theory.
(i) A formula a € ®,, represents P weakly in & if, for allx « N",
Px (S.) aa.
4.8. Remarks
4.9. Problem
Let ae ®,, where n > 0. Determine the n-ary relations that a repre-
sents weakly/strongly in the inconsistent theory.
§5. Arithmeticity
5.1. Preview
In this section we investigate an important class of relations: those
representable in complete first-order arithmetic, Q. In view of Rem.
4.8(iv), in the present context we need not distinguish between weak
and strong representation, so we say simply that a given formula
represents a relation in Q.
$5. Arithmeticity 225
5.2. Definition
A relation is arithmetical if it is representable in Q.
5.3. Remark
(*) Px = a(s,) €
5.4. Definition
Let ae ®, andae N”. If @ is satisfied by some valuation o based on
‘such that vy," = a; fori = 1,2, ...,n, we-write:
‘NE afa]’.
5.5. Remarks
(i) If NF ala], then by Thm. 8.5.8 «@ is satisfied by every valuation 0
based on %t such that v;° = a; fori=1,2,...,n.
(ii) Def. 5.4 is a contextual definition: it defines the whole expression
‘ME a[a]’ as a package. The part ‘a[a]’ of this package has no
meaning on its own: it does not denote anything whatsoever. In
particular, ‘
‘a{a]’ must not be confused with ‘a(s,)’, which does
have meaning on its own: it denotes the “-sentence obtained
from « by substituting the n-tuple of numerals s, for the first n
variables of 2. However —
5.6. Lemma
Letae ®, andae N". ThenNE a(s,) iff RE aa].
PROOF
5.7. Remark
From this lemma it now follows that conditions (*) and (**) of Rem.
5.3 are equivalent to
5.8. Examples
Because condition (***) refers to the standard interpretation, it is
always straightforward to work out the n-ary relation represented in Q
by a given ae @®,. All that we need to do is to ‘deformalize’ a by
‘translating’ it from 2 into the metalanguage (see Rem. 1.10).
5.9. Lemma
If the equation r=t belongs to ®,, then it represents in Q an elementary
n-ary relation. Conversely, every elementary relation is represented in Q
by an equation.
PROOF
First, suppose that r=t belongs to ®,. This simply means that every
variable occurring in r or t is among vj, V2, ..., V,. In addition to
variables, r and t may contain occurrences of 0,s, + and X.
Let P be the n-ary relation represented by this equation in &. To
determine P we use the process of ‘deformalization’ illustrated in Ex.
5.8. We get, for all x e N”:
(*) Px = fx = gx,
5.10. Warning
Not every formula that represents in 2 an elementary relation is an
equation. What we have shown is that among the (infinitely many)
formulas representing in 2 a given elementary relation there must be
an equation.
5.11. Theorem
The following two conditions are equivalent:
PROOF
5.12. Remarks
(i) Thm. 5.11 means that the class of arithmetical relations is the
smallest class that contains all elementary relations and is closed
under the logical operations.
(ii) That the proof of Thm. 5.11 was so easy is due in part to the
notation we are using (cf. Warning 9.1.4).
5.13. Corollary
If P is an n-ary r.e. relation, then it is arithmetical. Moreover, it is
represented in & by a formula of the form
where m = 0.
PROOF
By the MRDP Thm. 9.5.4, P is diophantine. This means that P is
obtained from an elementary relation by a finite number of (informal)
existential quantifications. The second half of the proof of Thm. 5.11
shows that P is represented in ® by a formula having the required
form. - #
5.14. Remark
Since the formula in Cor. 5.13 must be in @®,, all the variables
occurring inr or t must be among yj, V2, ..- 5 Vnam:
5.15. Corollary
Every recursive relation is arithmetical.
PROOF
5.16. Remark
Since every elementary relation is recursive (see Rem. 9.5.5(i)), it
follows from Rem. 5.12(i) and Cor. 5.15 that the class of arithmetical
$6. Coding 23
relations is the smallest class that contains all recursive relations and is
closed under the logical operations.
5.17. Reminder
In what follows we use the terms function and graph in the same sense
as in Ch. 9: an n-ary function is an n-ary operation on N; and its graph
is the (m + 1)-ary relation P such that, for all x ¢ N” and all y € N,
P(x, y) > fx = y.
5.18. Definition
An arithmetical function is a function whose graph is an arithmetical
relation.
5.19. Theorem
Every recursive function is arithmetical.
PROOF
5.20. Problem
Let P be a k-ary arithmetical relation and let f,, fo, ..., f, be n-ary
arithmetical functions. Let the n-ary relation Q be defined, for all
x € N", by the equivalence
OR Pl fit tote ee 5 ED:
§6. Coding
6.1. Preview
In a natural language we can talk of many things: of shoes and ships
and sealing wax, of cabbages and kings — and of that very language
itself. Can the same thing be done in “, under its standard interpreta-
tion? Can be used to ‘talk’ of its own expressions, of their proper-
ties, of relations among them and of operations upon them? At first
glance this seems absurd: under its standard interpretation ‘talks’ of
Wey 10. Limitative results
6.2. Definition
(i) To distinguish between the ordinary decimal and the binary
notation we shall use italic (slanted) digits ‘0’ and ‘J’ for the
latter: Thus 0=07 b= 1570 =2. 77 =3. 100 = 4 etc.
(ii) If kK21 and aj, az, ..., a, are any numbers, with a; >0, we
define their binary concatenation
aieae ES as
6.3. Definition
#0=2=10,
#5 =4=27
= 100,
f= ev =7 000,
Xe 1h 2 '=. 70000.
#= = 32 = 2° = 100,000,
#1 = 64 = 2° = 1,000,000,
#— = 128 = 2’ = 10,000,000,
#V = 256 = 28 = 100,000,000,
en 2 tor ht
(ii) If k>1 and py, po, ..., px are primitive symbols of 2 then we
assign to the £-string p;p2.. . py the code-number
6.4. Remarks
6.5. Convention
When a noun or nominal phrase referring to “-expressions appears in
small capitals, it should be read with the words ‘code-number of’ or
‘code-number of a’ prefixed to it. Thus, for example, ‘TERM’ is short
for ‘code-number of a term’.
Many relations and functions connected with the syntax of “ can easily
be seen to be recursive.
6.6. Examples
(i) Consider the property Tm defined by
Tm (x) <q; X 1S a TERM.
6.7. Example
The diagonal function is the unary function d defined as follows
HEN elec if x is A FORMULA @,
if x is not a FORMULA.
6.8. Theorem
The function d is recursive. For any formula a,
d(#0) = #[a(szq)].
PROOF
For the recursiveness claim, see above. The equality follows directly
from the definition of d. |
In this sense the formula @ expresses the property of being a TERM and
the sentence a(s,) ‘says’ that x is a TERM. Thus -2, under its standard
interpretation Qt, is able to discourse of various aspects of its own
syntax, albeit obliquely, by referring to its own expressions via their
code-numbers.
Can the standard semantics of £2 likewise be discussed in 2? We
shall show that it cannot.
236 10. Limitative results
7.2. Definition
For any set & of sentences, the property Ty is defined by
7.3. Remarks
(i) Equivalently, Ty is the set #[Z] of all seNTENCEs of Z.
(ii) In particular, Tg is the property of being a seNTENCE of @. In
other words, T(x) holds iff x is a TRUE SENTENCE (see Def.
1.9(ii)).
PROOF
(*) Px gp aT Q(d(x)).
If Tg were arithmetical, then by Prob 5.20 and Thm. 5.11 it would
follow that P is arithmetical as well. This would mean that there is
some formula @ € ®, such that, for any number x,
(+) Px = a(s,) € Q.
7.5. Remarks
(i) Let us paraphrase the proof just given. If the property P were
arithmetical then it would be expressed (that is, represented in
§7. Tarski’s Theorem 237
7.6. Definition
Let f be an n-ary function and let a € ®,,,;. We say that @ representsf
numeralwise in a theory © if, for any a € N", the sentence
belongs to X.
7.7. Problem
Let @ represent the n-ary function f numeralwise in the theory &. For
any formula B in ®,, define B’ as the formula
AV 1 +1[B(V
141) A).
Prove that, for any ae N", the sentence B(s;/,)<>B'(s,) belongs to Z.
(It is enough to show that this sentence is deducible from (***) In
Fopcal.)
238 10. Limitative results
7.8. Definition
A formula y € ®, is called a truth definition inside a theory & if, for
each sentence @, & contains the sentence
V(S¥qy)9.
7.9. Problem
(i) Prove that if the diagonal function d is representable numeral-
wise in a consistent theory 2, then there cannot exist a truth
definition inside &. (Given any y € ®;, use Prob. 7.7 to find a
formula 6¢€ @®, such that for every number a the sentence
1 (Sa(a))<?4(S,) is in Z; then take @ as O(s4s5).)
(ii) Prove that d is representable numeralwise in £2; hence deduce
that there is no truth definition inside Q. (Since d is arithmetical,
there is a formula a € ®, that represents the graph of d in Q.
Show that the same @ also represents d numeralwise in @.)
(iii) Using (ii), give a new proof of Thm. 7.4. (Show that if Tg were
represented in Q by a formula y, then y would be a truth
definition inside 2.)
(iv) Prove that if & is a sound theory (see Def. 2.14) there is no truth
definition inside it.
§8. Axiomatizability
Recall (Def. 2.7) that a set of postulates (a.k.a. extralogical axioms) for
a theory & is a set of sentences f such that & = DeI. Having a set of
postulates is no big deal: every theory 2 has one, because (by Def. 2.5)
x = Dec. In order to qualify as an axiomatic theory, & must be
presented by means of a postulate set [ specified by a finite recipe.
This does not mean that I itself must be finite. (Of course, if I is finite
then so much the better, for then its sentences can be specified directly
by means of a finite laundry list.) Rather, it means that we are
provided with an algorithm — a finite set of instructions — whereby the
sentences of I can be generated mechanically, one after the other. By
Church’s Thesis, this is equivalent to saying that Ty must be given as
an r.e. property.
8.1. Conventions
8.2. Definition
(i) A theory 2 is axiomatic if it is presented by means of a set of
postulates [’, which is given as an r.e. set.
(ii) A theory & is axiomatizable if there exists an r.e. set T of
postulates for L.
8.3. Remark
Note that being axiomatic is an intensional attribute: it is not a
property of a theory as such, in a Platonic sense, but describes the way
in which a theory is presented. On the other hand, axiomatizability is
an extensional attribute of a theory as such, irrespective of how it is
presented.
8.4. Theorem
If & is an axiomatizable theory then there exists a recursive set of
postulates for X.
PROOF
#Y0, PY pe ee as
8.5. Remark
The proof of Thm. 8.4 shows that if 2 is not merely axiomatizable but
an axiomatic theory, then we can actually find a recursive set of
postulates for it.
8.6. Definition
For any formulas @1, @2, ..., @,, Where n = 1, we put
8.7. Remark
Thus, the binary representation of #(@,, 2, ..., @,) is obtained by
stringing together the binary representations of the code-numbers #q),
§8. Axiomatizability 241
#2, ..., #@,, in this order, but inserting a digit ‘7’ between each one
and the next. These additional ‘/’s serve as separators (like commas)
showing where the binary representation of the code-number of one
formula ends, and the next one begins. These separators are easily
detected: they are always the first of two successive occurrences of ‘/’.
(The second ‘/’ belongs to the binary representation of the next
formula.)
8.8. Definition
For any set of sentences I’ we define a binary relation Dedr by:
8.9. Lemma
If T is a recursive set of sentences then the relation Dedy is recursive.
PROOF
8.10. Theorem
A theory is axiomatizable iff it is an r.e. set of sentences.
PROOF
If X is axiomatizable then by Thm. 8.4 there is a recursive set of
sequences [T such that & = DcI; that is, & is the set of sentences
deducible in Fopcal from I. Thus, for all x,
8.11. Remarks
(i) The proof of Thm. 8.10 (including the proofs of Thm. 8.4 and
Lemma 8.9) shows that if & is not merely axiomatizable but an
axiomatic theory, then a program can actually be produced for
making a computer operate as an enumerate-7y machine. Hence
2 can be given as an r.e. set in the sense of Conv. 8.1(i).
(11) The theorem means that a theory is axiomatizable iff there exists
a finite presentation of it, by means of a program for generating
one by one all the s—ENTENCEs of the theory.
8.12. Theorem
Q is not axiomatizable.
PROOF
8.13. Theorem
If P is weakly representable in an axiomatizable theory then P is an r.e.
relation.
$9. Baby arithmetic 243
PROOF
Px = a(s,) € 2.
Px > Ty(#[a(s,)]).
The n-ary function f defined by the identity fx = #[a(s,)] is clearly
recursive. (To compute fx the m numerals s, must be substituted for
the variables v,, v2, ..., V, in @; the code-number of the resulting
sentence is fx. This computation can evidently be performed by a
suitably programmed computer. )
By Thm. 8.10 Ty is r.e.; therefore by Thm. 9.4.6(iii) P is r.e. as
well. &
8.14. Problem
Prove that if P is strongly representable in a consistent axiomatizable
theory, then P is a recursive relation. (First show that if a represents P
strongly in a theory, then -@ represents — P strongly in that theory.)
Smt+So=Sm-
244 10. Limitative results
Sn +S841=8(Sm4t5S,)-
Sm*XSo=So-
Sm ¥*Sn4+1=Sm XSptSm-
9.6. Remark
Evidently, all these postulates are true; hence [Ip is sound. Also, this
theory is axiomatic, as the set of postulates 9.2-9.5 is evidently
recursive.
From the postulates of IIg we can deduce in Fopcal formal versions
of the addition and multiplication tables.
9.7. Example
Let us show that s,;+s,=s> € IIo. First, note that the equation
C) S$,+s,;=s(s|+So)
is an instance of Post. 2, and so belongs to IIp. Also, the equation
(2) 8, +So=s;
is an instance of Post. 1, and hence belongs to IIp. Using Ax. 6 of
Fopcal, we deduce from (2) the equation s(s;+so)=ss;, which (in view
of Deta.6) is
(3) S(s1+S9)=s>.
Finally, using Ax. 5 and Ax. 7 of Fopcal, we deduce from (1) and (3)
the equation
S, +s,=S),
9.8. Problem
Prove that II) contains the sentence:
9.9. Lemma
PROOF
t=s,+s,, 7
StS m=Sp
also belongs to Ip. From these two equations we deduce (using Ax. 5
and Ax. 7 of Fopcal) the equation t=s,,.
246 10. Limitative results
9.10. Definition
A formula (or sentence) of the form
9.11. Lemma
IIy contains all true simple existential sentences.
PROOF
9.12. Theorem
For any given n-ary r.e. relation P, we can find a formula of the form
that belongs to ®,, and represents P weakly in every sound theory that
includes Vo.
$9. Baby arithmetic 247
PROOF
Px = a(s,) € Q.
a(s,) € Qs a(s,) € x.
Px = a(s,) € =. a
9.13. Remarks
(i) By Thm. 8.13, only r.e. relations can be weakly represented in an
axiomatizable theory. We have just shown that every r.e. relation
is in fact weakly representable in Ig. Thus Ip achieves as much
as is possible for any axiomatizable theory as regards weak
representation.
(ii) As we shall see (Thm. 11.13), there are even weaker axiomatic
theories in which every r.e. relation is weakly representable.
However, the postulates of II) have been devised so as to make
this theory just strong enough for Lemma 9.11 to hold; hence r.e.
relations are weakly represented in My by formulas of a particu-
larly simple form.
9.14. Problem
Let U be an £-structure whose domain U is a singleton {u}.
AW 414Vn42 -- + dVn+m®,
248 10. Limitative results
Therefore
(9.15) Pa<>there are numbers b;, b2,..., Dm
such that NE @[a, by, bo, ..., bm].
9.16. Definition
Let @ be a formula belonging to ®,,,,,, and let a@ be the formula
9.17. Remarks
(i) Thus (9.15) means that — under the assumptions made in Def.
9.16 — Pa holds iff there exists an a-witness that it does.
Moreover, the sentence a(s,) may be regarded as ‘saying’ that
there exists an a@-witness that Pa. Indeed, it is clear that a@(s,) is
true — that is, St a(s,) — iff such a witness exists.
(11) In the special situation covered by Thm. 9.12, P is an rue.
relation, @ is a simple existential formula of a particularly neat
form and @ is an equation r=t. In this case an a-witness that Pa
is an m-tuple (b;, b>, ..., b,,) such that
(*) SE @=t)la,"by, b2, = v03 Dal!
10.2. Definition
For any terms r and t, we put
rSt = g, dz(r+z=t),
10.3. Remark
This is yet another mnemonic pun: by Ex. 5.8(ii), the formula v;<v
represents in Q the relation <.
As postulates for II, we take Post. 1-4 (9.2—-9.5), as well as the
250 10. Limitative results
Sm#Sn>
Vvi (v1,SS,<V1=Sp
VVj=S8, V. - -VV1=S,,),
Vvi(s,Svi VviSS,,);
where n is any number, and (in Post. 5) m is any number such that
mMF#N.
10.7. Remarks
(i) Evidently, II, is a sound axiomatic theory.
(ii) II, is a proper extension of II) because, for example, no instance
of Post. 5 belongs to I[p (cf. Prob. 9.14(11)).
10.8. Problem
Show that the results of Prob. 3.9(i), (ii) and (iii) hold with ‘Q’
replaced by ‘II,’.
10.9. Problem
(i) Let *2t be the 2-structure such that:
1. *N=NU {}, where © is an object that is not a natural
number;
2. *0=0;
3. *s is the extension of the ordinary successor function such that
*s() = 0;
4. *+ is the extension of ordinary addition such that if a = © or
b= thena*+ b=;
5. *x is the extension of ordinary multiplication such that if
a=~orb=thena*x b=0.
Show that *9 is a model for II,.
(ii) Prove that the sentence Vv,(sv;#s9) is not in II.
§10. Junior arithmetic 251
10.10. Definition
For any variable x, term r and formula @ we put
10.11. Preliminaries
@ = AV,415V,42---dV,4,(r=t),
y = dy(Ba 8B’).
Note that the free variables of 6 and Bf’ are among vj, v2, ..., V, and
y, and therefore y is in ®,.
PROOF
For the simple but somewhat lengthy proof, see B&M, pp. 337-340.
(The Main Lemma appears there as Lemma 7.9, but its proof requires
two earlier results, Lemmas 7.7 and 7.8.) a
10.13. Analysis
Let @ and q’ be the equations r=t and r’=t’ that occur in the formulas
a and a’ respectively.
We take up the discussion begun in Rem. 9.17. Recall that Pa holds
iff there exists an a@-witness that it does. By Def. 9.16, such a witness is
an m-tuple (b,, bz,..., 6») for which
Jee g(a, by, bo, wagers bm].
Thus B(s,, y/s,) ‘says’: There is an a-witness that Pa, and this witness is
bounded by the number b. In other words: Among the numbers <b
there can be found an a-witness that Pa. Exactly the same analysis
applies to P’, a’ and B’.
What does the sentence y(s,) ‘say’? Recalling that y = dJy(BA 78’),
we see that
y(s.) Ee QSME y[a]
Let us consider the truth value of y(s,) in each of these regions (that is,
for a belonging to each region).
For a in Region I, Pa holds, and hence is a-witnessed by some
m-tuple (b;,b,..., 6m). If we choose b large enough (say as the
largest among these b;) then Pa has an a-witness bounded by b. But in
this region P’a does not hold, hence has no a@’-witness, let alone a
witness bounded by our b. Thus Pa is a-witnessed before P'a is
a’-witnessed, simply because the former witness exists and the latter
does not. So y(s,) is true throughout Region I.
In Region II, the position is reversed. Here P’a holds, and is
therefore a'-witnessed; but Pa is not a-witnessed at all, let alone
before P'a is a’-witnessed. Hence y(s,) is false throughout Region II.
In Region III, both Pa and P’a hold, and are therefore witnessed,
but for some a in this region Pa may be a-witnessed before P’a is
«’-witnessed, while for other a in the same region this may not be the
case. So there is no general uniform answer for this region: y(s,) may
be true for some a and false for others.
In Region IV, neither Pa nor P’a holds, and hence neither is
witnessed. So, Pa is not a@-witnessed at all, let alone before P’a is
«'-witnessed. Hence the sentence y(s,) is false in this region.
254 10. Limitative results
Our Lemma says that for a in Region I the sentence y(s,) is not only
true, but even deducible from the postulates of II,; and that for a in
Region II the sentence is not only false, but even refutable (that is, its
negation is deducible) from these postulates.
The Lemma says nothing about the provability or refutability of
y(s,) in the other two regions. As far as Region III is concerned, the
reason is obvious: as we have seen, the sentence may not have a
uniform truth value in this region, so we cannot expect any uniform
result concerning its provability or refutability. But in Region IV the
position is quite different, because our sentence is false throughout this
region, just as in Region II. Why does the Lemma tell us nothing about
this fourth region?
To understand the reason for this discrepancy, we must examine
what kind of evidence is available for the truth or falsehood of y(s,)
when a is in Regions I, II, and IV.
In order to decide whether a given m-tuple (b;,b2,...,b,) of
numbers is an a-witness that Pa, we must be able to tell whether
ME ~la, b;, bo,..., 5], where @ is the equation r=t.
As we saw in Rem. 9.17, if (b,, b2,...,b,,) is indeed an a-witness
that Pa, then the operations required to recognize this fact can be
performed formally within Wo, and a fortiori within IT,.
Now, if (b;,b2,...,5,) iS not an a-witness that Pa, then the
operations required to recognize this fact involve not only adding and
multiplying to compute the relevant values of r and t, but also the
ability to tell that these two values are unequal. Thanks to Post. 5, all
this can be performed formally within IT,.
Thus, in II, it is possible to carry out formally all the operations
required to tell whether or not any given m-tuple (b;, b2,...,5,,) is
an a@-witness that Pa.
In order to decide whether a given a-witness that Pa is bounded by a
number b, we need to check whether each of the m components of the
witness is < b.
Now, if a is in Region I, then in order to verify that y(s,) is true we
need only to check that a given m-tuple of numbers is an a@-witness
that Pa, and is bounded by some given number 5; and then to verify
that each of the m-tuples bounded by b fails to be an a@’-witness that
P’a. Since there are only finitely many such m-tuples, all this requires
a finite number of simple steps.
In order to obtain a formal deduction of y(s,), we need to formalize
the process just described; and for this we need to have at our disposal
§10. Junior arithmetic ISS
10.14. Theorem
Given a recursive relation R, we can find a formula y, of the form
specified in Prel. 10.11, that represents R strongly in any theory that
includes 11,.
256 10. Limitative results
PROOE
In the Main Lemma, take P and P’ as R and —R, which are r.e. by
Thm. 9.3.6. Then the lemma shows that y represents R strongly in IT,,
and hence also in any theory that includes IT,. Es]
10.15. Problem
Let & be a theory that includes II,. Show that every recursive function
is representable numeralwise in XL. (If @ strongly represents the graph
of the n-ary function f in II,, prove that the formula
VySV n+ 1[0(Vn41/Y)<¥=Vn+i],
10.16. Remark
The results of this section, particularly the Main Lemma, in a some-
what weaker form, are essentially due to Barkley Rosser.’ The present
stronger version is made possible by the MRDP Thm., which allows us
to take a and a’ as simple existential formulas.
11.1. Postulate I
11.2. Postulate II
Vv, Vv2(Svj=sv2>Vv,=V>).
' His 1936 paper, ‘Extensions of some theorems of Gédel and Church’, is reprinted in
M. Davis, The Undecidable.
$11. A finitely axiomatized theory pexsy)
Vvi(v,+so=Vv)).
11.4. Postulate IV
Vv, Vv2[v,+sv.=s(v,+v>)].
ptRee Postulate V
Vvi(¥1XSo=So).
t1,6: Postulate VI
Vvi(v;<so—v1=So).
118: Postulate IX
11.10. Remarks
11.11. Theorem
II, C Ib.
PROOF
It is quite easy to show that all the postulates of II, (Post. 1-7) can be
deduced from Post. I-IX. (DIY, or see the details in B&M, pp.
341-342.) cf
11.12. Problem
(i) Let *M be the 2-structure such that:
1. *N=NU{o}, where © is an object that is not a natural
number;
26 = 0s
3. *s is the extension of the ordinary successor function such that
NO
4. *+ is the extension of ordinary addition such that if a = ~ or
b=othena*+b=0;
5. *xX is the extension of ordinary multiplication such that if
b=+(0 then @ "b= 6: © *x0—0° and
@ x @ =. for
all @-
Show that * is a model for II,.
(ii) Prove that the sentence Vv,(sv;#v}) is not in II,.
11.13. Theorem
(i) Given an r.e. relation P, we can find a formula that represents P
weakly in any sound theory.
(ii) Given a recursive relation R, we can find a formula that represents
R weakly in any theory & such that & U MM, is consistent.
PROOF
Pa+1—>a(s,) € 2.
Ran y(s,) € X. i®
§ 12. Undecidability
Let 2 be a set of sentences. The decision problem for & is the problem
of finding an algorithm -— a deterministic mechanical procedure —
whereby, for any sentence @, it can be determined whether or not
@ € =. This is clearly equivalent to the problem of finding an algorithm
whereby, for any number x, it can be determined whether or not
Ty(x) holds (that is, whether or not x is a SENTENCE of Z). If such an
algorithm is found, then this constitutes a positive solution to the
decision problem for Z, and & is said to be decidable. If it is proved
that such an algorithm cannot exist, this constitutes a negative solution
to that decision problem, and & is said to be undecidable.
Note that if © is undecidable, it does not follow that there is some
sentence for which it is impossible to decide whether or not it belongs
to ©. Each such individual problem may well be solvable by some
means or other. The undecidability of £ only means that no algorithm
will work for all sentences.
In order to make rigorous reasoning about decidability possible, this
260 10. Limitative results
12.1. Definition
If & is a set of sentences such that the property Ty is not recursive, we
say that & is recursively undecidable and that the decision problem for
x is recursively unsolvable.
From Tarski’s Theorem 7.4 and Cor. 5.15 it follows at once that Q is
recursively undecidable. This, as well as many other undecidability
results, also follows from
12.2. Theorem
If X is a theory in which every recursive property is weakly represent-
able, then & is recursively undecidable.
PROOF
(**) Px = a(s,) € =.
Taking x to be the number #a, we get, exactly as in the proof of Thm.
7.4:
12.3. Corollary
Any sound theory is recursively undecidable.
PROOF
Immediate, by Thms. 9.3.6 and 11.13(i). 2
12.4. Corollary
Any consistent theory in which every recursive property is strongly
representable is recursively undecidable.
PROOF
12.5. Corollary
Any consistent theory that includes YW, is recursively undecidable.
PROOF
Immediate, by Cor. 12.4 and Thm. 10.14. &
12.6. Corollary
If X is a theory such that & UI, is consistent, then X is recursively
undecidable.
PROOF
PROOF
Immediate from Cor. 12.6, since A U TI, = II, is clearly consistent.
12.8. Remarks
(i) The consistency of II, follows of course from its soundness; but it
can also be proved by more elementary arguments, without
invoking semantic notions.
262 10. Limitative results
12.9. Problem
Using Rem. 12.8(ii) and Prob. 8.14, obtain an alternative proof of
Thm. 8.12, not using Tarski’s Theorem.
12.10. Problem
Deduce Cor. 12.3 from Cor. 12.6.
12.11. Remarks
(i) Cor. 12.6 can be deduced from Cor. 12.5, as follows. Assume
that X is a theory such that 2 U II, is consistent.
In general, & UTI, is not a theory; but A = De(z UIL) is
clearly a consistent theory that includes II,, and hence also I.
Therefore by Cor. 12.5 A is recursively undecidable.
Let x be the conjunction of the nine postulates of II,. Then, it
is easy to show (DIY!) that, for any sentence q,
gpeAconr—@er.
VvoVv3... Vv,[a(so)
Vv; {a> a(sv,)}>Vv,a],
13.2. Remark
13.3. Definition
(i) Let ae @®,,, let *M be an £-structure and let a = (a), a2,..., An)
be an n-tuple of individuals in the domain *N. If @ is satisfied by
some — and hence every — valuation o based on *% such that
vV;" =a; fori =1, 2, . ..,n, we write:
Et ale.
(ii) For any -structure *3t, any formula a ¢ ®, (with n 21) and
any a7, 43,..., 4, € *N, we put
13.4. Definition
If *N is an L-structure and X is any subset of *N, we say that X is
inductive in *Q if it satisfies the condition:
ISD: Remarks
(i) A straightforward application of the BSD shows that
is equivalent to the condition that for all a2, a3,..., a, €*N the
set M(*2, @; ay, a3, ... , A) iS inductive in *2.
Thus, all instances of the induction postulates 13.1 hold in *2
iff all sets that are parametrically definable in *9t are inductive in
aot
(il) The Principle of Induction says that every subset of N is induc-
tive in Jt. It follows that all instances of 13.1 are true (that is,
they hold in 9t) and hence II is sound.
(iii) However, the present first-order induction scheme 13.1 falls far,
far short of expressing (under the standard interpretation) the full
power of the Principle of Induction. The latter states that all
subsets of N are inductive (in Jt). It is a second-order principle,
and was stated as such in Peano’s 1889 axiomatization of arith-
metic (cf. Rem. 6.1.8). Note that by Cantor’s Thm. 3.6.8 there
are uncountably many subsets of N.
On the other hand, our first-order induction postulates only
manage to state (under the standard interpretation) the induc-
tiveness of subsets of N that are parametrically definable in It —
that is, sets of the form M(X, @; az, a3, ... , a,). However, it is
easy to see (by an argument similar to that used in proving Thm.
6.3.9) that there are only denumerably many such subsets of N.
FOPA is in this sense merely a pale first-order shadow of the
theory outlined by Peano.
(iv) Nevertheless, II is an extremely strong theory. Although by
Thm. 8.12 we know that II must be a proper subtheory of ®, and
there must therefore exist true sentences that are not in I], it
requires very great ingenuity to discover such sentences.
The first examples of true sentences that do not belong to II
were given by Gédel in 1931. (We shall present his results in the
§13. First-order Peano arithmetic 265
13.6. Theorem
ic il.
PROOF
It is enough to show that the last three postulates of II, (Post. VII-IX)
belong to II. This is not difficult. (DIY or see B&M, p. 343f.) a
3.7: Problem
Prove that Vv;(sv;#v,) € If. Hence by Prob. 11.12(11) TI is a proper
extension of II,.
13,8; Remarks
(i) Let *9 be a model of II. Then * is, in particular, also a model
of II,; hence by Prob. 10.8 there is a unique embedding f of Jt in
*R. Without loss of generality, we can assume that *2 is actually
an extension of Jt. This amounts to assuming that N C *N and
that fn = n for every number n. Thus by Def. 3.4 we have:
=O) *sirn)
= mods) meen= m+n, me Xn — mn,
13.9) Problem
Let *M be a nonstandard model of TI. Without loss of generality,
assume that *Jt is an extension of 2.
266 10. Limitative results
' A translation of his paper, ‘On formally undecidable propositions of Principia mathe-
matica and related systems I’, is printed in van Heijenoort, From Ferge to Gédel.
§ 14. First Incompleteness Theorem 267
PROOF
Px <a Ty (d(x)).
B(s,) € 2 = AT (d(2)).
geQ=s=qex.
(*) ge Qandg
¢ Xz,
or
(#*) oy ¢Qandge.
14.3. Remarks
(i) If £, instead of being axiomatic, is assumed to be merely axio-
matizable, then the proof shows that there exists a sentence @
with the properties stated in the theorem, without telling us how
to obtain it.
(ii) In the proof of Thm. 14.2 we established not only that @ ¢ X but
also that @ € 2; hence mq ¢ 2. Since Y is assumed to be sound,
it follows that —q@ ¢ Z as well. Thus neither @ nor its negation is
in XZ, showing Z to be incomplete. For this reason Thm. 14.2 is an
incompleteness theorem.
(iii) Gédel says of @ that it is [formally] undecidable in X. We prefer
to say that @ is undecided by &, so as to avoid confusion with the
term undecidable explained in § 12.
14.4. Analysis
We know that Ty is the property of being a SENTENCE of &. Moreover,
tracing through the proof of Thm. 8.10, we see that — for an axiomatic
theory & — Ts was obtained as an r.e. property by noting that, for
any x,
Ty(x) <> x 1s a SENTENCE deducible from the postulates of X.
14.5. Theorem
Every axiomatizable complete theory is recursively decidable.
PROOF
Let & be an axiomatizable complete theory. Then by Thm. 8.10 Ty is
an r.e. property.
Also, if x is any number then, by the completeness of 2 : > Ty(x) iff
x is not a SENTENCE, Or X iS a SENTENCE whose negation belongs to Z.
Thus
= Ty(x) = 3 Frm (x, 0) v Ty(64°x).
270 10. Limitative results
Here Frm is the recursive relation defined in Ex. 6.6(iii). Note that
Frm(x,0) holds iff x is a SENTENCE. Note also that by Def. 6.3
Ty(64°x) holds iff x is a SENTENCE whose negation belongs to 2.
Clearly, 64*x is a recursive function of x. Also, by Thm. 9.3.6
— Frm is r.e. since Frm is recursive. Hence by Thms. 9.4.3 and
9.4.6(iii) it follows that — Ty is an r.e. property.
Therefore by Thm. 9.3.6 Jy is recursive. : gz
PROOF
14.7. Remark
14.8. Analysis
Consider the properties P and P’ defined in the proof of Thm. 14.6.
By definition, P’x holds iff d(x) is a SENTENCE belonging to LZ, and Px
holds iff d(x) is a SENTENCE whose negation is in X.
Thus, if 2 is consistent Px and P’x are incompatible. Referring back
to the definition of the four regions in Analysis 10.13, this means that,
for a consistent 2, Region III is empty. (The two discs in Fig. 5 do not
overlap.)
On the other hand, if 2 is the inconsistent theory, then Px and P’x
hold for exactly the same numbers x — namely, for any x such that
d(x) is a SENTENCE. Thus in this case Regions I and II are empty. (The
two discs in Fig. 5 coincide.)
Also, from Analysis 10.13 we find that (under the standard interpre-
tation) the Gddel—Rosser sentence y(s4,) ‘says’:
The proof of Thm. 14.6 shows that #y cannot belong to either of the
Regions I and II. Let us see why this is so.
Suppose #y were in Region I. Then, as we saw in Analysis 10.13,
y(Sgz,) must be true. Therefore (*) is a true statement. This implies
Oe, 10. Limitative results
that +y(sy,) is in 2. On the other hand, the Main Lemma tells us that
if #y were in Region I then y(sy,) would be in If; and hence in 2,
making & inconsistent — in which case Region I is empty! So #y cannot
be in Region I,
Now suppose #y were in Region II. Then the Main Lemma tells us
that y(sy,) is refutable from the postulates of II;, hence also from
those of &. Therefore there is an a-witness that y(sy,) is refutable
from the latter postulates. But since #y is in Region II, we know from
Analysis 10.13 that y(sy,) is false, so (*) is a false statement. This
implies that although an a-witness for the refutability of y(sy,) in X
can indeed be found, this does not happen before an a@’-witness for the
provability of y(sy,) in Z is also found. This means that y(s4,) is both
refutable and provable from the postulates of £, again making &
inconsistent, in which case Region II is empty. So #y cannot be there
either.
So #y must be in Region III or in Region IV. The former happens if
x is the inconsistent theory. In this case y(sy,) may be true or false,
depending on the precise form of a and a’, and in particular on the
(inconsistent) set of postulates by means of which & is given.
If & is a consistent theory, then Region III is empty, so #y belongs
to Region IV. From Analysis 10.13 we know that in this case y(s4,) is
a false sentence. This can also be seen from the proof of Thm. 14.6,
which shows that if 2 is consistent then y(sz,) is neither provable nor
refutable from the postulates of £. Therefore (*) is an untrue state-
ment, and y(sy,) is a false sentence.
Referring to Prel. 10.11 (with n=1), it is easy to see that for any
number k we have both y(s;,) + dyB(s;,) and JyB(s,) | a(s;,). Hence
+ a(s,)>(s;). Using this fact for k = #y, it follows from (3) that
PROOF
15.2. Remarks
(i) The Second Incompleteness Theorem can be extended to all
sufficiently strong formal theories, in & and other languages. All
that is required is that the theory in question is axiomatic, and
includes an appropriate ‘translation’ of II. For example, this
result applies to all the usual formalizations of set theory, such as
Lay
(ii) The result means that the consistency of any sufficiently strong
consistent axiomatic theory cannot be proved by means of argu-
ments that are wholly formalizable within that theory.
(iii) This poses a grave difficulty for the formalist view of mathema-
tics. For a brief discussion of this, see B&M, p. 358f.
(iv) In particular, if ZF is consistent, a proof of this fact cannot be
carried out within ZF itself. For this reason, it is extremely
unlikely that an intuitively convincing consistency proof for ZF
can ever be found.
This comes close to saying—but does not quite say—that set theory is
the sole foundation of the whole of mathematics. But soon such radical
claims were voiced. In 1910 Hermann Weyl’ put forward the view that
the whole of mathematics ought to be reduced to axiomatic set theory.
Each notion in the other branches of mathematics must be defined
explicitly in terms of previously defined notions. This regress stops
with set theory; ultimately all mathematical notions are to be defined
in set-theoretic terms.
wCitedins2 of Chal
2 The paper, ‘Uber die Definitionen der mathematischen Grundbegriffe’ is reprinted in
his Gesammelte Abhandlungen (1968). In this paper Weyl outlines a characterization of
the notion definite property, which he was to make more precise eight years later in
Das Kontinuum (cited in §2 of Ch. 1). The lines quoted here were translated by
Michael Hallett.
ZIAD
276 Appendix: Skolem’s Paradox
of an axiom system. Thus axiomatic set theory (more or less along the
lines proposed by Zermelo) becomes the ultimate framework for the
whole of mathematics.
Although Weyl was to change his mind, the reductionist view he had
expressed in 1910 was rapidly becoming very widespread among
mathematicians.
It was this reductionism that Skolem set out to criticize in 1922. His
short paper! — text of an address delivered at a congress of Scandina-
vian mathematicians — contains a lucid presentation of an astonishing
wealth of logical and set-theoretic ideas and insights.” But in Skolem’s
own view the most important result in his paper is what came to be
known as Skolem’s Paradox. It is the first of the fundamental limitat-
ive results in logic. In a Concluding Remark he comments on it:
1 In 1922 Fopcal had not been finalized (this was done in 1928 by David Hilbert and
Wilhelm Ackermann). When Skolem assumes ZF to be ‘consistent’, he means that it is
satisfiable. He then invokes the Léwenheim-Skolem Theorem (which he proves
directly, using relatively elementary means) to obtain a denumerable model for ZF.
278 Appendix: Skolem’s Paradox
logically deduced from its postulates. The postulates, and they alone,
must determine whether or not a given interpretation of the extra-
logical symbols of the theory is legitimate: an interpretation is legiti-
mate iff it satisfies the postulates.
(1) a@ = {x : xEa}.
We call @ the E-extension of a. Clearly, @ is a genuine set, in fact a
subset of U; and we have, for all x
2) x €a@<xEa.
' Note the ironic double role played by Cantor’s Theorem. On the one hand, the fact
that Cantor’s Theorem holds inside U (that is, under the interpretation LU) gave rise to
the paradox in the first place, because it was used to give us an uncountable set (in the
sense of Ll). Now we are using the fact that Cantor’s Theorem holds ‘in the real world’
in order to resolve the paradox.
§3. The paradox and its resolution 281
Let us see how these observations help to resolve the paradox. In his
universe, Hugh finds an object w" that is ‘the set of finite ordinals’ in
his sense (ow satisfies, in the interpretation U, the formal set-theoretic
definition of the set of finite ordinals). Of course, w may not ‘really’
be the set of finite ordinals; but it is quite easy to see that its
F&-extension is in fact denumerable. Now, Hugh has found another
object (U-set) c, which serves as the U-power-set of w", and he can
prove that c is uncountable. We, on the other hand, can prove that c
has only countably many U-members. Who is right?
In fact, both he and we are right. He is right because there does not
exist any U-set @ that constitutes an injection from c to w" in the sense
of the interpretation U. We, on the other hand, are right because the set
c (the E-extension of c) is countable in the sense of our external
world. In fact, we can prove that there exists an injection f from ¢ to
the E-extension of w". However, this f is purely external; it exists in
the outside world, but it cannot be the E-extension of any U-set.
Indeed, if f were not purely external then it would be quite easy to
show that c is countable in the sense of U.
So the paradox is resolved — but not very happily. It is disappointing
to find that axiomatic set theory, if consistent, has such perverse
models, in which an object that is really quite modest in size can seem
huge.
As Skolem himself pointed out, countability is by no means the only
important set-theoretic notion that is relative in this sense. For exam-
ple, the notion of finiteness is also relative: we can have a model U
(even a denumerable one) in which a U-set a may be finite in the
internal sense of U, while in fact a has infinitely many U-members.
Indeed, by an argument like that used in the proof of Skolem’s Thm.
10.3.8 we can show that ZF has a model U (with denumerable universe)
such that the object w", the U-set-of-finite-ordinals, is nonstandard.
This means that — in addition to U-members of the form n" for each
natural number n (that is, U-cardinals corresponding to the natural
numbers) — w" also has U-members that do not correspond to any
natural number. If a is such a nonstandard U-member of w" then a is
a U-finite-ordinal: it satisfies in U the formal definition of the notion
finite ordinal (the formalization of the first part of Def. 4.3.1). In
particular, a is U-finite. But, as seen from outside U, @ actually has
infinitely many U-members, and so @ is really (really?) an infinite set!
(Cf. Warning 6.1.9.)
This has an important bearing on the issue raised in Rem. 10.3.10 in
282 Appendix: Skolem’s Paradox
connection with Skolem’s Theorem. The theorem says that the struc-
ture It of natural numbers cannot be characterized uniquely (up to
isomorphism) in the first-order language of arithmetic.
Now, Dedekind showed that the system of natural numbers can be
characterized uniquely in set-theoretic terms (cf. Rem. 4.3.8(i)). Fol-
lowing him, Peano also formulated his axiomatization of that system
using variables ranging over all sets of natural numbers (cf. Rem.
10.13.5(iii)). These, then, are characterizations of the system of natural
numbers within an ambient set theory. And they seem to work, in the
sense that in a sufficiently strong set theory it can be shown that
Peano’s axioms have (up to isomorphism) a unique model (cf. Rem.
Osis)
However, these set-theoretic characterizations are all relative: they
merely pass the buck to set theory. And now we see that set theory
itself has strange (nonstandard) models. Hugh may be very pleased to
find that in his world there is (essentially) just one ‘system of natural
numbers’ satisfying Peano’s second-order postulates. But we, from our
external vantage point, can see that this U-system-of-natural-numbers
is in fact (in fact?) nonstandard, containing infinite unnatural numbers,
which merely seem finite to Hugh.
Godel, K., viii, 10, 78, 97, 194, 264, 266, Tarski, A., 153, 236
268, 274, 276 Turing, A., 194, 203
Hallett, M., 13, 275 van Heijenoort, J., 13-15, 90, 266
Halmos, P. R., 9 von Neumann, J., 54
Hamilton, W. R., 64
Harrington, L., 265 Weierstrass, K., 64
Hilbert, D327 Weyl, H., 15, 275, 276
Hodges, W., 152 Whitehead, A. N., 14
Kleene, S., 194 Zermelo, E., 14-18, 39, 77, 275, 276
Kuratowski, K., 24, 85 Zorn, M., 85
283
General index
References are given to the places where a term is defined, re-defined or explained.
A reference of the form x.y is to Section y of Chapter x. A reference of the form x.y.z is
to item z in Section y of Chapter x.
284
General index 285
bestiao Ded -@
ves € haa
Te
eI oa » @
rd
We
DATE DUE
MATA
Printed
in USA
HIGHSMITH #45230
ii l
a ———
eS
Ii I!) | |
—
3 1254 0266
WITHDRAWN. F.¢.y
This is an introduction to set theory and logic that starts completely
from scratch. The text is accompanied by many methodological
remarks and explanations. A rigorous axiomatic presentation of
Zermelo-Fraenkel set theory is given, demonstrating how the basic
concepts of mathematics have apparently been reduced to set
theory. This is followed by a presentation of propositional and first-
order logic. Concepts and results of recursion theory are explained in
intuitive terms, and the author proves and explains the limitative
results of Skolem, Tarski, Church and Goédel (the celebrated
incompleteness theorems).
For students of mathematics or philosophy this book provides an
excellent introduction to logic and set theory.
ISBN 0-521-47493-0
CAMBRIDGE
UNIVERSITY PRESS
9°780521°474931