0% found this document useful (0 votes)
498 views628 pages

A Course in Mathematical Logic - Bell, J - L - (John Lane) Machover, Moshé - 1977 - Amsterdam - North-Holland Pub - Co - New York - Sole

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
498 views628 pages

A Course in Mathematical Logic - Bell, J - L - (John Lane) Machover, Moshé - 1977 - Amsterdam - North-Holland Pub - Co - New York - Sole

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 628

NUNC COCNOSCO EX PARTE

THOMAS J. BATA LIBRARY


TRENT UNIVERSITY
Digitized by the Internet Archive
in 2019 with funding from
Kahle/Austin Foundation

https://archive.org/details/courseinmathematOOOObell
A COURSE IN MATHEMATICAL LOGIC
A COURSE IN
MATHEMATICAL LOGIC
by

J. L. BELL
London School of Economics
and Political Science

and

M. MACHOVER
Chelsea College of Science
and Technology

&Hc

1977

NORTH-HOLLAND PUBLISHING COMPANY — AMSTERDAM ■ NEW YORK OXFORD


Q ft ^ • B 3=\5 3

© North-Holland Publishing Company — 1977

All rights reserved. No part of this publication may be reproduced, stored in a retrieval
system, or transmitted, in any form or by any means, electronic, mechanical, photocopying,
recording or otherwise, without the prior permission of the copyright owner.

Library of Congress Catalog Card Number: 75-33890


North-Holland ISBN: 0 7204 28440

Published by:
North-Holland Publishing Company
Amsterdam • New York • Oxford

Sole distributors for the U.S.A. and Canada:


Elsevier North-Holland, Inc.
52 Vanderbilt Avenue
New York, N. Y. 10017

Library of Congress Cataloging in Publication Data

Bell. John Lane.


A course in mathematical logic.

Bibliography: p. 576.
Includes index.
1. Logic, Symbolic and mathematical. I. Machover, Moshe, joint author. II. Title.
QA9.B3953 51T.3 75-33890

Printed in Hungary
On the contrary, I find nothing in logistic for the discoverer but shackles. It does not help
us at all in the direction of conciseness, far from it; and if it requires 27 equations to es¬
tablish that 1 is a number, how many will it require to demonstrate a real theorem?

H. Poincare

Although it is a distinctly minor issue, we must mention Fibonacci’s famous recurring se¬
ries.... There is an extensive literature, some of it bordering on the eccentric, concerning
these numbers.... Some professional and 'dilettant esthetes have applied Fibonacci’s num
bers to the mathematical dissection of masterpieces in painting and sculpture with results
not always agreeable, although sometimes ludicrous, to creative artists. Others have dis¬
covered these protean numbers in religion, phyllotaxis, and the convolutions of sea shells

E. T. Bell

Infinitesimals as explaining continuity must be regarded as unnecessary, erroneous, and


self-contradictory.
B. Russell

302343
Dedicated to the memory of
Andrzej Mostowski
and
Abraham Robinson
ACKNOWLEDGEMENTS

In the course of producing this book we have become indebted to many


people. To begin with, we would like to put on record our intellectual
debt to those logicians and mathematicians whose work we have expo¬
unded: in a work of this kind it would not be feasible to attribute each
result to its creator, but we hope that the historical references at the end
of each chapter will furnish a general (if sketchy) guide to who proved
what.
Our students have been very helpful in furnishing comments and cri¬
ticism; in this connection we would particularly like to thank Michael
Bate, Narciso Garcia and Ali al-Nowaihi.
We are also grateful to our colleagues Daniel Leivant, Dirk de Jongh
and Brian Rotman for reading sections of the manuscript and making,
useful suggestions.
The job of typing the manuscript was, to say the least, an arduous
and tricky one and for skilfully carrying out this operation we are indeb¬
ted to Buffy Fennelly, Diane Roberts, Barbara Silver and Marie-Louise
Varichon.
We would also like to thank the printers, and the staff of North-Hol-
land Publishing Company, in particular Einar Fredriksson and Thomas
van den Heuvel, for the competent and friendly way they have handled
all stages in the production of this book.
Finally, we offer our warmest thanks to our wives Mimi and liana for,
among many other things, their patience and encouragement throughout
the long writing period and for their help in preparing the bibliography.

John Bell
Moshe Machover
CONTENTS

Acknowledgements. IX
Interdependence scheme for the chapters.XIV
Introduction. XV
Recommended reading.XIX
Chapter 0. Prerequisites. 1
Chapter 1. Beginning mathematical logic. 5
§1. General considerations . 5
§2. Structures and formal languages. 9
§3. Higher-order languages.14
§4. Basic syntax.15
§5. Notational conventions.18
§6. Propositional semantics.20
§7. Propositional tableaux.25
§8. The Elimination Theorem for propositional tableaux.31
§9. Completeness of propositional tableaux.33
■§10. The propositional calculus.34
§11. The propositional calculus and tableaux.40
§12. Weak completeness of the propositional calculus.43
§13. Strong completeness of the propositional calculus.43
§14. Propositional logic based on ~l and A .46
§15. Propositional logic based on “1, A and V.47
§16. Historical and bibliographical remarks .48
Chapter 2. First-order logic.49
§1. First-order semantics.49
§2. Freedom and bondage . . . . ‘ >.54
§3. Substitution.57
§4. First-order tableaux.67
§5. Some “book-keeping” lemmas.72
§6. The Elimination Theorem for first-order tableaux.79
§7. Hintikka sets.83
§8. Completeness of first-order tableaux.88
§9. Prenex and Skolem forms.93
§10. Elimination of function symbols.97
CONTENTS XI

§11. Elimination of equality.101


§12. Relativization .102
§13. Virtual terms.104
§14. Historical and bibliographical remarks.107

Chapter 3. First-order logic (Continued).108

§1. The first-order predicate calculus .108


§2. The first-order predicate calculus and tableaux.115
§3. Completeness of the first-order predicate calculus.117
§4. First-order logic based on 3.122
§5. What have we achieved?.122
§6. Historical and bibliographical remarks.124

Chapter 4. Boolean Algebras.125


§1. Lattices .125
§2. Boolean algebras.129
§3. Filters and homomorphisms.133
§4. The Stone Representation Theorem.141
§5. Atoms.150
§6. Duality for homomorphisms and continuous mappings.153
§7. The Rasiowa-Sikorski Theorem.157
§8. Historical and bibliographical remarks.159

Chapter 5. Model theory.161


§1. Basic ideas of model theory.161
§2. The Lowenheim-Skolem Theorems.168
§3. Ultraproducts.174
§4. Completeness and categoricity.184
§5. Lindenbaum algebras.191
§6. Element types and tVcategoricity.203
§7. Indiscernibles and models with automorphisms.214
§8. Historical and bibliographical remarks.224
Chapter 6. Recursion theory.226
§1. Basic notation and terminology.226
§2. Algorithmic functions and functionals.230
§3. The computer URIM.232
§4. Computable functionals and functions.237
§5. Recursive functionals and functions.239
§6. A stockpile of examples.247
§7. Church’s Thesis.257
§8. Recursiveness of computable functionals.259
§9. Functionals with several sequence arguments.265
§10. Fundamental theorems.266
§11. Recursively enumerable sets.277
§12. Diophantine relations.284
§13. The Fibonacci sequence. 288
§14. The power function.296
XII CONTENTS

§15. Bounded universal quantification.305


§16. The MRDP Theorem and Hilbert’s Tenth Problem.311
§17. Historical and bibliographical remarks.314

Chapter 7. Logic — Limitative results.316


§1. General notation and terminology.316
§2. Nonstandard models of .318
§3. Arithmeticity .324
§4. Tarski’s Theorem.327
§5. Axiomatic theories.332
§6. Baby arithmetic.334
§7. Junior arithmetic.336
§8. A finitely axiomatized theory.340
§9. First-order Peano arithmetic.342
§10. Undecidability.347
§11. Incompleteness.353
§12. Historical and bibliographical remarks.359

Chapter 8. Recursion theory (Continued).361


§1. The arithmetical hierarchy.361
§2. A result concerning Tq.369
§3. Encoded theories.370
§4. Inseparable pairs of sets.372
§5. Productive and creative sets; reducibility.376
§6. One-one reducibility; recursive isomorphism.384
§7. Turing degrees.388
§8. Post’s problem and its solution.392
§9. Historical and bibliographical remarks.398

Chapter 9. Intuitionistic first-order logic.400


§1. Preliminary discussion.400
§2. Philosophical remark.403
§3. Constructive meaning of sentences.403
§4. Constructive interpretations.404
§5. Intuitionistic tableaux.408
§6. Kripke’s semantics.416
§7. The Elimination Theorem for intuitionistic tableaux.422
§8. Intuitionistic propositional calculus.433
§9. Intuitionistic predicate calculus.434
§10. Completeness .'.438
§11. Translations from classical to intuitionistic logic..442
§12. The Interpolation Theorem.445
§13. Some results in classical logic.452
§14. Historical and bibliographical remarks.457

Chapter 10. Axiomatic set theory.459


§1. Basic developments.459
§2. Ordinals .468
CONTENTS
XIII

§3. The Axiom of Regularity. 477


§4. Cardinality and the Axiom of Choice. 487
§5. Reflection Principles.
§6. The formalization of satisfaction. 497
§7. Absoluteness.
§8. Constructible sets. 5q9
§9. The consistency of AC and GCH. 5]6
§10. Problems. ^22
§11. Historical and bibliographical remarks. 529
Chapter 11. Nonstandard analysis. 53]
§1. Enlargements. 532
§2. Zermelo structures and their enlargements.536
§3. Filters and monads. 543
§4. Topology. 553
§5. Topological groups. 56j
§6. The real numbers. 555
§7. A methodological discussion. 572
§8. Historical and bibliographical remarks. 573
Bibliography. 576

General index . 5g4

Index of symbols.595
INTERDEPENDENCE SCHEME FOR THE CHAPTERS
INTRODUCTION

For ihe past seven years, the authors have conducted a one-year M.Sc.
programme in mathematical logic and foundations of mathematics at
London University. The present book developed from our lecture notes
for this programme, and the student should therefore be able to work
through the text in (roughly) one academic year. The main problem that
we faced in constructing the programme was the following. First, we
wanted it to be an integrated and balanced account of the most important
aspects of logic and foundations. But secondly, since parts of our programme
are taken by mathematics and philosophy of science students who for
one reason or another do not want to cover all the topics we discuss, we
were led to arrange it in such a way that parts could be taken as separate
smaller courses. Accordingly, the book itself falls naturally into several
units:

L Chapters 1-3. These together constitute an elementary introductidn


to mathematical logic up to the Godel-Henkin completeness theorem.
We teach this part in a fairly leisurely way (four hours per week for ten
weeks, including problem classes), and accordingly the pace of the text
here is rather gentle. There is one feature which deserves special mention
and that is the use of Smullyan’s tableau method. This method serves
a dual purpose. First, it is a proof-theoretic instrument that allows us
to obtain constructive proofs of various results. In this respect it is equi¬
valent to Gentzen’s calculus and to various systems of natural deduction.
Secondly, our teaching experience shows that Smullyan’s method has
the great advantage of being a practical tool — after a little practice,
it furnishes a quick, efficient and almost computational method of actually
detecting the truth or falsehood of formulas. (This efficiency stems in
part from the fact that, unlike Gentzen’s calculus, it does not require
the same formula to be copied again and again.) Flowever, the material
XVI INTRODUCTION

on tableaux has in fact been isolated in separate (starred) sections so


that the reader who does not want to use this material can simply ignore it;
what remains is a self-contained standard account of first-order logic.
A middle course is also possible: a reader wishing to enjoy the practical
advantages of tableaux but who lacks the time or patience for the somewhat
complex constructive proofs of elimination theorems (Ch. 1, §8, and Ch. 2,
§§5, 6) can skip the latter because the same results are also obtained in
an easier but non-constructive way elsewhere (Ch. 1, §9 and Ch. 2, §8).
We should like to point out that the somewhat rebarbative complexities
of Ch. 2, §5 could have been avoided by using different symbols for free
and bound variables (as is often done in texts devoted mainly to proof
theory). This, however, would detract slightly from the practical utility
of the method and in any case would be contrary to accepted usage in
most other branches of logic.

2. Chapters 4 and 5. The contents of Chapter 4 are taught for 1 hour


a week over 10 weeks, concurrently with the material in Chapters 1-3
(of which it is totally independent). It in fact constitutes a separate short
course on Boolean algebras. The material in Chapter 5 — model theory —
is taught over the following 10 weeks for 2 hours a week. It depends heavily
on Chapter 4 but only slightly on Chapters 1-3 inasmuch as it can be read
by anyone having modest acquaintance with the notation and main results
of first-order logic.

3. Chapters 6 and 8. These two chapters constitute a self-contained


course on recursion theory. The material in Chapter 6 is taught for 2 hours
a week over 10 weeks, concurrently with Chs. 1-4, of which it is totally
independent. There are two points here which call for comment. First,
we employ register machines instead of Turing machines, because the
former are much closer in spirit to actual digital computers, and are also
smoother theoretically. Secondly, this chapter includes a full proof of
the Matiyasevich-Robinson-Davis-Putnam (MRDP) theorem that every
recursively enumerable relation is diophantine. We believe that — despite
the length and tedium of the proof — this result is of such importance
that no modern account of recursion theory can afford to omit it. In
teaching this part of the chapter, we have found that some of the material
in §§13, 14, and the first half of §15 can be omitted in class and given to the
student to study at home. As for Chapter 8, it is taught for the following
eight weeks at a rate of 2 hours per week. The material here of course
depends entirely on Chapter 6, but in this book it appears after Chapter 7,
INTRODUCTION XVII

because it is motivated by and illuminates the results contained there.


However, no detailed knowledge of Chapter 7 is required to understand
Chapter 8. In Chapters 6 and 8 we have adopted a somewhat formal
approach: in proving that such-and-such a function is recursive, we employ
the precise apparatus furnished by the recursion theorem, rather than
the intuitive proof by Church’s thesis”. We have chosen this course
because we believe that the beginning student has not yet developed
sufficient experience in the subject to be totally convinced by intuitive
proofs which employ Church’s thesis.

4. Chapter 7. This chapter contains an account of the limitative results


about formal mathematical systems. Reliance on the MRDP theorems
has enabled us to simplify some of the proofs and obtain somewhat sharper
results than usual. The chapter presupposes a good knowledge of first-
order logic and some knowledge of recursion theory. However, it can be
and is taken by students who have no detailed acquaintance with the
latter. We have found it feasible to develop all the requisite results from
recursion theory — except the MRDP theorem — using Church’s thesis,
the MRDP theorem itself being stated without proof. This approach
enables us to teach the material in this chapter intelligibly to students
who do not want take a full-fledged course in recursion theory. (The
material here is in fact taught concurrently with Chapter 5 for 10 weeks
at 2 hours per week.)

5. Chapter 9. Here we have an elementary introduction to first-order


intuitionistic logic. While neither of the authors claim to be an expert
on intuitionism, we nevertheless believe that the constructivist approach
to mathematics is of such great importance that some discussion of it is
essential. (The material in this chapter is taught concurrently with Chapters
5, 7 and 8 for 10 weeks at 1 hour per week.)

6. Chapter 10. This is devoted to an axiomatic investigation of Zermelo-


Fraenkel set theory, up to the relative consistency of the axiom of choice
and the generalized continuum hypothesis. It requires modest familiarity
with first-order logic and with the Lowenheim-Skolem theorem in Chapter 5.
(This material is taught over roughly 10 weeks at 2 hours per week at the
end of the second term and the beginning of the final term.)

7. Chapter 11. This chapter contains an introduction to nonstandard


analysis, which is an important method of applying model theory to
mathematics. The material here is taught over 10 weeks at 2 hours per week,

2
XVIII INTRODUCTION

during the latter part of the year. Although this chapter presupposes a few
results of model theory, these results can be stated concisely without proof
for the benefit of those students who wish to study the subject without doing
a special course on model theory. In fact it is possible to teach nonstandard
analysis to students who have only a slender acquaintance with logic.

As can be seen from the foregoing synopsis, the material in the book
can be regarded as forming several relatively independent units. However,
the book has been conceived as an organic whole, and provides what is
in our view a “balanced diet”. We have striven to reveal the interplay
between “structural” (i.e. set-theoretical) ideas and “constructive” methods.
The latter play a particularly prominent role in mathematical logic, and
we have therefore stressed the constructive approach where appropriate
but without, we hope, undue fanaticism.
The problems constitute an essential part of the book. They are not
mere brainteasers, nor should they be too difficult for the student to solve,
given the hints that are provided. Many of them contain results which
are later employed in proofs of theorems. Accordingly, no unstarred
problem should be skipped!
Certain sections and problems are starred. This does not necessarily
indicate that they are more difficult, but rather that they may be omitted
at a first reading. Some problems have been starred because they require
more knowledge or skill than is needed for understanding the text in
the same section.
Each chapter is divided into sections. When we want to refer to a theorem,
problem, definition, etc., within the same chapter, we give the number
of the section in which it occurs, followed by its number in that section.
Thus, e.g., Def. 2.10.1 is the first numbered statement in §10 of Ch. 2 and
within Ch. 2 it is refeired to as “Def. 10.1.” (or simply “10.1”).
We use the convenient abbreviation “iff” for “if and only if”. The
mark | is used either to signify the end of a proof or, when it appears
immediately after the statement of a result, to indicate that the proof is
immediate and is accordingly omitted.
References to the bibliography are given thus: Kelley. [1955], The
overwhelming majority of references to the bibliography are given in
a separate section at the end of each chapter.
RECOMMENDED READING

In addition to the special works referred to in the text, we would like


to recommend the following books as general reading in logic and the
foundations of mathematics.
Benacerraf and Putnam [1964]. (An excellent anthology of seminal
works in the philosophy of mathematics.)
Davis [1965]. (An anthology of important papers on limitative theorems
and recursion theory.)
Fraenkel, Bar-Hillel and Levy [1973]. (A comprehensive survey
of the foundations of mathematics, and set theory in particular.)
Van Heijenoort [1967]. (Another excellent anthology containing many
of the classic papers in logic and foundations.)
Kneebone [1963]. (An elementary but scholarly general introduction
to and survey of the foundations and philosophy of mathematics.)
Kreisel and Krivine [1967]. (The appendices to this work contain
a penetrating discussion of the philosophical foundations of mathematics.)
Lakatos [1967]. (A collection of papers on the foundations and philo¬
sophy of mathematics delivered at the 1965 London colloquium.)
Mostowski [1966]. (A wide-ranging survey of the development of mathe¬

matical logic from 1930 to the 1960’s.)

2*
CHAPTER 0

PREREQUISITES

In this book we assume that the reader is familiar with the basic facts of
naive set theory (including the fundamentals of cardinal and ordinal
arithmetic) as presented, e.g., in Fraenkel [1961], Halmos [1960] or
Kuratowski-Mostowski [1968]. Facts about cardinals and ordinals
are used at the end of Ch. 3, occasionally in Chs. 4 and 9, and throughout
Chs. 5 and 10. In some places (especially in Chs. 4 and 11), we assume
a slender acquaintance with the basic notions of general topology as
presented, e.g., in Ch. 1 of Bourbaki [1961] or the first few chapters of
Kelley [1955].
We distinguish between classes and sets. Except in Chs. 10 and 11 (where
the terms “class” and “set” are assigned a more precise technical meaning),
a class is understood to be an arbitrary collection of objects, while a set
is a class which can be a member of another class. (Another distinguishing
feature of sets is that only they have cardinalities.)
Given an object a and a class X, we write as usual a£ Xfor “a is a member
(element, point) of X and say “X contains a” or “a is in X”. If X contains
every member of a class Y, we say “X includes Y” and write Fsl. Two
classes are regarded as identical if they have the same members.
The set of natural numbers (which contains 0) is denoted by N or co.
Except in Ch. 10, the empty set is denoted by 0. If A is a set, the power set
PA of A is the set of all subsets of A.
Given n>1 objects a1,...,a„, we write (ax,...,^,) for the ordered n-tuple
of a1v..,a„. Thus <A,.y) is the ordered pair of a and y. By convention, we
put (a)=a (the ordered singleton of a).
The Cartesian product of a finite sequence of classes A1,...,A„ (with
ns* 1), denoted by A1X-.-XA„, is the collection of all n-tuples (a1}...,a„)
with a1€A1,...,a„£A„. If each A( is identical with a fixed class A, we
write A" for A^.-.XA^ By convention, we set A°={Q); thus A0 has
exactly one member, namely 0.
2 PREREQUISITES

For b> 1, an n-ary relation on a class A is a collection of n-tuples of


members of A, i.e. a subclass of A". A unary relation on A is called a
property, it is just a subclass of A. The identity (or diagonal) relation
on A is the binary relation

{<*>*>: X^A)-
The membership relation on A is the binary relation

{(x,y>: x£A and ye A and x€y}.

If R is an «-ary relation on A and B £ A, the restriction of R to B is defined


to be the n-ary relation RnB" on B. If R is a binary relation, we often
write xRy for (x,y)e R.
A function (map, mapping) is a class / of ordered pairs such that, whenever
(x,y)ef and (x,z)ef, we have y=z. The domain dom(/) of / is the class
{x: for some y, (.x,y)ef}

and the range ran(/) of / is the class

{y: for some x, (x,y)ef)-


If / is a function, and x£dom(/), then the unique y for which (.x,y)ef
is denoted (except in Ch. 10) by f(x), or sometimes fx, etc., and is called
the value of / at x. Sometimes we specify a function /in terms of its values;
under these conditions we write x>->-/(x). (Thus, for example, x^x+1
describes the successor function on N.) If/is a function such that dom(f) = A
and ran(f)^B, we say that / is a function from A to (into) B and write
/ : A-+B. If / : A-+B and X^A, we define the restriction f\X: X^B by

(/|X)(x)=/(x) for x€X.

If XgA and Y^B, we put

f[X] = {f(x): xeX}, /-1 m = {x: /(x)€ Y),

and, for ye Y, we put


* \

f-Hy)=f-'l{y)]•

A function / :A-+B is one-one (an injection) if f(x)=f(y) implies x—y


for all x,yeA; onto (a surjection) iff[A] = B; and a one-one correspondence
(a bijection) between A and B if both of these conditions hold. A and B
are said to be equipollent if there is a bijection between A and B. If/ : A -+B
and g : B-+C, the composition gof : A-+C is defined by (go f)(x)=g(f(x))
PREREQUISITES 3

for xdA. We sometimes omit the o and write simply gf instead of gof
Observe that, for each class A, the identity relation on A is a bijection
between A and A; for this reason it is also called the identity map on A.
If A^B, the natural injection of A into B is the map i : A-» B defined by
/(x)=x for xdA.
If A is a class and / is a set, we write A1 for the collection of all functions
from I into A. (Notice that this definition implies A0 = {0}=A°.) If {Ap id 1}
is an indexed family of sets, we write f°r collection of all
functions / with domain I such that f{i)dAt for all id I- The axiom of
choice asserts that, if each AtX0, then n
For any ndN, an n-ary operation on a class A is a function from An to A.
In particular, a 0-ary operation on A is a function from {0} to A, and
therefore has a unique value which we identify with the given 0-ary operation.
Thus a 0-ary operation on A is just a member of A. If/is an n-ary operation
on A, we write/(%,...,<z„) for f((a1,...,a„)). A subclass B of A is said to be
closed or stable under / if f(b1,...,bn)d B whenever b1,...,bndB. If B is
closed under f we define the restriction f\B of / to B by

(f\B)(b1,...,bn)=f(b1,...,bn) for blt...,bndB.

A binary relation R on a class A is called an equivalence relation if it


satisfies:
(a) xRx for all xd A,
(b) xRy implies yRx for all x,ydA,
(c) xRy and yRz implies xRz for all x,y,zdA.
If R is an equivalence relation on A, for each xdA the set
xR={ydA: xRy} is called the R-class of x. Calling a family 3d of subsets
of A a partition of A if \J3d=A and XnY=0 for any distinct members
X,Y of 3$, we see immediately that, if R is an equivalence relation on A,
the family of all i?-classes of members of A constitutes a partition of A.
A partially ordered set is an ordered pair (A, =s) in which A is a set
and =s: is a binary relation on A satisfying:
(a) x«x for all xdA,
(b) x=cy and y*sx implies x=y for all x,ydA,
(c) x*sy and j^z implies x«;z for all x,y,zdA.
If x<y, we say that x is less than or equal to y or y is greater than or
equal to x. We also write “x</’ for “x<_y and xXy ■ If (T,<) is a
partially ordered set, < is called a partial ordering on A. A partially ordered
set is said to be totally (or linearly) ordered if in addition
(d) x<y or y«sx for all x,ydA.
4 PREREQUISITES

If A is any family of sets, the relation £ of set inclusion is a partial


ordering on A. We frequently identify a partially ordered set (A,*s) with
its underlying set A.
If (A,<) is a partially ordered set and X^A, a member a£A is an
upper (lower) bound for X, if x<a (a<x) for every x^X. An upper (lower)
bound a for X in A is called the supremum (infimum) of X if a is less than
(greater than) every other upper (lower) bound for X in A. If X has a sup¬
remum (infimum) in A, we denote it by sup(A) (inf(X)). Notice that if
0 has a supremum (infimum) in A, then sup(0) (inf(0)) is an elemen
a£A such that a*zx(x*sza) for every x£A. That is, if sup(0) (inf(0))
exists in A, then it must be the least (greatest) element of A.
A chain in a partially ordered set (A,<) is a subset X of A such that
c totally orders X, i.e. such that x^y or y^x for all x,y£X. (A,*s) is
said to be inductive if each chain in A has an upper bound in A. An element
a£A is maximal if whenever x£A and fl<xwe have x = a. Zorn's lemma
(which is equivalent to the axiom of choice) asserts that for each element
x of an inductive set (A,<) there is a maximal element A such that x<a.
A partially ordered set (A,<) is well-ordered if each non-empty subset
X of A contains an element x such that x*sy for every y£X. Assuming
the axiom of choice, a totally ordered set (A,<) is well-ordered iff A
contains no sequence a0,a1,a2,... such that an+1<an for all n.
We conceive of ordinals in such a way that each ordinal is the set of
all smaller ordinals, and the finite ordinals as being identical with the
natural numbers. Each well-ordered set is order-isomorphic to a unique
ordinal. A cardinal is an ordinal which is not equipollent with a smaller
ordinal. The cardinality of a set X, denoted by \X\, is the unique cardinal
equipollent with X. (This needs the axiom of choice.) Notice that |A| = |T|
iff X and Y are equipollent. If a and /I are cardinals, then ocp denotes the
result of cardinal exponentation, i.e. the product of a with itself /? times.
Thus e.g. the cardinality of PA is 21'41, for any set A.
A set A is said to be finite if it is equipollent with n for some n£N,
denumerable if it is equipollent with N, and countable if it is finite or
denumerable.
CHAPTER 1

BEGINNING MATHEMATICAL LOGIC

In the first three sections of this chapter we prepare and motivate the
systematic development, which begins in §4. In §§4 and 5 we deal with
the basic syntax of first-order languages. The rest of the chapter (§§6-15)
is devoted to propositional logic.

§ 1. General considerations

We shall not attempt to define mathematical logic. Rather, we shall make


a few general observations concerning this discipline. In particular, we
shall explain why mathematical logic involves the study of formal languages;
we shall then point out some distinctions that must be made in such a study.
Let us start with an example. Consider the inference

(1) Every tove is slithy

(2) Alice is not slithy

(3) Alice is not a tove

Here statement (3) is correctly inferred from statements (1) and (2). To
recognize the fact that (3) is indeed a consequence of (1) and (2) we do not
need to know whether or not any one of these three statements is true.
Nor do we have to know what the words “tove” and “slithy” mean, and
who (or what) is the person (or thing) named “Alice”.1
What makes (3) a logical consequence of (1) and (2) is the fact that
if (1) and (2) are true then (3) must be true as well. Any interpretation
of “tove”, “slithy” and “Alice” under which (1) and (2) come out true

1 However, in order to be sure that (1), (2) and (3) are phrased according to the rules
of English syntax, we do (presumably) have to know that “tove” is a common noun,
“slithy” an adjective and “Alice” a proper noun.
BEGINNING MATHEMATICAL LOGIC [CH. I, §1
6

will make (3) true.1 Also, if “tove”, “slithy” and “Alice” are replaced
by other words2, this will not affect the validity of our inference.
We may say that (3) is a consequence of (1) and (2) by virtue of the
form — as distinct from the matter — of these statements. In this connection
the words “every” and “not” must be regarded as part of the form: if
they are re-interpreted or replaced by other words, the inference may
well become invalid. (Such words are said to be logical, while words used
as common nouns, adjectives, etc. and whose meanings may be changed
without affecting the validity of inferences are extralogical.)
Mathematical logic studies inferences of the kind used primarily in
mathematics. In the sense suggested above, it is a formal discipline.
Ordinarily, mathematical statements are formulated in a semi-formal
language consisting of some natural language (like English, Japanese,
etc.) supplemented by special mathematical symbols. But it is an extremely
difficult task to apply precise logical analysis to arguments formulated
in a natural language. This is so for two main reasons.
First, all natural languages abound in irregularities, ambiguities and
structural inconsistencies which tend to obscure and confuse the logical
form of statements made in them.
Secondly, as the discussion of the example above suggests, logical
analysis requires a separation between form and matter; we may need,
e.g., to assign to the same term different (and rather arbitrary) meanings.
But in a natural language words have already got particular (if sometimes
rather vague) meanings attached to them.
For these reasons we need to construct artificial formal languages whose
structure will be perfectly regular. In such a language some symbols (viz.
those that correspond to logical words, like “every” and “not”) will have
fixed meanings. But other symbols (e.g. those corresponding to proper
nouns) will not be given a fixed meaning, and will receive different meanings
according to need.
In dealing with a formal language if we must make a distinction between
syntax and semantics. Syntactic questions are purely formal; they are
concerned with expressions of if as strings of symbols, irrespective of
any interpretation. Semantics, on the other hand, is concerned with the

x We refer, of course, only to interpretations that respect the parts of speech to which
these words belong; so that “tove” must be interpreted as a common noun, etc.
2 These new words must, of course, belong to the right parts of speech; “tove” can
only be replaced by a common noun, etc.
CH. !,§!]. GENERAL CONSIDERATIONS 7

meaning that expressions of <£ receive when the symbols occurring in


them have been interpreted in some way.
Language serves as a medium of discourse and communication. Discussion
of any topic takes place in some language. Now, when the object under
discussion is itself a language, there are two languages involved, on two
different levels. First, there is the language that is being discussed-, this is
called the object language. Then there is the language in which the discussion
takes place; this is called the metalanguage. The distinction between the
two is extremely important and must be constantly borne in m nd.1
Mathematical logic is mathematical in two different senses, corresponding
to the two levels just mentioned. The object languages which we shall
discuss will be certain formal languages in which portions of mathematics
can be expressed. When the symbols of such a language are suitably
interpreted, expressions of that language may express propositions of,
say, arithmetic or plane geometry. Syntactic and semantic study of such
expressions throws light on the logic used in mathematics and on the
logical relations between propositions of a given branch of mathematics.
This is one sense in which mathematical logic is mathematical.
The second sense is that the study of formal languages, their syntax
and semantics, is itself conducted in a mathematical fashion. Facts about
formal languages and expressions in them are proved in much the same
way as facts about groups are proved in algebra and facts about natural
numbers in number theory.
In this book, the metalanguage used for discussing formal languages
is English supplemented by additional symbols that will be introduced
according to need. These metalinguistic symbols must not be confused
with symbols of the object languages. This distinction is especially important
in connection with two particular kinds of metalinguistic symbol, whose
use will now be explained.
A mathematician writing about, say, real numbers needs symbols which
serve as proper names of some particular numbers (or of particular sets
of numbers, etc.). Such symbols are called constants. For example “1”
is the constant used to denote (i.e., as a proper name for) the smallest
positive integer, and “e” is normally used to denote a certain transcendental
number. Here there can be no confusion between constants and their
denotations (i.e., what they denote): the number e is an abstract entity

1 Even when the two languages happen to coincide (e.g., when English is studied in
English) the distinction between the two levels is important.
BEGINNING MATHEMATICAL LOGIC [CH. I, §1
8

which is not a symbol, while the constant “e” is a symbol denoting that
number. Numbers and their names are different kinds of things — the
former are not part of a language, while the latter are part of the language
used to discuss the former.
When the objects to be studied are themselves symbols and combinations
of symbols (of some object language) the situation is a bit more tricky.
A standard device for naming a symbol or a combination of symbols
is to enclose it in quotation marks. (We have used this device in the
preceding paragraph.) We could follow this procedure in this book, so
that when a symbol of an object language is enclosed in quotation marks,
it would become a name in the metalanguage for that symbol. In practice,
however, this tends to become rather cumbersome. Another practice
we could follow is to drop the quotation marks and use the same symbol
(or combination of symbols) both in the object language and — as a name
of itself — in the metalanguage, and to rely on the context to tell the
reader in what capacity a symbol is being used. This, however, can be
a bit confusing.
Fortunately, there is another way out. In an English book about the
grammar of some foreign language it may be undesirable or even impossible
to actually display the symbols of that language. (E.g., the script may be
technically difficult to print; or the language may not have a written form
at all, its symbols being not graphic signs but certain sounds and gestures.)
To get round this difficulty, the author of such a book may use trans¬
cription, i.e., he will include in his metalanguage symbols denoting symbols
of the object language.
We shall employ a similar device in this book. For our purposes the
actual graphic shape (if any) of the symbols of a formal language is
immaterial. We shall therefore not exhibit such symbols. On the other
hand we shall use certain metalinguistic symbols — called syntactic
constants — to denote particular symbols or combinations of symbols
of the object languages. Thus, when we say, e.g. “if has the implication
symbol -> ”, the arrow-shaped sign used here is to be regarded as a syntactic
constant denoting a certain symbol (also called “implication symbol”)
of the object language if. What the latter symbol actually looks like is
of no importance; and the reader may give free rein to his imagination.
Besides constants, a mathematician writing about numbers also needs
variables. Variables are like constants, except that whereas a constant
denotes one object, a variable is allowed to range over a given collection
of objects; each object belonging to that collection may serve as a value
CH. 1, §2], STRUCTURES AND FORMAL LANGUAGES 9

of the variable. Variables are used, e.g., in stating conditions which numbers
ma5 or may not satisfy (such as “.v>2”) and in making general statements
about numbers (such as “for all x and y, jc+y=y+x”).
Similarly, we shall use metalinguistic symbols called syntactic variables,
each of which will range over a prescribed collection of symbols or combi¬
nations of symbols of some formal language.1

§ 2. Structures and formal languages

In this section we discuss the various kinds of symbols and expressions


that a formal language may be expected to have, if it is to be used as
a medium for expressing mathematical statements.
A great many — some would say all — mathematical statements are
about structures. A structure consists of the following ingredients:
(1) A non-empty class, called the universe or domain of the structure.
The members of this universe are called the individuals of the structure.
(2) Various operations on the universe. These are called the basic
operations of the structure.
(3) Various relations on the universe. These are called the basic relations
of the structure.
Here (2) is optional: we admit structures having no basic operations.
On the other hand, a structure must have at least one basic relation. The
identity relation on the universe is normally taken as one of the basic
relations.
Among the basic operations of a structure there may be 0-ary operations.
According to our convention (see Ch. 0) such an operation is simply an
individual. The O-ary basic operations of a structure are called its designated
individuals.
Let us give a few examples of structures associated with various branches
of mathematics.
Elementary arithmetic may be defined as the study of one particular
structure — the elementary structure of natural numbers. It has the set of
natural numbers as universe, two designated individuals (viz. 0 and 1)
and two basic operations (viz. addition and multiplication). Here the
only basic relation is the identity relation.
Set theory is concerned with a structure whose individuals are all sets

1 In this book syntactic constants and syntactic variables are printed in bold type, unless
otherwise stated.
BEGINNING MATHEMATICAL LOGIC [CH. I, §2
10

(i.e., its universe is the class of all sets). In addition to the identity relation,
this structure has the basic relation of membership: the relation that holds
for all pairs (x,y) such that x and y are sets and x is a member of y.
Elementary Euclidean plane geometry may be regarded as the study
of a structure — the elementary Euclidean plane — whose individuals
are points and straight lines. Among the basic relations of this structure
are the property1 of being a point, the property of being a straight line,
and the ternary relation of “betweenness” which holds for all triples
(x,y,z) such that x,y and z are collinear points and y lies between x and z.
A topological space may be thought of as a structure whose individuals
are all points of the space, as well as all sets of points, all sets of sets of
points, etc. There is one designated individual — the set of all open sets
of points (this designated individual is the topology of the space); and the
basic relations are the property of being a point, the identity relation
and the membership relation between individuals.
Similarly, a structure can be associated with each group, ring and with
other entities studied in algebra.
Suppose we are given a structure 11 and we want to set up a formal
language SE in which statements about XI are to be expressed. What symbols
should EE have?
First, we would like EE to have symbols that may be used as variables
ranging over the universe of XI. The need for variables is obvious to anyone
acquainted with mathematics. Variables ranging over the universe of
Xt are used, e.g., in expressing conditions which individuals of XI may
or may not satisfy, and in making general statements about Xt.
Next, we expect EE to have symbols that may be used to denote the
various basic operations of Xt. Such symbols are called function symbols.
More specifically, a symbol designed to denote an n-ary operation is
called an n-ary function symbol. In particular, if Xt has designated individuals
then EE should have symbols for denoting them. Such symbols are called
individual constants or, more briefly, just constants. Since designated
individuals are regarded as O-ary operations, constants are to be regarded
as 0-ary function symbols.
Using variables and function symbols, we can construct expressions
called terms. Roughly speaking, terms are the nounlike expressions of EE.
For example, in a formal language suitable for elementary arithmetic
we should have variables, say x,y, etc., intended to range over the set

1 Recall that we have agreed (Ch. 0) that a property is a unary relation.


CH. 1, §2], STRUCTURES AND FORMAL LANGUAGES 1J

N of natural numbers; and function symbols, say 0,1,+,x, intended to


denote the numbers zero and one and the operations of addition and
multiplication, respectively. Then x, 1, 1+x, Oxy, ((l+x)x(y+l))+0
are some of the terms we can form.
Of course, different interpretations can be applied to one and the same
language. For example, the language described in the preceding paragraph
can be re-interpreted by letting its variables range over some arbitrary
non-empty class and letting 0,1,+,X denote two arbitrarily chosen members
of that class and two arbitrarily chosen binary operations on it.
But suppose we have fixed one particular interpretation for a formal
language if, by means of a structure It. Then those terms of if that do
not contain variables will denote individuals of ll. A term containing
variables will not denote any particular individual, but will assume various
individuals as values, depending on which individuals are assigned as
values to the variables.
For example, in the particular language described above (taken with
its originally intended interpretation) the term 1+1 denotes the natural
number two, while the term (l+l)xx+y has as value the number obtained
by adding whatever number is assigned as value to y, to twice the number
that happens to be assigned as value to x.
Variables and function symbols alone do not suffice for formulating
in if statements about a structure U. For this, if must have symbols
that can be used to denote the basic relations of IF A symbol designed
to denote an «-ary relation is called an n-ary predicate symbol.
If — as is usually the case — the identity relation is one of the basic
relations of U, then if needs to have a predicate symbol to serve as a
name for it. It is convenient to earmark one particular symbol, =, for
this role.1
To illustrate how predicate symbols are used, let us return to our example
of the language with function symbols 0,1,+,X- Assuming this language
to have the predicate symbol =, we can write formulas like

(1) 1 = 1X1,
(2) l+0=lx0,

(3) x+x=xXx,

(4) x+x=y.

1 This amounts to treating = as a logical symbol: it is always interpreted as denoting,


the identity relation on the universe of discourse.
BEGINNING MATHEMATICAL LOGIC [CH. 1, §2
12

Formulas (1) and (2) are sentences: they express propositions. Under the
originally intended interpretation, (1) expresses the true proposition that
the number one is the same as its own square; and (2) expresses the false
proposition that the sum of one and zero is the same as their product.
Thus, under the interpretation which we are assuming, (1) is a true sentence
and (2) is a false sentence. We also say that (1) has the truth value ‘truth’
and (2) has the truth value falsehood’.
Formulas (3) and (4) are not sentences; they do not express propositions,
but conditions regarding the values which may be assigned to the variables
involved. Thus, (3) expresses the condition that when the value of x is
added to itself the result is the same as when it is multiplied by itself.
It makes no sense to say that (3) is true (or false) outright. Rather, (3)
will assume a truth value — either truth or falsehood— depending on which
natural number is assigned to x as value. If that value is zero or two, (3) will
have the truth value truth or, in other words, (3) will be satisfied. For all
other values of x, (3) has the truth value falsehood. Similarly, (4) is satisfied
(has the truth-value truth) iff the value assigned to y is double that
assigned to x.
This example illustrates a general fact. Any n-ary predicate symbol of
a formal language if can be combined with n terms of if to form a formula.
Once an interpretation of if is fixed, some formulas of if (viz. those that
are sentences) become true or false outright; all other formulas acquire
truth values when the variables have been assigned values (belonging
to the universe of the structure used for interpreting if).
Using variables and function symbols we can construct terms; using
these and predicate symbols we can construct formulas — but only rather
simple ones. In order to combine simpler formulas into more complex
ones, we shall require if to have two new kinds of symbol, called connectives
and quantifiers.
We would like negation to be expressible in if. Thus, for any formula
a of if we want if to have a formula “la (read: “not a” or “it is not the
case that a”) which will be true whenever a is false, and false whenever
k ^

a is true. (Thus a and ~ia always have opposite truth values.)


Next, we want the conjunction and to be expressible in if. Thus, for
any two formulas a, p we need a formula aAP (read: “a and P”) which
will be true iff both a and p are true.
CH. 1, §2]. STRUCTURES AND FORMAL LANGUAGES 13

Similarly, we want JSf to have, for any formulas a and p, a formula


avp (read: “a or P”) which is false iff both a and p are false.1
Further, we want if to be capable of expressing conditional statements.
Therefore, for any formulas a, p of if there should be in ^ a formula
a-»P (read: “a implies P" or “if a, then P”). This formula will be false
iff a is true but p is false.2
Finally, if should have, for any formulas a and p a formula a<-»p
(read: “a iff p ’) which is true whenever a and p have the same truth values
and false whenever their truth values are different.
We could satisfy all these demands by requiring if to have five logical
symbols, called connectives, viz. “l, a, V, <-►. But as a matter of
fact we can make do with less. To start with, (a-*P)A(P-*a) behaves
in just the way we want a«*p to behave; so we do not require if to have
the symbol «-►. Next, I(oc —^ |P) is easily seen to behave exactly as cxaP
should; so a too can be eliminated. Finally, (—|a)-*p behaves just like
avP; so we can eliminate v as well. Thus if need only have two connectives
~1 and -y. (For reasons that will become apparent later on, it is convenient
to economize in the stock of logical symbols. No sacrifice is involved,
since we can define «aP to be —|(a —^ —|P) etc.)
The last demand we shall make on if is that for any formula a and
any variable x, if should have a formula Vxa (read: “for every [value of]
x, a”) and a formula 3xa (read: “for some [value of] x, a”).
To explain the meaning of such formulas, suppose first that a expresses
some condition regarding the value of the variable x. Then Vx“ expresses
the proposition that every possible value of x (i.e., all individuals of the
structure) satisfy that condition. And 3X« says that the condition expressed
by a is satisfied by some (i.e., at least one) individual.
Thus, e.g., in the particular language discussed above, taken with its
intended interpretation, the formula x=xXx expresses the condition that
the value of x be equal to its own square; Vx(x=xXx) then expresses
the false proposition that every natural number is equal to its own square,
and 3x(x=xXx) expresses the true proposition that some natural number
is equal to its own square.3

1 In mathematical usage, “or” has the meaning which jurists express by “and/or”.
We are following this usage here.
2 Here too we are following normal mathematical usage, which dictates that whenever
« is false a->-P should be regarded as (vacuously) true, irrespective of the truth value of p.
3 In fact, there are of course two such numbers.

3
BEGINNING MATHEMATICAL LOGIC [CH. 1, §3
14

More generally, suppose that a expresses a condition regarding the


values of the variables x,ylv..,yB. Then Vx« and 3xa express conditions
regarding the values of yx,...,yn. The values bx,...,bn (assigned to yx,...,y«
respectively) satisfy [the condition expressed by] Vx« iff for every possible
value a of x the values a,bx,...fin satisfy a. And the values bx,...,bn satisfy
3xa iff there is at least one value a of x such that the values a,bx,...,bn
satisfy a.
Thus, to return to the particular language we have been using as an
example, the formula xXy=z expresses the condition that the product
of the values of x and y equals that of z. Then Vx(xXy=z) expresses
the condition that if any number be multiplied by the value of y, the result
is always equal to the value of z. (This is satisfied iff both y and z are
assigned the value zero.) And 3x(xXy=z) expresses the condition that
the value of z is a multiple of that of y.
We could require if to have two logical symbols V and 3> called
universal quantifier and existential quantifier respectively. But again we
can make do with less. As a matter of fact, it is enough if if only has the
symbol V (universal quantifier); for the formula “IVX“la behaves just
as we want 3xa to behave, so that we can define the latter to be the former.
Formal languages possessing just the equipment sketched above (with
the possible omission of function symbols) are called first-order languages.

*§ 3. Higher-order languages

Logicians sometimes deal with (apparently) richer formal languages. For


example, a second-order language has, in addition to the equipment of
a first-order language, another type of variable, ranging not over the
individuals of a structure U but over all sets of individuals (i.e., all subsets
of the universe). Among the symbols of such a language there is a special
one used to denote the relation of membership between an individual
and a set of individuals. Quantifiers are allowed to apply not only to the
individual variables (those ranging over individuals) but also to the new
set variables. Thus, if a is a formula and X is a set variable, VXa is also
a formula.
Similarly, in a third-order language there are variables ranging over
sets of sets of individuals, and so on for languages of still higher orders.
However, most logicians agree that such languages are — at least in
principle — dispensable. Indeed, let U be any structure and let 23 be a
structure obtained from U in the following way. The universe of 23 consists
CH. 1, §4]. BASIC SYNTAX 15

of all individuals of It plus all sets of individuals of It. The basic operations
of are defined in such a way that when they are restricted to the universe
of H (i.e., when they are applied to individuals of It) they behave exactly
as the corresponding basic operations of It. Finally, the basic relations
of 23 are all the basic relations of U plus two additional relations: the
property of being an individual of It, and the relation of membership
between an individual of tt and a set of individuals of H. Then any statement
about U expressed in a second-order language with set variables, can
easily be “translated” into a statement about 23 in a first-order language.
In this sense, second-order languages are dispensable. A similar argument
applies to other higher-order languages.
We therefore do not lose much by confining our attention to first-order
languages only.

§ 4. Basic syntax

We shall now begin to put into practice some of the ideas discussed in
§1 and §2. We proceed to describe an arbitrary first-order language if.
Throughout this book, unless the contrary is stated, if will be kept fixed.
We shall introduce various notions relating to if. These will be labelled
by the prefix “if-” or by phrases “of if”, “in if”, etc. However, once
such a notion has been introduced, we shall omit these labels except
where they are needed for emphasis or clarity.
The symbols of if are:
(a) An infinite sequence of (individual) variables, namely

h,VS,T3v •

(b) For each natural number n, a set of n-ary function symbols.


(c) For each positive natural number n, a set of n-ary predicate symbols.
For at least one n this set must be non-empty.
(d) The connectives ~l (negation) and -* (implication).
(e) The universal quantifier V-
The 0-ary function symbols (if any) are called (individual) constants.
If if has the special binary predicate =, we say that if is a language
with equality. We stipulate that if if has at least one function symbol that
is not an individual constant, then if must be a language with equality.1

1 Notice the difference between “=” and “=”. We use the former as a syntactic constant
denoting the equality symbol of if, while the latter is used in this book (as in most
other mathematical texts) as short for “is the same as”.

3*
BEGINNING MATHEMATICAL LOGIC [CH . I, §4
16

The variables, the connectives, the universal quantifier and = are called
logical symbols. They are assumed to be the same in all first-order languages
(or, in the case of =, in all first-order languages with equality). The
function symbols and the predicate symbols other than = are called
extralogical symbols and may differ from one language to another.
Notice that we have fixed a particular ordering of the variables: v„ is
the 77th variable1 of if. This ordering is called the alphabetic ordering
of the variables.
A finite (possibly empty) sequence of if-symbols is called an if-string.
(A given symbol may, of course, occur several times in the same string.)
The length of a string is the total number of (occurrences of) symbols
in it. In particular, the empty string has length 0; and any single symbol
is a string of length 1.
If s and t are strings, we define ST as the string obtained by concatenat¬
ing s and t, in this order. Similarly for three or more strings.
If r=st, where r,s,t are strings, then s is an initial segment of r. If
T is non-empty, then s is a proper initial segment of R. Similarly, T is a
terminal segment of R, and it is a proper one if s is non-empty. Obviously,
a string of length n has n + 1 different initial segments (including the empty
string and the entire string itself).
We shall only be interested in two kinds of strings, called terms and
formulas.
if-terms are if-strings formed according to the following two rules:
(1) Any if-string consisting of (a single occurrence of) a variable is
an JC-term.
(2) If f is an 77-ary function symbol of if and tlv..,tn are SC-terms,
then ftx.. .t„ is an SC-term.
Notice that, for n=0, (2) says that any constant is a term. In a term
formed according to (2), tl5...,t„ are called arguments of f.
By the degree of complexity of a term t (briefly deg t) we mean the total
number of occurrences of function symbols in t.
The stipulation that in a term, formed according to (2) the arguments
always follow the function symbol, makes it unnecessary for if to have
punctuation marks to indicate the grouping of arguments. To show this
let us define the weight of a string s as the sum obtained by adding up — 1

1 In accordance with a convention introduced in §1, the bold arrow e.g., is a


syntactic constant belonging to our metalanguage and denoting a certain connective
of if. Similarly, “vi”, e.g., is a syntactic constant denoting the first variable of if.
CH. 1, §4], BASIC SYNTAX 17

for each occurrence of a variable in s and n — 1 for each occurrence of


an n-ary function symbol in s. We then have:

4.1. Lemma. Each term t has weight —1; but the weight of any proper
initial segment of t is non-negative.
Proof. We use induction on deg t. If t is just a single occurrence of
a variable then our claim is obviously true (the only proper initial segment
of t is the empty string, which has weight 0).
If t is ftx...t„ where f is an 77-ary function symbol and tx,...,t„ are terms,
then deg tx,...,deg t„<deg t and we see by the induction hypothesis that
the weight of t must be (77 —1) + 77-(—1) = — 1. Also, it is clear that the
weight of any proper initial segment of t is non-negative. |

Using this result we see that in a string of the form tx...t„ (where tx,...,tn
are terms) tx is uniquely determined as the shortest initial segment with
weight —1. Similarly, t2,...,t„ are uniquely determined. Thus in a term
ftx...t„ all the arguments are uniquely determined.
SE-formulas are Jzf-strings formed according to the following four rules:
(1) If P is an 77-ary predicate symbol of ££ and tx,...,t„ are Jzf-terms,
then Ptx...t„ is an i?-formula.
(2) If a is an Jz?-formula, then so is “la.
(3) If a and p are ££-formulas, then so is -»ap.
(4) If a is an d£-formula and x is a variable, then \fxoc is an 3?-formula.
A formula Ptx...t„ formed according to (1) is called an atomic formula;
the terms tx,...,t„ here are the arguments of P. An atomic formula whose
predicate symbol is = is called an equation and its first and second
arguments are called its left-hand side and right-hand side, respectively.
A formula “la formed according to (2) is called a negation formula;
“la is the negation of a.
A formula -►ap formed according to (3) is called an implication formula.
Here a is the antecedent (or implicans) and P is the consequent (or implicate).
A formula V*a formed according to (4) is called a universal formula.
Here x is the variable of quantification, and the string xa is the scope of
the initial symbol V
The degree of complexity of a formula a (briefly deg a) is the sum obtained
by adding up 2 for each occurrence of ->• and 1 for each occurrence of
“1 and V in «
In the case of formulas also, punctuation marks are unnecessary. To
see this let us give weight 77-1 to each occurrence of an /7-ary predicate
symbol and weights 0, 1, 1 to each occurrence of “I, V respectively.
18 BEGINNING MATHEMATICAL LOGIC [CH. 1, §5

(The variables and function symbols retain their old weights.) Then we
have:

4.2. Lemma. Each formula a has weight — 1; but the weight of any proper
initial segment of a is non-negative.
The proof of this is similar to that of Lemma 4.1 and is left to the
reader. i

It follows that in an implication formula the antecedent and consequent


are uniquely determined.
By SE-expression we mean j^f-term or J?-formula.

4.3. Problem. Show that if r,s,t are strings such that rs and st are
expressions then s and t cannot both be non-empty. Thus, if two expressions
overlap, one of them must be a terminal segment of the other.

If s is a term (or formula) and r,t are strings such that rst is again a term
(or formula, respectively), then s is said to be a subterm (or subformula,
respectively) of rst. If moreover r is non-empty1 then s is a proper subterm
(or subformula, respectively) of rst.

4.4. Problem. Show that the proper subterms of a term ftx... t„ are all
the subterms of the arguments tl5...,tn. Also, show that the proper sub¬
formulas of
(a) “la and Vxa are aU the subformulas of a,
(b) -*ap are all the subformulas of a and of p.

§ 5. Notational conventions

In this book, unless otherwise stated:


(a) Boldface lower-case Roman letters from the end of the alphabet
(especially “x”, “y”, “z”) are used as syntactic variables ranging over
the variables of a first-order language.
(b) As syntactic variables ranging over function symbols we use
“f”, “g” and “h”.
(c) As syntactic variables ranging over constants we use “a”, “b” and “c”.
(d) As syntactic variables ranging over predicate symbols we use
“P”, “Q” and “R”.
(e) As syntactic variables ranging over terms we use “r”, “s” and “t”.

1Note that, by Lemmas 4.1 and 4.2, if r is empty then so is T.


CH. 1, §5]. NOTATIONAL CONVENTIONS 19

(f) Boldface lower-case and upper-case Greek letters are used as syntactic
variables ranging over formulas and sets of formulas, respectively.
In §4 we saw that no punctuation marks are needed in . In various
theoretical contexts this is an advantage, because it simplifies the syntax.
However, this advantage has been achieved at the cost of some artificiality,
by insisting that function symbols and predicate symbols (including =)
always precede their arguments and that -* precede the antecedent.
In practice it will now be convenient to make certain concessions to
common usage. The reader must note that these concessions do not in
any way constitute a modification of the formation rules of if laid down
in §4; they are just conventions in our metalanguage, which are used
when we talk about if.
Also, for reasons of economy we have allowed if to have only two
connectives and only the universal quantifier. We shall now introduce
other connectives and the existential quantifier metalinguistically by
definition.

5.1. Definition.
(a) (r=s) =df =rs.
(b) (r^s)=df-l(r=s).
(c) (a P) =df -*ap.
(d) (a A p)=df“|(a ip).
(e) (a v P)=df(“iap).
(f) (a <-► p) =df((a P) A (P a)).
(g) 3xa=df “iVx“ia.

Thus, e.g., “(a A P)” is, by definition, a synonym for “~i(a -► 1P)”, which
is in turn another name for the formula I —xx i p.
The formula (<xaP) is called a conjunction formula, a and P being the
first and second conjuncts, respectively. Similarly, (avP) is a disjunction
formula with a and p as first and second disjuncts. Also, (a«->P) is a bi¬
implication formula with a and p as left-hand side and right-hand side.
Finally, 3xa is an existential formula.
In our new metalinguistic notation parentheses are needed to prevent
ambiguity. To facilitate reading we shall replace some parentheses by
brackets of various shapes. Also, we often omit parentheses, subject
to the convention that “V”, “a” should be taken in this
order of priority (just as according to the rules of English punctuation
a full stop has priority over a semi-colon, and the latter has priority over
a comma). The ranges of ““T, “V” and “3” are to be as short as possible.
20 BEGINNING MATHEMATICAL LOGIC [CH. 1, §6

Thus, e.g., aAP-*Y<->-~~|a-»-pvY is {[(«aP)-*Y]<->-[‘Ta-*(pVy)]} and


3xocaPvy is [(3xaAP)VY]. But in “ctA(PVY)” and in “3x(<xaP)”
the parentheses cannot be omitted.
We also agree that where there are several occurrences of (or
“a”, etc.) the one farthest to the left has highest priority. (This is the
convention of association to the right.) Thus a-^PAY-^P-^Y is
a-* {(Pay)-»-(P-*y)} but in “(aaP)ay” the parentheses must not be
omitted.
We stress once more that all the conventions introduced in the present
section are merely metalinguistic devices used in referring to 25?-formulas.
In the language itself nothing has been changed. It should also be
noticed that strictly speaking “a”, “v”, and “3” do not, by
themselves, denote anything at all. For example, it is only the whole
combination “(<xaP)” that has been given meaning (as denoting the
formula “] -»a~i P) while “a” in isolation has not been given any meaning.
Nevertheless, we shall occasionally express ourselves a bit loosely, as
if 25? had connectives a (conjunction), v (disjunction), ■*-+ (bi-implication)
as well as an existential quantifier 3-

§ 6. Propositional semantics

The rest of this chapter is devoted to propositional logic which is (roughly


speaking) that part of logic concerned with the meaning of the connectives
and with rules for manipulating them.
In the present section we deal with the semantic aspects of propositional
logic. In the next chapter we shall specify in detail the way in which truth
values may be assigned to 25?-formulas. For the present, however, we take
such valuations as given arbitrarily, subject only to the condition that
the intended meaning of ~i and is respected.
Let us denote by “T” and “j_” respectively the truth values truth
and falsehood. Then the above explanation motivates the following:

6.1. Definition. A truth valuation on 25? is a mapping a assigning to


each 25?-formula a a truth value (i.e, a member of the set {t, 2L}) oF such
that for all 25?-formulas p and y

(1) (“lPr=T iff pa= _L,

(2) (P-*Yr=T iff PCT=T or1 y°=T.

1 Here, as usual in mathematical texts, “or” means what jurists mean by “and/or”.
CH. 1, §6]. PROPOSITIONAL SEMANTICS 21

( a should be read as “the value of a under or” or, briefly, “a


under or”.)
We call a a prime formula if it is atomic or universal. Note thet Def. 6.1
does not impose any condition on a for prime a. In fact, it is not difficult
17

to see that any mapping of the set of all prime formulas into {T,_L} can be
extended in a unique way to a truth valuation. (If a17 is fixed arbitrarily
for all prime formulas a, then conditions (1) and (2) of 6.1 define a '7 for
all formulas a by recursion on deg a.)
Conditions (1) and (2) of 6.1 can be encapsulated in truth tables:

p “IP P Y P->Y
T _L T T T
_L T T X X
X T T
X X T
More generally, given a finite number of formulas pl5...,pk and a formula
a compounded from them with and -*■, we shall show how to construct
a truth table fora in terms of pl3...,pk.
fo be precise, let us say that a formula is a (propositional) combination
of pi5...,pfc if it can be obtained by the following rules:
(a) Each p, (i = 1 ,...,&) is a combination of plv..,pk.
(b) If y is a combination of pl3...,pk then so is ~iy.
(c) If y and 5 are combinations of pi3...,pk then so is y-*5.
To construct a truth table for a formula a in terms of plv..,pk we start
by setting up a rectangular table with k columns — headed “p1”,...,“pk”
respectively — and 2k rows. In each of the k-2k spaces we enter “T”
or “X” so that no two rows are filled out in the same way. (Thus, each
of the 2k different sequences of length k made up of “T”s and “_L”s will
appear in exactly one row.) Now, if a is a combination of pi,...,pk, we
add a new column — headed “a” — and fill it out with “T”s and “_L”s
according to the following rules, proceeding by induction on deg a:
(a) If a is p; (1 </<&), we copy the entries of the ith column (headed
“P,”) into the new “a” column.
(b) If a is -ly, where y is a combination of pi3...,pk, then by the
induction hypothesis (since deg y<deg a) we already know how to construct
a “y” column. Then the “a” column will have “T” in places where the
“y” column has “_L”, and “ X” in all other places.
(c) If a is y-*5, where y and 6 are combinations of pl5...,pk, then by
the induction hypothesis we know how to construct “y” and “5” columns.
22 BEGINNING MATHEMATICAL LOGIC [CH. 1, §6

Then the “a” column will have “_L” in every row where the “y” column
has “T” but the “8” column has “_L”; and in all other places the “a”
column will have “T”.
It should be noted that these rules do not necessarily yield a unique
result: we do not exclude the possibility that some of the p£ are themselves
combinations of the remaining p,. In some cases, therefore, we are allowed
to choose between applying rule (a) and one of the other two rules. But
any table with columns headed “Pi”,...,“Pt” and “a” and filled out
according to the above rules is a truth table for a in terms of P!,...,pfc.

6.2. Problem. Using Def. 5.1 (clauses (d)-(f)) set up truth tables for
aAP, avp and a op in terms of a and p.
6.3. Problem. In a truth table in terms of two formulas a and p the third
column can be filled out in 16 (=24) different ways. Find 16 combinations
of a and p which yield all these different third columns.
6.4. Problem. We have chosen to regard ~| and o as primitive (i.e.,
as symbols of if) and to reduce a, V and o to them (Def. 5.1). Truth
tables for negation and implication were incorporated into Def. 6.1, while
the truth tables for conjunction, disjunction and bi-implication were
obtained as a result (Prob. 6.2). Show that, similarly, -*• could be reduced to
(a) ~l and a,
(b) “1 and v.
*6.5. Problem. Show that cannot be reduced to ~i and «-»•.
6.6. Problem. Let a|p be defined as ~l(aAP). Show that we could have
taken | (Sheffer's stroke) as a sole primitive connective and reduce both
~1 and -> to it.

6.7. Definition. A truth valuation a on if satisfies a set O of if-formulas


(briefly, c|=<D) if <pCT=T for every <p in ®.
If ® consists of just one formula <p we write “crN<p” (read: “cr satisfies
tp”) instead of “crN{<p}”.
If <7 |=a for every truth valuation a, then a is called a tautology.

6.8. Theorem. Consider a truth table for a formula a in terms of pi,...,pfc.


If the “a” column contains only “T”s, then a is a tautology. The converse
is also true, provided p1,...,pfc are prime and distinct.
Proof. Take any row in the given truth table. Let vt be T or 1 according
as the “P” entry in our row (i.e., that coming under the heading “P,”)
is “T” or “JL”. Similarly, let « be T or X according as the “a” entry
in our row is “T” or “j_”. It follows easily from Def. 6.1 and the rules
CH. 1, §6]. PROPOSITIONAL SEMANTICS 23

tor constructing truth tables that if er is a truth valuation such that

(!) Pf=», for i=l,...,k,


then also rta = v.
In particular, if the “a” column contains only “T”s, then a must be
a tautology, because any truth valuation <y will fulfil (1) for some choice
of vh corresponding to some row in the truth table.
If the pi are all prime and distinct, then for each of the 2k different
choices of vi there is a truth valuation a for which (1) holds (see remark
following Def. 6.1). Thus it a is a tautology, the “a” column must contain
only “T”s. g
For each a we can find a finite set of prime formulas of which a is a
combination. The smallest such set is the set of prime components of a,
defined by induction on deg a as follows: if a is prime, then it is its own
sole prime component; if a = “i p, then a has the same prime components
as P; ifa = P->-Y, then the prime components of a are the prime components
of p plus those of y. By setting up a truth table for a in terms of its prime
components, we can find out in a finite number of steps whether or not
a is a tautology.

6.9. Remark. From the above it is clear that the property of being a
tautology is invariant with respect to language: if if and if' are two first-
order languages such that a is both an if-formula and an if'-formula,
then a is a tautology in if' (i.e., satisfied by every tiuth valuation on if')
iff it is a tautology in if. This follows from the fact that the procedure
for checking whether or not a is a tautology is exactly the same, and must
yield the same result, in both if and if'. More directly, it is enough to
observe that aCT depends only on the values assigned by a to the prime
components of a.

6.10. Problem. Using the first part of Thm. 6.8 verify that for any a, p
and y the following are tautologies:
(a) a-*p->-a,
(b) (a-*p~*y)->(a-*P)-»a-*y,
(c) (—ia-^P)-»(~ia-+“lP)->-a,
(d) (a-*p)A(p-*y)-*a-*Y,
(e) (a->-P)A(a->-y)-^a->-PAy,
(f) (a->y)A(P-»y)-*avP-*Y,
(g) [a ->■ (P Vy)] (a -> P) V (a y).
BEGINNING MATHEMATICAL LOGIC [CH. 1, §6
24

(h) (aAP->-Y)-Ma-*Y)v(P-*y),
(i) (a-> ”|a.
6.11. Definition. A formula a is a tautological consequence of a set <I>
of formulas if <rN a for every truth valuation o such that ot=<S>. In this case
we write “d)i=0a”; and if O is empty we write simply “N0a”-

(It is clear that a is a tautology iff it is a tautological consequence of


the empty set, i.e., No1*-)
If 0 consists of just one formula <p, we say “tautological consequence
of <p” instead of “tautological consequence of {<p}”.
Formulas a and p are tautologically equivalent if they are tautological
consequences of each other, i.e., if aff=P‘I for every truth valuation a.

6.12. Problem. Show that a is a tautological consequence of {<px,... ,<p*}


iff the formula
► (po-► •. • -► tp^ —>- ot

is a tautology.
6.13. Problem. Show that a and P are tautologically equivalent iff a«-*P
is a tautology. Verify that for all a, p, y, <px,...,(pfc the following pairs are
tautologically equivalent:
(a) a-»p, “iP-*“ia;
(b) i(ot—>P), aA~iP;
(c) l(<Pia<p2a...a<Pfc), -KPiV-i^v.-.v-Kp*;
(d) -|((PiV<p2V...V(pt), KPiA KP2A...A KPfcj
(e) aAPAy, (aAp)Ay;
(f) avpvy, (avp)vy;
(g) <p!A(p2A... A<pt-»a, (p1->(p2-^...->(pfc->a.
6.14. Problem. Let <px..,<pfc be all the different prime components of a
Assuming that a is satisfied by at least one truth valuation, show how to.
construct a disjunctive normal form for a, i.e., a formula P which is
tautologically equivalent to a and which has the form PjV.-.vP,,, where
ns 1 and each P7- is of the form (pjA...A<Pfc, where, for each i, <p- is
<Pi or -1<P;-
6.15. Problem. Show that the relations of tautological consequence and
tautological equivalence are invariant with respect to language (cf.
Remark 6.9).

In the following three sections (§§7-9) we study the method of propositional


tableaux, which can be used to find out whether there exists a truth valuation
CH. 1, §7], PROPOSITIONAL TABLEAUX 25

satisfying a given finite set of formulas, whether a given formula is a


tautological consequence of a given finite set of formulas, and whether
a given formula is a tautology. This method is far more efficient and
elegant than that of truth tables, and is also of considerable theoretical
interest. However, a reader who wishes to take a short cut may skip these
three sections as well as §11, which depends on them.
From now on we shall often write, e.g. “d>,a” for “<Du{a}”.

*§ 7. Propositional tableaux

A tableau is a set of elements, called nodes, partially ordered and classified


into levels as explained below. With each node is associated a finite set
of formulas. We shall usually identify a given node with its associated
set of formulas; this is somewhat imprecise (since in fact the same set of
formulas can be associated with different nodes) but will not cause confusion.
Each node belongs to a unique level, which is labelled by some natural
number. There is just one node of level 0, called the initial node of the
tableau. Each node of level « + l is a successor of a unique node, which
must be of level n. (In representing a tableau diagramatically, we put
the successors of a given node below it, and join them to it by edges.)
A node is terminal if it has no successors.
If a tableau has at least one node of level d but no nodes of greater
level, we say that d is the depth of the tableau.
A branch of a tableau is a finite sequence of nodes d>0,...,d>fc such that
d>0 is the initial node of the tableau, and, for i=l,...,k, d>, is a successor
of 0>,_i, and d>;. is terminal. This branch is said to terminate at
Clearly, for each terminal node there is a unique branch terminating at it.
A formula belonging to a node of a branch is said to be a formula of
that branch. The formulas belonging to the initial node are the initial
formulas of the tableau.
The nodes of a tableau are partially ordered by the relation of following:
each node is followed by its own successors, by their successors, etc.
A tableau with initial node d>0 is said to be a tableau for <D0.
We shall now prescribe how to construct the kind of tableau studied
in the present chapter, namely propositional tableaux.
Firstly, if <D0 is any finite set of formulas, the tableau consisting of
<p0 as sole node is a propositional tableau for d>0. Here d>0 is both initial
and terminal, and there is just one branch.
Next, having obtained a propositional tableau T for <J>0, we are allowed
BEGINNING MATHEMATICAL LOGIC [CH. 1, §7
26

to extend it into a new one T by any one of the following three rules.
In each case T will have all the nodes of T, plus one or two new nodes.
Succession among the old nodes is the same in T' as in T. The new node(s)
will be assigned as successor(s) to a node that was terminal in T.

Rule —i —|: If among the formulas of a branch of T terminating at node


<p there is a formula “]“ia, add a new node {a} as successor
to O.
Rule ; if among the formulas of a branch of T terminating at node
there is a formula a-*p, add two new nodes {“la} and
(P) as successors to O.
Rule —l —> : If among the formulas of a branch of T terminating at node
O there is a formula ~i(a-*P), add a new node {a, “ip} as
successor to <I>.

In applying any one of these rules the respective formula “1“la, a-*p
or “|(a-*P) is said to be used for that application.
Schematically the rules are represented as follows:

“1 ~i -rule: -* -rule: “I -*-rule:

—I —la a-* p “|(a-*p)

i
a “|a
Ap i
a
“IP
In going over from T to T', one particular branch of T is extended into
a longer branch, or (in the case of Rule-*-) extended and split into two
longer branches. All the other branches of T remain unchanged.
In this chapter we shall refer to propositional tableaux briefly as just
tableaux, since no other kind will be discussed here.
A branch of a tableau is closed if there is a prime formula a such that
both a and “la are formulas-of that branch. The formulas a and “la
in question are said to be used for closing the branch. A (propositional)
tableau for d) is called a (propositional) confutation of O if all its branches
are closed. To confute ® is to construct a confutation of <I>.
The idea behind the method of tableaux is made explicit in the following:

7.1. Problem. Show that the three rules are (semantically) sound in the
following sense: if a truth valuation a satisfies [all the formulas of] a given
CH. 1, §7], PROPOSITIONAL TABLEAUX 27

branch in a tableau, and if that branch is extended into a new branch


(or extended and split into two new branches) by one of the rules, then
<j also satisfies the new branch (or at least one of the two new branches,
respectively). Hence show that the tableau method is (semantically)
sound, in the sense that if we can confute a finite set O of formulas, there
can be no truth valuation satisfying O; in particular, if 0 = {~lip} then
<p is a tautology. Also show that if <D, ~ia can be confuted then <D i=0 a.

In constructing a tableau for <J> the reader may think intuitively of U>
as a story which he is trying to criticize. At each stage of the construction,
the various branches represent alternative (but more specific and detailed)
versions of the same story. Moreover, if d> is to be believed then at least
one of these versions must be believed as well. A closed branch represents
an unbelievable version.
Suppose that we have a branch (in some tableau) such that for the set
of formulas of that branch we already possess a confutation. Then for all
practical purposes that branch is as good as closed, because we know that
by successively extending it we can eventually get branches all of which
are actually closed. In practice, a branch which is closed (or as good as
closed) need not be extended any further even if the rules allow us to do so.

7.2. Problem. Show that for every a a confutation of (a, ~|a} can be
constructed. (Hint: use induction on deg a.) It follows that if a branch
has a as well as ~ia then that branch is as good as closed, even if a is
not prime.

Our rules do not prevent us from using the same formula over and
over again indefinitely. But to do this is obviously pointless, since we get
nothing essentially new. This motivates the following:

7.3. Definition. A formula <p of a branch is used up in that branch, unless


one of the following three conditions holds:
(a) <p is —| —| <x and a is not a formula of the branch.
(b) ip is a->p and neither ~ia nor p are formulas of the branch.
(c) tp is ~i(a-*P) and a, “iP are not both formulas of the branch.

In practice, a formula used up in a branch need not be used to extend it.


Moreover, when a formula is being used, it is good practice to use it at
once to extend all branches to which it belongs (except those which are
as good as closed) and then to tick it off as a sign that it should not be
used again.
BEGINNING MATHEMATICAL LOGIC [CH. 1, §7
28

When dealing with formulas rendered in our metalinguistic notation


using “a”, “V” and it is in practice unnecessary each time to
unpack these via Def. 5.1. Instead, we can use the following six rules,
which can be seen to be sound directly from Prob. 6.2, or derived from
rules —| —|, ——| —► via Def. 5.1.

A-rule: “I A -rule:

aAP “l(aAp)

|
a “la
A “lp

ule: —1 v-rule:

avp “1 (a V P)

/X
a P “la
“IP

rule: “I <r>-rule:

a P -l(aop)

/\ /\
a “la a “la
p “ip —ip p

Of course, in all theoretical discussions we work in terms of ~l and -*


only, and recognize only the three old rules.
One last practical remark. In general when two or more rules are
applicable at the same time, it is advantageous to apply those that do not
lead to a split before those that do.

Example. Take the formula of Prob. 6.10(h). To show that it is a tautology


we confute its negation.1 For greater clarity we shall re-copy the tableaux
obtained at each step. (In normal practice this is not necessary: each step
may be grafted onto the tableau obtained at the previous step.) We put

1 See Prob. 7.1.


CH. 1, §7], PROPOSITIONAL TABLEAUX 29

X at the bottom of a branch if it is as good as closed. Here are the


successive steps:

(1) ~l[(a a v(P->-y)]

aAP-»-y
-|[(a-*y)v(p-*y)]
(2) —l[(a a P-»-y)->(a-^y) v(P->-y)] J
I
aAP-fy
~l[(a-*Y)v(p-*y)] V
I
-l(a-^y)
~bp->y)
(3) H[(aAp-*y)-Ka-»y)v(P-»y)] V

aAP->y
“l[(a->y)v(p-»y)] V

~l(a->y) V
i(P—►y)

a
“ly
(4) n[(aAP->y)->(a->y)v(P->-y)] V

aAP-^y
-|[(a-^y)v(p-*-y)] V
I
“l(a-^y) V
~l(p-»y) V

a
~iy

P
“iy

4
30 BEGINNING MATHEMATICAL LOGIC [CH. 1, §7

(5) —I [(a A P y) (a ->• y) v (P ->• y)] V

«Ap-*y V
~l[(a->y)v(P->y)] V
I
~l(a-*y) V
i(P—>y) V

~iy
I
P
~iy
/\
~l(aAP) y
X

(6) -i[(aAp->y)-^(a->y)v(P-^y)] J
«AP->y V
~l[(a->y)v(P^y)] V
I
“l(a-*y) V
~l(p->y) V

~iy
I
P
“iy
/\
~l(aAP)V y

/\ X
“la “ip
X X

7.4. Problem. Show by tableaux that the other formulas of 6.10 are
tautologies.
CH. 1, §8], THE ELIMINATION THEOREM FOR PROPOSITIONAL TABLEAUX 31

*§ 8. The Elimination Theorem for propositional tableaux

We shall need the following five simple lemmas.

8.1. Lemma. Given a confutation of ®, we can confute ®u'F, where 'F is


any finite set of formulas. |

8.2. Lemma. Given a confutation of ®, “i~ia, we can confute ®, a.


Proof. We imitate the given confutation, except that nodes {a} which
were obtained by using “|“ia must now be cut out; but since we have
now got a in the initial node, we can still use a in every branch.

8.3. Lemma. Given a confutation of ®, a-»p, we can confute ®, “ia


and ®, p.
Proof. To confute ®, “|a, we imitate the given confutation except that
where a-»p was used to get a configuration of the form

“la p

we now cut out the node {“la}, as well as the node {p} and all the nodes
following it. In a similar way we get a confutation of ®, p. %

8.4. Lemma. Given a confutation of ®, “"](«->• P), we can confute ®, a, “1 p.


Proof. Similar to that of 8.2. §

8.5. Lemma. Given confutations of ®, 5 and ®, i5, where 5 is prime,


we can confute ®.
Proof. We imitate the given confutation of ®, 5, except where 8 was
used. The only use that could have been made of 8 was to close branches
in which —18 turned up. But since we are given a confutation of ®, “i8
and we have ® as initial node, such a branch is as good as closed even
without the help of 8. I

We can now generalize Lemma 8.5.

8.6. Elimination Lemma. Given confutations of ®, 8 and ®, iS, where


8 is any formula, we can confute ®.
Proof. By induction on deg 8. If 8 is prime, we appeal to Lemma 8.5.
If 8 is “la then we are given confutations of®, “la and ®, —l —la. From
the latter we construct by Lemma 8.2 a confutation of ®, a. By the induction
hypothesis we can now confute ®, since deg a = deg 8 — 1.

4*
BEGINNING MATHEMATICAL LOGIC [CH. I, §8
32

Finally, if 8 is a-»p then we have got confutations of ®, a->P and ®,


—|(a-*p). From these we construct (using Lemmas 8.3 and 8.4)

(1) a confutation of ®, “la,

(2) a confutation of ®, P,

(3) a confutation of <J>, a, ~ip.

By Lemma 8.1 we can get from (2)

(4) a confutation of ®, a, p.

From (3) and (4) we get by the induction hypothesis (since deg P<deg8)

(5) a confutation of ®, a,

and using the induction hypothesis once more we get from (1) and (5)
a confutation of ®. I

Let us try to strengthen the method of propositional tableaux by adding


to the three old rules a fourth one:

Rule EM: If O is any terminal node of a tableau T,


and 8 is any formula, add two nodes {8},
and {“18} as successors to ®.

Schematically, this is represented thus:

EM -rule:

/\
8 ~|8

We say that the formula 8 is introduced via this application of the rule.
“EM” stands for excluded middle. The EM-rule can be justified semantically
(as the old rules were justified in Prob. 7.1) by observing that any truth
valuation must satisfy 8 or ~|8.
Though the new rule may help us to get confutations more quickly,
it is (in principle) redundant.

8.7. Elimination Theorem. Given a confutation of ®0 in which the EM-rule


is used, we can construct an EM -free confutation of ®0 (i.e., a confutation
of ®0 not using the EM-rule).
CH. I, §9], COMPLETENESS OF PROPOSITIONAL TABLEAUX 33

Proof. Let k be the deepest (i.e., maximal) level in the given confutation
at which there is an application of the EM-rule. Consider an application
at this level:

Before this application was made, was the terminal node of a branch
d>0,...,<Dfc. Let O be the set of all formulas of this branch, i.e.,
®=®0u...u$t. By the choice of k, all nodes which follow {5} in the
confutation must be obtained by the three old rules. Thus, by imitating
that part of the given confutation we can get an EM-free confutation
of O, 5. Similarly we get an EM-free confutation of <D, “i§. By the
Elimination Lemma we can now construct an EM-free confutation of <J>.
Thus, before our application of the EM-rule was made, the branch
<*>0, ^ was already as good as closed, without any need for the EM-rule.
We have therefore eliminated one application of the EM-rule from the
given confutation. This process can be repeated until we finally get an
EM-free confutation of O0. |

We shall not regard the EM-rule as one of the rules of the tableau
method. However, by virtue of the Elimination Theorem we can still
use that rule in practice.

*§ 9. Completeness of propositional tableaux

In this section we shall see (among other things) that the tableau method
is complete in the sense that if d> is a finite set of formulas which is not
satisfied by any truth valuation, then O can be confuted. (The converse
of this, included in Prob. 7.1, is that the method is sound.)
If d> is a finite set of formulas, we define deg <!> to be the sum of deg (p
for all <p in ®.
By the degree of a branch of a tableau we mean deg ®, where ® is the
set of all formulas of that branch which are not used up in it. A tableau
is exhausted if all its branches have degree 0.

9.1. Problem. Let ®0 be a finite set of formulas. Let T be a tableau for


<P0 constructed in such a way that at each stage of the construction no
34 BEGINNING MATHEMATICAL LOGIC [CH. 1, §10

formula is used to extend a branch if that formula is already used up in


the branch. Let be any branch of T. Show that its degree is
at most deg <D0 — k. Hence show that an exhausted tableau for O0 can be
constructed.
9.2. Problem. Let T be an exhausted tableau for ®. Suppose that in
T there is a branch which is not closed, and let T be the set of all formulas
of that branch. Let a be the (unique) truth valuation such that for every
prime formula a, aff=T iff a is in 'F. Show that for every formula <p,
<j)a= t if <p is in *I# and g>a = _L if ~19 is in T. Hence trl=$. In particular,
if <D — {—|(p}, then ip is not a tautology.

9.3. Theorem. Let d> be a finite set of formulas. In a finite number of


steps we can construct for <J> an exhausted tableau T. T is a confutation
iff there is no truth valuation satisfying <D. In particular, if <I> = {“lip} then
T is a confutation iff <p is a tautology.
Proof. Immediate from Probs. 7.1, 9.1 and 9.2. |

The method of tableaux can thus be used to find out whether any given
formula is a tautology or not. In practice, this method is considerably
more efficient than that of truth tables.

9.4. Problem. Using Prob. 7.1 and Thm. 9.3, find an alternative proof
for Lemma 8.6.
9.5. Problem. Let <I> be a finite set of formulas. Show that if O N0a then
<I>, la can be confuted. (This is the converse of the last part of Prob. 7.1.)

§ 10. The propositional calculus

The propositional calculus is a formal method for generating tautological


consequences (indeed, all tautological consequences) of any given set
<1> of formulas.
We start with the following simple fact:

10.1. Lemma. For any formulas a and p,

{a, a->- p} t=0 P |

The operation of passing from two formulas a and a->-P to the formula
P is called modus ponens. In this connection, a and a-*p are called the
minor premiss and major premiss respectively, and p is called the conclusion.
Note that the major premiss is an implication formula, whose antecedent
and consequent are the minor premiss and conclusion respectively.
CH. 1, §10]. THE PROPOSITIONAL CALCULUS 35

Lemma 10.1 shows that modus ponens is (semantically) sound as a rule


of inference, in the sense that it “preserves truth”: if a<, = (a-»P)<7= T,
then also pcr= T.
We shall now designate certain formulas as propositional axioms. We
shall then show that if ® is any set of formulas, then every formula obtained
from ® and these axioms by repeated application of modus ponens is a
tautological consequence of ®. (Later on we shall show that every
tautological consequence of ® can be obtained in this way.) This machinery
for obtaining tautological consequences is the propositional calculus.1
As propositional axioms of if we take all if-formulas of the following
forms:

(Ax. I) a-*P-»a,
(Ax. II) (a-*p-*7)-*(a->P)-*ct-*7,
(Ax. Ill) (“ia-*P)-» ( lot —> |P) —

where a, p, 7 are any if-formulas.


Notice that we have got here not three single axioms but three axiom
schemes, each representing infinitely many axioms obtained by all possible
choices of formulas a, p, 7.
Let $ be a set of if-formulas. By a propositional deduction from ® in
if we mean a finite non-empty sequence of if-formulas (pl5...,(pn such
that, for each k (l^k^n), <p* is a propositional axiom of if, or <pfc€®
or <pfc is obtained by modus ponens from earlier formulas in the same
sequence (i.e., there are i,j<k such that <pj = (p£ —^ q>*)-
In this connection ® is called a set of hypotheses.
A propositional deduction in if from the empty set of hypotheses is
called a propositional proof in if.
In this chapter we shall often say simply deduction and proof omitting
the adjective propositional, since no other deductions and proofs will be
dealt with here. Also, as before, we shall usually omit qualifications
like “of if” and “in if”.
A deduction (or proof) whose last formula is a is said to be a deduction
(or proof, respectively) of a.
We write “O b-0a” to assert that a is deducible from ® (i.e., that there is
a deduction of a from ®). If ® is empty, so that a is provable (i.e., there

1 To be quite precise, this is just one version of the propositional calculus. Other versions
are based on different choices of axioms and rule(s) of inference.
36 BEGINNING MATHEMATICAL LOGIC [CH. 1, §10

is a proof of a), we write simply “|—0a”. Also, we write, e.g., “<p|—0a”


instead of “{(p}h0a”-

10.2. Theorem. // d> |-0«, then <l> t=0«- In particular, if |—0a then f=0a.
Proof. Let <px,.. .,<p„ be a deduction of a from O. Thus ip„ =a. By induc¬
tion on k = 1,...,« we show that <D |=0 <pk. (Thus, for k=n, we have 0 N0«.)
If <pfc is a propositional axiom, we easily verify that (pt is a tautology
(cf. Prob. 6.10). Thus <pk is satisfied by every truth valuation and is therefore
a tautological consequence of any set of formulas.
If tpfc ^ then clearly O N 0ipk.
Finally, if for some i,j<k we have <py- = <pf —><pfc, then {<pt, <py} t=0 <pfc
by Lemma 10.1. But by the induction hypothesis <I> (=0 <p,- and <D(=0<Pj.
Hence clearly O f=0 <pfc •

Thm. 10.2 means that the propositional calculus is (semantically) sound.

10.3 Lemma. For any a, |—0a-»a.


Proof. Here is a propositional proof of a-4-a:1
(a -► (a -* a) a) -► (a -► a -» a) -► a -f a, (Ax. II)
(Ax. I)
(a -► a ->• a) ->• a a, (m.p.)
sc-mx-hx, (Ax. I)
a-»-a. (m.p.) |

In the sequel we shall make implicit use of the following simple facts
about deductions:
If O c then any deduction from ® is also a deduction from T.

If <p„ is a deduction from fl> and 1 then <plv..,<(>*. is also


a deduction from <J>.
If is a deduction from <D and vjis a deduction from T,
then the concatenation of the two sequences (i.e., \|/ls...,\]/B)
is a deduction from Ou*P.

10.4. Deduction Theorem. Given a deduction of p from ct>, a, we can


construct a deduction of a-»p from <D. (Thus, //<D,ah-0p, then <pj-oa-*PJ
Proof. Let <pi,...,<p„(=p) be the given deduction of p from O, a. We show

1 For convenience we have added justifications in the margin. Thus the first formula
is an instance of Ax. II (where we take a->-a and a for p and y respectively). In principle
such justifications are redundant, since the reader could always check for himself
whether a given formula is an instance of one of the axiom schemes, or whether it is
obtainable by modus ponens from two earlier formulas.
CH. 1, §10],
THE PROPOSITIONAL CALCULUS
37

by induction on k — that a deduction of a-np*. from <I> can be


constructed. 'The following cases are possible: <pt is an axiom, or (pfc€0,
or — cfr'or <pt is obtained by modus ponens from two earlier formulas
tp; and <py.
If (pfc is an axiom then the following is a proof of a-Mp,, and a fortiori
a deduction of a-*<pfc from O:
%> (Ax.)
(Ax. I)
a“><P/c (m.p.)
9*6 the same sequence of three formulas is a deduction of a-xpt
from ® (except that in this case the justification in the first line should
read hyp. — short for hypothesis).
^ 9/c a> then in the proof of Lemma 10.3 we had a propositional proof1
of a-»a (=a-Mpfc). This is a fortiori a deduction of a-*<pA. from ®.
Finally, suppose that for some ifysJs we have <p= <p—>- <pfc. Then by
the induction hypothesis we have got deductions of <x-Mp; anda-np,-*^
from <D. We concatenate these two deductions and adjoin three new
formulas: o
oL-7 § f
a-MPb

(a -► <pt <p*) -> (a -*(p;) a -Mp* , (Ax. II)


(a-xp.O-Kx-xp*, (m.p.)
a “►<!>*• (m.p.)

We thus have a deduction of a-*<pfc from ®.

Let us note that the converse of the Deduction Theorem is clearly true.
For, if we are given a deduction of «->>p from ®, and a is added as an
extra hypothesis, we can then employ a with a-* p to^getjj by modus ponens.
In this way we get a deduction of p from <F, a.
The Deduction Theorem is of great practical use in constructing deduc¬
tions. Suppose we want to construct a deduction of a->-p from ®. Then
by the Deduction Theorem it is enough to construct a deduction of P
from ®, a. In this way we have both simplified the formula to be deduced

1 Note the difference in meaning between the two uses of the word proof here. The first
refers to a (hopefully convincing) argument used to establish a metamathematical result
(i.e., result about fd). The second refers to a sequence of formulas in fd.
BEGINNING MATHEMATICAL LOGIC [CH. 1, §10
38

(P instead of a-4p) and gained an extra hypothesis, a. This is a double


simplification of the problem: in general it is easier to deal with shorter
formulas; and it is also clear that the more hypotheses we have, the easier
it is to make deductions from them.
So far we have not made use of Ax. Ill, which encapsulates the deductive
properties of negation. We shall do so now.

10.5. Theorem. For any a and p,


(a) lia |-oa>
(b) ah0nna,
(c) P, “lPh-0a.
Proof, (a). We take a proof of “la-*“la (see Lemma 10.3) and adjoin
to it six further formulas, obtaining the following deduction of a from ! “1 a:

la-4 la,
(na->“ia)->(na-^nna)-4a, (Ax. Ill)
(la-* lia)-* a, (m.p.)
“1 “la, (hyp.)
1 la-4 la-4 1 la, (Ax. I)
“la-*—lia, (m.p.)
a. (m.p.)
(b). By (a) we have a deduction of "la from “l“l“ia. Using the Deduc¬
tion Theorem (with 0 empty) we get a propositional proof of “1 “I “la-* “la.
We take such a proof and by adjoining six further formulas, we obtain
a deduction of “I“la from a, as follows:

“1 “1 “1 a-* “| a,
(—|“i“ia-4a)-4(l1ia-4la)-4lia, (Ax. Ill)
a, (hyp.)
a-* “1 “11 a-* a, (Ax. I)
111 a-►a, (m.p.)
(llia4la)4lla, (m.p.)
11a. (m.p.)
(c). Here is a deduction of a from P, ip:
('1 a -4 p) -4 (1 a -4 1P) -4a, (Ax. Ill)
P> (hyp.)
P-4 1 a—4 p, (Ax. I)
la-4 p. (m.p.)
(la-4 lP)-*a, (m.p.)
CH. 1, §10]. THE PROPOSITIONAL CALCULUS 39

-ip, (hyp.)
-lp-*-|a-*-|P, (Ax. I)
“|a-*“lp, (m.p.)
a. (m.p.) i
In the sequel we shall make implicit use of the following fact. Suppose
that ®;H0Pi for i = 1 and {p1,...,p„}|-oa. Then <D1u...uO„l-0a.
This is so because we can take a sequence of formulas obtained by
concatenating deductions of p; from ®, (for i = 1,...,«) and — using
any deduction of a from {pi?...,pn} — extend this sequence to a deduction
of a from 01u...u<Dn.

A set <D of formulas is propositionally inconsistent if for some p both


<I> 1—o P and <I> h0 “1P- Otherwise O is propositionally consistent. In this
chapter we shall often omit the adverb “propositionally”.
By virtue of the soundness of the propositional calculus we have:

10.6. Theorem. No truth valuation satisfies an inconsistent set of formulas.


Proof. Let ®b-0P and ®h0“lp. If ctN® then by Thm. 10.2 we would
have both cr |= p and cr 1= | p, which is impossible. |

It follows that the empty set of formulas is consistent, since this set is
satisfied (vacuously) by every truth valuation. This fact — the propositional
consistency of 0 — is usually expressed by saying that the propositional
calculus is consistent.
In §11 the consistency of the propositional calculus will be proved by
another method, using tableaux rather than semantics.

10.7. Theorem. A set <I> of formulas is propositionally inconsistent iff


<I> |—o ot for every formula a.
Proof. If O H0 “ for every a, then for some (in fact, for all) p we have
<D h-0 P as well as Oh-0~iP.
Conversely, if for some p both <I> 1—0 P and <I> b-0~l P, then for any formula
a we get ® h-0 a by Thm. 10.5 (c). 1
10.8. Theorem. For any ® and a:
(a) <I>, ~| a is propositionally inconsistent iff <I> h-0 a,
(b) <I>, a is propositionally inconsistent iff ® H0“la.
Proof, (a) If <J>, “la is inconsistent, then for some p we have ®, —loc l—0 P
and <D, —|oe l—0—• P- By the Deduction Theorem, ®|-o~la->-P and
O 1—o—I <x——I p. On the other hand, using

(—|a->P)^(-|a->“lP)->a (Ax. Ill)


40 BEGINNING MATHEMATICAL LOGIC [CH. I, §11

we get (with two applications of modus ponens) a deduction of a from


{—ia-4-p, —I a—> —l P}. Thus <J> |~0a.
Conversely, if O Kta then <J>, —la is inconsistent because then O,-la h-0«
and d>, ~la |—0“ia.
(b) Using Thm. 10.5(a) it is easy to see that any formula deducible
from <D, a is also deducible from <I>, “I “Ila. In particular, if O, a is in¬
consistent then <D, ~l~la is inconsistent as well; so by the first part of the
present theorem we have O b0“l
Conversely, if O Ho-la then <D, a is inconsistent because both <P, a I—0 Ia
and O, a ho a. I

10.9. Problem. Show that if (a-*P)(|<I>, and both <!>, ~ia and <t>, p are
inconsistent then <D is inconsistent. Also show that if ~l(a-> P)£ <I>, and
O, a, ~ip is inconsistent then O is inconsistent.
10.10. Problem. Show that a set <I> of formulas is consistent iff every
finite subset of d> is consistent.

*§11. The propositional calculus and tableaux

Let O be a finite set of formulas and let a be a formula. Consider the


following three conditions:
(i) Oho a,
(ii) Oh0«,
(iii) O, ~la can be (propositionally) confuted.
We know (see Prob. 9.5 and Thm. 10.2) that (i)=>(iii) and (ii)=>(i) (in fact,
the latter holds for infinite O as well). It follows that (ii) =>(iii)- In the
present section we shall give an alternative proof that (ii)=>(iii). This
proof will be direct; it will not make an excursion through semantics
(but will use the results of §8).
Also, we shall give a direct proof that (iii)=>-(ii). It will then follow
that (i), (ii) and (iii) are equivalent.
The fact that (iii)=>(i) has already been proved directly (Prob. 7.1).
But (i)=>(ii) is new. (In §12 we shall prove (i)=>(ii) directly, and in §13
we shall extend this to infinite as well as finite O.)

11.1. Theorem. Le <t> be a finite set of formulas. Given a deduction of


a from <I>, we can confute <l>, ~l a.
Proof. Let <pj ,...,(p„(=a) be the given deduction of a from We show
by induction on A' = l,...,/r how to confute <P, —|q>fc. (For k—n we get
the required confutation of <l>, ~|a.)
CH. 1, §11], THE PROPOSITIONAL CALCULUS AND TABLEAUX 41

If is a propositional axiom it can easily be verified that ~] can be


confuted (see Prob. 7.4). Hence by Lemma 8.1 we can confute ®, ~i <pA.
If (pt€® then by Prob. 7.2 and Lemma 8.1 we can confute <D, n<pfc.
Finally, if for some i,j<k we have <pj=q>,- —► , then by the induction
hypothesis we can construct

(1) a confutation of ®, —Kp;,

(2) a confutation of ®, —i^-Mp*).

We confute ®, ~i(pfc as follows. First, we apply the EM-rule, introducing


the formula <p, —> <pfc. Next, we apply the -*-rule to <p;-»<pfc and get

®, “l<P/t

<P/—► <Pa —1(«P/—► <Pa)

1(p,- <pfc.

Here the leftmost and rightmost branches are as good as closed, using
(1) and (2) respectively. The middle branch is as good as closed because it
has both | <pfc and (p*. (see Prob. 7.2).
By the Elimination Theorem 8.7, we can now eliminate our use of the
EM-rule. g

11.2. Problem. Using Thm. 11.1 and Prob. 9.5, find an alternative proof
for Thm. 10.2. (Warning: note that in Thm. 10.2 ® may be infinite.)

11.3. Theorem. Every finite inconsistent set of formulas can be confuted.


Proof. If we are given deductions of formulas p and ~ip from a finite
set ®, then by Thm. 11.1 we can confute ®, ~lP as well as ®, —| —| p.
Using the Elimination Lemma 8.6, we can confute ®.

Using Thm. 11.3 we get a new proof for the consistency of the proposi¬
tional calculus. For, if the empty set were inconsistent, then it could be
confuted; but this is impossible since none of the tableau rules can be
applied to the empty set.
We now prove the converse of Thm. 11.3.

11.4. Theorem. Every finite set of formulas that has a confutation is


inconsistent.
BEGINNING MATHEMATICAL LOGIC [CH. 1, §11
42

Proof. Given a confutation of ®, we show by induction on its depth d


that ® is inconsistent.
If d=0, this means that the given confutation has just one node, ®.
Thus there must be some (prime) formula a such that a£® and “|a£®.
Then ® is clearly inconsistent.
Now suppose that 0 and examine level 1 of the given confutation.
We consider three cases, corresponding to the three different ways in
which the node(s) of this level could have arisen.
Case “i“i: There is just one node of level 1, and it arises by applying
the ~i “1-rule to a formula “l”lP€<I>. Then the given confutation
starts thus:

Now, if in this tableau we fuse the initial node ® with its successor {p},
we get a confutation of ®, p and the depth of this confutation is now d— 1.
By the induction hypothesis ®, P must therefore be inconsistent. By
Thm. 10.8(b) we have ® H0“lP, but since TlP€® it follows that ® is
inconsistent.
Case-*-: There are two nodes of level 1, arising by an application
of the ->--rule to a formula a-*PG®. Then the given confutation starts
thus:
<D

/\
“la p

Now, if in this confutation we cut out the node {p} as well as all nodes
following it, and we then fuse the node ® with the node {“la}, we get
a confutation of ®, “la and this has depth By the induction hypothesis
it follows that ®, “la is inconsistent. Similarly, we show that ®, p is
also inconsistent. Since a -> p 6 ® it follows easily that ® must be inconsistent
(see Prob. 10.9).
Case There is just one node of level 1, arising by application
of the “|-►-rule to a formula “|(a-*P)6®. Then the given confutation
CH. 1, §12], WEAK COMPLETENESS OF THE PROPOSITIONAL CALCULUS 43

starts thus:
0)

a
IP

andasinCase ~~\~i we see that<D, a, ~|p is inconsistent. Since —|(a->P)£(L>,


it is easy to show (see Prob. 10.9) that <I> is inconsistent. £

11.5. Problem. Show that if ® is a finite set of formulas such that no


truth valuation satisfies ®, then ® is propositionally inconsistent. (This
is a partial converse to Thm. 10.6.)
11.6. Problem. Show that if O is a finite set of formulas such that ®, “la
has a confutation, then ®h0i (This is the converse of Thm. 11.1.)

§ 12. Weak completeness of the propositional calculus

The main result of this section is the following partial converse to Thm. 10.2:

12.1. Weak Completeness Theorem. If ® is finite and<f>t=0ai, then ®h-0a-


In particular, if f=0a then H0a-
Proof. The result follows at once from Probs. 9.5 and 11.6. An alternative
proof, which is more direct since it does not make an excursion via tableaux,
is sketched in the following three problems (12.2-12.4). |
12.2. Problem. Let a be a propositional combination of pi5...,pfc. Con¬
sider a row in a truth table for a in terms of pl5...,pfc. For each i =
let P- be pi or ~iP; according as the “P;” entry in this row is “t” or “j_”.
Similarly, let a' be a or ~la according as the “a” entry in this row is “T”
or “J_”. Show that {p',...,p*}|—0a'. (Follow the prescription for construct¬
ing truth tables given in §6 and use induction on deg a.)
12.3. Problem. Let a be a tautology and let pi5...,pft be all the different
prime components of a. By induction on p — 0,1,...,& show that
{Pi,...,Pft_p} (-0a, where, for each i, p,' may be chosen as p; or —| p£ (and
the choice is made independently for different /). Hence (for p = k) [-0
12.4. Problem. Let ® be finite and O(=0a. Using Probs. 6.12 and 12.3,
show that a deduction of a from <F can be constructed.
12.5. Problem. Using Thm. 12.1, find an alternative solution to Prob. 11.5.

12.6. Remark. It follows from Thrris. 10.2 and 12.1 that the relation
is invariant with respect to language. Let if and if' be two different
BEGINNING MATHEMATICAL LOGIC [CH. I, §13
44

first-order languages such that a as well as all formulas of O are both


^-formulas and if'-formulas. (if and if' must both have all the symbols
occurring in 0> and a, but each may have different additional symbols.)
Suppose we have a deduction of a from O in if. This deduction is not
necessarily an if'-deduction, because it may contain if-formulas that are
not if'-formulas. Nevertheless, a is deducible from <D in if' (possibly
via a different deduction). To see this we notice that the given deduction
of a from <D in if can only use a finite number of hypotheses. Thus a is
deducible in if from a finite subset <D0 of <I>. By Thm. 10.2 it follows that
<J>0t=0a. But the relation of tautological consequence is invariant with
respect to language (see Prob. 6.15). Hence O0t=0a holds also in if' and
by the Weak Completeness Theorem 12.1 (applied to if') we see that a
is deducible from <D0 in if'. The same result can also be obtained by
tableaux.
Similarly, the property of being a propositionally consistent set of for¬
mulas is invariant with respect to language.
12.7. Remark. All proofs given so far in this chapter have been constructive:
whenever we have asserted the existence of a deduction (or a tableau,
etc.) we actually prescribed how to construct it in a finite number of steps.
Also, whenever we have proved a set <1> to be inconsistent, the proof actually
tells us how to find a formula P and deductions of p and ~1 p from O.
In particular, both proofs of Thm. 12.1 were constructive.

*§ 13. Strong completeness of the propositional calculus

In this section we give a non-constructive proof of the full converse of


Thm. 10.2.
A set O of if-formulas is maximal propositionally consistent in if if
it is propositionally consistent but is not a proper subset of any proposi¬
tionally consistent set of if-formulas.
As usual, we allow ourselves to omit the qualifications “propositionally”
and “in if”. But it must be stressed that the property of being maximal
consistent is not invariant with respect to language.

13.1. Theorem. The set <I> is maximal consistent iff the following two
conditions both hold:
(a) 0> is consistent.
(b) For every formula a, or ~lafO.
CH. 1, §13], STRONG COMPLETENESS OF THE PROPOSITIONAL CALCULUS 45

Proof. Suppose <I> is maximal consistent. Then (a) holds by definition.


Also, if as well as “la^O, then by definition <D,a and O, na must
both be inconsistent. Thus by Thm. 10.8 both 4)h0“ia and O t-0« —
contrary to (a). u? u=
Conversely, suppose that (a) and (b) hold. We show that if O is a proper
subset of ¥ then ¥ is inconsistent. Indeed, let a£¥ but Then,
by (b), la^O; soa$¥ as well as ~|a£¥, making ¥ inconsistent. |

13.2. Theorem. Let <t> be maximal consistent. Then a-*P£® iff ~ia£<l>
or PGO.
Proof. Suppose a-»-pea>. If -ja^O and p^O then 0>, “la and <J>, p
are inconsistent. It is then easy to verify (see Prob. 10.9) that <D itself
is inconsistent, contrary to our assumption.
Conversely, let —la^O) or p^O. Then <D, a or 0>, ~iP must be
inconsistent. In either case, <D,a, “ip is inconsistent. If “i(a->P)6<D,
then it is easy to verify (see Prob. 10.9) that O would be inconsistent.
Therefore ~i(a-*P)$ <D; so by Thm. 13.1 we must have a-*P6®. |

In the proof of the following theorem we apply Zorn’s Lemma, which


makes the proof non-constructive.

13.3. Theorem. Let <I> be consistent. Then there is a maximal consistent


¥ such that O ¥. 4
Proof. Consider the family of all consistent sets of ^-formulas, partially
ordered by inclusion e.
r
If {Oji /£/} is an arbitrary totally ordered subfamily of that family,
then the union is also consistent. To show this we make
use of Prob. 10.10. Every finite subset of that union is already included
in some <!>,, and is therefore consistent. So the whole union is consistent.
We apply Zorn’s Lemma to obtain the required result. |

We can now prove the (full) converse of Thm. 10.6:


13.4. Theorem. If d> is a propositionally consistent set of formulas, there
is a truth valuation o satisfying <I>. f
Proof. By Thm. 13.3, we can assume that Oc'P, where ¥ is maximal
consistent. ' ' £
We determine a truth valuation a by requiring that
-f
(1) = T iff <p€¥,
for every prime formula <J>. We shall now show by induction on deg <p
that (1) in fact holds for every formula (p.

5
46 BEGINNING MATHEMATICAL LOGIC [CH. I, §14

If <p is not prime, then it is either a negation formula or an implication


formula. a
Suppose <p = “ia. Then

iff aCT= _L (by Def. 6.1)

iff a$ T (by ind. hyp.)

iff “lae'P, (byThm. 13.1)

i.e., <p€'F-
I3 V
Now suppose <p=a->p. Then
D.4
4C<PCT = T iff aff= _L or pff=T (by Def. 6.1)

iff a(f *P or peT* (by ind. hyp )*

iff “lag'P or p£*P (by Thm. 13.1)

iff a-^pG'F, (by Thm. 13.2)


fpfti ft
i.e., tp^'P.

We have thus verified (1) for all formulas. Since ^c'P it follows that
P 2 I
We can now prove the (full) converse of Thm. 10.2:
6 A
13.5. Strong Completeness Theorem. 7/<Dn0(x then <t> t-()a.
Proof. If O|=0a then no truth valuation can satisfy O, “la (for, if crt=®
thenaCT= T, so that (“la)CT= _L). Hence by Thm. 13.4 O, “la is inconsistent.
Therefore, by Thm. 10.8(a), |

*§ 14. Propositional logic based on ~i and a

In some of the later chapters of this book (viz., those devoted to model
theory and axiomatic set theory) it will be convenient to assume that the
primitive connectives of d£ are “1 and a rather than “l and In this
section we sketch how the treatment of propositional logic given above
can be modified and adapted to that setting.
First, a->-p would have to be defined, e.g. as ~l(aA~lP) (cf. Prob. 6.4).
Then, clause (2) of Def. 6.1 should be modified to read:
(Pay)'t=T iff both p"=T and fi=T.
The rules for constructing truth tables should also be modified accordingly.
CH. 1, §15], PROPOSITIONAL LOGIC BASED ON AND V 47

The treatment of propositional tableaux should be modified by adopting


the a-rule and | A-rule (which in the treatment above were recognized
only in practice, not in theory) instead of the -»-rule and n-*-rule.
All the material presented in §§6-9 carries over, mutatis mutandis1,
to the new setting.

In the propositional calculus, we retain the old axiom schemes and


modus ponens, but add also some new axiom schemes. A possible choice
might be the following three axiom schemes:

(Ax. IV) aAp-*a,

(Ax. V) aAP^-p,

(Ax. VI) a-*P-HxAp.

These axioms may used be used to solve the modified version of Prob. 10.9.
(“Show that if aAP€<I> and 3>,a, p is inconsistent, then <I> is inconsistent.
Also show that if ~l (a A P) € O and both O, ~I a and <P, ~i P are inconsistent,
then d> is inconsistent.”)
All the material of §§10-13 then goes through without any difficulty.

*§ 15. Propositional logic based on ~l, , A and V

For some purposes — e.g., for comparing classical logic with intuitionistic
logic (see Ch. 9) — it is desirable to regard all four connectives A
and v as primitive symbols of the language rather than define a and
V in terms of “i and -►. In this section we sketch the modifications that
have to be made to adapt our treatment to that setting. The details are
left to the reader.
First, the prescription for constructing formulas (given just after
Lemma 4.1) has to include two additional rules: if a and p are formulas,
then so are A«P and vap.
Clauses (d) and (e) in Def. 5.1 have to be amended to define (<xaP)
and (avP) as Aap and vap respectively.
In Def. 6.1 we need two additional clauses dealing with formulas of
the forms Pay and pvy in the obvious way. Also, two corresponding
rules have to be added to the prescription for constructing truth tables.

1 For example. Lemma 8.3 should be modified to assert that if we are given a confutation
of <t>, «AP, then we can confute O, a, p.

5*
48 BEGINNING MATHEMATICAL LOGIC [CH. 1, §16

Propositional tableaux require four additional rules (a, “1A, v


and ~IV).
The propositional calculus may be based on nine axiom schemes: the
three schemes of §10, plus the three schemes of §14, plus the following
three schemes:

(Ax. VII) a-»avp,

(Ax. VIII) p-»avp,

(Ax. IX) (a-»Y)-MP^y)->-avP-»-Y.

With these modifications, all the material of §§6-13, mutatis mutandis,


goes through without difficulty.

§16. Historical and bibliographical remarks

For a detailed treatment of the topics touched upon in §§1 and 2, the
reader is referred to the Introduction of Church [1956]. Also, §29 of
Ch. II in Church’s book contains a survey of the earlier literature on
propositional logic. (In our present Remarks we only deal with topics
not covered by Church’s survey.)
The tableau method (for propositional and first-order logic) was devised
independently by Beth [1955], Hintikka [1955] and Schutte [1956],
It was developed and streamlined by Smullyan [1968], to whom our
treatment is heavily indebted. Essentially, the tableau method is dual to
the method of Gentzen [1934], Gentzen’s method is essentially a systematic
search for proofs in tree form, while the tableau method is a systematic
search for refutations in upside-down tree form. A node in a Gentzen proof
tree roughly corresponds to a branch in a tableau. The Elimination
Theorem for tableaux corresponds (and is in fact equivalent) to Gentzen’s
celebrated Hauptsatz.
The method used in §13 for proving strong completeness is due to
Henkin [1949], Thm. 13.3 is due to Adolf Lindenbaum (see Tarski [1930]).
CHAPTER 2

FIRST-ORDER LOGIC

Whereas in propositional logic the only symbols of the object language


to play an essential role were the connectives, we shall now bring into
play the full apparatus of our first-order language if. Much of the present
chapter will be devoted to extending the ideas developed in §6-9 of Ch. 1
to this more elaborate setting. In addition, we shall develop a new order
of ideas, the need for which arises from the specific role played by the
variables; this will be done in §§2-3. The final sections of this chapter
(§§9-13) deal with some special topics. These will not be required immedi¬
ately and the reader who is in a hurry to get on to Ch. 3 may put off
reading them.

§ 1. First-order semantics

In order to interpret if-terms and if-formulas it is necessary to fix an


if-structure (or if-interpretation) H consisting of the following ingredients:
(1) A non-empty class U, called the universe of discourse (or, briefly,
universe, or domain) of IT. The members of U are called individuals (of U).
(2) A mapping which assigns to each function symbol f of if an
operation fu on U such that if f is an n-ary function symbol, fH is an n-ary
operation on U. In particular, if a is a constant, then a11 is an individual.
An individual of this kind -— i.e., alt for some constant a of if — is said
to be designated.
(3) A mapping which assigns to each extralogical predicate symbol
P of if a relation Pu on U such that if P is an n-ary predcate symbol,
Pu is an n-ary relation on U. In particular, if P is unary, then Pu is a
subclass of U.
We make the following notational conventions. If an upper-case German
letter denotes a given structure, then the corresponding italic letter denotes
the universe of that structure. Individuals will be denoted by lower-case
50 FIRST-ORDER LOGIC [CH. 2, §1

italic letters (especially “a”, “h”, “c”, “v”). In contexts where only
one paiticular structure U is considered, the operation (or relation) corre¬
sponding to a given function symbol (or predicate symbol, respectively)
will sometimes be denoted by the corresponding italic letter. Thus we
may write “/” and “P” instead of “fu” and “Pu”.
An ^-interpretation U only suffices to assign a value (which is an indi¬
vidual of It) to an JSf-term in which variables do not occur. Also, it only
suffices to assign truth values to some ^-formulas (viz. those that are
to be called -sentences). But, as explained in §2 of Ch. 1, the value of
a term and the truth value of a formula will in general depend not only
on the interpretation but also on values assigned to some ,if-variables.
Thus, to evaluate a given if-expression we need an interpretation It plus
a particular assignment of values in U to some if-variables. Here there
is a slight technical problem: different if-expressions may involve different
variables, so that we would have to consider, say, an assignment of values
to one set of variables in connection with one formula a, an assignment
to a different set of variables in connection with another formula P, and
yet another assignment in connection with a-»-p. This is feasible, but
technically rather awkward. Instead, we prefer to work with assignments
that assign a value in U to all variables at once. And we shall arrange
matters so that in evaluating any given expression the values assigned
to variables which the expression does not involve will not in fact make
any difference.
We therefore define an if-valuation o to be an if-structure U together
with an assignment of a value xadU to each variable x.
The reader must note that an if-valuation is not the same thing as
a truth valuation on if, although (as will transpire shortly) each valuation
gives rise in a natural way to a truth valuation.
By definition, each valuation a involves a particular structure U. We
refer to H as the structure underlying <j. If U is the structure underlying <7,
we define fa and P17 to be the operation fu and the relation Pu, respectively,
where f is any function symbol and P is any extralogical predicate symbol.
The universe U of U will also be called the universe of a.
We shall say that two valuations a and x agree on a given variable x
(or function symbol f, or extralogical predicate symbol P) if cr and x have
the same universe and xCT=xT (or fa=V, or P<T=Pt, respectively).
Let a be a valuation with universe U and let u£U. We define o(x/u)
to be the valuation which agrees with a on every variable other than
CH. 2, §1], FIRST-ORDER SEMANTICS 51

X as well as on every extralogical symbol, while x°(xlu)=u. Thus, in partic¬


ular, the structure underlying a(x/u) is the same as that underlying a.
Given an if-valuation a with universe U, we now define, for each
i?-term t, the value of t under a (briefly, ta) in such a way that t"6 U.
Also for each if-formula a we define the value of a under <j (briefly, oF)
so that oF is either T or i. This is done by recursion on degt and deg a
respectively, as follows:
1.1. Basic Semantic Definition.
(Tl). If x is a variable, then x*7 is already defined.
(T2). If f is an n-ary function symbol of if and are if-terms, then

(ft^.t^f^,...,^).
(FI). If P is an n-ary extralogical predicate symbol of if and tl5...,t„
are if-terms, then

T if
(pt1...t,r=(
_L otherwise.

(FI ). If s and t are if-terms and if is a language with equality, then

if s*=tCT,
(s=tr otherwise.

(F2). For every if-formula p.

rT if P'=_L,
(“IP) [± otherwise.

(F3). For every if-formula p and if-formula y,

T if PCT= -L or y<T=T,
(P->y r={ _L otherwise.

(F4). For every if-formula P and variable x,

[T if P<T(x/“) = T for every u£U,


(VxP)"={[ ± otherwise,

where U is the universe of a.

The above definition will be referred to as “BSD”. It must be stressed


that what the BSD defines is not the valuation a itself — which must be
given in advance, by specifying a structure U and an assignment of value
xff6 U to each variable x — but two mappings induced by o.
52 FIRST-ORDER LOGIC [CH. 2, §1

The first of these, defined in (Tl) and (T2), maps the set of all terms
into U. It is well defined because in (T2) its value for a term is
reduced to its values for the terms tlv..,t„ whose degrees of complexity
are smaller.
The second mapping induced by a is defined in (F1)-(F4). It maps
the set of all formulas into {t, jL}. In (FI) and (Fl=) its values for
atomic formulas are given explicitly in terms of the first induced mapping.
In (F2)-(F4) its values for non-atomic formulas are reduced to the values
assigned by it (or — in the case of universal formulas, covered in (F4) —
by similar mappings induced by other valuations) to formulas with smaller
degrees of complexity. Because of clauses (F2) and (F3), this second
induced mapping is clearly a truth valuation (see Def. 1.6.1).
There is no need to be overawed by the BSD. It merely spells out with
precision what is pretty obvious once the intended meaning of the logical
symbols has been understood, and the extralogical symbols have been
provided with references by a valuation (which is merely a kind of diction¬
ary). The BSD just puts on record the way in which if-expressions are
supposed to be understood.
In particular, clause (FI) is hardly more startling or profound than,
e.g., the fact that the Latin sentence “Catullus uxorem Celeris amavit"
is true iff Catullus did really love Celer’s wife. This fact can easily be
verified by consulting any standard Latin dictionary and grammar.
It obviously has little epistemological value. The question whether that
Latin sentence is actually true is of course much more exciting — but
belongs to historical gossip-mongering, not to logic or philosophy.
The first explicit formulation of (a version of) the BSD was given in
1933 by Tarski.1 For this reason it is often referred to as Tarski's truth
definition or Tarski's definition of satisfaction.
We conclude this discussion of the BSD with:

1.2. Remark. Because of clause (F4), the BSD is strongly non-constructive:


if U is infinite, (F4) does not provide us with a method for computing
the value (Vxp)* in a finite number of steps, for it presupposes the values
Pff(x/“) f°r infinitely many u. This non-constructive character is inherited
by all the semantic definitions given below, which are based on the BSD.
Indeed, one of our main tasks will be to obtain a more constructive charac¬
terization of the concepts thus defined.

1 Cf. §14 below.


CH. 2, §1], FIRST-ORDER SEMANTICS
53

1.3. Problem. Using Def. 1.5.1(g) show that (3xa),T=T iff a'T(x/u)= t
for some u£U, where U is the universe of the valuation a.

1.4. Problem. Show that 3 could have been taken as primitive (i.e., as
a symbol of =5?) instead ol \]. (Replace BSD (F4) by the statement of
Prob. 1.3 and replace Def. 1.5.1(g) by a definition of from which the
original (F4) can be derived.)

We have noted above that, for every valuation a, the induced mapping
defined in clauses (F1)-(F4) of the BSD is a truth valuation. We shall
say that a satisfies a formula <p (or a set 0> of formulas) — briefly, o\=g>
(or cr(=<t>, respectively) — if the truth valuation induced by a satisfies
<P (or O, respectively). Thus <rf= <p iff tpa=T; and <rf=<I> iff <pCT=T for
every <p6<I>.
It must be noted, however, that not every truth valuation is induced
by some valuation. For example, if a is any formula then the prime
formula Vx(a“>«) is satisfied by every valuation but not by every truth
valuation. Thus a truth valuation not satisfying this prime formula cannot
be induced by any valuation.

1.5. Definition. If every valuation satisfying a set O of formulas also


satisfies a formula a, we say that a is a logical consequence of 0> (or a
follows logically from <I>, or O logically entails a). We write this briefly
as “Of=a”. As usual, we shall write “<p|=a” instead of “{(p)|=a” and
say that a is a logical consequence of <p. If a is satisfied by every valuation
(i.e., if a follows logically from the empty set of formulas), then we say
that a is logically true (or logically valid) and we write “h=a”. If a(=p
as well as Pf=a (i.e., aCT = p,T for every valuation a), we say that a and p
are logically equivalent. We say that a formula <p (or a set O of formulas)
is satisfiable if ct=(p (or crN®, respectively) for some valuation a.

1.6. Theorem. If Of=0<x then ®t=a. In particular, if \=0 a then h=a; and
if a and P are tautologically equivalent then they are logically equivalent.

The converse of Thm. 1.6 is false. For example, Vx(a-*°0 is logically


true but not tautologically true; and it is logically (but not tautologically)
equivalent to Vx(a**a)-

1.7. Problem. Show that <E>, at=P iff Ot=a-*p. Hence show that
{q*!..,(p„}|= P iff 1= tpj-»•...-Mp„-*p. Also show that a and p are logically
equivalent iff [=«<-► P-
54 FIRST-ORDER LOGIC [CH. 2, §2

1.8. Problem. Let a be a subformula of p, and let a and a' be logically


equivalent. Let p' be obtained from p by replacing zero or more occurrences
of a by occurrences of a' (i.e., p=T0aT1aT2...aTt, where T0,...,Tfc are
strings, and P'=T0a1T1a2T2...afcTt, where, for i=l,...,k, a£ is a or a').
Show by induction on deg p that p and P' are logically equivalent.

§ 2. Freedom and bondage

The value of a given expression (i.e., term or formula) under a valuation


a does not in fact depend on the whole of a. For terms the situation is
very simple, as we now go on to show.

2.1. Theorem. Let t be a term, and let a and r be valuations which agree
on all variables and function symbols occurring in t. Then t<T = tI.
Proof. By straightforward induction on degt. If t = x then, since x
is a variable occurring in t, we must have x<T=xT, i.e., tff=tT.
If t=ft1...t„, where f is an n-ary function symbol and tl5...,t„ are terms,
then by assumption f7 and fT are the same. Also, since every variable
or function symbol occurring in one of the arguments L,... ,t„ occurs
also in t, we have by the induction hypothesis
ta—tT ta—tl
Thus

=ntf...rn) (by BSD (T2))


(by ind. hyp.)
=(ftx...tny (by BSD (T2))
=tT. B
We say that a term t is closed if it does not contain any variable. By
Thm. 2.1 it follows that in this case ta depends only on the structure II
underlying a (and not on the values assigned by a to the variables). If
t is a closed term we define tu to be the value of t under some (and hence
every) valuation o whose underlying structure is ll.
For formulas we can prove a result analogous to Thm. 2.1, but in fact
we shall need a stronger result. For example, though the variable x occurs
in V*P, we should not expect (Vxp)'7 to depend on xCT. This may be
understood from the informal explanation given in §2 of Ch. 1. More
precisely, it can be seen from clause (F4) of the BSD. For this clause
defines (V^P)*7 in terms of the values p17^") for all u in the universe U
of cr; so that here xa makes no difference.
CH. 2, §2], FREEDOM AND BONDAGE 55

This leads us to a distinction between two kinds of occurrence of a


variable x in a formula a: a given occurrence of x in a is bound if this
occurrence is within a subformula of a having the form V*P (he., a universal
subformula of a which has x as variable of quantification); all other
occurrences of x in a are free. To be quite precise, we define these concepts
by recursion on deg a:

2.2. Definition. A given occurrence of a variable x in a formula a is


free in a iff it is not bound in a. Moreover:
(1) If a is atomic, then every occurrence of x in a is free in a.
(2) If a=~ip, then a given occurrence of x in a is free in a iff the same
occurrence is free in p.
(3) If a = P-^y, then a given occurrence of x in a is free in a iff that
occurrence is a free occurrence of x in p or in y.
(4) If a = VxP, then every occurrence of x in a is bound in a, but if
a = VyP, where y is a variable other than x, then a given occurrence of
x in a is free in a iff that occurrence is free in p.

Note that the same variable may have both free and bound occurrences
in the same formula. For example, in the formula

Vx [x=yA3y(y=x)]-^3z(x/z),

the first three occurrences of x are bound and the last is free; the first
occurrence of y is free and the other two are bound; and both occurrences
of z are bound. (Here we are assuming that x,y and z are distinct variables.)
We say that x is free in a if x has at least one free occurrence in a.
Note that x may also have bound occurrences in a; thus both x and y
(but not z) are free in the formula of the above example.
The free variables of a are the variables which are free in a.

2.3. Theorem. Let o and x be valuations which have the same universe
U and which agree on every free variable of a as well as on every extra-
logical symbol occurring in a. Then a'T=aT.
Proof. By induction on deg a. We deal here only with the case where
a is universal, leaving the other (and easier) cases to the reader.
Let a = VXP- Then, by the BSD, T iff pc(x/K)= T for every u in the
universe U of o. Now, the extralogical symbols of p are exactly those
of a. Also, the free variables of p are either exactly those of a, or they
are those plus x. But (for every uf U) a(x/u) and x(x/m) clearly agree
not only on the free variables and extralogical symbols of a, but also on x.
56 FIRST-ORDER LOGIC [CH. 2, §2

Since deg p<dega, it follows from the induction hypothesis that p<T(x/") =
= pr(x/u). Thus oC= T iff pr(x/w)= T for all U, i.e., iffaT= T. 3

2.4. Problem. Show that if x is not free in a, then a, Vxa and 3xa are
logically equivalent.

A formula which has no free variables (so that all occurrences of variables
in it, if any, are bound) is called a sentence. It follows from Thm. 2.3
that if a is a sentence then the value aC depends only on the structure U
underlying o. In this case we define a11 to be that value (i.e., au=aff for
any valuation o which ll underlies).
If ctu = T, we say that the structure It satisfies the sentence a (or a holds
in It, or It is a model for a), briefly, U N a. If It f= <p for every <p in a set
<J> of sentences, we say that It is a model for O.

More generally, let a be a formula such that all the free variables of
a are among the first k variables of if, namely Then, by Thm 2.3,
a" depends only on the structure It underlying o and on vf for i=l,...,k.
We write
Itt= a [«X,...,Z4]
when we wish to assert that cr(=a for some (hence for every) valuation
<7 such that It underlies o and such that v?=w£ for i=\,...,k.

2.5. Problem. Construct a sentence a containing only logical symbols


(i.e., no function symbol and no predicate symbol other than =) such
that a holds in a structure It iff U has
(a) at least three members,
(b) at most three members,
(c) exactly three members.
2.6. Problem. Using just one binary predicate symbol (but no other
predicate symbols and no function symbols) construct a sentence a such
that a has no finite model (i.e., no model with finite universe); but if U
is any infinite set then a has a model whose universe is U.

2.7. Remark. From Thm. 2.3 it follows that the various semantic concepts
defined in Def. 1.5 are invariant with respect to language. For, if if and
if' are two first-order languages and o is an ^-valuation then there is
an ^'-valuation o' which agrees with a on the symbols which and d£'
have in common. Any formula a belonging to both d£ and <£' will then
get the same value under a and o'. Thus, e.g., if a is satisfiable as an
d£-formula (i.e., satisfied by some if-valuation) it is also satisfiable as
an if'-formula.
CH. 2, §3], SUBSTITUTION 57

§ 3. Substitution

Let s and t be terms. We define s(x/t) as the term obtained from s when
an occurrence of t is substituted for each occurrence of x in s. In detail,
is defined by recursion on degs as follows:
s(x/t)

3.1.Definition. If s = x then s(x/t)=t; but if s = y, where y is a variable


other than x, then s(x/t)=y. If s=fs1...s„, where f is an 77-ary function
symbol and sl5...5s„ are terms, then s(x/t)=fs1(x/t)...s„(x/t).

We would now like to investigate the semantic behaviour of s(x/t).


To this end, let us fix a valuation a. Then the value of s is s'7. Now let
us assign to x various values u in the universe U of a, while the rest of
a is held fixed. We get various valuations o(x/u) and the corresponding
values of s will be s"(x/"). We have thus got a function / (mapping U into
itself) defined by

(1) /(w) = s<7(x/u), u£U.

In particular, since c/x/x'7) is a itself, we have

s'=/(x»).

Now, in s(x/t), t has taken the place that x had in s. We would therefore
expect that s(x/t)'7 will depend on ta in the same way as s'7 depends on xff;
that is, we conjecture

(2) s(x/t)'7=/(t<7).

Combining (2) with the identity (1) which defines/ we can put our conjecture
in the form
s(x/t),T=s'7(x/t),

where t=ta. In fact, this conjecture is correct.

3.2. Theorem. If s and t are terms, x a variable and a a valuation, then

s(x/t)<7=s‘7(x/(),

where t—ta.
Proof. By induction on degs. Each of the three cases in Def. 3.1 has
to be treated separately. The details are routine and are left to the reader. |

We now want to do the same kind of thing with formulas. Thus, for
given formula a and valuation a, we define a mapping/of U into {T,_L} by

d') f(ii) =a<7(x/“), u£U.


58 FIRST-ORDER LOGIC [CH. 2, §3

And we should like to define a(x/t) in such a way that

(20 «(x/ty=At°).

But here there are two snags. First, it follows from Thm. 2.3 (and, indeed,
directly from clause (F4) of the BSD) that f(u) defined in (F) depends
on u only through the free occurrences of x in a. Thus, in order to have
(20 we must define a(x/t) in such a way that t is only substituted for the
free occurrences of x in a, not for the bound occurrences.1
But even if we take this precaution we come across another snag. For
example, let a be yjy(x=y), where x and y are different variables. Using
clauses (F4) and (Fl~) of the BSD) it is easy to verify that in this case

«"(*/") =T iff u = v for all v£ U,

where U is the universe of a. (More intuitively, a says: “the value of


x is equal to every individual”.) Thus a'T(x/u) = T iff’ U has exactly one
member, and (F) becomes

T if U has exactly one member,


_L otherwise.

On the other hand, if we take t to be y and substitute it for the occurrence


of x in a (which is free!) we get Vy(y=y)- This sentence is logically true:
we have
[Vy(y=yXT=T

for every o. If U has more than one member, we therefore find

[Vy(y=y)]V/(yCT).

So we cannot take Vy(y=y) to be a(x/y), if we want (2') to hold.


More intuitively, we want a(x/t) to say about the value of t what a says
about the value of x. But in our example Vy(x=y) says that the value
of x is equal to every individual, whereas Vy(y=y) says nothing about
the value of y but merely states that every individual is equal to itself.
What went wrong? Clearly the snag was that when we substituted
an occurrence of y for the free occurrence of x in Vy(x=y), this new
occurrence of y was captured — it fell within the scope of a y-quantifier.

1 Besides, if t is substituted for all occurrences of x in a, then the result is not always
a formula. Thus, e.g., from Vx(x=y) we would get Vt(t=y) — which is not a formula,
unless t happens to be a variable.
CH. 2, §3]. SUBSTITUTION 59

What are we to do? Let us take a clue from manipulations used in the
integral calculus. The value of, say, the integral
i

depends on the value of x but not on the value of y. (A variable of integra¬


tion behaves analogously to a variable of quantification!) If we want
to substitute for x in this integral an expression containing y, then we first
have to change the variable of integration getting, say,
i

where z does not occur in the expression we want to substitute. This


integral has exactly the same value as the one we had before, but now we
can perform our substitution quite safely.

Similarly, instead of substituting y for x in Vy(x=y) we must, it seems


first change Vy(x=y) into Vz(x=z) — it is easy to see that these two
formulas are logically equivalent — and only then substitute y for x
We now get Vz(y=z), a formula which says that the value of y is equal
to every individual ■— just what we wanted.
We shall first define a(x/t) only in those cases where the substitution
of t for x in a does not lead to “capture” and thus does not require any
change of the variable of quantification. Later we shall also define a(x/t)
in the remaining cases, by prescribing the changes that must be made in
a before the substitution may take place.
We shall say that t is free to be substituted for x in a (briefly, free for
x in a) if no free occurrence of x in a is within a subformula of a having
the form VyP, where y occurs in t.
If t is free for x in a, we shall define a(x/t) as the result of substituting
an occurrence of t for each free occurrence of x in a. (Note that because
t is assumed to be free for x in a, all occurrences of variables that have
been introduced via the substitution are free in a(x/t).)
We now spell out precisely, by recursion on deg a, the conditions under
which t is free for x in a; and simultaneously we define a(x/t) under these
conditions.

3.3. Definition. If a is an atomic formula Ps^.-s,,, then t is free for x


in a. And a(x/t) is defined as Ps1(x/t)...s„(x/t). (Here, for n = 2, P may
also be the logical predicate symbol =.)
FIRST-ORDER LOGIC [CH. 2, §3
60

If « = —1P, then t is free for x in a iff t is free for x in P; if this is the case,
a(x/t) is defined to be i[p(x/t)].
If a = P-^y, then t is free for x in a iff t is free for x in both p and y;
if this is the case we define a(x/t) as P(x/t)-+y(x/t).
If « = VyP, then t is free for x in a iff one of the following conditions
holds:
(a) x is not free in a,
(b) x is free in a (hence, in particular, x^y), and t is free for x in p,
and y does not occur in t.
In case (a) we define a(x/t) to be a. In case (b) we define a(x/t) to be
Vy [P(x/t)].

It is easy to verify that if no variable occurring in t has a bound occurrence


in a, then t is free for x in a. Also, x is always free for itself in a, and
a(x/x)=a.

3.4. Theorem. If t is free for x in a then, for every valuation a,

a(x/t)"=a*(x/t),

where t = ta.
Proof. By induction on deg a. We distinguish various cases, correspond¬
ing to the cases in Def. 3.2. Here we only deal with the case a = VyP,
leaving the other (easier) cases to the reader.
First suppose that x is not free in a. Then a(x/t)=a. Also, by Thm. 2.3,
a*=a"(*/*). Thus
a(x/t)<T =a<T =o.a(-xlt\

Now suppose that x is free in a and t is free for x in p and y does not
occur in t. Then we have

(1) a(x/tr=(vy[P(x/t)])ff.
By the BSD,

(2) (Vy [P(x/t)])ff= T iff P(x/tr(y/H)=T for all u£U,

where U is the universe of a. Since deg p<dega, the induction hypothesis


yields

(3) p(x/t)a(y/u) = p <y!'W\

where But y does not occur in t. Hence by Thm. 2.1

t'=ta<*M=t(T=t.
CH 2, §3],
SUBSTITUTION
61

Also, x and y are different (otherwise x could not be free in a); hence

a(y/w)(x/1)~ a(x/t )(y/u).

For, it makes no difference whether we first change the value of x from


x to t and then change the value of y to u, or vice versa. (It would make
a difference if x were the same as y!) Hence we can rewrite (3) as

(4) P(x/t)<T^"-) = pCT(x/0(y/“)

Now, by the BSD,

pa(x/rt(y/H)=T for all u iff [yyp]»(*/0=T>

Combining this with (1), (2) and (4) we get the required result. |

3.5. Definition. If z is a variable which is not free in p but is free for


x in p, we say that V7 [P(x/z)] arises from VxP by (correct) alphabetic
change. (Note that if z does not occur at all in p, then z certainly satisfies
both of the above conditions.)

An alphabetic change is analogous to a change in the variable of inte¬


gration. We note that this operation is reversible. For, if z is not free
in p but is free for x in p, then the free occurrences of z in P(x/z) are
precisely those that arose from the free occurrences of x in p in the course
of the substitution. Thus no free occurrence of z in P(x/z) can fall within
the scope of an x-quantifier; i.e., x is free for z in P(x/z). Also, x cannot
occur free in P(x/z), because if it did this could only mean that x and z
are the same variable, hence p(x/z) = p — but z was assumed not to be
free in p. It follows that VXP can be retrieved from Vz [P(x/z)] by alpha¬
betic change.

3.6. Theorem. If \fz [p(x/z)] arises from VxP by alphabetic change, then
these two formulas are logically equivalent.
Proof. Take any valuation a and let U be its universe. Then by the BSD

(1) (Vz [P(x/z)])CT= T iff P(x/z)ff(z/") = T for all U.

By Thm. 3.4 we have

(2) P(x/z)ff(z/u) = p

where z'=za(-zlu\ But za(-zlU)=u by definition of a(z/u). Using this, as


well as the fact that z is not free in p, we obtain
(3) p<r(z/l/)(x/z') _ pCT(x/u)_

6
62 FIRST-ORDER LOGIC [CH. 2, §3

By the BSD,

P<T(x/u)_ T forall U^U iff (YxP)<r= T.

Combining this with (1), (2) and (3) we get the required result. |

Consider a given formula a. Suppose a has a universal subformula,


say VyP Let us replace one occurrence of VyP in « by an occurrence
of a formula Vz [P(y/Z)l arising from VyP by alphabetic change (i.e.,
z is not free in p, but is free for y in P). We shall say that a' is a variant
of a (briefly, a~a') if a can be transformed into a' by a finite number of
applications of steps like the one just described. (We include the case
where the number of such steps is 0, so that a~a.)
More precisely, we can define being a variant of a by recursion on deg a,
as follows:

3.7. Definition. If a is atomic, then a is its own sole variant.


If a=“ip, then the variants of a are all formulas of the form —i (P'j),
where P' is a variant of p.
If a = p~^y, then the variants of a are all formulas of the form P'-^y',
where P' and f are variants of P and y respectively.
If a = VyP, then the variants of cc are all formulas VyP', where P' is
a variant of p, as well as all formulas Vz [P'(y/Z)] obtained from such
VyP' by alphabetic change.

It is easy to check that ~ is an equivalence relation (e.g., the symmetry


of ~ follows at once from the fact that alphabetic changes are reversible).
It can also be readily verified that if a~a' then the only difference
between a and a' is that bound occurrences of some variables in a may
be replaced by bound occurrences of other variables in a'. But if the
ith symbol1 of a is a free occurrence of a variable, then the z'th symbol
of a' is a free occurrence of the same variable. Similarly, if a symbol other
than a variable occurs in a at the /th place, then the same symbol also
occurs in a.' at the same place.
Moreover, we have: ‘'

3.8. Theorem. If a~a' then a and a' are logically equivalent.


Proof. Immediate from Thm. 3.6 and Prob. 1.8. |

1 Recall that a is a string, i.e., a finite sequence of symbols. The ith symbol of a is simply
the /th member of this sequence.
CH. 2, §3], SUBSTITUTION 63

For any formula a and a finite number of variables yx,... ,yn, we can always
find a variant <x' of a such that yl5...,y„ do not occur bound in a'. Suppose,
e.g., that a has a subformula VyiP Then this can be replaced by \fz [p(yx/z)],
where z does not occur at all in p (and hence is free for y± in P) and is
different from ylv..,y„. After a finite number of such replacements,
a is transformed into an a' with the desired property. In particular, if all
the variables occurring in a given term t are among the y*, then t is free
for x in a'.
We are now in a position to define oc(x/t) even when t is not free for
x in a. We simply choose some variant a.' of a such that t is free for x in
a' and define a(x/t) to be a'(x/t). To make a(x/t) uniquely defined, we
have to choose a' by some definite rule. We do this now, by recursion
on deg a.

3.9. Definition. For given x and t, define a' for each a as follows:
If a is atomic then a' = a.
If a = —ip, then a' = ~l(p').
If a = p-*y, then a' = P'->-Y'.
If a = VyP? we distinguish three cases:
(a) If x is not free in a, then a' = a.
(b) If x is free in a and y does not occur in t, then a' =VyP -
(c) If x is free in a but y does occur in t, then a' = Vz [P'(y/z)], where
z is the first variable (i.e., the \t with least index) such that z does
not occur in t and such that a.' arises from VyP7 by alphabetic
change.
We then define a(x/t) to be a'(x/t), where the latter is already defined
in Def. 3.3.

Comparing this definition of a' with Defs. 3.3 and 3.7, it is easy to check
that a' is indeed a variant of a, and t is free for x in a'. Also, if t is already
free for x in a itself, then a' = a. So Def. 3.9 is consistent with Def. 3.3,
and a(x/t) is now well defined for all a, x and t.
The following theorem is a stonger version of Thm. 3.4 and shows
that a(x/t) has the required semantic property.

3.10. Theorem. For all a,x,t and a,

*(x/ty=<xa(xlt\

where t=ta.
Proof. By Def. 3.9, a(x/t) = p(x/t), where p(=a') is some particular variant

6*
FIRST-ORDER LOGIC [CH. 2, §3
64

of a such that t is free for x in p. Hence a(x/t)<T = p(x/t)ff. By Thm. 3.4,


P(x/t)<T=P'T(x/(), but since a~P we have pff(x/0 = a»(x/f) by Thm. 3.8. |

As an application, we prove:

3.11. Theorem. Let <$> be a set of formulas, and let ~iVxa^®- Let y be
a variable that is not free in <t> (i.e., not free in any <p € O). If <■|) is satisfiable,
then so is1 <I>, ~ia(x/y).
Proof. Let o\=<I>, and let U be the universe of a. Since ~iVxa^O we
have (~| Vxa)^ T- Therefore, by the BSD, aa(x/u) = _L for some U.
Consider a(y/u). Since y is not free in <D, it follows from Thm. 2.3 that
o(y/u)\=*i>. Also, by Thm. 3.10,

ot(x/y)a(y/!<) =raa(y/u)(x/>’'),

where y'=ya^y^—u. Therefore

(1) a(x/y)°'(y/") =a<7(y/u)(x/u).

If y is the same variable as x, then o(y/m)(x/m) = ct(x/m), so that

(2) a<r(y/“)(x/«) =a<T(x/u) _

On the other hand, if y^x then y cannot be free in a (because y is not


free in ~iVxa) and then (2) holds by Thm. 2.3. Thus in any case we can
conclude from (1) and (2) that a(x/y)<T(y/u) = _L, i.e., <j(y/u)\= ~\a(x/y). |

3.12. Problem. Let O be a set of formulas and let —Let a be


a constant that does not occur in d>. Show that if <I> is satisfiable, then
so is <I>, ~ia(x/a).

In the sequel we shall sometimes need to substitute a term t for a constant


c (rather than for a variable).

3.13. Definition. a(c/t) is the formula obtained by substituting an


occurrence of t for each occurence of c in a.

In practice we shall use a(c/t) only when no occurrence of c in a is


within the scope of a quantifier whose variable of quantification occurs
in t (e.g., when t is a closed term).
We conclude this section with a treatment of simultaneous substitution.

1 Notice that, by Defs. 3.3 and 3.9, [~ia](x/t)= —i [a(x/t)]. Hence we can write simply
“ “I a(x/t)”.
CH. 2, §3], SUBSTITUTION 65

Suppose that we wish to substitute terms tl5...,t„ simultaneously for


variables x1}...,xn in a. Of course, we wish do to so correctly, so that
an appropriate generalization of Thm. 3.10 should hold. In general,
it will not be good enough to substitute the t, one by one. For example,
*i may contain x2; so, if we first substitute tx for xl9 the formula a(xj/tj)
will have new free occurrences of x2 which have been introduced by this
substitution. And if we now substitute t2 for x2, then t2 will replace these
additional free occurrences of x2, which is contrary to our intention.
We therefore have to be a bit more cautious.
We shall now define a(Xj/txjt,,) — the result of substituting tx,...,tn
in a simultaneously for xlv..,x„ respectively — by recursion on n. The
case n— 1 has already been treated above (Def. 3.9). For 1, we assume
as our inductive hypothesis that a(x1/s1,...,x„_1/s„_1) has already been
defined for all terms s1....,s„_1. We now define:

3.14. Definition. If x„ is not free in a, then

«(x A,... ,x„/t„) =a(x1/t1 ,...,x„_ Jtn _ 1).

If x„ is free in a, then

a(xi/t1,x2/t2,... ,x„/tn) =a(x1/s1,... ,x„ _ 1/s„ _ j) (xn/1„) (z/xn),

where z is the first variable which occurs neither in a nor in tl5...,t„ and
s;=t,(x„/z) for z = l,...,« — 1.

Thus we first substitute z for x„ in t1;.Then we substitute the


resulting terms s1,...,s„_1 simultaneously for Xj,...^,,.! in a (the induction
hypothesis being that we know how to do this). Next, we substitute
t„ for x„. Finally, we replace z again by x„. The role of z is to replace
x„ at points at which we do not want t„ to enter. After t„ is safely substi¬
tuted in the right places for x„, we put x„ back in place of z.
We now show rigorously that this definition has the required semantic
property.

3.15. Theorem. ci(x1/t1,...,xJtny = aa{Xlltl)-{x>'ltn), where /1 = t1'T,...,/„=t/.


Proof. By induction on n. For n = 1 the result has already been proved
(Thm. 3.10). Now let n>\. The case in which x„ is not free in a is left
to the reader.

1 Here, and in the rest of this section, xlt...,xn are always assumed to be n distinct
variables.
FIRST-ORDER LOGIC [CH. 2, §3
66

If x„ is free in a, we have, by Def. 3.14 and Thm. 3.10,

«(xA,... ,x„/tny=afo/Sj,... ,X„ _ i/s„ _ d(xn/tny^\

where xn=x„a. Applying Thm. 3.10 again, we have

aCXi/tj,... = afri/Si ,...,x„_1/s„_ 1)<r(z/jc")(x"/‘\

where But since z was chosen so as not to occur in t„, we have


in fact t'=t„a=t„. We now apply our induction hypothesis and get

(1) afrj/ti,... ,x„/tn)" = -0

where

s.==s,'lW*n)(Xn/fn) for i = l,...,n—1.

But s;=t;(xn/z). Hence by Thm. 3.2 we get

(2) j,=/= l 1,

where
(3) z'= 7a(,zlxn)(xjtn)'

Clearly, (2) can be simplified:

(4) s~t i=\,...,n-l.

Since z is different from x„, from (3) we clearly have z'=x„, and hence
(4) yields

s. = t°(zlxn)(.xjxn)

*^
But since z does not occur in t; we have

Si = tia(-Xnlxn>=t^ = t..

We can now rewrite (1) as

aixJU,... ,xjtny -••(*„/»„)


CH. 2, §4], FIRST-ORDER TABLEAUX 67

and since z is not free in a we have, in fact,

«(xA,... ,xn/tn)<T=aa(xi/ti)-K/0

as claimed.

In some shapters of this book we shall adopt the convention of writing


“a(ti,...,t„)” instead of “a(x1/t1,...,x„/t„)”5 provided the choice of xlv..,xB
is determined unambiguously by the context or by an explicit agreement
(e.g., we may agree that x1?...,x„ are the first n variables of if, i.e., vl5...,vB
respectively).

The following three sections (§§4-6) as well as §8 are devoted to the


study of first-order tableaux. The material covered in them bears to first-
order logic much the same relation as the material in §§7-9 of Ch. 1 bore
to propositional logic. (More specifically, §4 parallels §7 of Ch. 1; §5
contains technical lemmas needed in §6; and §6 parallels §8 of Ch. 1.
The material of §7 does not deal with tableaux; it is inserted here because
it will be used in §8 in connection with tableaux, but it will also be used
for another purpose in Ch. 3. Finally, §8 parallels §9 of Ch. 1).
A reader who does not wish to study first-order tableaux at all should
proceed as follows: First, do Probs. 4.3-4.6, which can be solved (albeit
somewhat less efficiently) without tableaux. Then read §7 and from there
proceed to §9 or (if one wants to skip §§9-13 as well) directly to Ch. 3.
In Ch. 3, §2 must also be omitted.
A reader who would like to benefit from the practicalad vantages of
the tableau method, but who does not have the time or the patience for
the painstaking work of §§5-6, may skip these two sections only. In §8
(see Prob. 8.6) the tools are provided for an alternative — easier, but
non-constructive and indirect — proof of the main result of §6.

*§ 4. First-order tableaux

Propositional tableaux were designed as a strategy for showing that a given


finite set of formulas is not satisfied by any truth valuation. Now we shall
introduce the method of first-order tableaux as a strategy for showing
that a given finite set of formulas is not satisfiable (i.e., not satisfied by
any valuation).
The construction of first-order tableaux is similar to that of propositional
tableaux, except that in addition to the three old propositional rules
68 FIRST-ORDER LOGIC (CH. 2, §4

(—| —|, -* and we now admit two quantifier rules represented


schematically as follows:

yj-rule: : ~1 \J-rule: :
Vxa “iVxa

a(x/t) “l a(x/y) (y restricted)

In the V'rule> t may be any if-term. In the —|V'rule> y may be any


variable which is not free in any formula of the branch which is being extended.
The particular y used in a given application of the “]V-nde is said to be
the critical variable of that application.
If if is a language with equality, we also admit three equality rules,
represented schematically as follows:

SI-rule:

t=t
where t is any if-term;
SF-rule:

*1 — *« + !-► t2=tn + 2-> ...-4-t„=t2n->'ft1...tn=ftn + 1...t2 n

where f is any n-ary function symbol of if and tls...,t2„ are any if-terms;

SP-rule: '■

—bi+1—>-t2—... ->-tB—t2n->-Pt1...t,l->-Ptrt+1...t2„

where P is any n-ary predicate symbol of S£ and tl5...,t2n are any if-terms.
(For n — 2, P may be =.)
The names of these rules are abbreviations of “self-identity”, “substi-
tutivity of equals in functions” and “substitutivity of equals in predicates
Note that the equality rules can always be applied to extend any branch,
irrespective of what formulas belong to that branch.
The last difference between first-order tableaux and propositional
tableaux is that a branch of a first-order tableau is closed if there is an
CH. 2, §4], FIRST-ORDER TABLEAUX 69

atomic1 formula a such that both a and —|a belong to that branch. A first-
order tableau for <I> whose branches are all closed is called a first-order
confutation of <I>.
In this chapter we shall often say briefly “tableau” instead of “first-
order tableau”, etc.

4.1. Theorem. The tableau method is semantically sound: if a finite set


O of formulas can be confuted, then <D is not satisfiable. In particular, if
then <p is logically true.
<!> = {~i ip}

Proof. We show that if a branch of a tableau is satisfiable (i.e., if the


set of formulas belonging to the branch is satisfiable) and if that branch
is extended (or extended and split) by any one of the eight rules, then the
new branch (or, in the case of the -*-rule, at least one of the two new
branches) is also satisfiable.
In the case of the three propositional rules, the proof is easy and is left
to the reader (cf. Prob. 1.7.1).
In the case of the equality rules, it is enough to observe that the formulas
introduced by these rules are all logically true, as the reader can easily check.
Now consider the V-rule. Let *F be the set of formulas of the branch
which is about to be extended. Let *F and let t be any term. Suppose
(tN'F. Then (Vxa)CT=T, and by the BSD a<r(x/u)=T for every u in the
universe U of a. On the other hand, by Thm. 3.10, a(x/t)<T=aCT(x/f), where
t=ta. Since tCT$ U, it follows that a(x/t)'7=T. Thus a \= *F, a(x/t).
The case of the | \/-rule is treated by Thm. 3.11.
Since any tableau for O is constructed by successive extensions from
the tableau which has O as its sole node, it follows that if <F is satisfiable,
any tableau for <D must have at least one satisfiable branch. But a closed
branch is clearly not satisfiable. Thus if <I> can be confuted, it is not
satisfiable. In particular, if <1> = {—| q>} then —| <p is not satisfiable, i.e.,
<p is logically true. g

4.2. Theorem. If O is a finite set of formulas and if for some formula a


both af O and “|a(E<J>, then can be confuted. Hence, in any tableau,
a branch that has both a and “la is as good as closed even if a is not atomic.
Proof. By induction on deg a. If a is atomic, then the tableau having
<D as its sole node is already a confutation.

1 N.B.: Not just prime.


70 FIRST-ORDER LOGIC [CH. 2, §4

The cases a = “iP and « = p-»y are left to the reader.


Finally, let a = VxP- We choose a variable y which is not free in O,
and apply the “lV-rule to “la with y as critical variable. Then we apply
the V-rule to a with y as the term t. We get

<D

“i P(x/y)

P(x/y)

By the induction hypothesis this is as good as closed. |

Flere are some practical hints for constructing confutions efficiently.


As in the propositional case, a branch which is seen to be as good as
closed should not be extended any further.
Also, when a formula is being used, it is usually best to use it at once
to extend all branches to which it belongs (except those that are as good
as closed).
In addition to the eight rules, we may in practice use the six “unofficial”
propositional rules (see §7 of Ch. 1) as well as the following “unofficial”
quantifier rules:

3-rule: \ ~\^-rule: |
3xa “l3xa

a(x/y) (y restricted) “ia(x/t)

Here t may be any term, while y is subject to the same restriction as in the
“lV-rule. These two rules can easily be justified via Def. 1.5.1.
In general, it is best to apply the restricted quantifier rules (“1V and 3)
as soon as possible, and to delay applying the unrestricted quantifier rules
(V and “| 3) as long as possible. (For example, if in the proof of Thm. 4.2
we had applied the V_ru'e first to obtain P(x/y), we could then no longer
use y as critical variable for the “lV-rule.)

Example. We show that if x is not free in p, then (Vxa-* P)-*3x(a_* P)


is logically true. By Thm. 4.1 it is enough to confute the negation of that
formula:
CH. 2, §4], FIRST-ORDER TABLEAUX 71

-|[(Vxa-» P)-»3x(a-* p)]

Vxa-F p
-I3x(a->P)
/\
~i Vxa P

~la I (oe —> P)

“1 (a —► P) a
I IP
a X
"IP
X

Here the | V-rule has been applied to ~lVxa, to yield —|a(x/x) = “la.
(This is legitimate, since x is not free in p.) The —13-rule has been applied
to ~i3x(a-*P) to yield ~|(a->P)(x/x)= P). The rest is self-
explanatory.

4.3. Problem. Show by tableaux or otherwise that the following formulas


are logically true:

(a) Vx(a~>P)->Vxa->-VXP;
(b) a-»-Vxa> where x is not free in a;
(c) Vxa^a(xA)>
(d) a(x/t)-*3xa’

where, in (c) and (d), t is any term;

(e) \fx(aa P) <-*VX(*aP,


(f) 3x(«aP) o^3x«aP,
(g) Vx(«vp) -H-Vx«vp,
(h) 3x(avP) 4+3xavp,
(i) Vx(«-* P)<-»3xa->- P>
(j) 3x(a-»P)<->Vxa-*P>
(k) Vx(P-»a)«*P->- Vxa>
(l) 3x(p-*a)<->p-*3xa,

where in (e)-(l) x is not free in p.


72 FIRST-ORDER LOGIC [CH. 2, §5

4.4. Problem. Let P be an n-ary predicate symbol, and let tl5...,t„ be


terms. Let x1,...,x„ be n distinct variables which do not occur in any of
the t;. Using Prob. 1.7, show by tableaux or otherwise that the three
formulas

Pti • t„,
V*l-VX„(ti=X1 t„=x„ -► Pxx...x„),
3x1...3x„(t1=X1A...Atn=X„APx1...X„)

are logically equivalent.


4.5. Problem. Let s and t be terms in which the variable y does not occur.
Show that the three formulas

s=t,
Vy(s=y-»t=y),
3y(s=yAt=y)

are logically equivalent.


4.6. Problem. Let s,tx,...,t„ be terms, and let f be an «-ary function
symbol. Let xl5...,x„ be distinct variables which do not occur in any of
the terms s,t1}...,t„. Show that the three formulas

ft1-t„=s,

VXi...Vxn(t1=x1->...-^tn=x„->fx1...xn=s),
3Xi..3xn(t1=X1A...Atn=X„Afx1...Xn=s)

are logically equivalent.


4.7. Problem. Prove by tableaux that the formulas VxVy(x=y) and
3xVy(x=y) are logically equivalent.

*§ 5. Some “book-keeping” lemmas

In practice, the method of first-order tableaux is very easy to use. However,


the theoretical study of this method is technically a bit tricky because
in applying the V-nile to a formula Vxa, the term t that we need to sub¬
stitute may not be free for x in a; thus, the substitution may involve
alphabetic changes inside a. (A similar situation arises in connection
with the “lV-mle.) If several such substitutions are made in the same
tableau, it is rather difficult to keep track of all these alphabetic changes.
Let T be a tableau for <D. We shall say that T is pure if the following
two conditions hold:
CH. 2, §5],
SOME “BOOK-KEEPING” LEMMAS 73

(1) The terms t used in the applications of the V-rule in T do not contain
any variable that occurs bound in ®.
(2) The critical variables of the applications of the ~iV-rule in T do
not occur bound in O.
It is easy to see that in a pure tableau no alphabetic changes are ever made.
Instead of trying directly to keep track of the alphabetic changes that
are made in an impure tableau, we propose to reduce the general case
to that of pure tableaux.
Let T and T be tableaux. We shall say that T' is a variant of T (briefly,
T~T') if T can be transformed into T by replacing each formula a in
T by a variant a', in such a way that each application of the V-rule in
T is transformed into an application of the V-rule with the same term t,
and each application of the ~| V-rule in Tis transformed into an applica¬
tion of the ~i V-rule with the same critical variable}
Similarly, we say that if ® can be transformed into ®' by replacing
each <p£<t> by a variant <p'.

Our first aim is to show that, given a tableau T for ®, we can construct
a variant T' of T for any variant O' of ®. For this we shall need

5.1. Lemma. If a~a', then a(x/t)~a'(x/t).


Proof. By induction on deg a. We leave the easy cases to the reader
and deal here only with the case where a is universal.
Let a = VyP- Then (by Def. 3.7) a=VyP' or a.'= Vz [p'(y/z)J, where
P~P' and z is not free in P' but is free for y in pk We may assume that
x is free in a (hence also in a') because otherwise we have

a(x/t) = a, a/(x/t) = a/,

and the assertion of our lemma is trivial. Also, for the time being, we shall
assume that t is free for x in both a and a'.

By Def. 3.3, y cannot occur in t, and

a(x/t) = Vy [P(x/t)].

If a, = VyP,> then Def. 3.3 similarly implies that

<x'(x/t)=vy [PTx/t)],
and our assertion follows at once from the induction hypothesis.

1 As a matter of fact the last two conditions (beginning with “in such a way...”) are
redundant, but it is convenient to include them nevertheless.
FIRST-ORDER LOGIC [CH. 2, §5
74

Now suppose a'= \fz [p'(y/z)]. Then Def. 3.3 implies that z cannot
occur in t, t is free for x in P'(y/z), and

a,(x/t) = Vz [p'(y/z)(x/t)].

Since z is free for y in P' and t is free for x in p'(y/z) it follows that in going
from pr to p'(y/z)(x/t) no alphabetic changes are made. Thus P' and
p/(y/z)(x/t) have exactly the same bound occurrences of variables. Next,
we observe that

p,(y/z)(x/t) = p/(x/t)(y/z).

For, x is different from both y and z (x is supposed to be free in a and a')


and y does not occur in t; so it makes no difference whether we first
substitute z for y and then t for x, or vice versa. Thus

«'(x/1) = Vz [P'(x/t)(y/z)].

Now, z is not free in p'(x/t), because z was not free in P' and t does not
contain z. Also, z is free for y in P'(x/t), because z was free for y in P'
and the substitution of t for x in p' cannot change matters in this respect.
(Recall that t does not contain y and the substitution does not involve
any alphabetic change within P'!) Therefore our a'(x/t) arises by alpha¬
betic change from Vy [P'(x/t)] and we have

a'(x/t)~ Vy [P'(x/t)]
~Vy [P(x/t)],

by the induction hypothesis.


We have therefore proved the assertion of our lemma under the assump¬
tion that t is free for x in a and a'. However, even if this assumption does
not hold, then by Def. 3.9 we have at any rate

*(x/t)=y(x/t), a'(x/t)=y/(x/t),

where y and y' are certain variants of a and a' respectively, and t is free
for x in y and y'. Thus, by what we have already proved,

*(x/t)=Y(x/t)~y'(x/t) = a'(x/t). g

5.2. Lemma. Let ®~®'. Given a tableau T for ®, we can construct a


tableau T' for ®' such that T~T'.
Proof. By induction on the number of nodes in T. If T has ® as its only
node, we take T' to be the tableau having <J»r as its only node.
CH. 2, §5], SOME “BOOK-KEEPING” LEMMAS 75

If T has more than one node, then T is obtained by extending a tableau


To b) means of one of the eight rules. By the induction hypothesis, we
have got a tableau T0 for O' such that T0^T^. We shall show that the re¬
quired T can be obtained by extending T0' using the same rule which
yielded T from T0. Eight cases have to be treated, corresponding to the
eight rules.
The cases of the propositional and equality rules are straightforward
and are left to the reader. We shall deal here with the quantifier rules.
Rule V' Suppose some branch of T0 has a formula Vx« and T was
obtained by adjoining (a(x/t)} as successor to the terminal node of that
branch.
The corresponding branch of T0' must have some variant (Vxa)' of
Vx<x. By Def. 3.7, we have

(Vxa)' = Vxa' or (Vxa)' = Vz [<*'(x/z)],

where a~a' and z is not free in a' but is free for x in a'.
If (Vxa)' = Vxa"> then we can apply the V-rule to Vxa' and extend
T' to T' by adjoining {a'(x/t)} as successor to the terminal node of the
branch in question. By Lemma 5.1, <x(x/t) ~ a'(x/t), so that T~T' as
required.
Now suppose (Vxa)' = Vz [a/(x/z)]. Again, we apply to this formula
the V'rule with the same term t. It remains to show that

(1) a(x/t)~a'(x/z)(z/t).

As a matter of fact, it can be proved that

(2) a'(x/z)(z/t) = a'(x/t),

hence, again by Lemma 5.1, we have (1). However, to prove (2) rigorously
is a bit tricky. (The substitution of z for x in a' involves no alphabetic
changes, since z is free for x in a'; but one has to verify that the substitu¬
tion of t for z in a'(x/z) involves exactly the same alphabetic changes as
the substitution of t for x in a'.)
Instead of appealing to (2), we can proceed as follows. We take any
formula a." such that a'~a" (hence a~a") and such that z and the variables
occurring in t do not occur bound in a". Then the substitution of z in
a" and of t in both a." and a"(x/z) do not involve any alphabetic changes-
Thus it is easy to see that

(3) a/,(x/z)(z/t) = a7/(x/t).


76 FIRST-ORDER LOGIC [CH. 2, §5

Moreover, by Lemma 5.1,


(4) <*'(x/z)(z/t) ~ a"(x/z)(z/t),
(5) oc(x/t)~a"(x/t).
Using (3), (4) and (5) we obtain (1).
Rule ~lV- This is treated in the same way as the V’rule- Here one
also needs to observe that if y is not free in any formula of a given branch
of T0, then y cannot be free in any formula of the corresponding branch
of T'. I

It is clear that if T is a confutation and T~T' then T' is a confutation


as well. (Recall that an atomic formula has no variants other than itself.)
Also, if T is a tableau for O, we can always choose O' such that 0~0'
and such that the tableau T' of Lemma 5.2 is pure. (We obtain O' from
O by suitable alphabetic changes inside each (p£0 in such a way that the
terms t used in the applications of the V_rule in T do not contain any
variable bound in O', and the critical variables of T also do not occur
bound in O'.)
We can now prove:

5.3. Lemma. Let y1,---,yn be any n variables. Given a confutation of O,


we can construct for O a confutation in which none of the variables yl5...,y„
is used as a critical variable.
Proof. We may assume that yx,...,yn are not free in O, because a variable
which is free in O cannot in any case be used as a critical variable in a
tableau for O.
Let T be the given confutation. We choose O' such that 0^0' and
such that the tableau T' constructed in Lemma 5.2 is pure. Moreover,
we can choose O' so that yx,...,y„ do not occur bound — hence do not
occur at all —- in O'. Because T' is pure, yx,...,y„ do not occur bound in
any formula in T'. Since T is a confutation, T' is a confutation as well.
We choose n variables z1,...,zn which are different from all the y; and
do not occur at all in T'. Clearly, z; could have been used just as well
as yj. More precisely, let T"' be obtained from T' by putting zx,...,z„
everywhere for yl5...,y„ respectively. (In particular, wherever yt is used
in T' as a critical variable, it will now be replaced by z;.) Clearly, T" is
also a confutation of O' and yl5...,y„ are not used in T" as critical variables
(in fact, they do not occur at all in T").
Finally, using Lemma 5.2 again, we can obtain for O a confutation
T'" such that T" ~ T'". Clearly, T" has the required properties. §
CH. 2, §5],
SOME “BOOK-KEEPING” LEMMAS 77

From now on we put

$(x/t) = {(p(x/t) : ipf O}.

We now come to the final result of this section.

5.4. Lemma. Given a confutation of ®, we can confute ®(z/s).


Proof. Let T be the given confutation. By Lemma 5.3 we may assume
that no variable occurring in the term s serves in T as a critical variable.
Also, for the time being we shall assume that
(1) T is pure,
(2) no variable occurring in s occurs bound in O.
Fiom (1) and (2) it is easy to see that s is free for z (indeed, for any variable)
in every formula of T. In fact, no variable occurring in s can be bound
in any formula of T.
Let T* be obtained from T by replacing each formula a in T by a(z/s).
We shall show that T* is a confutation of ®(z/s).
When we go from T to T*, the initial node <D of T becomes <D(z/s). We
shall verify that each application of a rule in T is transformed into an
application of the same rule in T*. For the propositional and equality
rules this is quite obvious (e.g., a-^-p is transformed into a(z/s)-^p(z/s)
and t=t is transformed into t(z/s)=t(z/s)). We shall deal in detail with
the quantifier rules.
Rule V- Suppose Vx« is used in T to yield a(x/t). When T is trans¬
formed into T*, Vx« becomes [Vxa](z/s) and a(x/t) becomes a(x/t)(z/s).
If z happens to be x, then clearly

[Vxa](z/s) = Vxa,
a(x/t)(z/s) = a(x/t)(x/s) = a(x/t(x/s)).

Thus a(x/t)(z/s) can be obtained from [VxaKz/s) by the V-rule-


If z^x, then we have

[Vxa](z/s) = Vx [a(z/s)],
and since by (1) and (2) x does not occur in s we also have

a(x/t)(z/s) = a(z/s)(x/t(z/s)).

Here again a(x/t)(z/s) can be obtained by the V’rule from [Vx«](z/s).


Rule ~lV- Suppose “iVxa is use(t in T to yield —ia(x/y). Upon
going from T to T*, ~iVxa becomes [“| Vxa](z/s) and “ia(x/y) becomes
-ia(x/y)(z/s).
FIRST-ORDER LOGIC [CH. 2, §5
78

If z happens to be x, then

[~l Vxa](z/s) = “l Vxa-


Also, since T is assumed to be pure, the critical variable y is not bound
in O and hence cannot be bound in any formula of T (for, in a pure tableau
no alphabetic changes are ever made). Hence y must be different from
x and we have

“|a(x/y)(z/s) = ~ia(x/y)(x/s) = ~l a(x/y).

Thus, if z=x, the transformation from T to T* does not affect ~iVxa


and “ia(x/y). We still have to check that y is not free in any formula
preceding ~|a(x/y) in T*. But this is actually the case. For y was not free in
any formula preceding ~ia(x/y) in T; it is true that these formulas have been
transformed by substituting s for z, but we have assumed in the beginning
of this proof that y does not occur in s either.
Finally, suppose that z^x. Then

[~1 Vx«](z/s) = ~1 Vx [a(z/s)].

Also, we may assume that z is free in O, otherwise the assertion of our


lemma is trivial. Hence y, being a critical variable in T, must be different
from z, so that y(z/s)=y. And since, by (1) and (2), x does not occur
in s, we have

“I a(x/y)(z/s) = “1 a (z/s) (x/y (z/s)) = -|a(z/s)(x/y).

As before, we know that y cannot be free in any formula of T* preceding


-1|a(x/y)(z/s). Thus in this case too “ia(x/y)(z/s) can be obtained from
[—| VxoeKz/s) using the “lV-rule.
It follows that T* is a tableau for 0(z/s); and it is clear that every branch
of T* is closed, because an atomic formula in T is transformed into an
atomic formula in T*.
So far we have assumed that (1) and (2) hold. If this is not the case,
we can find a variant O' of O such that no variable of s occurs bound in
O' and such that the tableau T' constructed from T in Lemma 5.2 is pure.
By what we have proved so far, we can construct a confutation of 0'(z/s).
But by Lemma 5.1 we have 0(z/s)~0'(z/s); so, using Lemma 5.2 once
more, we get a confutation of 0(z/s).
CH. 2, §6],
THE ELIMINATION THEOREM FOR FIRST-ORDER TABLEAUX
79

*§ 6- The Elimination Theorem for first-order tableaux

We start by proving four simple lemmas analogous to Lemmas 1.8.1-1.8.4.


(The wording is actually identical, except that now “confutation” means
first-order confutation, not propositional confutation.)

6.1. Lemma. Given a confutation of 0>, we can confute d> u yV, where T*
is any finite set of formulas.
Proof. This is not quite as trivial as the analogous result for propositional
tableaux (Lemma 1.8.1). In general, we cannot simply adjoin to the
initial node ot the given tableau, because in that tableau variables that
are free in ¥ might have been used as critical variables. However, by
Lemma 5.3 we can construct for O a confutation in which no free variable
of ¥ is used as a critical variable. In this tableau we can adjoin *F to the
initial node, getting a confutation of $uT. |

It follows from Lemma 6.1 that if we possess a confutation for a subset


of the set of formulas belonging to a given branch, then this branch is as
good as closed. Below we shall make implicit use of this fact.

6.2. Lemma. Given a confutation of nia, we can confute O, a. |


6.3. Lemma. Given a confutation of <J>,a-*p, we can confute O, —ia
and O, p. |
6.4. Lemma. Given a confutation of O, -|(a-*p), we can confute
0),a, ~ip. |
We shall also need the following:

6.5. Lemma. Given a confutation of <D, —| Vxa, we can confute <I>, ~ia(x/t),
where t is any term.
Proof. Let T be the given confutation. We proceed by induction on the
depth d of T. By Lemma 5.3 we may assume that no variable occurring
in t is used as a critical variable in T.
If d= 0, then for some atomic formula p both p^O and “iPCO. Then
clearly <6, ~ia(x/t) is also a confutation (with a single node).
If d> 0, we examine how the node(s) of level 1 in T could have been
obtained.
First, suppose that T begins thus:

O, “i Yxa

7*
FIRST-ORDER LOGIC [CH. 2, §6
80

where p is obtained by an equality rule, or by applying a quantifier or


propositional rule — except the -►-rule — to a formula If in T we
fuse the initial node with the node {p} of level 1, we get a confutation of
depth d-1 for <D, p, -|VX(* By the induction hypothesis, we can then
confute <P, P, ~~1 a(x/t). We now start a tableau for <I>, —ia(x/t) thus:

O, ~| a(x/t)

P
where p is obtained in the same way as in T. But, since we already possess
a confutation of <D, p, “|a(x/t), the single branch of our tableau (1) is
as good as closed.
Next, suppose that T begins thus:

o,—ivxa
/\
p y

where p and y are obtained by applying the -►-rule to some q>£<I>. If we


fuse the initial node with the node {P}> we get a confutation of depth
for O, p, ~iVxa- Hence by the induction hypothesis we can confute
<D, p,-|a(x/t). Similarly, we confute <D, y, ~ia(x/t). We can now start
a tableau for O, ~|a(x/t) thus:

<D, -|a(x/t)
/\
P Y

and both branches of this tableau are as good as closed.


Finally, we have to consider the case where level 1 of T is obtained by
applying the ~iV-rule to “iVxa- Then T begins thus:

a>, ~i vxa
I ' k*
~ia(x/y)

and by Lemma 5.3 we may assume that y does not occur at all in “i Vxa-
By fusing the initial node with that of level 1, we get a confutation of
depth d— 1 for O, “ia(x/y), ~iVxa- By the induction hypothesis (with
y used instead of t) we can confute <D, “|a(x/y), ~ia(x/y), i.e., <I>, ~ia(x/y).
CH. 2, §6], THE ELIMINATION THEOREM FOR FIRST-ORDER TABLEAUX 81

By Lemma 5.4, we can now confute 0(y/t), “l a(x/y)(y/t). But y, being


a critical variable in T, cannot be free in <D. Hence <D(y/t)=<D. Also, it
can be shown that

(2) -i <x(x/y)(y/t) = ~| a(x/t)

because y does not occur at all in a. However, instead of appealing to (2),


which is a bit tricky to prove rigorously, we may argue as follows.1 We
choose a' such that a-a' and such that y and the variables of t do not
occur bound in a' (hence y cannot occur at all in a'). Then it is easy to
see that

(3) “I a'(x/y)(y/t) = “1 a '(x/t),

since these substitutions involve no alphabetic changes. On the other


hand, by Lemma 5.1 we have

(4) -] <x(x/y)(y/t) - “| a'(x/y)(y/t).

Thus, by (3) and (4), and another appeal to Lemma 5.1,

“I a(x/y)(y/t) ~ ~l a'(x/t) ~ ~1 a(x/t).

Since we already possess a confutation of ®, ~i a(x/y)(y/t), we can use


Lemma 5.2 to confute O, —|a(x/t). 1

We remark that Lemma 6.5 is rather powerful. It implies that if in


a branch of a tableau v/e have both ~iVxa and “la(x/t), where t is any
term, then ~iVxa can be considered used up in that branch and need
not be used to extend it. Intuitively speaking, the reason for this is that
—ia(x/t) conveys at least as much information as ~iVxa- (On the other
hand, for any finite number of terms tx,...,t„, the formulas a(x/t1),...,a(x/t„)
jointly do not in general convey as much information as Vxa- Therefore
Vx« may never be used up in a branch.)
As a last preparatory lemma we get in the same way as 1.8.5:

6.6. Lemma. Given confutations of <I>, 5 and <I>, “16, where 5 is atomic,
we can confute ®. I

We can now prove:

6.7. Elimination Lemma. Given confutations of ®, 6 and <J>, ~iS, where


5 is any formula, we can confute O.

1 Cf. the proof of Lemma 5.2.


FIRST-ORDER LOGIC [CH. 2, §6
82

Proof. By induction on deg 5. If 5 is atomic, we appeal to Lemma 6.6.


The cases where 8= la and 8 = ct—>P are treated exactly as in the proof
of the Elimination Lemma for propositional tableaux (Lemma L8.6),
using Lemmas 6.1-6.4.
The remaining case is 8 = V*a. Let Tbe the given confutation ofO, Vx«-
We shall now transform T by eliminating from it all nodes obtained by
applying the V'ru^e 1° Vxa-
Suppose Thas a node {a(x/t)} obtained by applying the V-rule to Vx<*
We choose a node of this form at the deepest possible level in T, i.e., such
that no node of a deeper level is obtained in a similar way. Let this level
be k+ 1, and let <P0,...,<Dt (in this order) be the nodes of T preceding our
{a(x/t)}. In particular, O0=d>, Vxa Let

'F=<Du<I>1u...u$t.

(Note that Vxa has been excluded from 'F.)


Since Vx« is not used in T below level k-H, it is clear that by adjoining
*F to our node {a(x/t)} and by considering the nodes of T that follow this
node, we get a confutation of 'F, a(x/t).
On the other hand, we were given a confutation of <t>, | Vx«; so by
Lemma 6.5 we can confute <P, “|a(x/t). Since <5gT,we can use Lemma 6.1
to confute *P, —ia(x/t).
Now we have got confutations of both *F, a(x/t) and ¥, ~ia(x/t). Since
deg a(x/t) = deg a = deg 8 — 1, we can (by the induction hypothesis)
confute tF.
We transform T as follows. First, we cut out our node {a(x/t)} as well
as all nodes following it. We are now left with the “truncated” branch
avo!,...,^. But t is a subset of the set of formulas of this branch,
and we possess a confutation of *F. Therefore this branch is as good as
closed, without making any further use of Vxot
Thus we can transform T into a confutation of <J>, Vx“ in which Vxa
is used fewer times than in T> Repeating this process a finite number of
times, we obtain a confutation of <D, \/xa in which \/x<x is never used.
We can now delete Vxa from the initial node, getting the required con¬
futation of O. |

As in §8 of Ch. 1, we now (temporarily) admit the EM-rule in addition


to our eight rules. However, we have again:
CH. 2, §7], HINTIKKA SETS 83

6.8. Elimination Theorem. Given a confutation of <X> in which the EM-rule


is used, we can construct for <I> an EM -free confutation. |

Therefore we shall not regard the EM-rule as one of the tableau rules,
but we can still use it in practice.

§ 7. Hintikka sets

In proving the completeness of the method of first-order tableaux (see


next section), as well as in proving the completeness of the predicate
calculus (in Ch. 3), we shall make use of Hintikka sets.

7.1. Definition. A set T of if-formulas is a Hintikka set in if if the follow¬


ing conditions hold:
(1) If tp is atomic, then (p and —|<p do not both belong to ¥ (i.e., if
<p$'E, then —|<p$'P).
(2) If T_la£'P, then y.£'¥.
(3) If then ~|aG*P or P£*P.
(4) If ~i(a->P)6xP, then both <*£¥ and ~iP€'P.
(5) If Vxa€'F, then a(x/t)£'P for every if-term t.
(6) If —lVxa^'P, then “|a(x/t)6'P for some if-term t.
If if is a language with equality, we also require:
(7) For every if-term t, (t=t)6T.
(8) If f is an n-ary function symbol of if and t1,...,t2n are if-terms,
then the formula

tl = t„ + l~*' t2 = t„ + 2~>' ■ • • t„ = t2n—ftx.. .t„ = ft,| + 1. ..t.2„

belongs to 'P.
(9) If P is an 7t-ary predicate symbol of if and tls...,t2fl are terms then
the formula

ti==t„+1->t2=tn+2->- ...-►t,1=t2n->-Pt1...t,I->-Pt„ + 1...t2,1

belongs to T. (For n — 2, P can be =.)

Throughout the present section we let *P be a fixed (but arbitrary)


Hintikka set. Also, throughout this section we shall refer to the nine
conditions of Def. 7.1 simply as (1), (2) etc. instead of 7.1.(1), 7.1.(2) etc.
Our aim is to show that *P is satisfiable; but this will require some
preliminary work.
We begin by defining a binary relation E between if-terms.
84 FIRST-ORDER LOGIC [CH. 2. §7

If if is a language without equality, we simply take E to be the identity


relation. In other words, sift iff s and t are the same term.1
If if has equality, we define s£t to mean that the equation s=t belongs
to *F.

7.2. Lemma. E is an equivalence relation (i.e., it is reflexive, symmetric


and transitive).
Proof. If if does not have equality, the assertion of our lemma is trivial.
Therefore we assume that if does have equality.
The reflexivity of E follows at once from (7).
To show that E is symmetric, let us assume sEt; we shall show that tEs.
By definition, sEt means that (s=t)G'P. Now we make use of (9), where
we take n = 2 and choose P,tx,t2,t3,t4 to be =,s,s,t,s respectively. Then
(9) says the formula

S=t-FS=S->S=S->t = S

belongs to Tf It now follows from (3) that (at least) one of the formulas

Sjdt, s=s-»s=s->-t=s

must belong to *P. But, since we already have (s=t)6 'F, it follows from (1)
that (s^ait) 'F. Thus the formula

s=s->s=s->t=s

belongs to *P. Again, (3) implies that one of the formulas

S5*£s, s=s-*t = s

must belong to T. But, by (7), (8=8)6^, hence, by (1), (s^s^T. Thus


the formula

s=s—>-t=s

belongs to 'P. Using (3), (7) and (1) once more, we finally see that
(t=s)£ *F, i.e., tifs. ■v
To show that E is transitive, we assume that r£s and s£t; we shall show
that rift. Since E is symmetric, rEs implies s£r, i.e., (s=r)£xF. Also,

By the way, recall that in §4 of Ch. 1 it was stipulated that a first-order language without
equality cannot have function symbols other than constants. Thus the only terms of
such a language are the variables and the constants
CH. 2, §7], HINTIKKA SETS 85

sEt means that (s—t)£*F. We use (9) with n — 2, choosing P,tx, t2, t3, t4
to be =,s,s,r,t respectively. Then (9) says that the formula

s=r-»s=t-*s=s-»r=t

belongs to Y. From (3), (1) and the fact that (szzrrK'F we infer that
the formula

s=t-»s=s-*r=t

belongs to TV It follows from (3), (1) and the fact that (s=t)£'P that the
formula
s=s-*r=t

belongs to'F. Using (3), (7) and (1) we finally get (r=t)£ 'F, i.e., rEt. |

By Lemma 7.2, the set of all if-terms is partitioned into mutually


exclusive ^-classes. Terms s and t belong to the same F-class iff s£t.
For each term t we put

[t] = {s: s£t},

.e., [t] is the F'-class to which t belongs. Clearly,

[s] = [t] iff s£t.

7.3. Lemma. Let tl5...,t2n be SE-terms such that M = [*„ + ;] for i=l,...,n.
Then:

(a) For any n-ary function symbol f of if,

[ftj.. .t,J = [ft„+1.. .t2n].

(b) For every n-ary predicate symbol P of $£, if Pt1...t„6'P then


Ptn+1...t2n€'F.
Proof. If is without equality then the assertion of our lemma is trivial.
(In this case the assumption of the lemma is that t;=t„+i for i=l,...,n.
Incidentally, here (a) is vacuous for n >0, because the only function symbols
in such a language are the constants.)
We therefore assume that if is a language with equality. Then our
assumption is that

(ti=tn + ;)€'F, i = l,

We use (8) and apply (3) and (1) n times; we then get

(ft1...tn=ft,1+1...tj€^,
86 FIRST-ORDER LOGIC [CH. 2, §7

which proves (a). Similarly, (b) follows from (9) by applying (3) and
0) n+1 times. I
We define an -valuation a as follows.
As universe we take the set U of all E-classes of jSf-terms, i.e.,

£/ = {[t]: t is an jS?-term}.

If f is an n-ary function symbol of JS?, we define the n-ary operation


ia on U by putting

Note that this definition is legitimate by Lemma 7.3.(a), which guarantees


that [ft1...t„] is independent of the particular choice of “representatives”
tl5...,tn for the respective .E-classes
If P is an n-ary extralogical predicate symbol of <£, we define the n-ary
relation on U by putting

<[tJ,...,[t„]>€P* iff Ptx-.UT.

Again, this definition is legitimate by Lemma 7.3.(b).


Finally, for each variable x we put

xa = [x].

This completes the definition of a. We note that the cardinality of the


universe U of a is not greater than the cardinality of the set of all if-terms.

7.4. Lemma. For each £F-term t,

t'=[t].

Proof. By induction on degt. If t is a variable x, then x<T = [x] by the


definition of a. If t=ft1...t„, then degt;<degt for /=!,...,n and we have

(by the BSD)


(by ind. hyp.)
(by def. of ia)
I
We can now prove the main result of this section:

7.5. Theorem. <7^=4'.


CH. 2, §7], HINTIKKA SETS 87

Proof. We show by induction on deg tp that

(a) if then (p<T=T,


(b) if ~|<p£T then <p'T=j_.
Case 1: (p=Pt1...t„, where P is an extralogical predicate symbol.

If Pt^. t^T,
then <[t1],...,[t„])€PCT, (by def. of Pff)
then <t1",...,tI,0>€P,r, (by Lemma 7.4)
then (Pt1...tn)<T= T. (by BSD) .

On the other hand,

if -iPL-.t^T,
then Pt1...t„^'F, (by (1))
then <[t1],...,[t„])^P<7, (by def. of Pff)
then <t1<r,...,t/)$Pc7, (by Lemma 7.4)
then (Pt1...t„)<r= _L. (by BSD)

Case 1 : cp = (s=t).

If (s=t)€xP,
then [s] = [t],
then s*=t*, (by Lemma 7.4)
then (s=t)ff=T. (by BSD)

On the other hand,


if (s^t)GT,
then (s=t)^T, (by (l))
then [s]^[t],
then sVtCT, (by Lemma 7.4)
then (s=t)<T = _L. (by BSD)

Case 2: (p = ~ia. Here deg a<deg <p.

If -|«6T,
then aa=±, (by ind. hyp.)
then (na)'=T. (by BSD)

On the other hand,

if nae^,
then a € 'P, (by (2))
then a.a= T, (by ind. hyp.)
then (na)’=l. (by BSD)
88 FIRST-ORDER LOGIC ,
[CH. 2 §8

Case 3: (p — a-*p. Here deg a and deg p are <deg<p.

If a->P6'F,
then —|aC'F or PG'F, (by (3))
then aff= _L or P^T, (by ind. hyp.)
then (a-*-P)CT= T- (by BSD)

On the other hand,

if 1 (ot—P)6 *F,
then a^'F and —iPG'F, (by (4))
then aa=T and P"=_L, (by ind. hyp.)
then (a-* $)a= J_. (by BSD)

Case 4: (p = Vxa. Here deg a < deg tp.

If VxaC'F,
then a(x/t) £ *F for every term t, (by (5))
then a(x/tY= T for every t, (by ind. hyp.)
then a<T(x/ta)= T for every t, (by 3.10)
then aff(x/[t])= T for every t, (by 7.4)
then for every U, (by def. of U)
then (Vxa)<T=T. (by BSD)

On the other hand,

if -|Vxa<E'F,
then ~ia(x/t)£vF for some t, (by (6))
then a(x/t),T= _L for some t, (by ind. hyp.)
then aff(x/tCT)= _L for some t, (by 3.10)
then aa(x/[t])= _[_ for some t, (by 7.4)
then a<T(x/“) = _L for some ud U, (by def. of U)
then (Vxa)ff=_|_. (by BSD) |

Thus every Hintikka set in if is satisfied by a valuation whose universe


has cardinality not greater than the cardinality of the set of ^-terms.

*§ 8. Completeness of first-order tableaux

In this section we shall describe a procedure by which one can systematically


search for a confutation of any given finite set <D of formulas. Moreover,
we shall show that if this systematic search does not yield a confutation
of O, then <D is satisfiable, so that by Thm. 4.1 O cannot be confuted.
CH. 2, §8], COMPLETENESS OF FIRST-ORDER TABLEAUX 89

For any set O of formulas we define if0 as the “poorest” first-order


language such that O is still in JSf^. More precisely, the extralogical
symbols of if0 are exactly those extralogical symbols that actually occur
in O. If = occurs in tD, or if some function symbol (other than a constant)
occurs in ®, then we let be a language with equality. Clearly,
is a sublanguage of our original language Jzf, in the (obvious) sense that
every if^-expression is also an if-expression.
For the rest of this section we let <I> be a fixed (but arbitrary) finite set
of formulas.
The set of symbols of if^ is denumerable: it consists of finitely many
function symbols, finitely many predicate symbols (possibly including =),
two connectives, the universal quantifier, and the variables v1,v2,... .
Since every if^-formula is a string (i.e., a finite sequence) of if^-symbols,
we can enumerate these formulas by some effective rule. (For example,
suppose that the symbols of if0, excluding the variables, are k in number.
We assign to them “lexical numbers” 1 respectively. And to each
variable v; we assign the “lexical number” k + i. Let A„ be the set of all
formulas whose length is at most n and in which there occur only symbols
whose lexical number is at most n. Then A„ is clearly finite and we can
arrange it in lexicographic order. Finally, we can arrange all the A„
among themselves in increasing order of n. The resulting enumeration
of ^-formulas is effective, in the sense that for any m we can find in
a finite number of steps the ra-th formula in that enumeration.)
For the rest of this section, we fix some particular effective enumeration

{(p„: «=0,1,...}

of all if^-formulas. Without loss of generality we shall assume that each


Sf^-formula a is repeated infinitely often, i.e., a = <p„ for infinitely many n.
(For the particular enumeration suggested above, this is actually the case.
But even if this were not so, we could always replace the enumeration
<P<D<PU<P2 etC- by <Po><Pu<Po><Pu<P2»<Po><Pu<P2»<P3 etC.)
Similarly, we fix some effective enumeration

{t„: «=0,1,...}

of all JS^-terms. (Here it does not matter how many times each term is
repeated.)
Finally, if JSf0 has =, we fix some effective enumeration

{\|V 71 = 0,1,...}
90 FIRST-ORDER LOGIC [CH. 2, §8

of all ^-formulas which can be introduced into a tableau by the equality


rules (i.e., all formulas of the forms mentioned in clauses (7), (8), (9) of
Def. 7.1, except that now we confine ourselves to

For every natural number n, we now prescribe how to construct a tableau


T„ for <I>. We proceed by recursion on n.
T0 is the tableau having O as its sole node.
Suppose Tn has been constructed. We first construct a tableau Tn* as
follows:
(a) If (|>„ occurs in Tn and if <p„ can be used for any one of the four rules
1 I, ~and —we extend every branch of Tn having (p„ by
applying to it the appropriate rule. (In the case of the —| V-rule we use
as critical variable the first variable which can be so used.) The resulting
tableau is Tn*.
(b) If <p„ occurs in Tn and has the form \fxrt, we extend n +1 times each
branch of T„ having <p„ by adjoining successively the nodes {a(x/t0)},...
...,{a(x/t„)}. The resulting tableau is Tn*.
(c) In the remaining case — where (p„ is not in T„, or where <p„ is an
atomic formula or a negation of an atomic formula — we let T* be Tn.
Having got T*, we now construct Tn+1. If does not have =, we let
T„+i be Tn*. If J?# has =, we obtain Tn+1 by adjoining the node {v|/„} to
each branch of Tn*.
If for some n all branches of Tn are closed, then, by Thm. 4.1, <l> is
unsatisfiable.1
On the other hand, suppose that each T„ has at least one open branch.
By construction, if n>m then every (open) branch of Tn is an extension
of ■— i-e., obtained by zero or more successive extensions from —- one
and only one (open) branch of Tm. We now prove:

8.1. Lemma. If each Tn has an open branch, then there exists a sequence
{Bn: 72 = 0,1,...} such that for all n Bn is an open branch of Tn and Bll + 1
is an extension of Bn.
Proof. Let B be a branch of Tn. We say that B is good if for infinitely
many m>n Tm has an open branch which is an extension of B. (It follows
of course that B must be open.)
If B is a branch of Tn, there are finitely many branches of Tn+1 extending
B, say B\...,Bk. If B is good, it follows that at least one of these Bl

1 The fact that here we ar eworking with rather than makes no difference; see 2.7.
CH. 2, §8]. COMPLETENESS OF FIRST-ORDER TABLEAUX 91

must be good as well, because each open branch of Tm (where m>-ri) which
extends B must also be an extension of some Bl.
Now, T0 has just one branch; call it B0. Every branch of every T„ is
an extension of B0. Since each T„ has an open branch, B0 is good. There¬
fore at least one branch of Ij which extends B0 must be good as well.
Choose such a branch and call it Bx. Similarly, T2 has a good branch Bz
extending Bx, etc. |

Lemma 8.1 is (a version of) the result commonly known as Koenig's


Tree Lemma. The reader should note that the proof was not constructive:
having chosen a good Bn we can only show that one of the extensions
of Bn in Tn+1 must be good. But we have no way of telling which of these
extensions is good — because for this we would have to examine all Tm
with m>n. However, the proof of Lemma 8.1 does not require the axiom
of choice: the branches of each T„ can be ordered “from left to right”,
and for each n we can take Bn+1 as the leftmost branch of Tn+1 which
is a good extension of Bn.
The main result of this section is contained in:

8.2. Theorem. If each Tn has an open branch, then <1> is satisfied by a valu¬
ation with countable (i.e., finite or denumerable) universe.
Proof. Let {Bn: «=0,1,...} be a sequence as described in Lemma 8.1.
Let 0„ be the set of all formulas of Bn. In particular, O0=$. Now let
be the union of all the On. From the construction of the Tn and the
properties of the Bn it can readily be verified that ¥ is a Hintikka set
in JT0. Thus, by Thm. 7.5, ¥ is satisfied by a valuation with countable
universe. (Recall that the set of i^-terms is denumerable.) But
O=<D0c»P. |

8.3. Corollary. A finite set of formulas can be confuted iff that set is
unsatisfiable. Every satisfiable finite set of formulas is satisfied by some
valuation whose universe is a countable set.
Proof. Thms. 4.1 and 8.2. |

8.4. Remark. The above results do not provide us with an effective method
for deciding whether or not a given finite set <D is satisfiable. The definition
of Tn is constructive, and we may proceed to construct r0,7j,r2 etc.,
knowing that if ® is unsatisfiable then sooner or later we shall find a confu¬
tation Tn. But there is in general no way of telling in advance how large
is the first n for which Tn will be a confutation. Thus, if we have constructed
Tm for some huge m and find that it has an open branch, there are still
92 FIRST-ORDER LOGIC [CH. 2, §8

two possibilities: either O is in fact satisfiable, or <J> is unsatisfiable but


the first n for which Tn is a confutation is (perhaps) much larger than our m.
Contrast this with the results of §9 of Ch. 1. There we could tell in
advance how far we have to go to obtain an exhausted propositional
tableau for O and thus to decide whether or not ® is satisfied by some
truth valuation.

8.5. Problem. Show that if® is finite then® f= a iff <I>, ~|a can be confuted.
8.6. Problem Using Cor. 8.3 find an alternative proof1 for the Elimination
Lemma 6.7.
8.7. Problem. Let ® be a finite set of formulas in which ~| does not
occur. Show that in any tableau for ® there must be a branch in which
“I does not occur. Hence show that ® is satisfiable.

An V-formula is a formula of the form

Vxi...VxfcP,

where 0 and p is quantifier-free (i.e., V does not occur in P).

8.8. Problem. Let 3? be a language without equality, and let ® be a finite


set of V-formulas in &'• Let {tlv..,tm} be a finite non-empty set of -terms
(i.e., variables and constants) which contains every variable free in ®
and every constant occurring in ®. Let ®* be the smallest set of formulas
such that ®c®* and such that whenever Vxa£<D* then a(x/t;)6®*
for (®+ can clearly be constructed from <D in a finite number
of steps.) Let ®0 be the set of all quantifier-free formulas belonging to ®*.
Show that ® is unsatisfiable iff a closed propositional tableau can be
constructed for ®0. (Let Tbe a propositional tableau for ®0 which is ex¬
hausted in the sense of §9 of Ch. 1. If T is a propositional confutation,
then ® is easily seen to be unsatisfiable. Otherwise, choose an open branch
of T and let ¥ be the set of all formulas of that branch. Define a valuation
eras follows. Let the universe U of <r be any set of m objects, U={u1....,um).
Put tj = ui for i=\,...,m. If P is an «-ary predicate symbol of JS?, let

a n
P" iff Pt, ...t, C'F.
1 n

Show that ct(=®.)

However, note that the direct constructive proof in §6 is more informative: given
confutations of <1>, 5 and <t>, ~15 that proof gave an explicit construction of a confutation
of <D. Using Cor. 8.3 (or Thm. 8.2) we can only say that, for some n, Tn will be a con¬
futation, without being able to estimate how large such an n might be.
CH. 2, §9], PRENEX AND SKOLEM FORMS 93

An formula is a formula of the form

3xi - 3xfcp,

where /c^O and p is quantifier-free.

8.9 Problem. Let be a finite set of 3~fc>rmulas in a language without


equality. Show how to find out, in a finite number or steps, whether or
not O is satisfiable. (Consider a variant d>' of d> such that no bound va¬
riable of any formula occurs in any other formula of O'. Let O* be obtained
from O' by removing all existential quantifiers, leaving only the quantifier-
free parts. Show that O is unsatisfiable iff a propositional confutation
can be constructed for O*.)

*§ 9. Prenex and Skolem forms

A formula is said to be prenex if it has the form

Q1x1Q2x2. .. Q^p,

where k^O, and for each i, Q{ is V or 3> and P is quantifier-free. In this


connection, the string Q^.-.Q^x*. is called the prefix and p is called the
matrix. A normal prenex formula is a prenex formula such that the
variables Xi,...^ in its prefix are distinct, and all of them occur in the
matrix p.
A prenex form for a is a prenex formula logically equivalent to a.
The following problem yields a procedure whereby we can find a prenex
form for any given formula.

9.1. Problem. Let (p be a formula which contains n +1 quantifiers.


Show how to find a formula of the form Qxi[/ — where Q is V or 3,
and \|/ contains only n quantifiers — which is logically equivalent to <p.
(Proceed by induction on deg (p. Clearly, <p cannot be atomic. The cases
where <p is a negation formula or a universal formula are easy. If (p is
a-»P, then by the induction hypothesis we may assume that a has the
form Qxy or p has the form Qy5, and by alphabetic change we can arrange
that x is not free in p and y is not free in a. Then use Prob. 4.3.)

Using recursion on n, we can now find a prenex form for <p.


Note that by Prob. 2.4. we can always delete quantifiers in a prenex
form for <p, until a normal prenex form for <p is obtained.
Let ££' be a first-order language which is an extension of (i.e., every
jSf-symbol is also an JSf'-symbol). Then for each jS?'-valuation o' there

8
94 FIRST-ORDER LOGIC [CH. 2, §9

is a unique if-valuation o which agrees with o' on all if-symbols.1 We say


that o is the if-reduction of o'. Also, o' is said to be an ^'-expansion
of o. (Note that in general an if-valuation has more than one ^'-ex¬
pansion.)
Let a be an if-formula. By an S -form1 for a we mean a formula a'
in some extension if' of if, such that any if-valuation o satisfies a iff o
has an if'-expansion o' satisfying a'.
It is clear that if a' is an S-form for a, then a' is satisfiable iff a is. Also,
if a" is an S-form for a' (in some extension if " of if') then a" is an S-form
for a.
By a V-form3 for a we mean a formula a' in some extension if' of if,
such that any if-valuation o satisfies a iff every if'-expansion o' of o
satisfies a'. (It follows, in paiticular, that a' is logically valid iff a is.)

9.2. Problem. Show that a' is a V-form for a iff —la' is an S-form for
“la. Similarly, a' is an S-form for a iff “la' is a V-form for “la.

The rest of this section will be concerned with finding various kinds
of S-forms and V-forms for a given formula.

9.3. Lemma. Let a be an LL-formula and let xx,...,xk,y be k +1 different


variables, where 0. Let P be a (k+lfary extralogical predicate symbol
which is not in if. Then

(1) V*i • ■ Vn. [3yPxi • ■ ■ xkya Vy(Pxi...xfcy->• a)]

is an S-form for

(2) Vxi-Vxt3ya.

Proof. We take if' to be any extension of if such that P is an if'-symbol.


Let o be an if-valuation with universe U. First, suppose that o satisfies
formula (2). Then it follows from the BSD that for all u1,...,uk in U there
exists some v in U such that

o(xilui)...(xkluk)(y/v)\=<x.

1 Thus cr and o' have the same universe; and x<r=xtr'for every variable x, f<r = fa' for
every function symbol f of if, and P<r=P-' for every extralogical predicate symbol
P of if.
2 This is short for satisfiability form.
3 Short for validity form.
CH. 2, §9], PRENEX AND SKOLEM FORMS 95

Let o' be an expansion of a such that

(ui> — >uk,v)£ PCT iff o(x1/u1)...(xk/uk)(y/i?)|=a.

Then for all u1,...,uk in U there is some v in U such that

(3) a'ixjuj... (xk/uk)(y/v) N Px* xky

and for each such v we also have

(4) v'(xi/u1)...(xk/uk)(y/v)l=a.

It follows from the BSD that o' satisfies formula (1).


Conversely, if o' is any expansion of o and o' satisfies formula (1), then
for all u1,...,uk in U there is some v in U such that (3) holds, and, for every
such v, (4) holds as well. It follows that o' satisfies formula (2). But since o
agrees with o' on all if-symbols, er too satisfies formula (2). §

An V3-formula is a prenex formula in whose prefix all universal


quantifiers precede all existential quantifiers.
By a PS-Skolem form1 for an if-formula tp we mean a formula \[/
such that
(a) v|/ is in a language if' obtained from if by adding new extralogical
predicate symbols only,
(b) \J/ is an V3 -formula,
(c) v|/ is an S-form for <p.
3y-formula is defined like V3 -formula, but with the roles of V and 3
reversed. Also, P\-Skolem form is defined like PS-Skolem form, but
with “V3” replaced by “3V” and “S-form” replaced by “V-form”.

9.4. Theorem. Given any Sfi-formula <p, we can find a PS-Skolem form
for q>.
Proof. By Prob. 9.1, we can assume without loss of generality that <p
is a normal prenex formula. If <p is an V3 -formula, then <p is a PS-Skolem
form for itself, and we are through.
Otherwise (p has the form

Vxi...Vx*3yQizi- Q™zmP>

where k^O, 7?r>0, each of the Q, is V or 3 and at least one Qy is V>


P is quantifier-free and x1,...,xfc,y,z1,...,zm are all different.

1 Here “P” is short for “predicate” and “S” (as before) is short for “satisfiability'’.

8*
96 FIRST-ORDER LOGIC [CH. 2. §9

For brevity we put

a = Q1z1...Qmzmp.

Using Lemma 9.3 we obtain for tp the S-form

(1) Vxi - Vx& [3yPx1...xfcyAVy(Pxi-..xfcy-►«)],

where P is an extralogical predicate symbol not belonging to if. We


now choose a variable z which does not occur in <p and replace 3yPxi--x/ty
in (1) by the logically equivalent formula 3zPx1...xJtz. Next, to the
formula Px^.x^y-nx in (1) we apply the last two clauses of Prob. 4.3.
We repeat this m times, until Vy(Pxi---xjty->-a) is replaced by the logically
equivalent

VyQiZi...Qfczfc(Px1...xty-*p).

We now apply clauses (e) and (f) of Prob. 4.3 with the quantifiers

Vy,QiZi,.. ,3z,

in this order. At the end of all this, (1) is transformed into the logically
equivalent formula

(2) Vxi- • • Vx/cVy®iZj ... Q„,Z„,3z [Pxx...xfcz a(Pxx...x*yp)].

Since (2) is logically equivalent to (1), (2) is an S-form for (p. If (2) is an
V3-formula, we are through. If not, we can apply to (2) the same process
again. Each time, a new predicate symbol is introduced, an existential
quantifier in the middle of the prefix is replaced by a universal quantifier,
and a fresh existential quantifier crops up at the end of the prefix.
After a finite number of iterations, we clearly get a PS-Skolem form
for tp. |

9.5. Problem. Show that we can find a PV-Skolem form for any
if-formula ip. (Use Prob. 9.2.)
9.6. Problem. Let a be an if-formula. For ks>0, let ..xt,y be
distinct variables, and let f be a k-ary function symbol not belonging
to if. Show (using the axiom of choice) that

Vxi--.Vxfc[a(y/fx1...x*)]

is an S-form for Vxi---Vx*3ya-


CH. 2, §10], ELIMINATION OF FUNCTION SYMBOLS 97

We say that i|/ is an FS -Skolem form1 for an if-formula <p if


(a) \J/ is in a language if' obtained from if by adding new function
symbols only,
(b) v|/ is an V-formula,
(c) v]/ is an S-form for (p.
FY-Skolem form is defined similarly, but with “□-formula” and “V-form”
instead of “V-formula” and “S-form”.
9.7. Problem. Show that, given any if-formula q>, we can find an FS-
Skolem form and an FV-Skolem form for tp.

*§ 10. Elimination of function symbols

In this section we shall show that function symbols are not indispensable
in a first-order language with equality. Roughly speaking, whatever
can be expressed using an «-ary function symbol can also be expressed
using an (rc + l)-ary predicate symbol.
It will be convenient to use the following abbreviation.
10.1. Definition. If a is a formula and x is a variable, then

□ !xa

is (an abbreviated notation for) the formula2

3yVx(a<H-x=y),

where y is the first variable which is different from x and is not free in a.

10.2. Problem. Verify that at=31xa iff in the universe U of a there is


exactly one individual u such that o(x/u) f=a.

Consider two first-order languages if and if'. Let o be an if-valuation,


and let o' be an if'-valuation. We shall say that o and o' are compatible
if they agree on all symbols which are common to if and iff (In particular,
o and o' have the same universe, and agree on all the variables.) Note
that if if' is an extension of if then o' is compatible with o iff o' is an ex¬
pansion of o.
Let a and a' be formulas of if and if' respectively. We shall say that
a and a' are co-satisfiable if for each if-valuation o satisfying a there

1 Here “F” is short for “function”.


2 Here we tacitly assume if to be a language with equality. Def. 10.1 will only be used
for such languages.
98 FIRST-ORDER LOGIC [CH. 2, §10

exists an ^'-valuation o' compatible with a that satisfies a', and also
conversely: for each J§f'-valuation o' satisfying a' there exists a compatible
if-valuation o satisfying a. Note that if if' is an extension of if then
a and a' are co-satisfiable iff a' is an S-form for a.
We shall say that a and a' are co-valid if whenever o and o' are a compat¬
ible if-valuation and if'-valuation respectively, then <rNa iff d'|=a'.

10.3. Problem. Verify that a and a' are co-valid (co-satisfiable) iff ~la.
and ~ia' are co-satisfiable (co-valid).

For the rest of this section we assume that if is a language with equality.

10.4. Lemma. For each SF-formula a we can find an £F-formula a* logically


equivalent to a, such that each atomic subformula of a* which contains
a function symbol has the form

fx1...x„=y,

where x1,...,x„,y are distinct variables.


Proof. Suppose a has an atomic subformula of the form

(1) Ptx-t,,,*

where P is extralogical. If not all the arguments tl5...,t„ are variables,


we choose n different variables xl5...,x„ which do not occur in any of
these arguments, and we recall (see Prob. 4.4) that (1) is logically equi¬
valent to

(2) Vxi... V x„(tx=xx t„=x„ Px,... x„)

as well as to

(3) 3XX...3x„(t1=x1 A... At„=x„aPx1...x„).

Thus (1) can be replaced by (2) or (3). In this way we transform a into
a logically equivalent formula oq such that each atomic subformula of
cq is an equation or has the form Px1...x„.
Next, suppose oq has an atomic subformula which is an equation s=t
in which the right-hand side t contains a function symbol. We choose
a variable y which does not occur in s=t and recall (see Prob. 4.5) that
s=t is logically equivalent to

(4) Vy(s=y-+t=y)
CH. 2 §10], ELIMINATION OF FUNCTION SYMBOLS 99

as well as to

(5) 3y(s=yAt=y).

Thus s=t can be replaced by (4) or (5), and we can transform ax into
a logically equivalent formula a2 in which function symbols only occur
on the left-hand side of equations.
Finally, suppose a2 has an atomic subformula of the form ft^.-t^y,
where the arguments tl5...,tn are not distinct variables different from y.
Then we choose n distinct variables x1?...,x„ which are different from y
and do not occur in any of the arguments tl5...,t„. By Prob. 4.6, ft1...t„=y
is logically equivalent to (and hence can be replaced by) each of the two
formulas
Vx1...Vx„(t1=x1->...-)-t„=xn-^fx1...x„=y),
3xx...Ex^t^Xia...At„=x„Afxj. .x„=y).

Repeating this process, we can transform a2 to a formula a* with the


desired property. |

As before, we assume if to be a language with equality. Select an w-ary


function symbol f of if, and let if' be obtained from if by excluding
f and introducing a new (n + l)-ary predicate symbol P. We prove:

10.5. Theorem. For any ^-formula a we can find an if'-formula which


is co-satisfiable with a and an 21'-formula which is co-valid with a.
Proof. By Lemma 10.4, we may assume without loss of generality that
f only occurs in a in equations of the form ft1...t„=s, where tx,...,t„, s
do not contain f. Let a' be obtained from a by replacing each such equation
by Ptj.. ,t7Js. We show that the formula

(1) Vvi...Vv„3!v„+iPv1...v„v„+1Aa/
is co-satisfiable with a. Indeed, if a is an if-valuation satisfying a, we
let o' be the if'-valuation which is compatible with o and such that for
all u1,...,u,„v in the universe U of o,

(2) (u1,...,un,f)£P't' iff r(u1,...,un) = v.

Then it is easy to see that o' satisfies (1).


Conversely, suppose that o' is an if'-valuation satisfying (1). In parti¬
cular, o' must satisfy the first conjunct of (1). It follows that for all u1,...,un
in U there is a unique v in U such that (u1>...,un,v)€P* . We let o be the
if-valuation which is compatible with <r' and such that (2) holds. It is
easy to see that o\=x. Thus a and (1) are co-satisfiable.
100 FIRST-ORDER LOGIC [CH. 2, §10

Next, consider the formula

(3) Vvi • • • Vv„3 !v„+iPVi... v„v„+!-►«'.


The negation of (3) is tautologically equivalent to

Vvj... Vv„3 !v„+jPv!... v„v„+! A “I a',


which — by what we already proved — is co-satisfiable with “la. Thus,
by Prob. 10.3, a and (3) are co-valid.

By Thm. 10.5, a first-order language with equality loses virtually nothing


in expressiveness if we eliminate from it a function symbol and introduce
a suitable extralogical predicate instead. Indeed, using the same method
we can eliminate all function symbols in favour of predicate symbols.
Therefore in certain theoretical contexts we lose little by confining our
attention to languages without function symbols, or without function
symbols other than constants.1 Such languages possess the advantage
of not having complex terms (the only terms being the variables, or the
variables and constants).
However, it can readily be seen that the formula (1) obtained in Thm. 10.5
is in general much longer than the given formula a. (Note that a must
first be transformed by the method of Lemma 10.4!) Also, (1) will usually
have many more quantifiers than a. Therefore languages with function
symbols express statements much more concisely.
Besides, in many branches of mathematics it is more natural to consider
structures with basic operations.2 For example, in arithmetic it is more
natural to think of addition and multiplication as binary operations rather
than ternary relations. Similarly, in group theory the multiplication in
a group is normally thought of as a binary operation.
For these reasons it is in general desirable to deal with languages having
function symbols.

1 In this book we shall do so, e.g., in Ch. 5.


2 It may be argued that part of the reason for this is precisely the fact that mathematical
statements can often be expressed much more concisely in terms of operations rather
than relations.
CH. 2, §11], ELIMINATION OF EQUALITY 101

*§ 11. Elimination of equality

In this section we shall investigate the expressive power of a first-order


language without equality, compared to that of a language with equality.

11.1. Problem. Let if be a language without equality. Let a be an


if-valuation with universe U. Let V be an arbitrary class such that
U^V. Show that there is an if-valuation t with universe V such that,
for every if-formula a, o N a iff i(=a. (Take any mapping / of V onto
U such that/(«) = « for every u£U. For any n-ary predicate symbol P, put

Pc = {(tq,... ,vn): (ffifi... ,f(vn)) £ Pff}.

For every variable x, let xT=x<T, and for every constant a, let a^a*7.)

Thus, a first-order formula without equality (and even an arbitrary set


of such formulas) cannot impose an upper bound on the size of the universe.
Contrast this with Prob. 2.5.
For the rest of this section, let if be a first-order language with equality
but without function symbols other than constants. Let if' be obtained
from if by excluding = and introducing a new extralogical binary predi¬
cate symbol E.
Let a be a given if-formula. We define a* as the if'-formula obtained
from a when each occurrence of = is replaced by an occurrence of E.
Let Pl5...,Pfc be all the predicate symbols occurring in a*. If P; is nrary,
let 8; be the sentence

Vv... Vv2„. (Ev1v„j+! -* Ev2vn;+2-f ... Ev„ v2„(


“►Ppq...vn.^Pivnj+1...v2J.

Let a' be the formula

V ViEv^ a 5X a ... A 8fc A a *.

Then we have:

11.2. Theorem. If a is an ££-valuation satisfying <x, then there is an


-valuation o' which is compatible with a and which satisfies a'. If a' is
satisfiable, then so is a.
Proof. The first part is trivial: if o' is the if'-valuation which is compatible
with o and for which Ea' is the identity relation on the universe, then
clearly a'Net'.
To prove the second part, we may assume that = occurs in a; other¬
wise a* is a and the assertion is trivial. Thus E is one of the P,-.
102 FIRST-ORDER LOGIC [CH. 2, §12

Let o' be any iif'-valuation satisfying a'. From the fact that
VViEvjVj, are satisfied by o' we can easily draw two conclusions.
First, if V is the universe of o' then Ea' is an equivalence relation on V.
:Second, if we denote by [a] the E^'-class of v£ V, then we have:

If [Vj\=[vni+j] fory=l,...,«e and if (vlf...,vn)£Pf then <u„|+i,...,b2n<>€Pf.

We define an if-valuation o as follows. The universe of o will be1

U={[v] :v£V}.

If P, is an extralogical predicate symbol occurring in a we put

(and for any extralogical predicate P that does not occur in a, PCT may be
defined arbitrarily, e.g. as the empty set).
Finally, for any variable x and any constant a we put x'T=[x<T ] and
aff=[aCT'].
Then it is easy to see that tx N a. |

*§ 12. Relativization

For the purpose of this section we select for a special role some formula
ip and some variable z. Any variable which is free in tp but is different
from z will be called a parameter.
We define, for any formula a, a formula a* called the relativization of
a to <p (with z as chosen variable). We proced by recursion on deg a.

12.1. Definition.
(a) For atomic a, a* = a.
(b) (~1P)* = ~l(P*).
(C) (p-»y)* = p*-»y*.
(d) If x is not a parameter, then

(Vxp)* = Vx[<p(z/x)-»p*].
(e) If x is a parameter, then

(Vxp)*={Vy [P(x/y)])*,
where y is the first variable which is not a parameter, and is not free in
P but is free for x in p.
1 This definition of U is legitimate only if all the equivalence classes [y] are sets. This
may not be the case if V is a proper class (i.e., a class that is not a set). But by Cor. 8.3
we can always assume V to be a countable set.
CH. 2, §12], RELATIVIZATiON 103

To investigate the semantic behaviour of a*, we shall need a few auxiliary


definitions.
If f is an 72-ary function symbol, we define <p{f} to be the formula

VXj.. . Vx„ [<p(z/Xi)-> ... cp(z/x„)-^ (p(z/fXi...x„)],

where xl5...,x„ are the first n variables which are not parameters. (In
particular, if a is a constant then <p{a} is <p(z/a).)
Let fl5...,fm be any function symbols, and let uls...,ufc be any variables.
We shall say that a valuation a has the closure property for {ft,... ,fm; ux,... ,uk}
if the following three conditions hold:

(1)
(2) cr f= <p{f,} for 2 = 1,...,772 ;
(3) ffN<p(z/uf for

12.2. Problem. Let U be the universe of a, and let

U* = {u:u£.U and a(z/u) t= 9}.

Verify that conditions (1), (2) and (3) are respectively equivalent to
(L) E/V0;
(2') U* is closed under the operation fj for 2 = 1,...,m;
(3') ufor j=l,...,k.

Suppose that o has the closure property for {f1....,fm; ul5...,ufc}, and
let U* be the class defined in Prob. 12.2. We shall say that a valuation
<r* is a restriction of a for (fx,... ,fm; ux,...,ujif the following four conditions
hold:
(a) The universe of 0* is U*.
(b) For every extralogical predicate symbol P, PCT* is the restriction
of Pff to U*.
(c) For 2 = 1,...,777, if is the restriction of if to U*.
(d) For j=l,...,k, uf = uf
We can now prove the main result about relativization:

12.3. Theorem. Let a be a formula such that all the function symbols
occurring in a are among fl5...,fm and all the free variables of a are among
ul5...,u*. Let a have the closure property for {fl5...,fm; ul5...,ufc} and let
<7* be a restriction of o for {fx,...,fm; ulv..,uj. Then <rh=a* iff (7*1= a.
Proof. By induction on deg a. We distinguish five cases (a)-(e), correspond¬
ing to the five clauses of Def. 12.1. We outline the proof, leaving the
details to the reader.
104 FIRST-ORDER LOGIC [CH. 2, §13

Cases (a), (b) and (c) are routine. Case (e) is easily reduced to (d)
via Thm. 3.6.
Case (d): a = V*P, where x is not a parameter. Then all the function
symbols occurring in p are among and all the free variables of
P are among ulv..,uk,x. Let u* be any member of U*. It can readily
be verified that cr(x/u*) has the closure property for {flv..,fm; ul5...,uk,x}
and that <r*(x/u*) is a restriction of cr(x/u*) for {fx,...,fm; ulv..,uk,x}.
By the induction hypothesis it follows that cr(x/w*)f=P* iff <r*(x/«*) t= p.
Using this and the definition of a*, it is easy to check that a t= a*
iff a*Not. |

12.4. Corollary. Let a be as in Thm. 12.3. If the formula

2Z(P A <P {fx} A ... A <p {fm} A <p(z/Ui) A ... A (p(z/uk) Aa*

is satisfiable, then a is satisfiable as well. Also, if a is logically valid, then


so is the formula

3zq> -Mp {fj} ip {fm} -► <p(z/ Ul)<p(z/ufc) -»a*. |

*§ 13. Virtual terms

We have noted in §10 that languages without function symbols (other


than constants) possess the advantage of having simple terms only. On
the other hand, in practice function symbols are needed for making state¬
ments in a more concise way. In this section we describe a way of getting
the best of both these worlds.
Throughout this section we assume that if is a language with equality.
We choose a particular if-formula a having exactly one free variable z.
We shall keep a fixed throughout this section.
We shall be interested only in those if-valuations a for which <r|=3!za.
Accordingly, we say that such valuations are admissible. (As a matter
of fact, since 3!za is a sentence, the admissibility of an if-valuation
depends only on its underlying if-structure.)
If a is an admissible valuation, then we define aa to be the unique
individual in the universe U of cr for which o(z/aa) N a. (See Prob. 10.2.)
Now let q> be any if-formula and let y be a variable. For the time being,
we hold <p and y fixed as well. Let xl5...,x„ be all the different free variables
of (p other than y (i.e., all the free variables of Vy<p)- To be quite definite,
we suppose that the xf are taken in alphabetic order: x;=vy , where
ii<.
Next, let if' be the extension of if obtained by adding a new n-ary
CH. 2, §13], VIRTUAL TERMS 105

function symbol f. We define (p{oc,f,y} to be the if'-sentence

V*i-• • Vx„ [a !y<p a<p(y/fxx...x„) V “1 a !y<pA«(z/fXl...x„)].


If g is an admissible ^-valuation with universe U, we define an n-ary
operation fa on U as follows. For any u1,...,un^ U, if there is a unique
v>eU such that ct(x1/m1)...(x„/w„)(y/u)t=q> then we put fa(u1,...,u„)=v;
otherwise we put fa(u1,...,u„)=aa.
It is easy to see that for any admissible if-valuation a there is a unique
if -expansion <7 of a such that o' f= <p{a,f,y}. Namely, it is the one for
which ia' is our fa.
We now associate with each if'-formula i|/ an if-formula vj/„ as follows.
If f does not occur in vj/ (i.e., if v|/ is itself an if-formula), we put v|/0=\|/.
Otherwise, we first use the method of Lemma 10.4 (more precisely, one
of the alternative methods provided in the proof of 10.4) to transform
\J/ into an if'-formula vj/* which is logically equivalent to vj/ and in which
f occurs only in equations of the form
(1) fux...un=v,

where uls...,u„,v are distinct variables. We let vj/„ be the if-formula


resulting from vj/* when each equation of the form (1) is replaced by

(2) 3 !y9(x1/u1 ,... ,x„/u„, y/v) a (pCxi/ifi,... ,x„/u„, y/v) V


V“13 IvvpCxi/Uj,... ,x„/u„, y/v) a«(z/v).
Note that in going from vj/ to vj/0 the only changes that occur are that some
of the atomic parts of vj/ are replaced by other formulas. In fact we have

(“l'l')o=“lOI')o, (^X)o = ('l')o->(x)o. (Vx'j/)o = Vx(vj/)0.


We now prove:
13.1. Lemma. For each SF'-formula vj/,

3!zot, <p{a,f,y}Nvj/-*+v}/0.
Proof. Let o' be a if'-valuation such that o' 3 <p{a,f,y}. Then
o' must be an expansion of an admissible if'-valuation o, and by what
we have seen above fa must be fa. But from this it follows without difficulty
that the formulas (1) and (2) above have the same truth value under o'. |

13.2. Theorem. Let O be a set of £F-formulas such that O (= 3 !za. Then


for any SF'-formula vj/ we have Ot=vj/0 iff O, <p{a,f,y}t=vj/.
Proof. Since <X>|=3!z°h it follows from Lemma 13.1. that
(*) o, <p{<x,f,y}t=vj/«+v|/0.
Thus if <Ff=vJ/0 we must have <I>, <p{a,f,y}t=vj/.
106 FIRST-ORDER LOGIC [CH. 2, §13

Conversely, suppose ®, <p{a,f,y}t=x|/. Then by (*) we get

(**) <D, (p{a,f,y}|=v|/0.


Now, if a is an J$f-valuation such that uN®, then a is admissible because
$t=3 !za. Hence there is a unique if'-expansion o' of a such that
a’ |= <p{a,f,y} and by (* *) we have also <x'|=v|i0. But since v|/0 is an
if-formula, it follows that <7f=vj/0. Thus ®|=vK- I
Note, in particular, that if in Thm. 13.2 we take v|/ to be an if-formula,
then v|/0=\J/, and we have 0|=\j/ iff 0, (p{a,f,y}t=vj/. We say that the
addition of <p{a,f,y} to ® is conservative, since the logical consequences
of ®, (p{a,f,y} in if are exactly the same as those of ®.
Thm. 13.2 is applied in the following way. Suppose that ® is as in the
theorem and that we want to derive logical consequences of ® in if.
In practice it may be more convenient to work in if' (since we can then
use an additional function symbol f) and add <p{ot,f,y} to ®. If we discover
an ^'-formula x[/ such that ®, <p{a,f,y}t=\|/, then by our theorem it follows
that ® t=v|/0. (In particular, if \J/ happens to be in if then ® N\]/.)
If we like, we may regard each if'-formula v|/ merely as an abbreviation
of the corresponding if-formula i]/„. This has the advantage that while
in practice we work in if', which has a greater facility of expression, we
can pretend that all our formulas belong to the simpler language if.
If we adopt this point of view, we say that terms containing the new
function symbol f are virtual terms: they can be used in practice but ignored
in theory, since any formula that contains them is regarded as an abbrevi¬
ation for a formula that does not. In particular, the virtual term fx^-.x,,
is denoted by “iay<p”. Since we have fixed a once and for all, we shall
omit the subscript “a” and write, more briefly, “iy<p”. (Thus, if w is
a variable other than y, or x is an <£-formula other than <p, then the
virtual term iwx is not in Jzf' but in a language obtained in a similar way,
with a playing the same role and with w and x playing the roles of y and <p.
Of course, every formula of this other language is also regarded as an
abbreviation for a suitable <£-formula.)
With the new notation, the sentence <p{a,f,y} can be written in the form

V*i. • • Vx„ [a !y <p a (pCy/iytp) v ~i 2 !y <i> a «(z/iy 9)].


Note also that since (p{a,f,y) itself is now regarded as an abbreviation
for — and can therefore be identified with — the corresponding ^f-sentence
(<p{a,f,y})0, it follows from Thm. 13.2 that

® t=<p{a,f,y}.
CH. 2, §14], VIRTUAL TERMS 107'

This procedure of adding virtual terms can be iterated several times.


If <p' is any ^'-formula and y' is any variable, we may extend if' to S£"
by introducing a suitable new function symbol f'. Any j2?"-formula may
be regarded as an abbreviation for an appropriate J5?'-formula, which
is in turn itself regarded as an abbreviation for an .^-formula. We then
have ® t= (/{aTky'}. And so on.
/
13.3. Problem. Let ®, <p and y be as inThm. 13.2. Regarding JiC-formulas-
as abbreviations for the corresponding j5?-formulas, verify that

® 1= V*i...Vxn [3 !y<p-^ <p(y/iyq>)],


®NV*i—Vx„ [”l3!y<p->>a(z/iyq>)].

Also, taking the special case where tp and y happen to be the same as-
a and z respectively, show that

® t=a(z/iza).

Hence for any <p and y show that

® N Vxi --Vx„ [~l3!y<p->iy(p=iza].

13.4. Problem. Let ®, <p and y be as before. The free variables of iy<p
are defined to be the same as the free variables of Vy^lb ke-, xl5...,x„.
(i) If w is not a free variable of iytp, show that

® Niy<p=iw [<p(y/w)].

(ii) Let t be any term (virtual or otherwise) that does not contain the;
variable y, and let x be any variable other than y. Show that

® N iy [<P (x/t)]=[ vy <p] (x/t).

§ 14. Historical and bibliographical remarks

Historical notes to some of the topics covered in this chapter can be found
in §49 of Ch. IV in Church [1956],
The first explicit formulation of the BSD was contained in a paper
published by Tarski in Polish in 1933, of which Tarski [1935] is a German
translation. (English translation in Tarski [1956].)
For remarks on the method of tableaux see §16 of Ch. 1.
CHAPTER 3

FIRST-ORDER LOGIC (CONTINUED)

This chapter is devoted to the first-order predicate calculus. We shall


do for first-order logic what we did in §§10-13 of Chapter 1 for proposi¬
tional logic.

§ 1. The first-order predicate calculus

By a generalization of a we mean any formula of the form Vxi---Vx*«>


where 1 and x1,...,xk are any variables, not necessarily distinct.
As first-order axioms of i? we take all if-formulas of the following
eight groups:

(Ax.l) All propositional axioms of if.

(Ax.2) Vx(a-* P)-> Vxa_* VXP>


where a, p are any if-formulas and x is any variable.
(Ax.3) a -*Vxa,
where a is any if-formula and the variable x is not free in a.

(Ax.4) Vxa-*a(x/t),
where a is any if-formula and t is any if-term free for x in a.
(Ax.5) t=t,
where t is any if-term.

(Ax.6) t1=tn+1->... t„=t2n—► ftx.. .t„=ft„+1.. .t2„,


where f is any n-ary function symbol of if and are any
if-terms.
(Ax.7) ti=t„+1->...-)-tn=t2n-^Pt1...tn->Ptn+1...t2n,
where P is any n-ary predicate symbol of if and tlv..,t2„ are any
if-terms.

(Ax.8) All generalizations of axioms of the preceding groups.


CH. 3, §1], THE FIRST-ORDER PREDICATE CALCULUS

If if is without equality then (Ax.5), (Ax.6) and (Ax.7) are on


As rule of inference we again take modus ponens.
First-order deduction and first-order proof are defined like the correspond¬
ing propositional notions (§10 of Ch. 1) except that now the axioms are
the first-order axioms instead of just the propositional axioms. We use
“I—” for first-order deducibility as “|—0” was used for propositional
deducibility.
From now on we shall often drop the label “first-order” and say
briefly, e.g., “deduction” instead of “first-order deduction”.

1.1. Theorem. Ifi!>(—0a then Of-a. In particular, if |—0a then |— a. |

The following theorem states that the predicate calculus is semantically


sound:

1.2. Theorem. If CD|—a then <I>|=a. In particular, if ha then |=a.


Proof. Similar to Thm. 1.10.2. We first verify that our first-order axioms
are logically true. For (Ax.l) this follows from the fact that the proposi¬
tional axioms are tautologies. For (Ax.2)-(Ax.7) we can use tableaux
(see Thm. 2.4.2 and Prob. 2.4.3) or verify directly that they are logically
true. For (Ax.8) we merely need to observe that if a is logically true then
so is Vxa> and hence so is any generalization of a.
The rest is exactly as in the proof of Thm. 1.10.2. M

1.3. Deduction Theorem. Given a deduction of p from <I>,a, we can


construct a deduction of a -> p from <1>. (Hence, if <l>,ahP then Oha-^PJ
Proof. Exactly the same as for the propositional calculus (Thm. 1.10.4). |

Here, as in the propositional calculus, the converse of the Deduction


Theorem is easily seen to hold.

As in Ch. 2, we say that a variable x is free in a set O of formulas, if


x is free in some formula of <1>. Similarly, we say that x is free in a deduction
D if x is free in some formula of D.

1.4. Theorem. Let x he a variable which is not free in O. Given a deduction


D of a from <F, we can construct a deduction D' of \fxx from <I> such that
(a) x is not free in D',
(b) every variable free in D' is free in D as well.
Proof. Let cpx,...,<p„ be the given deduction D. So <p„ = a. By recursion
on k (k=\,...,ri) we construct a deduction Dk of Vx<P/t from ^ such that

9
110 FIRST-ORDER LOGIC (CONTINUED) [CH. 3, §1

x is not free in Dk and such that each variable free in Dk is free also in D.
Then Dn is the required D'.
If <pfc is an axiom, then Vx(P<t is an axiom as well — see (Ax. 8) — so
Vxip,; by itself is the required Dk.
If tpA. is in <D, then x is not free in tp*.. We take Dk to be

9k, (hyp.)

(|H-*Vx<P,c, (Ax. 3)

VX(P/C (m.p.)

Finally, if for some i,j<k we have <p j—9i-*9k, then by the induction
hypothesis we already possess Dt and Dj. We let Dk be the deduction
obtained by concatenating D; and Dj and adding three more formulas.
as follows:

D'\ Vx<Pd

Dl{vx(<pi-* tpfc),

Vx(«p,->• tpA.)Vx<Pi-> Vx9t, (Ax. 2)

Vx<p,-*Vx<th> (m.p.)

Vxtpf (m.p.)

This completes the proof.


1
1.5. Remark. By Thm. 1.4 we have the following law of generalization
on variables', if Oh a and x is not free in O, then ®H Vxa- Two modified
forms of this law are stated in the following problem.

1.6. Problem. Assuming that x is not free in O, p, show that


(a) if OHP->a, then OhP->Vx«,
(b) if O)—a->p, then OH3xa-»p.
(To prove (b), begin by observing that1 a-* P H0~lP->- ~ia.)

1.7. Lemma. Given a deduction D of a from O, we can construct a deduction


D °f from O such that every variable free in D* is free in O or in a.

1 Here and in the sequel, whenever we make an assertion of the form ®H0a, where
O is finite, that assertion may be verified e.g. by constructing a propositional confutation
of 0>, iu, or by truth tables (using Prob. 1.6.12 and Thm. 1.12.1).
CH. 3, §1], THE FIRST-ORDER PREDICATE CALCULUS 111

Proof. Let xl5...,x„ be all the variables which are free in D but are free
in neither G> nor a. By Thin. 1.4, we can construct a deduction D' of
Vxia from <f> such that xx is not free in D' but any variable free in D' is
also free in D. We extend D' by adding the two formulas Vxi<x-»<x (this
is (Ax.4), with xx as the term t) and a (by modus ponens). Thus we have
obtained for a a deduction D1 from <D such that xx is not free in Dx but
each variable free in Dx is also free in D.
Continuing in this way, we successively eliminate x2,...,x„ and get D*
as required. |

1.8. Theorem. If Q>\- a and a as well as all members of tD are sentences,


then there is a deduction D of a from O such that D is made up of sentences
only.

We say that formulas a and p are provably equivalent if both ah-P and
Pi—a. By the Deduction Theorem, this is tantamount to saying that both
l-a-> P and 1— P-*a. But it can easily be verified that {a-*p, P->-a}l-oa<-»P,
ap t—oa —► P and oc•<->p |—0 P-*ct- Therefore a and p are provably equi¬
valent iff !-«<-► P-
It is easy to verify that provable equivalence is indeed an equivalence
relation between formulas.

1.9. Lemma. Let a be a subformula of P, and let <x' be provably equivalent


to a. Let P' be obtained from p by replacing (zero or more) occurrences
of a by occurrences of a'. Then p' is provably equivalent to p.
Proof. By induction on deg p. If a = P, then P' is a or a' and there is
nothing to prove. From now on we shall assume that a is a proper sub¬
formula of p, and in particular that p is not atomic. We shall make use
of Prob. 1.4.4.
If p = —| y, then p/=“i(y/), where f is obtained from y by replacing
occurrences of a by occurrences of a'. By the induction hypothesis we have
1— y«-»y'. But it is easy to verify that

y<H-yV0“iy4>n(y/);
hence
h“iy<H>n(y/)-

If p = y_>.5, then P^y'-^fL, where f is obtained from y as above,


and 5' is obtained from 5 in a similar way. By the induction hypothesis
we have Hy<-M' and But it is easy to verify that

{yoy', 5<-*5'} H0
V
112 FIRST-ORDER LOGIC (CONTINUED) [CH. 3, §1

Hence

If p = VxY> then P' = Vxy', where f is obtained from y as before. By


the induction hypothesis we have 1— y —^Y^ ar*d I—Y,—► Y- Generalizing
on x (see 1.5) we get l-Vx^^Y') and using

Vx(y-^y)-> VxY“* Vxy' (Ax.2)

we obtain

h-VxY-> VXY •

Similarly,

h- Vxy'->- VxY- I
1.10. Theorem. 7/'P~p/ then p and P' are provably equivalent.
Proof. By Lemma 1.9 it is enough to show that if Vz ta(x/z)] is obtained
from Vxa by alphabetic change then these two formulas are provably
equivalent.
Since z is free for x in a, the formula Vxa-»a(x/z) is an instance of
(Ax.4). Thus Vx«b«(x/Z). Since z is not free in a, it cannot be free in
Vxa either; hence we may generalize on z (see 1.5) and obtain

Vxah Vz [a(x/z)].

Since an alphabetic change is reversible, Vxa is obtainable from


Vz [a(x/z)] by alphabetic change, and we can prove similarly that
Vz [a(x/z)]b- Vxa. |

Note that Lemma 1.9 and Thm. 1.10 provide new proofs for Thms. 2.3.6.
and 2.3.8. (By Thm. 1.2, formulas that are provably equivalent are also
logically equivalent.)

1.11. Theorem. For every formula a, variable x and term t,

h Vxa->a(x/t), • 4-a(x/t)-*3xa.

Proof. By Def. 2.3.9, a(x/t) = a/(x/t), where a' a certain variant of a


such that t is free for x in a'. By (Ax.4) we have Vxa't-a(x/t) and by
Thm. 1.10 we also have Vxal—Vxa/. Hence Vxal-a(x/t) and by the
Deduction Theorem

I— Vxa->-a(x/t).
CH. 3, §1], THE FIRST-ORDER PREDICATE CALCULUS 113

By what we have just proved, we have HVx~la-*~ia(x/t). But (as


can easily be verified)

Vx~l a -► -| a(x/t) Ho a(x/t) -* ~1 Vx“l a,

hence |—a(x/t)-» ~l Vx“la, i.e.,

ba(x/t)-»3xa. |

1.12. Theorem. Let c be a constant which occurs neither in a> nor in a.


Given a deduction D of a(x/c) from O we can construct a deduction D' of
Vxa from ®.
Proof. We choose a variable y which does not occur in D. Let D± be
obtained from D by replacing every formula p by P(c/y).
It is not difficult to verify that if p is an axiom then so is P(c/y). The
hypotheses (formulas of ®) used in D are left unchanged because c does
not occur in ®. Aiso, every application of modus ponens is transformed
into an application of modus ponens. Since c does not occur in a,

a(x/c)(c/y) = a(x/y).

Thus Dx is a deduction of a(x/y) from ®. Moreover, if ®0 is the subset


of ® consisting of those hypotheses that are actually used in D1; then
y does not occur in ®0 and Dx is a deduction of a(x/y) from ®0. By Thm. 1.4
we get from Dl a deduction D2 of Vy [ot(x/y)] from ®0 and hence from <P.
Clearly, Vy[a(x/y)l is Vxa or is obtained from Vxa by alphabetic
change. Therefore (either trivially, or as in Thm. 1.10) we obtain a deduction
D* of Vxa from Vy [a(x/y)]. From D2 and Dz we get the required D'. |

1.13. Remark. By Thm. 1.12 we have the following law of generalization


on constants:
If ®|-a(x/c) and c does not occur in <D nor in a, then ®|- Vxa.

1.14. Problem. Using Thm. 1.11 show that if x does not occur in the
term tthen 1— 3x(t=x).
1.15'. Problem. Show that if x does not occur in t, then h3!x(t=x).

(For the definition of 3* see 2.10.1.)

A set ® of formulas is first-order inconsistent if for some p both ®H P


and ®H“lP- Otherwise ® is first-order consistent.
We shall usually omit the qualification “first-order” and say just “in¬
consistent” and “consistent”.
114 FIRST-ORDER LOGIC (CONTINUED) [CH. 3, §1

If ® is an inconsistent set, then some finite subset of ® must be in¬


consistent, because in deducing p and —lp from ® we can only make use
of a finite number of hypotheses (i.e., formulas of ®). Conversely, if
some subset of ® is inconsistent, then ® is clearly inconsistent. Thus
a set <J> is consistent iff every finite subset of O is consistent. (Cf. Prob.
1.10.10.)
We can now show that the empty set is consistent. This fact is expressed
by saying that the first-order predicate calculus is consistent.

1.16. Problem. Let a* be defined by recursion on deg a as follows:

(Pt1t2...t„)*=Pv1v1...v1,

(“lP)*=-|(P*),

(P-»Y)* = P*-*Y*,

(Vxp)* = p*.

Show that if a is a first-order axiom then a* is either a tautology or the


formula vx=Vi. Hence prove the consistency of the first-order predicate
calculus by showing that if both [- p and (- “l p then the empty set or
{v1=v1} would be propositionally inconsistent.

1.17. Theorem. An inconsistent set of formulas is unsatisfiable.


Proof. Immediate from Thm. 1.2. ■

From Thm. 1.17 we get another proof for the consistency of the first-
order predicate calculus. For, the empty set of formulas is satisfied by
every valuation. However, note that this proof, unlike that of Prob. 1.16,
makes use of the highly non-constructive notion of satisfiability (see
Remark 2.1.2).

1.18. Theorem. A set <t> of formulas is inconsistent iff <D (— a for every
formula a.
Proof. Similar to that of Thm. 1.10.7. If O is inconsistent, then for some
P both ®|-P and ®|—“ip. But for every a we have (p, ~Ip}|-0a; hence
ha. The converse is obvious. |

1.19. Theorem. For any ® and a,


(a) ®, “la is inconsistent iff ® [a,
(b) ®, a is inconsistent iff ® (— “la.
Proof, (a) If ®, “la is inconsistent, then, by Thm. 1.18, ®, -jah-a.
CH. 3, §2], THE FIRST-ORDER PREDICATE CALCULUS AND TABLEAUX 115

Therefore, by the Deduction Theorem, 0|— ~ia-»a. But one can easily
verify (e.g., by a propositional tableau) that “|a-+>a|-oa. Hence <D|-a.
The converse is obvious.
(b) Similarly, if 0,a is inconsistent, one shows that 3>|-a-^“ia. Also,
it is readily verified that a-*~ia t-0“la. Hence <Dhna. The converse
is again obvious. |

* § 2. The first-order predicate calculus and tableaux

In this section we shall do for first-order logic what we did for propositional
logic in §11 of Ch. 1. As in Ch. 2, we say tableau when we mean first-
order tableau.

2.1. Theorem. Let O be a finite set of formulas. Given a deduction of


a from <J>, we can confute O, “I a.
Proof. The existence of such a confutation follows at once from Prob.
2.8.5 and Thm. 1.2. However, as in the proof of Thm. 1.11.1, we can get
a direct construction of a confutation of “la from the given deduction
of a from <I>. We proceed as in the proof of Thm. 1.11.1, but now using
the Elimination Theorem 2.6.8.
We only have to verify that for each axiom a we can confute {“la}.
This is easily done. (For (Ax.2)-(Ax.4) see Prob. 2.4.3. For (Ax.8) observe
that if we can confute {“la} then we can also confute {“iVxa}> because
the -lV-rule may be applied to “nVxa, yielding “la. |

Observe that from Thm. 2.1 and Prob. 2.8.5 we get a new (and indirect)
proof for Thm. 1.2. (Cf. Prob. 1.11.2.)

2.2. Theorem. Every finite inconsistent set of formulas can be confuted.


Proof. Similar to Thm. 1.11.3. |

Using Thm. 2.2, we obtain yet another constructive proof for the
consistency of the first-order predicate calculus. We only need to verify
that the empty set of formulas cannot be confuted. If L£ is without equality,
this is obvious since no tableau rule can be applied to the empty set. If
L£ does have equality, then the equality rules can always be applied. But
it is easy to see that if ® is a finite set of formulas in which “1 does not
occur (in particular, if O is empty) then in every tableau for O there must
be at least one branch in which “| does not occur. Hence ® cannot be
confuted. (See Prob. 2.8.7.)
116 FIRST-ORDER LOGIC (CONTINUED) [CH. 3, §2

In order to prove the converse of Thm. 2.2 we shall need:

2.3. Lemma. If Q> is any set of formulas, each of the following six conditions
is sufficient for O to be inconsistent:
(a) For some a, ~i ~~\\a £ and <I>, a is inconsistent.
(b) For some a and p, a-»- p £ and both “1 a and d>, p are inconsistent.
(c) For some a and p, —i(a-*p) 6 <I> and <D, a, “lP is inconsistent.
(d) For some a, x and t, Vxa € <I> and <D, a(x/t) is inconsistent.
(e) For some a, x and y, ~I Vxa € (I>, and y is not free in <I>, and <P, ~I a(x/y)
is inconsistent.
(f) For some axiom a, <1>, a is inconsistent.
Proof, (a) Since <D,a is inconsistent, we have Ob-~ia by Thm. 1.19.
But i-ia£®, hence <I> 1— | |a, and O is inconsistent.
(b) Since <h, “la is inconsistent, we have <D[-a by Thm. 1.19. But
a-^PG®, hence <1> I— P (using modus ponens). On the other hand, <1> [— —| p
because <P, p is inconsistent. Thus <D is inconsistent.
(c) Since O,a, “lP is inconsistent, it follows that <D, a(~P Hence,
by the Deduction Thm., d>|— a-»p. But —|(a->- P) £ G>.
(d) Since 0>, a(x/t) is inconsistent, 0> b “1 a(x/t). But VX«€<I), hence,
by Thm. 1.11., 0>|-a(x/t).
(e) Since <D, ~ia(x/y) is inconsistent, we have $h«(x/y). Because
y is not free in O, we may generalize on y (see Remark 1.5) and obtain
® b- Vy [«(x/y)]- If y is X, then this means Oh-Vx«, and since “
clearly O is inconsistent. Now suppose y is not x. Then by Def. 2.3.9,
a(x/y) = a/(x/y), where a' is a certain variant of a such that y is free for
x in a'. Moreover, since y is not free in O, and ~i Vx<* <E O, it follows
that y is not free in a (because y^x) and hence y cannot be free in a'. Thus
Vy [a (x/y)] is obtained from Vxa/ by alphabetic change and we have

Vx«~ Vx«'~ v y [a'(x/y)] = Vy Wx/y)],


so, by Thm. 1.10,

Vy [a(x/y)]h-Vxa-

Thus again Ob-VX(*> and O ^inconsistent.


(f) This is obvious, because if a is an axiom then whatever can be deduced
from O, a can also be deduced from O. |

We can now prove the converse of Thm. 2.2:

2.4. Theorem. If is a finite set of formulas having a confutation, then


O is inconsistent.
CH. 3, §3], COMPLETENESS OF THE FIRST-ORDER PREDICATE CALCULUS 117

Proof. Similar to the proof of Thm. 1.11.4. We proceed by induction


on the depth d of a given confutation of <D.
If d= 0, then O must contain both a and “la for some (atomic) a and.
hence O is inconsistent.
If d>0, we distinguish cases according to the rule that gave rise to level 1
of the given confutation. These are dealt with as in the proof of Thm. 1.11.4.
In the cases corresponding to the three propositional rules we use parts
(a)> (b) and (c) of Lemma 2.3. For the quantifier rules we use parts (d)
and (e) of that lemma. Finally for the equality rules we use part (f) of
the lemma. |

2.5. Problem. Prove the converse of Thm. 2.1.

§ 3. Completeness of the first-order predicate calculus

We begin this section by proving the weak completeness of the first-order


predicate calculus. Then we go on to prove the strong completeness of
that calculus. The proof of the former result (but not that of the latter)
will depend on §2. The reader who has skipped §2 should now proceed
directly to Def. 3.4. Remark 3.2 should in this case be read after Thm. 3.13.

3.1. Theorem. Every finite consistent set of formulas is satisfied by some


valuation with a countable universe.
Proof. If <D is finite and consistent, then by Thm. 2.4 <D cannot be confuted.
Hence by Cor. 2.8.3 O is satisfied by a valuation with a countable uni¬
verse. |

3.2. Remark. We can now see that consistency is invariant with respect
to language. For, a set O is consistent iff every finite subset of 0 is consist¬
ent; and by Thms. 1.17 and 3.1 this is the same as every finite subset of
O being satisfiable. But we know that satisfiablity is invariant with respect
to language (see Remark 2.2.7). It follows that deducibility too is invariant
with respect to language, for <D|— a iffO, ~ia is inconsistent (see Thm. 1.19).

3.3. Weak Completeness Theorem. If O is finite and fl> (= a, then <I>ba


In particular, if \= a then 1— a.
Proof. If ®l=a, then O, ~]a is unsatisfiable. Hence, by Thm. 3.1, <I>, ~|a
is inconsistent, so, by Thm. 1.19, <1> l—ot. |

Thm. 3.3 is due to Godel (see §6). Both this theorem and Thm. 3.1
are known as GddeVs Completeness Theorem.
118 FIRST-ORDER LOGIC (CONTINUED) [CH. 3, §3

The rest of this section is mainly devoted to strengthening Thms. 3.1


and 3.3. Whereas our proof of Thm. 3.1 makes an excursion via tableaux,
the stronger version will be proved directly.

3.4. Definition. A set <1> of if-formulas is maximal first-order consistent


in if if <t> is first-order consistent but is not a proper subset of any first-
order consistent set of if-formulas.
As usual, we shall omit the qualification “first-order”. When only the
language if is being considered, we may also omit the qualification “in if”,
but it must be stressed that maximal consistency is not invariant with
respect to language.

3.5. Theorem. A set O is maximal consistent iff both of the following


conditions hold:
(a) <I> is consistent.
(b) For every formula a, xf<b or ~\ x. £ O.
Proof. Similar to Thm. 1.13.1. |

3.6. Theorem. If Q> is maximal consistent and <D|— x, then a f O.


Proof. Otherwise we would have ~|a£®, by Thm. 3.5. Hence Oh-“Ia,
making O inconsistent. |

3.7. Theorem. Let *F be maximal consistent in if. Then conditions (l)-(5)


of Def 2.7.1. hold. If if is a language with equality, then conditions (7)—(9)
of Def. 2.7.1 hold as well.
Proof. (1) We cannot have both (ph'P and “i(p6'F for atomic — or
indeed for any — formula tp, because *F is consistent.
(2) If —i~ia€^F, then a 6 by Thm. 3.6, because “ilal-oa-
(3) Suppose a->Ph'F. Then if we have 'F by Thm. 3.5.
Therefore (using modus ponens) 'Fl-P and hence p£ *F by Thm. 3.6. Thus
-|a^T or ph¥.
(4) If ~l(a-^p)£T, then 'Fl— a and 'Fl— ip because —|(oc —p)|— 0a
and “I(a-*P)I—0 IP- Therefore a£Y and “lpfE'F by Thm. 3.6.
(5) If Vxa 6 'F, then for every if-term t we have *F \- a(x/t) by Thm. 1.11.
Hence a(x/t)£'F by Thm. 3.6.
(7)—(9) Every axiom of if — and in particular every instance of (Ax.5),
(Ax.6) and (Ax.7) — is provable, and hence deducible from *P. Thus by
Thm. 3.6 every axiom belongs to *F. |

The reader will have noticed that condition (6) of Def. 2.7.1 has been
omitted in Thm. 3.7. Indeed, this condition does not necessarily hold for
a maximal consistent set.
CH. 3, §3], COMPLETENESS OF THE FIRST-ORDER PREDICATE CALCULUS 119

Example. Let if be the language with equality but without any extra-
logical symbols. Let U be a set having two members u and v. Let a be
the valuation with U as universe, such that xa=u for every variable x.
Define 4* as the set of all if-formulas a such that <r|=a. Then 4* is consist¬
ent by Thm. 1.17 because a t= 4*. Also, for each a, either a £ 4* or ~\ a £ *F,
so, by Thm. 3.5, ¥ is maximal consistent.
Now consider the formula ~iVx(x=y), where x and y are distinct
variables. It is easy to see that this formula is satisfied by o. (Indeed,
o(x/v) does not satisfy x=y, hence a does not satisfy Vx(x=y).)
On the other hand, the only terms in if are the variables and clearly <r
does not satisfy z?£y for any variable z.

3.8. Definition. T is a Henkin set in if if 4* is maximal consistent in if


and whenever ~l yxtxf 4r then for some if-term t also ~|a(x/t)£ 4*.
Thus a Henkin set is a maximal consistent set for which condition (6)
of Def. 2.7.1 does hold. It follows at once from Thm. 3.7 that every Henkin
set in if is a Hintikka set in if.

3.9. Lemma. 7/4' is a Henkin set in if, then 'P is satisfied by some valuation
whose universe has cardinality not greater than the cardinality of the set
of all ST-terms.
Proof. Immediate from Thm. 2.7.5. |

3.10. Definition. By the cardinality of if, (briefly, ||if||) we mean the


cardinality of the set of all if-symbols.

For the rest of this section we put ||if||=2.

3.11. Theorem. The set of all ST-terms has cardinality <2 and the set
of all ST-formulas has cardinality X.
Proof. Each term or formula is an if-string, i.e., a finite sequence of
if-symbols. Since X is clearly infinite (there are denumerably many
variables!) it follows that the set of if-strings has cardinality X. Thus
both the set of if-terms and the set of ST-formulas have cardinality <2.
It remains to show that the set of if-formulas has cardinality s®2.
Since 2 is infinite, at least one of the following sets must have cardinality 2:
(a) The set of all variables and constants.
(b) The set of all function symbols other than constants.
(c) The set of all predicate symbols.
In each of these cases it is easy to see that the set of all atomic formulas
of the form Ptt...t has cardinality 2. I
FIRST-ORDER LOGIC (CONTINUED) [CH. 3, §3
120

We now extend jS? to if' by adding new individual constants:

{c«: £<2}.

(Below we shall continue to refer to these as the new constants.) It is


clear that ||if'|| = ||if||=A. Also, the set of if'-terms has cardinality X.

3.12. Theorem. Let <t> be a consistent set of i-formulas. Then there


exists a set Y such that Os'? and is a Henkin set in if'.
Proof. By Thm. 3.11, the set of all if'-formulas has cardinality X. We
fix a well-ordering

of this set. By transfinite recursion, we define for every £<2 a set 0= of


if'-formulas such that
(1) ®,;£®,* for all ??<£,
(2) ®, is consistent,
(3) Only finitely many, or at most |£| new constants occur in ®,*.
We put ®0=®. Then for £=0 condition (1) holds vacuously, (3) holds
because ® is in if and has no new constants, and (2) holds by assumption.1
Now suppose that 0<£«s2 and that for all £<£ the ®* have been defined
in accordance with (1), (2) and (3). If £ is a limit ordinal (in particular
if £=A), we put £<£}. Then by definition ®,*c®; for all
£<£. Also, ®? is consistent because every finite subset of ®? is included
in some ®., where £<£, and ®^ is consistent by (2). Finally, since (3) holds
for all £<£, it is easy to see that (3) holds for £ as well.
If £ is a successor ordinal, say £ = £ + 1, we distinguish three cases.
Case 1. If ®s,u{(pJ is inconsistent, we put ®«+i=®^. Then (1), (2)
and (3) clearly hold for £ + 1.
Case 2. If ®s=u;{<pJ is consistent and <p* is not of the form ~iVXQb
we put ®£+1=®eU {<p^}. Again, (1), (2) and (3) clearly hold for £ + 1.
Case 3. If ®,u{<|k} is consistent and has the form ~iVxa» then
by (3) there exists a new constant c which does not occur in ® = nor in q>».

1 ® was assumed consistent in T£. To see that it is also consistent in T£' we can use
Remark 3.2 or — in order to make the present proof independent of the method of
tableaux — we can argue directly as follows. If ® were inconsistent in ££' then for every
Jzf-formula y we would have a deduction of y from® in T£'. This deduction may contain
new constants, but these may be replaced (as in the proof of Thm. 1.12) by variables
which do not already occur in the deduction. Thus we would obtain a deduction of
y from ® in J5f, so ® would be inconsistent in 2zf.
CH. 3, §3], COMPLETENESS OF THE FIRST-ORDER PREDICATE CALCULUS 121

We take such c and put <D4+1=<D^u{<p^, ~ia(x/c)}. Then (1) and (3)
clearly hold for £ + 1. To see that (2) holds as well, suppose <D,,+1 were
inconsistent. Then by Thm. 1.19 we have <p«|-a(x/c) and hence
<P«1-Vxa by Rem. 1.13. Since <p*=—iV*a, this contradicts our
assumption that <D,u{<pJ is consistent.
We put T=<D^. Then by (1) we have <F=d>0c'F5 and, by (2), *F is
consistent.
To see that 'F is maximal consistent in ST', let a be any ST'-formula not
belonging to *F. Then a = (p5 for some £<2. Since it follows that
<0^+1 must have been defined by Case 1, for in the other two cases we have
a = <p«^ <D^+1 c'F. Thus 0»u{a} is inconsistent, and 'Fu{«} must also
be inconsistent.
Finally, if ~iVxa€'F then ~iVxa = <p<s for some £<2. <J>,*u {~| Vx«}
is a subset of *F, and is therefore consistent. It follows that <P€+1 must
have been defined by Case 3 and hence for some c we have ~ia(x/c) 6 0^+1.
Thus 'F has all the required properties. |

3.13. Theorem. If O is a consistent set of ST-formulas, then <J> is satisfied


by some valuation whose universe has cardinality ^\\ST\\.
Proof. Immediate from Lemma 3.9, Thm. 3.11, and Thm. 3.12. |

3.14. Strong Completeness Theorem. 7/" <I> t= ot, then Oh a.


Proof. If Oh a then O, “la is unsatisfiable. Hence, by Thm. 3.13, O, “|a
is inconsistent. Then, by Thm. 1.19, 9
Both Thm. 3.14 and Thm. 3.13 are known as HenkiiT s Completeness
Theorem.
We end this section with two important semantic consequences of
Thm. 3.13.

3.15. Theorem. T/'O is a satisfiable set of ST -formulas, then O is satisfied


by some valuation whose universe has cardinality \\ST\\.
Proof. Immediate from Thm. 1.17 and Thm. 3.13. |

3.16. Compactness Theorem. If every finite subset of O is satisfiable,


then O is satisfiable.
Proof. By Thm. 1.17, every finite subset of O is consistent. Hence O is
consistent. It follows from Thm. 3.13 that O is satisfiable. |

Note that both Theorem 3.15 and the Compactness Theorem are purely
semantic and do not directly involve any notion about the predicate
122 FIRST-ORDER LOGIC (CONTINUED) [CH. 3, §5

calculus (such as consistency, deducibility, etc.). It is therefore natural


to ask whether they can be proved without making such a long excursion
through the predicate calculus. This is indeed the case, and in Ch. 5 we
shall prove these theorems by purely semantic means.

*§ 4. First-order logic based on 3

In some parts of this book it will be convenient to regard 3 as primitive,


and define Vx as an abbreviation for ~i3x“l (cf. Prob. 2.1.4). All the
material covered in Ch. 2 can easily be adapted to this setting. Only slight
(and obvious) modifications are needed. For example, in the tableau
method we have to take the “i3 and 3 rules instead of the V and ~lV
rules.
In Def. 2.7.1, conditions (5) and (6) should be replaced by:
(5') If “lSxaG'P, then —ia(x/t)6T for every d2?-term t.
(60 If 3xa 6 *P, then a(x/t) 6 *P for some if-term t.
In the predicate calculus we may add a new axiom scheme, e.g.,

Vx(a-► P)-*• 3xa-► 3xp,

and modify (Ax.8) to cover generalizations of these new axioms as well.


The material of the present chapter up to Lemma 1.9 requires no further
change. In the last part of the proof of Lemma 1.9 we can use the new axiom
scheme to show that 3XY and 3xy' are provably equivalent. The proof of
Thm. 1.10 can be modified in a similar way. The rest of the material of
the present chapter requires only slight and obvious changes.

§ 5. What have we achieved?

One of the main tasks of logic is to clarify and characterize the notions
of logical consequence and logical validity. Let us now look back and
see to what extent we have accomplished this task. Of course, since we
have confined ourselves to first-order languages, we can consider logical
consequence and logical validity only for such languages.
In §1 of Ch. 2 we explicated these two notions by reducing them to the
notion of satisfaction. This, in turn, was characterized in the BSD. Now,
the BSD is direct and natural because it merely spells out the intended
meaning of the logical symbols. But it is highly non-constructive.
In the course of Ch. 2 and the present chapter, we have been able to
CH. 3, §5], WHAT HAVE WE ACHIEVED? 123

discover more constructive (and more informative) characterizations of


satisfiability, logical consequence and logical validity.
The method of tableaux can be used for detecting unsatisfiability. Given
a finite set ® of formulas, we can systematically search for a confutation
of ® (e.g., as described in §8 of Ch. 2). While this does not constitute
an effective procedure for deciding whether or not ® is satisfiable, we know
at least that if ® is unsatisfiable this fact will eventually be discovered.
Since a is logically valid iff ~ia is unsatisfiable, we also have a method
for detecting logical validity.
Another method of detecting logical validity is provided by the first-
order predicate calculus. By the Completeness Theorem, (=a iff \-a.
So, if we want to know whether a is logically valid, we can search system¬
atically for a first-order proof of a. //"such a proof exists, we shall sooner
or later discover ir. (On the other hand, if no such proof exists, this fact
may never be revealed by our procedure.)
More generally, suppose ® is a countable set of hypotheses and we are
given an enumeration of ®,

® = {(p.: i=0,l,2,...}.

Let a be any formula. Then ® (= a iff there is a deduction of a from ®


in the language (which is the “poorest” first-order language in which
a and all members of ® are formulas).

The language £> = £’0>a is denumerable (i.e., ||Jzf|| = K0), so we can


effectively enumerate one by one all deductions from <D in j§f. (For example,
we can first fix some effective enumeration of all axioms in if. Now, let
Q)n be the set of all deductions whose length is at most n and in which not
more than the first n axioms and the first n hypotheses are used. Then
Q)n is finite and we can fix some ordering in S>n. The different Q)n can be
arranged in order of increasing n). If ® |=a, then sooner or later we shall
discover a deduction of a from ®. (On the other hand, if a is not a logical
consequence of ®, we may never discover this.)
The procedures described above may not be efficient. However, the
important fact is that they show in principle that, e.g., every logical truth
can eventually be discovered.
Let us point out another consequence of the Completeness Theorem
(or, more precisely, of Thm. 3.15). Initially, the notion of satisfiability
was defined with reference to valuations whose universes may be arbitrary
classes. From Thm. 3.15 it follows, however, that it is enough to consider
universes which are sets. Moreover, only sets whose cardinality is «||if|j,
124 FIRST-ORDER LOGIC (CONTINUED) [CH. 3, §6

need be considered. In particular, if S£ is denumerable, then a set O of


if-formulas is satisfiable iff it is satisfied by some valuation with countable
universe.
The results mentioned in this section constitute considerable achievements
of the formal method (and the formal approach to mathematics). However,
in Ch. 7 we shall see that — just because of these positive results — the
formal method has some very crucial inherent limitations.

§ 6. Historical and bibliographical remarks

The predicate calculus is essentially due to Frege [1879]. (The version used
in this book differs from Frege’s in notation as well as in the choice of
axioms and rules of inference.) Frege allowed quantification not only over
individual variables but also over predicate symbols, in a way which made
his system inconsistent. The flaw is corrected in Whitehead and
Russell [1910]. Godel [1930] proved the Weak Completeness Theorem
for the first-order predicate calculus (in the version of Whitehead and
Russell). In the same paper he also proved the Strong Completeness
Theorem and the Compactness Theorem for the case in which the set
<D is denumerable. The full Strong Completeness Theorem is due to
Henkin [1949] and the full Compactness Theorem is due to Mal’cev [1936].
Theorem 3.15 was first proved by Lowenheim [1915] for single formulas,
but his proof had several gaps. Skolem [1920] gave the first complete
proof and generalized the theorem to denumerable sets of formulas.
CHAPTER 4

BOOLEAN ALGEBRAS

For any first order language if, we may regard the logical symbols
A, V, “I as operations on the set of all formulas of if. If we identify
provably equivalent formulas, then these operations define an algebraic
structure B whose properties are determined by if. Accordingly, a know¬
ledge of the properties of B will yield information about if. B is an
example of a special kind of algebraic structure called a Boolean algebra:
the general theory of such structures forms the theme of this chapter.
Later on (Ch. 5, §§3, 5, 6) we shall employ some of the results we obtain
in proving important theorems in model theory. Some of the topics treated
in this chapter (those to be found in sections marked with a *) will not
be used in the sequel; they have been included to make the chapter a
reasonably broad introduction to the theory.
The content of this chapter is entirely independent of that of earlier
chapters.

§ 1. Lattices

A lattice is a (non-empty) partially ordered set1 (L,<) in which each pair


of elements x,y has a supremum — denoted by x vy — and an infimum —
denoted by XAy. xvy and xAy are often called, respectively, the join
and meet of x and y. Clearly, we have, for any elements x,y of a lattice,

-<=>- xay=x <=> xvy—y.

1.1. Problem. Show that the following identities hold in any lattice:

x Ay=y ax,

xv (yv z)=(xv y)v z, x a (y a z) = (x ay) a z;

(xa y)vy=y, (xvy)Ay=y.

1 When discussing a partially ordered set (X,«z) we shall frequently commit the pardon¬
able sin of identifying it with its underlying set X.

to
126 BOOLEAN ALGEBRAS [CH. 4, §1

1.2. Examples, (i) Any totally ordered set <A,«s) is a lattice; clearly
in this case we have

x Ay=min(x,y), x v.y=max(x,y).

(ii) For any set Y, the power set P Y of 7 is a lattice under the partial
ordering of set inclusion. In this case we have, for any A,B£PY,

AaB = A<aB, AvB—AuB.

(iii) The families of open sets and closed sets, respectively, of a topological
space each form a lattice under the partial ordering of set inclusion. In
these lattices joins and meets are the same as in example (ii).
(iv) Define a partial ordering | on the set co of natural numbers by
stipulating that m\n iff m is a divisor of n. Then (co, [ ) is a lattice in which

m An—g.c.d.(w,«), mwn — l.c.m ,(m,ri).

(v) Let C be the set of complex numbers; define a partial ordering < on
C by setting a+i&<c+i£/ (for real a, b, c, d) iff a<c and b^d. Then
(C, <) is a lattice. What are the join and meet of a pair of elements of C?
(vi) Let V be a vector space over a field F. Then the set SY of all vector
subspaces of V is a lattice under the partial ordering of inclusion. In this
lattice the join of any two elements A and B is the subspace of V generated
by AcjB, while the meet of A and B is just AnB.
A mapping h from a lattice I to a lattice L' is called a (lattice) homo¬
morphism if for all x,y£L we have

h(x Ay) = h(x) a /;(>’), h(x vy) — h(x) v h(y).

If h is one-one and onto, then h is called an isomorphism of L onto L', and L'
is said to be isomorphic to L. A sublattice of a lattice L is a subset L' of
L which is closed under the meet and join operations of L, i.e., if x,y£L',
then x a y£L' and x v yd L'.

1.3. Problem. Show that, if h is a homomorphism of a lattice L into


a lattice L', then h[L\ is a sublattice of L' (called the image of L under /?).

A lattice L is said to be distributive if the following conditions are


satisfied by all x,y,z^L:

x a (y v z) = (x Ay) v (x a z),

xv(yAz) = (xvy) a(xvz).


CH. 4, §1],
LATTICES
127

It is a curious tact that each of the above conditions implies the other.
For example, suppose that the first condition holds. Then we have,
using Prob. 1.1.,

(x v y) a(xvz) = [a- a (x V z)\ V [ v A (x v zj\

=av[Oa x) v(jaz)]

= [xvOax)] vOaz)

= X v(j'Az).

The converse is proved similarly.


It is easy to show by induction that any (non-empty) finite subset
X {-Vi,... ,vn} of a lattice has both a supremum and an infimum. We
denote the supremum (or join) of X by Xl v ... v or V?=1jc, and its infimum
(or meet) by ^a.-.ax, or A"=i*;•
Notice that an infinite subset of a lattice need not have an infimum
or a supremum. For instance, in the (totally ordered) lattice Z of integers,
the set of even integers has neither. If, however, every subset1 of a lattice
L does have an infimum and a supremum, then L is said to be complete.
Another curious fact about lattices is that, for a lattice to be complete,
it suffices for each subset to have a supremum (or for each subset to have
an infimum). We accord this fact the status of a theorem.

1.4. Theorem. Let L be a partially ordered set in which each subset has
a supremum (or in which each subset has an infimum). Then L is a complete
lattice.
Proof. Suppose that each subset of L has a supremum. Then to show
that L is a complete lattice we have to prove that each subset of L has
an infimum. So let A be a subset of L, and let Y be the set of lower bounds
of X in L, i.e.

Y—{z^L: \/x£ X[z<x]}.

Then Y has a supremum y, and it is not hard to see that y is the infimum
of X. A similar argument goes through in the case in which each subset
of L has an infimum. £

If L is a lattice (not necessarily complete) and Icf, we shall denote


the supremum or join of X (if it is has one) by and the infimum or

1 This includes the empty subset 0 of L. Recall (Ch. 0) that the supremum of 0 (if it
has one) must be the least element of L, and that the infimum of 0 (if it has one) must
be the greatest element of L.

10*
128 BOOLEAN ALGEBRAS [CH. 4, §1

meet of X (if it has one) by f\X. If we suppose that X is indexed as


{x^. /£ /}, then we sometimes write the join and meet of X as Vt eixo
respectively.

1.5. Examples, (i) The power set lattice PY of any set Y is a complete
lattice, in which joins and meets coincide with set-theoretic unions and
intersections respectively.
(ii) The lattices G and # of open sets and closed sets of a topological
space X are both complete. If {Ap. /£/}£$, then in the lattice G,

If {Bp iP.1}^^, then, in the lattice #,

A / £ / Bi = r\teiB„ V i€L®i = (Ui£/-®i) •

(Here A° and A~ denote the interior and closure, respectively, of a subset


A of a topological space.)

Let I be a lattice. An element x of L is said to be the least element of


L if x=s:y for all yd L, and the greatest element of L if y*sx for all yd L.
Clearly the least and greatest elements of a lattice — if they exist — must
be unique. Accordingly we agree always to write 0 for the least and 1
for the greatest element of a given lattice, assuming they exist.
A lattice L is said to be complemented if it has a least and a greatest element
and for each x£ L there is yd L such that

xvy=l, XAy=0.

Such an element y is called a complement of x. In general (cf. Prob. 1.7),


an element of a lattice may have more than one complement — or none
whatsoever. But the situation in a distributive lattice is much more pleasant
as we now show.

1.6. Theorem. An element of a distributive lattice has at most one


complement.
Proof. Let L be a distributive lattice, let x€ L and suppose that y, y'
are both complements of x. Then

xvy=l, xv/=l,

XA_V = 0, xa/ = 0,
CH. 4, §2]. BOOLEAN ALGEBRAS 129

and we have

y=yvO=yv(xA/)=(yvx)A(yv/)
= 1 A (}> V/)

=^v/.

Similarly, y'=y vy' so that y=y'. |

It follows from this result that in a complemented distributive lattice


L we can define a mapping which sends each x£ L onto its unique
complement x* in L. Clearly we have x** = x for each x€ L.

1.7. Problem, (i) Consider the lattice 9 of vector subspaces of Euclidean


2-space E2. Consider the following subspaces of E2:

Ei={<*,y)eE2: x=0},

E2 = {<x,y)€E2: y=0},

^3={<^E>^E2: x=y}.

Show that V2 and V3 are both complements of V1 in 9, and deduce that


9 is not distributive.
(ii) Let L be a totally ordered set with a 0 and a 1. Show that, if x£ L
and CMx^ 1, then x has no complement in L.

§ 2. Boolean algebras
The stage is now set for us to introduce the central notion of the chapter.
A Boolean algebra is a complemented distributive lattice.
By the results and definitions of §1, the following identities hold in any
Boolean algebra:

xvy^yvx, xAy=yAx;
(i)
x v (y v z) = (x v y) v z, x A(y az)=(x Ay) az;
(ii)
(iii) (xvy)/\y=y, (xa y)vy=y;
(x vy) az=(xaz) v(yAz), (x Ay) v z = (x v z) a O v z);
0)
x vx* = l, XAX* = 0.
(v)
The join, meet and complementation mappings in a Boolean algebra
are called Boolean operations.

2.1. Problem. Let (B, a, v, *, 0, 1) be a structure consisting of a set B,


two binary operations a , v and one unary operation * on B, and two
130 BOOLEAN ALGEBRAS [CH. 4, §2

designated elements 0, 1 of B. Suppose that the identities (i)-(v) hold in


this structure. Define a relation < on B by

x^y o xay=x.

Show that (B,*z) is a Boolean algebra in which a, v and * are the meet,
join and complementation operations, and 0, 1 are the least and greatest
elements.

Prob. 2.1 shows that a Boolean algebra may alternatively be defined


as a structure <B, a , v, *, 0, 1) satisfying identities (i)-(v). We shall
use this characterization occasionally in the sequel.
Let P be a statement about Boolean algebras which involves just the
Boolean operations A, v,* and the two elements 0 and 1. The dual
of P is obtained from P by interchanging a with v, and 0 with 1. The
principle of duality for Boolean algebras is the observation that, if P holds
in all Boolean algebras, then so does its dual. This can be proved as
follows. Suppose that P holds in all Boolean algebras, and let
be any Boolean algebra. Define a new relation on B by putting

«=► y*sx.

It is then easy to verify that is a Boolean algebra. Moreover, the


meet (join) operation in (B,<') coincides with the join (meet) operation
in (B,<), the least (greatest) element of with the greatest (least)
element of and complementation in (B,^') with complementation
in (-B, =c). Now, since P was assumed to hold in all Boolean algebras,
it holds in (B,<'), and by the preceding remark, this means that the dual
of P holds in (B,*s). This establishes the principle of duality; we shall
frequently employ it without further comment.
For example, it is easy to show that in any Boolean algebra we have
(ia^NxN/, The principle of duality allows us to infer that
(x vy)*—x* Ay*.

2.2. Examples of Boolean algebras, (i) The power set lattice PX of


a set X is a Boolean algebra,called the power set algebra of X. In this
algebra we have

A a B — An B, AvB = A*uB,

A*=X-A,

0=0, 1=X.
CH. 4, §2], BOOLEAN ALGEBRAS 131

(ii) Let if be a first-order language, and let O be the set of all formulas
of if. For a,P£0 write a%p for (—0a«-»p (Ch. 1, §10). Then is an
equivalence relation on ®. For each a£ ® let

|a| = {p^O : a^p}

be the -class of a. Let B be the set of all ss -classes. Define a relation


< on £ by putting

|a|*s |P| iff H0a->-P-

Then (5,«s) is a Boolean algebra called the propositional Lindenbaum


algebra of if. The Boolean operations on B are defined by

|a| a |P| = |aAp|, |a| v |p| = |avp|,

|a|*= | 1 ot |.

The greatest and least elements of B are given by

l = |a| for any a such that (-0«,

0= |P| for any P such that l—0 IP-

(These matters are taken up in further detail in Ch. 5.)


(iii) Let X be a topological space, and let CX be the family of all simul¬
taneously closed and open (clopen) subsets of X. Then, with the partial
ordering of inclusion, CX is a Boolean algebra called the clopen algebra of X.
(iv) Let FX consist of all finite subsets and complements of finite subsets
of a set X. Then with the partial ordering of inclusion, FX is a Boolean
algebra called the finite-cofinite algebra of X.
(v) Define a total ordering < on the two element set 2— {0,1} by
putting 0^0, Od, Id. Then (2,<> is a Boolean algebra called the
minimal algebra.

A subalgebra of a Boolean algebra B is a non-empty subset B' which


is closed under the Boolean operations in B. It is clear that each sub¬
algebra of B contains 0 and 1.
At this point we decree that a Boolean algebra must have at least two
elements, which is equivalent to asserting that 0d in any Boolean algebra
we consider. With this proviso, it is clear that each Boolean algebra
includes as a subalgebra a copy of the minimal algebra, namely, the two
element subset {0, 1}.
132 BOOLEAN ALGEBRAS [CH. 4, §2

2.3. Problem. Show that for any elements x,y of a Boolean algebra,

x*sy <=> y*^x* o xAy* = 0 o x*vy= 1;

(XAjf=/v/, (xVjf = X*A/.

2.4. Problem, (i) Show that the intersection of a family of subalgebras of


a Boolean algebra is a subalgebra, and deduce that eachsubset I of a
Boolean algebra B is included in a smallest subalgebra A (under inclusion).
A is called the subalgebra generated by X. If A — B, then X is said to
generate B.
(ii) Show that the subalgebra generated by a subset X ofa Boolean
algebra B consists of all elements of B of the form V"=iA"^i xjk (or of
all elements of the form A"=i VZhxjk), 'where for all j,k, either xjk£X
or xjk(i X. Deduce that if B' is a subalgebra of B, and x£ B, then the sub¬
algebra of B generated by consists of all elements of B of the
form (a a x) v (b a x*) with a,b^B'.
(iii) Let X be a subset of a Boolean algebra B, and let A be the subalgebra
generated by X. Show that, if X is finite, then |/4|^22'X|, and that, if
X is infinite, then |^4| = |Z|. (Use (ii).)
2.5. Problem. A Boolean ring is a ring R with identity such that x2=x
for all *6 R.
(i) Show that for any elements x,y of a Boolean ring R we have xy=yx
and ;t+:c=0.
(ii) Let R be a Boolean ring. Show that the relation < on R defined
by putting x<y o xy=x is a partial ordering on R and that <R,<> is
a Boolean algebra. (In this Boolean algebra, x xAy=xy.)
(iii) Conversely, let B be a Boolean algebra, and define

x+j/=(xay*) v(x*ay), x-y=xAy for all x,yeB.

Show that (B, + , •) is a Boolean ring.

2.6. Problem. Let B a Boolean algebra and let {xf: /£ /} be a subset


of B for which Vie/*; exists.' Show that Ate/X* exists and is equal to
(Vie/ *;)*• Dualize. Show also that, for each y£B, \/iu(y ax*) exists
and is equal to ya Vie/*.- Dualize.
CH. 4, §3] FILTERS AND HOMOMORPHISMS 133

§ 3. Filters and homomorphisms

A filter in a Boolean algebra B is a non-empty subset F of B satisfying the


following conditions:
(i) x,y£ F => XAye F,
(ii) x£ Fbx*sy => y£ F,
(iii) 0$ F.
For reasons which will become apparent later on, filters will play an
important role in our investigation of the structure of Boolean algebras.
We now give a necessary and sufficient condition for a subset of a Boolean
algebra to be included in a filter.
A subset X of a Boolean algebra B is said to have the finite meet property1
(f.m.p. for short) if whenever x1,...,xn£X we have ax„^0.

3.1. Theorem. A subset X of a Boolean algebra B is included in some


filter in B iff X has the finite meet property.
Proof. Suppose that X is included in a filter F; then

Xu...,Xn€X => A ... AXne F => X1A...AX„^0

since 0$F.
Conversely, suppose that X has the f.m.p. If AV0, define

X+ = {y£ B: 3*1...3xI1€ X^a... AAr„<^]}.

Clearly IgP; we claim that X+ is a filter. Obviously X+ satisfies


clause (ii) of the filter condition. To verify clause (i), suppose that y, y'£ X+.
Then there exist x1,...,xn,x'1,...,x'm£ X such that

x1A...Ax„<yt x[a ... Ax'm^y'.


Hence
x1 a ... a x„ A x[ A ... a ay'.

Therefore yAyfiX+ and clause (i) is satisfied. Finally, since X has the
f.m.p., 0$X+. Accordingly X+ is a filter including X.
If, on the other hand, X=Q, then we put X+ = {1}. |

The filter X+ constructed in the proof of the preceding theorem is


easily seen to be the smallest filter including X. It is called the filter
generated by X. In this connection, a filter is said to be principal if it is
generated by (the singleton of) a single (non-zero) element. Evidently

1 If B is a subalgebra of a power set algebra, then meets in B are set intersections and
accordingly we use the term finite intersection property.
134 BOOLEAN ALGEBRAS [CH. 4, §3

the principal filter generated by an element x is {y: x«sy}; * is called


the generating element of this filter.
We turn now to the important notion of a homomorphism of Boolean
algebras. A homomorphism of a Boolean algebra B into a Boolean algebra
B' is a map h: B-+B' such that for all x,yd B we have

h(x a y) = h(x) a h(y),

h(xv y)~ h(x) v h(y),

h(x*) = h(x)*.

If h is one-one and onto, it is said to be an isomorphism of B onto B';


in this situation B and B' are said to be isomorphic, and we write Bs=B'.

3.2. Problem. Show that, if h is a homomorphism of a Boolean algebra


B into a Boolean algebra B', the image h[B] of B is a subalgebra of B'.
Show also that h is one-one iff /2-1(0) = {0} or h~1( 1) = {1}.
3.3. Problem. Let h be a map of a Boolean algebra B into a Boolean
algebra B'.
(i) Show that the following conditions are equivalent:
(a) h is a homomorphism;
(b) for all x,y6 B,

h(x A y)=h(x) A h(y), h(x*) = h(x)*;

(c) for all x,y£B,

h(xv y)=h(x)v h(y), h(x*)=h(x)*.

(ii) Suppose that h is one-one and onto. Show that li is an isomorphism


iff for all x,y£ B,

x*syoh(x)=<h(y).

Filters and homomorphisms are closely related. First, observe that if


h\ B^B' is a homomorphism of Boolean algebras, then

h~1(l) = {x£B: /j(x)=*l,}

is a filter in B'. (Problem: Prove this.) This filter is called the hull of h.
Thus each homomorphism of a Boolean algebra into another can be
associated with a filter. We now show how to establish the converse.
For each pair of elements x,y of a Boolean algebra B we put
CH. 4, §3], FILTERS AND HOMOMORPHISMS 135

Given a filter F in B, we define the binary relation on B by putting

x~Fy -&■ x y£ F.

Then is an equivalence relation on B. (Problem: Prove this.) More¬


over, is a congruence relation with respect to the Boolean operations
in B. This means that if x~Fy and x'~Fy', then

xv x' ~Fyv y', x a x'~F y Ay',

y*-

(Problem. Piove this.) It follows that the set B/F of ~f-classes of members
of B can be turned into a Boolean algebra by means of the prescription

xvy=(xvy)', XAy = (x/\y)~,

x*=(x*y,
where, for each x£ B,

x={y: x~Fy}

is the ~F-class containing x. The resulting Boolean algebra B/F is called


the quotient of B by the filter F. Evidently, if we define h: B^ B/F by
h(x)=x, then A is a homomorphism of B onto B/F (called the canonical
homomorphism). Moreover, F is the hull of h. (Problem: Prove this.)
Thus with each filter F we can associate a Boolean algebra B' = B/F and
a homomorphism h of B onto B' such that F is the hull of h.

3.4. Problem. Let h: B-+B' be a homomorphism of Boolean algebras


and let F be the hull of h. Show that the image h [B] of B is isomorphic
to B/F.

We have seen that filters arise as the hulls of homomorphisms. Especially


important are those which are associated with homomorphisms onto the
minimal algebra 2. Such homomorphisms are called 2-valued. It is possible
(and useful) to given a particularly simple description of these filters;
this we proceed to do.
A filter F in a Boolean algebra is called an ultrafilter if it is not properly
included in any filter in B. That is, F is an ultrafilter if whenever F' is a
filter including F, then F=F'. Later on we shall show that ultrafilters
exist; for the present we prove:
136 BOOLEAN ALGEBRAS [CH. 4, §3

3.5. Theorem. Let F be a filter in a Boolean algebra B. Then the following


conditions are equivalent:
(i) B/F~ 2;
(ii) F is the hull of a 2-valued homomorphism h on B;
(iii) F is an ultrafilter',
(iv) for all x,y£ B, if x vy6 F, then x£ F or F;
(v) for each xd B, either x£ F or x*F F.
Proof. (i)=>-(ii). If (i) holds, then the canonical homomorphism of B
onto B/F ^2 meets the requirements imposed in (ii).
(ii) =>(iii). Suppose (ii) holds, and let F' be a filter including F. Then,
if x£ F', we must have x*(j F since F' is a filter and Fcf'. Therefore
h(x*)^ 1, whence h{x*) = 0. But then /z(x) = 1 so that, since F is the hull
of h, we have x£ F. Hence F' — F and F is an ultrafilter.
(iii) =>(iv). Suppose (iii) holds and that xvy£F. If x$F, it is easy
to see that G={z: xvz£F} is a filter which includes F, and so, since
F is an ultrafilter, F=G. But since xvydF, it follows that y£ G and
hence that y£F.
(iv) =>(v). This is immediate since for any x£B we have xvx*=l£.F.
(v) =>(i). Assume (v), and for each x£ B let x be the image of x under
the canonical homomorphism of B onto B/F. Suppose that x^l in B/F.
Then x$F, so that x*£ F. But then x* = (x*) ~ = 1. It follows that x = 0.
Therefore B/F contains just two elements; in other words, it is isomorphic
to 2. |

Thus ultrafilters are precisely the hulls of homomorphisms onto the


minimal algebra. The question now arises as to whether ultrafilters actually
exist. Assuming Zorn’s lemma, the answer is in the affirmative. In fact
we can establish the existence of an extremely rich class of ultrafilters in
any Boolean algebra.

3.6. Ultrafilter Theorem. Each filter in a Boolean algebra is included


in an ultrafilter.
Proof. Let !F be the set of all filters in a Boolean algebra B; dF can be
partially ordered by inclusion. We will show that, with respect to this
ordering, chains in FF have upper bounds in BF.
Let <£ be a chain in BF, and let C= U^- If x,y£ C, then for some
x€ D and y£ E. Since ^ is a chain, either Dcf; or suppose
the former case obtains. Then x,y£ D and because D is a filter we have
xa ydD^C. If z£ B and x<z, then ipso facto z£DczC. Since 0$ D
CH. 4, §3], FILTERS AND HOMOMORPHISMS 137

for all it follows that 0(£ C. Therefore C is a filter and is the required
upper bound for in F.
We may accordingly invoke Zorn’s Lemma to conclude that, for every
filter F in B, 3F contains a maximal member, i.e. an ultrafilter, which
includes F. |

3.7. Corollary. A subset of a Boolean algebra is included in an ultrafilter


iff it has the f.m.p.
Proof. This follows immediately from Thms. 3.1 and 3.6. §

This gives instantly:

3.8. Corollary. Each non-zero element of a Boolean algebra is contained


in an ultrafilter. :

3.9. Corollary. For any pair of distinct elements of a Boolean algebra


there is an ultrafilter containing one but not the other.
Proof. Let x and y be distinct elements of a Boolean algebra B. Then
either x?fej or yfix; suppose the former case obtains. Then, by Prob. 2.3,
xa/^0. Hence, by Cor. 3.8, there is an ultrafilter containing xay*;
this ultrafilter clearly contains x but not y. |

The correspondence between ultrafilters and 2-valued homomorphisms


set up in Thm. 3.5 enables one to reformulate Thm. 3.6 and its corollaries
in terms of such homomorphisms. Thus, e.g. Cor. 3.9 becomes: if x,y
are distinct elements of a Boolean algebra B, then there is a 2-valued
homomorphism h on B such that h(x)Xh(y). In other words, there are
enough 2-valued homomorphisms to distinguish points of a Boolean
algebra.
Let F be a filter in a Boolean algebra B. A subset X of F is called a
base for F if for each y £ F there is x£ X such that x<y. If for each subset X of
B we put XA for the set of all finite non-empty meets of members of X,
it is clear that, if X is non-empty and generates a filter F, then XA is a base
for F. Moreover, it is obvious that if A is a base for a filter F, then F
is the (unique) filter generated by X.
A base X for a filter F is said to be strong if X is closed under finite
meets, i.e. if XA=X. Evidently each filter F has at least one strong base,
namely F itself.

3.10. Problem. Let B be a Boolean algebra.


(i) Show that a non-empty subset A of a filter F in B is a base for F iff X
generates F and for all x,y£ X there exists z£ X such that z«sx Ay. Deduce
138 BOOLEAN ALGEBRAS [CH. 4, §3

that a non-empty subset X of B is a base for a (unique) filter in B iff 0<£ X


and for all x,y£ X there is X such that z<xAy.
(ii) Show that a subset X of B is a base for a (unique) ultrafilter in B
iff X has the f.m.p. and for all ye B there is X such that x^y or x<y*.
(iii) Show that, if X is a strong base for a filter F, then, for any x£X,
the set {y£ X: jet} is also a strong base for F.
(iv) Show that a filter in B is principal iff it has a finite base, or equi¬
valently, iff it has a finite strong base.

In general a filter may have many strong bases. For example, let B
be a Boolean algebra with more than two elements, and let a be an element
of B with 05^aX\. Then both {a} and {a,l} are strong bases for the
principal filter generated by a. On the other hand, it is easy to see that all
strong bases for a given principal filter F must contain the generating
element of F, so that a principal filter cannot have disjoint strong bases.
However, the situation in this regard is quite different for non-principal
filters. For example, consider the family of all cofinite sets (i.e., comple¬
ments of finite sets) of natural numbers. This is obviously a non-principal
filter in the power set algebra Pco of the set to of natural numbers. Moreover,
if tor each ngco we put Sn = {mdco: n*sm], then it is easy to see that
iS2n'- «€co} and {S',,,,+1:77£co} are both strong bases for'g7, and that they are
disjoint. Our next result, which will be useful in our discussion of non¬
standard analysis, shows that this conclusion is satisfied by all non-principal
filters.

*3-11. Lemma. Any non-principal filter F in a Boolean algebra B has at


least two disjoint strong bases.
Proof. Among the strong bases for F choose one, say X, which has
minimal cardinality, say x. Then % must be infinite by Prob. 3.10(iv).
Let X be well-ordered in the form X={xa: a<x}. By transfinite
lecursion we define ya and za for each a<x in such a way that
(0 }’a > At € X, ya, za Xa ‘
00 {yp- /?<a}An{zfi: /J«sa}A=0.
Suppose that ya and za have been defined in conformity with (i) and (ii)
for all a where y<x. We show how yy and zy may be defined.
Clearly, the set {ya: oc<y}A is not a strong base for F, since its car¬
dinality is less than x. Thus there must be some x£ X such that
y${ya\ a<y}A for all y^x. Choose the first such x in the wellordering
of X, and let zy=xAxy. Then zfi X, zy*sxy and y$ {ya: a<y}A for all
y<z-r
CH. 4, §3]. FILTERS AND HOMOMORPHISMS 139

Similarly, we can define yy in such a way that yy*zxy, yye X and


y$ {za: a<y}A for all
It is now not difficult to verify that

{ya: a<y}An{za: a<y}A=0.

The sets {ya: a<x}A and {za: a<>t:}A are disjoint strong bases for F. g

3.12. Problem. (Throughout this problem we adopt the notation and


terminology of Ex. 2.2.(ii)). Let if be a first-order language, and let B
be the propositional Lindenbaum algebra of .
(i) Let E be a set of formulas of ££. Show that E is propositionally
consistent iff {jot|: has the f.m.p. in B. Show that E is maximal
propositionally consistent (see §13 of Ch. 1) iff {|a|: «£E} is an ultra¬
filter in B.
(ii) Let I be a propositionally consistent set of formulas of jg?. Let
U be an ultrafilter in B including {[a|: E}, and let h be the canonical
homomorphism of B onto B/U^ 2. Define the mapping J_} by

<r(a)=T -<=>■ A(|a|) = l, cr(a)=_L <=>• /r(|oe|) = 0.

Show that a is a truth valuation satisfying E. (This gives an alternative


proof of Thm. 1.13.4. For an extension of this method to the full predicate
calculus, see Ch. 5.)

3.13. Problem. An ideal in a Boolean algebra B is a subset I of B such that


(a) x,y£l^xvyei,
(b) x£ I and I,
(c) 1$/.
(i) For each subset X^B let X* = {x*: x£ X}. Show that X is a filter
(ideal) iff X* is an ideal (filter).
(ii) Reformulate the results of this section in terms of ideals instead
of filters.
3.14. Problem. Show that each filter in a Boolean algebra is the inter¬
section of the collection of all ultrafilters which include it.
*3.15. Problem (Sikorski). (i) Let 5 be a Boolean algebra, let A be
a subalgebra of B, and let C be a complete Boolean algebra. Show that
any homomorphism of A into C can be extended to a homomorphism
of B into C. (Let h\ A-+C be the given homomorphism. By Zorn’s
lemma, there is a maximal pair <f,D), where D is a subalgebra of B
including A and / is a homomorphism of D into C extending h. Then
140 BOOLEAN ALGEBRAS [CH. 4, §3

D—B, for if a£ B—D then / can be extended to the subalgebra

{(x a a) v (y a a*): x,y£ D)

generated by Z>u{a} by setting

/(O a a) v (y a a*)) = (/(x) a 6) v (/(y) a b*),

where b is any element of C satisfying

V{/(*): &x£ £>}<&<; A {/O’): a*s:y&y£D}.)


(ii) A subalgebra A of a Boolean algebra B is said to be dense in B if
for every x£B, xXO, there is y£ A, such that y^O and y*sx. Show that,
if A is a dense subalgebra of B, then every one-one homomorphism of
A into a complete Boolean algebra C can be extended to a one-one homo¬
morphism of B into C. (Use (i).)
3.16. Problem. If X is a set, a filter in the power set algebra PX is called
a filter over X.
(i) Show that the following conditions on an ultrafilter °U over X are
equivalent:
(a) °U is principal.
(b) °lt is generated by a singleton.
(c) °U contains a finite subset of X.
(ii) Show that, if X is infinite, there is a non-principal ultrafilter over X.
(Show that the family of cofinite subsets of X has the f.m.p., and
use Cor. 3.7.)
(iii) Show that, if is an ultrafilter over X and {Ylt ...,7*} is a partition
of a member of °U, then exactly one of the Yt is a member of 6U. Extend
this result to ultrafilters in arbitrary Boolean algebras.
(iv) An ultrafilter °U over X is said to be uniform if |F| = |Jif| for all
Y£_°U. Show that, if X is infinite, there is a uniform ultrafilter over X.
Show also that an ultrafilter over a denumerable set is uniform iff it is
non-principal.
(v) An ultrafilter °U over X is said to be countably incomplete if there
is a denumerable subset {Xn: n£co} of ^ such that p|„ecoXn$ °ll. Show
that, if is a countably incomplete ultrafilter over X, there is a descending
chain Y0^ Y,=> Y2=>... such that each Yne °U and f|ne<urn=0- Show
also that every non-principal ultrafilter over to is countably incomplete.
3.17. Problem. Let X be a topological space and for each x€ X let
be the family of neighbourhoods of x.
(i) Show that is a filter over X.
CH. 4, §4],
THE STONE REPRESENTATION THEOREM
141

(ii) A filter 2F over A is said to converge to a point x£ X if ZFX c


Show that X is Hausdorff iff each ultrafilter over X converges to at most
one point, and that X is compact iff each ultrafilter over X converges
to at least one point.
3.18. Problem. A subset A of a Boolean algebra B is said to be an antichain
it whenever x,y£ X and xXy, we have xaj = 0. B is said to satisfy the
countable chain condition if any antichain in B is countable. Show that
the following conditions are equivalent:
(i) B satisfies the countable chain condition;
(ii) for each subset X of B, there is a countable subset Y of X such that
X and Y have the same set of upper bounds.
(For (i)=Kii), consider a maximal antichain in of the ideal / generated
by X, i.e. I—{y^B: y«.v1v...vx„ for some x1} X}.)
3.19. Problem. A positive measure on a Boolean algebra B is a function
p on B to the non-negative real numbers such that
(i) p(xvy)=fix)+fiy) for all x,y£ B such that xAy = 0,
(ii) Ml) = l,
(iii) p(x) = 0 iff x = 0.
Show that, if B admits a positive measure, then B satisfies the countable
chain condition.

§ 4. The Stone Representation Theorem

The very first example of a Boolean algebra that occurred to us was the
power set algebra of a set. Now it is easy to construct Boolean algebras
which are not isomorphic to any power set algebra. In fact, the finite-
cofinite algebra Fa> of the set co of natural numbers is such an algebra,
since Fo has cardinality &0, while evidently no power set algebra can
have this cardinality. Nonetheless, we are going to show that each Boolean
algebra is isomorphic to a subalgebra of a power set algebra, or, in other
words, each Boolean algebra can be represented as a subalgebra of a power
set algebra.
Let us define a field of seis to be a subalgebra of a power set algebra.
In particular, a field of subsets of a set A is a subalgebra of PA.
If B is a Boolean algebra, we denote by SB the set of all ultrafilters
in B. Then we have:

4.1. Stone Representation Theorem. Each Boolean algebra B is isomorphic


to a field of subsets of SB.
BOOLEAN ALGEBRAS [CH. 4, §4
142

Proof. Let B be a Boolean algebra. Define a mapping u: B-PSB by


putting

u(x) = {F£SB: x£F}

for each x € B. Thus u{x) is the set of all ultrafilters containing x.


We claim that u is a homomorphism of B into PSB. For suppose
x,j>€ B; then, if EE SB, we have

F£ u(x Ay) o x Aye F x£ F & y£ F o F£ u(x)nu(y).

Hence u(x Ay) = u(x)nu(y). Also, we have

F£u(x*) o x*£F o x$ F (by Thm. 3.5(iv)) <=> F£SB—u(x).

Accordingly «(x*) = SB-w(x), so that, by Prob. 3.3, u is a homomorphism.


We also note that u is one-one, for if xXy then by Cor. 3.9 there is
an ultrafilter F containing x, say, but not y. Then F£ u(x) and u(y),
so that u(x)Xu(y).
We have therefore shown that u is an isomorphism of B onto the sub¬
algebra u[B] of PS2?, which proves the theorem. |

The Stone Representation Theorem can be phrased much more suggestively


within the framework of general topology.
Recall that a simultaneously closed and open subset of a topological
space is called a clopen set. Let us define a Boolean space to be a compact
Hausdorff space with a base of clopen sets. (Spaces satisfying this latter
condition are sometimes called totally disconnected for reasons which
are revealed in Prob. 4.14.)
If A is a topological space, then the clopen subsets of X form a field
of sets (cf. Ex. 2.2(iii)). This field is called the clopen algebra of X and is
written CX.
We shall show that we can assign a natural topology to the set SB of
all ultrafilters in B in such a way that it becomes a Boolean space and its
clopen algebra is isomorphic .to B.

4.2. Lemma. Let X be a compact space, and let sd be a field of subsets of


X which is also a base for the topology on X. Then sd — CX.
Proof. The elements of sd are open subsets of X since they form a base
for the topology. Since sd is a field of sets it is stable under the formation
of complements and so its members are also closed. Thus sd s CX.
Now suppose that Y£ CX. Y is open and sd is a base, so there is some
CH. 4, §4], THE STONE REPRESENTATION THEOREM 143

family {A:: i£l} of members of whose union is Y. But Y is also a


closed subset of the compact space X, so Y is itself compact, and therefore
the open cover {Ap. i£ 1} of Y has a finite subcover {Ai,...,Ai}, i.e.
Y=A,iu...uAin. However, s/, being a field of sets, is stable under finite
unions. Hence Y£ s/, and our proof is complete. |

Consider the family «[£] = {«(*): *6 B}, where u is defined as in 4.1 by

u(x)={F£SB: x£ F} for x£ B;

u[B] is a field of subsets of SB and is therefore stable under finite inter¬


sections. Accordingly u[B] forms a base for a unique topology on SB.
The resulting topological space is called the Stone space of B.
Our next result is of great importance.

4.3. Theorem. The Stone space SB of a Boolean algebra B is a Boolean


space and B is isomorphic to the clopen algebra CSB of SB.
Proof. SB is a Hausdorff space. For suppose F,G£SB and F^G. Then
for some x£ B, we have x£ F but x$G; so that G. Then F£ u{x)y
G£ u(x*), and u(x), u(x*) are disjoint open subsets of SB.
SB is a compact space. To see this it is sufficient to show that each cover
of SB by basic open sets has a finite subcover. Suppose that {m(x;): z*6 1}
(xf B) is such a cover which has no finite subcover. Then, for each finite
subset I0 of I, we have \Jieliu(xi)^SB, so that (recalling that u is a homo¬
morphism!)

u { A xt*) = fl u(xi*) = n [S5-«(*i)] = SB- u k(*j) 5*


Ue/0 ) KI0 K1 o /€/0
^0=«(O).

Therefore, since u is one-one, we have fit,jxf^0. Thus {^*: id /} has


the f.m.p. and can therefore (Cor. 3.7) be extended to an ultrafilter F.
Since x*£ F for all /£ /, it follows that U/eM*/), which contradicts
the assumption that {w(x;): /£ /} covers SB. Hence SB is compact.
Since u(x) is the complement of n(x*), each member of the base
M*): xG B} is closed, and so clopen. By 4.1, u is an isomorphism of B
onto u[B], while by 4.2, u[B] = CSB. Thus u is an isomorphism of B
onto CSB. 1

The upshot of Thm. 4.3 is that each Boolean algebra B may be identified
with the clopen algebra of its Stone space SB. Accordingly, the structure
of B is completely reflected in the structure of SB; in other words, each
algebraic property of B corresponds to a topological property of SB, and

ii*
144 BOOLEAN ALGEBRAS [CH. 4. §4

conversely. What, for example, is the topological property of SB which


corresponds to completeness of B? The answer is somewhat weird, as we
shall see.
Let us call a topological space extremally disconnected if the closure
of every open set is open (and hence clopen!). Then we have:

4.4. Theorem. A Boolean algebra is complete iff its Stone space is extremally
disconnected.
Proof. Identify the given Boolean algebra B with the clopen algebra of SB.
Suppose that B is complete, and let U be an open set in SB. Let sd be the
family of all members of B included in U\ then, since B is a base for SB,
we have U={]sd. Since B is complete, sd has a supremum V in B which
must by definition be a clopen subset of SB. We claim that U= V. Since
V is an upper bound for jd, certainly U= {]sd V, so that U^V since
V is closed. If V—U?^0, then V—U is a non-empty open set which must,
since SB is a Boolean space, include a non-empty clopen set W. But then
V— IT is a clopen set which includes \]sd and is properly included in V.
This contradicts the choice of V as the supremum of sd. Therefore U=V
as claimed, so a fortiori U is open.
Conversely, suppose that SB is extremally disconnected, and let sd
be a subfamily of B. Then U=\Jsd is an open subset of SB (recall that
we are identifying B with CS5!) and so, since SB is extremally disconnected,
U is clopen and hence in B. We claim that U =\J sd in B. Certainly U
is an upper bound for in 5; on the other hand, if V is a member of
B which includes each member of sd, then U=[jsd ^V so that U^V
since V is clopen. Therefore U= \J sd as claimed, and B is complete. £

As well as providing curious and amusing characterizations of algebraic


propel ties, such as in Thm. 4.4, appeal to the theory of Boolean spaces
can often yield simple proofs of algebraic results whose algebraic proofs
would turn out to be somewhat hairy. Witness, for example:

4.5. Theorem, (i) Let B be a finite Boolean algebra. Then B is isomorphic


to PSi?, and hence =
(ii) Any two finite Boolean algebras of the same cardinality are isomorphic.
Proof, (i) Let B be a finite Boolean algebra of cardinality m. Then SB
is a finite HausdorfT space, hence a discrete space, i.e. every subset of
SB is clopen. Therefore, CSB = ?SB, and since B^CSB, we have (i).
(ii) Let B and B' be two finite Boolean algebras of the same cardinality m.
Let SB have p and SB' q elements. Then, by (i), we have 2p=m = 2q, so
CH. 4, §4], THE STONE REPRESENTATION THEOREM 145

thatp — q. Therefore B and B' are both isomorphic to the power set algebra
of a set of p elements, and so B and B' are themselves isomorphic. g

+4.6. Examples, (i) Let X be an infinite discrete space. It is not hard


to show that the one-point compactification du{ oo } obtained by adding
a new “point at infinity” °° to X is a Boolean space which turns out to
be the Stone space of the finite-cofinite algebra FX of X. For each x€ X
let °UX be the ultrafilter in FX generated by {x}, and let °U be the ultrafilter
in FX consisting of all cofinite subsets of X. (Notice that is not an
ultrafilter in ?X but it is one in FT!) Then {°llx: jc6T}u{®} consists
of all ultrafilters in FX, that is,

SFX={‘%x: x£X}u {)?/}.

It is now quite easy to see that SFT' is (homeomorphic to) the one-point
compactification of X. The subset \fUx: X} is a homeomorphic copy
of X, and dl functions as the “point at infinity”.
(ii) Again let X be an infinite discrete space; this time consider the Stone
space of the power set algebra of X. This turns out to be (homeomorphic to)
the Stone-Cech compactification of X.

A natural question that arises is the following: given a Boolean algebra,


what is the cardinality of its Stone space? It follows immediately from
Thm. 4.5 that, if B is finite, then |Si?|<|i?|. But the situation is entirely
different when B is infinite, as we shall see. Let X be a set and let srf be
a field of subsets of X. is said to be separating if for each pair of distinct
points x,y of X there is A£ srf such that x£ A and y$A.

4.7. Lemma. If X is a compact Hausdorff space and stf is a separating


field of clopen subsets of X, then j?/ is the field of all clopen subsets of X.
Proof. By Lemma 4.2 we need only show that is a base for X. Let
Y be a closed subset of X and let a$ Y. For each Y we can find Ay£ stf
such that y£ Ay and a$Ay. Now the family {Ay: T} constitutes an
open cover of Y; and Y, as a closed subset of a compact space, is itself
compact. Accordingly {Ay: y£Y} has a finite subcover, the union of
which is a member of x/ including Y but not containing a. Since a was
an arbitrary member of X—Y, it follows that Y is the intersection of all
members of stf which include it. Since Y was an arbitrary closed set, we
infer that any closed set is the intersection of a subfamily of st. Dually,
each open set is the union of a subfamily of sd, i.e. st is a base for X. g
146 BOOLEAN ALGEBRAS [CH. 4, §4

We can now prove

4.8. Theorem. For each infinite Boolean algebra B, we have [fi[=c|Sfi|.


Proof. Let B be an infinite Boolean algebra, and let X be the Stone space
of B. Then by Thm. 4.3 we may identify B with the clopen algebra of X.
We have to show that |fi|*s|A'j. Suppose, on the contrary, that |JT|-<jfi|. Let

Y={(x,y)£XXX: xXy}.

Then, since B is infinite and \X\<\B\, it follows immediately that |7|<|fi|.


Since X is a Boolean space, for each z=(x,y)d Y there is Vzd B such that
x£ Vz and y (£ Vz. Let A be the subalgebra of B generated by {Vz: Y}.
It follows from Prob. 2.4(iii) that |T|<|fi|. But, by construction, A is
a separating field of clopen subsets of X, so that, by Lemma 4.7, A=B.
This contradiction shows that the assumption |W|<|fi| was wrong, so
that |fi|«|X|. |

Thus, if B is an infinite Boolean algebra, |Sfi| is bounded below by |fi|.


Also, it is clear that, for any Boolean algebra B, jS.fi | is bounded above
by 2|B|. These bounds may, in fact, be attained. For example, it is clear
from Ex. 4.6(i) that if fi is the finite-cofinite algebra of an infinite set,
then |fi| = |Sfij, while a well-known result of Tarski [1939] asserts that
if fi is an infinite power set algebra, then |Sfi|=2|BL

4.9. Problem. Show that a Boolean algebra is finite iff its Stone space
is discrete, and hence iff its Stone space is finite.
4.10. Problem. Let fi be a Boolean algebra, and let A be a subalgebra
of fi. Show that A is a proper subalgebra of fi iff there are distinct ultra¬
filters U,U' in fi such that UnA = U'nA. (Let A be a proper subalgebra
of fi; let u be the isomorphism of fi onto CSfi. Then u[A\ is a proper
subalgebra of CSfi; by Lemma 4.7, u [A] is not separating; the conclusion
now follows easily.)
4.11. Problem, (i) Show that, for each infinite Boolean space X, there
is a (countably) infinite collection of mutually disjoint clopen subsets of X.
(ii) Deduce from (i) that each infinite Boolean algebra has a (countably)
infinite antichain. (Prob. 3.18.)
4.12. Problem. Prove the so-called dual form of the Stone representation
theorem: each Boolean space is homeomorphic to the Stone space of
its clopen algebra. (Let I be a Boolean space. Show that the map
v :X-*SCX defined by v(x)= {A£CX: x£A} for each x£X is a homeo-
morphism.)
CH. 4, §4). THE STONE REPRESENTATION THEOREM 147

4.13. Problem. A Boolean algebra is said to be a-complete if each


countable subset has a join and a meet. A subset of a Boolean space is
called a-open if it is the union of countably many clopen sets. A Boolean
space is said to be a-disconnected if the closure of each a-open set is open.
Show that a Boolean algebra is a-complete iff its Stone space is a-dis¬
connected. (Argue as in the proof of Thm. 4.4.)
4.14. Problem. A topological space is said to be totally disconnected
if no connected subset has more than one point. Show that any Boolean
space is totally disconnected.1
*4.15. Problem. A subset U of a topological space A is said to be regular
open if U—{U~)°, i.e. if U coincides with the interior of its closure. Show
that the family RX of all regular open subsets of X forms a complete Boolean
algebra under the partial ordering of inclusion. (If {Up. z'£/}£RA,
show that, in RA, Vi€/^i=((Ui€J^)-)° an<^ Aier^i=(n>er^i)°’ while
U*=X—U for t/£RA.) RA is called the regular open algebra of X.
(ii) Show that, if A is a Boolean space, then CA is a dense subalgebra
(Prob. 3.15(h)) of RA. Deduce that, if {Up z'€ /} is a subfamily of CA
which has a supremum (inhmum) in CA, then this supremum (infimum)
is the same as the supremum (infimum) of {Up. i£ 1} in RA.
(iii) Let A be a topological space in which every family of disjoint open
sets is countable, and let J be a base for A. Show that |RA|^=|^|S°.
(Using Zorn’s lemma, for each L/RA let &v be a maximal family of
disjoint members of PUc\&. Now show that U=((\Jd$u) )°, and observe
that there are at most |#|8® families of the form &v.)
(iv) If B is a Boolean algebra satisfying the countable chain condition
(Prob. 3.18) show that |RS5|^|5|X°. (Use (v).)
*4.16. Problem. Show that a complete Boolean algebra is isomorphic
to the regular open algebra of an extremally disconnected space. (Use
Thm. 4.4 and Prob. 4.15.)
*4.17. Problem. Let B be a Boolean algebra, and let A be a subset of B.
Then B is a said to be freely generated by A, and A is said to be a free set
of generators for B, if A generates B and any mapping / of A into any
Boolean algebra B' can be extended to a homomorphism/' of B into B'.
A Boolean algebra is said to be free if it is freely generated by some subset,
(i) Show that, if B is freely generated by A and/is a mapping from A

1 The converse holds for any compact Hausdorff space; see Gillman and Jerison
[1960], Thm. 16.17.
148 BOOLEAN ALGEBRAS [CH. 4, §4

into a Boolean algebra B', then the homomorphism f' extending / is


unique. (Let f" be another extension; consider (jc£ B: f\x)=f"(x)}.)
(ii) Let B and B' be Boolean algebras freely generated by X and X'
respectively. Show that B is isomorphic to B' iff \X\ = \X'\. (Use (i),
and Prob. 2.4 (iii).)
*4.18. Problem (Cantor spaces). Let I be a set. The Cantor space
over X is the product space 2X with the product topology (where 2 = {0, 1}
is assigned the discrete topology).
(i) Show that any Cantor space is a Boolean space. (Use Tychonoff’s
theorem that the product of compact spaces is compact.)
(ii) Show that the Cantor ternary set (i.e. the set of real numbers in the
closed unit interval, which have a triadic expansion without the digit 1, with
the topology inherited from the closed unit interval) is homeomorphic
to the Cantor space 2", where a> is the set of natural numbers. (Map
each 2“ onto the real number J£,BL12/(i)3_i.)
^(iii) Show that, for each X, the clopen algebra of the Cantor space
2 is freely generated by a set of the same cardinality as X. In particular,
the minimal algebra 2 is freely generated by 0. (Let B be the clopen algebra
of 2X, and for each X define

j(x) = {f£2x: f(x)=l}.

Let Y={j(x): x^X}- then \Y\ = \X\, and Y generates B. To show that
Y freely generates B, suppose that h is a mapping of Y into a Boolean
algebra B . Identify B' with the clopen algebra of its Stone space SB'.
Define /: SB'-*2X by

1 ^ h(j(x)),
0 ^ y$h(j(x)).

Show that / is continuous, and that, if one defines h'\ B^B' by h\F) =
—f 2[^]) then h' is a homomorphism extending h to B.)
(iv) Show that each Boolean algebra is the image of a free algebra
under a homomorphism. (Use (iii) to show that B is a homomorphic
image of the clopen algebra of 2s.)
(v) Show that, if n is finite, the free Boolean algebra generated by a set
of « elements has 22" elements. (Use (ii).) Deduce that a finite Boolean
!.,gebr^ lff 2t has 2'2" dements for some n. (For sufficiency, use
Thm. 4.5(h).)
CH. 4, §4],
THE STONE REPRESENTATION THEOREM
149

(vi) Let X be a subset of a Boolean algebra B which generates B. Show


that the following conditions are equivalent:
(a) B is freely generated by X;
(b) each mapping of X into 2 can be extended to a homomorphism
of B into 2;
(c) for any distinct elements of X, we have x'a...ax'^0,
where x- is either xt or x*.
(For (c)=>(a), show that, using Prob. 2.4(ii), B is isomorphic to the
free Boolean algebra generated by a set of the same cardinality as X whose
existence is proved in (iii).)
(vii) Let be a first-order language. Show that the propositional
Lindenbaum algebra of is freely generated by the set

{|a| : a is a prime formula of jSf}.

(Use (vi).)
*4.19. Problem. Let B be a Boolean algebra freely generated by a subset X.
(i) Let Y^X, x(zX—Y, let B' be the subalgebra of B generated by Y,
and let B" be the subalgebra of B generated by 5'u{x}. Show that each
element y of B " can be uniquely expressed in the form

y = (a ax)v (b ax*)

with a,b£ B'. (Use Probs. 2.4(ii) and 4.18(vi) (c).)


(ii) With the assumptions and notation of (i), show that each
positive measure yu on B' can be extended to a positive measure // on B".
(Let r be any real number such that 0<r<L and for y£ B' put

fi'(y)=r-n(a) + (l-r)-n(b),

where y = (a ax)v (b ax*) with a,b£B'.)


(iii) Show that B admits a positive measure. (Apply Zorn’s lemma
and (ii) to the set {/.i: n is a positive measure on the subalgebra of B generated
by a subset of X}, partially ordered by inclusion.)
(iv) Show that any free Boolean algebra — in particular, the clopen
algebra of any Cantor space -— satisfies the countable chain condition.
(Use (iii) and Prob. 3.19.)
(v) Let I be an infinite set, and let X be the Cantor space 21. Show
that |RA|^|/|N°. (For the definition of RA, see Prob. 4.15. Use (iv) and
Prob 4.15 (iv).)
BOOLEAN ALGEBRAS [CH. 4, §5
150

§ 5. Atoms

An atom in a Boolean algebra B is a minimal non-zero element. In other


words, an element x of B is an atom iff x^O and for all y€B, if .y<x,
then }>=x or j=0. It is easy to see that x is an atom in B iff for each
y£ B exactly one of x^y or x^j* holds.

5.1. Lemma. A non-zero element x of a Boolean algebra B is an atom


iff the filter {y£ B\ x^y} generated by x is an ultrafilter. |

5.2. Theorem. Let U be an ultrafilter in a Boolean algebra B. Then the


following conditions are equivalent:
(i) U is generated by an atom;
(ii) U is an isolated point in SB (i.e. {£/} is open in SB).
Proof. (i)=^(ii). Suppose U={xeB: a<x} for some atom a. Let u be
the isomorphism of B onto the clopen algebra of SB defined in Thm. 4.1.
Then, using Thm. 5.1, we see immediately that u(a)={U), so that {[/}
is open and U is isolated.
(ii)=>(i). Suppose that U is an isolated point of SB. Then {£/} is a clopen
subset of SB so that {U}—u{a) for some a£B. Let F={x£B: a<x)
be the filter generated by a. We claim that F=U. Clearly F^U. On the
other hand, if FfiU, then there is x£ U such that a fix, so that a axVO.
Accordingly there is an ultrafilter U' in B containing a ax*, hence both
a andx*. But then Ufi u(a) and U'^U, contradicting the fact that u{a)—{U).
Hence F— U. By Thm. 5.1, a is then an atom and it generates U. |

5.3. Corollary. A Boolean algebra B is finite iff every ultrafilter in B


is principal.
Proof. B is finite o (by Prob. 4.9) Si? is discrete <=> every point of SB
is isolated (by Thm. 5.2) every ultrafilter in B is principal.

A Boolean algebra B is said to be atomic if for each x£B, x^O, there


is an atom a£ B such that a<x. At the other extreme, B is said to be
atomless if it contains no atoms at all.
Notice that any power set algebra is atomic. (What are the atoms in
such an algebra?) Also, it is clear that every finite Boolean algebra is
atomic. On the other hand, the field of subsets of the real line generated
by intervals of the form [x,+°°) is easily seen to be an atomless Boolean
algebra.
We pointed out at the beginning of §4 that not every Boolean algebra
B is isomorphic to a power set algebra. But notice that if B is isomorphic
CH 4, §5], ATOMS 151

to such an algebra, it must be complete and atomic, since power set


algebras have both these properties. We now prove the converse.

5.4. Theorem. Any complete atomic Boolean algebra is isomorphic to


a power set algebra.
Proof. Let B be a complete atomic algebra, and let A be its set of atoms.
We show that B^PA. To this end, define h: B-+PA by

h{x)={adA :

for each xd B. We claim that h is an isomorphism of B onto PA.


First, h is a homomorphism. For if x,yd B, then for all ad A we have

ad h(x ay) o a<XAy o acx &. a^y

-o- ad h(x)nh(y).

Hence h(xAy)—h(x)nh(y). Also,

adh(x*) •<=>■ a^x* o (since a is an atom) a^x o adA — h(x).

Hence h(x*)—A—h(x). It follows that h is a homomorphism.


Next, h is one-one. For if x^y in B, then either x^y or assume
the latter. Then x*Ay^0, so, since B is atomic, there is ad A such that
Hence a<sx* and a*sy, so a^x and a=^y. It follows that
adh(y)—h(x), so that h(x)^h(y).
Finally, h is onto. For if XdPA, let x=\/X. (Recall that we have
assumed that B is complete!) We claim that X=h{x). For if adX, then
by definition a<x so that ad h(x), and accordingly X^h(x). Conversely,
if adA—X, then, by the definition of an atom we have aAa'=0 for all
a'd X, so that a'*sa* for all a'd X. Therefore x= \/X<a*, so, since a?±0,
we have a^x, whence a$h(x). Therefore h(x)^X and our claim is
established, completing the proof. 1

5.5. Problem. Let B be a Boolean algebra. Show that a non-zero element


a of B is an atom iff whenever a=xvy with xAy=0 ,then x=0 or j>=0.
(The term “atom” derives from this property.)
5.6. Problem. Let B be a Boolean algebra, and let A be the set of all atoms
in B. Show that B is atomic iff 1 is the only upper bound for A in B.
*5.7. Problem. Show that the regular open algebra (Prob. 4.15) of the
closed unit interval [0, 1] is atomless. (This furnishes an example of an
atomless complete Boolean algebra.)
152 BOOLEAN ALGEBRAS [CH. 4, §5

'5.8. Problem. A complete Boolean algebra B is said to be completely


distributive if for every doubly indexed subset {.r[7: {i,j)d IY.J) of B
we have
A V
idijtJ
xtJ = fdJ1
V mrA *//«)»
where J1 is the set of all mappings of all mappings of / into J. Show that
the following conditions on a Boolean algebra B are equivalent:
(i) B is complete and completely distributive;
(ii) B is complete and atomic;
(iii) B is isomorphic to a power set algebra.
(To establish (i)^(ii), let B={xp. idl) and for each id I define xi0=xt
and xa=xf. Observe that 1 = A.a(*/ovxa) = V/g2* Aier*i/(o and that
AiaiXifv) is either 0 or an atom for each fd 21, and apply the result of
Prob. 5.6.)
5.9. Problem. Show that a Boolean algebra is atomic iff its Stone space
has a dense subset of isolated points.
5.10. Problem. Let {Bp. id I} be a family of Boolean algebras. The
product of the family is the set with the Boolean operations defined
pointwise, i.e. for all f,gdY\i^IBi, fAg,fvg,f* are defined by
(/Ag)(0=/(0Ag(0,
(/V g)(i)=f(i)v g(i),
/*(/)=/(/)*,
for all id /.
(i) Show that, if each Bi is complete, or c-complete, then so is the
product Y\iaBi-
(ii) Show that the product of atomic Boolean algebras is atomic. Is
the converse true? That is, if a product is atomic, must each factor be
atomic?
(iii) Show that each Boolean algebra is isomorphic to a subalgebra of
a product of 2-element algebras. (This is the Stone representation theorem
in disguise. Show that any Boolean algebra B can be embedded in 2SB.)
5.11. Problem. Give a topological proof that each complete atomic
Boolean algebra B is isomorphic to a power set algebra. (Use the fact
that the Stone space X of B is extremally disconnected (4.4) and has a
dense subset / of isolated points to show that the map U^Unl, for
Ud CX, is an isomorphism of CX with P7.)
5.12. Problem. A one-one homomorphism between Boolean algebras
is called a monomorphism. If x is a cardinal, a Boolean algebra B is said
CH. 4, §6], DUALITY FOR HOMOMORPHISMS AND CONTINUOUS MAPPINGS 153

to be x-universal if for each Boolean algebra A of cardinality «x there


is a monomorphism of A into B.
(i) Let B be a complete Boolean algebra, and let x be an infinite cardinal.
Show that the following conditions are equivalent:
(a) B is x-universal;
(b) for each set X of cardinality <x, there is a monomorphism of
the finite-cofinite algebra (2.2(iv)) of X into B;
(c) B has an antichain (Prob. 3.18) of cardinality x.
(To prove (c)=>(a), first show that, assuming (c), B has a antichain
{^: £<x) such that b^O for all £<x and yi<Mbc=l. Let A be a Boolean
algebra ot cardinality «x; let {a*: £<x} be an enumeration (possibly
with repetitions) of the non-zero elements of A, and for each £<x let
Uz be an ultrafilter in A containing a*. Show the map h: A — B defined
by h(x)= V {bf x£ C/J for x£A is a monomorphism.)
(ii) Deduce that every infinite complete Boolean algebra is K0-universaI.
(Use Prob. 4.11 and (i).)
5.13. Problem. Show that, if the Boolean algebra B is not atomic, then
|S5|>2n°. Thus, if SB is countable, B is atomic. (If B is not atomic,
then there is an element 0 of B such that no element of B is an
atom. Using this fact and Prob. 5.5, construct for each /£ 2(0 a subset
Xf^ {x: x<b) such that Xf has the f.m.p. and for there exists x6 Xf
and yd Xr such that xa_v=0.)

*§ 6. Duality for homomorphisms and continuous mappings

In this section we show that there is a natural correspondence between


homomorphisms of Boolean algebras and continuous mappings of Boolean
spaces.
Consider two Boolean algebras B, B', and a homomorphism li of B
into B'. For each ultrafilter U in B', it is easy to see that h~1 [U] is an
ultrafilter in B, so we can define a mapping h : SB'-^SB by putting

h*(U)=h~im
for each SB'. The mapping /?# is called the dual of the homomorphism h.
Let u and u be the isomorphisms of B,B' onto the clopen algebras
of SB, SB', respectively.

6.1. Theorem. If h is a homomorphism of B into B', /;# is a continuous


mapping of SB’ into SB.
154 BOOLEAN ALGEBRAS [CH. 4, §6

Proof. It suffices to show that the inverse image under /z# of a clopen
set in SB is clopen in SB'. Now each clopen set in SB is of the form u(x)
for xd B, and we have
/?-1[w(x)] = {C/€S5/: /z#(£/)d «(*)}

= {U£SB': h~x [t/]d z/(x)}

= {U£SB': xeh-'iU]}

= {U£SB': h(x)£ U)

=u\h(x)),

which is, by definition, clopen in SB'.

Now consider two Boolean spaces X,X', and a continuous mapping


cp of X into X'. If V is a clopen subset of X', then cp~1 [V] is a clopen
subset of X, so we can define a mapping cpk : CX'^CX by putting

<pXV)=(p~1 [V]
for each V£CX'. The mapping <pk is called the dual of cp-, it is easy to
see that cpi is a homomorphism of CX' into CX.
If h is a homomorphism of a Boolean algebra B into a Boolean algebra B',
then, by Thm. 6.1, its dual /z# is a continuous mapping of SB' into SB,
and so the dual /z#l of /z# is a homomorphism of CSB into CSB'. We
thus obtain a diagram

B —* B'
(6.2) u\

CSB-^-CSB'

By the proof of Thm. 6.1 and the definition of /z#t we have, for each .vd B,

h^{u(x))=h~l [z/(x)] = zz/(/z(.v»,

so that (6.2) commutes. Therefore, if we identify B with CSB (via u) and


B' with CSB' (via it'), h is to be identified with its “second dual” /z#i.
The situation for Boolean spaces is similar. Thus let X and X' be Boolean
spaces, let q> be a continuous mapping of X into X' and let v and v' be
the homeomorphisms of X and X' onto SCX and SCX' given in Prob. 4.11.
(These homeomorphisms are defined by

v(x)={VdCX : xd V),

v'(x')={VfCX' : *'d V'}


CH. 4, §6], DUALITY FOR HOMOMORPHISMS AND CONTINUOUS MAPPINGS 155

for xdX and xffX'.) The dual <pk of q> is a homomorphism of CX' into
CX and so, by Thm. 6.1, the dual <pk# of (pk is a continuous mapping
of SCX into SCX'. We therefore obtain a diagram

X — X7
(6.3) „[ [o'

sex—-* SCX'

By the definitions of <pk#, v and v', we have, for each x£X,

(P^(v(x))=(p~1 [»(*)]

=<P71l{vecx: xav}]

= {V'eCX': xtvXV')}

= {V'£CX': xecp-^V']}

= {VfCX': cp(x)e v'}

= v'(<p(x)).

Accordingly (6.3) commutes. Therefore, if we identify X with SCX (via v)


and X' with SCX' (via v'), then (p is to be identified with its “second dual” <pk#.

We now prove:

6.4. Theorem, (i) Let h be a homomorphism of a Boolean algebra B into a


Boolean algebra B'. Then h is one-one iff /z# is onto, and h is onto iff /z# is
one-one.
(ii) Let (p be a continuous mapping of a Boolean space X into a Boolean
space X'. Then (p is one-one iff is onto, and <p is onto iff <p. is one-one.
Proof, (i) For simplicity put SB=X, SB'—Y. Then each of the following
statements is equivalent to its neighbours:
(a) /z# is onto;
(b) X—[T] = 0;
(c) X—h# [Y] includes no non-empty clopen set;
(d) if A^X is clopen and /i#_1[d] = 0, then A = 0;
(e) if A£CX and A#b(^) = 0, then A=0;
(f) h#i is one-one;
(g) h is one-one.
The only equivalences here which require justification are (b) o (c) and
(e) <=► (f). For the first of these we merely observe that Y is compact.
156 BOOLEAN ALGEBRAS [CH. 4, §6

so h# [7] is compact, hence closed, so that X—h# [ Y] is open. For the


second we appeal to the fact that /?#k is a homomorphism and to Prob. 3.2.
This proves the first part of (i). As for the second, again notice that
each of the following statements is equivalent to its neighbours:
(a') /z# is one-one;
(b') {h^[A\: AdCX} is a separating field of (clopen) subsets of Y;
(CO CY={h~1 [A]: AtCX};
(dO CY={h#fA): AtCXy,
(e') is onto;
(f) h is onto.
To see that (a') <=>- (b'), first assume (a'). Then if x^y in Y, we have
h#(x)Xh#(y), so that, since J is a Boolean space, there is A£CX such
that h#(x)eA and h#(y)$A. Hence x£ h*1 [A\, y$ h*1 [A] and (b') holds.
Conversely, assume (b'). Then if xYy in Y, there is A£CX such that
xehl1 [A] and y^h^1 [A], so that, a fortiori, /z#(AM/z#0), and (a') holds.
(b') o (c') is an immediate consequence of Lemma 4.7, and the other
equivalences are clear. This proves (i).
(ii) is a simple consequence of (i). For cp is one-one (resp. onto) iff
<P„# is one-one (resp. onto), and by (i) this latter condition holds iff <p. is
onto (resp. one-one). |

We may sum up the results of this section rather concisely using the
language of category theory. Let 38 be the category whose objects consist
of all Boolean algebras and whose morphisms consist of allhomomorphisms
between these algebras, and let SY be the category whose objects consist
of all Boolean spaces and whose morphisms consist of all continuous
mappings between these spaces. Let D (for duality!) be the mapping
of 38 into Sf which assigns SB to each object B of 38 and /?# to each
morphism /? of 38, and let D' be the mapping of 3Y into 38 which assigns
CX to each object X of Sf and <pk to each morphism cp of SY. It is easy
to verify that, if /?, h' are two composable morphisms of 38, and cp, cp'
are two composable morphisms of SY, then

(h'oh)#=h#oh#,

(<p' o (p\=(p,o (p[.

Thus D and D' are contravariant functors of 38 into SY and 3Y into 38


respectively. Thm. 6.4. implies that D and D' transform monomorphisms
into epimorphisms, and conversely. Moreover, the commutativity of
CH. 4, §7).
THE RASIOWA —SIKORSKI THEOREM 157

diagrams (6.2) and (6.3) shows that D and D' are mutually quasi-inverse
functors, so that dft and df are anti-equivalent categories.

6.5. Problem. Let B and B be Boolean algebras.


(i) Show that B is isomorphic to a subalgebra of B' iff SB is a continuous
image of SB'.
(li) Show that B is isomorphic to a quotient of B' iff SB is homeo-
morphic to a (necessarily closed) subspace of SB'. (Apply Thm. 6.4.)
6.6. Problem. Let F be a filter in a Boolean algebra B. The dual F of
F is the closed subset ^F) of SB, where u is the natural
isomorphism between B and CSB.
(0 Show that Ft-+F is a one-one mapping of the set of all filters in
B onto the family of all non-empty closed sets in SB.
(ii) Show that F, regarded as a subspace of SB, is homeomorphic to
S(B/F). (Apply Thm. 6.4 to the canonical homomorphism h\ B-+B/F.)

§ 7. The Rasiowa—Sikorski Theorem

We conclude our discussion of Boolean algebras with a result which has


important applications in model theory (Ch. 5, §§5, 6).
Let B be a Boolean algebra, and let T be a subset of B which has a
join V T. An ultrafilter U in B is said to respect T (or the join V T) if
we have
VT€ U=>TnU^Q.

Clearly, for any ultrafilter (in fact any filter) U we have

rnC/^0=>V^€ u.
Thus U respects T iff

\/TeU o Tn t/^0.

If dF is a family of subsets of B, each member of which has a join, we say


that U respects F (or the family of joins {VT: T^dF}) if U respects each
member of dF.

7.1. Problem. Show that, if each member of dT is finite, then every ultra¬
filter respects dT.

Let h be the canonical homomorphism of B onto B/U^ 2. It is easy


to see that U respects dT iff h{\j T) = \j h\T\ for all T^dF. A two-valued

12
BOOLEAN ALGEBRAS [CH. 4, §7
158

homomorphism h satisfying this condition is called a dT-complete


homomorphism.
We now ask: given a Boolean algebra B and a family d/ of subsets
of B each member of which has a join, is there always an ultrafilter in
B which respects dTl. In general the answer is no, in view of:

7.2. Theorem. Let B be a complete Boolean algebra. Then there is a


natural one-one correspondence between atoms of B and ultrafilters in B
which respect PB, and hence also between atoms of B and PB-complete
2-valued homomorphisms on B.
Proof. We give the merest sketch. For each atom a£ B, let Ua be the
ultrafilter generated by a; then it is easy to verify that Ua respects PB.
Conversely, let U be an ultrafilter which respects PB and let a=f\U.
Setting U* = {x*: U}, we see that, by Problem 2.6, a* = \JU*. Since
Ur\U*=Q and U respects PB, it follows that a*$U, whence a£ U.
Therefore a generates U, so that, by Lemma 5.1, a is an atom. This shows
that the mapping a>-* Ua is a bijection of the set of atoms of B onto the set
of ultrafilters in B which respect PB. |

It follows from Thm. 7.2 that if B is an atomless complete Boolean


algebra (e.g. the regular open algebra of [0, 1]; cf. Probs. 4.11 and 5.7),
then there is no ultrafilter in B which respects PB. However, since a complete
atomless Boolean algebra B must be infinite (every finite Boolean algebra
being atomic), PB is uncountable. Thus, in general, there may be no ultra¬
filter respecting a given uncountable family dT of subsets of a Boolean
algebra. But the situation is quite different when dT is countable, as our
final result shows.

7.3. The Rasiowa-Sikorski Theorem. Let B be a Boolean algebra and let


3T be a countable family of subsets of B, each member of which has a join
in B. Then there is an ultrafilter in B which respects dT.
Proof. Enumerate dT as {T„: «£«}, and let tn=\JTn for each n£co.
We define by induction a sequence {bn: n£ co} of elements of B such that,
for each ndco, bn£ T„ and the set {/■„* vb0,...,t* v bn) has a non-zero meet.
Suppose that ndco and that for each we have found bm to satisfy
these conditions. If n — 0, let

y=h
and if 0 let

y=(toVb0) a ... a (**_! v 6„_ j).


CH. 4, §8],
HISTORICAL AND BIBLIOGRAPHICAL REMARKS
159

Then if n-0 we have 0 by assumption, and if 0 we have y^O by


induction hypothesis. Suppose now, for contradiction’s sake, that
TA(t„*vZ>) = 0 for all be T„. Then

so that yAt* = 0 and yAb = 0 for all be T„. It follows that y<sb* for all
be Tn so that by Prob. 2.6

y^A{b*: be Tn}={\/TnY=t*.

Thus y=yAt* = 0, contradicting the induction hypothesis.


Accordingly, we can find bn to satisfy the required conditions, and
therefore such a bn can be found for each neco. Then the set
VX--T„* v &„,...} has the f.m.p. and is therefore included in an ultra¬
filter U in B. We show that U respects 3T. If tne U then, since t*vb„e U
by construction, it follows that

bn=tnAbn=tnA(t*vbn)e U,

so that Tnc\U^$. Thus U respects ST and we are finished. g

*7.4. Problem. The Baire category theorem asserts that, if 'V is a countable
family of dense open subsets of a compact Hausdorff space, then [\f"
is dense. Derive the following strong form of the Rasiowa-Sikorski
Theorem from the Baire category theorem: for any Boolean algebra B,
any countable family dT of subsets of B, each member of which has a
join in B, and any non-zero * in B, there is an ultrafilter in B which
contains x and respects ST. (For each ne u> let Qn = {UeSB: Urespects Tn}-,
where 3T={Tn\ «€&>}; show that Qn is dense and open in SB and apply
the Baire category theorem.)

§ 8. Historical and bibliographical remarks

Boolean algebras are named after the English mathematician George


Boole (1815-1864). In 1847 he made the first successful attempt to apply
mathematical techniques to logic. The equivalence between Boolean
algebras and complemented distributive lattices was first formulated
by Huntington in 1904. The Ultrafilter Theorem is due to Tarski [1930];
it is often stated in terms of ideals and is then known as the Boolean
prime ideal theorem. Lemma 3.11 is due to Hirschfeld and appears in

12*
160 BOOLEAN ALGEBRAS [CH. 4, §S

Machover and Hirschfeld [1969]. The Stone Representation Theorem


is to be found in Stone [1936] and the first application of topological
methods to the theory of Boolean algebras in Stone [1937]. Thm. 4.8
is due to Makinson [1969]; the proof we give was suggested by Brian
Rotman. The Rasiowa-Sikorski Theorem was first proved in Rasiowa
and Sikorski [1951], using topological methods (see Prob. 7.4). The
simple proof we give is due to Tarski (see Feferman [1952]).
Good introductions to the theory of Boolean algebras are to be found
in Dwinger [1961] and Halmos [1963]. More advanced treatises include
Birkhoff [1967] and Sikorski [1964],
CHAPTER 5

MODEL THEORY

The theme of this chapter — model theory — is the relationship between


sets of first-order sentences and the structures in which they are satisfied,
i.e. their models. In particular we shall be concerned with the various
methods by which models with prescribed properties can be constructed.
In §1 we set up a natural framework for discussing models; and in §2 we
prove the basic results on the existence of models of prescribed cardinalities.
In §3 we introduce an important method of constructing models — the
ultraproduct construction — which is a modification of the familiar algebraic
procedure of forming direct products. In §4 we discuss the extent to which
a set of sentences determines the properties of its models, and apply our
results to specific formal mathematical theories. In §5 we show how the
theory of Boolean algebras is connected with the theory of models, and
in §6 we discuss the role played in the theory by formulas with free variables.
Finally, in §7, we introduce models generated by Skolem functions and
prove the existence of models with many automorphisms. (§§6 and 7
are of a more specialized character than the other sections and may be
omitted at a first reading.) Acquaintance with the contents of Chs. 1-3
is required for an understanding of this chapter. However, only §§3, 5 and
6 assume familiarity with the results and methods of Ch. 4.

§ 1. Basic ideas of model theory

Throughout this chapter — with the exception of §7 — we shall use the


symbol JS? to denote a first-order language with equality but with no
function symbols apart from individual constants.1 We shall take con¬
junction (a), negation (—|) and the existential quantifier Q) as the

1 This restriction is not essential but is made for the sake of simplicity. The results of
this chapter can be extended in a straightforward way to languages with function symbols.
162 MODEL THEORY [CH. 5, §1

primitive logical symbols1 of JS? and regard the other logical symbols
(V, «-», V) as being defined in terms of these. We shall assume that
the individual variables of !d£ are enumerated in a fixed alphabetic
sequence2 . Moreover, we assume that the predicate
symbols and constant symbols of are given in the form of indexed
sets {R, : /€/}, {cj : jd J} respectively. For each id I we let X{i) be the
number of argument places in the predicate R,. Thus A is a mapping
of the set / into the set of positive integers; it is called the signature of Jzf.
The notion of an if -structure has already been introduced in §1 of
Ch. 2. There we allowed the domain of a structure to be an arbitrary
(non-empty) class. In model theory, however, we confine our attention
to structures whose domains are sets. (By the remark of the end of §5
of Ch. 3, there is no loss of generality in making this restriction, at least
as far as satisfiability of formulas is concerned.)
It is clear that any if-structure whose domain is a set may be regarded
as an ordered triple

9I=<T,<2,c>,
where
(1) A is a non-empty set called the domain or universe of 91;
(2) ^ is a mapping of / into the set of all relations on A such that for
each id I, £%(i) is a X(i)-ary relation;
(3) c is a mapping of J into A.
For each id I and each jd J we often write for ^2(7) and Cj for c(j), and
we also write
(4) 91=(A,(Ri)iu,(cj)jej).
The Ri and the Cj are called the relations and designated individuals of 91,
respectively. We shall sometimes write Rf for R{ and cj for cy, in order
to emphasize the fact that R, is the interpretation of R,, and c} that of
Cj, in 91.
We shall always use upper-case German letters to denote structures.
If we are given a structure denoted by a German letter, we agree to use
the corresponding upper-case italic letter to denote its domain. Thus
A is the domain of 91, B that of 93, etc.
If 91 is an J^-structure, we often call Jz? the language for 91.

1 Thus the degree of complexity deg<p of an Jz?-formula <p will be the number obtained
by adding up 1 for each occurrence of 3, 1, and A in tp.
- This is a departure from the convention adopted in previous chapters in which we
assumed the enumeration of the individual variables to start with yx.
CH. 5, §1], BASIC IDEAS OF MODEL THEORY 163

Given an if-structure of the form (4), we obtain an if-valuation


(Chapter 2, §1) by further specifying a sequence

a = {a0 ,a1?...)

of members of A as an assignment of values to the variables v0,vls... of if.


We shall call such a sequence an assignment in 21.
If a is an assignment in an if-structure 21 and be A, we define a(n\b)
to be the assignment which assigns the same values to the variables as
does a, except that it assigns the value b to the variable v„. Thus

a(n\b)=(a0,a1,...,an_1,b,an+1,...).

Notice that the value assigned to v„ by a(rc|Z>) is independent of the value


assigned to v„ by a.
For convenience we now restate the Basic Semantic Definition (2.1.1)
in the form best suited to our present purpose.
Let df be a language with predicate symbols {R,: /£/}, constant symbols
{cj\ j£J) and signature X. Let

21 — (A, (Ri)i€j, (cj)j£j)

be an if-structure, and let a—(a0,a1,...) be an assignment in 21. For all


if-formulas <p we define the relation a satisfies <p in 21, which we write
21 t=a <p, by induction on deg tp:
(1) For terms tl512 of if,

21 Na ti=t2 o b1=b2,

where if t„ (n = 1, 2) is the variable vt then bn is ak, while if t„ is the constant


cj then bn is Cj.
(2) For i£l and terms tl9...,tA(£) of if,

21 N„ Rjti...tA(i) o (fi1,...,bx(i))(LRi,

where if t„ (n = l,...,A(z)) is the variable \k then bn is ak, while if t„ is the


constant cj then bn is Cj.
(3) 21 Na “I cp o not 21 l=a (p.
(4) 21 Na (pAv)/ o 21 (=„ q> and 21 N0*|/.
(5)21 N0 3vn<? o 21 l=o(ll|6)<p for some be A.
It should be clear that the above definition does not differ essentially from
that given in 2.1.1. In fact, it is easy to verify that if 21 is an if-structure
and a is an assignment in 21, then, if a is the if-valuation determined by
164 MODEL THEORY [CH. 5, §1

91, a, we have, for each if-formula <p,

91 Na (p T.

The following facts are clear:


(a) 91 f=a Vv„ o 9tNo(„16)(|) for all b£A;
(b) if q> is a formula and a, a' are assignments in 91 such that an=a'n
whenever v„ occurs free in (p, then 91 |=a <p o 91 (=a- <p. (See Thm. 2.2.3.)
In view of fact (b), the truth of 911=„ <p, insofar as it depends on a,
depends only on the values a assigns to the free variables of <p. Accordingly
we make the following definition: if <p is a formula all of whose free variables
are among v0,...,v„ and a0,...,a„£A, we say that the finite sequence a0,...,an
satisfies <p in 91 and write

91 N (p [a0,...,a„]

if 91 Na-<p for some assignment a' in A such that a'0=a0,...,a'n=an. It


follows immediately from (b) that 91 N <p [a0,...,o„] iff 91 1=,- <P for all
assignments a' in 91 such that a'0=a0, ..., a'n = an.
If <r is a sentence, i.e. a formula without free variables, we say that <r
is valid or holds in 91, or that 91 is a model of <r, and write

911= <r,

if 91 Na <r for some assignment — and hence, in view of (b), all assign¬
ments — a in 91. If L is a set of sentences, we say that 91 is a model of
L and write

911= L

if 91 is a model of each sentence in 2.


Let ££' be a language which is an extension of if, so that, in addition
to the predicate symbols and constant symbols of if, if' contains a set
{R;: *’€/'} of predicate symbols and a set {c;: jfiJ'} of constants. Given
an if'-structure

the if-structure

^= (Ri)l€I> (cj)j€J)

is called the <£-reduction of 9T, and 9T is called an ££'-expansion of 91


(cf. Ch. 2, §9). (Notice that in general an if-structure has more than one
if'-ex pansion.)
CH. 5, §1],
BASIC IDEAS OF MODEL THEORY
165

Let
91 =<v4, (Cj)j£j),

91 =(A , (Jtj)iei, (cj)J€J)

be ^-structures. We say that 91 is a substructure of 91' and write 91 £91'


if Ac A', for each j£j, Cj=Cj, and, for each i f I, Rt is the restriction of
Ri to A, i.e. Ri=R^nAX(,). If B is a non-empty subset of A which contains
all the designated individuals Cj of 9t, we define the restriction 9115 of
91 to B by

9l|5=<5,(5in5^)feJj(c.).6j>.

It is clear that for any subset B of A which contains all the designated
individuals of 91 we have 9I|5s9I.
An embedding of 91 into 91' is a one-one mapping/of A into A' such that
(0 f(Cj) = Cj for all j£j;
00 (a1, ■ ■ ■ ,a £ Ri o R't
for all id I and all a1,...,aX(i)£A.

If/is an embedding of 91 into 91', it follows from (i) that f[A] contains
all the designated individuals of 91', so that we can form the restriction
91'1/MI- This is written /[9t] and is called the image of 91 under /.
An isomorphism of 91 onto 91' is an embedding of 91 onto 91'. If there
is an isomorphism of 91 onto 91', we say that 91 and 91' are isomorphic
and write 91 = 91. Clearly, if / is an embedding of 91 into 91', we have
9t =/[ 9X].
91 and 91 are said to be (££ -^elementarily equivalent, and we write
91 = 91', if for any ^-sentence <r we have 9lN<r o 91'Nor. Thus two
^-structures are elementarily equivalent if they cannot be distinguished
by an if-sentence.

1.1. Problem. Let/be an isomorphism of 91 onto 91'. Show by induction


on deg ip that for any formula 9 all of whose free variables are among
v0,...,v„ and all a0,...,an£A we have

91N 9 [a0,...,an\ o 9l'N 9 [f(a0),...,f(a„)\.

Infer that, if 91 = 9L, then 91 = 91'. (We shall see later on that the converse
is false.)
91 is said to be an (££-)elementary substructure of 91', and 91' an (if-)ele¬
mentary extension of 91 if 91 £ 91' and for any if-formula 9 all of whose
MODEL THEORY [CH. 5, §1
166

free variables are among v0,...,v„ we have

211=9 [o0,.••,«„] $I'l=9 ••»«»]

for all o0,...,an€/L In this situation we write 2I<2T.


It is clear that 21-<2I'=> 2l = 2T. Our next problem shows that the
converse is false.

1.2. Problem. Let 2l = (<o — {0},<), 2T=(©,<) where < is the usual
ordering of the natural numbers. Show that 21 s 91', 91 £ 91', but not
91 -< 91'.
1.3. Problem. Show that if 9I-<91', 9l"-<9l' and 2l£2l", then 21-<2I".
An embedding / of 21 into 21' is called an (if-)elementary embedding
if for any if-formula (p all of whose free variables are among v0,...,v„
we have
21 |=<P o 21' N 9 [/(«o)>•••,/(>„)]
for all a0,...,an£A.

1.4. Problem, (i) Let / be an embedding of 21 into 21'. Show that / is


an elementary embedding iff /[2I]-<2I'.
(ii) Let 21 £21'. Show that 2t-< 21' iff the natural injection of 21 into
21' is an elementary embedding of 21 into 21'.
(iii) Let /be any mapping of A into A' such that, for all a0,...,an£A
and all formulas 9 with free variables among v0,...,v„, 21N 9 [ao
o 2I')=9 [/(a0),...,/(a„)]. Show that/is one-one and hence an elementary
embedding of 2t into 21'.

21 is said to be elementarily embeddable in 21' if there is an elementary


embedding of 21 into 21'. Clearly 21 is elementarily embeddable in 21'
iff 21 is isomorphic to an elementary substructure of 21'. Evidently, also,
if 21 is elementarily embeddable in 21', then 21 is elementarily equivalent
to 21'.
We now prove some lemmas which will be very useful later.

1.5. Lemma. Suppose that 21 £21'. Then the following two assertions
are equivalent. -s
(i) 2I-<21';
(ii) for any n, any S£-formula 9 whose free variables are all
among v0,...,v„, and any a0,...,on_1€y4, if there is a'£A' for which
21'f= 9 [a0i,a'], then there is a£A for which 21'N 9 [a0,...,a„_1,a].
Proof. (i)=>(ii). Suppose 2t-<21' and 21'[=9 [a0,...,an_i,a'] with a'£A'.
Then 21'NEK9 [/<),••• A,-eI, so that, since 21<2t\ 211= 3v„9 [a0,...,an~x].
CH. 5, §1], BASIC IDEAS OF MODEL THEORY 167

Hence 9I(=q> [a0,...,a„_i,a] for some a£A, so that, since 3I-<3T,


rNq>k»-A-i,4
(ii)=>(i). Assume (ii). We have to show that

(1) 31|=<p [aQ,...,an] o 3J'|=9 [a0,...,fl„]

for any if-formula 9 with free variables among v0,...,v„ and any a0,...,a„ €A.
We prove (1) by induction on deg 9. That (1) holds for atomic 9 follows
immediately from the assumption that 31 c3T. The induction steps for
A and ~1 are trivial, so it remains to establish the induction step for 3.
Let 9 be 3vi<vk> and suppose that the free variables of 9 are all among
v0,...,v„. We want to prove (1); clearly it will be enough to prove

311= 9 K,•••,«„] o 3I't=9[a0,...,flm]

for some m>*n. Therefore we may assume without loss of generality that n
is greater than the indices of all the variables occurring —- free or bound —-
in 9: in particular, n>k and the free variables of ij/ are all among v0,...,v„. If

2lNQvfc\l/) [a0,...,a„],

then we have for some a£A, so that, by


inductive hypothesis,

3l'Mi[a0,...,ak_1,a,ak+1,...,an\,

whence 3TN(3vk\|/) [a0,...,a„].


Conversely, suppose that 3l'|=Qvkv|/) [a0,...,a„]. Put x=vKvfc/v« +1);
then 3vkvJ/ and 3vn + iX are variants — hence logically equivalent — so that

3TN(3v„+1x) [a0,...,an].

Thus there is some a'£A' for which 31'I=%[a0,...,an,a']. Hence, by (ii),


there is a£A such that 31 'Nxk.»«.4 Since deg/<deg9 and the
free variables of x are all among v0,...,v„+1, it follows from the inductive
hypothesis that
31 NxKvA.a]
and so 3tNQv„+iX) [a0,...,a„]. Therefore

21H3VI0 [a0,...,an].

This completes the induction step and the proof. I

Before stating the next lemma we need to set up some more notation.
Let K be a set such that JnK=$ (recalling that J is the set indexing
168 MODEL THEORY [CH. 5, §2

the constants of Jzf). We define to be the language obtained from


by adding a set of entirely new distinct constants {c*: kdK}.
Now let yX=(A,&,c) be an J5?-structure, where, as usual, c is a mapping
of J onto the set of designated individuals of 31. Given a mapping a
of K into A, we define

(%a)=(A,&,cua).

Clearly (31, a) is an Jz?x-expansion of 31 in which c{^l'a)=ak for each k£K.


If a maps K onto A, we say that a is an indexing of A by K.
We can now state:

1.6. Lemma. Let 31 and 3L be L£-structures, and let a be an indexing of


A by a set K. Then 31 is elementarily embeddable in 31' iff there is a mapping
a' of K into A' such that (31,a) and (3T,a') are (fTK-) elementarily equivalent.
Proof. We merely sketch the proof, leaving the details to be filled in
by the reader.
If / is an elementary embedding of 31 into 31', define a' : K-^A' by
a'(k)=f(a(k)) for k£K. It is then easy to verify that (31,a) = (3I',a').
Conversely, if a' :K-+A' is such that (31,a) = (3L,a/), define / :A^A'
by f(a(k))=a'(k) for k£K. Then f is an elementary embedding of 31
into 3T. |

1.7. Problem. Show that, if 31 and 3T are ^-structures, then 31 = 31'


iff there is a set K and indexings a and n' of A and A', respectively, by
K such that (31,a) = (31',a').
1.8. Problem, (i) Let 31 and 31' be ^-structures such that 31 £ 31'.
Suppose that for any finite subset {ax,...,aJ of A and any afA' there is
an isomorphism of 31' onto itself which leaves each at fixed and carries
a' into A. Show that 31 -<31'. (Use Prob. 1.1 and Lemma 1.5.)
(ii) Let be the ordered set of rationals and 31 the ordered set of reals.
Deduce from (i) that Q<3L

§ 2. The Lowenheim—Skolem Theorems

In this section we show that any infinite structure has elementary sub¬
structures and extensions in a wide range of cardinalities. In particular,
in Thm. 2.2 we derive a strengthened version of Thm. 3.3.15 for sets
of sentences. The proof of Thm. 2.2 is entirely semantic, i.e. makes no
use of the machinery of the predicate calculus.
If 31 is an JS?-structure, we put ||3I|| = |T|. ||3t|| is called the cardinality
CH. 5, §2]. THE LOWENHEIM—SKOLEM THEOREMS 169

of 91. Notice that ||9l|| depends only on the domain of 91; it has nothing
to do with the relations and designated individuals of 91.
Recall that the cardinality ||jS?|| of the language <£ is the cardinality
of the set of all ^-symbols. Clearly ||«Sf|| = max {|/|, \J\, K0}, where I
and J are the sets indexing the predicate symbols and constants, respec¬
tively, of .

Our first result asserts the existence of “small” elementary substructures


of a given infinite structure.

2.1. Theorem. Let 91 be an infinite -structure, and let X^A. Then for
any cardinal a satisfying max {|JST|, ||j£?||}<a*s;||9[|| there is an elementary
substructure 93 of 91 such that ||93||=a and X^B.
Proof. Let h be a choice function for the non-empty subsets of A, i.e.
such that h{Y)£Y whenever Y^A and Yx0; the existence of such a
function is ensured by the axiom of choice. Define the sequence B0,BX,...
of subsets of A as follows: B0 is any subset of A such that X^B0 and
|2?0|=a (such subsets of A exist in view of the conditions on a!) and for
each n£co,

Bn+1 = {h(Y): for some m, some ^-formula <p all of whose free variables
are among v0,...,vm and some a0,...,am-x£Bn,

0XY={x<=A: 91 b= cp [a0,...,am-i,x]}}.

We claim first that Bx contains all the designated individuals of 91.


For let Cj be a designated individual of 9I, and let <p be the formula
\0=Cj. Then

{cj}={x£A: 91 (= <P M},

so that cj=h[{cj))^B1 by the definition of B1.


Secondly, Bn^Bn+1 for each n£co. For if a0£Bn, then, putting tp for
the formula v0=vx, we have

H} = {x € A: 91N <p [a0,x]}

so that a0=h({a0})dBn+1 by the definition of Bn+1.


Thirdly, \Bn\=<x for each ndco. This is proved by induction on n. We
have |i?0| = a by assumption, so assume that 0 and |5n_x| = a. We
already know that Bn_1^Bn, so certainly \Bn\^cc. To prove the reverse
inequality we observe that each member of Bn is determined by an
S£-formula and an m-tuple of members of Bn_x, for some m£co. It follows
170 MODEL THEORY [CH. 5 §2

that, if we write O for the set of i?-formulas (so that by Thm. 3.3.11
i® mi
ia,H u *x(j.-,r|« e |4>H£,-Ir=n^n. e «’•
m£(o m£cD mdo

Since a»||JS?||s*K0, it follows that

Ill’ll- E «"=Ill’ll•«=«•
m € cu

Therefore |i?„|<a and so |i?n|=a.


Now put B=\J„^C0Bn. Then

\B\< £ \Bn\ = • a = a.
m £ co

Accordingly |5|=a. Moreover, since B1 contains the designated indi¬


viduals of 91, so does B, and we may therefore put 23 = 2l|i?.
We claim that 23 satisfies the required conditions. Certainly we have
X^B, and ||i?|| = a has been proved above. It remains to show that
23«<2l. To do this, we apply Lemma 1.5. Suppose then that a0,...,a„_1£B,
9 is an jSf-formula whose free variables are all among v0,...,v„, and there
is a'dA for which 2tt=9 [a0,...,an-1,ar\. Now we have already shown that
the Bn form an increasing chain, so there is m£a> such that a0,...,an-1£Bm.
The set
Y={x£A: 21N 9 [aQ,...,an_1,x\)

is non-empty by assumption, so, if we put a=h{Y), we have a£Bm+1^ B


and 2l|= 9 [a0,...,a„-x,o]. By Lemma 1.5, it follows that 23-<2I, and
the proof is finished. g
From this we infer the following important result (cf. Thm. 3.3.15).

2.2. Downward Lowenheim-Skolem Theorem. Let E be a set of sen¬


tences of Jz? with an infinite model of cardinality «»|Ej. Then E has a mo¬
del of any cardinality /? such that max (|E|,K0)</?<a.
Proof. Let y=max(|E|,tf0). Then at most y extralogical symbols occur
in the sentences of E so that |fj*?s|| = y, where is the “poorest” first
order language (with equality) in which E is still formulable (Ch. 2, §8).
Let 21 be an infinite model of E of cardinality a>|E|. Then the J^Vreduc-
tion 21' of 21 is also a model of E of cardinality a. Since by
Thm. 2.1 there is an (Jz?E-)elementary substructure 23' of 2T of cardinality /?.
23' is then a model of E, and if 23 is any ^-expansion of 23', then 23 is
a model of E which is an T£-structure of cardinality /?.
CH. 5, §2]. THE L0WENHEIM-SKOLEM THEOREMS 171

As a special case of Thm. 2.2, we have:

2.3. Corollary. Any countable set of sentences with an infinite model


has a countable model.
We now turn to the problem of showing that each infinite structure
has elementary extensions of prescribed cardinalities. The next lemma,,
although quite simple, is very useful in this connection.

2.4. Lemma. Let 91 be an A£-structure. Then, for any cardinal a>||91||,


91 has an elementary extension of cardinality a iff 91 is elementarily
embeddable in a structure of cardinality a.
Proof. Necessity follows immediately from Prob. 1.4(ii). Conversely,
let 9T be an if-structure of cardinality a>||9l||, and let/be an elementary
embedding of 91 into 91'. If B is a set of cardinality a including A, let
g be a bijection of B onto A' which extends / and then use g_1 to “transfer
the structure” of 9L onto B. The resulting structure 93 is an elementary
extension of 91 of cardinality a. I
For convenience we restate the compactness theorem (Thm. 3.3.16)
in the form most suitable for our present purpose.1
2.5. Compactness Theorem. If each finite subset of a set £ of sentences
of has a model, then £ has a model. I
We now employ the compactness theorem in the proof of:

2.6. Theorem. Let 91 be an infinite JSf-structure. Then 91 has an elementary


extension of any cardinality a>max (||9l||, ||^f||).
Proof. Let a : K-+ A be an indexing of A; let K' be a set of cardinality
a disjoint from both K and J. Let £ be the set of all sentences of STK which
hold in (9I,a), and let
£'=£u {(cfc-5*£cr): k', k'fiK' and k'Ak").

Then clearly £/ is a set of sentences of ITk\jk' of cardinality oc. Moreover,


each finite subset £0 of £r has a model. For let {kv...,kn} be the finite
set of members k' of K' such that cv occurs in a sentence of £0. Since
A is infinite, we can choose n distinct members a'v...,a'n of A. Define the
mapping a' :K'~* A by putting

afk'f — a'i for 1

oXk')=a[ for k'£K'-{k'lt...,k'n}-

1 In §3 we shall give a direct model-theoretic proof of the compactness theorem which


does not depend on the results of Chs. 1-3.
172 MODEL THEORY [CH. 5, §2

Then it is clear that (S^aucO is a model of £0. By the compactness


theorem, £' has a model which must be of cardinality >a since £' contains
the sentences ck^cr for all k',k"£K such that k'^k". Since j£'|=a,
by Thm. 2.2 £' has a model ©' of cardinality a. Now 23', as an ^-struc¬
ture, must be of the form (©,bub'), where © is an ^-structure, and b,b'
are mappings of K,K', respectively, into B. Since S' is a model of £',
(©,b) is a model of £ and therefore (2l,a) = (©,b). Since a was taken
to be an indexing of A, it follows from Lemma 1.6 that 21 is elementarily
embeddable in ©. Since ||©'|| = ||©||=a, Lemma 2.4 implies that 21 has
an elementary extension of cardinality a. §

We can now prove the following important theorems.

2.7. Upward Lowenheim-Skolem Theorem. Let £ be a set of ST-sentences


with a model of cardinality a>K0. Then £ has a model of any cardinality
>max(a, |£|).
Proof. Let 2t be a model of £ of cardinality a. Then the ^-reduction
2T of 21 is also a model of £ of cardinality a. Let /?> max(a, |£|); then
since ||jS?E||=max(^0> |£|), we have /?s>max(a, ||«Sfy), so that, by Thm. 2.6,
2T has an elementary extension ©' of cardinality /?. S' is then a model
of £, and any ST-expansion of ©' is an if-structure of cardinality /J which
is a model of £. |

2.8. Lowenheim-Skolem Theorem. Let £ be a set of ST-sentences with


an infinite model. Then £ has a model of each cardinality 3*max(|£|,K0).
Proof. Suppose £ has a model of cardinality a>K0- Let /?3=max(|£|, tf0).
If /?<a then £ has a model of cardinality [3 by Thm. 2.2, while if a</?
then /?»max(a, |£|), so that £ has a model of cardinality [3 by Thm. 2.7. |

2.9. Theorem. If £ is a consistent set of ST-sentences, then either £ has


a finite model or £ has a model of any cardinality >max(|£|,^0).
Proof. Let £ be a consistent set of sentences of ST. Then, by Thm. 3.3.13,
£ has a model. If this model is infinite, then, by Thm. 2.8, £ has a model
of any cardinality s=max(|£|,K0). |

2.10. Problem. Let sd be a family of ^-structures, where ST has no


constant symbols. The union of the family sd is the structure

IW=< U A, ( U
91 £ ^ 91 g si

Thus the domain of (Jsd is the union of the domains of the members
■of sd, and for each /£/ the interpretation of R; in \Jstf is (Jsig ^Rf.
CH. 5, §2], THE L0WENHEIM-SKOLEM THEOREMS 173

is said to be a chain (resp. elementary chain) if for all 21 ,2T6j/


we have 21 £21' or 21'£21 (resp. 21 -<2l' or 2T-<2I). Show that the union
of a chain of structures is an extension of each member of the chain, and
that the union of an elementary chain is an elementary extension of each
member of the chain.

2.11. Problem, (i) Use the compactness theorem to show that if a set
of sentences £ has arbitrarily large finite models, then it has an infinite
model. (Show that each finite subset of the set Eu {<y„: n£co} has a model,
where <t„ is a sentence which asserts that there are at least n distinct
individuals.)
(ii) Let if be a language containing a binary predicate R. Show that
there is no set £ of if-sentences with at least one infinite model such that
R21 is a well-ordering of A for each infinite model 21 of £. (Let {c„: n^ro)
be a set of new constants, and put E' = Eu {R Cn + lC/iACn + l5l^Cn’
Show that each finite subset of £' has a model, and apply compactness
to obtain an infinite model of £ in which the interpretation of R is not
a wellordering.)
2.12. Problem. Let 21 be an ^-structure. A setr of if-formulas each member
of which has precisely one free variable, v0 say, is said to be finitely
satisfiable in 21 if for each finite subset {<pl3...,q>„} of T there is an a£A
such that 2I|=(<p1A... A<p„) [«]. T is said to be satisfiable in 21 if there
is a£A such that 211= <p [a] for all <p£r.
(i) Show that for each structure 21 there is an elementary extension
21' of 21 such that every set of if-formulas finitely satisfiable in 21 is satis¬
fiable in 21'.
(ii) Let 21 be a finite structure, and let T be a set of if-formulas with
one free variable. Show that, if T is finitely satisfiable in 21, then Y is
satisfiable in 21.
(iii) Let 21 be a finite structure, and let 23 be a structure such that
21 = 23. Show that 21^23. (First show that A and B have the same finite
number n of elements. Let A = {%,...,an}. Use (ii) to construct by induction
a sequence of n distinct elements {b1}...,b„} of B such that (21, (a1,...,ai)) =
= (23, ,
frf)) for each i !«/</?.)

13
174 MODEL THEORY [CH. 5, §3

§ 3. Ultraproducts

To facilitate the exposition, in this section we confine our attention to


structures (A,R) consisting of a non-empty set A and a single binary relation
R on A. The language appropriate for these structures will be denoted
by ; thus is a first-order language with equality and another binary
predicate symbol R but with no constant symbols. It will be clear that
everything we do can be extended to structures with arbitrarily many
relations (and operations) and hence to first-order languages with arbitrarily
many predicate (and function) symbols, merely by complicating the
notation. We leave it to the reader to make these extensions.
Let / be a fixed but arbitrary non-empty index set, and for each /£/
let 91i={Ai,R>) be an jS?-structure. Also let f]be the Cartesian
product1 of the sets At. We shall use /, g, h,f', g', h' to denote elements
of A.
The direct product °f the family {9I(: /£/} is the structure
where S is the set of all pairs (f,g) such that (f(i),g(i))€Ri
for all i£l. The direct product is a natural construction which finds
application in many branches of mathematics. But from the logician’s
point of view it suffers from the drawback that J],6/9lj may not share
the first-order properties of the 91,. In other words, there may be a sentence
<t such that each 91; is a model of a but ]^[9I, is not. For example, suppose
that each 91,- is a totally ordered set with at least two elements, and let
a be the sentence VvoVvifRvoViVRvjVo]. Then 9tf t= a for each /£/, but,
if / has at least two elements, n* i 1= Icr. (In less technical terms, the
product of totally ordered sets is not in general totally ordered.)
We are going to introduce a modification of the direct product construc¬
tion which does not suffer from the drawback mentioned above and which
is accordingly of great usefulness in model theory.

First we define two mappings E and R of AXA into PI by putting,


for fgCA,

E(f,g)={ia: f(i)=g(i)),

R(f,g)={iEl: (f(i),g(i))£Ri}.

Now let SE(A) be language obtained from jSf by adding a new constant

1 Where the index set is clear from the context, we shall in future write rf/fi or YlAt
and similarly for other expressions of the same kind.
CH. 5, §3], ULTRAPRODUCTS 175

symbol f for each/6,4. We define the mapping <n-*||<r|| of the sentences


of JT(A) into PI by induction on deg a as follows:

l|f=g|| =df E(f,g),

lIRfgll =dfR(f,gY,

for =S?(/4)-sentences a, a',

II^Aa'II =df ||<r || n ||<i'||


ll”l®ll =df/-H<TIU
and for each variable x and each =£?(,4)-formula tp with at most x free,

II 3*9II =df U ll<p(x/f)||.


feA

We may think of the mapping || • || as assigning “Boolean truth values”


in the Boolean algebra PI to the sentences of A), and the triple (A,E,R)
as a “Boolean-valued” ^f-structure, in which the “truth value” of an
^-sentence a is now ||<r||, and no longer simply one of the two truth values
T or l (cf. Prob. 5.14).
In order to simplify the notation employed in the sequel, we make the
following conventions. If tp is any J§?-formula all of whose free variables
are among v0,...,v„ and 6,4, we agree to write (p(f0,...,f„) for
<p(v0/f0,...,v„/fn). Also, if (p has at most the variable x free and f€A, we
write <p(f) for <p(x/f).

3.1. Theorem. Let <p be any S£-formula whose free variables are all among
v„,...,v„. Then for any f0A we have1

||(p(fo,...,fn)|| = {z€/: 2l|N«p [/o(Ov.,/n(0]}-

Proof. We argue by induction on deg <p. For atomic <p the result is true
by definition.
Suppose now that the result holds for all vj/ with deg \|/<deg <p. We show
that it holds for (p. There are three cases to consider.

1 Notice that, under these conditions <p(f0, .... f„) is an JT(A)-sentence, so IKp(f0, •••, Oil
is defined.

13*
MODEL THEORY [CH. 5, §3
176

(a) (p is vJ/ax- We have

||(p(f0,...,Q|| = ||vl/(f0,...,fn)||n||x(fo,..-,fn)ll

= {/€/: 9t|NvK/0(i),-•/»(*)]}

n{/€/: 3Ij|=XL/o(0»—JiO')]}

= {*€/: 3I; 1= (vj/Ax) [/o(0»—/»(0]}

= {*€/: 31; 1= tp [/0(Oj• • • >fn(i)]}>


and so in this case the result holds for <p.
(b) <p is “IX- We have

||(p(f0,...,fn)||=/-||x(fo,-,fn)ll
=I-{ifJ\ UC£ |= X [/o(Ov#')]}
= {iel: %\=~\X [/0(0,---,/n(0]}
= {i£l: N (p

so the result holds for <p in this case as well.


(c) <p is 3vfci|/. Without loss of generality we may assume that k*sn,
and we have

||(p(f0,...,f„)|| =

= U H'l/(fo5---»fft-l5fjffc + l5---5fn)ll
f<LA

= U {/€/: 31; t=v)/ [/o(/),...,/fc_i(0,/(0/fc+i(0;--->/n(0]}


f£A

= {*€/: for some f£A,

/fc+i(0,---/„0')]}
=X, say.
Clearly
^ {/€/: «,N(3vW [/o(0,-,/„(/)]}-
On the other hand, if i£l is such that

3t;N(3vtxl/)[/o(Ov..,/n(0]
then there is some a^At such that

31; 1= [/o(0>• • • ifk -l(0>a>fk + x(0,•■■’fn 0)]•


CH. 5, §3], ULTRAPRODUCTS 177

By taking f£A so that/(/) = cr, we see immediately that i£X. Hence

*= {*'€/: ^NQv.xJ/)[/0(z),...,/„(/)]}

= {/€/: 5ljNf[/0(i),..,/„(i)]},
and so the result holds for <p in this case too. |

3.2. Problem, (i) Show that, if a is any ^?(^4)-sentence such that 1— <r,
then ||<r||=7. (Use Thm. 3.1.)
(ii) Suppose that each 9I; is identical with a fixed structure 93. Then
YXA — B1. For each bdB let h be the function on / with constant value b.
Show that, for each formula <p with free variables among v0,...v„, and all
b0,...,bn£B,

SN(p [b0,...,bn] o ||<p(b0,...,bII)|| =/.

(Use Thm. 3.1).


3.3. Problem. Let cp be any £T(A)-formula with at most the variable
x free, and let f,g(LA. Show that

l|f=g||n||(p(f)||c||(p(g)||.

(Use Thm. 3.1; alternatively, argue by induction on deg <p.)

Recall that when we defined the mapping || • || we insisted that

113*911 = U II q>(0 II.


ft a

i.e. ||3x<p|| is the supremum — in PI— of the ||<p(f)|| for f£A. We now show
that this supremum is actually attained by at least one of the |[ <p(f)||.

3.4. Lemma. Let (p be an JL(A')-formula with at most the variable x free.


Then there is fdA such that.

||3X9|| = || 9(f) ||.

Proof. Well-order A in the form {/?: £<oc} for some ordinal a. For
each £<a, put

^=119(^)11- U ll9(W
Then we have

(1) 113X911 = U 119(f)II = U 119(011 = U


f€A €«*
xt.
MODEL THEORY [CH. 5, §3
178

Also, if ^rj, X^nXn=0, so we can choose f£A to satisfy f\X^—f^X^


for all £<a. Then ^c||f=f^|| and hence, by 3.3,

119(011 2 l|f=f«Hn 119(^11 2 ^

for all £<a. Hence

U X^ c || 9(f) ||

and so ||3xtp|| c ||<p(f)|| by (1). The reverse inclusion is an immediate


consequence of the definition of ||3x<Pll, and the result follows. |

We next show how to reduce the “Boolean-valued structure” (.A,E,R>


to an ordinary i^-structure.
Let 2F be a subset of P/, and define the relations ~r, R# on A by putting

f~?g ^ ||f=g||€^r,

(f>g)£R* ** l|Rfgll€#'.

We now have:

3.5. Lemma. Suppose that 2F is a filter over I (Prob. 4.3.16). Then:


(i) is an equivalence relation on A;
(ii) if /~^/' and g~#g', then (f,g)£R# implies (f\g')^R^-
Proof, (i) By definition,

l|f=g|| = {*€/: f(i)=g(i)}-


Since /€^, is reflexive. Clearly is symmetric, so it remains to
establish its transitivity. Thus suppose that f~^g and g^^lv, then
||f=g||£jf' and ||g=h|| Accordingly, since $F is a filter,

||f=g||n||g=h|K^.
But clearly

l|f=g|M|g=h|| S l|f=h||.

Hence ||f=g||^Jzr, so f^^h. Thus is transitive, and (i) is proved.


The proof of (ii) is similar to that of (i), and is entrusted to the reader. |

From now on we will assume that is a fixed but arbitrary ultrafilter


over 7. Intuitively an ultrafilter over 7 consists of “large” subsets1 of 7

1 In the sense that, if 3F is an ultrafilter over I, and h is the canonical homomorphism


(Ch. 4, §3) of PI onto 2, then h(X) = 1 if and only if X^JF.
CH. S, §3], ULTRAPRODUCTS 179

and we may think of the statement f~&g as asserting that / and g agree
on a “large” subset of / (or that the Boolean truth value ||f=g|| of the
assertion that f=g is close to the largest element I of PI).
For each fdA we let f/F be the ~^-class of/and we put

A/F=l\Ai/F={f/F:f£A}.
ia

The relation R^ on A naturally induces the relation R\F on A\F defined by

(f/F,g/F)£R/F o (f,g)£R

By 3.5(ii), this is a sound definition.


Now define to be the structure1 (A/F,R/F). Then

is an ^-structure called the ultraproduct of the family {2I;: i£l) with


respect to the ultrafilter F. If each 2t; is identical with some fixed structure
23, the ultraproduct is written iBfF and is called the ultrapower of 23
with respect to IF.
Our next result is basic.

3.6. Theorem. Let <p be an F-formula all of whose free variables are
among v0,...,vn. Then for any /0,...,/„6T we have

U%/dT^^[f0/F,...,fjF] o \\<p(f0,...,fn)\\eF.

Proof. The proof is by induction on deg <p. For atomic <p the result
holds by definition.
Suppose now that the result holds for all \j/ with deg v|/<deg (p. We
show that it holds for <p. As usual, there are three cases to consider:
(a) (p is v]/ax- We have, using the inductive hypothesis and the fact
that F is a filter,

||(p(f0,...,fn)||€#' o

** ||(vi/Ax)(f0,..,fn)l|e#'
l|vKf0,...,f„)||e^&i!x(f0,...,f„)||€^
^ Y\%IF Nx|/ [fJF,...,fJF] & nWJF N ^ [foIF,... JJF]

Y[^i/F 1= A x) [fJF,... ,fJF]


mJF^9WFt...JJF]

so in this case the result holds for <p.

1 (A/F,R/F) is the quotient of the Boolean-valued structure <A,E,R> by the ultrafilter F.


180 MODEL THEORY [CH. 5, §3

(b) . <p is We have, using the inductive hypothesis and the fact
that is an ultrafilter,

||q>(f0,...,f„)[|€^' o nx(fov,Qll6^

||x(fo,-,fn)ll^

■» not [

~ EIW^N-lxl/o

«• n

so the result holds for <p in this case as well.


(c) . <p is 3vk\|/. As usual we may assume without loss of generality
that k*sn. Then we have

||<p(f0,...,f„)||€JF

^ liavfcv|/(f0,...,ffl)|[^

o for some/£v4, ||\|/(f0,...,ffc_1,f,ffc+1,...,fn)||€#' (by Lemma 3.4)

~ for some f£A,

«• nw^Niai,*)

«■ nvN9[/^,.../.w

Thus in this case, too, the result holds for <p, and the proof is complete. |

Putting Thms. 3.1 and 3.6 together we obtain the following fundamental
result:

3.7. Los' Theorem. For any IF -formula <p whose free variables are all
among v0,...,v„ and any f0,...,fn^Ai we have

Y[%i^nq>[fjp,...;/„/&] ~ {ifj:
In particular, for any IF-sentence a we have

IlW^Ncr <=> {/€/: %^=<s}Z3F. |

Now let 31 be an <S?-structure and let IF be an ultrafilter over I. For


each a£A let aI A1 be defined by a(i) = a for each i£l; ci is called the constant
CH. 5, §3], ULTRAPRODUCTS 181

function (on /) with value a. Define the mapping d\ A by putting

d(a) — d/^r for each a£A.

It is easy to verify that d is an embedding of 91 into 91it is called the


canonical embedding of 91 into 9I1/#'.

3.8. Theorem. The canonical embedding d is an elementary embedding


of 91 into 917^”. Hence 91 and 9IJ/dF are elementarily equivalent.
Proof. Let be a formula whose free variables are all among v0,...,v„r
and let a0,...,an be any elements of A. Then by Los’ Theorem we have

9lf/^^=(f[d(a0),...,d(an)] o 9lf/J*> <p [a0/d?,...,an/d?]

~ {/€/: 9lN(p[«o(0,-,d„(0]}€#-

o 91|=<p [a0,...,an]. |

3.9. Problem, (i) Show that if F is principal and generated by f0£/,


then YY&Jd?s9Iio. (Define h: WAJ^^Aio by h(f/^)=f(i0); h is the
required isomorphism.)
(ii) Deduce from (i) that if I is finite, then 9t//^r ^ 91.
3.10. Problem. Show that, if A is finite, then 9I//^r = 9l. (In this case
d maps A onto A1/^.)
3.11. Problem, (i) For each nZco let 91 n = (A„,<„) be an infinite well-
ordered set, and let dF be a non-principal ultrafilter over co. Show
that J|9I„/J2r is not well-ordered. (We may assume that (co,<) is an initial
segment of each 9I„. Show that the elements (0,1,2,3,...}/ J5", (0,0,1,2,...)/J5",
(0,0,0,!,...)/^, etc., constitute an infinite descending chain in J”[9I„/JC)
(ii) Let 9t=(co, + ,X,0,l), and let dF be a non-principal ultrafilter
over co. Show that 9tc7^r is not isomorphic to 91. (Use (i).)

We now apply Los’ Theorem to obtain a proof of the compactness


theorem which is purely semantic and independent of the results of
Chapters 1--3.

3.12. Compactness Theorem.1 Let E be a set of SF-sentences. If each


finite subset of E has a model, then E has a model.
Proof. Suppose that each finite subset of E has a model; let I be the
family of all finite subsets of E. For each A 6/ let 9IA be a model of A,

1 Once we have given a purely semantic proof of the compactness theorem, it will be
clear that the proofs of Thms. 2.6, 2.7 and 2.8 a'so become purely semantic. A different
semantic proof of Thm. 2.6 is given in Prob. 3.16.
182 MODEL THEORY [CH. 5, §3

and define

A={A'€/: AgA'}.

If {A1,...,A„}c/J then clearly

A1u...uA„€A1n...nA„,

so that {A: A£/} is a subset of P/ with the finite intersection property.


Let be an ultrafilter over / containing each A. We claim that the ultra¬
product is a model of £. For suppose that <r££. Then {tr}£/,
and if {<r}cA£/ then 5Ia|=<t, so that

{<t}~ = {A€7: {a}cA}c{AC/:

since {<t} by construction. It follows from Los’ Theorem that


PJ9Ia/#' |= <r. This completes the proof. |

Remark. How the Compactness Theorem got its name. The term “compact”
is topological, but there is no mention of topology either in the Compactness
Theorem or its proof. Why, then, do we call it this?
The answer is really quite simple. Let ST be the the class of all ^-struc¬
tures, and for each if-sentence <r and each set £ of jSf-sentences let

Mod(<T) = {2Ie5C- 9l(=<r}, Mod(£) = {UlSST: 2I|=£}.

Classes of structures of the form Mod(a) or Mod(E) are called elementary


and generalized elementary classes, respectively. It is easy to see that
the elementary classes form a field of subclasses of ST\ accordingly they
form a base for a topology on ST, called the elementary topology. Since
the generalized elementary classes are precisely the intersections of
elementary classes (as is easily verified) and the elementary classes are
clopen subsets of ST, the generalized elementary classes constitute the
family of all closed sets in the elementary topology. Using this remark,
one easily shows that the Compactness Theorem means that the elementary
topology is compact!

3.13. Problem. Let ST be the class of all ^-structures, and let fey1,
(i) Show that 3C is an elementary class iff both SC and ST — SC are
generalized elementary classes.
(ii) Show that SC is a generalized elementary class iff every ultraproduct
of members of SC is in SC and every structure elementarily equivalent to
a member of SC is in SC. (Let o be a variable ranging over sentences. For
CH. 5, §3]. ULTRAPRODUCTS 183

the “if” part, let

E={a: <t holds in all 216^}.

Then afcMod(E). If 21 € Mod (E), let

A = {<r: 21 No},

and for each g£A show that one can choose so that ©cNo. For
each a£\ let ro = {x£A: ©TNo}; show that {FCT: o£A} can be extended
to an ultrafilter SF over A and that = 21.)
*3.14. Problem, (i) Let 21 and © be if-structures. Show that 2t = ©
iff 21 is elementarily embeddable in an ultrapower of ©. (For the “only
if” part, suppose that 21 = ©, and let a=(ak: k£K) be an indexing of A.
Let E be the set of all £fK- sentences which hold in (21,a). For each a€E
show that there is a sequence ba£BK such that (©,bc)No; then put

Act = {t€E: (S,bT)No}.

Show that {Aa: o^E} can be extended to an ultrafilter & over E and
that 21 is elementarily embeddable in ©s/#\)
(ii) Deduce from (i) that if A is finite and 21 = © then 2t s ©.
*3.15. Problem. Let {2I;: *'€/} be a family of fields, and let & be an
ultrafilter over I. For each/Cj^T; let

Z(f) = {iO: f(i)=0}

and let
M={feY\Ar. Z(f)£F}.

Show that rw is a ring in which M is a maximal ideal, and that the


ultraproduct Y\%/^ is isomorphic to the quotient ring
*3.16. Problem. For each set I let P(0(7) be the family of all finite subsets
of I. An ultrafilter J5" over / is said to be regular if there is a one-one
map/of / onto ?JI) such that for each *'£/,

{jfj: iefUDZP-
(i) Show that, if / is infinite, there is a regular ultrafilter over I. (Let
/be a one-one map of / onto P<o(/). For each i£l let

E-Vdl: ief(j)}.

Show that {Et: /€/} has the finite intersection property and that any ultra¬
filter over / which contains each £; is regular.)
184 MODEL THEORY [CH. 5, §4

(ii) Let A,I be infinite sets, and let be a regular ultrafilter over I.
Show that \Al/&r\ = \A\W. (It suffices to construct a one-one map h of
A1 into ffl/SF, where SA is the set of all finite sequences of members of A.
Let g be a one-one map of / onto PJJ) such that, for each jdf

{*€/: y'€g(i')K#\
Let < be a total ordering of /. For each fdA1 define f*d&1 as follows.
For id I,
g(i) = {i0,...,ik}dPM)

with z0<...<4- Put

/*(0=</0o),-,/(4)>.

Now define h: A1-*^1/^ by setting h(f)=f*/#r for each fdA1. Show


that h is one-one.)
(iii) Deduce Thm. 2.6 from (i) and (ii).
*3.17. Problem, (i) Let if be a countable language, let !F be a countably
incomplete ultrafilter (Prob. 2.3.16) over a set 7, let {21*: /£/} be a family
of if-structures, and let T be a set of if-formulas each of which has
exactly one free variable v0. Show that, if T is finitely satisfiable (Prob. 2.12)
in m*, then T is satisfiable in [([s2Ii/JJr. (Let T= {<[>„: ndco}, let
/03/i2/22... be a descending chain of members of i5- such that (“)/„ = 0
(Prob. 2.3.16), for each ndco let

Yn=I„ n {/€/: 2I; t= 3v0(<p0 A... A q>„)},

and for each idf let p{i) be the greatest n such that id. Yn. Define fd\\Ai
by setting/(0 to be an arbitrary element of / if F0, while if id T0 choose
f(i)dA( so that
A... A9M(0) [/(i')].
Show, using LoS’ Theorem, that f/& satisfies T in
(ii) Use (i) to give another solution for Prob. 2.12(i).

§ 4. Completeness and categoricity

Throughout this section we let if be a countable first-order language.


By a theory in if we shall mean a set E of if-sentences which is closed
under deducibility, i.e. such that for each if-sentence <r, if E|-<t, then
A subset r of a theory E is called a set of postulates for E if T)— tr
for every <t<EE. It is clear that each set of sentences T is a set of postulates
CH. 5, §4], COMPLETENESS AND CATEGORICITY 185

for a unique theory 2, namely,

2 = {<r :<t is an if-sentence and T|— a}.

A property P of if-structures is called a first-order property if there is


an if-sentence <r such that, for any if-structure 21,

21 has property P o 1= <r.

In this case we say that P is induced by a. To what extent does a given theory
L determine the first-order properties of its models? The completeness
theorem tells us that the first-order properties shared by all models of 2
are precisely those induced by the sentences in 2. But this still gives
us very little information about the extent to which the first-order pro¬
perties of a specific model of 2 are determined by 2. And in general
the models of a given theory can be very different.
Consider, for example, the (first-order) theory of partially ordered sets,
PO. This theory is formulated in a language if having one binary predicate
symbol R. Its postulates are

VxRxx,

VxVy [RxyARyx-»x=y],

VxVyVz [RxyARyz-*- Rxz],


An if-structure (A,R) is then a model of PO iff it is a partially ordered set.
Since a partially ordered set can have many different first-order properties,
e.g. it can be a lattice, or a Boolean algebra, or a totally ordered set, etc.,
it is clear that PO does not precisely determine the first-order properties
of its models.
Let us call a theory 2 complete if it is consistent and the first-order
properties of any model of 2 are just those determined by the sentences
in 2. More precisely, if for each if-structure 21 we define Th(2I), the
theory of 21, to be the set of all if-sentences <r such that 21 \= a, then 2
is complete iff 2 is consistent and Th(2I) = 2 for each model 21 of 2.

4.1. Lemma. The following conditions on a consistent theory 2 are equivalent:


(i) 2 is complete.
(ii) Any pair of models of 2 are elementarily equivalent.
(iii) For any TP-sentence a, either <rf 2 or “laf 2.
Proof. (i)=>(ii). Suppose that 2 is complete and that 21 and 25 are models
of 2. Then Th(2l) = 2 = Th(23), so that 21 = 23.
MODEL THEORY [CH. 5, §4
186

(ii) =>(iii). Assume (ii) and let <r be an if-sentence. If a<|E, then by
Thm. 3.1.14. is consistent and therefore, by Thm. 2.9, has
a model 91. Since “iff holds in 91, it must hold in every model of E and
so, by the Strong Completeness Theorem 3.3.14, ShKr. Since E is
a theory, —|cr£E.
(iii) =>(i). Assume (iii), and let 91 be any model of E. Then EsTh(9l).
If E, then “i«€E. Therefore 91N“Iff, whence “idfTh(9l), so that
cr^Th(9I). Accordingly Th(9I) c E, so that Th(9I) = E and E is complete. |

4.2. Problem. Show that Th(9I) is a complete theory for any if-structure
91, and that a consistent set E of if-sentences is a complete theory iff

E = Th(9I) for some if-structure 91.

A natural strengthening of the condition for completeness given in


4.1 (ii) is to insist that each pair of models of E be isomorphic. Under
these conditions E is said to be categorical. Certainly any categorical
theory is complete. On the other hand, no theory E with an infinite model
can be categorical, for by the Lowenheim-Skolem theorem E must then
have models of many cardinalities, and no pair of models of different
cardinalities can be isomorphic. However, it is still possible that any
two models of E of the same (infinite) cardinality are isomorphic. This
suggests the following weakening of the notion of categoricity.
Let a be an infinite cardinal. A theory E is said to be a-categorical
if any pair of models of E of cardinality a are isomorphic. We now give
some examples.

4.3. Examples, (i) Let if be the language with no extralogical symbols,


and let E be the set of all if-sentences which hold in every if-structure.
Then E is oc-categorical for every cc.
(ii) Let if be a language with no extralogical symbols except for one
unary predicate symbol, P, and let E be the set of if-sentences which
hold in every if-structure. Then E is not oc-categorical for any a.
(iii) Let if be as in (ii) and for each natural number m let am be the
sentence which asserts that there are at least m individuals having the
property P and at least m individuals not having the property P. Let E be
the theory with {<Tm: m£co} as postulates. It is then easy to see that E
is Xu-categorical but not oc-categorical for any uncountable oc. (We give
a more interesting example later on.)
(iv) Let if be a language whose only extralogical symbols are countably
many constants c0,clv.., and let E be the theory whose set of postulates
CH. 5, §4], COMPLETENESS AND CATEGORICITY 187

is {cm9*<V m^n). It is then easily verified that £ is a-categorical for


every uncountable a but not &0-categorical. (We give a more interesting
example later on.)

Notice that we have not given an example of a (countable) theory which


is a-categorical for some but not all uncountable a. The reason for this
omission is that no such theory exists, for Morley [1965] has shown that
if a countable theory is a-categorical for some uncountable a, then it is
a-categorical for all uncountable a. The proof of this deep result is too
lengthy to be included here.
The relevance of a-categoricity to completeness is contained in the
following theorem.

4.4. Theorem. Let £ be a consistent theory with no finite models, and


which is a-categorical for some infinite a. Then £ is complete.
Proof. Suppose that £ is not complete. Then there is a sentence a such
that <r^£ and ~i<r^£. It follows that £u{a} and £u{~i<x} are both
consistent and hence have models, which must both be infinite since £ has
no finite models. Therefore, by the Loweinheim-Skolem Theorem, £u{a}
and £u{~i<r} have models of cardinality a. Since a holds in one of these
models but not in the other, £ is not a-categorical. This contradiction
proves the theorem.

We now apply Thm. 4.4. to establish the completeness of various specific


theories.
First consider the theory UDO of unbounded dense (total) orderings.
Let be a language with one binary predicate symbol R. UDO is the
theory having the following postulates:

(UDOx) Vx®xx A VxVy [RxyaRyx-*x=y] a Vx Vy Vz[Rxy a Ryz-* Rxz]


AVxVy [RxyvRyx]
(UD02) VxVy [RxyAX5*£y-»3z [x^zAy/zARxzaRzy]]
(UD03) Vx3y3z [x?£yAx^zARyxARxz].

UDOx asserts that R is a total ordering, UD02 asserts that R is dense,


and UD03 that it is unbounded both below and above. Natural examples
of models of £ are £5 and 91, the sets of rational numbers and real numbers,
with their natural orderings.
Our next result is a classical theorem of Cantor. The proof provides
a first example of a so-called “back-and-forth” construction.
MODEL THEORY [CH. 5, §4
188

4.5. Theorem. UDO is K0-categorical.


Proof. Let s3t = (A,<) and 23 = (B,^) be two denumerable models of UDO.
Thus each is an unbounded densely ordered set. Let A = {a„: n£co} and
B={bn: n£co}. We define two sequences c0,c1}..., and d0, by
recursion as follows. First, put c0=a0 and d0 = b0. Now suppose k>0;
then we consider two cases.
(i) k = 2m is even. In this case we put ck = am. If, for somey'</c, ck = Cj,
then we put dk=dj. Otherwise we let dk be some element of B which bears
the same order relations to the elements of {d0,...,dk_x} as does ck to the
elements of {c0,...,cfc_1}; that is, for each y</c, if1 cfc<C/ then dk<d},
and if Cy<cfc then dj<dk. Since B is dense and unbounded it is clear that
we can always find such a dk.
(ii) k=2m + 1 is odd. In this case we put dk = bm. If, for somey<£, dk=dj,
then we put ck—Cj. Otherwise we let ck be some element of A which bears
the same order relations to the elements of {c0,...,cfc_x} as does dk to the
elements of {r/0,...,r4_1}. Again such a ck can always be found.
This completes our recursive definition. We now define h:A->-B by
putting h(c„)=dn for each ndco. It is clear from our construction that
h is an isomorphism of ill onto 23.

From 4.4 and 4.5 we imediately obtain

4.6. Corollary. UDO is a complete theory. |

In particular, since Cl and 9? are models of UDO it follows from 4.6


that Q = 9l (cf. Prob. 1.8(h)), and that UDO is a set of postulates for
both ThfQ) and Th(SR).

4.7. Problem. Show that LJDO is not a-categorical for any uncountable a.
(Let 21 be the ordered sum of a copies of Q, and let 23 the ordered sum
of a + 1 copies of Q. Then 21 and 23 are models of UDO but 21 is not
isomorphic to 23 because every initial segment of 21 has cardinality <a
while this is not true of 23.)

The theory that we are going to consider next is most naturally formulated
in a language with function symbols.
Let SdF be the language for fields, that is, d£v is a first-order language
with constants 0,1 and binary function symbols +, •. The theory FT
of fields has the following postulates (where we write x+y and x • y for

11 We write for “a*sb and a^b" as usual.


CH. 5, §4], COMPLETENESS AND CATEGORICITY 189

+xy and -xy):

VxVyVz [(x+y)+z=x+(y+z)],

Vx(x+0=x),

VxVy(x+y=y+x),

Vx3y(x+y=0),

VxVy Vz [(x • y) • z=x • (y • z)],

Vx(x-l=x),

VxVy(x-y=y-x),

Vx [x^O-4- 3y(x • y=l)],


VxVy Vz [x • (y+z)=(x • y)+(x • z)],

0^1.

If we add the postulate

pi—d

where p is a prime number and pl = l+...+l with p summands, we get


the theory FT(p) of fields of characteristic p. If on the other hand we
add the infinite set of postulates

{pl^iO: p a prime}

we get the theory FT(0) of fields of characteristic 0. We now put x" for
the expression x • (x • (•••x)--), with n factors. The infinite list of postulates,
one for each ns* 1,

Vx0... Vx„ [x„5*0—► 3y(x„ • y”+xn_! • y"_1+...+x1 • y+x0=0)]

when added to the theory of fields gives us the theory ACF of algebraically
closed fields. If we add the postulate /?1=0 or the set of postulates (pI^O:
p a prime} to ACF we get the theories ACF(p) and ACF(O) of algebraically
closed fields of characteristic p and characteristic 0, respectively.
Notice that ACF(O) is not K0-categorical. For the field g of algebraic
numbers and the algebraic closure of the field $ [7r] obtained by adjoining
the transcendental n to $ are countable non-isomorphic models of ACF(O).
On the other hand, a classical theorem of Steinitz asserts that any pair
of algebraically closed fields of the same characteristic and the same

14
MODEL THEORY [CH. 5, §4
190

uncountable cardinality are isomorphic, so it follows from Thm. 4.4 that


ACF(O) and ACF(p) are complete theories1. Since the field £ of complex
numbers is a model of ACF(O), it follows that ACF(O) is a set of postulates
for Th(£). We also see that any first-order property possessed by £ is shared
by all algebraically closed fields of characteristic 0. (This is known as
Lefschetz’ principle.)
This method of establishing completeness also applies to the following
theories, each of which is a-categorical for some infinite a, and hence
complete:
(1) The theory of atomless Boolean algebras: K0-categorical.
(2) The theory of infinite Abelian groups in which all non-zero elements
are of the same prime order: a-categorical for all infinite a.
(3) The theory of infinite, divisible, torsion-free Abelian groups: a-cate¬
gorical for all uncountable a.

4.8. Problem. Show that if <r is a sentence in the language for fields which
has models of arbitrarily high finite characteristic, then cr has a model of
characteristic 0. (Use the Compactness Theorem.)
4.9. Problem. A theory is said to be finitely axiomatizable if it has a finite
set of postulates.
(i) Show that the theory FT(0) is not finitely axiomatizable. (Suppose
the contrary; then there is a sentence a such that, for all fields 5,
iff gr is of characteristic 0. Now apply Prob. 4.8 to ~i<t.) Deduce that
the property of being a field of finite non-zero characteristic is not a first-
order property.
(ii) Assume the following result: for each n£a>, there is a field which
is not algebraically closed but in which all polynomials of degree *sn
have zeros. Show that ACF is not finitely axiomatizable. (Like 4.8.)

*4.10. Problem. An ordered field is a structure of the form


t^ = (F, +, • ,0,l,«s), where (F, +,*,0,1) is a field and ^ is a total ordering
on F such that for all x,y,z£F we have

x«£y=>x + z=s:y + z, .x

Xscy, zs»0=>x • z«sy • z.

An ordered field is said to be archimedean if for any x,ys* 0 there is a natural


number n such that y^n-x (where n*x=x+...+x with n summands).

1 It is well-known that these theories have no finite models.


CH. 5, §5]. LINDENBAUM ALGEBRAS 191

Show that each ordered field (in particular, the real field) is elementarily
equivalent to a non-archimedean ordered field. (Use the Compactness
Theorem.) Deduce that the property of being an archimedean ordered
field is not a first-order property.

4.11. Problem. Show that every group which has elements of arbitrarily
high finite order is elementarily equivalent to a group which has an element
of infinite order. (Use the Compactness Theorem.)

§ 5. Lindenbaum algebras

In this section we show that each consistent set of first-order sentences


L gives rise to a Boolean algebra which can be used to investigate the
model-theoretic properties of L.
Let 5£ be a fixed language (not necessarily countable), let <5 be the set
of all JSC-formulas, and let E be a consistent set of J^-sentences. We
define the relation on O by putting

<p %v|/ o S|— <p«-»v}/

for all <p,\J/ C <I>. The relation ^ is evidently an equivalence relation* on (D.
For each tp^®, let |(p| be the %-class1 of (p. Thus

|(p| = {vJ,e<D: <p^i|/}.


Let
5={|<p|: <p€0>},

and let < be the relation on B defined by putting

(5.1) |<p|«|\|/| o

5.2. Problem. Show that (5.1) is a sound definition, i.e. that if |<Pi| = [(p2|
and |v|q| — [v|/21 then

Si—(Pl~>vj/l -<=> SH <P2“*Vl/2-

Our next result is crucial.

5.3. Theorem. (B,<) is a Boolean algebra. In (B,^) we have

|<p| = l <=► Sh<P,

[<p( =0 ■<=>• Sh-~l(p.

1 It is important to keep in mind that rs and |<p| depend on £ as well as J2?.

14*
MODEL THEORY [CH. 5, §5
192

Proof. We have to show that (B,<) is a complemented distributive lattice


with at least 2 elements.
Since cp—► <p is a propositional tautology, we have El-<p-Mp; hence
|<p|<|tp|, and is reflexive. By definition, is antisymmetric. To show
that it is transitive it is enough to notice that, since is a tautological
consequence of the set of formulas {<p—>-v|/, >J/—► x}, we have

Sh<p—fv|/ & 221— xj/ —>• x ==>"^I ^X-

Thus ^ is a partial ordering of B.


We now show that (£,<) is a lattice, in other words, that any two
elements |<p|, |\]/| have an infimum and a supremum in B. In fact we claim
that
inf{|<p|, |v|/|}=j<pAi|/|, sup{|<p|, |\M}=|<pVt|/|.

To prove the first equality, observe that we have Hh<[>Av|/-mp and


£l_(pAv!/—»\|/ since these formulas are propositional tautologies; hence

|q>A*M<|q>|, |<pAv|/|<|'|'|.

Now suppose that |x| is any lower bound for {|<p|, |v|/]}. Then Sl-X-KP
and 22i— x—> M'• Since x-»(q>A\K) is a tautological consequence of the set
of formulas {x->-9, X~* 'I'}* it follows that H|—x-K^AvK). Thus

|x|<|<pAv|/|.

Hence |<pa.x|/ | =inf{[<p|, |v|/|} as claimed. The other equality is proved


similarly.
In accordance with the lattice-theoretic notation introduced in Ch. 4, §1,
we write |q>| a |vj/| for inf{|<p|, |vj/|} and |cp| v |vj/| for sup{|q>|, |\[/|}. Thus

|(p| A |v(/| = |(pAvl/|, |<P| V |M/| = |<pWj/j.

We have, for any formulas <p, vj/, x,

sh((9v^)ax)<+((<pax)v(v|/ax)),

since the formula on the right is a propositional tautology. Therefore

(l<p| v |\|/|)A Ix| = (!<p| a |xl)v(|vl/| a Ixl),

so that one of the distributive laws holds in B. Therefore, by a result in


Ch. 4, §1, the other distributive law holds as well, and so (i?,<) is a distrib¬
utive lattice.
CH. 5, §5]. LINDENBAUM ALGEBRAS 193

We now show that (5,<) has a largest element 1 and a smallest element 0.
Let x be any tautology; then E|—x and for any formula v|/ we have E|-v|/->-x;
hence |\j/|^|x|- It follows that |x| is the largest element of Similarly,
|“lxl is the smallest element of (5,<). Since E is consistent, it is easy
to see that 0 1.
If E|— (p, then for any formula v|/, El— vj/—>-<p and so |\J/|^|(p|. Hence
|<p| = l. Conversely, if |<p| = l, then, for any formula \|/, |v|/|c|<p|, so that
Si—x|/—► <p. Choosing \J/ so that Si—v{/ (e.g. take v|/ to be a tautology), we
obtain El— <p by modus ponens. Therefore

|<p| = l iff El— <p.


Similarly
I <p | = 0 iff E l— —i <p.

For any formula <p we have

El— <p v “up, El— —l(q>>V I<p),

since (pv~l<|> and —| (<pA —l<p) are propositional tautologies. Hence

|<p|v|—|<p| = |q)V —|<p| = l, |<p| A |“iq>l = l<PA~iq>l = 0.

Accordingly the complement of |<p| in (5,<) is | Iq>|.

The Boolean algebra1 B is often written 5(E) to point up its dependence


on E. 5(E) is called the Lindenbaum algebra of E. The algebra2 5(0)
is called the Lindenbaurn algebra of the language ST, and is usually denoted
by 5(if).
In the proof of Thm. 5.3 we saw that the mapping | • | behaves like
a truth valuation on ST except that it assigns “truth values” in the Boolean
algebra 5(E), rather than in the 2-element algebra {T,±}. (For a more
general discussion of such “truth valuations”, see the problems at the
end of this section.) We now investigate the effect that this “truth valuation”
has on the quantifiers.

5.4. Theorem. Let E be a consistent set of ST-sentences, and let T be


a set of ST-terms which contains infinitely many variables. Then for each
ST-formula (p and each variable v,c we have, in 5(E),

I3v*<p|= V |q>(vfc/t)|.
t£ T'

1 As usual we write B for


2 We know already (Prob. 3.1.16.) that 0 is a consistent set of sentences of ST.
194 MODEL THEORY [CH. 5, §5

Proof. By Thm. 3.1.11 we have, for any t£T',

£k<p(vj t)-»3vp>
so that, for any t(ET',

l<P(vk/t)|<|3vkq)|.

Accordingly |3v*<p[ is an upper bound in 5(E) for {|<p(v*/t)|: t£T'}.


Now suppose that |v|/| is any upper bound in 5(E) for {|q>(v*/t)|: t£T'}.
Then, in particular, choosing q so that v9€T' and \q occurs in neither
<p nor v|/, we have

£Hq>(v*/v,)-M\>-
Hence, by Prob. 3.1.6,

(1) Sh3vg(p(vk/vg)->\|/.

But 3Vgtp(vfc/Vg) is a variant of 3vfc<P so that, by Thm. 3.1.10, we have

(2) £b3vg9(vfc/v*)^3vft(p-
Now (1) and (2) give

Sh-3 VP->>-

Hence |3vk<p|«s|t|/| in 5(E). This shows that |3vk<p| is the supremum of


{|<P(vfc/t)|: t£T'}, as required. |

5.5. Problem. With the assumptions and notation of Thm. 5.4, show
that, in 5(E),

lvvp|= A l<p(vfc/t)|.
teT'

5.6. Problem. Let U be an ultrafilter in 5(E). Show that

{(p: |q»[€C/}

is a maximal consistent set of formulas (cf. Def. 3.3.4).

We now show how certain types of ultrafilter in 5(E) give rise to models
of E. Our construction is similar to that employed in Ch. 2, §7.
Let E be a consistent set of jS?-sentences and let U be an ultrafilter
in 5(E). Define a relation E on the set T of all ■-terms by
CH. 5, §5], LINDENBAUM ALGEBRAS 195

It is left to the reader as an exercise to show that E is an equivalence


relation on T. (Using Prob. 5.6 and Thm. 3.3.7 proceed as in the proof
of Lemma 2.7.2.) For each t£T let

|t| = {t'€T: tEt'}

be the is-class of t, and let

^(EO = {|t|: t€T}

be the set of all is-classes.

Now suppose that =£? is of signature 2 and that {Rf: i£l) and {c^: j£J}
are lists of the predicate symbols and constant symbols, respectively, of SE.
For each i£l we define the 2(/)-ary relation Rt on A{U) by putting

(5.7) (|t1|,...,|U(i)|)€7?i •<=>■ |Rit1...t/l(i)|€C/

for t1,...,t;i(0€T.

As in the proof of Lemma 2.7.3, one verifies that (5.7) is a sound defini¬
tion, i.e. the left-hand side is independent of the choice of “representatives”
for 1*1 !,•••,lLl(t)I-
We now define the ^-structure 51 (U) by

(5.8) 51 (U)=(A(U), (Rt)iU, <|c

51 (U) is called the canonical ££-structure determined by U.


Let T(U) be the set of all if-sentences <r such that |<r| 6 U. Then E c r(t/),
for if <r£E then [<r| = 16 U in 5(E), so that <r^r(C/).
It is natural to ask whether 5l(U) is a model of r(t/), or, at least, of E.
In general the answer is no, as the following example — an adaptation
of the example given in Ch. 3, §3 — shows.

Let S£ be a language with equality but without any extralogical symbols.


Let A be a set with two members a0 and alt and let 51 be the unique
jS?-structure with domain A. Let E consist of the single sentence
gvoBViCvo^Vi). Let a be the assignment in 51 with constant value a„,
and put
I7={|+|€S(£): «!=.+>.
Using the obvious fact that 51 is a model of E, it is easy to verify that
U is an ultrafilter in 5(E). Also, it is clear that 5lNavm=v„ fo rail m,n£a>,
so that |vm=v„| £U, whence |vm| = |v„|, and so 5t(t/) has exactly one element.
It follows immediately that 51(17) is not a model of E.
196 MODEL THEORY [CH. 5, §5

It is clear that the unhappy state of affairs exemplified above would


not have arisen if Uhad satisfied the following condition: for any if-formula
<p and any variable vfc,

(5.9) |3vfc<Pl€ o |<p(v*/t)|€lf for some t£T.

(Recall that T is the set of all if-terms.) By Thm. 5.4, (5.9) is equivalent
to the condition that U respects (Ch. 4, §7) the family^of joins

{ V l<P(vk/t)|: (p€<D, k£co),


teT

where ® is the set of all if-formulas. Moreover, using Prob. 5.6 and
Def. 3.3.8, it is easy to see that, if U satisfies (5.9), then {tp: |<p|££/} is
a Henkin set in if. We are now going to show that condition (5.9) is
sufficient for 91(17) to be a model of T(17) and hence, since Esr(C),
also of E.
Let us call an ultrafilter U in B(L) perfect if it respects the family of
joins
{ V l<p(vfc/t)|: <p€<D, k£(o).
t£T

Then we have:

5.10. Theorem. Let U be a perfect ultrafilter in B(L). Then:


(i) For each if-formula tp whose free variables are all among v0,...,v„,
and each (n + l)-tuple t0,...,t„ of members ofT, we have

9I(t/)N<p [|t0|,...,|t„|] o |<p(v0/t0,...,v„/tn)|€t7;

(ii) 91(17) is a model ofY(U) (and hence also of EJ


Proof. Notice first that (ii) is an immediate consequence of (i). For
if ff€r(t/), then \o\£U, so that, by (i), 9I(C/)t=<r.
It thus remains to prove (i). Put

*F={<p: |<p|6t/}.

Then, as we remarked a little earlier on, T is a Henkin set — hence a


Hintikka set — in jSf. We now observe that the if-valuation a defined
in Ch. 2, §7 has the following properties, all of which are verifiable by
inspection:
(1) its universe is A(U),
(2) RJ=Ri for /£/;
(3) t"=|t| for t£T.
CH. 5, §5], LINDENBAUM ALGEBRAS 197

Since ¥ is a Henkin set, it follows immediately from Thm. 2.7.5 (a) and (b)
that, for any formula (p,

<pff=T o <peT.

Hence, if t0,...,t„£T, we obtain, using Thm. 2.3.15,

^(vo/to)...(v,X)=T ^ <p(T0/to,...,YA)€Y
i.e.

(4) (pff(V0/to)...(Vn/tn)_ T ^ |(p(v0/t0,...,Vn/tn)|€t/.

Now suppose that the free variables of tp are all among v0,...,v„. Then
(1)—(3) show that the assertion

(pCT(v0/to)...(vn/t^)_ -p

is merely an alternative way of expressing the assertion

s2l(t/) N <p [|t0|,...,|tn|].

This fact, together with the equivalence (4) yields the required conclu¬
sion. |

Theorem 5.10 shows that perfect ultrafilters in 5(2) give rise to canonical
models of 2. We are now going to establish a converse.
Let 0 be an equivalence relation on T; for each tfT let t/0 be the
0-class of t, and let T/0 be the set of all 0-classes of members of T. An
^-structure of the form

at=<T/e, fe/ew

is called a basic ^-structure.


Clearly 2l(£/) is a basic if-structure for each ultrafilter U in 5(2).

5.11. Theorem. There is a natural one-one correspondence between the


class of perfect ultrafilters in 5(2) and the class of basic if -structures which
are models of 2.
Proof. Let di be the class of perfect ultrafilters in 5(2) and let Ji be
the class of basic if-structures which are models of 2. We define two
mappings / and g with domains °ll and Ji, respectively, as follows. For
each U^di,f(U) is 21(£/), the canonical if-structure determined by U.
For each 2l£./#, there is an equivalence relation 0 on T such that the
domain of 21 is T/0. Let t»=(v0,v1,...) be the sequence of all variables.
198 MODEL THEORY [CH. 5, §5

of Se, and let v/Q=(\0/Q,yJQ,...). Then we put

S(«0={|<P|:

By construction, /(£/) is a basic ^-structure for each U^°li and by


Thm. 5.10(ii) it is a model of E. Hence f(JJ)£Ji, so that/maps W into Ji.
For each %^Ji, it is easy to see that g(sII) is an ultrafilter in .6(E). We
claim that g(9I)£^, i.e. that g(5l) is perfect. To prove this, suppose that
j3vfc(p|€g(3I). Then, by the definition of g(5I), we have l=o/Q 3vfc<p,
so that, for some t£T,

^o/QWt/Q) <P-

But this means (cf. Thm. 2.3.10) that '21 h=o/Q <p(vfc/t) i.e. that |<p(vjt)|6g(91).
Accordingly g(2I) is perfect.
We now claim that / o g is the identity on Ji and that g of is the identity
on °U, from which it immediately follows that / is a bijection of m onto Ji.
The first of these assertions is left as a simple exercise to the reader. To
show that go f is the identity on °U, observe that, if U(i°li, then, by the
definitions of / and g we have for any (p€d>, putting |o| = (|v0|, IvjJ,...),

l9|€g(/(C/)) ~ 9l(C/)N|0|q>.

If the free variables of <p are all among v0,...,v„, then, since U is perfect,
we have, by Thm. 5.10(i),

Sl(^)h»l<P ^ |tp(Vo/v0,...,v„/vn)|€C/.

But clearly q>(v0/v0,...,v„/v„) = <p, so that

h»l o |<P| € t/-


Putting these equivalences together, we obtain

l<p|€g(/(C/)) o |<p|€C/,

so that g(f(U)) — U. Thus go/ is the identity on °ll, and the proof is
finished. |

5.12. Problem. Recall that we write S6(E) for the Stone space of .6(E).
Define h: S6(E)-S6(E) by putting h(U)=gVL(U) for each U£SB(L),
where g is defined as in Thm. 5.11. Show that U is a perfect ultrafilter
in 6(E) iff h(U) = U; in other words, the perfect ultrafilters in 6(E) are
precisely the fixed points of the mapping h.
CH. 5, §5]. LINDENBAUM ALGEBRAS 199

5.13. Examples, (i) Let if be a language with only countably many


constant symbols but with an uncountable family of unary predicate symbols
{Pi: /€/}. Let £ be the set of ^-sentences

{3voP;V0: /€/}U {Vv0~l(PiV0 A P;V0): ij£l,i?*j};

then clearly any model of £ must be uncountable. On the other hand, each
basic if-structure is countable since if has only countably many terms.
Accordingly no basic if-structure is a model of £ and hence, by 5.11,
there are no perfect ultrafilters in R(£).
(ii) Suppose that if is countable and let £ be a consistent set of
if-sentences. Then the family of joins

{ V l<P(vft/t)|: (p£<P, k€«}


t£T

is countable and so, by the Rasiowa-Sikorski theorem 4.7.3, there is a


perfect ultrafilter U in R(£). 5H(C/) is then a model of £. This gives a new
proof of the assertion that every countable consistent set of sentences has
a model, and hence of the Godel Completeness Theorem for sentences.
(This proof is due to Rasiowa and Sikorski [1951].)

Throughout Probs. 5.14-5.18 we make the following assumptions,


if is a first-order language whose only extralogical symbol is a binary
predicate R. (This assumption is not essential, we make it solely for the
sake of simplicity.) If if' is any extension of if obtained by adding
constants, <p is an if'-formula all of whose free variables are among
v0,...,v„, and a0,...,a„ are constants of if', we write <p(a0,...,a„) for
(p(v0/a0,...,vn/a„). Also, if <p is an if'-formula with at most the variable
x free, and a is a constant of if', we write <p(a) for <p(x/a).

*5.14. Problem. Let B be a complete Boolean algebra. A B-valued ^-pre-


structure is a triple 21 = (A,E,R) where A is a non-empty set and E,R are
mappings of AX A into B. Let if(T) be the extension of if obtained by
adding a new constant symbol a for each a£A. Define the mapping || • Hj,
from the set of all if(^)-sentences into B by induction as follows:

II a=b||„ = E(a,b), ||Rab||M = R (a,b),

||<tA<t'||91= ||tr||ajA lltr'llgj, II0lUi= Mai »

113X9113,= V ll(p(x/a)llai-
a€A
200 MODEL THEORY [CH. 5, §5

21 is called a 5-valued 3?-structure if for all a,b,c,d£A we have

l|a=a||9l=l;

||a=b-»-c=d-»-Rac-»-Rbd[|a,= l;

||a=b-^c=d->a=c->b=d||ai=l.

Let (p be an if-formula whose free variables are all among v0,...,v„.


9 is said to be B-valid if for any 5-valued if-structure 21 and any a0,...
...,an£A we have ||<p(a0,...,a„)||9,= l. Show that the following are equivalent:
(0 h<p;
(ii) 9 is 2-valid;
(iii) 9 is 5-valid for some complete Boolean algebra 5;
(iv) 9 is 5-valid for every complete Boolean algebra B.
(For (i) o (ii), observe that a 2-valued if-structure is essentially just
an if-structure and apply the Completeness Theorem. For (i) => (iv),
argue as in the proof of Thm. 3.3.12. For (iii) => (ii), notice that each
2-valued structure is a 5-valued structure.)
*5.15. Problem. Let 2t = (T,5) be an if-structure, and let 5 be a complete
Boolean algebra. Put

A(-B) = {f€BA:{ora\la,bdA,aXb=>f(ci)Af(b)=Oa.nd Vfip) = 1},

and define the mappings E, R' from A(B)XA(B) into 5 by setting, for
fg^B\

E(f,g)= V /(«)Ag(a),

R'(fg) = V {/(«) A g(b): (a,b) € 5}.

Then, as is easily verified, 21 (B)=(y4(B),J?,5') is a 5-valued if-structure;


it is called the B-extension of 21. For each sentence <t of if' = if (2I(S))
write ||<t|| for |i<7||9(CB) (Prob. 5.14).
(i) If f,gdA(B), show that ftT=g|| = l iff/=g.
(ii) Let 9 be an if'-formula with at most the variable x free. Show
that, for any fg£A{B),

l|f=g|| a ||9(f)|| <||9(g)||.

(iii) Let 9 be an if-formula whose free variables are all among v0,...,v„,
CH. 5, §5], LINDENBAUM ALGEBRAS 201

and let ffnd A(B). Show that1

I <P(f0,...£)II = V! A /=o
fi(ad'■ 'l N<p [flo>• • • ,a„], a0,...,andA}.
(Argue by induction on deg <p, using Prob. 4.2.6.)
(iv) For each adA define ddA(B) by putting

if a=x,
if a^x.

Show that2, for any if-formula whose free variables are all among
v0,...,v„, and any a0,...,an€A,

9In<P [a0,...,an] <=> ||(p(a0,...,a„)H = l.

(Use (iii).) Thus, in an obvious sense, the mapping a^a is a B-valued


elementary embedding of ill into
*5.16. Problem. We continue to employ the assumptions and notation
of Prob. 5.15.
(i) Let {by. idI}^B satisfy 6;a&j = 0 for iAj, and let {fy idI}^A{B).
Show that there exists an fdA{B) such that [|f=fi||>&; for all idI. (First
show that without loss of generality one can assume V/eA=1- Then
define fdBA by putting

/(«)= V biAfiia)
id I

for ad A. Show that / meets the requirements.)


(ii) Let <p be an if'-formula with at most the variable x free. Show that
there exists an fdA(B) such that3

||3x(p|l = ||cp(f)||.

(Let A(B) be well-ordered in the form {/,: £<a}; put

by= ||<p(f=)|| a [ V II<P(QII]*-


Observe that

b^Abri = 0 for £?±ri,

V *«=ll3xq»||.

1 Compare Thm. 3,1.


2 Compare Prob. 3.2.
3 Compare Lemma 3.4.
202 MODEL THEORY [CH. 5, §5

By (i), pick with ||f=f5||>6{ for all ^<a. Using 5.15 (ii), show that
||<p(f)||>&4 for all £<a.)
*5.17. Problem. We continue to employ the assumptions and notation
of Prob. 5.15. Let Fbe an ultrafilter in B. Define the relation on A(B> by

f~Fgo ||f=g||€i7;

then ~p is an equivalence relation and for each f^A<-B) we let fF be the


~f-class of /. Put

Am/F={fF: /€^(B)}.

Define the relation RF on A^/F by

(fF,gF)eRF ||Rfg||€F.

The if-structure <H(~B)/F=(A(^B)/F,Rf) is called the (5,F)-(Boolean)


ultrapower of 31.
(i) Let tp be an if-formula whose free variables are all among v0,...,v„,
and let /0,...,/,£A(B). Show that1

^B)/FN<p[/B...,fF] o ||<p(f0,...,fn)||£F.

(Argue by induction on deg tp, using 5.16(ii) to handle the existential


case.)
(ii) Show that the mapping a-+aF (defined in (iv)) is an elementary
embedding of 31 into 3l(B)/F. (Use (i) and 5.15(iv)).
*5.18. Problem. (The assumptions and notation of Probs. 5.15-5.17
are still in force.) Let / be a non-empty set, let &b e the power set algebra
P7, and let J5" be an ultrafilter in J1. Define the mapping j: A1 — (PI)A
by putting, for each feA1 and each a£A, j{f)(a)=f~\a).
(i) Show that j is a bijection of A1 onto A(P/).
(ii) Let <p be an if-formula whose free variables are all among v0,...,v„.
Show that,2 for any f0,...,fneAF,

{/€/: 211= <p [/0(i),...,yj,(/)]} = ||<p(j(f0),...,j(f„))||.

(Use 5.15 (iii).)


(iii) Show that the mapping //#W(y(/))-F is an isomorphism of the
ultrapower 3l7^ onto the Boolean ultrapower 3I(p/)/JF (Use (ii).) Thus,
the ultrapower is a special case of the Boolean ultrapower.

1 Compare Thm. 3.6.


2 Compare Thm. 3.1.
CH. 5, §6]. ELEMENT TYPES AND N0-CATEGORICITY 203'

§ 6. Element types and K0-categoricity

So far in this chapter our main concern has been with the properties of
structures determined by first-order sentences. In this section we turn
our attention to the role played by formulas with free variables.
Throughout this section we shall assume thet if is a fixed countable
first-order language. We write O for the set of all if-formulas and for
each n(i(D we write <D„ for the set of all if-formulas whose free variables
are all among v0,...,vn_1. Thus 0>0 is the set of all ^-sentences. If £ is.
a consistent set of if-sentences, we put in f?(0)

£„(£) = {I<P|: <P€<D„},

where |<pj is the ?«-class of <p (see §5). We also write Bn for i?n(0). It is.
clear that Bn(L) is a subalgebra of the Lindenbaum algebra B(L). B0 is.
called the Lindenbaum algebra of sentences of if.

6.1. Problem. Let E be a consistent theory in if, and put in B(Q)

£/(£) = {|ct|: cr££}.

Show that:
(i) t/(£) is a filter in B0;
(ii) £ is finitely axiomatizable (Prob. 4.9) iff t/(£) is a principal
filter in B0\
(iii) £ is complete iff £/(£) is an ultrafilter in B0.

Let Ac$n. An ^-structure 91 is said to realize A if there is a finite


sequence (a0,...,an^1) of members of A such that (p [a0,...,a„-i] for
all cp€A. 91 is said to omit A if it does not realize A.

6.2. Examples, (i) Let 91 be the structure (o», + ,X,0,l), and let A be the
set of formulas

{0^v0, O+l^Vj, 0+1 + Mvo,...}

in the language for 91. Then up to isomorphism 91 is the unique model


of Th(9I) which omits A.
(ii) Let A be the set of formulas

{(vo5*£Oa/Wo5«£0): p a prime}

in the language for fields (§4). Then the fields which omit A are precisely
those of finite non-zero characteristic. (We know from Prob. 4.9(h) that
this is not a first-order property.)
204 MODEL THEORY [CH. 5, §6

(iii) Let g be an ordered field (Prob. 4.10), and let A be the set of formulas

{1<V0> 1+1<V0> 1+1+1<V0>-}


in the language for Then g omits A iff 5 is archimedean. (We know
from Prob. 4.10 that this is not a first-order property.)

Let £ be a consistent set of ^-sentences, and let 31 be a model of £.


For each /7-tuple a=(a0,...,an_i) of elements of A, the type of a in 31
is the set
®(3I,a) = {<pe<D„: 3T|=<p [a0,...,a„_i]}.

It is clear that <P(3l,a) is a maximal consistent subset of ®„.


In i?„(£) we define

?7(3I,a) = {|q>|: <p£ ®(3I,a)}.

It is easy to verify that t/(3I,a) is an ultrafilter in BfiL). Conversely,


we have:

6.3. Theorem. For each ultrafilter U in Bn(L) there is a countable model


31 of £ and an n-tuple a = (a0,...,a„_1) of elements of A such that U= U (31, a).
Proof. Let c0,.be new constant symbols (i.e. not already in if).
We claim that each finite subset of the set of sentences

£'=£u{q>(v0/c0,...,v^i/c^i): <p€ <1>„, [<p|€ C/}

has a model. For let {|q>x|,_»|<Pm|} be a finite subset of U, where


%,...,<pm£<!>„. Then since U has the finite meet property, we have
|<px| a ... a |<pm| 7^0. Putting

<p = (PxA...A(pm,

it follows that |q>|^0, so that £—itp. Hence £[/• Vvo-Vv«-i“l(p,


so that £u is consistent, and accordingly has a model 33.
Let (60,...,Z?„_1)=b be an /7-tuple of members of B such that

© 1= <P [fio > • • • fin -1] •

Then clearly (©,b) is a model of

2 u {<Pi(v0/c0, • • • ,v„_i/c„ _ j): i = 1,...,m}.

This shows that each finite subset of £' has a model, and so, by the
Compactness Theorem, £' has a model (31,a), where 31 is an if-structure
and a = (a0,...,a„_1) is an n-tuple of elements of A. By the Downward
CH. 5, §6], ELEMENT TYPES AND tt0-cA.TEGORICITY 205

Lowenheim-Skolem Theorem we may take 21 to be countable. By


construction we have t/c[/(2l,o), and since U is an ultrafilter we have
U— U (21,a). £

Let us call a subset of £„(£) of the form t/(2I,a) a L-reduced n-(element)


type of 21. Then, by Thm. 6.3, the £-reduced /z-types of (countable)
models of £ are precisely the ultrafilters in £„(£), so that (Ch. 4, §4) we have:

6.4. Corollary. The set of £-reduced n-types of (countable) models


of £ is the (set of points of the) Stone space of B„(£). |

6.5. Problem. Show, that, if 21 and 23 are models of £ and ci,b are /7-tuples
of elements of A, B, respectively, then

U(21,a) = U(23,b) (2t,a) = (®,b).

6.6. Problem. Let ST be the class of all structures of the form (21,a),
where 21 is an J§?-structure which is a model of £ and a is an 77-tuple of
members of A. For each (21,a)£ST let

(21,a)* = {(23,b): (S,b)€2T and (21,a) = (23,b)}.


Let
3r* = {(2T,o)*: (21,0)6#}.

For each tp £<!>,, let

M(<p) = {(2I,a)*: 21 N <p [a0,.where a = <a0,...,a„_1)}.

Show that the M(<p) form a base for a topology on ST*, and that with this
topology ST* is (homeomorphic to) the Stone space of i?„(£). (Use 6.3,
6.4 and 6.5).

We now establish a criterion for the existence of a countable model


of £ realizing a given subset of d>„.

6.7. Theorem. Let AcOr Then the following conditions are equivalent:
(i) £ has a countable model realizing A;
(ii) {jcp[: <p€A} has the finite meet property.
Proof. (i)=>(ii). Let 2t be a model of £ realizing A. Then there is
a = (o0,...,an_1)€d!" such that 2l|=(p [1a0,...,a„for every <p6A. It follows
that
{|<p|: <p€A}ct/(2I,a).
Since £/(2l,a) has the finite meet property, so does {|<p|: <p6A}.
(ii)=>(i). If {|<p|: <p€A} has the finite meet property, then it is included

15
MODEL THEORY [CH. 5, §6
206

in an ultrafilter U in Bn(L). By Thm. 6.3, there is a countable model 91


of £ and an n-tuple a of members of A such that U=U(f1,a). Clearly
91 realizes A.

Our next result deals with the omission of subsets of O. For each set
£ of j£f-sentences we put
£c= {<t£O0: Si-®}-
Notice that £c is a theory.

6.8. Theorem. Let £ be a consistent set of T£-sentences and let As<I>„.


Consider the conditions
(i) £ has a countable model which omits A;
(ii) [A] = {|<p|: <p£A} is not included in any principal filter in Bn(L).
Then (ii)=>(i). Moreover, if £c is complete, then the conditions are equivalent.
Proof. We first prove that (i)=>(ii) when £c is complete. Suppose (ii) fails.
Then there is a formula vj/ in <!>,, such that |\|/|^0 and

[A]c{|(p|: M<|q>|}.

Since |v|/|^0, we have £|-/~|\|/, so that £[/ Vvo--hence

Sl/-l3v0...3v„_1\|/.

Since £c is complete, we must have

(1) £H3v0...3v„_1il/.

But we also have |i|/|c|<p| for each <p£A, i.e. £l—v|/—<p, whence

£HVv0..-Vvn_i(\|/-Mp).

But this, together with (1), immediately implies that every model of £
realizes A, so that (i) fails.
(ii)=>(i). Suppose (ii) holds. We claim that, in B(Z), we have, for each
sequence of distinct variables x0,...,x„_1,

(2) A I <P(vo/xo s—»v«—l/Xn—i) I= 0-


<p£ A • N

Suppose not; then there is an if-formula \J/ such that |\j/|^0 and

|vl/|^|<p(v0/x0,...,vn_1/x„_1)| for all <p€A,


i.e.
(3) £f-vl/->-(p(v0/xo,...,vn_1/x,I_1),

for all <p6A.


CH. 5, §6], ELEMENT TYPES AND K0-CATEGORICITY 207

Let <p€A, and let y0,...,yk be those free variables of \|/ which are not
among x0,...,xn_x. Then, since y0,...,y* are not free in <p(v0/x0,...,vll_1/xfl_x),
it follows from (3), using Prob. 3.1.6, that

(4) LI<p(v0/x0 i/x„ -1),

where \|/' is 3y0...aykv|/. We also have l—\|/—hence N#|-*= and


since |v|/|?^0 it follows that [v|/'|^0. Observe that the free variables of
Y are among x0,...,x„_x. It follows from (4) and Remark 3.1.5 that

(5) £ H Vx0 • • • Vx„ _ i(il<p(v0/x0, ■ • • ,v„ _ x/x„ _ x)),

so that

(6) E h- V v0 • • ■ V v„ - X(vl/'(xo/v0, • • • ,x„ _ x/v„ _ x) -► <p),

since the sentences on the right-hand side of the turnstiles in (5) and (6)
are easily seen to be logically equivalent.
Let Y be ^,(x0/v0,...,xII_i/vll_^; then "€<!>„, \Y\^ and, by (6),
|v|/"M<p| for all <p£A. Therefore [A] is contained in the principal filter
in Bn(L) generated by |Yj/"|, contradicting assumption. This proves (2).
It follows immediately from (2) that

(7) V |“l(p(vo/x0,...,vn-x/xn_x)| = l
<p£ A

for distinct x0,...,x„_x.


For each m£a> let Im—co — {0,...,m}. Let T be the (countable) set of
terms — i.e. variables and constants — of 5£. Then for each t£T and
each variable vfc distinct from t, we have, by Thm. 5.4 and Prob. 3.1.14,

(8) l = |3v*(t=v*)| = V |t=v,|.

Since 25? is countable, so is the family of joins

(V |q»(vk/t)|: <p€<P, k€m}u{ V It=vp|: m£o),t£T)


teT peim

u { V I “1 <P(v0/x0 - i/x„_ j) |: x0,... ,x„_ x distinct variables}.


<p£A

Hence, by the Rasiowa-Sikorski Theorem 4.7.3, there is an ultrafilter


V in i?(E) which respects this family of joins.
Since U respects ^

[V l<P(vJt)|: (peO, k€«},


t€T

15*
MODEL THEORY [CH. 5, §6
208

U is perfect (§5) and hence, by Thm. 5.10, the canonical structure s2t(C/)
is a model of £. Since T is countable, so is 5l(C/).
We claim that for each term t there are infinitely many variables vp
such that |t| = |vp|. For U respects the family of joins

j V |t=vp|: m£co, t£T};

taking m=0, it follows from this and (8) that there must be p0£l„ such
that |t=ypo\£U, i.e. |t| = |vpJ. Similarly, there is Pi£lPo, hence p^Po,
such that |t| = |v |. Continuing in this way we obtain a sequence of natural
numbers p0<P!<p^ ... such that |t| = |vpJ. This proves the claim.
We can now show that 91(C) omits A. For let (|t0|,...,|t„_1|) be an
n-tuple of elements of A(U). Then, by our claim above, we may choose
distinct variables such that |t4| = |xf| for all i, 0*s,i<n-\. Now,
since are distinct, U respects the join

V |-l<p(v0/x0,...,v„_1/xn_1)|,
<p£A

so it follows from (7) that there must be a formula <p£A such that

I “1 <P(v0/x0 ,...,vn_1/x„_1)[€C.
Hence, by Thm. 5.10,

2t(E/)t=”l<p [|xo|,...,|xn_i|]
and so

This shows that no n-tuple of elements of satisfies A, so that 9f(t/)


omits A. 1

As a corollary to this result we obtain the following important theorem,


due to Ryll-Nardzewski, which gives necessary and sufficient conditions
for a complete theory to be K0-categorical.

6.9. Theorem. Let £ be a (countable) complete theory having only infinite


models. Then the following conditions are equivalent:
(i) £ is $0-categorical; 'v
(ii) the Boolean algebra Z7„(£) is finite for each ndco;
(iii) for each n£a), there are only finitely many 'L-reduced n-types;
(iv) for each nCzoi, every ultrafilter in £„(£) is principal.
Proof, (ii) o (iii). We have already remarked (6.4) that the family of
£-reduced n-types is the Stone space of £„(£). The equivalence of (ii)
and (iii) now follows immediately from Prob. 4.4.7.
CH. 5, §6]. ELEMENT TYPES AND K0-CATEGORICITY 209

(ii) o (iv). This is an immediate consequence of Cor. 4.5.3.


(i)^-(iv). Suppose that (iv) fails. Let U be a non-principal ultrafilter
in B„(Z), and let

A = {<p60„: |(p|<Et/}.

Then [A] = U, so that, in particular, [A] is not included in any principal


filter in B„(L). Hence, by 6.8, there is a countable model 91 of E which
omits A. Also, since t/=[A] is a filter, [A] has the finite meet property,
so that, by 6.7, there is a countable model © of E which realizes A. Since
91 omits A and © realizes it, clearly 91 cannot be isomorphic to S, so that
E is not K0-categorical. Thus (i) fails.
(iv)=»(i). The argument here is a modification of the “back-and-forth”
construction used in the proof of Cantor’s theorem 4.5. Assume (iv),
and let 91 and © be models of E of cardinality tf0. Let A and B be enumer¬
ated as a0,«!,... and 60A,... respectively. We show by induction that we
can enumerate A and B into new sequencesc0,cl3... and d0,d\,respectively,
in such a way that, for each k£co,

U(91, ck)=U(B, bk)

where ck—(c0,...,ck) and bk=(d0,...,dk).


First, we put d0=b0. By assumption £/(S,b°) is principal; let |(p| be
a generating element with q>C ^i- Then ©i=(p[<70], so that ©l=3vo9-
Since 91 and © are both models of the complete theory E, they are elemen¬
tarily equivalent, so that 9I|=3vo9 also. Let c0 be an element of A for
which 91 N<p fa,]- Then f7(9I,c°) contains the generating element |(p[ of
U(S,b°), so it follows immediately that C/(9f,c°)= £/(©,b°).
Now suppose that &>-0 and c0,...,ck-1} d0,...,dk-1 have been chosen
in such a way as to satisfy the requirements. We now show how to construct
ck and dk. There are two cases to consider:
(1) k = 2m + \ is odd. In this case we put

ck=am.

By assumption, the ultrafilter U(91, c/l) is principal; let :<p be a generating


element with In particular, j<p!C L/(9T, cfe); thus 911= 9 [c0,...,cfc]
so that
911= 3V*9 [c0,...,ck-1],

and hence
MODEL THEORY [CH. 5, §6
210

By the inductive hypothesis, t/(9I, ck 1)=(/(93, bk 1), so that

Hence
93 |= 3>^(p .,dk~1]

so that we can choose dk£B in such a way that © t= <P [d0,...,d^. Then
C/(93, bk) contains the generating element |<p| of £/(9I, ck), so that 1/(93, bfc) =
= 1/(91, ck). This completes the induction step for the case in which k
is odd.
(2) k=2m is even. In this case we put

dk bm,

and choose ck in the same way as we chose dk in case (1). We obtain


(7(91, cfc)=i/(93, b*) just as before.
This completes our recursive definition.
It is clear from the construction that A — Icq^^...} and B={d0,d1,...}.
Moreover, since 17(91, cfc) = t/(23, bk) for each k£co, it follows from Prob. 6.5
that (91, </) = (93, b*) for each k£co, and this immediately implies that

(9r, <c0,c1,...» = (©, (d0,du...)).

Thus 91^93 by Prob. 1.7. Therefore £ is K0-categorical and the proof


is complete.

6.10. Problem. Let £ be a consistent set of .^-sentences, and let A^®„.


Consider the conditions:
(a) £ has a countable model which omits A;
(b) for each formula \j/£<I>„ such that £u w is consistent there is
a formula <p£A such that

£(/i|/-Mp.

Show that (b)=>(a) and that if £c is complete the two conditions are
equivalent. (Show that (b) is equivalent to condition (ii) of Thm. . .)
6 8

6.11. Problem. Let be a countable first-order language which has


a distiguished unary predicate P and for each ndoo a constant symbol n.
An jSP-structure 91 is called an co-model if

P?l={ir“: n£co}.

A theory £ in d£ is said to be co-complete if for each ^-formula <j> and each


CH. 5, §6]. ELEMENT TYPES AND X0-CATEGORIC£TY 211

variable x we have

Lh-<l>(x/0), L|—<p(x/l),... implies Eh Vx(Px-M|>).

Let A be the set of formulas

{Pv0, vo5*0, v0s«£l,...}.

(i) Let S be a set of if-sentences. Show that the following conditions


on an if-structure 21 are equivalent:
(a) 21 is an ca-model of E;
(b) 21 is a model of Eu{Pn: «£co} which omits A.
(ii) Show that, if [Su{Pn: n£co}]c is consistent and co-complete then
E has an co-model. (Show that, using Prob. 6.10 (b), Eu{Pn: n^co)
has a model omitting A, and then apply (i).)
6.12. Problem. We adopt the notation of Prob. 6.11. The co-rule is the
following infinite rule of proof: from <p(x/0), <p(x/l),... infer Vx(Px->-(p),
where tp is any if-formula, co-logic is obtained by adding the co-rule to
the axioms and rules of inference of the first-order predicate calculus
if and allowing infinitely long deductions.
(i) Let E be a set of if-sentences. Prove the co-completeness theorem:
E has an co-model iff Eu{Pn: n£co) is consistent in co-logic. (Consider
the set of all if-sentences provable from Eu{Pn: n€co} in co-logic, and
apply Prob. 6.11 (ii)-)
(ii) Show that the compactness theorem fails for co-models of co-logic.
6.13. Problem. Let if be a countable language and for each n£co let
A„c<I>n. Let E be a consistent set of if-sentences, and consider the
conditions:
(a) E has a countable model which omits each A„;
(b) for each n^co, [A„] = {|q>|: <|>£A} is not included in a principal
filter in i?„(E).
Show that (b)=>(a) and that, if Ec is complete, the two conditions are
equivalent. (Argue as in the proof of Thm. 6.8.)

Throughout Probs. 6.14-6.17 we assume that E is a (consistent) complete


theory without finite models, formulated in a countable first-order
language if.
6.14. Problem. A countable model 21 of E is said to be prime if 21 is
elementarily embeddable in every model of E.
(i) Show that 21 is a prime model of E iff for each n and each a 6 A",
212 MODEL THEORY [CH. 5, §6

17(21, a) is a principal ultrafilter in j9n(£). (For necessity, show that

|q>|€tf(Sr,o)}

is realized in every model of £, and apply Thm. 6.8. For sufficiency,


argue as in the proof that (iv)=>(i) in Thm. 6.9.)
(ii) Show that £ has a prime model iff Bn(L) is atomic (Ch. 4, §5) for
each «6 ox (For necessity, show that, if 21 is a prime model of £, then,
for each <p 6 On such that | <p | ^ 0 in 2?„(£), there is n 6 A" such that |cp 16 U(21,a);
then apply (i). For sufficiency, argue as follows. For each n£a> let

A„ = {(p6<I)„: | “I | is an atom of #„(£)};

show that [A„] satisfies (b) of Prob. 6.13; conclude that £ has a countable
model 21 which omits each A„ and use (i) to show that 21 is prime.)
(iii) Show that any two prime models of £ are isomorphic. (Use (i),
and argue as in the proof that (iv)=^(i) Thm. 6.9.)
*6.15. Problem. Let 21 be a countable model of £, let a£An, let if (a)
be the language for (21,a), and let Om(o) be the set of all formulas of
if (a) whose free variables are among v0,. A subset rcOm (a)
is said to be compatible with (21,a) if each finite subset of T is realized in
(21,a). 21 is said to be finitely saturated if, for each a £A" and each subset
T c <D1(a) which is compatible with (21,a), T is realized in (21,a). In the
sequel we shall say “saturated” instead of “finitely saturated”.
(i) Show that 21 is saturated iff for each a = (a0,...,a„-1)eAn and each
ultrafilter U in 5„+1(£) such that 17(21,a) cl/ there is an^A such that
17= 1/(21,a'), where a'=(a0,...,an).
(ii) Let 21 be a saturated model of £ and let U be an ultrafilter in £„(£).
Show that there is an ci£An such that 17=1/(21,a).
(iii) Show that, if £ has a saturated model, then each B„(L) includes
only countably many ultrafilters.
(iv) Let 21 be any countable model of £, let a = (a0,...,an_1)eAn, let
U be an ultrafilter in 5B+1(£) such that 1/(21,a) c U, and let 23 be a countable
elementary extension of 21. SJhow that there is a countable elementary
extension 23 (£/,a) of © and an element an of 23 (U,a) such that
U= C/(©(t/,a),o'), where a'=(a0,...,an). (Use the Compactness Theorem.)
(v) Suppose that, for each n, 2?n(£) includes only countably many
ultrafilters. Show that each countable model 21 of £ has a countable
elementary extension 21* such that for each a = (a0,...,an_1)£An and each
ultrafilter U in 5„+1(£) such that 17(21,o)c[/ there is an£A* such that
CH. 5, §6]. ELEMENT TYPES AND St0-CATEGORICITY 213

U—U(91*,o'), where a'—(a0,...,an). (Let (£/m,ctm) be an enumeration of


all pairs (U,a), where a£An for some n and U is an ultrafilter in £n+1(E)
such that U(2l,ct) jj. Using (iv), define an elementary chain (Prob. 2.10)
by putting s2l0 = 9l and, for each m, 2lm+1 = 2lm(Um,am) (in
the notation of (iv)). Show that 2t* = Um€co2lm meets the requirements.)
(vi) Suppose that for each n, B„(E) includes only countably many ultra¬
filters. Show that each countable model ill of E has a (countable) saturated
elementary extension. (Define an elementary chain 2l0, 5ll5... by putting
2l0=2I and, for each m, 2lm+1 = 2Im* (in the notation of (v)). Show that
Ume<o^m meets the requirements.)
(vii) Show that E has a saturated model if and only if the Stone space
of each f?„(E) is countable. (Use (ii) and (vi).)
(viii) Show that, if the isomorphism relation ^ partitions the class
of countable models of E into countably many equivalence classes, then
E has a saturated model. (Use (vi).)
*6.16. Problem, (i) Show that any two saturated models of E are iso¬
morphic. (If 21 and © are two saturated models of E show, using a
“back-and-forth” argument like that in the proof of Thm. 6.9, that A
and B can be enumerated as {cq,^,...} and {r/0,fl),...}, respectively, in
such a way that (2t,cfc) = (93,bk) for each kdco.)
(ii) A countable model 21 of E is said to be (countably) universal if
each countable model © of E can be elementarily embedded in 21. Show
that each saturated model of E is universal. (Like (i).)
(iii) Show that if E has a saturated model it has a prime model. (Use
Probs. 6.15(vii), 4.5.13 and 6.14(ii).)
(iv) Show that E has a saturated model iff it has a universal model.
(For one direction, use (ii). For the other, show that, if E has a universal
model, then .B/E) has only countably many ultrafilters for each n, and
apply Prob. 6.15(vii).)
*6.17. Problem. Let J?(L) be the class of all countable models of E.
Show that the isomorphism relation on E) does not have exactly
2 equivalence classes.1 Less precisely but more suggestively, E cannot
have exactly 2 non-isomorphic countable models. (Argue by contradiction.
Assume that E has exactly 2 non-isomorphic models. Then, by the results
of Probs. 6.15 and 6.16, E has a saturated model © and a prime model 21.
Show that 21 and © cannot be isomorphic, so that S cannot be prime.

1 On the other hand, for any finite 0,2 it is possible to find a complete theory E
such that a divides JtkL) into exactly n equivalence classes! This result is due to
Ehrenfeucht, cf. Vaught [1961],
214 MODEL THEORY [CH. 5, §7

Then, by Prob. 6.14(i), there is b£B” such that 6/(23,b) is non-principal.


Observe that (23,b) is saturated, so that the complete theory £'=Th((23,b))
has a saturated model, and hence a prime model ((£,c). Now show that
•(£ is a model of £ which is not prime, so that (£^2l. Next, using the fact
that £ is not K0-categorical and Thm. 6.9, show that no model of £' can
be both prime and saturated. Conclude that ((£,c) is not saturated, so nor
is (£, and therefore (£^23.)

*§ 7. Indiseernibles and models with automorphisms

So far we have discussed two major methods of constructing models for


sets of first-order sentences: Henkin’s method in Chapter 3 (which is
closely related to the Boolean method discussed in §5) and ultraproducts
in §3. (A third method — that of unions of elementary chains — was
outlined in Prob. 6.15.) In this section we introduce a new method: the
generation of models by Skolem functions. This method has a quite
different flavour from the others, and many of its applications involve
combinatorial set theory in an essential way. We shall discuss one such
application: the construction of models with many automorphisms.
Throughout this section we assume that JS? is a countable first-order
language which may possess a set {f„: «£co} of function symbols1 as well
as a set {R„: n^oo) of predicate symbols. (Recall that we regard constants
as 0-ary function symbols.)
For each n£a> let X(n) be the number of argument places in R„ and
X\n) the number of argument places in f„. Then an structure is a system
of the form

2t = ^,<Rn>„eco, </„>n£co»,
where A is a non-empty set and, for each n£co, Rn is a X(n)~ary relation
and/„ a X'(n)-ary operation on A. (Note that some of the /„ may be 0-ary
functions, i.e. designated elements of A.) The /„ are called the (basic)
operations of 21.
If t is an term whose variables are all among v0,...,v„ and a0,...,an£A
we define the value t9I[cr0,...,an] of t at (a0,...,an) inductively as follows.
If t=v; then
t'M[a0,-.-,o„] = a£;

1 The results of §§1-6, proved originally for languages with no function symbols other
than constants, extend to S£ in a straightforward way. We leave it to the reader to make
these extensions.
CH. 5, §7], INDISCERNIBLES AND MODELS WITH AUTOMORPHISMS 215

and if t=fmt1...tV(M) then

t9l[a0, • • • ,an] =fm(tf[a0,an\,... ,t9',(n) [a0,... ,a„]).

We assume that the basic semantic definition for if-formulas given


in 2.1.1 has been reformulated as a satisfaction definition as was done
for a language without function symbols in §1.
The notion of one if-structure being a substructure or elementary sub¬
structure of another is the obvious extension of the corresponding notion
introduced in §1. Thus, e.g., if

9I = (^4, (R„)n£W, (/„)„£ a>)>

<x>„£„
are if-structures, then 91 <= 51' iff

A^A';

for each ndco,

R,=R'„nAx<»-,

and for each n£co and any elements a1,...,ax^n) of A,

frSfi 1» • • ■ >^A'(n)) fn (^1>* •

In particular, if 9T c 91, then A' is closed under the operations of 91.

If 2? is a non-empty subset of A which is closed under the operations


of 91, we define the restriction of 91 to B by

9l|2?=<2?, (RnnB^)„,(0, <.fn\B)n€co>.

Clearly we have 9I|i?c:9l.


Let if' be an extension of if obtained by adding a countable set
{f.': i£l) of new function symbols, where, for each /£/, f/ is <5(/)-ary.
Let 91 be an if-structure, and for each i£l let f! be a <5(z)-ary operation
on A. We write (91, (/:')ieJ) for the structure obtained by adjoining the
operations {//: i€l} to 91. Obviously (91, <//)i€j)=2t' is an if'-structure;
it is called an ££'-expansion of 91, and 91 is the <£-reduction of 91'.
We denote by if + the extension of if obtained by introducing a new
/z-ary function symbol for each formula 9 and each sequence
x—(x0,...,x„) of distinct variables which include all the free variables
of 9. We put 9^ for the if+-sentence

Vxi- •. Vx«(3xo9“^ <P(xo/f<p, ,xi- ■ -x«));


MODEL THEORY [CH. 5, §7
216

<p+ is called the defining axiom for f(p>i. Given a set E of if-sentences,
we put E+ for the union of E with the defining axioms of all the ffp<*.
Clearly |E+| = tf0-
We iterate the process of obtaining if + from if and E+ from E by
setting
i^o = if, E0 = E,

X,+i = ^„+, £„+1 = E„+.


We put
= U sen,
n £ co
e* = u
n£co
E„.

Clearly if* is countable and |E*| = K0-


We now introduce one of the central notions of this section: E is called
a Skolem set in if if E is a consistent set of if-sentences and

(7.1) for each formula <p and each sequence x0,...,x„ of distinct va¬
riables which include all the free variables of q> there is a term t
whose variables are all among xl5...,x„ such that

Eh v*i- ■ ■ Vx„(3x0<p - q>Cx0/t».

(t is called a Skolem term for <p.)

7.2. Lemma. Let E be a consistent set of if-sentences. Then E* is a Skolem


set in if*. In fact, each model of E has an if*-expansion which is a
model of E*.
Proof. It is clear from the definitions of E* and if* that E* satisfies (7.1).
Moreover, since E is consistent, it has a model. Thus, once we have shown
that each model of E can be expanded to a model of E* it will follow
immediately that E* is consistent, and hence that it is a Skolem set.
Let 31 be any model of E. We show how to expand 31 to an .^-struc¬
ture 31* which is a model of E*.
Let g be a choice function for PA. For each if*-formula <p and each
sequence £=(x0,...,x„) of distinct variables which includes all the free
variables of <p we shall define a function f from the appropriate Cartesian
power of A into A which will serve as the interpretation of f . Let
<t>_x = 0 and for each n£a> let <I>„ be the set of all if„-formulas. For each
if*-formula <p let iF(<f>) be the set of all finite sequences of distinct variables
which include all the free variables of <p. Now suppose that f x has been
defined for all <p£<!>„_! and all xdSTiy). Let

3l„ (31,
CH. 5, §7], INDISCERNIBLES AND MODELS WITH AUTOMORPHISMS 217

For each formula <p£O,,-0>n_x and each x=(x0,...,xm)e&(y) we put

il/ = <p(x0/Vo,...,xm/ym)

and define fv<x: Am — A by setting, for each (<al,...,am)£Am,

where
Y={a£A: 2I„h=\|/ [a,ai,...,am]}.

In this way, by induction, we define f^fior all J$f*-formulas <p and


all x£2C((.p). If we now put

21* — (21, (/(p.^eot^ear^)),

it is easy to see that 21* meets the required conditions. I


Let us call an jSf-structure 21 a Skolem (Substructure if Th(2I) (§4) is
a Skolem set in S£. It follows immediately from the proof of Lemma 7.2
that each jSf-structure can be expanded to a Skolem Jzf*-structure.
Let 21 be an Jzf-structure, and let X^A. We put

#HQO = {tM[*ot is an if-term whose variables are

among v0,...,v„ and x0,...,xnqX}.

It is easy to verify that H%(X) is the smallest subset of A which includes


X and is closed under the operations of 21. In particular, we may put

$sl(Z)=2l|/7ai(Z).

£>2jP0 is called the Skolem hull of X in 21. X is said to generate 21 if


$„(*) = 21.
7.3 Lemma. Let 21 be a Skolem structure. Then:
(i) each substructure of 21 is an elementary substructure of 21;
(ii) if QxX^ A, then .^(A) is the smallest elementary substructure
of 21 which includes X. Moreover, if X is infinite, then \HfiX)\ = [ A|.
Proof, (i) follows immediately from (7.1) and Lemma 1.5. As for (ii),
since §,a(Z)c2t by definition, we have $,a(X)-<2I by (i). Also, if 23 -< 21
and Jc5, then B must be closed under the operations of 21, so, since
H^fX) is the least subset of A including X and closed under the operations
of 21, it follows that H^fX)^B. The final assertion follows easily from
the countability of ST.
MODEL THEORY [CH. 5, §7
218

We tow introduce the important notion of a set of indiscernible elements


is a structure.
Let 31 be an ^f-structure, and let I be a subset of A which carries
a (strict) total ordering <. (< is not necessarily a basic relation of A.)
We say that X is a set of indiscernible elements1 in 31 (with respect to <)
if for each if-formula <p whose free variables are all among and
each pair of sequences y0<...<x„ and y0<..-<y„ from X, we have

311= (p [x0,...,xn] o 3t |= <P [3^os• ■ •»3’nl-

Clearly this condition is equivalent to

(31, <*„,••• >W» = (® , <Lo, • • • Xn))-

Our next task is to establish the existence of models of a given consistent


set of sentences with sets of indiscernibles of any prescribed order type.
For this we shall need a combinatorial result.
For each set X and each rf to let [X\ be the family of all subsets ol
X with exactly r elements. Then we have:

7.4. Ramsey’s Theorem. Let X be an infinite set, let 1, and let {Cl5...,Ck}
be a partition of [X]r into k pieces. Then there is an infinite subset Y of
X and some j, 1 ■*£/«&, such that [F]r £C,-.
Proof. By induction on r. For r=l the result is trivial. Assume then that
the theorem holds for r; we show that it holds for r+1.
Let {Clv..,Cfc} be a partition of [A]r+1. For distinct Xi,...,xr in X, put

Mi({xl5...,x,}) = {x€X: {*!,...,xr,x}€C;}

Let J5" be a non-principal ultrafilter over X; then by Prob.


4.3.16.(i) each member of IF is infinite. Moreover, it is clear that, for
each r-tuple xl5...,xr of distinct elements of X, the Mi({x1,...,xr})
form a finite partition of Z=X—{xlf...,xr}. Since F is non-principal,
ZQ.F, so that by Prob. 4.3.16(iii) exactly one of the A/)(xr}) is
in F. Let M({x1;...,xr}) be the unique Mi({x1,...,xr}) which is in F.
We now define a sequence Lo^Ld--- °f distinct elements of X as follows.
First, choose yQ,...,yr-x to be any r distinct members of X. Then, if ns*r
and ym has been chosen for all m<n, pick

€ fl (M({km1? • • •»Lmr}): < • • • < "V<«}•

1 For some examples of sets of indiscernible elements see Prob. 7.9.


CH. 5, §7], IND1SCERNIBLES AND MODELS WITH AUTOMORPHISMS 219'

(Notice that this latter set is a finite intersection of members of J5", hence
is a member of J5" and accordingly non-empty.) From the definition it
is clear that y„ is different from all of its predecessors.
Now put Y' = {y0,y1,...y, then Y' is infinite and so by inductive hypoth¬
esis the theorem holds for Y' and r. Thus, if for each i, 1 c/dt, we define

• •>Tmrl-

then {C'1,...,C'k} is a partition of [Y']r, so there must be an infinite subset


Y of Y' and a j, 1 <j<k, for which [Y]r^Cj. We claim that [7]r+1cCi.
For let ymi,...,y,„r+i£Y, with Then {yn,x,...,ym^C'j so
that Mj{{ym^...,ym^)^^. Flence, by definition,

- Mj(\ym^'
But then, by the definition of ym , we have
^ mr +1

so that ymr+ieMj({ymi,...,ymr}), i.e.

{Tm1J-"jTmr»Tmr + 1}€ Cj •

This proves the claim, the induction step, and the theorem. |

We now use this result in the proof of:

7.5. Theorem. Let £ be a set of sentences of Y£ with an infinite model,


and let (T,<) be any totally ordered set. Then there is a model of £ of
cardinality max(^0,|T[) which includes X as a set of indiscernibles.
Proof. Firstly, we observe that without loss of generality we may assume
that X is infinite. For otherwise we may extend it to a countably infinite
totally ordered set X', and if the theorem holds for X', it holds also for X.
Let be the extension of obtained by adding the new set of constants
{cx: and let £' consist of the following S£'-sentences:
(i) all members of £;

(ii) <p(\0/cXo,...,v„/cXn) <-► (p(v0/cv...,v„/cyn)


for each ^-formula <|> with free variables among v0,...,v„ and each pair of
sequences x0,...,xn, y0,...,yn from X such that x0<...-=xn and y0<...^y„;

(iii) cx^cX2

for x1?±x2 in X.
We show that each finite subset £0 of £' has a model.
MODEL THEORY [CH. 5, §7
2 20

Let 91 be an infinite model of E, and let <* be a fixed total ordering


of A. Let (plv..,<ps be a list of all formulas playing the role of <p in sentences
of type (ii) in E0; suppose that <p; has nt free variables. Then <px determines
a partition of [T]"1 into two classes: those ^-element subsets of A that,
when put in increasing order under <*, satisfy (px in 91, and those that
do not. Since A is infinite, by Ramsey’s Theorem there is an infinite subset
Ax of A such that [dj"1 is entirely included in one of these classes. Now
repeat the procedure, dividing [dj”2 into two classes according to satisfaction
of (p2 or —|<p2 in 9t; we obtain an infinite subset A2 of Ax such that [A?\n-
is included in one of these classes. Iterate this procedure 5 times to obtain
a sequence AX,...,AS such that
(a) As<=As-1^...c:A1c:A;
(b) each A; is infinite;
(c) any two members of [v4;]% when their elements are arranged in
increasing order under <*, either both satisfy <p; in 91 or both satisfy
—up; in 91.
Putting Y=AS, it follows from (a) and (c) that for all i, l<i^s, and
each pair of increasing sequences a1,...,an, from Y,
(d) 91N 9; [«i,.••,«„.] o 91N <p; \bl,...,bn].
Let Z be the set of elements x of X such that cx occurs in a sentence of E0.
Since Y is infinite and Z is finite, there is a one-one order-preserving
map / of (Z,<) into (T,<*>. We claim that
91'=(91, </(Z))zez)
is a model of E0. Since 91 is a model of E, all sentences of type (i) hold
in 9L, while those of type (iii) in E0 hold since / is one-one. Finally, (d)
and the fact that / is order preserving immediately imply that 91' is a model
of those sentences in E0 of type (ii). Thus 91' is a model of E0.
By the compactness theorem, then, E has a model 93". Since 93" is
a model of all sentences of the form (iii), we may, without loss of generality,
identify the interpretations of cx, for x£X, in 93", with the elements x
themselves. Thus B" must be infinite. It is now clear that the i?-reduction
93' of 93" is a model of E including X as a set of indiscernibles. By Thm. 2.1
we can find an elementary substructure 93 of 93' of cardinality max(^0,|JL|)
which includes X. It is easy to see that X is also a set of indiscernibles
in 93, so that 93 satisfies the requirements of the theorem.

An automorphism of a structure is an isomorphism of the structure


onto itself. If (X,<) is a totally ordered set, an order automorphism of X
is an automorphism of the structure (Z,<). We now show how order
CH. 5, §7], INDISCERNIBLES AND MODELS WITH AUTOMORPHISMS 221

automorphisms of sets of indiscernibles give rise to automorphisms of


structures.

7.6. Theorem. Let 21 be a Skolem structure, and let (A,<) be a set of


indiscernibles in 21 such that X generates 21. Then each one—one order
preserving map f: X-* X can be extended uniquely to an elementary embedding
g of 21 into itself Moreover, if f is an order automorphism of X, then g
is an automorphism of 21.
Proof. Since X generates 21, each element a of A is of the formt'JI[x0,...,x„],
where t is a term whose variables are all among v0,...,v„ and x0,...,x„£A.
By making suitable changes of variable in t, if necessary, we may assume
that x0-=...<x„. We shall call the equation a=t® [x0,...,x„] a presen¬
tation of a.
Let a=t91 [x0,...,x„] be a presentation of a. We put

g(o)=taj[/(x0),...,/(A„)].

We first show that g(a) is well-defined. Let a — sn [y0,...ym] be any other


presentation of a. Then we have

(1) tM[x0,...,xJ = s?1[y0,

Let {z0,...,zq} be an enumeration of the set {x0,...,x„, y0,-■■,)’,„} in increasing


order. Then the equation (1) may be expressed in the form

2Il=<p [z0,...,zq]
for some formula tp. Since X is a set of indiscernibles for 21 and/preserves
order, it follows that
2fN<p[/(r0),...,/(z?)].
This immediately gives

t* [/U0),...,/(xn)]=s9t [f(y0),...,f(ym)l

Thus g is well-defined. Also, it is easy to see that it extends /.


We now show that g is an elementary embedding of 21 into itself. Let
<p be a formula whose free variables are all among v0,...,vfc and let
a0,...,ak£A be such that 211= <p [tf0,...,%]. For each i,0^i^k, let
at=tf [xi0,...,x(„] be a presentation of at. Lety0<.-.<yq be an enumeration
of the set U{{*io>--i=0,...,k} in increasing order. By making
suitable changes of variable in the t; we may assume that our presentations
are actually

ai=tf[y0,...,yq], O^i^k.

16
MODEL THEORY [CH. 5, §7
222

Now let
vl/ = (p(V0/t0v5Vfc/tt).
Then we clearly have

2lt=(p[fl0,...A] o 21[Jov-Jj.
21 f= 9 [gOo),... ,g(ak)\ 211= \J/ [/(To)> • • • >/(>>«)]■
But, since/is order preserving and Xis a set of indiscernibles in 21, we have

2ti=\J/ [y0,---,yq] o 21 lf(yo)>-'->f(y$-


Hence
9lN<p[fl0»-A] o 2t N(p [g(a0),...,g(ak)],
so that, by Prob. 1.4(iii), g is an elementary embedding of 21 into itself.
The proof of the fact that g is unique and that it is an automorphism
if / is an order automorphism is left to the reader. I

Our final task in this chapter is to establish the existence of models


with many automorphisms. To achieve this we shall need:

7.7. Lemma. For each infinite cardinal x there is an ordered set of cardinality
x with exactly 2* order automorphisms.
Proof. We first observe that the ordered set Q of rational numbers has an
order automorphism / which differs from the identity, e.g. f(r)=/•+1.
(In fact Q has 2S° automorphisms, but we shall not need this many.) Now
let x be an infinite cardinal, and let {Qf £<x} be a disjoint family of
copies of Q. The set X=Ua<xQn may then be ordered in the obvious
way: for x,y£X, put x<y if x^Qi and y£Q„ with £<r], or x,y^Q^ and
x<y in Q«. X is then of cardinality x, and so has at most x* = 2* order
automorphisms. We now construct a set of 2* distinct automorphisms
of X as follows. Since / is an automorphism of Q, it may be regarded as
an automorphism of each Q5. For each F<=x define gY :X-*X by the
condition
/ if UY;
identity on Q* if

It is now easy to see that {gy: fc x) is a family of 2* distinct automorphisms


of X. 1

We can now prove


7.8. Theorem. Let £ be a set of ST-sentences with an infinite model. Then
for each infinite cardinal x, £ has a model of cardinality x with exactly
2* automorphisms.
CH. 5, §7]. INDISCERNIBLES AND MODELS WITH AUTOMORPHISMS 223

Proof. £ is consistent and has an infinite model, so by Lemma 7.2 the


Skolem set £* also has an infinite model. By Lemma 7.7 we can find an
ordered set X of cardinality x with 2* order automorphisms. By Thm. 7.5
there is a model 31 of £* of cardinality x, including I as a set of
indiscernibles. Then 31 is evidently a Skolem structure; let 31' be the Skolem
hull of X in 31. By Lemma 7.3, 31' is an elementary substructure of 31 of
cardinality x. Moreover, X generates 31'. Hence, by Thm. 7.6, each order
automorphism of X can be extended to an automorphism of 3T, so that
3T has at least 2* automorphisms. Let 33 be the ^-reduction of 91'; then
every automorphism of 9T is obviously an automorphism of 93 so that 93
has at least 2*, and hence, since ||93||=x, exactly 2* automorphisms. |

Thm. 7.8 has the following interesting consequences. First, let tR be


the field of real numbers. Then there is a field 91 of cardinality 2s0 such
that 9l = tR and 91 has 2"Xo automorphisms. This should be contrasted
with the well-known fact that 9i itself has only one automorphism, namely
the identity.
Secondly, Thm. 7.8 implies that the complex field (E has 22^0 auto¬
morphisms, for since — as is well-known — the theory of algebraically
closed fields (of characteristic 0) is 2s°-categorical, the algebraically closed
field of cardinality 2X° with 22^0 automorphisms whose existence is implied
by Thm. 7.8 must be isomorphic to (E.

7.9. Problem, (i) Let 91 be an if-structures, and let <X,<) be a totally


ordered subset of A. Suppose that for any pair of increasing n-tuples
*!<...<*„ and from X there is an automorphism/of 9t such
that f(x1)=y1,...,f(xn)=yn. Show that X is a set of indiscernibles in 91.
Use this result to show that in each of the following examples X is a set
of indiscernibles in the structure 91:
*(i) 31 is the complex field and X is a set of algebraically independent
elements in A;
(ii) 91 is the ordered set of reals and (JF,<) = 91;
(iii) 91 is the free Boolean algebra generated by a set X;
*(iv) 31 is a Boolean algebra and X is the set of all atoms of 91.
7.10. Problem. Let (E be the complex field. Show that the set of real
numbers is not first-order definable in (E; that is, show that there is no
formula <p with one free variable in the language for fields such that, for
all C, (E N <p [a] iff a is real. (Show that, if there were such a formula <p,
then any automorphism of (E must take reals to reals, and hence would

16*
MODEL THEORY [CH. 5, §8
224

have to be either the identity or conjugation. This contradicts the fact


— earlier derived — that (£ has more than 2 automorphisms.)
7.11. Problem. Let 7 be a set of indiscernibles in a Skolem ^-structure 9L
Show that:
(i) If f£X, then 7 is a set of indiscernibles in §>2,(7) (with respect
to the ordering induced by the ordering of 7) and §>2j(2')-<§>2i(20-
(ii) If X is infinite and Y is an arbitrary infinite totally ordered set,
then there is a structure 93 in which Y is a set of indiscernibles, and the
sets of formulas satisfied by increasing sequences from X in 91 and from
7 in 93 are the same. (Let 2 be the set of all if-formulas satisfied by
increasing sequences from X in 91. Let if' be obtained from if by adding
constants c}, for ye 7, and let S' be the set of all if'-sentences q>(cyi,..., c„n),
where (p£2 and 71<...<L)1 in 7. Show that 2' is consistent and that
the if-reduction 93 of any model 93' of 2' meets the requirements.)
(iii) If 7 is a set of indiscernibles in 93 such that the sets of formulas
satisfied by increasing sequences from X in 91 and 7 in 93 are the same,
then 93 is a Skolem structure and each one-one order preserving map
f of X into 7 can be extended uniquely to an elementary embedding of
§>2,(7) into §>2,(7). (Like the proof of Thm. 7.6.)
(iv) If 7, 93 are as in (iii), and X and 7 are infinite, then, for any set of
formulas A whose free variables are included among §>3,(7)
realizes A iff §>^(7) realizes A.
7.12. Problem, (i) Let 91 be a Skolem structure, and let 7 be an infinite
set of indiscernibles in 91 such that 7 generates 91. Show that, if / is a
one-one order-preserving map of 7 into itself which is not onto, then the
unique elementary embedding of 91 into itself given by Thm. 7.6 is not
onto. (Let g be the extension of / to an elementary embedding of 91 into
itself. Show first that g[A] = H^(f [7]). Then show that, if y£7-/[7],
then y^H^f[X]))
(ii) Let 2 be a set of sentences with an infinite model. Show that, for
each infinite cardinal x, 2 has a model of cardinality x which has an ele¬
mentary embedding onto a proper substructure of itself. (Take 7 to be
the ordered set co, and then argue as in the proof of Thm. 7.8, using (i).)

§ 8. Historical and bibliographical remarks

The discovery that a mathematical theory may have more than one model
was made in the nineteenth century when Riemann and Klein es¬
tablished the independence of the parallel postulate by constructing
CH. 5, §8]. HISTORICAL AND BIBLIOGRAPHICAL REMARKS 225

a model of the other axioms of geometry in which the parallel postulate


fails. Following the formalization of predicate logic by Frege in the late
nineteenth century, Lowenheim proved in 1915 that a finitely axiomatizable
theory with a model has a countable model. This result was extended
to arbitrary countable theories by Skolem in 1920. The general forms of
the Lowenheim-Skolem Theorem are due to Tarski, as are most of the
basic model-theoretic notions introduced in §§1 and 2. (See Tarski and
Vaught [1957].)
The ultraproduct construction is foreshadowed by some early work of
Godel and Skolem in the 1930’s. The construction was, in essence, used
by FIewitt [1948] in connection with real closed fields. The general reduced
product construction was introduced by Los [1955], and Thm. 3.7. was
proved there. The proof we give was inspired by the theory of Boolean¬
valued models; cf. Mansfield [1971]. The proof of the Compactness
Theorem by ultraproducts is due to Frayne, Morel and Scott [1962].
Theorem 4.4 is due to Los [1954] and Vaught [1954]. The completeness
of the theory of unbounded dense orderings was first established by
Langford in 1927, that of the theory of algebraically closed fields of
characteristic 0 by Tarski, and that of algebraically closed fields of arbitrary
characteristic by Robinson [1951]. Theorem 4.5 is a famous result of
Cantor.
The idea of the Lindenbaum algebra of a language was formulated
independently by Tarski and the Polish mathematician A. Lindenbaum
in 1935, although Lindenbaum’s results were never published. Theorem
5.10 and its application to the proof of Godel’s Completeness Theorem
are due to Rasiowa and Sikorski [1951]. The results in Probs. 5.14-18
are due to Mansfield [1971]. Theorem 6.8 is due to Ehrenfeucht (cf.
Mostowski [1958]), and Thm. 6.9 to Ryll-Nardzewski [1959]. The results
in Probs. 6.14-6.17 are due to Vaught [1961].
The notion of a set of indiscernibles and Thms. 7.5 and 7.8 are due to
Ehrenfeucht and Mostowski [1956]. Ramsey’s Theorem is due to
Ramsey [1930].
The reader wishing to acquaint himself further with the fascinating
and extensive subject of model theory may consult Bell and Slomson [1969]
and Chang and Keisler [1973], For applications of model theory to
algebra see Robinson [1963]. For an illuminating discussion of the early
history of model theory, see Vaught [1974],
CHAPTER 6

RECURSION THEORY

The primary task of recursion theory is to characterize and stud the


class of all algorithmic functions of natural numbers. Roughly speaking,
a function / is algorithmic if there is an algorithm (i.e., a deterministic
mechanical procedure) for calculating the value f(a) for any a belonging
to the domain of /.
An important application of this theory is in the study of decision
problems. A decision problem has the following general form: a set A
and a property P are specified and the problem is then to find — or to
prove the impossibility of finding — an algorithm by means of which
one could tell, for any a£A, whether or not a has the property P. In this
chapter we shall use recursion theory to solve (negatively) an important
decision problem in number theory, Hilbert’s Tenth Problem. In Ch. 7
we shall make several crucial applications of recursion theory to logic.
This chapter can be read independently of all earlier chapters.

§ 1. Basic notation and terminology

In this chapter, unless the contrary is stated, the word number shall mean
natural number (i.e., non-negative integer). The set of all numbers is
denoted by N.
By n-ary function we mean any mapping

/: A-+N,

where A^N", i.e., A is a set of ordered /7-tuples of numbers. We call


A the domain of/and put ^4 = dom(/). By function we mean 77-ary function
for some number n.
Recall that by convention (see Ch. 0) N° is a set having just one member.
Thus if/is a 0-ary function, then either dom(/) has just one member and
/has just one number as value, or dom(/)=0 and / is nowhere defined.
CH. 6, §1]. BASIC NOTATION AND TERMINOLOGY 227

In the former case we identify/with its unique value. The nowhere defined
0-ary function will be denoted by Thus the set of all 0-ary functions
is Nu{°°}.
Unless otherwise stated, lower-case italic letters from the end of the
alphabet (especially “x”, and “z”) will be used as variables ranging
over the set N and lower-case italic letters from the beginning of the
alphabet will be used as constants denoting members of that set. (We
shall refer to these as numerical variables and constants).
We use lower-case German letters as abbreviations for //-tuples. Thus
i=(jc1s x2,..., xn), a=(a1,a2,...,a„), etc. Note that by this convention
the number of components of an entity denoted by a lower-case German
letter is always taken to be n, rather than k or m etc.
The fact that different n-ary functions do not in general have the same
domain could lead to some rather awkward formulations. These will be
avoided by the following device: we convert each n-ary function into
an n-ary operation on Nu {«=>}. Allowing m to range over (Nu {<*>})" we put

(1) /(uj) = oo whenever u>$dom(/).

Thus for any n-ary function / we have

(2) dom(/) = {id : /(to) ^ °°),

and since dom(/) £ N", (2) implies that

(3) /(id) = °o whenever rd$A".

Conversely, any n-ary operation/on Nu {°°} that fulfils (3) is now regarded
as an n-ary function, and (2) is regarded as the definition of its domain.1
Note that to define an n-ary function one needs to specify the values
fix) for x£Nn only, since the rest is automatically taken care of by (3).
If /"is an n-ary function for which we have not only (3) but also its converse

(4) /(td)^oo whenever m£Nn,

i.e., if dom(/)=A", then/is said to be total'1.

1 Our device amounts simply to regarding “ = °o” as an abbreviation of “is undefined


and as an abbreviation of “is defined (and equal to some number) .
2 In many books dealing with recursion theory the term function is reserved for total
functions only, while what we have called “functions” are termed partial functions.
But since the class of mappings satisfying (3) — rather than both (3) and (4) is the
more natural one in recursion theory, we prefer to apply the shorter term to it.
228 RECURSION THEORY [CH. 6, §1

Total unary functions can clearly be identified in a natural way with


infinite sequences of numbers. Therefore we call such functions sequences.
We use lower case Greek letters (except £, q, (, X and n) as constants denoting
particular sequences.
The letter £ (and occasionally also q and £) will be used as a variable
ranging over the set Nn of all sequences.

By an n-ary functional we mean a mapping

F : A -+N,

where A = dom(F)^.NNXNn. Thus F has one sequence argument and


n numerical arguments. We shall normally separate the two kinds of
argument by a semi-colon (e.g., “F^ix)”). Using the same device as
for n-ary functions, we can regard an n-ary functional as a mapping

F: NnX(Nv {c°})"-+Nkj {«=}

satisfying the condition

(5) F(£;vo)=°° whenever to $JV".

F is total if the converse condition

(6) F(£;w)X °o whenever vo £N"

is satisfied as well. (Here again to ranges over (iVu{°<=}) .)


Unless otherwise stated, functional will mean n-ary functional for some n.
When we assert an identity, say “r=s”, where the values of r and s
are in iVu{°°} and depend on values taken by variables, say
then (unless otherwise stated) we mean that r and s have the same values
for all ££Nn and all1 xeN". (N.B. Not necessarily for all x£(Nv {°°})".)
With each n-ary function / we can associate in a natural way an n-ary
functional, simply by adding a fictitious sequence variable. That is, we
define a functional F by the identity
Ftf;x)=f(x).

We shall often identify a function with its associated functional, and so


regard functions as a special kind of functional.

1 Here too we depart from the convention of books dealing with recursion theory, which
introduce a special symbol (e.g., to denote identity in this strong sense. But it
seems to us that our use of = is more in line with common usage in most branches
of mathematics. Also, since the strong sense of identity will be required more often
than any other, it seems sensible to denote it by the simpler symbol.
CH. 6, §1], BASIC NOTATION AND TERMINOLOGY 229

In this chapter we shall sometimes use Church's lambda notation as


follows. Suppose that r is an expression such that the identity

/(*)=>*

defines an n-ary function /. Then this function / will also be denoted by

Xxv..xnr

or briefly, Xxr. (Note that we have, for any n-ary function f,Xxf(x)=f)
Thus, e.g., /bc(x+l) is the sequence (p satisfying the identity <p(x)=;c + l,
and /bcyO'*) is the binary function / satisfying the identity f(x,y)=xy.
More generally, suppose r is an expression such that the identity

F(t;;x,y1,...,ym)=r

defines an (n+m)-ary functional F, then the meaning of Xxr is determined


by the identity

In other words, Xxr is F(f-,x,y1,...,y„) “as a function of x”, with ^,y1,...,ym


as parameters. For example, Xxq(x) is simply and if F is a binary
functional than XxF(c,-,x,y) is the unary function f^.y (depending on q
and y as parameters) such that f^.ty{x)—F{b>\x,y).

To conclude this section we introduce the least number symbol “f".


Suppose that r is an expression such that the identity

G(£;x,y)=r

defines an (n + l)-ary functional G. Then the expression pyr has the


following meaning. The identity

F (£;*)=by r

defines an n-ary functional F which behaves as follows. Let cp be any


sequence, and let a be any n-tuple of numbers. Consider the values
G((p;a,y) for y—0, 1,2, ... . There can be at most one number b such
that for all y<b the values G(cp;a, y) are defined and positive (i.e., different
from °° and 0) and G((p;a, b) = 0. If such b exists, we put F((p;a) = b. But
if no such b exists we leave F((p;a) undefined, i.e., F((p;d) = ^°.
The same notation is also used in connection with functions rather
than functionals.
RECURSION THEORY [CH. 6, §2
230

For example, if we define /by

f(x,y)=nz(y-xz),

then for any numbers a and b,f(a,b) is the least number c such that b = ac
if such c exists; otherwise f(a,b) = °°.

§ 2. Algorithmic functions and functionals

This section is devoted to an informal discussion of the intuitive notions


that underlie and motivate recursion theory. The most basic among these
notions is that of algorithm.
By algorithm mathematicians generally mean a computation procedure
whose application leaves nothing to chance and ingenuity, but requires
a rigid stepwise mechanical execution of explicitly stated rules. Here
are a few examples of algorithms:
(a) The procedure of “long multiplication” for multiplying two numbers
which are represented in decimal notation.
(b) The Euclidean algorithm for finding the highest common factor
of two positive numbers.
(c) The well-known procedure for differentiating any combination of
rational, trigonometric, exponential and logarithmic functions.
The following examples, taken from earlier chapters of this book, are
(or can easily be converted into) algorithms:
(d) The method of truth tables (see §6 of Ch. 1) for finding out whether
a given formula is a tautology.
(e) The method described in §8 of Ch. 2 for systematically searching
for a first-order confutation of a given finite set of formulas.
(f) The method suggested in §5 of Ch. 3 for systematically searching
for a first-order proof of a given formula.

An algorithm is presented as a prescription, consisting of a finite number


of instructions. It can be applied to any one of a set of possible inputs —
each input being a finite sequence of symbolic expressions. Once any
particular input has been specified, the instructions dictate a succession
of discrete simple operations, requiring no recourse to ingenuity or chance.
The first operation is applied to the input and transforms it into a new
finite sequence of symbolic expressions. This outcome is in turn subjected
to a second operation (dictated by the instructions of the algorithm)
and so on.
CH. 6, §2], ALGORITHMIC FUNCTIONS AND FUNCTIONALS 231

It may happen that after a finite number of steps the instructions dictate
that the process must be discontinued and an output be read off (in some
prescribed way) from the outcome of the last step. For some inputs,
however, the process may never terminate and there is no last step and
hence no output1.
With any algorithm SJ3 we may associate a mapping in the following
natural way. Let A be the class of all possible inputs of “p. If a^A and
the application of ^ to a eventually (i.e., after a finite number of steps)
yields an output, we let ^(a) be that output; but if ^ applied to a yields
no output we leave s)3(a) undefined (or, using the convention introduced
in §1 we may put ^3(a) = °° in this case).
A particularly important class of algorithms — to which many other
cases can easily be reduced — is that in which, for some n, the set of possible
inputs of the algorithm ^3 includes the set Nn of all n-tuples of numbers
(represented in some particular system of notation) and the corresponding
outputs of ^3, when they exist, are numbers (also represented symbolically).
In this case, the mapping associated with ^3, when restricted to TV", yields
an «-ary function, say /. We say that this / is algorithmic and ^3 is an
algorithm for (calculating) f.

The foregoing ideas can be extended to deal with functionals as well.


The snag here, however, is that if F is an n-ary functional, then it would
seem that an algorithm 'ip for calculating F should have inputs representing
pairs of the form (cp; a) where cp is a sequence and a an n-tuple of numbers.
But this cannot be so, since in general we would need an infinite amount
of information in order to determine cp completely, and this cannot be
encapsulated in a finite symbolic representation.
We shall therefore suppose that in this case the initial input for s)3 will
represent only the numerical component a, while representations of values
cp(r) of cp will be fed in subsequently, “on demand”, during the course of
the calculation. Thus, at certain steps of the calculation process the
instructions of ^ may dictate that a representation of a value cp(r) be added
(in some prescribed way) to the outcome of the preceding step. Here
r itself is to be determined in some prescribed way from the outcome of
the preceding step.
It should be stressed that the sequence argument cp is not itself assumed
to be algorithmic. The values cp(r) are not necessarily outputs of some

1 This can actually happen in examples (e) and (f) above. Thus, if we start with a formula
that is not logically true, the systematic search for a proof will never terminate.
232 RECURSION THEORY [CH. 6, §3

prior calculation process. We simply require that they be made available


if and when they are called for.
Now, if is an algorithm that can be applied to all pairs (£;*), where
the 72-tuple x of numbers (in some prescribed system of notation) serves
as initial input, and values of £ are fed in “on demand” as outlined above,
and if the corresponding outputs of whenever they exist, are numbers
(represented in some specified way) then the mapping associated with
s)3 yields an 72-ary functional F. We say that F is algorithmic and that
is an algorithm for (calculating) F.

§ 3. The computer URIM

It may have occurred to the reader that the informal discussion of §2


has quite a lot to do with computers1. In fact, any program that can be
used to program a given computer may be regarded as an algorithm.
In this section we shall make use of this idea to define a wide class of
algorithmic functions and functionals. We describe an imaginary “com¬
puter” called Unlimited Register Ideal Machine (briefly, URIM) and the
programs under which it may be made to “operate”. These programs
can then be regarded as algorithms2, with which we may associate
algorithmic functions and functionals as outlined in §2.
We assume that our “computer” URIM has an infinite sequence of
registers R; (/= 1,2,...). We call the positive number i the address of the
7th register R;.
The registers are designed to be storage places for the inputs, output
and intermediate stages of a computation. We assume that at each moment
of time every register stores some natural number. (We may imagine the
register to contain a symbolic representation of the number, but for our
purposes it will not matter what particular method of representing numbers
is used.)
In addition, URIM has a program counter K, which also contains, at
each moment, some natural number. (This number will not be part of
a computation, but merely a “position marker” for book-keeping purposes.)

1 By computer we mean any calculating machine or automaton which operates in a


deterministic and discrete step-by-step way. We exclude probabilistic, continuous and
analog devices.
- This use of a “machine” and its programs to define algorithms is just a heuristic
metaphor. As a matter of fact, the algorithms in question could be described directly
in purely abstract mathematical terms, without any reference to machines and programs.
CH. 6, §3], THE COMPUTER URIM 233

We say that a register or the counter K is empty when the number stored
in it is 0. Thus to erase a register is to put 0 in it.
We shall assume that any number, however large, can be stored in each
register and in the program counter. From this it would seem that we
require URIM to be able to store an infinite amount of information.
However, in fact we shall always assume that at any moment almost all
(i.e., all but a finite number) of the registers are empty. Moreover, each
program will only make use of a finite number of registers, whose addresses
can be read off directly from the program itself. Thus we really need only
an unlimited — but in each case finite — number of registers; hence the
name of our machine.
Even so, the amount of information that can be stored in URIM, albeit
finite, is unbounded. It is precisely this that makes URIM a purely ideal
machine, which cannot be realized physically1. Every real computer has
a finite storage capacity and can therefore only store a bounded amount
of information. True, the storage capacity of some real computers can be
expanded in case of need by bringing in new auxiliary memory units
(e.g., on magnetic tape); but even this has its practical limitations and
cannot go on indefinitely.
However, in our theoretical treatment of algorithmic functions and
functionals we ignore such limitations of a purely practical, contingent
nature. Thus we put no bound on the number of registers available in
URIM or on the size of the numbers stored in the program counter and
registers.

In order to deal with functionals (rather than just with functions) we


introduce hypothetical entities called oracles. If cp is any sequence (i.e.,
a total unary function) then a (p-oracle is an agency able to supply the value
(p(r) for each r. It must be stressed that an oracle is not assumed to be
a computer — in fact, it cannot be a computer unless cp happens to be
algorithmic. We do not consider oracles to be part of URIM but external
to it.
Our only assumption concerning oracles is that, for any sequence (p,
URIM can be linked up to a <p-oracle and that values cp(r) can be trans¬
mitted from the oracle and fed into a register of URIM. The circumstances
under which any particular value (p{r) is called for by URIM, and the
determination of the register into which it must be fed, will be explained
below.

1 In all other respects the assumptions we shall make about URIM will be quite realistic.
234 RECURSION THEORY [CH. 6, §3

We assume that URIM can obey four kinds of commands as follows.


Z commands. For each positive i, the command Z; is obeyed by erasing
the z'th register (i.e., causing R; to become empty) and adding 1 to the
program counter K (i.e, causing the number k stored in K to be replaced
by k+l). The other registers remain unchanged.
S commands. For each positive i, the command is obeyed by adding
1 to both R; and K. The other registers remain unchanged.
A commands. We suppose that URIM has been linked up to a <p-oracle.
Let i be a positive number and suppose that at a given moment the number
stored in R; is r. Then the command A; is obeyed by putting cp(r) in place
of r in R; and adding 1 to K. The other registers remain unchanged.
(The number cp(r) is of course supposed to be obtained by URIM from
the (p-oracle.)
J commands. Let i and j be positive and let k be any number. Suppose
that at a given moment the numbers stored in R, and Rj are rt and Vj.
Then if r—rj the command JiJ k is obeyed by putting k in K instead of
the number that was stored there previously. If r^rj, then ]iJ k is obeyed
by adding 1 to K. In either case all registers remain unchanged.
All together we have commands Zh Sh A; and Ji j k for all positive
i and j and all k.
By a program we mean any finite sequence of commands

^=<C0,C1,...,C,_1).

Here h is the length of *!p, i.e., the number of its commands. For any
kdi, we call Ck the kth command of "p. (Thus, p begins with its 0th
command, not with its 1st.)
The addresses occurring in a program p are precisely those positive
numbers i and j for which Zi,Si,Ai or (for some k) J iJk are in p.

We go on to describe how URIM operates under a given program p.


We suppose that URIM has been linked up to some oracle. When “switched
on”, URIM will go through a (finite or infinite) succession of steps. Each
step consists of obeying some command of p. Suppose that at a given
moment the number stored in K is k. If A:<length of 'ip, the next step
will be to obey the kth command of <p. However, if ^length of ^3 (so
that ip has no klh command) then URIM halts — it “switches itself off”
and does not perform any more steps.
We shall always assume that initially — i.e., before URIM begins to
operate — the program counter K is empty. Thus, the commands of <p
CH. 6, §3]. THE COMPUTER URIM 235

are obeyed one by one in order; except that when a command Ji j>k is
obeyed and the numbers stored in R( and Rj happen to be equal, then the
next command to be obeyed will be the kth command of SJ$ (or, if k > length
of ip, URIM will halt). In other words, if the numbers in R( and Rj are
equal, a command Jijk makes URIM jump to the kth command rather
than proceed in order.
We shall also assume that initially almost all the registers are empty.
It is easy to see that if URIM is operating under the program iJ3, the only
registers that may affect the process or be affected by it are those whose
addresses occur in s)3. Thus almost all the registers remain empty throughout
and play no role at all.
3.1. Example. Let i and j be positive. We construct a program 91,- ^ whose
effect is to copy the number initially stored in R; into Rj. All registers
other than Rj are to be left unchanged. (In particular R, will retain its
initial contents.)
START

Fig. 1

If i—j, then the empty program will do the job. Now let zVy. We begin
by setting up a flow chart which shows how we want the required program
to work (see Fig.l).
A diamond-shaped box represents a question (in the present case: “are
the numbers stored in R£ and Rj equal?”). Two arrows lead away from the
RECURSION THEORY [CH. 6, §3
236

diamond, corresponding to the answers “Yes” and “No”; they are label¬
led accordingly “Y” and “N”. The rest is self-explanatory.
By following the chart the reader can see how our program is supposed
to work. First, Ry is erased. Then URIM enters a loop and goes round
and round as long as the numbers in R; and R} remain different. Each
time round, it adds 1 to Rj. When the number in Ry becomes equal to
that in R(, URIM gets out of the loop and halts.
We now convert our chart into an actual program. We start at the
entrance of the chart (marked “START”) and work our way along, following
the arrows. Our 0th command is Zy. Next, our 1st command, which
corresponds to the diamond, must of course be of the form Ju>-. We leave
the third index blank, to be filled in later on; this third index will determine
the jump that URIM will have to make when the numbers in R; and Ry
are found to be equal.
In the meantime we follow the arrow labelled “N”, which corresponds
to the case where the numbers in R; and Ry are different, and does not
involve a jump. Our next command will thus be Sy. After this, we have
to make URIM jump back to the diamond command J;Jj_ which was
our 1st command. We therefore take our next command to be Jl lx: the
number in the register Rx will of course be found to be equal to itself,
and URIM will therefore jump back to the 1st command, as required.
(Instead of J1)Xil we can use }p p l with any positive p.)
Now that we have come back to the diamond, we have all the commands
we need. Since the length of our program is 4 (we have got four commands)
we can fill the blank in Ju>_ by 4 (or by any number >4). Thus we have
constructed the program

= (Z/ ’ JiJA> »Jl,l,i)*

The reader should check that this program in fact does what is required.

Let ‘ip be a program of length h. By the normalization of s)S we mean


the program obtained from l*p when every command of the form
Jitjk with k>h is replaced by*J;y;i. Clearly, ^3' and ^5 work in exactly
the same way, except that with <p' URIM can halt only when the number
in K is h, while with ^ that number may be >/z. If is the same as its
normalization, we say that ^ is normal.
Let ^ be a program of length h and let Q be any program. Then their
concatenation ^PQ is defined to be the program obtained as follows. First,
we write the commands of the normalization of ^P; then we follow them
CH. 6, §4], COMPUTABLE FUNCTIONALS AND FUNCTIONS 237

by the commands of Q except that we replace every command J,- j k of


^ by Jij'h+k- Clearly, under PQ URIM begins to operate as it would
under P; but when it would come to a halt under p, it will now go on
to operate as under Q.
Clearly, concatenation is associative: (pQ)9I = p(Q9I). So brackets
can be omitted.
When drawing up a flow chart, we shall sometimes represent a concatena¬
tion, say pQ, by putting ^ and Q in two rectangular boxes, with an arrow
from one box to the other to show the order of concatenation.

We conclude this section with yet another definition. For any program
P and number m, we define p(m) to be the program obtained from p by
adding m to every address occurring in it. (Thus, e.g., a command }i j k
is to be replaced by Ji+my+mfc.) It is clear that P(m) ignores all registers
whose addresses are ^ in, while it treats R,+m (for all positive /) exactly
as “ip treats R;.

§ 4. Computable functionals and functions

Let p be a program and let n be any number. We define an n-ary functional


p„ as follows.
Let cp be any given sequence, and let a be any n-tuple of numbers.
Suppose that URIM is linked up to a <p-oracle. Let ax,...,an be stored in
R^.-^R,, respectively, while all the other registers (as well as K) are empty.
Starting from this initial position, let URIM operate under p.
If after a finite number of steps URIM comes to a halt, and at that
moment the number stored in the first register R, is b, we define P„(<p;a)
to be b.
If, on the other hand, URIM never comes to a halt, then P„(<p;a) is °°,
i.e., it is left undefined.
We say that p computes the functional P„. A functional is computable
if it is computed by some program.
If p is a program having no A commands, then evidently the value of
Pn(£;x) is independent of the value assigned to the sequence variable
In this case we obtain an n-ary function (which we also denote by “P„”)
defined by the identity
p„(*)=p,,(£;*)•
We say that p computes the function p„. A function is computable if
it is computed by some program having no A commands.

17
RECURSION THEORY [CH. 6, §4
238

The reader should note that the term “computable” is used here in a
precise technical sense. In the literature the same term is often used in a wi¬
der and looser sense, as synonymous with “algorithmic”; but we shall
avoid this usage here.
It is clear that each URIM program can in fact be regarded as an algo¬
rithm, and every computable function(al) is algorithmic in the sence of § 2.
We shall argue later on (see §7) that it is reasonable to assume that, con¬
versely, every algorithmic function(al) is computable; but this is a matter
of reasonable belief, not a theorem.

START

4.1. Example. We show that addition (i.e., the function / defined by


f(xx, x2) = a'1 + a'2) is computable. We first set up a flow chart (see Fig. 2).
We assume that at the start numbers ax and a2 are stored in Rx and R2
respectively, and the other registers (in particular R3) are empty. R3 is used
as a loop counter: it shows how many times URIM has been round the
loop. Each time round, 1 is added to both Rx and R3. This goes on so
long as the number in R3 is less than a2. When the contents of
R3 reaches a2, this means that 1 has already been added a2 times to Rx.
At that moment the number in Rx must be ax-\-a2 and it is time to halt.
Using the same method as'in Ex.3.1, we can convert our chart into the
program <J2)3>4, Sl5S3, Ju>0>.

Let cp be any given sequence. If if! is any program, we can get an n-ary
function / from the functional %!„ by giving its sequence variable the fixed
value (p. Thus

/(*) = ^„((p A),


CH. 6, §5], RECURSIVE FUNCTIONALS AND FUNCTIONS 239

i.e., / is Ax'!)$n((p;x). We say that s43 computes f relative to </> (or, briefly,
ip-computes f). A function is computable relative to (p (briefly, ip-computable')
if it is (^-computed by some program.

From now on we shall often omit the subscript from expressions


like “93n(<p;*)”, “^P„0O” etc., since the n can in any case be determined
from the number of numerical arguments shown.

4.2. Theorem. A computable function is computable relative to every


sequence. A function computable relative to the sequence ax(x + 1) is
computable.
Proof. Let / be an /z-ary computable function. Then, by definition, the
identity /(x) = ',P(c;x) holds for some program ^ having no A commands.
But then sP(£;x) is independent of the value assigned to £, so for every
sequence ip we have /(x) = ^|3(<p ;x), and hence /is (^-computable.
Now suppose that / is an n-ary function computable relative to the
particular sequence (p for which <p(a:)=a:+1. Then for some program ^
the identity f(x) = ty(ip;x) must hold. Let be the program obtained
from ip upon replacing every command A; by S;. Then clearly Q has no
A commands. Also, when URIM is linked up to a ^-oracle, A; has exactly
the same effect as S,-. It follows that /(x) = Q(x) and / is a computable
function.

Every sequence ip is computable relative to itself (it is (^-computed by


the program (Ax)). On the other hand, since there are only denumerably many
programs and non-denumerably many sequences, most sequences are not
computable. Thus (^-computability does not in general imply computability.

§ 5. Recursive functionals and functions

In this section we shall define the class of recursive functionals. Although


this definition will not refer to machines and programs, it will be seen later
on that a functional is recursive iff it is computable in the sense of §4.
We begin by introducing an infinite list of formal symbols:

C, P, M, Z, S, A, IBif,

where n and i are any positive numbers such that l<i<».


Certain expressions made up of these symbols (using also parentheses
and commas) will be called descriptions. More precisely, for each number
n we shall have n-ary descriptions.

17*
RECURSION THEORY [CH. 6, §5
240

5.1. Definition. Descriptions are expressions formed according to the


following seven rules:
(1) The symbol Z is a 0-ary description.
(2) The symbol S is a unary description.
(3) The symbol A is a unary description.
(4) For any n and i such that 1 <i<n, the symbolIn>; is an n-ary description.
(5) If G is a k-ary description, with 0, and if H1;...,Hfc are n-ary
descriptions, with n>0, then C(G, is an n-ary description. (Thus
if k = 0 then C(G) is an n-ary description for every n.)
(6) If G is an n-ary description, with 0, and H is an (n + 2)-ary
description, then P(G, H) is an (n+\)-ary description.
(7) If G is an (n + l)-ary description, with 0, then M(G) is an n-ary
description.
By the degree of a. description F (briefly, deg F) we mean the total number
of occurrences of the symbols C, P and M in F.

To each n-ary description F we shall now assign an n-ary functional F;


and we shall say that F describes, or is a description of F. (Here and in
the sequel we adopt the convention of denoting a description by a bold
Roman letter, and the functional described by it is then denoted by the
corresponding italic letter.) We proceed by recursion on deg F, following
the seven cases of Def. 5.1.

5.2. Definition.
(1) The symbol Z describes the 0-ary functional Z defined by the identity

Z(0 = o.

(2) The symbol S describes the unary functional 5 defined by the identity

S(£;x)=x+1.

(3) The symbol A describes the unary functional A defined by the identity
*

A(c; x) = <*(*).

(4) For any n and i, l^sioi, I„; describes the n-ary functional /„, defined
by the identity
CH. 6, §5], RECURSIVE FUNCTIONALS AND FUNCTIONS 241

(5) If F is C(G, Hl5...,Hfc), where G describes the k-ary functional G


and describe the n-ary functionals respectively, then
F describes the n-ary functional F defined by the identity

We say that F is obtained from G and by composition. Note


that F((p;a)=£°° iff both °° for i=\,...,k and

G(cp\Hyifp ;a),...,Hk(cp ;a)) ^ °°.

(6) If F is P(G,H), where G describes the n-ary functional G and H


describes the (n + 2)-ary functional H, then F describes the (n f 1 )-ary
functional F defined by the two identities

F(£;x,0) = G(£;x),

F(£;x,y+l) = H(£;x,y, F(£;x,y)).

We say that F is obtained from G and H by primitive recursion. (It is


easy to verify, by induction on the value assigned to y, that these two
identities together determine F uniquely.) Note that if, for given cp, a and b,
we have F((p;a,b) = then also F((p;a,b') = °° for every b'>b.
(7) If F is M(G), where G describes the (n+ l)-ary functional G, then
F describes the n-ary functional F defined by the identity

F(£;x)=pyG(£;x,y).

We say that F is obtained from G by minimization. Note that even if,


for given cp and a, G(cp;a,y)^00 for every value of y, we may still have
F(<p;a) = oo.
A functional is recursive if it has a description.

Note that while each n-ary description describes a unique n-ary func¬
tional, the converse is false. For example, for any description F, it is
clear that F and C(IM, F) describe the same functional. It follows that
each recursive functional has infinitely many descriptions.
If F is an n-ary description in which the symbol A does not occur, then
it is easy to verify by induction on deg F that the corresponding functional
F does not depend on its sequence argument. Thus in this case we can
define an n-ary function — which we identify with F and denote by “F”
as well — by putting

F{x) = Ftt-x).
242 RECURSION THEORY [CH. 6, §5

We then say that F describes this function F, and that F is a recursive


function. If F is a description in which the symbol M does not occur,
the functional described by F is said to be primitive recursive (or, briefly,
p.r.). If neither A nor M occur in F, then the function described by F
is primitive recursive (or p.r.).
It is easy to see that every p.r. function(al) is total.

Let cp be any given sequence. Then, if F is an n-ary recursive functional,


the function f=XxF{cp\x) is said to be recursive relative to (p (briefly,
cp-recursive). A description of F will also be said to describe f relative to cp.

5.3. Theorem. A recursive function is cp-recursive for every sequence cp.


A function recursive relative to /.x(x+ 1) is recursive.
Proof. The first part is obvious. To prove the second part, suppose that
for cp=Xx(x+\) we have f(x) = F(cp;x), where F has a description F. Let
G be obtained from F upon replacing the symbol A by S. It is easy to
see (by induction on deg F) that G describes /. |

5.4. Problem. Let G and H be an n-ary and an (m + l)-ary recursive


functional, and let H be total1. Let Fbe an (n + m)-ary functional satisfying
the identity
F(£',x,z1,...,zm) = G {XyH (f; y,zv... ,zm) ;x).

Show that Fis recursive. (Use induction on the degree of a given description
of G. Verify that the identity

A(XyH(f; y,z2,... ,zj ;x) = H(f ;x,z1}... ,zm)


holds.)

We shall now show that every recursive function(al) is computable.

5.5. Theorem. Given any description F of a function(al) we can construct


a program which computes that function (al).
Proof. We shall deal with functionals rather than functions. However,
the reader can easily check that the program we shall construct will contain
A instructions only if the symbol A occurs in F; so the result for functions
will follow at once.
Suppose then that F describes the n-ary functional F. By induction
on deg F we show how to write a program s$ that not only computes F,

1 We require Ii to be total to make sure that the unary function kyH{£\y, zu..., zm)
(which depends on z1,...,zm as parameters) is always total and hence eligible to serve
as the sequence argument of G.
CH. 6, §5]. RECURSIVE FUNCTIONALS AND FUNCTIONS 243

but has the following stronger property:

Vn + m({;*>yi>->ym) = F(£'>X) for a11 m-

That is, when URIM operates under 'ip, the existence and value of the
output do not depend on the initial contents of registers other than the
first n. Thus, when using ^ to compute F we need not insist that initially
all registers other than the first n be empty. Of such 'ip we say that it
computes F strongly.
It will be convenient to use the following notation: for all m>n we
let (£„jm be the program1

It is easy to see that the effect of (£n>m is to copy the contents of Ri,...,R„
into Rm,...,Rm+n_1 respectively.
We proceed by cases, corresponding to those in Defs. 5.1 and 5.2.
(l)-(4). The functionals Z,S,A, and I„ti are strongly computed by the
programs (Z,), <Si),<Aj> and 9lu respectively.
(5) Suppose F is C(G, Hl5...,Hfc) and

F<£\x) = G{£-MZ-, x),...,Hk({;x)).

By the induction hypothesis, we possess programs that strongly


compute G, H1,...,Hk respectively. We set up the required program %
which strongly computes F, in the form of a flow chart (see Fig. 3).
For the reader’s convenience, we show alongside the chart the state of the
registers at various key stages in the computation of F(<p;o). Asterisks
stand for registers whose contents are irrelevant.

(6) Suppose F is P(G, H) and

F(£,*,0) = G(£;*),

F(te,y + l)=H(£;x,y,F(£,x,yj).

(Here F is (n + l)-ary rather than n-ary.) By the induction hypothesis,


we possess programs Q and 91 which strongly compute G and H respectively.
We set up a flow chart (see Fig. 4) which (as in Ex.3.1) can easily be
converted into a program ^ that strongly computes F. Again, the state
of the registers at various key stages is shown alongside. The reader should
note that Rn+2 is used as a loop counter: at the start it is erased, and from

i For the meaning of %tJ see Ex. 3.1. Concatenation of programs was also defined in §3.
244 RECURSION THEORY [CH. 6, §5

START kkl---kl*|*

kN--.k!flik|...kU*|.A

\al\a2\...\an\Hl{(p\ q)|JJ...

\oi\a2\...\an\H1((p: a)\a^\a2\...\an

aikkkl#i(</>; ft) H2(cp;

kk kl^;n)UJ

HALT
<01*1*1
Fig. 3
CH. 6, §5], RECURSIVE FUNCTIONALS AND FUNCTIONS 245

V
HALT
Fig. 4
RECURSION THEORY [CH. 6, §5

START ai\a2\...\an\JJ...

kN---k|o|*|*|...
After t times round the loop:

LOOP((r + l)th time)

\a1\a2\...\an\t\a1\a2\...\an\t\JJ...

\ai\a2\...\an\t\G((p; a,

\a1\ai\...\an\t\G(<p; a,/)|0UJ...

Compare G(^; a, t) in Rn + 2 with 0 in Rn + 3;


if unequal, go again round loop; if equal,
t is F{(p \ a) so take exit “Yes”.

kkl---kk+i|*l*l---

v
HALT
CH. 6, §6], A STOCKPILE OF EXAMPLES 247

then on the number stored in it is the number of times the computation


has gone round the loop.

(7) Finally, suppose F is M(G) and

F(£;*)=iiyG(£;x,y).

By the induction hypothesis, we possess a program Q that strongly computes


G. Again, we set up a flow chart (see Fig. 5) which can easily be converted
into a program that strongly computes F. We use R„+1 as a loop counter
to show how many times the computation has gone through the loop.
This completes our proof. |

§ 6. A stockpile of examples

In this section we collect examples of recursive functionals, not merely


for illustration but for future use. But first we introduce some new ter¬
minology.
Consider an operation that can be applied to given functionals, say
Gx,...,Gk (which may either be arbitrary or subject to some special con¬
ditions) to yield another functional, say F. We shall say that the operation
is recursive if for any given descriptions G1,...,Gfc of G1,...,Gk, respectively,
we can construct a description F of the corresponding F such that:

(1) If the descriptions G1,...,Gi do not contain the symbol A, then F too
does not contain A.

Clearly, in this case, if G1,...,Gk are recursive functionals (or functions),


then F too is a recursive functional (or function, respectively).
We shall say that the operation in question is primitive recursive (briefly,
p.r.) if in addition to (1) we also have:

(2) If the descriptions G1}...,Gfc do not contain the symbol M, then F too
does not contain M.

In this case, if Glf...,Gk are p.r. functionals (or functions) then so is F.

Our treatment can be made to cover relations on N as well as functions.


In fact, we shall find it convenient to identify an n-ary relation R on N
with the n-ary function / defined by
248 RECURSION THEORY [CH. 6, §6

This amounts merely to identifying the truth values truth and falsehood
with 0 and 1 representively.
Thus, for the purposes of recursion theory, we agree that by first-order
n-ary relation we mean a total n-ary function R such that for all
x£N".
Similarly, by second-order n-ary relation we mean a total n-ary functional
R such that R(^;x)^l for all ££Nn and all x£Nn.
By relation we mean (unless otherwise indicated) a first-order or second-
order n-ary relation for some n. Note that, since by our convention functions
are a particular kind of functional, first-order relations are a particular
kind of second-order relation, and treatment of the former is subsumed
under that of the latter.
If R is a relation, we shall often say, e.g., “R{t,',x) holds” or “R(£;x)
is true” instead of “i?(£;s) = 0”.

We are now ready to proceed to our examples.

6.1. Example. Let G be any given A-ary functional, and let the /z-ary
functional F be obtained from G by the identity

F(£, ;x) = G(£ ‘,xix,xi2,...,xik),

where 1 </7</z for j=\, 2and the ij need not be distinct. Then this
is a p.r. operation; for, if G describes G then Fis described by

C(G> I/mv

There are three important special cases of this operation:


(1) Permutation of variables', e.g., k = n = 3 and F is obtained from G
through the identity

F(£;x,y,z) = G(£;y,z,x).

(2) Identification of variables', e.g., k = 3, n = 2 and the identity


F(£;x,y) = G(£;x,y,x).

(3) Introduction of a redundant (or fictitious) variable', e.g., k= 1, n = 2


and the identity1 k'

F(f,x,y) = G{£;x).

Note that by the conventions adopted in §1, this identity means that for every sequence
<p and numbers a and b we must have F(<p;a,b) = G(<p\a). However, if instead of b
we take then we have F(^;n,~) = <*,; hence we do not have F{<p\a,°°) = G(<p\ a),
unless G{cp\a) happens to be ■».
CH. 6, §6], A STOCKPILE OF EXAMPLES
249

We shall often apply these operations without special mention. Suppose,


e.g., that F is obtained from G, Hx and H, through the identity

F(Z;x,y) = G(Z;H1(Z-xlH2(H;y,x)).

Then we conclude at once that this is a p.r. operation and say (stretching
the terminology introduced in Def. 5.2) that F is obtained from G, Hx
and H2 by composition. (In fact, we should first obtain two other functionals,
say H[ and H2, by

H^-x,y)=H^-x\

H^\x,y) = H^;y,x),

i.e., by introduction of a fictitious variable and permutation of variables;


and then get F from G, Hx and H'2 by composition in the strict sense of
Def. 5.2.)
Similarly, suppose e.g. that F is obtained from G and H through the
identities
F(Z;0,y,z) = G&y),

F(i;;x + l,y,z) = H(Z,F(£,x,y,z),z).

Then we conclude at once that this is a p.r. operation and (again stretching
the terminology of Def. 5.2) say that F is obtained from G and H by
primitive recursion.
Again, suppose e.g. that F is obtained from G by

F(£\y) = p.xG(£;x,x,y).

Then we conclude at once that this operation is recursive and say that F
is obtained from G by minimization.

6.2. Example. In §1 we agreed to identify any number n with the 0-ary


function having n as its (unique) value. (This function is in turn identified
with the corresponding 0-ary functional.)
The numbers 0, 1, 2 etc. are then p.r. functions, described by Z, C(S,Z),
C(S,C(S,Z)) etc.
As for the 0-ary function °°, it is recursive, described e.g. by M(S).

From now7 on we shall leave it to the reader to construct appropriate


descriptions of the functionals we introduce, or at any rate to check that
they could be constructed.
RECURSION THEORY [CH. 6, §6
250

6.3. Example. Let b be any number. If F is obtained from G by putting

then by Ex. 6.2 this is a p.r. operation. We shall use this result frequently,
without special mention.
(Warning: if cp is a non-recursive sequence1, then the operation yielding
/from G through the identity f(x) = G((p;x) is not recursive. For example,
we have (p(x) = A((p;x) although A is recursive and cp is not.)
6.4. Example. The sum function Axy(x-\-y) is p.r., since it may be obtained
by primitive recursion from IX1 and S, using the identities

x+0=x,

*+0'+l)=(x+J') + l-
Product and power, i.e., Axy(xy) and Xxy(xy) are now easily seen to be p.r.
functions as well.

For typographical reasons we shall often write, e.g., “exp(x,/)” instead


of ‘V”.

6.5. Example. The cut-off difference function Xxy(x—y) is defined by

x—y=max(x—y, 0).

To show that it is p.r., we first consider the unary function /, where


/(x) = x— 1. This/may be obtained by primitive recursion from 0 and / x:

/(0)=0,

/(x+l)=x.

Now cut-off difference, in turn, is obtained by primitive recursion from


h,i and /:

x —0=x,

x —(t+1)=/(x—.
(The reader should check the correctness of these identities.)
The function max can be defined by

max(x, y) = (x -y)+y,

1 Such sequences exist, since there are uncountably many sequences and only countably
many of them are recursive.
CH. 6, §6], A STOCKPILE OF EXAMPLES 251

i.e., it is obtained by composition from + and —, and is thus a p.r. function.


6.6. Example. We introduce the propositional operations as p.r. operations.
We define ~i as the unary function Xx{\ — x), which is clearly a p.r.
function. The operation of negation is that which when applied to an n-ary
functional G yields the functional ~iG obtained by composition of n
and G, i.e.,

nCK;r)=n(G(c;x)).

This is evidently a p.r. operation. If G is a relation, then it is easy to verify


that —\G is the relation that takes the value truth (i.e., 0) precisely where
G takes the value falsehood (i.e., 1).
Next, we define v to be the product function Xxy(xy), which we know
from Ex.6.4 to be a p.r. function. We write “xvy” instead of “v(x,j)’\
The disjunction operation is that which when applied to n-ary functionals
G and H yields the n-ary functional Gv H such that

(G v H)(£ ;x) = G(£ ;*) v H(£ ;x).

This operation is clearly p.r., and if G and H are relations then Gv H


is the relation which is true precisely where G and/or H are true.
Finally, we define a and to be the p.r. functions Xxy(j~i(—ixv —iy))
and Ixy(-ixvy) respectively. The operations of conjunction and implication
are then defined in the obvious way (and are seen to be p.r.).
6.7. Example. The order relation < can be defined as the p.r. function
We write “(x<y)” instead of “<(x,y)” The relation >
is p.r. since it is obtained from < by permutation of variables:

(x>y) = (y<x).

The p.r. relations and > are introduced by

(x<jO=-iO><x),

(x^y) = {y*zx).

The equality relation = is defined by

(x=y) = (x<y)A(y<x),

and is thus primitive recursive.


The reader should note that when = (or, for that matter, any othei
relation) is composed with functionals, the resulting functional may not
be a relation. Suppose, e.g., that G and H are n-ary functionals and F
RECURSION THEORY [CH. 6, §6
252

satisfies the identity

(1) F(£,x)=(G(S;*)±H(Z-,x)).

Then clearly F is obtained from G and H by a p.r. operation and is recursive


(or p.r.) if G and F[ are. But, according to our conventions, if cp£NN and
a 6 N" are such that G(q> ;a) = ~ or H(q> ;a) = °° (or both) then also F((p ;o) =
whereas a relation must be total. In fact, F is a relation iff G and H are total.
This state of affairs should be contrasted with the following definition.
Let r and s be expressions whose values are in the set N*u {«>} and depend
on values assigned to the sequence variable £ and numerical variables x.
Then the expression

(2) (r=s)

will, for any given values of £ in Nn and x in Nn, have the value 0 or 1,
according as the corresponding values of r and s are equal (both being
the same number or both °°) or different. It must be stressed that expression
(2) must be taken as an indivisible whole, not as a result of substituting
r and s for x and y respectively in “(x=y)”.
According to this definition, if G and H are as before and

(3) P(^,x)=(G(^,x) = H^;x)),

then P is certainly a relation. On the other hand, we shall see later that
P is not necessarily recursive even if G and H are.
Finally we note that if r and a never take the value °° for any values
of £ in Nn and x in N”, then we have

(r=s) = (r=s),

where “(r — ■$)” is the result of substituting r and s for x and y, respectively,
in “(x =}’)”• Thus in this case we may and shall write “(r=j)” instead
of “(/• =s)'\ In particular, if G and H are total, then F and P of (1) and (3)
are the same.
We shall write “(r^^>)” instead of i(r=s)”.
6.8. Example. The operation of summation is applied to an (n + l)-ary
functional G, yielding another (« + l)-ary functional F:

F(Z;x,y)=ZG(t->x,z).
z<-y
CH. 6, §6]. A STOCKPILE OF EXAMPLES 253

That this operation is p.r. can be seen from the identities

0)=0,

F(€ ;*,y +1) = F(£ ;x,y) + G(£ ;x,y).

Similarly, the product operation, where

Z-=J>

is p.r. because

F(£;*,0) = 1,

f(£ +1)=F(£ ;*» y) • g(£ ;*,y)-

(We sometimes write, e.g., “x<sy” under the summation or product


sign, as short for “x<_y+l”.)
6.9. Example. Bounded existential and universal quantifications are two
operations that when applied to an (n + l)-ary relation P yield (« + l)-ary
relations Q and R, where

Q(£i*,y)=

R(£\x,y)= Vz<^P(<^;x,z).

Here Q(<p;a,b) holds iff there is some c<b for which P((p;a,c) holds; and
R(<p;a,b) holds iff P(cp;a,c) holds for every c<b. That these operations
are p.r. can be seen from the identities

Q{^x,y)=X\P^,x,z),
z<y

-R(£;*,?)=“! -iP(£;x,z).

We shall write, e.g., as short for “3z<y+1”.


6.10. Example. The bounded minimum operation is applied to an (n + l)-ary
relation P to yield an (n + l)-ary total functional F:

(1) F(£;x,y)=pz [(z<y) a P(£ ;x,z) v (z=y)\.

The right-hand side of (1) will be written more briefly thus:

min z<y P(£;x,z).

(The reader should check that if there is some c<b for which P((p;a,c)
holds, then ram z^b P((p;a,z) is the smallest such c; if no such c exists,
then min z<b P(cp; a,z) = b.)

18
RECURSION THEORY [CH. 6. §6
254

From (1) it follows at once that this operation is recursive. But in fact
it is even p.r., in view of the identity

min z<.y P(%;x,z) = £(3 v<z P(£;x,v)),


z<y

which we ask the reader to verify.


6.11. ExaMPLE. Division with remainder provides us with two p.r. functions
q and rm defined by

q(x,y) = min z<x (j(z+ l)>-x),


rm(x,>>) = x-*-y-q(x,y).

(If b^O, then q(a,b) and rm(a,b) are respectively the quotient and remainder
obtained when a is divided by b. Also, q(o,0) = rm(a,0) = a.)
6.12. Example. The sequence of primes Xxpx, where p0=2, ^ = 3, p2 — 5
and so on through the prime numbers. Euclid’s famous proof that there
are infinitely many primes shows that px+1<px! + l. Hence we have1
px+1*s:exp(px,px) and we can obtain our sequence by primitive recursion
from functions already known to be p.r.:

Po = 2,

Px+1 = minj^exp (px,pf){(y>pj a fz<y [(z^l) v(rm(y,z)>0)]}

Thus this sequence is primitive recursive.


We now define the binary p.r. function Xzx [(z)J by the identity

(z)x = miny<z (rm(z,^+1)>0).

Clearly, if z>0 then (z)x is the greatest y for which px still divides z.
(Also, (0)^=0; but we shall not use this.) We call (z)x the xth exponent in z.
We shall write, e.g., “(z)x>y” for “((z)x)y” etc.
Next, we define the p.r. function 2zlh(z) by

lh(z) = min>’<z(n exp(/7x,(z),)=z).


x<y

Clearly, for positive z, J|x<lh(z) txP(Px>(z)x) is the canonical representation


of z as a product of powers of primes. In other words, if e is a positive
number, then the equality e — exp(jpx,e^) (with if 0)
holds iff h = lh(<?) and ex — (e)x for each x</z. (From the definition it also
follows that lh(0) = 0, but we shall not use this.) We call lh(z) the length of z.

1 Recall that exp(x,p)=xy.


CH. 6, §6], A STOCKPILE OF EXAMPLES 255

6.13. Example. The pairing function J is defined as follows. We enumerate


all ordered pairs of numbers (x,y) in order of their increasing sum, x+y,
and — among different pairs with the same sum — in order of increasing
second component, y. Then J(x,y) is the index of (x,y) in this enumeration,
beginning with J(0,0) = 0. Or, in other words, J(x,y) is the number of
pairs preceding (x,y) in the enumeration.
Now, for each w there are clearly w +1 different pairs u,v with u + v = w.
Thus for each pair (x,y) there are altogether 1 +2 +... + (x+j) different
pairs whose sums are <x+j. Also, there are y pairs whose sums equal
x+y but whose second component is <>’ (namely, the pairs (x+y—z,z)
with z<y). It follows that

J O, y) = q(0+y)(x+y+\),2)+y.
Hence J is a p.r. function. (From this identity it also follows that
x+4y*e/(x,>’) for all x and y.)
J is a bijection of A2 onto N. Hence it has a pair of inverses K and L
such that

K(J(x,y)) = x, L(J(x,yj) = y, J(K(z),L(z)) = z.

That these inverses are p.r. can be seen from the identities

K(z) = minx«z 3.y<z (J(x,y)=z),


L(z) = min y<z (J(K(z),y)=z).

6.14. Example. Definition by cases is an operation that yields an n-ary


functional F from n-ary functionals G1,...,Gk and n-ary relations R1,...,Rk
such that

Gff+) if Rftix),
F(c:.s) I if mi*),
Gk(£;x) if Rk(+,x).
(Here *)” is short for holds”.) The relations R{ are subject
to the condition that for each (p£NN and aeNn there is exactly one i,
1 such that Ri((p;a) holds. Also, for the present we require all the
Gi to be total. Under these conditions, the operation is p.r. since

We shall see later that if the Gt are not required to be total, definition by
cases is a recursive operation.

18*
RECURSION THEORY [CH. 6, §6
256

6.15. Counter-Example. We know from Ex. 6.3 that if / is a binary


recursive function then for each number a the unary functions Xxf(x,a)
and Xyf(a,y) are also recursive. The converse, however, need not be true.
For example, let q> be a non-recursive permutation of N, i.e., a sequence
that takes each value exactly once. (Such a sequence exists, since there
are uncountably many permutations of N.) Let R be the graph of cp, i.e.,
the binary relation such that

R(x,y) = ((p(x)=y).

For each a we have


R (x,a) — (a = (p~1(a)),

R(a,y)=((p(a)=y),

where <p_1 is the inverse of cp. These two identities show that XxR(x,a) and
XyR(a,y) are recursive, and even primitive recursive. But cp can be obtained
from R by minimization:
(p(x)=nyR(x,y),
so R cannot be recursive.

6.16. Problem. The Fibonacci sequence (p is defined by the identities

<p(0)=0, <p(l) = l, (p(x+2) = <p(x+l) + (p(x).

Show that it is primitive recursive. (A method which works here and in


similar cases is to show first that the sequence cp, defined by (p(y) =
=IIx<y exp(p*,<?(*)) is P-r., and then use the identity <p(y)={$(y)),.)
6.17. Problem. For fixed m, let Gt and Ht be n-ary and (n+m + 2)-ary
functionals respectively, i=0,...,m. Let Ft be obtained from the Gt- and
Ht by '''‘simultaneous primitive recursion”, i.e.,

0) = Gtf;x),

Ft(£;x,y+1) = Hfcx,y,F0(Z;x,y),...,FJ£;x,y)),
for i=0,...,m. Show that ea9h Ft can be obtained by a p.r. operation
from G0,...,Gm, (Consider functionals G and H defined by

= n exp(phGi(t;;x)),
i^m

H(£,x,y,z) = exp(pi,H^;x,y,(z)0,...,(z)m)).
i^m

Obtain a functional F from G and H by primitive recursion; then obtain


each Ft from F.)
CH. 6, §7], CHURCH’S THESIS 257

§ 7. Church’s Thesis

Using the methods developed in §6, as well as further methods that can be
developed along the same lines, a wide variety of algorithmic functions
can be shown to be recursive. As a matter of fact, no one has ever come
up with an example of a function / such that there is an algorithm (answer¬
ing the general description of §2) for calculating /, and such that / could
not be shown to be recursive. No one has even outlined a plausible way
of producting such a counter-example.
Moreover, during the 1930‘s as well as since then, many different defini¬
tions have been offered, each of which characterizes a class of algorithmic
functions. In every case, the class so defined has proved to be a subclass
of the class of recursive functions. (In many important cases, the intention
was to make the definition as wide as possible. In such cases the class so
defined turned out to contain all recursive functions, but no others.)
These definitions are of three main kinds.
First, there are definitions using imaginary machines. Our own definition
of computable functions is of this kind, but there are many others, using
a wide variety of machines. (Some of these machines, like URIM, have
an unlimited storage capacity — or else the class of functions that they
can calculate does not contain all recursive functions — but in all other
respects they are like computers that have been, or could plausibly be,
actually built.) In all these cases, the functions calculable by the proposed
machine can be shown to be recursive, using the same kind of method
we shall employ in §8 to show that all computable functions are recursive.
(We shall then point out why that method is likely to work for every machine
that might plausibly ever be proposed.)
A second kind of approach is to specify explicitly some basic functions
which are obviously algorithmic, and some basic operations which, when
applied to algorithmic functions, always yield functions which, are in
turn algorithmic. The class of functions defined in this way is the smallest
class containing the basic functions and closed under the basic opeiations.
(Our definition of the class of recursive functions is of this kind: the basic
functions are those described by the symbols Z, S, IM, and the basic
operations are composition, primitive recursion and minimization.) In all
cases where such a definition has been proposed, it is possible to show
directly (and for the most part quite easily) that the basic functions are
recursive, and that the basic operations are recursive operations in the
sense of §6.
258 RECURSION THEORY [CH. 6, §7

A third kind of method proceeds by employing a formal calculus. One


takes a suitably chosen formal language (which may or may not be a first-
order language of the kind studied in this book) and specifies a procedure
for making formal deductions in that language. (This is often done by
specifying axioms and rules of inference in much the same way as we have
done for the propositional and predicate calculi; other methods resemble
the method of tableaux.) Certain deductions in such a calculus may be
interpreted (in some natural way) as formal proofs of statements of the
form “/(a) = b”. A function / is said to be representable in the calculus
if the statements of the form “/(a) = b" (with this particular /) which
do have such formal proofs are precisely the correct ones. If the calculus
is set up in a suitable way, all functions representable in it will be algo¬
rithmic1. Many calculi of this kind have been proposed and examined.
Some were purpose-built specifically for the task of getting as many
algorithmic functions as possible to be representable. Other calculi had
arisen in connection with attempts to formalize and axiomatize various
portions of mathematics. It turns out that there are rather weak calculi
in which all recursive functions are representable. (We shall see examples
of this in Ch. 7.) But in every case where the functions representable
in a calculus are known to be algorithmic they can be shown to be recursive
as well.
To sum up: all the various definitions — using different kinds of
approaches — which aimed at characterizing classes of algorithmic functions,
have in fact turned out to define subclasses of the class of recursive functions.
This and other evidence has led virtually all logicians to accept the
following conclusion, known as Church's Thesis:
All algorithmic functions are recursive.

Church's Thesis is a matter of reasonable belief, not a theorem. It


cannot be proved mathematically, because the notion of algorithm — and
hence that of algorithmic function — is not precisely defined, but merely
described in intuitive terms. Our grasp of this notion enables us to recognize
an algorithm when we see one2; but it does not enable us to prove the
1 For this it is sufficient that there should be some effective procedure for enumerating,
one by one, all the deductions of the calculus (as we had, e.g., in §5 of Ch. 3). Then,
if/is a representable //-ary function and a is any //-tuple, we can calculate /(a) by system¬
atically searching for a formal proof of a statement of the form “/(a) = b”. This yields
an algorithm for /.
2 Thus, we can see that all computable functions are algorithmic, because we easily
recognize URIM programs to be algorithms. Since (by Thm. 5.5) all recursive functions
rae computable it follows that the converse of Church’s thesis is clearly true.
CH. 6, §8]. RECURSIVENESS OF COMPUTABLE FUNCTIONALS 259

uo/r-existence of an algorithm for calculating a function which we suspect


of being non-algorithmic.
When doing recursion theory, we shall not use Church’s Thesis in our
proofs. These will be based on the definition of recursiveness given in §5.
However, Church’s Thesis motivates recursion theory and gives significance,
to its results. In particular, if a function is shown to be non-recursive,
then by Church’s Thesis we conclude that it is non-algorithmic — and
this is the only known way to reach that conclusion1.
We remark that Church's Thesis for functionals — “All algorithmic
functionals are recursive” — is also held to be true, on virtually the same
grounds as for functions. However, we do not wish to make much of
this, since in this book we shall be concerned with functionals as a means
of getting results about functions and not for their own sake.

§ 8. Recursiveness of computable functionals

In this section we shall show that every computable function(al) is


recursive. In preparation for this, we assign a code number #C to each
command C as follows:

Command: Z£ S; A; \iJ>m
Code number: 2' 3‘ 5‘ 7TH13'".

Next, we assign to any program 9$ = (C0the code number

#93 = f] expOfc,#Cfc).
k<h

In particular, # of the empty program is 1.


We define a property (i.e., unary relation) Prog by the identity

Prog(z) = (z>0) a V*<lh(z) 3«<z 3 v<z 3w<z[((z)x = 2''+1)

v ((*),= 31(+1) v ((z),= 5U+1) v ((z)*=7u+1 • 11”+1 • 13w)] •

From this identity it is easy to see that Prog is a p.r. property, and Piog(_)
holds iff z = # 9> for some program 93.
Next, we define a function Az(z) as follows:

„ __ fz if Prog(z),
jl if mProg(z).

1 On the other hand, to show that a function is algorithmic, it is enough to produce


an algorithm and prove that it serves for calculating the function.
260 RECURSION THEORY [CH. 6, §8

This function is p.r. (see Ex. 6.14), and z is always the # of some program:
namely, of the program whose # is z or —-if such a program does not
exist — of the empty program.

8.1. Definition, {z} is the program such that # ^ = z.

According to this definition and the conventions of §4, {z}„ is the n-ary
functional computed by {z}; and if {z} has no A commands then {z}„
is also the function computed by {z}. As agreed in §4, when “{z}„” is
written next to the appropriate arguments, the subscipt may be dropped,
since n can be determined from the number of numerical arguments shown.
It is easy to see that the length of the program {z} (i.e., the number of
commands it has) is lh(z).
For any 0, we shall say that at a given moment URIM is in state u if
at that moment the number in the program counter K is (u)0 and, for
each positive i, the number in the ith register is (u)t. Note that (w),- = 0
for almost all i, in agreement with our decree that at any moment almost
all the registers are empty. Note also that at any moment (assuming
almost all the registers to be empty) URIM is in some uniquely determined
state «>0.
From now on until we come to Def. 8.2 below, we assume that URIM
is linked up to a Noracle and is operating under the program {z}.
We say that u is a halting state (for the program {z}) if u>0 and
(«)0>lh(z). (This corresponds to the fact that URIM halts precisely
when the number in K is > the length of the program.)
For each state u there is a unique next state, Nex(£;w,z), depending
on c, and z as well as on u. If u is a halting state, then the next state will
be u itself (URIM has halted!); otherwise, the next state is that in which
URIM will be immediately after obeying the («)0th command of {z}. To
make Nex defined also for u=0, which is not a state, we put Nex(£;0,z) = 0.
We wish to show that Nex is a p.r. functional. To make things a bit
easier, we first define auxiliary functions k,i,f and k' as follows:

k{u) = («) „,
i{u,z,x) = (z)fc(u)>x,

f{u,z,x) = q(u, exp(pt(u,z,x),(u)i(u<ZiX))),

k'(uZ^ = \i{M'Z'5) if ((M)i(«,z,3) = («).(u>z>4)),


(A(w) +1 otherwise.

By the results of §6, these functions are easily seen to be primitive recursive.
CH. 6, §8], RECURSIVENESS OF COMPUTABLE FUNCTIONALS 261

To explain what they mean, let us assume that u>0 and URIM is in
state u. Also, let us assume that u is not a halting state, i.r., (u)0<lh(z).
Then k(u) is the number stored in K. Also, URIM is about to obey
the A'(«)th command of {z}.
Now, i{u,z,x) is the Xth exponent in the code number (z)k(u) of that
command. Therefore, recalling that p0—2,...,p^=\3, we observe the
following facts.
If i(u,z,0)>0, then URIM is about to obey the command Zi(u z 0); if
i(u,z, 1)>0, then the command about to be obeyed is S;(u>Zjl); if i(u,z,2)>0,
then the command about to be obeyed is A;(u z>2); and if z'(w,z,3)>0, then
the command about to be obeyed is )uUtZ,z),nu,z,i),i{u,z,h)-> and in this case
the number stored in K after that command is obeyed will be k'(u,z).
Finally, f(u,z,x) is a state exactly like u, except that in f(u,z,x) the
i(u,z,x)th register is empty.
On the basis of these explanations, the reader should have no difficulty
in verifying the following identity (in which we write, e.g., “/(0)” as short
for “i(u,z,0)” etc.):

2-/(w,z,0) if (m>0) A(A(u)<lh(z)) a(z(0)>0),


2 °Pi(1) • u if (m>0) a (A(w)<lh(z)) a (z(1)>0),
2 • exp(pi(2), ^((w)l(2))) */(u,z,2)
Nex(£;w,z)=<i
if (w>0) a(A(u)<lh(z)) a(z(2)>0),
2k'(u’-'>. q(u,2k(-u)) if (u>0) a (A(w)dh(z)) a(/(3)>0),
u otherwise.

It now follows at once from this definition by cases that Nex is a p.iv
functional.

We pause here to make two comments.


First, in showing Nex to be p.r., we have made use of the fact that c
is not allowed to vary over all unary functions but only over sequences.
For, Ex. 6.14 (definition by cases) which we have used here was shown
to work only when all the functionals on the right-hand side are total.
This is just a symptom of the fact that if one allows £ to vary over all unary
functions, the treatment (and even the very notion) of algorithmic functionals
becomes a bit problematic. It is precisely for this reason that we have
chosen to allow £ to vary over NN only.
Second, as we shall soon see, the whole proof that every computable
function(al) is recursive hinges on the result that Nex is recursive. Now,
it is reasonable to believe that a similar result will hold for any other
RECURSION THEORY [CH. 6, §8
262

“computing machine” that may plausibly be proposed for calculating


algorithmic functions or functionals. The method of encoding programs,
states, etc., by numbers is of general applicability, and so we can be quite
sure that for every proposed machine we should be able at least to define
an appropriate next-state function(al). Can this functional) fail to be
recursive? This seems highly unlikely. For, what makes a proposed
theoretical device plausible as a “machine” is that the successive steps
it is supposed to perform are rather rudimentary and depend in a simple
way on the program and on the state of the device at a given moment.
The next-state function(al) of any plausible machine must certainly be
algorithmic, and rather simple at that. However, all the evidence suggests
that if there is a non-recursive algorithmic function(al) — which is extremely
unlikely anyway — it must be very complicated. Our argument is, in other
words, that from the virtual certainty that no simple algorithmic function
can fail to be recursive, it is reasonable to conclude that no non-recursive
function can be calculated by anything that we might plausibly be ready
to call a “computer”. Further, since the essential thing about algorithms
(as generally understood by mathematicians) is their “mechanical” or
“robot-like” character, it is reasonable to believe that every algorithmic
function can be calculated by some device that one could plausibly regard
as a “machine”.
Thus, from the virtual certainty that simple algorithmic functions are
recursive, we arrive by a chain of highly plausible intuitive arguments at
Church’s thesis that all algorithmic functions must be recursive.

Let us now return to our functional Nex. For certain purposes it will
be useful to analyse how Nex depends on its sequence argument. Let
Nex* be the ternary function obtained by a definition by cases exactly
like Nex, except that on the right-hand side “£((w)i(2))” — which is short
for “^((w)i(U)Z>2))” — is replaced by “(e)(u) o ”, and on the left hand side
“Nex(£;w,z)” is replaced by “Nex*(u,w,z)”.
It is then clear that Nex* is a p.r. function and

(1) Nex(c;w,z) = Nex*(u,w,z), provided ^((«)t(a))=(»)(ll)|(I).

Next, let us put

(2) Uy)=Y\pl(x)-
x^y

The functional F defined by Ff;y) = f(y) is easily seen to be p.r.: it is


CH. 6, §8], RECURSIVENESS OF COMPUTABLE FUNCTIONALS 263

obtained by first composing Ixw(p^) with the functional A, and then


applying the product operation. Also, it follows from (2) that

(3) €(x)=(£Cy))x, provided x<v.

We then have:

8.2. Lemma. Nex* is a p.r. function, and

Nex(£;(», ,z)-Nex*(f (jO,(jOf ,z).

Proof. Let w = Then (u)i(2)<u^y. Therefore, by (3),

5((«)i(2))=(? (30)(«)fta)-

Putting |(y) for v in (1) we therefore have Nex(£;i/,z) = Nex*(£(y),«,z)


as required. |

Let us now take a fixed but arbitrary number n. Let URIM be put in
a state u0 in which numbers X!,...,xn are stored in the first n registers
Rl5...,Rn respectively, while all the other registers as well as K are empty.
Thus,
u0=exp (p1,x1)... exp (pn ,xn).

(If /7=0 then this means that w0=l.) Assuming, as before, that URIM
has been linked up to a ^-oracle and is operating under the program {z},
it will now go through successive states: beginning with the initial state
Uq , then M1=Nex(^;w0,z), then w2=Nex(^;w1,z), etc. We now distinguish
two cases regarding this sequence of states:
Case 1. If {z}(£;then for some t the state ut is a halting state
(i.e., (w,)0>lh(z)). We let h be the first such t and call the number
rUexp(/U.f) the computation code for (£;z,x).
Case 2. If {z}(£;x) = ~, then there is no t for which u is a halting state;
in this case the computation code for (f\z,x) is undefined (i.e., °°).

We now come to one of the most important definitions of this chapter.


Its meaning will be explained in the theorem which immediately follows it.

8.3. Definition. For each n we define an (n + 3)-ary relation of the 1st


order, T*, and an (/7+2)-ary relation of the 2nd order, Tn, by the iden-
RECURSION THEORY [CH. 6, §8
264

tities
T*(v,z,x,y) = ((y)0 = exp(Pi,x1)...exp(pn,xn))

a Vt<lh(y) — 1 ((>’)(+i = Nex*(t),(v)f,z))

a \/7<lh(y) — 1 ((^)f)0<lh(z))

A((T)lhOO-i,o>lh(*))>
Tn(£;z,x,y) = T*(Z(y),z,x,y).

When writing “J*” and “r„” next to their arguments, the subscript “/?”
may be omitted, since n is uniquely determined by the context.

8.4. Theorem. T* and Tn are primitive recursive. For each sequence


^ and 77 + 2 numbers z,x and y, T(c;z,x,y) holds iff y is the computation
code for (f ;z,x).
Proof. That T* and Tn are p.r. can be seen at once from their definition.
To see when T(£;z,x,y) holds, substitute f(y) for v in the identity defining
T*, and use Lemma 8.2. We then see without difficulty that T(£;z,x,y)
holds iff the following conditions are satisfied:
y is of the form nt<ftexP(Pf’wr)> where w0=exp(p1,x1)...exp(p,1,x„), and
for each t<h, «(+1=Nex(^;w„z) and ut is not a halting state for the
program {z}; but uh is a halting state for {z}.
But these are precisely the conditions under which y is the computation
code for (£;z,s). g
8.5. Definition. U is the unary function defined by C/(y) = (T)iho-)^i.i-
Clearly, U is a p.r. function. We now prove:

8.6. Theorem. (z}(c;x) = U(/iyT(£;ztx,y)).


Proof. For any given sequence £ and 77 + 1 numbers z and x, suppose
that URIM is linked up to a ^-oracle and is operating under the program
{z}, from an initial state where Xj,...,*,, are in respectively and
the other registers as well as K are empty.
By Thm. 8.4, pyT(£;z,x,y) is precisely the computation code for (£;z,x).
This is defined (i.e. + oo ) iff {*}«;*) 7^ °°. In case it is defined, Def. 8.5.
shows that U{p.yT(f\z,x,y)) is the 1st exponent in the last positive exponent
in this computation code. But this is precisely the number found in Rt
when URIM comes to a halt, i.e., {z}(£;x). |

8.7. Corollary. For any given program the functional sfs„ is recursive
and we can actually find a description for it. If +s has no A commands, then
the function sf\, is recursive, and we can find a description for it.
CH. 6, §9]. FUNCTIONALS WITH SEVERAL SEQUENCE ARGUMENTS 265

Proof. We encode ^P, obtaining the number e— Thus ^3 is {e} and


the functional *ipn is {e}„. By Thm. 8.6, this functional is recursive:
W?) = U(/tyni;e*,y)).
Moreover, tracing back the steps leading to the definitions of U and Tn,
we can get the required description.
If has no A commands, then in the description obtained for the
functional we may replace the symbol A by S, thus getting a description
for the function I

From now on we may use the term “recursive” in all contexts (including
the context “<^-recursive”) as a synonym for “computable”. We have
the right to do so, for our present results combined with those of §5 show
not only that these terms are coextensive, but also provide us with effective
methods for converting a description of a function(al) into a program that
computes it, and vice versa.
From now on, when we say that a recursive function(al) is given, we
mean that a description of it (or, equivalently, a program that computes it)
is specified. Similarly, when we say that such-and-such a recursive func¬
tional) can be got, or constructed, or found, we mean that a description
of it (or a program computing it) can actually be constructed, given enough
time and patience.
Note, by the way, that since U and Tn are p.r., we can find, for any given
recursive function(al), a description in which the symbol M occurs at
most once. (See proof of Cor. 8.7.)

8.8. Problem. Using Thm. 8.5, find a shorter solution to Prob. 5.4. (Note
that if {e} is a program computing G, then

G(c;x) = U(pyT*(l\exp(ptMt)),e,x,y))-)

*§ 9. Functionals with several sequence arguments

According to the terminology introduced in §1, we have used the term


“functional” for a mapping having only one sequence argument. In this
section we consider a more general case. By an {m,nf ary functional we shall
mean here a mapping of a set A into N, where A'—^N ) XiV . Thus,
an n-ary function is a (0,n)-ary functional, and an n-ary functional (in the
terminology of §1) is a (l,n)-ary functional.
We shall indicate how the treatment of this chapter can be extended
to cover (n?,n)-ary functionals for any m. For simplicity let us consider
266 RECURSION THEORY [CH. 6, §10

the case m = 2; the cases m — 3,4,... are dealt with by exactly the same
methods.
The treatment of computability (§§3 and 4) can be modified by assuming
URIM to be linked up to two oracles instead of one. We would then need
two kinds of A commands, say Aj and Af, for consulting the first and
second oracle respectively.
The treatment of recursiveness (§5) can be modified by introducing,
instead of the symbol A, two symbols A1 and A2 as descriptions of (2,l)-ary
functionals A1 and A2, where

A\^,r]-,x) = £(x),

A2(£,ti;x) = rj(x).

All the rest of our treatment can then be easily adapted to (2,«)-ary func¬
tionals.
Instead of following these modifications and adaptations step by step,
we can cut through them by reducing (2,n)-ary recursive functionals to
(l,«)-ary ones as follows.
Let [£,17] be the sequence 2x(2l(jc) • 3',(x)), which depends on the parameters
£ and rj. Then one can show quite easily that the (2,n)-ary functional F
is recursive (or p.r.) iff for some recursive (or p.r., respectively) (l,n)-ary
functional G we have

F(Z,r,x) = G([Z,rj];x).
Using this, results about (l,«)-ary recursive (or p.r.) functionals can readily
be generalized to (2,n)-ary functionals.
For this reason, we shall confine ourselves, as before, to dealing with
(l,«)-ary functionals only.

§10. Fundamental theorems

In this section we shall prove three main results, from which virtually the
whole of recursion theory can be developed.
We start with a recapitulation of the results of §8.

10.1. Normal Form Theorem. We have got a p.r. function U and, for
each n, an (n + 3)-ary first-order p.r. relation T* and an (n + 2)-ary second-
order p.r. relation Tn such that

(1) W;;z,x,y) - T\l(y),u,x,y),

(2) (z}(£;*) = U(nyT(£;z,x,y)).


CH. 6, §10]. FUNDAMENTAL THEOREMS 26?

Thus for any given n-ary recursive functional F we can find a number e such that

F(£ ;*) = U{nyT(f -,e,x,y)).

Moreover, for any (p£NN and numbers e,a, there is at most one number b
for which T((p;e,a,b) holds.
Proof. See §5 and §8. |

The great power of this result lies mainly in the fact that by (2) (i.e., by
Thm. 8.6) the (ft-t-l)-ary functional F„ defined by

(10.2) F„(c;x,z) = {z}(<^;x)

is recursive. This is much stronger than the result we had initially set
out to prove; namely, that for each number e the computable n-ary func¬
tional {e}n is recursive1. This recursive functional Fn is universal for all
n-ary recursive functionals, in the following sense: an n-ary functional
F is recursive iff for some number e we have

F(Z;x) = Fn(Z;t,e).

Also, if ip is a program computing Fn, then sp is a universal program for all


n-ary computable functionals in the sense that the program {<?}, applied
to the numerical inputs a yields exactly the same output (if any) as the
program ip applied to the numerical inputs (a,e). (Intuitively speaking,
sp works as follows: when applied to n + 1 numerical inputs (a,e), it de¬
codes the (n+l)th input e, and applies the resulting program {e} to the
first n inputs a.)
For each n, Azxy \Tn{Xt(t+ l);z,x,y)] is evidently an («+2)-ary first-order
p.r. relation. We shall denote it by “T„” (and, when writing it together
with its arguments, we shall often omit the subscript “ft”). There will
be no ambiguity in this, since the reader will always be able to tell from
the context which of the two Tn is being referred to — the first-order or
the second-order relation.
Also, we shall write, e.g., “{z}(*)” instead of “{z}(2i(t+l);*)”. With
this notation we have:

10.3. Normal Form Theorem (for functions). We have got a p.r. function
U and, for each n, an in -f 2)-ary first-order p.r. relation Tn, such that

{z}( x) = U(pyT(z,x,y)).

1 The weaker result is all that we needed for Cor. 8.7.


RECURSION THEORY [CH. 6, §10
:268

For any given n-ary recursive function f, we can find a number e such that

f{x) = U(iiyT(e,x,y)).

Moreover, for any numbers e and a there is at most one number d such that
T(e,a,b) holds.
Proof. Immediate from Thms 10.1, 4.2 and 5.3. I

Here too, the (n + l)-ary recursive function /„ defined by

(10.4) fn(x,z) = {z}(x),

is universal for all n-ary recursive functions, in the obvious sense.


In future we shall often refer to Thms. 10.1 and 10.3 briefly as NFT .

We shall say that a number e is an index of the »-ary functional F (or


of the 7?-ary function/) if we have the identity {e}(£;x) = F(^;x) (or the
identity (c}(i) = /(x)). The NFT says, among other things, that an index
can be found for each recursive function(al). Conversely, a function(al)
is recursive only if it has an index.
We shall now present a few applications of the NFT.

10.5. Example. In Ex. 6.14 we saw that definition by cases is a p.r. opera¬
tion, provided the given functionals G1,...,Gk are total. We shall now show
that, without this proviso, the operation is recursive. Keeping the notation
and assumptions of Ex. 6.14 (except that we no longer assume the Gt to
be total) we let et be an index of G) for i=l,...,k. (If the G; are given by
means of descriptions or programs, we can actually find such c;.) Then
the identity

F(£ ;x) = {nz [{z=cx) a Rfifi ;x) v ... v (z=ek) a Rk(£ ;*)]}(£ ;x)

can easily be verified. It shows that F is recursive if the Gt and the Rt are.
Moreover, it can be used to get a description of F from those of the G,
and Rt. (Note that we have made essential use of the fact that the universal
functional Fn — defined in (10.2) — is recursive.)
10.6. Example. The diagonal function Xx [{x}(x)] is recursive by the NFT.
Consider any unary function g which extends the diagonal function, i.e.,
g(x) = {x}(x) for all x such that {x}(x)^°°. Using our conventions, this
can be expressed more succinctly by the identity

{x}(x)=g(x)+0 • (x}(x).

We show that g cannot be both total and recursive. If g is recursive, then


CH. 6, §10], FUNDAMENTAL THEOREMS 269

so is Ax [g(x) +1]. Thus by the NFT there is a number e such that

g(x) + l = {e}0).

Putting e for x in both our identities1 2 we obtain

g(e) + l=g(e) + 0 •{<?}(>).

This is possible only if both sides are °° (otherwise, {e}(e) is some number b,
and we would have b + l=b+0 • b=b). Thus g(e) = °° and g cannot be total.

With each first-order n-ary relation P we associate a decision problem:

Find an algorithm whereby, for each xfN", one could find out
whether P(x) holds or not.

If such an algorithm exists, the decision problem for P is said to be solvable


and P itself is said to be decidable.
Clearly, Church’s thesis implies that P is decidable iff" it is recursive.
Without committing our technical terminology to Church’s thesis, we shall
say that the decision problem for P is recursively solvable, and P itself
is recursively decidable if P is a recursive relation; otherwise, we say that
the decision problem is recursively unsolvable and P is recursively undecidable.

10.7. Example. The self-application problem is the decision problem for


the property (i.e., unary relation) Ax({x}(x) = °°). We show that it is
recursivelv unsolvable. Otherwise, this property would be recursive.
It would then follow (Ex. 10.5) that the function g defined by

( w/(*}(*) if ww^°°>
8{X)-\0 if {x}(x) = co,

is recursive. But then g would be a total recursive extension of the diagonal


function Ax [{x}(x)], contrary to Ex. 10.6.
By the way, this justifies our precaution of distinguishing between =
and = (see Ex.6.7.). For we see that Ax({x}(x) = =°) is not recursive
although the functions Ax [{x}(x)j and °o are recursive.
It follows at once that the decision problem for Azx({z}(x) = °°), known
as the halting problem2 is recursively unsolvable. For, if Azx({z}(x) = °°)

1 This is an application of Cantor's diagonal method. We shall use this method many
times in the sequel.
2 Because it is concerned with finding, for any given z and x, whether or not UK1M
will ever halt when it is linked up to a Xt(t+ l)-oracle and is operating under the program
{r} with x as numerical input.

19
RECURSION THEORY [CH. 6, §10
270

were recursive, Ax({x}(jc) = (which is obtained from it by identification


of variables) would also be recursive.

10.8. Problem. The vanishing problem is the decision problem for the
relation 2zx({z}(;c) = 0). Show that it is recursively unsolvable. (Consider
/bc({x}(x)^0). If it were recursive, it would have an index e. Get a contra¬
diction by examining (e}(c).)
10.9. Problem. Show that there cannot exist a recursive function universal
for all recursive sequences, i.e., a binary total recursive function f such
that for every recursive sequence cp there is a number e for which cp is
Xx [f(x,e)\. (If/were such a function, consider the sequence Xx [f(x,x) -f 1].)
10.10. Problem. The totality problem is the decision problem for the
property P such that P(z) holds iff the function Xx [(z}(x)] is total. Prove
that it is recursively unsolvable. (Show that if P were recursive this would
contradict Prob. 10.9.)

The last two problems illustrate the fact that in recursion theory it is
more natural to deal with all functions rather than just with total functions.
We cannot even have an algorithm whereby we could tell, for any given
program without A commands, whether the function computed by it is
total or not.
For the second basic result of this section, we shall need the following:

10.11. Lemma. For any given program tp we can find a unary p.r. function
fv such that if y'>y and lh(j/)>-lh(.y) then ffyf^ffy)', and for every
normal program1 Q,/<p(#£})= #(JQS)I).
Proof. Let e— and put

(e)x‘ 13lh(j,) if O)*,3>0,

0)* otherwise.

Now define f p by

My)=y- n exp(p*+iho)>/j(*,p))-
.x<lh(e)

Recalling that (e)x 3>0 iff the Xth command in is a J command, one can
easily verify that f^ has the required properties.

To continue, it will be convenient to lay down:


10.12. Definition. An «-ary function / will be called nice if the following
three conditions hold:

1 For the definitions of normal and of the concatenation see the latter part of §3.
CH. 6, §10], FUNDAMENTAL THEOREMS 271

(1) / is primitive recursive;


(2) /(0,0,...,0)>0;
(3) /is monotone increasing in each argument, i.e., if a^bj for i=l,...,n
and ai>bi for some then/(a)>/(b).

We now come to the second main result of this section.

10.13. Theorem. Let n and m be any numbers. Then for any given (n + m)-ary
recursive functional F we can find a nice m-ary function g such that

■ ■ • ,Tm)}(C ;*) = F(£ ;x,yx,... ,ym).

Proof. Let SJ> be a program that computes F and let /^ be as in Lemma


10.11. For each ra-tuple <jl5...,jm>, let h(y1,...,ym) be the code number
of the program havingyi+...+ym commands: yx commands S„+1, followed
by y2 commands S„+2,..., followed finally by ym commands S„+m. Now put

g(yi, }’m) =/#Ol, ym))-

It is easy to verify that li is p.r.; and both h(yx,...,y„) and lh^O^,...,^))


increase when any one of the yt is increased. Hence g is nice.
If URIM is made to operate under the program (g(Fi,...,ym)) with
initially stored in the first n registers and the other registers empty,
it will begin by putting yx,...,ym in the registers R„+1,...,Rn+m respectively,
and then proceed as under

This theorem can be applied to functions in the following way. If / is


an (/?+w)-ary recursive function, we consider the functional F with which
/ is identified, i.e.,

F(£ ;i,h, • • •, ym) =f(x,y1,..., ym).


We now apply the theorem to F and substitute the sequence Jj(t+1) for </
Thus we get a nice g such that

{gOi, • • •,>/>}/) =f(*,yi, ■ ■ ■ ,yj-

Using Thm. 10.13 we now prove:

10.14. Corollary. For any numbers m and n we can find a nice (m+ 1 )-ary
function S'" such that

{S™{z,yx,. ..,ym)}(f-, x)={c}(C ;*,yu ■ • -,ym).


Proof. Put

F(f ',X,Z,yx,.. .jfin) Tm)‘

19*
RECURSION THEORY [CH. 6, §10
272

Then F is clearly an (n+ra + l)-ary recursive functional and by Thm. 10.13


we can find a nice (m + l)-ary function — which we now call S” — such
that the required identity holds. I

We shall refer to Thm. 10.13 as the S™ theorem\ In this connection the


variables y1}...,ym will be called parameters.
We now present a few applications of the S™ theorem.

10.15. Example. For m =0 and for any n we have by Cor. 10.14,

Thus, for any number e, the n-ary functionals {<?}„ and {,S°(e)}« are the
same, i.e., e and S°(e) are indexes of the same n-ary functional. Note that
since S° is nice it follows (by induction on e) that S°n(e)>e. Applying
S® again and iterating, we get an increasing infinite sequence of indexes
of the same n-ary functional {<?}„.
10.16. Example. Let d be any number, and let P be the property
Az ({*}(<*)=-). We show that P is recursively undecidable (i.e., not
recursive).
Let / be the recursive function defined by the identity f(y,x) =
= {x}(x +y — d). Applying the S™ thm. to /, with x as parameter, we get
a nice sequence1 2 g satisfying the identity {g(x)}O0 = {*K*+.j> —*0- 1°

particular, giving y the value d, we have the identity

{g(*)}(<0 = {*}(*)•
From this and the definition of P we have

W)=(WW=4
Now, if P were recursive then the composition of P and g would also be
recursive — contradicting the fact (see Ex. 10.7) that A.v({.x}(x) = °°) is
not recursive.

10.17. Problem. Show that, for any number d, the property l_-({z}(rf)=0)
is recursively undecidable. '
10.18. Problem. Let d be any number and let Q be the property such that
Q(z) holds iff d belongs to the range of the unary recursive function {z}l

1 The name is justified because Thm. 10.13 and Cor. 10.14 are merely two forms of the
same result. (Problem: Deduce Thm. 10.13 from Cor. 10.14.)
2 That is, a nice unary function. Such a function, being p.r., is necessarily total and
hence, by our convention, may be called a sequence.
CH. 6, §10]. FUNDAMENTAL THEOREMS 273

(i.e., {z}(x) = r/ for some x). Show that O is not recursive. (Get a nice
sequence h such that

{/i(z)}(x)=0-{z}(x)+x.

Show that if AzQ(h(z)) were recursive, this would contradict Ex. 10.16.)
We now come to the third and last major result of this section.

10.19. Recursion Theorem. Given any (n + 2)-ary recursive functional F,


we can find a nice sequence cp such that

{<p(z)}(€ ;x) = F{£ ;x,q>(z),z).

Proof. By the S'" thm. we get a nice sequence a such that

(Apply the theorem to the functional G defined by G(£;x,y) = {y}(£;x,y)


taking y as parameter.) Next, the 5™ thm. yields a nice sequence rj/ such that

(* * ) {iKz)}(c ',x,y) = F{£ ;x,a(y),z).


Now let cp be the composition of a and i.e., Ay [^(^(k))]- Substituting
xjy(z) for y in both (*) and (* *), we obtain the required identity

{cp{z)){f,x) = F{^x,cp^)A

Since <p was obtained by composing two nice sequences, it is clearly nice
as well.

As a corollary we have:

10.20. Corollary. Given any (n + \)-ary recursive functional F, we can


find a number e such that

{e}(£ ;*)=*■(£ ;*,*)•


Proof. Obtain F' from F by introducing a redundant variable:

F'(£;x,y,z)=F(£;x,y).

Now apply Thm. 10.19 to F' and take e as (p(d) for arbitrary d. i

We shall often refer to Thm. 10.19 and Cor. 10.20 briefly as “the RT”.
Like the S™ thm., these results too can be applied to functions. Thus,
if/is a given (« + 2)-ary recursive function we get a nice sequence cp such that

{q>{z)}(x)=f{x,(p(z)A
274 RECURSION THEORY [CH. 6, §10

And if / is a given («+l)-ary recursive function we can find a number


e such that

MO) =/(*»«)•
To explain the meaning (and great power) of the RT, let us first suppose
that we are asked to write a URIM program ^3 whose outputs are to
depend in a prescribed way on the sequence £ (given by an oracle) and
n numerical inputs x, i.e.,

$(£;*)=*■(£;*),
where F is prescribed in advance. We know that this problem has a solution
iff F is recursive. Moreover, if we can find a description of F (in the sense
of Def. 5.2) then by Thm. 5.5 we can actually construct a program ^3
which does the required job. This problem does not therefore pose any
theoretical difficulty; in fact, it is precisely the kind of problem which
a URIM programmer would be expected to solve.
Now suppose that we are asked to write a program whose outputs
are to depend in a prescribed way not only on £ and x as before, but also
on ^ itself. (To take a simple example, suppose the outputs represent
the costs of running our “computer”, including our own fees for writing
the program, which depend on the length of the program we write.) This
problem is not only much harder, but at first sight we might suspect that it is
insoluble. We may object that we cannot reasonably be asked to write
a program without knowing in advance what this program is supposed
to do; and in the present case what the program is supposed to do depends
on the — yet unwritten — program itself. A programmer confronted
with such a task might well feel like the wise men of Babylon when they
were commanded by Nebuchadnezzar not only to interpret his dream
but to guess what the dream was1. Like them he might object that this
was not the kind of problem he could fairly be expected to solve.
However, let us see whether something can be done about it after all.
Since every program has the form {e}, and all the information about
M is contained in the number e, we may state our problem as follows:
Find a value of the unknown y for which the identity

M(f;*)=.F(£ ;*,;>)
will hold, where F is prescribed in advance. We can see at once that a

1 See Daniel, Ch. 2.


CH. 6, §10], FUNDAMENTAL THEOREMS 275

solution may not exist if F is not a recursive functional. On the other


hand, Cor. 10.20 tells us that if F is recursive then a solution e does exist.
(And we can actually find such a solution if we are given a description
of F.)
Furthermore, suppose we are asked to supply an infinite sequence of
programs {<p(z)} for z=0,1,2,... such that the outputs of the zth program
depend not only on the data c and x but also on the program itself, and
the very dependence itself varies with z. Here we must solve an identity
of the form

{^(z)}(c;i) = F(^;i,>7(z),z)

for the unknown sequence q, with prescribed F. Thm. 10.19 shows that
for recursive F a nice solution q> exists. Moreover, given a description
of F we can actually write a “master program” which computes (p.

The RT, in the form of Cor. 10.20, is very useful in situations where
we seek an 77-ary recursive functional, say G, such that G satisfies a con¬
dition of the form

G(f;x)=r,

where r is an expression involving and G itself. To find a solution,


we replace “G” by “{y}” on both sides. Our condition then takes the form

(y}(C;s) = F(£;*,y).
If we can show that F here is recursive, then by the RT we can find a value
e for y such that the condition holds. Then {p}„ can be taken as the
required G.

Let us now present a few applications of the RT.

10.21. Example. Let P be the property such that P(z) holds iff z is an
index of the nowhere defined unary function, i.e., {z}(x) = °° for all x.
We show that P is not recursive.
Put /(x,y,z) = {z}(y). Applying the RT to /, we get a nice sequence
<p such that {<p(z)}(x) = {z}(<p(z)). Thus

(*) ^(p(*))=({*}(?(*)) = °°)-


Now let g be defined by cases:

if P(z),
otherwise.
276 RECURSION THEORY [CH 6, §10

If P were recursive, theng would be recursive as well by Ex. 6.2 and Ex. 10.5.
If e is an index of g, the definition of g yields the identity

(* *) i>(z) = ({c>}(z)^°°).
Putting e for z in (*) and cp(e) for z in (* *) we get a contradiction.

10.22. Problem. Let Q and R be the properties such that Q{z) holds iff
{z}(x) = c» for infinitely many values of x, and R(z) holds iff the function
Ax [{z}(x)] has an infinite range (i.e., takes infinitely many different values).
Show that Q and R are not recursive.

The results of Ex. 10.21, Prob. 10.22 and many similar results (including
some obtained earlier in this section) can also be derived from the follow¬
ing rather general theorem.

10.23. Theorem. Let P be a property such that:


(1) P(z) holds for some, but not for every, numerical value of z.
(2) If a and b are any two indexes of the same unary function (i.e., if
the identity (a}(x) = {Z>}(x) holds), then P{a) = P(b).
Then P is not recursive.
Proof. Take numbers a and b such that P(a) holds but P(b) does not.
Define g by
if P(y),
otherwise.

From this definition the identity

(3) P(^))=nP(j)

follows at once. On the other hand, if P were recursive, then g and hence
also the binary function Axy({g(j>)}(x)), would be recursive as well. Apply¬
ing the RT to this binary function, we get for some number e the identity
(<?}(x) = {g(<?)}(x). By condition (2) we would therefore have P{g{e))—P(e),
contradicting (3). |

We conclude this section with

10.24. Simultaneous Recursion Theorem. Let F and G be given (n + 2)-ary


recursive functionals. We can find numbers c and d such that

(c}(f;x)=.F(f;*,c,rf),

{^}(£;x)=G(£,x,c,d).
CH. 6, §11], RECURSIVELY ENUMERABLE SETS in

Proof. By the RT we can find a nice sequence cp such that

Mz)}(£ ;*) = F(ll ;i,<p(z),z).

Next, applying the RT to the functional H such that H(£;x,y) =


= G(£;x,(p(y),y), we find a number d such that

{d}(£\x) = G(£-,x,cp(d),d).

Putting c=cp(d) and substituting d for z, we get the required result. |

10.25. Problem. Prove the following generalization of the RT. For any
given (in+m + l)-ary recursive functional F we can find a nice m-ary function
g such that

{g(z 1,... ,zm)}(£ ;*) = F{£ ;x,g(z1}... ,zm),zls... ,zm).


(Let G(£;x,y,z)=F(£;x,y,(z)1,...,(z)m). Apply the RT to G and put
z=exp(/?1,z1)...exp(pm, zj.)
10.26 Problem. Given m (n+/n)-ary recursive functionals Ft for i=l,...,m,
show how to find m numbers et for which the following identities hold:

{ei}(£-,x) = Fi(£\x,e1,...,em), i-1,... ,m.

§ 11. Recursively enumerable sets


From now up to the end of this chapter we confine our attention to
functions rather than functionals. By relation we shall therefore mean
a first-order relation. (Much of what we shall do can be generalized to
second-order relations as well, but we shall not make use of this.)
By n-dimensional set we shall mean any subset of N". There is a natural
one-to-one correspondence between n-ary relations and n-dimensional sets.
To each n-ary relation P there corresponds the n-dimensional set
{*: P(x) = 0}, which is called the extension of P. Conversely, given any
A^Nn we let A.x(x£A) be the n-ary relation whose extension is A, i.e.,
we put

ifl€A
' ’ \l if x(N"-A.

Through this one-to-one correspondence, any definition or result dealing


with n-ary relations can be understood as dealing with n-dimensional
sets, and vice versa. In what follows we shall often make use of this without
special mention. For example, we shall say that an n-dimensional set
A is recursive, meaning that the corresponding relation is recursive.
RECURSION THEORY fCH. 6, §11
278

Let us note that if P is an n-asy relation having A as its extension, then


the extension of iP is the complement Ac of A relative to N" (i.e.,
Ac—Nn—A). If Q is another «-ary relation having B as its extension,
then clearly the extensions of P v Q and Pa Q are AuB and AnBrespec¬
tively. (See Ex. 6.6.)
We introduce the operation of (unbounded) existential quantification
which can be applied to any (w + l)-ary relation P, to yield an n-ary
relation R:
R(x)=3yP(x,y).

Here P(a) holds iff there is some number b such that P(a,b) holds.
Similarly, (unbounded) universal quantification can be applied to P to
yield an n-ary relation Q:

Q(x)=dyP(x,y).
Here 0(a) holds iff P(a,b) holds for every number b.
We shall soon see that these operations are not recursive.

11.1. Definition. A relation is recursively enumerable (briefly, r.e.) if it


is obtained by existential quantification from a recursive relation. (For
a justification of this terminology, see Thm. 11.5 below.)

It is easy to see that a recursive relation is r.e., for

P(*)=3y[P(*)A(y=y)],

and Xxy [P(s) a (y=y)] is a recursive relation if P is. But we shall soon
show that not every r.e. relation is recursive.
We shall make frequent implicit use of:

11.2. Lemma. The following operations applied to r.e. relations yield r.e.
relations:
(1) composition with total recursive functions;
(2) existential quantification;
(3) conjunction;
(4) disjunction;
(5) bounded existential quantification;
(6) bounded universal quantification.
Proof. (1) If

R(z1,...,zk)=3yP(z1,...,zk,y),
then
^(/i0)> • • • ,/*(*)) = 3 yP (/i(3c),... ,fk(x), y).
CH. 6, §11]. RECURSIVELY ENUMERABLE SETS 279

If P is a recursive relation and /i,... ,/fc are total recursive functions, then

is a recursive relation.
(2) Using Ex. 6.13 we have

3y 3z P(x,y,z) = 3u P(x,K(u),L(u)),

and, if P is recursive, so is XxuP(x,K(u),L(u)).


(3) We have
3 v P(x,y) a 3y Q(x,y)=3y 3z [P(x,y) a Q(x,z)]

and the two consecutive existential quantifiers 3y 3z can be contracted


into one as in (2).
(4) Similar to (3).
(5) 3w<m 3>> P(x,H’,y) = 3h! 3y [(w<h) a P(x,m’,j>)].
(6) Vw<m 3y P{x,w,y) = 3zVw<« P(x,vv,(z)w). I
By the graph of the n-ary function / we mean the (n + l)-dimensional set

{<*,/(*)>: ^ € dom(/)}.

(Recall that, by the conventions of §1, dom(/) = {x: f(x) ^ «=}.) Note
that the graph of/is the extension of the relation Xxu(f(x)=u).
The recursion-theoretic interest in r.e. sets1 stems in part from the
following:

11.3. Theorem. Let A be the graph of an n-ary function f. Then A is r.e.


iff f is recursive.
Proof. If/is recursive, we can find an index e of/by the NFT and we have

((*,/> € A) = 3 y(T(e,x, y) a (U(y) = «)),

which shows that A is an r.e. set.


Conversely, if A is r.e., then for some (n + 2)-ary recursive relation R
we have
((x,u)£A)=3yR(x,u,y).

It is then easy (using Ex. 6.13) to verify the identity

f(x) = K(jizR(x,K(z),L(z)},

from which it follows that / is recursive.

1 That is, extensions of r.e. relations.


RECURSION THEORY [CH. 6, §11
280

We note that if A is the graph of an n-ary total recursive function/then


A is recursive:

(<*,«>€ A)=(/(*) = «),

(see Ex. 6.7).

11.4. Theorem. A relation is recursive iff both it and its negation are
recursively enumerable.
Proof. If R is a recursive relation, then so is ~\R (see Ex. 6.6). Hence
R and —i R are certainly recursively enumerable.
Conversely, suppose

R(x)= 3 yP(x,y),

~iR(x)=3yQ(x,y),

where P and 0 are recursive relations. Then we can easily verify the identity

R(x) = P(x,py [P(x,y) v Q(x,y)}),

which shows R to be recursive. I

If/j,...,/„ are unary functions, we say that the n-tuple (/,...,/„) enumerates
the n-dimensional set

t€domC/j)n... ndom(/„)}.

The following theorem provides a justification for the term “r.e. set” and
gives useful characterizations of r.e. sets.

11.5. Theorem. For any n-dimensional set A, the following four conditions
are equivalent:
(1) A is an r.e. set;
(2) A is the domain of some n-ary recursive function;
(3) H = 0 or A is enumerated by an n-tuple of p.r. sequences;
(4) A is enumerated by an n-tuple of recursive unary functions.
Proof. (1)=>(2). Suppose

(x£A) = 3yR(x,y),

where R is a recursive relation. Then if/is the /7-ary function Ax(pyR(x,y)),


it is clear that /is recursive. Also, it is not difficult to see that dom(/) = /4.
(2)=>(3). Suppose A = dom(f), where / is an n-ary recursive function.
By the NFT, / possesses an index e and we have f(x) — U(pyT(e,x,yj),
CH. 6, §11]. RECURSIVELY ENUMERABLE SETS 281

hence it follows that

(x£A)=3yT{e,t,y).

If A ^ 0, choose any a£A and define sequences cpt for i = as follows:

if T(e,(Oi,---,(0,,,(Oo),
' |a; otherwise.

Then the cpt are p.r. (see Ex. 6.14). We show that enumerates A.
First, take t such that T(e,(Oi>---XO„XOo) holds. Then 3yT(e,(t)1,...,(t)„,y)
holds as well, so that ((//,...,(t)n)dA. But in this case

<<Pl(0» —»9»(0> = <(0l. —»(0n)


by the definition of the cpt. On the other hand, if T{e,{t\,.. ,,{t)n,{t)^) fails
to hold then

<<Pl(0»-»<P..(0> = <*•
Thus for every numerical value of t we have {(pft),...,(pn{t)) € A. It follows
that the //-tuple (<?!,...,<£„) enumerates a subset of A. Now' suppose c£A.
Then 3yT(e,c,y) holds, so that T(e,c,b) holds for some (unique) number b.
For
t = 2bexp(p1,c1).. .exp(p„ ,c„)

we see without difficulty that (<p1(t),•••,<?„(0) — c- Thus enumer¬


ates the h'hole of A.
(3) =>(4). If A = 0, let / be the unary function which is never defined.
Then/is recursive (e.g., f(t)=py(t+y = t+y+l)) and the //-tuple </...,/>
enumerates A. If ^4 is enumerated by an //-tuple of p.r. sequences then
(4) certainly holds.
(4) =>(1). Let A be enumerated by the //-tuple (/,•••>/,)> where the/ are
recursive unary functions. For each /, let Bi be the graph of f. Then

(x£A) = 3t [((fix/£Bf a ... a (/,*„>6Bn)\,

so that A is r.e. by Thm. 11.3 (and Lemma 11.2). fl

We shall denote by “dom„{c}” the domain of the //-ary recursive function


having e as index. With this notation, Thm. 11.5 yields.

11.6. Corollary. The n-dimensional r.e. sets are precisely the sets dom„{e}
{where e = 0,1,...). The (n+l)-ary relation 2*z(x£dom„{z}) is recursively
enumerable', in fact (x6dom„{z})= 3y T(z,x,y).
282 RECURSION THEORY [CH. 6, §11

Proof. The first assertion is merely a re-statement of the equivalence


between (1) and (2) of Thm. 11.5. The second assertion follows from the
proof that (2)=>(3) in the same theorem. |

We shall say that e is an index of the n-dimensional r.e. set dom„{e}.


When we say that an r.e. set is given (or can be found or constructed etc.)
we mean that an index of it is given (or can be found or constructed etc.).
Note that if an (« + l)-ary recursive relation R is given and the r.e. set
A is defined by the identity

(x€A) = 3yR(x,y),

then (using the proof of Thm. 11.5) we can find an index of A.

11.7. Corollary. Each n-ary r.e. relation is obtained by existential quantifi¬


cation from an (n + \)-ary p.r. relation.
Proof. Obvious from the identity

(x 6 dom,,{<?}) = 3 yT(e,x,y)

and the fact that Tn is a p.r. relation. |

11.8. Counter-Example. The property Az^Gdonqlz}) is r.e. by Cor. 11.6.


But its negation cannot be r.e., for in that case we would have for some e

(z$ donqlz}) = (z 6 dom1{c})

and putting z=e we would get a contradiction. Thus (by Thm. 11.4) the
property 2z(z6dom1{z}) cannot be recursive. (This result is merely a re¬
formulation of Ex. 10.7.)

The following result throws additional light on the difference between


recursive and r.e. sets.

11.9. Theorem. A set A^N is recursive iff A is finite or A is enumerated


by a monotone increasing recursive sequence.
Proof. Suppose that A is recursive. Then by the RT there is a unary
recursive function / such that

f(t)=lix [(xG4)aVz< / (/(z)<x)].

(To see this, replace “/” on both sides by “{e}”.) It is clear that the
successive values of / are the members of A in increasing order, until A
is exhausted — at which point the values of/become °o, Thus/enumerates
A; and if A is infinite/is total, i.e., a sequence.
CH. 6, §11]. RECURSIVELY ENUMERABLE SETS 283

It is clear that a finite set is recursive. (If A = {a1,...,ak), then

(x^y4) = (x^A') v(x=a1) v ... v(x=ak).)

Finally, suppose that A is enumerated by the monotone recursive


sequence <p. Then we have

(x£.4) = 3 t*sx ((p(t)—x),

so that A is recursive. i

11.10. Problem. Show that every infinite /j-dimensional r.e. set B has an
infinite recursive subset. (First deal with the case 72 = 1. Then, for 72 >1,
consider the one-dimensional set

B' - {exp(jp1,A'1)...exp(^n,.Y„): 1 € 5}.)

Using Thm. 11.5 and Cor. 11.6, we can adapt the S'"-thm. and the RT
to r.e. sets. We devote the rest of the present section to this task.

11.11. Theorem. For any given (n + m)-ary r.e. relation R we can find a nice
m-ary function g such that

(* € dom^gOi,..., jm)})=R(x,ylt • • •, yj-


Also,
(x e dom n{SZ(z,ylt.. .,ym)}) = (<*,^1, • • • ,ym) 6 dom,+„{z}).

Proof. By Thm. 11.5, we can find an (72+727)-ary recursive function/ such


that dom(/) is the extension of R, i.e.,

«*, Ti, • • •, ym) € dom(/))=R(x, ylt..., yj.

By the S’f thm. we find a nice /27-ary function g such that

{llOhj'-'sjOjOO = /(TU’lvjl’m)-
Thus
(x € dom„{g(ji,... ,ym)})=((I.Jiv • Jm> ^ dom (/)).

This proves the first part of our theorem. The second part follows at once
from Cor. 10.14. I

We shall refer to Thm. 11.11 as the S'f theorem for r.e. sets.

11.12. Problem. Given a unary recursive function/, find a nice sequence


cp such that
, , . .. fdonqlz} if z£dom(/),
dom1{«-W>=|0 otherwise.
RECURSION THEORY [CH. 6, §12
284

(Consider the binary relation R such that

R(x,z) = (x € domx{z}) a (z£dom(/)).)

11.13. Recursion Theorem for R.E. Sets. Given an (n + 2)-ary r.e. relation
R, we can find a nice sequence (p such that

(x 6 dom„{<p(z)})=R{x,(p{z),z).

Proof. Left as an easy exercise for the reader. H

§12. Diophantine relations

In what follows polynomial will mean polynomial with integer (possibly


negative) coefficients. But the variables of a polynomial are still only
allowed to range over N, unless otherwise stated.
We extend to polynomials the notation introduced in Ex. 6.7. Thus, e.g.,
if p is a polynomial, we regard the expression “(p(x)=0)” as having the
value 0 for x£Nn such that p(x) vanishes, and 1 for all other x£Nn. (The
use of negative coefficients is merely a convenience, not a necessity, since
<‘(p(x)=0)” can always be re-written as “(/(s)=g(x))”, where / and g are
polynomials with natural number coefficients.)
We shall say that an 77-ary relation E is elementary if, for some polynomial
p in 77 variables,

E(x) = (p(x) = 0).

Note that if E' is obtained from E by introducing a redundant variable,


i.e., E'(x,y)=E(x), then E' is elementary as well, because E'(x,y) —
= (p(I) + 0*>’=0). We shall often make use of this fact without special
mention.

12.1. Lemma. The disjunction and conjunction of n-ary elementary relations


are themselves elementary.
Proof. Let p and q be two polynomials in n variables. Then

(/'(*) = 0) V (q( x) = 0) = (p(x)-q(t) = 0),

(p(x) = 0) A (q(x) = 0) = ((M*)]2 + [?(s)l2 = 0). |

An 77-ary relation P is diophantine if for some m ^0 and some (77 +777)-ary


elementary relation E the identity

P(x)=3 ... 3 ymE(x,yt,ym)


CH. 6, §12], DIOPHANTINE RELATIONS 285

holds. An ^-dimensional set A is diophantine if it is the extension of a


diophantine relation, i.e., the relation Xx(x£A) is diophantine. Note that
an n-ary function / has a diophantine graph (or, as we shall say briefly,
d.g.) iff Xxu(f(x)=u) is a diophantine relation.

It is easy to see that every diophantine relation is recursively enumerable.


The rest of this chapter is devoted mainly to proving the converse — every
r.e. relation is diophantine. This result yields a (negative) solution to
Hilbert’s Tenth Problem1, and is of great importance in itself.

12.2. Lemma. Let P be an n-ary relation obtained in one of the following ways:
(1) by existential quantification from an (n + V)-ary diophantine relation;
(2) by conjunction from n-ary diophantine relations;
(3) by disjunction from n-ary diophantine relations.
Then P itself is diophantine.
Proof. Case (1) is obvious. To deal with case (2), let

P(x) = 3y1...3ykD(x,y1,...,yk)A3z1...3zmE(x,z1,...,zJ,

where the relations D and E are elementary. Here we may assume that the
variables yx...,yk are all different from the variables z1,...,zm. Then

P(*)=3y1...3yk3z1...3zm(D(x,y1,...,yk)AE(x,z1,...,zm)),

so that by Lemma 12.1 P is diophantine. (By introducing redundant


variables we can replace D and E by (n+k+m)-ary elementary relations
D' and E'j)
Case (3) is treated similarly. |

12.3. Theorem. The p.r. functions described by the symbols Z,S and In>;
(where 1 *si<n) have diophantine graphs. A function obtained by composition
from fwictions having d.g.s has a d.g. as well.
Proof. The first part of the theorem is left as an easy problem to the reader.
Now, suppose g is a L-ary function and hx,...,hk are n-ary functions, and let

f(x)=g(h1(x),...,hk(xj).
Then
(/(*) =y) = 3y1...3yk[(h1(x)=y1) a ... A(hk(x)=yk)

A(g(Li,.-.,Lfc)=j)]-

Hence, by Lemma 12.2, if g,hx,...fik have d.g.s, so does/. H

1 See §16.

20
RECURSION THEORY [CH. 6, §12
286

12.4. Theorem. If P is a diophantine relation and gi,-.-,gk are total functions


with d.g.s, then the relation Xx [P(gx(*),..■,&(*))] is diophantine. A relation
with d.g. is diophantine.
Proof. The first part of our theorem is proved in the same way as the last
part of Thm. 12.3.
Now suppose that P is an n-ary relation with a d.g., i.e., the (n + l)-ary
relation Axy(P(s)=j) is diophantine. But P(x) = (/,(x) = 0), so that P is
diophantine as well.

If we can show that a function obtained by primitive recursion from two


total functions with d.g.s has itself got a d.g., then by Thm. 12.3 it will
follow that every p.r. function has a d.g. (see Def. 5.2 and the text follow¬
ing it). In particular, by Thm. 12.4 it will follow that every p.r. relation
is diophantine. Since by Cor. 11.7 every r.e. relation is obtained by existen¬
tial quantification from a p.r. relation, it will then follow from Lemma 12.2
that every r.e. relation is diophantine. Unfortunately, to prove that the
class of total functions with d.g.s is closed under the operation of primitive
recursion is no easy task, and we shall have to work pretty hard for it.
We start with:

12.5. Definition. Godel’s function (1 is the ternary function1


Xxyz [rm(x,y(z+l) + l)].

To study the behaviour of the function we shall need the following


elementary number-theoretic result.

12.6. Chinese Remainder Theorem. For ns*0, let d0,di,---,dn be any


positive numbers that are mutually prime (i.e., no two of them have a non¬
trivial common factor). Let d be their product d0...dn. Then for any given
n +1 numbers ri^~di (where i=0,l,...,n) we can find a number c<d such
that r~rm(c,rfj) for all i<n.
Proof. Let us say that an (n + l)-tuple of numbers (r0,...,rn) is good if
r^di for all i<n. Then there are exactly d different good (n + l)-tuples.
Each number c gives us a^ good (« +l)-tuple: (rm(c,d0),...,rm(c,r/n)).
Allowing c to take all values <d, we get d such (n+ l)-tuples. We shall
show that they are all different; hence it follows at once that every good
(n + l)-tuple must be so obtained, as the theorem claims.
Indeed, suppose that 0«c<c'<d and c yields the same (n+ l)-tuple as c'.
Then rm(c,<7,)=rm(cVi) for all i<n. Therefore c' — c must be divisible

1 For the definition of rm see Ex. 6.11.


CH. 6, §12]. DIOPHANTINE RELATIONS 287

by each <7;, and — since the dt are mutually prime — it follows that c'— c
must be divisible by their product d. But this is impossible, because
0<c' —c<c/. |

12.7. Theorem. For /r>0, let r0,r1,...,r„ be any given numbers. Then we can
find numbers a and c such that P(c,a,i)=rtfor all i=s.n.
Proof. Let s be any number greater than n and each of the rt (e.g., take
s=n + r0+... +rn +1). Put a = s\ and for each i<n let ^=a(z + l) + l.
Clearly, r^d^. We shall show that the dt are mutually prime. Suppose
that and dt had a prime factor p in common with dj. Then
p would divide dj—di, that is, a(j—i). Since p is prime, it would therefore
divide a or j—i. But p cannot divide a, because we have assumed that
p divides di—a(i+1) + 1. Also, p cannot divide j—i, because 0 <j—i<n<s,
so thaty — /' must divide s\ = a, and we have just seen that p does not divide a.
By the Chinese Remainder Theorem, we find a number c such that
rm(c,<7;) =/y for all i<n. This is the required result, in view of our definition
of the d-, and Def. 12.5. I

12.8. Lemma. The relations < are diophantine. The functions rm and
/] have diophantine graphs.
Proof. Use the identities

(x<y) = 3 z(x+z=y),

(x<y) = (a +1 *cy),

(rm(x,y) = z) = 0=0) a(z=x) v(z<j) a 3 u(x=yu+z),

P(x ,y,z) = rm(x,y(z+l) + l),

and apply 12.2, 12.3 and 12.4. I

Now let the function / be obtained from total functions g and h by


primitive recursion:

/(r,0)=g(x),

f(x,y+\)=h{x,y,f(x,y)).

Then, for any a£N" and numbers b and c, the equality /(a,b) = c is clearly
equivalent to the following condition:
There are numbers r0,r1,...,rb such that r0=g(ci); and h(a,t,it) — it+1
for every t<b; and rb — c.

20*
RECURSION THEORY [CH. 6 §13
288

In view of Thin. 12.7 we therefore have the identity

(f(*,y)=z) = [(P(u,v,0)=g(x))

a V t<y(h(x,t,P(u,v,t))=P(u,v,t +1))

A (P(u,v,y)=z)].

If we can prove that the class of diophantine relations is closed under


bounded universal quantification, then this identity (together with earlier
results of the present section) shows that the class of total functions with
d.g.s is closed under primitive recursion. As pointed out after Thm. 12.4,
this implies that every r.e. relation is diophantine.1
The following three sections (§§13-15) are devoted to proving that the
class of diophantine relations is in fact closed under bounded universal
quantification. On a first reading of this book, the reader who so wishes
may accept this result without proof and go directly to §16.

*§ 13. The Fibonacci sequence

In the present section and in §14, we let 9 be the Fibonacci sequence defined
in Prob. 6.15:

(1) <p(0) = 0, q>( 1) = 1, (p(x+2)=(p(x+\)+(p(x).

We shall be especially interested in the sequence lx [^>(2.v)]. From (1) we


see without difficulty

(2) ^(0) = 0, 9(2) = 1, 9(2.y + 4) = 3<p(2x + 2) — cp(2x).

13.1. Theorem. (p(2x+2)s*2x and cp(2x)<3x for all x.


Proof. The first inequality follows from (1) by induction on x. (Observe
that 9(x+l)>9(x) and hence (p(x + 2)s*(p(x) for all x.)
The second inequality follows by induction on x using (2).
* \

The rest of this section is devoted to proving that the sequence lx [9(2.y)]
has a diophantine graph.2

1 As will be seen in §16, this in turn yields a negative solution to Hilbert's Tenth Problem.
In fact, it was precisely the conjecture that Hilbert’s problem has a negative solution
which led mathematicians to try to show that every r.e. relation is diophantine.
2 The motivation for proving this result is explained below in §17.
CH. 6, §13]. THE FIBONACCI SEQUENCE 289

13.2. Lemma.

(m2 — uv — v2 = 1) = 3 x [(« = (p(2x +1)) a(v = <p(2x))].

Proof. Using (1) it is easy to verify by induction on x that

(3) (<p(x +1 ))2 - <p(x) • <P(X +1) - (<p(x))2 = (-1 )x.

Next, we show that if u and v are numbers such that

(4) (w2 — uv — u2)2 = 1, with u >0,

then

(5) for some x, u~cp{x+1) and v = cp(x).

We do this by induction on u+v. For v—0, (4) reduces to u— 1, so that


by (1) we have (5), with x=0.
Now assume (4) with u>0. Then v; otherwise we would get
u2—uv—v2^—2. We put u' = v and v' = u — v; thus u—u'+v' and v — u'.
Substituting this in (4), we find that

(u'2 — u'v'—i/2)2 = 1.

Also, u'—u>0 by assumption. Since u' + v' = w<u + e, we can use the induc¬
tion hypothesis to conclude that u' = cp(y+1) and v =(p(y) for some y.
Hence we find

u=cp(y+ l) + <p(j/) = <?(>’ +2), v = (p{y+1).

Thus (5) holds with x=^ + l.


From (3), and the fact that (4)=>(5), the assertion of the lemma follows
at once.

We shall make use of the identity

(6) cp(x+y) = (pipe +1) • (p(y) + <p(x) '<p(y~ 1)’


which holds for all values of x and all positive values of y. This identity
is easily verified by induction on y, using (1).
We shall also make use of the fact that any two consecutive Fibonacci
numbers — say <p(x) and cp(x+1) — are always mutually prime. This can be
seen at once from (1) by induction on x.
We put

x\y = 3 z(xz-y).
RECURSION THEORY [CH. 6, §13
290

Thus the relation | is diophantine. (We shall Qply use the symbol | for
divisibility of natural numbers, not negative integers.)

13.3. Lemma. cp(x)\(p(xy) holds for all x and y.


Proof. For x = 0 the result is trivial, so we may assume x>0. We use
induction on y. The cases y = 0 and y= 1 are again trivial.
Now let y=z+1, where z is positive. Then xz is positive as well and
by (6) we have

(p(xy) = (p(x+xz)

= (p(x-{-1) • <p(xz) + (p(x) • <p(xz-\).

Using the induction hypothesis we get the required result. |

13.4. Lemma. If (p{x)\cp(y) holds and x^2, then x\y holds.


Proof. We may assume that y>x>2, because all other cases are trivial.
We can thus put y=xz + u, where u<x and z>0.
By (6) we have

cp(y) = cp(u+1) • q>(xz) + (p(u) • <p(xz—1).

By Lemma 13.3, <p{x)\(p{xz) holds; also, the consecutive Fibonacci numbers


<p(xz— 1) and <p(xz) are mutually prime, hence cp(xz— l)andip(x)are mutually
prime as well. Thus (p{x)\(p{u) must hold.
But, since u<x and x>2, it is clear that <p(«)<<p(x). It follows that
cp(u)=0, i.e., u=0. |

Our next lemma looks a bit ugly, but it contains most of the spade-work
needed for the two neat lemmas following it.

13.5. Lemma. For x>0 and 0<u<z, put

f(u,x,z) = (z—u) • (p{xu — 1) • (p(x) + cp{xu) • q>{x — 1).

Then <p(x)\z holds iff (<p(x))2[/(h,x,z) holds.


Proof. By induction on u. First, take u—\. We have

f(\,x,z)=z(p(x-1) • q>(x),

and the result follows from the fact that cp(x) and <p(x— 1) are mutually
prime.
Next, let u=v+1, where t>>0. We wish to express f(u,x,z) in terms of
CH. 6, §13], THE FIBONACCI SEQUENCE 291

f(v,x,z). To this end, we observe that

(pipcu—1) =s (p(xv — 1 + x)

= cp{xv) • (p(x)+(p(xv— 1) • (pipe— 1), by (6);

cp(xu) = cp(xv-\-x)

=cp(xv +1) • (p(x)+(p(xv) • <p(x — 1), by (6);

(p(xv + l)=(p(xv)+(p{xv— 1), by (1).

Using these identities we find without difficulty

f(u,x,z)—f(v,x,z) • cp(x — 1) + r • <p(x) ' <PC*W)>

where r = (z — u) • <p(x)+f (x— 1). From this, Lemma 13.3, and the fact that
(p(x) and q>{x— 1) are mutually prime, we get the result

(<p(x))2|/(w,x,z) holds iff ((p(x))2\f(v,x,z) holds.

This completes the induction.

13.6. Lemma. For all x and y, if (<p(x))2|<p(» t/zen <p(x)|y fto&fc us well.
Proof. We may assume y>x>2, because all other cases are trivial.
Suppose (<p(x))2|<p(>0 holds. By Lemma 13.4 we have y=xz, where
z>0. Using Lemma 13.5 (with u=z), we see that <p(x)|z, and hence also
(p(x)\y, must hold.
13.7. Lemma. For all x and y, if x-(p(x)\y holds then ((p(x)fUp(y) holds
as well.
Proof. The case _e = 0 is trivial, so we may assume y>0.
If x-(p(x)\y holds, then x>0 and y = xz for some z>0. Thus q>(x)\z
holds and from Lemma 13.5 (with u=z) we see that {(p(x))2\(p(y) • <p{x-1)
must hold. The required result follows at once, since cp(x) and <p(x— 1)
are mutually prime.
13.8. Lemma. Put d=(p(2b) + (p(2b + 2) and consider the sequence
^=Ax[rm(<p(2x),rf)]. We have:

(7) Tm((p(2x),d) = <p(2x) if b>0 and x^b +1,

(8) rm((p(2x),d)=d— cp{4b —2x+2) if b^x^2b.

Also, the sequence ijs is periodic with period 26 + 1, i.e.,

(9) rm(<p(2(x + 2 b +1 )),d)=rm(<p(2x),rf).


292 RECURSION THEORY [CH. 6, §13

Proof. (7) is obvious, since under the conditions stated (p(2x)<(p(2b + 2)<d.
To prove (8), observe that, under the condition stated,

0 (p(b) < d— cp(4b — 2x + 2) d— cp{2) < d.

Hence it will be enough to prove the congruence

(10) cp(2x)= — cp(4b — 2x + 2) (mod d)

for b<x*s2b. We do this by induction on x. For x=b and x=b + l,


(10) is immediate from the definition of d. For the induction step, put
x=z +2, then use (2) in the form

<p(2z + 4) = 3(p(2z + 2) — cp{2z);

now apply the induction hypothesis to z and z+1, and use (2) again in
the form
(p(4b — 2z + 2) = 3cp(4b — 2z) — cp(4b — 2 z — 2).

Actually, this induction proves (10) for b<x<2b+l. Moreover, (10)


holds even for x<6. For, in this case we put u=2b—x+l; then 6 + l<«=^
<26+1, so by what we have already proved (10) holds with x replaced
by u. This yields (10) for x as well.
It remains to prove (9). Again, we work with congruences mod d. We
can re-state (9) thus:

(9') (p(2x+4b+2) = (p(2x).

We prove this by induction on x. For x=0, we have to show cp(4b + 2)^0;


but this is the same as (10) for x=0.
For x— 1, we must show that <p(46 + 4) = l. Putting x=2b in (10), we
obtain cp(4b)= —1. But by (2) and by what we have already shown,
cp(4b+4) = 3<p(46+2) — cp(4b) = 0—(—1) = 1.

For the induction step we put x=z+2 and use (2) again. |

We now define an auxiliary binary function g by the identities

(11) g(w,0)=0, g(w,l) = l, g(w,x+2) = wg(H’,xTl) — g(w,x).

We shall be interested in g(w,x) for w^2 only. For such w it is easy to


see (by induction on x) that g(w,x+l)>g(w,x); hence the “-2-” in (11)
may be replaced by
Using (11), it is easy to see by induction on x that

(12) g(w,x) = x (mod w—2), for ws*2.


CH. 6, §13], THE FIBONACCI SEQUENCE 293

Also, using (2) and (11) it can easily be seen by induction on x that

(13) (p(2x) = g(w,x) (mod w—3), for iv>2.

The following lemma is similar to Lemma 13.2.


4

13.9. Lemma. Let vt;>2. Then the conditions

(14) u2 — wuv + y2= 1, with u^v,

(15) for some x, u=g(w,x) and v=g(w,x+\),


are equivalent.
Proof. (15) => (14) is easily verified by induction on x, using (11).
To prove (14)=>(15), we proceed by induction on wv — u. (This cannot
be negative, since (14) says For u=0, (14) reduces to v = l; so (15)
holds with x=0.
Now let (14) hold with z/>0. Re-arranging (14), we have

w2 = 1 + v(wu—v), 0 <W<SE.

From this we get We put u' = wu—v and v'=u. Thus u=v'
and v = wv'—u'. Substituting this in (14) we find that u'2—wu'v'+v'2=\.
Also, as we have seen, u'=wu—v<u=v'. Since

wv' — u' = v<(w— l)u«wr — u,

we may use our induction hypothesis to conclude that u'=g(w,y) and


v'=g(w,y+1) for some y. Therefore u=g{w,y+\) and (using (11))
v=g(w,y + 2), so (15) holds with x=y +1. |

We can now prove the main result of this section:

13.10. Theorem. The function Xx[cp(2x)] has a diophantine graph.


Proof. We show that, for any numbers x and y, (cp(2x)=y) holds iff there
are numbers r,s,t,u,v,z and w such that
(i) x^yd,
(ii) t2—tz—z2=l,
(iii) r2 — 2rs — 4s2 = 1,
(iv) t2\r,
(v) w = 3+(4v+r>,
(vi) u2 — wuv + v2 = 1,
(vii) x = rm(«,t),
(viii) y=rm(u,As+r).
By the results of §12, the theorem will then be proved.
RECURSION THEORY [CH. 6, §13
294

Suppose first that, for given x and y, there are numbers r,s,t,u,v,z and w
satisfying the eight conditions (i)-(viii).
By Lemmas 13.2 and 13.9 we obtain, from (ii), (iii) and (vi),

t=q>(a), r=<p(2b + l), 2s = <p(2b), u=g(w,c),

for some a,b and c. Let us put

d = (p(2b) + cp(2b + 2);

then by (1) we have also

d= 2 - cp{2b) + (p{2b +1)=4s+r.

By (viii) we now have

y = xm{g{w,c),d).

But from (v) and (13) we get

q>(2c) = g(w,c) (mod ds),

and therefore we can also write

y = rm(yp(2c),d).

Dividing c by 26 + 1 we have c = q • (2b + l) + e, where q is the quotient, and


the remainder is e<2b. By the last part of Lemma 13.8, we now have

y=rm(cp(2e),d),

because c = e(mod 26 + 1).


We want to know whether or 6<e<26. If the latter were the case,
then by (8)

y = d— cp(4b — 2e+ l)s*d— (p(2b) = cp(2b + 2).

But by (i) and (iv)

y<t<r = (p(2b + l)<(p(2b-h2).

Thus we conclude that e<6. If 6 = 0, then d=<p(0) + (p(2)= 1, hence we


have y=rm(cp(2e),l)=0 and by (i) also x=0; thus y — q>{2x) as required.
We may therefore assume 6>0 and use (7), getting

y=(p(2e).
CH. 6, §13], THE FIBONACCI SEQUENCE 295

We must show that x = e. If e=0, then >’ = <p(0)=0 and again we see from
(i) that x=0. If e>0, then by Thm. 13.1 and (i)

e<2e~1<(p(2e)=y<t.

From (vii) we have x = rm(«,t) = rm(g(w,c),t). But from (iii)-(v) it is easy


to verify that t\w—2, hence by (12) we get

x=rm(c,l).

Also, since t=(p(a) and r=(p(2b + l), we conclude from (iv) and Lemma 13.6
that t\2b + l. Thus, since we know that we have

x=rm(c,t) = rm(i7 • (2b +1) + e,t) = e,

as required.
Now for the converse. Let x and y be numbers such that <p(2x) =y. We
have to find r,s,t,u,v,z and w to satisfy (i)-(viii).
We choose
t = q>( 24.x+1), z = (p( 24x).
Clearly, (i) is satisfied; and, by Lemma 13.2, (ii) is satisfied as well.
From (1) it is easy to see that cp(a) is even iff a = 0 (mod 3). Hence t is
odd. Also, it is easy to see that if a = 1 (mod 8) then (p(a)=l (mod 3).
Hence t= 1 (mod 3) and t(24x+l)-l=0 (mod 3). It follows that
<p(t(24x+l) —1) is even. Put
r=cp(t( 24x+l)), s=\(p(t(24x+l) — l).

Then (iii) and (iv) are satisfied because of Lemmas 13.2 and 13.7.
Next, choose w as dictated by (v) and take

u=g(w,x), v=g(w,x+1),

so that (vi) holds by Lemma 13.9.


As in the first part of this proof, we see from (iii)—(v) that t\w — 2.
Therefore by (12),

rm(u,t)=rm(g(w,x),t) = rm(x,l).

But by (i) we have x<t. Hence rm(x,l) = x and (vii) is satisfied.


Finally, 4s + r\w-3 by (v), so (13) yields

rm (u,4s+r) — rm(g(w,x),45+r) = rm((p(2x),4s+r).

But cp(2x)=y, and clearly y<4s+r by (i) and (iv). Thus rm(u,4s+r)=y
and (viii) is satisfied.
296 RECURSION THEORY [CH. 6, §14

*§ 14. The power function

The main result of the present section is that the power function Xyz(y~)
has a diophantine graph.
We start by making another excursion into elementary number theory.
The equation

(1) x2—Ay2=l,

where A is a given natural number that is not a square, is known (mistakenly)


as Pell’s equation. To solve it is to find all natural solutions, i.e., all pairs
(x,y)£N2 satisfying (1). We propose to solve (1) in the important special
case where

(2) A —a2 —l, a >- 1.

The reader should check that, because of (2), A1/2 must be irrational.
If (x,y) and {x',y') are two solutions, then x<x' iff y<y'. Thus the
solutions can be arranged in increasing order of both components. We let
{Xz{a), Yz(aj) be the zth solution (in the above ordering), beginning with
z=0. (We shall soon see that a zth solution exists for all z.) The 0th and
1st solutions are evident: (1,0) and (a,l). They are known as the trivial
and fundamental solution respectively. The complete solution is given by:

14.1. Theorem. Equation (1), subject to condition (2), has infinitely many
solutions. The zth solution is uniquely determined by

(3) Xz{a) + Yz(a) • A^ = (a + A1'2)2.

Proof. Let z be any fixed number. By multiplying out (a+All2)z, and


separating even and odd powers of A112, we obtain

(4) x+yAll2 = (a+A1,2)z,

where * and y are natural numbers. Moreover, jc and y are uniquely


determined by (4): if x+yAll2=x' +y'A112, then x=x' and y=y' because
A112 is irrational.
N * \

If in the right-hand side of (4) we replace A112 by — A112, then the terms
containing even powers of A112 are unchanged, while those containing odd
powers of A112 change signs. Thus

(5) x—yA1,2.= (a—A1,2y.

Multiplying (4) by (5), and using (2), we find that x2 — Ay2= 1. Thus the
pair (x,y) determined by (4) is a solution of (1).
CH. 6, §14], THE POWER FUNCTION 297

Also, it is clear that in (4) x and y increase monotonically with z. So (4)


gives us infinitely many solutions, in the correct order.
It only remains to show that every solution satisfies (4) for some z. Let,
therefore, (x,y) be any solution of (1). Then x+yA1,2>x^ 1. On the
other hand, by (2) we have a+A1,2>\. It follows that x+yA1/2 must lie
between two consecutive natural powers of a + A112, i.e., for some (unique)
z we have

(6) (a + A1I2)z< x+yA1'2 < (a + All2)z+1.

By (2) we have

(7) * (a + A1,2)(a — A1/2) = 1.

Therefore, multiplying (6) by (a — A111)2, i.e. by (a+All2)~z, we get

(8) 1 <zp + qA1!2^a+A112,

where p and q are (uniquely determined) integers such that

(9) (x+yAll2)(a - A1,2)z=p + qA1'2.

Here p and q can be obtained by multiplying out the left-hand side, and
separating even and odd powers of A1'2. If in the left-hand side of (9)
we replace A112 by — A112, then the terms with even powers of A1'2 are
unchanged, while those with odd powers change signs:

(x-yAll2)(a + A1'2y=p-qA112.

Multiplying this equality by (9) and using (7) as well as our assumption
that (x,y) is a solution of (1), we get

l=(p + qA1,2)(p-qA112).

Using this and (7), we can take reciprocals in (8):

(10) a—A1*2 <p—qAll2*sl.

By (7), o —H1/2 = (a-M1/2)_1>0. Hence by (10) we have

0 <p — qA112.

Adding this last inequality to the left half of (8) we find that 1<2p. Since
p is an integer we must therefore have 1. From this and the right half
of (10) it follows that q^O.
On the other hand, adding the left half of (10) to the right half of (8)
and re-arranging, we get qA12cA1^; hence q< 1. Thus q must be 0.
RECURSION THEORY [CH. 6, §14
298

From the right half of (10) it now follows that /)<1. But we have seen
above that also />> 1. Hence p = 1.
Putting/>= 1 and 9 = 0 in (9) and using (7), we obtain (4), as claimed. |

The following seven lemmas are concerned with various aspects ot the
behaviour of Xz(a) and Yz(a) as z and a vary. In the first four of these
lemmas a will be kept fixed and only z will vary, so we put, for the sake
of brevity,

xz = Xz(a), yz=Yz{a).

14.2. Lemma.

*0=1, >o=o;
*i=o, >i=i;
*z + 2 = 2g*z + 1 —*z, >z + 2 = 2o>z + i->z.

Proof. The 0th and 1st solutions have already been obtained above. The
remaining pair of equalities are obtained as follows:

(a + All2)z+2 by 14.1,

■ (a + All2)z(2a(a + A112) + A — a2)

(a+All2)z(2a(a+All2)-\) by (2),

-2a(a + All2)z+1-(a + All2y

- (2 axz+1 - xz) + (2 ayz+1-yz)A112 by 14.1.

The required result follows from the irrationality of A112. :

14.3. Lemma. xz+1 = axz + Ayz.


Proof.
x.+1+yz+1All2=(a+A112)2+1

= (a + All2)z(a + A112)

= (xz+"yzAll2)(a + A112)

= (axz + Ayz) + (xz+ayz)A1/2. |

14.4. Lemma. az<xz*s(2a)z.


Proof. In multiplying out (a + All2)z, the terms with even power of A112
include az. Since all these terms are non-negative, we have az<xz by
Thm. 14.1.
CH. 6, §14], THE POWER FUNCTION 299

It follows at once from (2) that ,41/2<a. Hence, by Thm. 14.1,

xz^xz+yzAll2=(a+Allzy*s;(2ay. |

14.5. Lemma. yz = z (mod a— 1).


Proof. By induction on z, using Lemma 14.2. |

14.6. Lemma. Let S be the diophantine relation defined by

S(u,v) - 3xBj[(jc2-(m2-1)(m-1)V=1)a(x>1)a(m>1)a(i;=mx)].

Then if S(b,c) holds, we have obb. Also, for every b > 1 we can find some
c such that S(b,c) holds.
Proof. Suppose that S(b,c) holds. Then for some x and y we have

x2 — (b2 —1)((6 — l)y)2 = 1,


where
x>l, 1, c—bx.

It follows that for some z

x—Xz{b), (b-l)y=Yz(b).

Moreover, z>0 because x>l. From Lemma 14.5 we get

0 = (6 — \)y=z (mod 6 — 1).

Therefore z is a positive multiple of b— 1. Hence, using Lemma 14.4,

x = Xz{b)s^ bz > bb _1,

so that c = bx^bb as claimed.


Now suppose b> 1. Put

x = X„-1(b), c = bx.

Then certainly x>X0(fi) = 1. Also, by Lemma 14.5 we see that

Yb-i(b) = 0 (mod b- 1),

so that Yb-XQ>) is a multiple of b-1 and for some y we have Yb_1(b) =


= (b — l)y. It now follows that

x2 — (b2 — \)(b — Yfy2, —!

and we see that S(b,c) holds. I


300 RECURSION THEORY [CH. 6, §14

14.7. Lemma. Let y> 1 and a>yz. Then

Xz(a)-f^Xz(ay)<Xz(a) • (/ +1).

Proof. We start by using Thm. 14.1. Expanding the right-hand side of (3),
ignoring terms with odd powers of A1/2 and recalling that A=a2 — 1, we
obtain
Y(o)=.Z(iK-2V-i)\
where the summation is over all k<\z. Hence

X£a) • / = 2(Q(ayY^\aY

Xz(ay) = ZQk)(ayy ~ 2k(a2y2 -1 )fc,

where the summations are again over k*a\z. The left half of our lemma
now follows at once, since our assumptions imply that

0 < a2y2 — y2 < a2y2 — 1.

By comparing the expressions for Xz(a) and Xz(ay) we see that in order
to prove the remaining part of the lemma it is enough to show, for all Ar<|z,

(a2y2 — 1 )fc < (a2 — 1 )k( yz +1).

But for k<\z we clearly have (a2y2— \)k<a2kyz: hence it is sufficient to show

a-2k(a2-\)k>yz(yz +l)-\

Now, by the well-known Bernoulli inequality,

a~2\a2-l)k = (l-a~2)k> 1 -ka~2.

Next, since k<z<yz<a, we have

1 — ka~2> 1 —o_1.

Finally, since _yz + l«tf, we have

1 1 — (kz + l)_i!l=>’z(>’z + l)_1,

as required. |

14.8. Lemma. Let y>~ 1 and a>yz. Then the double inequality

Xz{a) < XJay) < Xz{a) • a

holds iff w = z.
CH. 6, §14], THE POWER FUNCTION 301

Proof. If w=z then the double inequality follows directly from Lemma 14.7,
using our assumptions ^>1 and a>yz +1.
Since Xw(ay) increases with w, it remains to show that

Xz-1(ay) < Xfa), Xz+fay) >Xz(a) ■ a

(the former only for positive z and the latter for any z). This is easily done,
using Lemmas 14.7 and 14.3. We leave the details to the reader. g
We now come to the main result of this section:

14.9. Theorem. The function ?.yz(yr) has a diophantine graph.


Proof. We claim: for any numbers x,y and z, the equality x=yz holds iff:
z = 0 and x=l; or
z>0 and y«sl and x=y; or
z>0 and ^>1 and x >0 and there are numbers a,b,c,u,v,r and s such that
(i) y+z<b,
(ii) S(b,c) holds,
(iii) x + c+z + 3<a,
(iv) l<r<<p(2a),
(v) r2 — (a2— l)(z + s(a—1))2 = 1,
(vi) rx^u<r(x + \),
(vii) u2—(a2y2— l)u2 = l.
(In (ii), S is the diophantine relation of Lemma 14.6; and in (iv), cp is the
Fibonacci sequence.)
It is clear that once we establish this claim, the theorem will be proved.
(Here we are making use of Thm. 13.10.)
To establish our claim, we may assume

(*) z>0, y> 1, x>0,

because the remaining cases are trivial.


Suppose, first, that there are a,b,c,u,v,r and s satisfying conditions (i)-(vii).
We have to show that x=yz.
By (v) we have for some n

r=Xn(a), z+s(a-1) = Yn(a).

We would like to show that n—z. Using (iv), Thm. 13.1, (iii) and Lemma
14.4 we have

1 <r< (p(2a)<3a<aa*sXa(a).

21
RECURSION THEORY [CH. 6, §14
302

Thus 1 <r<Xa(a); so by the ordering of the solutions of our Pellian equation


it follows that

0
On the other hand, by our assumption (*) and by (iii) we also have

By Lemma 14.5,

z = z + s(a — l)= Y„(a) = n (moda-1).

Therefore n must in fact be z.


Next, from (vii) we see that u=Xw(ay) for some w. Thus (vi) yields

Xz{a) • x<Xw(ay) < X.(a) • (x+1);

hence using (*) and (iii) we have

Xz(a) < Xw(ay) ^Xz(a)-a.

At this point we wish to apply Lemma 14.8, so we need to check that


y> 1 and yz<a. Now, y>l by (*); also, by (i), (ii), Lemma 14.6 and (iii)
we have
yz<bb<c<a.

Thus we may apply Lemma 14.8 and conclude that w=z. We can now
re-write (vi) as follows:

Xz(a) • x^Xz(ay)<Xz{a) • (x+1).

On the other hand, by Lemma 14.7,

Xz(a) .f^X2(ay)^Xz(a) • (/ +1).

The last two double inequalities mean that x — q(Xz(ay), Xz(a)) and
yz = q(Xz(ay), Xz(a)) respectively (see Ex. 6.11); hence x~yz as required.
Now for the converse. Retaining our assumption (*), we let x=yz.
First, we choose b so that (i) is satisfied. In particular, by (i) and (*) we have
b>\, so by Lemma 14.6 we can find c such that (ii) is satisfied as well.
Next, we choose a to satisfy (iii). Moreover, since 2a_1 increases with
a faster than (2a)z, we can take a so large that 2fl_1>(2a)z.
Now we put

r=Xz{d).
CH. 6, §14], THE POWER FUNCTION 303

Then by Thm. 13.1, by our choice of a, by Lemma 14.4 and by (*) we have

cp(2a) s* 2a -1 > (2 a)z > Xz(a) > 1,

so (iv) is satisfied.
By Lemma 14.5,

Yz(a)=z+s(a-1),

where s' is an integer. But if s were negative, then by (iii) we would have
Tz(a)<0, which is impossible. Thus s is a natural number. From the
definition of Xz(a) and Yz(a) it follows that (v) is satisfied.
Next, we choose u=Xz(ay). By (*) we have y>\, and by (iii) we have
a>x=yz. Thus we may use Lemma 14.7, which yields (vi) by the choice
of r and u and our assumption that x=yz.
Finally, we take v-Y.(ay), so that (vii) is satisfied. B

Using Thm. 14.9, we shall now prove that two other very useful functions
have diophantine graphs.

14.10. Theorem. The binomial coefficient, i.e., the function Xxyff, has a
diophantine graph.
Proof. We use the following definition of the binomial coefficient:

©={n (x-z)]/>'!-
z<y

First let us assume that We wish to find the quotient

q(2^(l+2T, 2*2).
We have
2xy(l+2T= Z (z)2x(y+x~z).
Z^X

In the sum on the right-hand side, each of the first y+1 terms (i.e., those
with z<y) is a multiple of 2X\ All the remaining terms together (those
with z>y) contribute at most

£ (^)2*(*_1)=2*(*_1)((1 +1)* — 1)
0<z<J
= 2x(x~1\2x—l),

which is <2X\ Thus we have found

q(2**(l+2T,2*2)= Z (*)2x(y-2)■

21*
RECURSION THEORY [CH. 6, §14
304

Similarly we find

q(2JC(y—1)(1 +2X)X, 2X*)= X (*)2x<-y~z~1\


z<y

Multiplying this equality by 2X and substracting from the preceding one


we get
q(2xy(l +2x)x,2x2)-2x • q(2"(j’-1>(l +2x)x,2x2) = (x).

Thus, assuming that 0<y<x, we have shown that z=(x) iff there are
numbers u and v such that
(i) u=z+V‘ 2X,
(ii) u • 2x*<2xy(l + 2X)X<(« +1) • 2X\
(iii) v • 2x2«2x°’-1)(1 + 2x)x<(t> +1) • 2X\
Bringing in the other cases (i.e., y— 0 and x<y) we now have: z = (x) iff:
y=0 and z— 1; or
x<y and z=0; or
Ocycx and there are u and v satisfying (i), (ii) and (iii).
In view of Thm. 14.9, this yields the required result. 9

14.11. Theorem. The function Ax(x!) has a diophantine graph.


Proof. We show that

x!=q(rx,(')), i.e., x!<rx/(')<x! + l,

where r is any number >(2x)x+1; e.g., r=(2x)x+1 +1.


The cases x=0 and x=l are trivial, so we may assume x>l. We have

l’7Q=x!/(l-r-1)(l-2i--i)...(l-(x-l)r-1).

From this identity we get at once

But from the same identity we also get

rX)-=*!/0-*r-1)*
* \

Now, it is easy to show that for every real number a such that 0<a<f

(1 — a)_1< 1 +2 a.

Also, if ft is real and 0</?<l, then

(i+«*= e or-i+nE e)=i+«i+D'=i+^2*.


CH. 6, §15], BOUNDED UNIVERSAL QUANTIFICATION 305

Using these two inequalities with a=xr 1 and fi=2xr \ we get

rxlQ<x\(l +2.v/'_1)xc.x!(l +2X+Ixr~1).

By the choice of r this last quantity is clearly <x!+l. Thus we have


shown that

*!< rx/Q^x\ + l,

as claimed. Thus y=x\ iff

X;)-=r«-(y+l)Q,
where, e.g., r = (2x)* + 1 + l.
In view of Thms. 14.9 and 14.10, this proves the present theorem. g

*§ 15. Bounded universal quantification

We shall need one more result stating that a particular function has a d.g.:

15.1. Lemma. The binary function f defined by

f(y, 0= FI (1+0+1)0
has a d.g.
Proof. For the time being, let t^O. Then it is easy to see that

(1) f(y,t) = {yy)tyy'; where a =y + t~1.

Now let a be any positive number. Using Taylor’s theorem we develop


(l-|-fl~2)a in powers of a~2 up to the j;th power:

(2) (1 +a~2)“= X (j)a~2J + remainder.


j*ey

The remainder (a la Lagrange) is


(^i)a-20>+1)(i+Sa-2)«-r-i

with 0<$< 1. Since y-coc^y+l by (1), it is easy to see that


Also,
(i+3o-2r-y_2<i-
Therefore the remainder can be re-written in the form na"~ly+1) with
0<q<l. Multiplying (2) by a2y+l we obtain

(3) cF+\l+a-*)*= X (5)fl2(,_J)+1+»/«_1-


RECURSION THEORY [CH. 6, §15
306

Now let us assume 0 (as well as 0). We develop (1+tf 2Y up to


the (y-l)th power of a-2. This time the remainder is

(;)a-2y(l+3fl-2)a~y.

Using the fact that y«x.^y+\ it is easily seen that 0<(“)<j>+1 and
(\+9a~2)*~y<2, so this remainder can be re-written as

with 0<(<1. Multiplying by a2y_1 we get

(4) a2y-\l+a-r= 'Z(*yly-J)-1 + 2ta-1(y+ D-


j~=y

We now put

(5) a = 2(y+\)\ty.

Then all the terms under the summation signs in (3) and (4) are natural
numbers (for, when the coefficients (“) are written out as fractions, their
denominators divide a). Also the remainder terms in (3) and (4) are <1.
Thus, if we put

u= X (xj)a2(y-j)+1 and v= £ g)*80'--0"1


j^y J<y

then u and v are the unique natural numbers such that

«^a2},+1(l+tf-2)a<u + l,

v<a2y-1(l+a~2y<v +1.

These inequalities can be stated equivalently:

(6) M*fla<at(aa + l),t+1 <(«+!)*«*»

(7) EV+2^(a2 + iy,+1<(u + l)ta,+2.

On the other hand, from the definition of u and v we have at once

u-a2v = Qa,

so that by (1) and (5) we get

(8) u—a2v + 2(y+1) • z,

where z=f(y,t).
CH. 6, §15], BOUNDED UNIVERSAL QUANTIFICATION 307

Bringing in also the trivial cases t — 0 and y=0 which we have so far
left aside, we now have: f{y,t)—z iff:
vt = 0 and z=l; or
yt>0 and there are a, u and v satisfying (5)—(8).
By virtue of the results of §14 it follows that /has a d.g. |

We can now begin our direct attack on bounded universal quantification.


We start with the following special case.

15.2. Lemma. Let

P(x,y)=\!w<y 3z^y ... 3zm^y[f(x,w,z1,...,zJ=Q],

where ms®0 and f is a polynomial. Then P is diophantine.


Proof. We find a polynomial g with n +1 variables and positive coefficients

such that

(1) g(*,y)>y for all x,yi

(2) |/(s,w,z1,...,zj|<g(x,y) whenever w,z1,...,zm-cy.

(Such g may be obtained from/as follows: change the sign of all negative
coefficients in /, then substitute y for each of the variables w,z1,...,zm and
finally add jy-f 1.)
We claim that P(x,y) holds iff:
;E = 0; or
and there are numbers s,t,v1,...,vm such that
(i) t=g(x,y)\,
(ii) l+(.+ l)t=n>v</l+(H' + l)0>
(iii) l+(j+l)f divides /(*,s,
(iv) l+(s+l)t\nj<A-J) for
Once this claim is established, the lemma will follow at once. (Note that
(i) and (ii) are of the right form by 14.11 and 15.1. Also, (iii) can be
written as

3 U [«2(l+(j+l)02 = (/(3E»J*,?l> — »,,m))2]-


Finally,

j^y

so that (iv) is of the right form as well, by the results of §14.)


Since our claim is trivially true for y—0, we shall from now on assume
y> 0. Also we fix x£Nn.
308 RECURSION THEORY [CH. 6, §15

First, suppose that s,t,v1,...,vm are numbers satisfying (i)-(iv). We have to


show that P(i,y) holds. Fix any we must show thatf(x,w,z1,...,zm) =0
for some
By (i) we have t>0, hence l+(w+l)t>l. Let p be a prime divisor of
l+(w+ \)t, and put

(3) 2;=rm(r;,/)) for

From (ii) and (iv) it follows that p\Y\jKyiVi—j) for each /. Since p is prime
it follows that for each i there is some j(i)<y such that p divides vi—j(i).
Hence, by (3), z~rm[j'(i),p), so that z£<y(i)<y, as required.
Next, since p\\+(w + \)t, it follows that p cannot divide t. Hence by (i)
we must have p>g(x,y). Therefore by (2) we get

(4) |/(x,vv,z1,...,zj|</?.

On the other hand, since p divides both l+(H>+l)r and, by (ii), also
l+(s+l)L it must divide their difference (s'—w)t. We have seen, however,
that p cannot divide t, so it must divide s—w, i.e., whs (mod p). Also,
by (3), zt = vt (mod p) for each i. Thus we have

(5) f(x,w,z1,...,zj=f(x,s,v1,...,vm) (mod p).

By (ii), (iii) and (5) it now follows that p divides/(s,w,z1,...,zm); but by (4)
this means that f(x,w,z1,...,zn) =0 as required.

Now for the converse. Suppose that P(x,y) holds. We are assuming
y=>0, so we must show that there are s,t,v1,...,vm satisfying (i)-(iv).
We choose t as dictated by (i). Next, when the product []W<:),(1 +(vv+ l)t)
is multiplied out, we obtain 1 plus terms containing powers of t (there is at
least one such term since y >0). Hence we can find ^ such that (ii) is satisfied.
Since P(x,y) holds, we have for each w<y numbers z1(vi'),...,zm(w)<y
such that

(6) f(x,w,z1(w),...,zm(w)) = 0.

Let us fix any i, l</'=cm. We show that there is a number vt such that

(7) Zj(w)=rm(t?j, l+(w+l)t) for all w<y.

By the Chinese Remainder Theorem 12.6, it is enough to show that


Z;(w)< 1+(w+l)t for all and that the divisors l+(w + l)r, for
different are mutually prime.
CH. 6 §15], BOUNDED UNIVERSAL QUANTIFICATION 309

Since zi(w)cy by assumption, we have in fact

by (1) and (i).


To show that the l+(w+l)t are mutually prime, suppose that p were
a prime divisor of both l+(«+l)t and l+(vr+l)r, where u<w<y. Then
p would also divide their difference (w—u)t. Thus p divides w — u or t.
However,

w-u<y<g(x,y)

by (1), so (i) implies that w—u divides t. Hence p must divide t in any case.
But this contradicts our assumption that p divides l+(w+l)t.
Thus for each we have a number vt satisfying (7). It remains
to show that our s,t,v1,...,vm satisfy (iii) and (iv).
Fix any H’<y. Then, by (ii), l+(vv + l)t must divide l+(s+l)t and hence
also the difference (w—s)t. But since clearly 1 + (w+l)t and t are mutually
prime, it follows that l+(w+l)t divides w—s; i.e.,

s = w (mod l+(w+l)t).

Also, (7) means that

v^z^w) (mod l+(w + l)/).

These facts, together with (6), yield

f(x,s,v1,...,vm) = 0 (mod l+(w+l)0-


In other words, l+(w+l)t divides f(x,s,v1,...,vm) for all w<y. But since,
as we have seen, these divisors are mutually prime, their product must also
divide f(x,s,Vx,...,v„). By (ii) this product is 1 +0 +1 )t, so that (iii) is satisfied.
Finally, we recall that by assumption zi(iv)<>!. Hence vt—zt(w) is one of
the factors in n
;< y(vi—j)■ By (7) it therefore follows that l+(w+l)t
divides rL<
y(vi—j). Again, since these divisors are mutually prime, their
product l-h(^+l)t must also divide HjcyiVi-j) and (iv) is satisfied. |

Lemma 15.2 can be generalized to the case where the bound on the initial
universal quantifier is not necessarily the same as that of the m existential
quantifiers:

15.3. Lemma. Let


P(x,u,v)=\/w<u 3z^v ... 3zm<u[f(x,w,z1,...,zm)=0],

where m 0 and f is a polynomial. Then P is diophantine.


310 RECURSION THEORY [CH. 6, §15

Proof. Introducing new variables w', z'v...,z'm and y, we put

Q(x,u,v)= 3y[(y=u + v)

a 33zx<j 3z'<y ... 3zm<y 3z'm<y

E (x,u,v,w, w',zl5z',... ,zm ,z^)],


•where
...) = (u+w' = w) v [(/(*,w,zl9...,zm) = 0)

a(z1+z'1+1=v)a ... A(zm+z'm + l=v)].

By Lemma 12.1, E is an elementary relation. Hence by Lemmas 15.2 and


12.2 Q is diophantine. We now proceed to prove the identity

P(x,u,v) = Q(x,u,v),

from which the assertion of our lemma follows at once. Let us fix numbers
x, u and v. First, assume that P(x,u,v) holds. We choose y=u + v, and
fix any w<y. We must show that there are numbers w',z1,z'1,...,zm,z'm-<y
such that E(x,u,...) holds. There are two cases to consider.
If we put w' = w—u. Then as required. We take the
2; and the z\ to be arbitrary numbers (e.g., all of them =w). By our
choice of w' we have u + w' = w, so that E(x,u,...) holds.
If we take w' to be any number (e.g., w' = w). By assumption
P(x,u,v) holds, so there are z1,...,zm<u such that

(1) f(x,w,z1,...,zj=0.

Note that z;<y, since v*su+v=y. We put

z'i=v—zi—1 for i=\,...,m.

Then clearly z^ixy, as required. By the choice of the z\ we have

(2) Zj+Z; + l=u for


‘ N

From (1) and (2) it follows that E(x,u,...) holds, as claimed.


Conversely, suppose that Q(x,u,v) holds. Given any w<w, we must find
z1,...,zm<t; satisfying (1). Now, because Q(x,u,v) holds, there are numbers
w',z1,z'1,...,zm,z'm such that E{x,u,...) holds. On the other hand, u + w'^w
since we were given w<u; so the fact that E(x,u,...) holds means that (1)
and (2) must be satisfied. Thus we have (1), where z1,...,zm<i; because
of (2). |
CH. 6, §16], THE MRDP THEOREM AND HILBERT’S TENTH PROBLEM 311

We are now ready to prove:

15.4. Theorem. The class of diophantine relations is closed under bounded


universal quantification.
Proof. Let
P(t,u)= 3zx... 3zm [f(x,w,z1,...,zm)=0\,

where 0 and/is a polynomial. We must show that P is diophantine.


To this end we put
Q(x,u)=3v Vtr<« 3zx-<v ... 3z„,<r[/(Wi--,zJ=0].

For given numbers x and u, if Q{x,u) holds then clearly P(x,u) holds as well,
since the condition imposed by Q is apparently more stringent than that
imposed by P.
Conversely, suppose that P(x,u) holds, Thus for every w<u we may
choose m numbers z1(w),...,zm(w) such that

f(x, w,zfw),... ,zm(w))=0.


Since every finite set of numbers is bounded, we have (for sufficiently big v)
v3=>z((w) for all and The existence of such v means that
Q(x,u) holds.
Thus we have proved that P and Q are the same relation. On the other
hand, from the definition of Q and Lemma 15.3 we see that Q is obtained
by existential quantification from a diophantine relation. Hence Q (i.e., P)
is itself diophantine by Lemma 12.2. I

§ 16. The MRDP Theorem and Hilbert’s Tenth Problem

The main result of this section is:

16.1. Theorem. A relation is r.e. iff it is diophantine.


Proof. As pointed out at the end of §12, it follows from Thm. 15.4 that
every r.e. relation is diophantine. The converse is obvious.

We shall refer to this result as “the MRDP Theorem”. This is an


acronym for the names of the four mathematicians — Matijasevic,
Robinson, Davis and Putnam — to whose joint efforts it is due. (For further
details see §17.)

16.2. Remark. The proof of the MRDP Theorem is completely constructive.


Suppose an n-ary r.e. relation P is given to us by means of an index e. Thus

P(i) = (x€domn{c})= 3yT(e,x,y).


312 RECURSION THEORY [CH. 6, §16

(See Cor. 11.6.) Then the proof of the MRDP Theorem (including of course
the proofs of all the results leading up to it) provides us with a method for
finding a polynomial p and a number m such that

P (t) = 3 y!... 3 ym[p(x, y i,..., ym) = 0].

(Of course, this method is not really practical.)

A diophantine equation is any equation of the form

(1) /(*)=o,
where / is a polynomial (with integer coefficients).
Hilbert's Tenth Problem is the problem of devising an algorithm whereby,
given any diophantine equation, one could decide whether or not that
equation has a solution.
By a solution of a diophantine equation (1) one normally means a solution
in integers, i.e., an 77-tuple of integers satisfying (1). However, we shall
now show that the nature of the Tenth Problem is not changed if we take
solution to mean solution in natural numbers.
Given a polynomial / in n variables, let us define polynomials g and h,
in 4n and 2n variables, respectively, as follows:

g(*,t),3,u) =f(x\ +y\+z\+u\,... ,x2n+yl + z; + u2),

h(x,v)) =/(*! - ylt... ,xn -y„).


By a well-known theorem of Lagrange, every natural number is the sum
of four squares. Hence (1) has a solution in natural numbers iff the diophan¬
tine equation g(x,r),3,u)=0 has a solution in integers. Conversely, every
integer is the difference of two natural numbers, so (1) has a solution in
integers iff the diophantine equation h(x,i)) = 0 has a solution in natural
numbers.
We shall therefore take solution to mean solution in natural numbers.
Under this interpretation, the Tenth Problem is equivalent (as we have
just seen) to the problem as originally posed.
It is not difficult to encode polynomials by numbers; to each polynomial
/ a code number # / is assigned such that coding and decoding are al¬
gorithmic (i.e., we have an algorithm for calculating the code number of
any given polynomial, as well as an algorithm for finding out whether or
not any given number is the code of a polynomial, and if so of which one.)
This may be done, e.g., by a method like that used in §8 for encoding
URIM programs.
Let P be the property (i.e., unary relation) such that, for any number z,
CH. 6, §16], THE MRDP THEOREM AND HILBERT’S TENTH PROBLEM 313

P(z) holds iff z=#/ for some polynomial / such that equation (1) has
a solution.
Then clearly Hilbert’s Tenth Problem is equivalent to the decision
problem1 for P. However, we shall now show that P is recursively un-
decidable (i.e., non-recursive). Hence — if we accept Church’s Thesis —
we must conclude that the decision problem for P is not solvable. This
constitutes a negative solution to Hilbert’s Tenth Problem.
Let R be an r.e. but non-recursive property. For example, using 11.8
we may take
i?(z) = (z£dom1{z})= 3yT(z,z,y).

By the MRDP Theorem we can find a number n and a polynomial / of


n +1 variables such that

(2) R(z)=3.v1...Bx„[/(x,z) = 0].

For each value of z we get from / a polynomial fz (of n variables) defined by

(3) /Z(s)=/(x,z).
Let the sequence cp be defined by the identity cp(z) = # /.. Clearly, (p is
algorithmic. Moreover, for any of the standard methods for encoding
polynomials, it is easy to show that cp is recursive.
But from (2), (3) and the definitions of P and cp we get the identity

R(z) = P(cp(z)).
Since R is not recursive, P cannot be recursive.

We conclude this section with two easy consequences of the MRDP


Theorem.

16.3. Theorem. Given any r.e. set A^N, we can find a polynomial g such
that A is the set of all non-negative values assumed by g as its variables
range through N.
Proof. By the MRDP Theorem we can find n and a polynomial/of « +1
variables such that
(z£A)= 3x1...3x„ [f(x,z) = 0].

Now put
g(x,z)=z-(zT l)(/(x,z))2.

It is easy to see that g has the required property. I

1 See discussion following Ex. 10.6.


RECURSION THEORY [CH. 6, §17
314

16.4. Theorem. Given any r.e. set A^N, we can find polynomials g and h
such that A is the set of all integer values assumed by the rational function
g/h as its variables range through N.
Proof. With / as in the proof of Thm. 16.2, we put

g(*,z)=z + (/(x,z))2,

/?(ar,z) = (z+l)(/(a:,z))2 + l. |

Using the last two results we can (at least in principle) obtain a “general
algebraic formula” for each r.e. set of numbers. (In particular, we can
get such a formula for the prime numbers — a result which had eluded
number theorists for several centuries.) However, such formulas would
be rather too complicated and difficult to obtain to be of much practical use.

§17. Historical and bibliographical remarks

An excellent work on recursion theory is Rogers [1967]. For some


historical remarks on the origins of recursion theory see §2.4. of that work.
Turing [1936] was the first to use imaginary computers (which came
to be known as Turing machines) for characterizing the class of algorithmic
functions.
Imaginary computers like URIM (which are known as register machines)
were invented for the same purpose by several people independently
(Shepherdson and Sturgis, Lambek, Minsky) in the late 1950’s. The first
mention in print is probably Lambek [1961] and Minsky [1961]. These
machines gained wider currency following Shepherdson and Sturgis [1963].
Our definition of the class of primitive recursive functions (in §5) is
essentially the same as the original definition given by Godel [1931], except
that at the time he still called these functions simply “recursive”.
The first definition of the class of recursive functions was given by
Godel [1934] following a suggestion by J. Herbrand. This definition was
proved by Kleene [1936] to be equivalent to another definition, which is
virtually the same as the one used by us (§5).
Church’s Thesis was proposed by Church [1936], and an equivalent
thesis is implicit in Turing [1936], For a detailed discussion of the evidence
for the thesis see Kleene [1952],
The Normal Form Theorem (10.1 and 10.3) and the Recursion Theorem
(10.19 and 10.20) are due to Kleene [1936] and [1938]. Thm. 10.23 is due
CH. 6, §17], HISTORICAL AND BIBLIOGRAPHICAL REMARKS 315

to Rice [1953] and the Simultaneous Recursion Theorem (10.24) is due


to Smullyan [1961],
The problem of finding an algorithm for deciding the solvability of
diophantine equations was included by Hilbert [1900] in his famous list
of problems for twentieth century mathematics. (Hilbert explicitly men¬
tioned the possibility that some problems on his list might not have a positive
solution. A proof of non-existence of a positive solution would constitute
a negative solution.) The first major advance towards a (negative) solution
was made by J. Robinson [1952], Using the properties of the solutions
of Pell’s equation1, she proved the results reproduced here as Thms.
14.9_14.11. However, her proof depended on the (then still unproved)
assumption that there exists a binary diophantine relation which possesses
a certain property (“having roughly exponential growth”). This advance
was continued by M. Davis, H. Putnam and J. Robinson [1961], who
proved the results reproduced here as Lemmas 15.1 and 15.2. But their
proof depended on J. Robinson’s earlier result, and hence on her unproved
assumption. The road from this point to Thm. 16.1 had already been
bridged by M. Davis [1958], who showed that every r.e. relation P can be
represented as

P(x) = 3y 3izx<y ... 3zm<y [f(x,y,w,z1,...,zm) =0],

where/is a polynomial. Another proof of the same result was given also
by R. M. Robinson [1956]. (These results of Davis and R. M. Robinson
are not really essential to the proof of Thm. 16.1, and we have circum¬
vented them via Lemma 15.3 and Thm. 15.4.)
The final break-through was made by Matijasevic [1970], He proved
that the sequence Xx[(p{2x)\ has a diophantine graph (Thm. 13.10). But the
well-known growth behaviour of that sequence (Thm. 13.1) is easily seen
to imply that the relation Xxy[(p(2x) =y] has the “roughly exponential
growth” required by J. Robinson’s assumption. Thus his result bridged
the major gap in the proof of Thm. 16.1. Our presentation in §13 is based
on Matijasevic [1971].
Theorems 16.3 and 16.4 were proved (for diophantine sets) by Putnam

[I960],

1 The equation was posed by Fermat in 1657 as a challenge to British mathematicians.


A partial solution was given by Lord Brouncker. The first complete solution was given
by Lagrange in 1767 and later streamlined by Dirichlet. Our treatment (proof of Thm 14.1)
follows Dirichlet.
CHAPTER 7

LOGIC—LIMITATIVE RESULTS

The main results proved in this chapter have one feature in common:
they display the inherent limitations of formalism and the formal method
in mathematics. As such, they are fundamental to any serious discussion
of the philosophy of mathematics.
It is significant that these results apply even to a basic, elementary and
(it would seem) philosophically unproblematic branch of mathematics
— elementary arithmetic1, which is concerned with the natural numbers
and the operations of addition and multiplication of natural numbers.
Because of this we shall be content to confine ourselves in this chapter to
elementary arithmetic, although the results in question can for the most
part be readily generalized to other, more elaborate, settings. (Something
will be said about this in Ch. 8.)
We shall assume knowledge of Chapters 1,2,3 and 6. Knowledge of Ch. 5
will only be assumed in a few remarks and starred problems.

§ 1. General notation and terminology

Unless otherwise stated, the definitions and conventions made in Chapters


1-3 and 6 remain in force in this chapter.
Throughout this chapter, we take if to be the first-order language with
equality having one individual constant 0, one unary function symbol
s and two binary function symbols + and x. if has no other extralogical
symbols. We call if the first-order language of arithmetic.
As a concession to common usage, we lay down the following two
definitions:

(r+t) =df +rt,

(rXt) =df Xrt,

1 Often also called elementary number theory.


CH. 7, §1], GENERAL NOTATION AND TERMINOLOGY 317

where r and t are any terms. (These two conventions are additional to, and
in the same spirit as, those made in def. 1.5.1.) We use the normal con¬
ventions for omitting parentheses.
We let On be the set of all £?-formulas whose free variables are among
Vi,...^ (i.e., the first n variables of ££). In particular, <I>0 is the set of all
^-sentences.
For any foimula a and terms tl5...,t„ we write “a^,...,^)” instead of
“a(v1/t1,...,vn/t„)”. Also, if r is a term, we write “r^,...,^)” for the result of
simultaneously substituting tl5...,t„ for respectively, in r.
By recursion on k we define the kth numeral sk:

So —0, sk+1 = ssk.

We shall frequently write, e.g., “a(s0)” as short for “a(sfli,...,s\aJ”.


We let 91 be the ^-structure with universe N (the set of natural numbers)
and with 09,=0 (the number zero), s9, = Ax(x + l) (the successor function),
+* = + (addition of natural numbers) and X91 = X (multiplication of
natural numbers). We call 91 the standard ^-interpretation or ^-structure,
or the (first-order) structure of natural numbers. We say that an ^-sentence
a is true if 91 f= a; otherwise we say that a is false.
It is easy to verify that

' (1.1) sf—k for all k£N.

(See remark following Prob. 2.2.1.)


We recall (see remarks following Prob. 2.2.4) that if a£<I>„ then 9th=a[al
means that a\=a for some (hence for every) valuation a having 91 as its
underlying structure and such that v7 = a; for i—l,...,n. Using (1.1) and
Thm. 2.3.15 it is easy to see that

(1.2) 911= a [a] iff 9lNa(sa)

for any a<E<l>„. We shall use this frequently without special mention.
In this chapter we mean by theory any set £ of i?-sentences closed under
first-order deducibility; i.e., £c<D0, and <pe£ whenever <p€<I>o and £f-<P-
(Cf. §4 of Ch. 5.)
The set O0 of all sentences constitutes an inconsistent theory. Moreover,
by Thm. 3.1.18, tf>0 is the only inconsistent theory; thus a theory L is
consistent iff £ is properly included in <P0.
Throughout this chapter, we let Q be the set of all true sentences, i.e.,

H={(p: (p^o, 91N <|>}.

22
LOGIC — LIMITATIVE RESULTS [CH. 7, §2
318

By the soundness of the predicate calculus (Thm. 3.1.2) 12 is a theory.


Notice that for any <p6<J>0 we have either (pg!2 or —|(p$12, but not both.
We call 12 complete first-order arithmetic1.
We shall say that a set of sentences E is sound if E c 12.
Let E0 be an arbitrary subset of O0. If

E={<p: <p£<I>0, E01— <p},

then E is clearly a theory; in fact, it is the smallest theory which includes E0.
We say that E0 is a set of postulates for E and that E is the theory based
on E0 (as a set of postulates).

§ 2. Nonstandard models of 12

Let *91 be an arbitrary ^-structure with universe *N, designated individual


*0, unary operation *s and binary operations *+ and *X (corresponding
to the function symbols + and X respectively).
By an embedding of the standard structure 91 into *91 we mean a one-one
mapping / of N into *N such that

/(0) - *0, f(m +1) = *s(f(m)),

f(m+n) =f(m) *+/(«), f(mn) = f(m) *X/(w),

for all m and n.


If / is as above and maps N onto *N, then / is called an isomorphism of
91 onto *91. If such an isomorphism exists, 91 and *91 are said to be
isomorphic2.

2.1. Problem. Let/be an embedding of 91 into *91. Let er be any valuation


whose underlying structure is 91, and let fo be the valuation such that
the underlying structure of fa is *91 and xfa—f(x<T) for every variable x.
Show that t/<7=/(tCT) for every term t. In particular, t*'J!=/(t9!) for every
closed term t.
2.2. Problem3. Let / be an isomorphism of 91 onto *91, and let a and fa
be as in Prob. 2.1. Show that a/<T=oC for every formula a. In particular,
*91f=(f> iff 91t=<p, for every sentence (p.

The last part of Prob. 2.2 means that, if 91 and *91 are isomorphic, then
*91 is also a model of J2 (i.e., *9tt=q> for all q>6i2). Such a *91 is called

1 This agrees with the terminology of Ch. 5; see Lemma 5.4.1.


2 This terminology is the same as that introduced in §1 (and extended in §7) of Ch. 5.,
3 Cf. Prob. 5.1.1.
CH. 7, §2], NONSTANDARD MODELS OF fi 319

a standard model of 12, while a nonstandard model of 12 is one which is


not isomorphic with 91. We shall now show that nonstandard models
of 12 actually exist1.

2.3. Theorem. There exists a nonstandard model of 12 with denumerable


universe.
Proof. Let

SB=K^st: i<n), £ = U{£„: neN}.

We want to show that fiuE is satisfiable.


By the Compactness Theorem 3.3.16 it is enough to show that every
finite subset of fiul is satisfiable. But any such finite subset is included
in 12 uE„ for some n; and 12vj£„ is clearly satisfied by any valuation a
whose underlying structure is 92 and such that, say, y{~n.
By Thm. 3.3.15 it follows that ftuE is satisfied by some valuation x
whose underlying structure *91 has a finite or denumerable universe.
However, £2 clearly cannot be satisfied in a finite structure (because £2
contains sentences sm5^s„ for all pairs of different numbers m and n) so
that *N must be denumerable.
It remains to prove that *91 is nonstandard. To this end, suppose that
/is an embedding of 92 into *92; we show that/is not onto. For the valuation
t (mentioned in the preceding paragraph) and for any n we have

T 1= v1;i£sn.
But
K=C=AO=An)

(see Prob. 2.1) so we must have y\^f(n) for all n. Thus/cannot map N
onto *N. 3

2.4. Problem. Let *92 be any model of 12. Let / be the mapping of N
into *N defined by: /(»)=s*91 for all n.
(i) Show that/is one-one. (If m^n, then the sentence sm7*sn belongs
to 12 and must hold in *92.)
(ii) Show that / is an embedding of 92 into *92.
(iii) Show that / is the only embedding of 92 into *92. (Use Prob. 2.1.)
(iv) Prove that *92 is a standard model of 12 iff */V = {s*91: n£N}.

1 By the Upward Lowenheim-Skolem Theorem 5.2.7, 12 has models of all nondenumer-


able infinite cardinalities; such models are clearly nonstandard. Here we show that
12 has denumerable nonstandard models.

22*
LOGIC — LIMITATIVE RESULTS [CH. 7, §2
320

Later in this chapter we shall discover some of the properties of non¬


standard models of J2. For the time being we confine ourselves to dis¬
cussing the implications of the fact that they exist. This fact means that
91, the first-order structure of natural numbers, cannot be characterized
uniquely (even up to isomorphism) in the corresponding formal language
if. For, any sentence (or set of sentences) of if that holds in 91 also holds
in other structures, not isomorphic with 91.
We would like to suggest that this is not accidental and that there is no
way of characterizing the natural numbers formally. By this we mean,
more precisely, that there is no set of (finite) expressions in any formal
language, which are (jointly) true in a structure that can reasonably be
identified as the structure of natural numbers, but not in any other structure
not isomorphic with it.
At first glance this claim may seem a bit too far-reaching. It might be
thought that the structure 91 cannot be characterized in the corresponding
language if simply because 91 is too rudimentary and (correspondingly)
if is too poor. It might thus be hoped that by considering additional
notions connected with the natural numbers and by introducing correspond¬
ing symbols for the notions, one could obtain a more elaborate “structure
of natural numbers” and a correspondingly richer language in which
this structure could be characterized (up to isomorphism). This objection
seems all the more plausible because there is a well-known informal way
of characterizing the natural numbers; and it might be hoped that this
could somehow be formalized. It will therefore be instructive to outline
this informal characterization, and to see what happens when one tries
to formalize it.
The informal (or, perhaps more precisely, semi-formal) characterization
we have in mind involves the following notions: (natural) number, set of
(natural) numbers, the particular number 0 (zero), the unary operation
s (successor), the binary operations + and X (addition and multiplication
of numbers) and the binary relation £ (<membership of a number in a set
of numbers).
Seven postulates (the Peano postulates) are laid down:
(i) s{n)X 0 for every number n.
(ii) If n and m are numbers and nXm, then s(n)Xs(m).
(iii) n + 0 = n for every number n.
(iv) n+s(m)=s(n+m) for all numbers n and m.
(v) n X 0 = 0 for every number n.
(vi) nXs(m) = (nXm)+n for all numbers n and m.
CH. 7, §2], NONSTANDARD MODELS OF O 321

(vii) If M is a set of numbers, and if 0 £M, and if s(n)dM for every number
n such that ndM, then n6M for every number n.

Postulate (vii) is of course the principle of mathematical induction.


The arithmetic of natural numbers can be developed on the basis of
these postulates1. Moreover, it can be proved that these postulates uniquely
characterize the natural numbers (up to isomorphism). We shall not do
so in detail, but merely hint at the proof.
In Prob. 2.4(iv) we saw that a model *91 of 12 is nonstandard iff in the
universe of *91 there are objects different from the values of all the numerals
s„. This suggests (correctly) that in order to show that the seven postulates
(i)-(vii) do not have “nonstandard models” (i.e., pathological interpreta¬
tions) we must deduce from these postulates that the set {0, s'(O), ^(^(O)),...}
contains all natural numbers. But this is in fact an immediate consequence
of the induction principle (vii).
We observe that the induction principle (vii), which was crucial to the
above argument, involves two notions (set of numbers and £) which have
no counterpart in the structure 91 and do not correspond to any symbol
of if. It might therefore be thought that if we augment 91 by adding these
notions (and extend if accordingly) we might be able to characterize the
augmented structure uniquely in the extended formal language. Let us
see whether this is so.
We start by specifying a structure II (in the sense of §2 of Ch. 1) of which
postulates (i)—(vii) may reasonably be regarded as an informal (or semi-
formal) description. The universe U of U must contain all natural numbers
and sets of natural numbers. Thus U=NuS, where 5 is the set of all
subsets of N. We take 0 as designated individual of H. The basic operations
of U are 5, + and X, i.e., successor, addition and multiplication2. The
basic relations of U are the properties (i.e., subsets of U) N and S and
the membership relation

E={(a,b> :aebeS}.

Next, we specify the language 3?' appropriate for II. Clearly, 3 must

1 For a semi-formal development based on a similar set of postulates for the positive
integers, see Landau [1930].
2 In order to conform to the format of §2 of Ch. 1, s(a), a + b and aXb must be denned
(and belong to U) even when a or b (or both) are sets of numbers rather than numbers.
We therefore define them arbitrarily (e.g. as equal to 0) in these cases. This will not
affect matters in any significant way.
LOGIC — LIMITATIVE RESULTS [CH. 7, §2
322

be the language obtained from S£ by adding two unary predicate symbols


N and S and a binary predicate symbol E.
We call 11 the standard jSf'-structure, since it is highly natural to regard
U as the structure of natural numbers corresponding to S£'. (Actually,
U is essentially what is usually called the second-order structure of natural
numbers and if' is essentially the corresponding second-order language,
except that here we have reduced them to first-order entities by the method
outlined in §3 of Ch. 1.)
The Peano postulates can now be formalized — i.e., “translated”
into if' — in a straightforward way. For example, (i) is formalized as
Vx(Nx -* sx^O), and (vii) as

(1) Vx [SxaEOxa Vy{Eyx-*Esyx}-> Vy{Ny-»Eyx}].

It is immediately clear that if there is any hope of characterizing it uniquely


in if', the formalized versions of (i)—(vii) must be supplemented by other
sentences, formalizing facts that are implicit in the semi-formal version.
These include sentences like NO, Vx(Nx-*Nsx), VxVy [Nx ANy-*N(x+y)]
and, less trivially, sentences formalizing facts about the property S and
relation E, for example:

(2) VxVy(SxAEyx-»-Ny),

(3) VxVylSxASyA Vz(Ezx«+Ezy)-*-x=y].

(Sentence (3) is a formalized version of the extensionality principle for


sets of numbers.)
Instead of chasing after such sentences one by one, let us go the whole
hog and consider the set fl' of all sentences that hold in It, i.e.,

£2' = {<p: ip an if'-sentence, U(=<p}.

Clearly, O' contains the formalized versions of the Peano postulates (e.g.,
the induction postulate (1)) as well as the additional sentences mentioned
above such as (2) and (3).

It is easy to define in a natural way the notion of isomorphism between


two if'-structures (a one-one mapping between their universes, which
“respects” the basic operations and relations). We leave the (simple)
details to the reader.
Let *11 be an arbitrary ^"-structure. We use a notation similar to the
one used above in connection with if-structures: we let *U be the universe
CH. 7, §2], NONSTANDARD MODELS OF £2 323

of *U, *0 the designated individual of *U, *N the property (i.e., subset


of *U) corresponding to the predicate symbol N, etc.
We say that the Jzf'-structure *U is regular if every member of *S is
a subset of *N, and *E is the membership relation between members of
*N and those of *S, i.e.,

*E={(a,b) :a£h€*S}.

The standard structure ll is clearly regular.

2.5. Problem. Let *11 be a model of £1'. For any a in *S put

f(a)={b: (ib,a)CE}.

Show that / is a one-one mapping of *S into the set of all subsets of *N.
(Use the fact that sentences (2) and (3) belong to SI'.) Hence show that if
we replace each a by f(a) we obtain a regular structure isomorphic with *U.

By Prob. 2.5, there is no loss of generality in dealing only with those


models of £1' which are regular.
An ^'-structure *U is full if it is regular and *S is the set of all subsets
of *N.
The informal argument outlined above (which shows that (i)-(vii) charac¬
terize the natural numbers uniquely) can now be converted into a proof
that every full model *U of £1' is standard (i.e., isomorphic with U). For,
using the fact that the (formalised) induction postulate (1) belongs to £1',
we can easily show that

*N = {s*u: rt€A}.

Hence, putting
f{n)=s*u for all n£N,
/(a) = {/(«): n£a} for all a£S,

it is easy to verify that / is an isomorphism of It onto *U. (The details


are left as a problem to the reader.)
Does this mean that £1' characterizes the standard structure H uniquely?
Not at all. For it is easy to show that SI' has nonstandard models (i.e., models
that are not isomorphic with It). To show that SI has a denumerable
nonstandard model we argue as in the proof of Thm. 2.3, except that instead
of the formulas v15*£sn we use the conjunction formulas NvjA^^sJ.
Of course, we already know that a nonstandard model of £1 cannot be
full, so if we simply decree that non-full models of £1' are to be disregarded,
324 LOGIC — LIMITATIVE RESULTS [CH. 7, §3

the only remaining model (up to isomorphism) is it. However, the point
is that there is no formal way of excluding the non-full models.
The informal characterization of the natural numbers works (or seems
to work) only because it tacitly assumes that the notion set of natural
numbers is interpreted correctly, as referring to all subsets of N. Thus it
is not an absolute characterization but only relative to the notion of power
set (set of all subsets) of a (possibly infinite) set. This latter notion cannot
be characterized in a purely formal way; besides, it is considerably more
problematic than the notion of natural number. (In this connection cf.
Thm. 10.7.9 and Prob. 10.7.10.)
If we believe that part of the task of mathematics is to characteiize
structures such as 91 (or H) uniquely up to isomorphism1, then the above
results suggest that mathematics cannot be completely formalized.
However, we could take the view that the real task is to study not some
structure as such, but rather what can correctly be asserted of it in the
appropriate formal language — i.e., in our case, £1 rather than 91. From
this point of view, the fact that Q has “pathological” (nonstandard) models
as well as the intended standard one is not very damaging.
The trouble is that £1 was defined (see §1) in terms of 91. Moreover,
the definition was purely semantic and thus highly non-constructive. It is
therefore natural to enquire whether Q can be characterized in an alternative,
more constructive way2.
Later in the present chapter we shall prove results that throw much light
on this problem.

§ 3. Arithmeticity

In this chaptei, when we say function we mean (unless otherwise indicated)


a function in the sense of §1 of Ch. 6.
By (n-ary) relation we mean (unless otherwise indicated) a first-order
(n-ary) relation in the sense of §6 of Ch. 6.

3.1. Definition. Let L be a theory and let P be an /7-ary relation. A formula


a£<I>„ weakly represents P in E if, foi every x£N'\

a(sI)^S iff P(x) holds.

1 Implicit in this is presumably the Platonistic belief in the objective existence of such
structures.
2 What we have in mind is something like what we achieved for the notion of logical
validity; see §5 of Ch. 3.
CH. 7, §3], AR1THMETICITY 325-

A formula a^,, strongly represents P in E if, for every xdNn,

a(sx)£E if P(x) holds,

~ia(sx)£E otherwise.
P is weaklyjstrongly representable in E if it is weakly/strongly represented
in E by some a £ <t>„.

The same terminology will also be used in connection with n-dimensional


sets (see beginning of §11, Ch. 6).
If a strongly represents P in E, then a also weakly represents P in E,
provided E is consistent. (In the inconsistent theory <D0, every relation is
strongly representable; but only a trivial relation P, such that P(x) holds
for all x£Nn, is weakly representable.)
In the case E=£l, a strongly represents P iff a weakly represents P.
Therefore in connection with ^ we shall drop the adverbs “weakly” and
“strongly”.

3.2. Definition. A relation is arithmetical if it is representable in n.

The rest of this section is devoted to characterizing the class of arithmetical


relations.

By logical operations (on relations) we mean the propositional operations


defined in Ex. 6.6.6 as well as universal and existential quantification defined
in the beginning of §11, Ch. 6.

3.3. Lemma. The class of arithmetical relations is closed under the logical
operations.
Proof. Among the propositional operations it is enough to consider
negation and implication, for the others can be reduced to these. But if
a and P represent in Q the n-ary relations P and Q, then clearly Hot and
a-*P represent ~iP and P^Q.
If P(x) = 3yQ(*,y) and a represents Q in a, then clearly 3v„+1ot repre¬
sents P. Universal quantification can be reduced to this and negation. |

3.4. Lemma. Every n-ary diophantine relation is represented in D by a formula


of the form 3v„+1..-3vn+m(r=t), where 0.
Proof Let P be an n-ary elementary relation. Then (see §12 of Ch. 6)

P(x)=(/(*) =g(*)),

where /(x) and g(x) are polynomials in the n variables x, with coefficients
in N. Each such polynomial can be formalized (i.e., “translated” into if)
LOGIC — LIMITATIVE RESULTS [CH. 7, §3
326

by replacing each xt by v;, each coefficient by the corresponding numeral,


and + and X by + and X. (For example, 3x^x2+xz can be formalized
as s3Xv1Xv1Xv2+v3.) Let r and t be terms which formalize fix) and
g(x) respectively. Clearly the formula r=t represents P in fl.
Every diophantine relation is obtainable by existential quantifications
from an elementary relation. Flence by (the proof of) Lemma 3.3, the
present lemma is proved. I

3.5. Corollary. Every r.e. relation, and in particular every recursive rela¬
tion, is arithmetical.
Proof. Immediate from Lemma 3.4 and the MRDP Thm. 6.16.1. |

In the following problem we sketch an alternative proof of Cor. 3.5,


which does not use the MRDP Thm.

3.6. Problem, (i) Verify that the results of §12, Ch. 6 (6.12.2-6.12.4 and
6.12.8) continue to hold if the word “diophantine” is replaced by “arith¬
metical”.
(ii) Show that the class of arithmetical relations is closed under bounded
universal quantification. (Observe that Vj’<z P(x,y) = Vy[(y<z)-*P(x,y)].)
(iii) Hence show (without the MRDP Thm.) that every r.e. relation is
arithmetical. (Argue as in the discussion following Lemma 6.12.8.)

3.7. Theorem. The class of arithmetical relations is the smallest class


containing all recursive relations and closed under the logical operations.
Proof. By Lemma 3.3. and Cor. 3.5, it is enough to show that if P is an
«-ary relation represented in by a formula a60n, then P can be obtained
from recursive relations by logical operations. We show this by induction
on deg a.

If a is atomic then P is clearly an elementary (hence p.r. and a fortiori


recursive) relation.
The cases a = “iP and a = P-»y are left to the reader.
If a = VxP> we choose a variant a.' of a such that a' = Vvn+iP • Clearly,
a' also represents P. Then p''belongs to <I>„+1 and represents in f! an
(« + l)-ary relation which, by the induction hypothesis, is of the required
kind, and from which P is obtained by universal quantification.

3.8. Problem. Verify that Thm. 3.7 continues to hold if the word “re¬
cursive” is replaced by “p.r.” or “elementary”.
3.9. Problem. Show that a relation is arithmetical iff its graph is arith¬
metical.
CH. 7, §4], TARSKI’S THEOREM 327

3.10. Definition. A function is arithmetical if its graph is arithmetical.

3.11. Remark. A relation is, in particular, a function (whose values are


all in the set {0,1}). By Prob. 3.9, a relation is arithmetical as a function
(i.e., in the sense of Def. 3.10) iff it is arithmetical as a relation (i.e., in the
sense of Def. 3.2). Thus Defs. 3.2 and 3.10 are compatible.
Notice that, by Thm. 6.12.4 and Prob. 3.6, a relation obtained by com¬
posing an arithmetical relation with arithmetical functions is itself arith¬
metical.

We conclude this section with:

3.12. Theorem. Every recursive function is arithmetical.


Proof. Immediate from Thm. 6.11.3, Def. 3.10 and Cor. 3.5. |

*3.13. Problem. Let D be the collection of all arithmetical total unary


functions. Let 2F be an ultrafilter over N and put

D/3? = {fl&: fdD},

(i) Let (p€On+1 and/i,.■•,/„€2). For any g£NN put

X(g) = {k£N: <R N <p Iffk),... fnik),g(k)}).

Show that if X(g)e^ then there exists an h£D such that X(/i)€ JC
(ii) Show that 91-<9t'v/Jzr. (Use Lemma 5.1.5, Los’ theorem 5.3.7, and (i).)
(iii) show that if is non-principal then 91 is a denumerable nonstandard
model of (For the nonstandardness, consider the individual 2x(x)/#\)

§ 4. Tarski’s Theorem

We assign a code number #t to each term t as follows:

#0 = 1,. #v; = 3' for i= 1,2,3,... ,

#sr = 2* 3#r,

#(r+t) = 4 • 3#r • 5#t,

#(rxt) = 8.3*r-5*t.

Also, we assign a code number #a to each formula a:

#(r=t) = 16 • 3#r* 5#t,

# ("la) = 32 *3**,
#(a->-P) = 64 • 3#a • 5#p,
328 LOGIC — LIMITATIVE RESULTS [CH. 7, §4

and, finally,

#Vv,a = 26+,'-3#* for /= 1,2,3,... .

For the sake of brevity we write, e.g., “term” as short for “code number
of (a) term” and similarly in other cases; so that, when a word (or phrase)
is printed in small capitals, it should be read with the words “code number
of” prefixed to it.
Various functions and relations connected with the syntax of 5£ are
easily seen to be recursive.

4.1. Example. Consider the property Tm such that Tm(x) holds iff x is
a term. This property is uniquely determined by the identity

Tm(x) = 3i/<x (x = 3u)

v 3u<x 3r<x (Tm(w) a Tm(t>)

a [(x=2 • 3U) v (x = 4 • 3“ • 5W)

v (x=8 • 3“* 5”)]}.

The quickest way to verify that Tm is recursive is via the Recursion Theorem.
If in the above equality we replace “Tm” everywhere by “{y}i”, we obtain
a condition of the form

where / is a recursive function1. By Cor. 6.10.20 we can find a value e of


y for which this condition holds identically in x. For this e we clearly
have the identity Tm(x) = {<?}(x), which shows Tm to be recursive2. The
same method can be used in other cases considered below.

4.2. Problem. Show that the property Fla, such that Fla(x) holds iff x
is a formula, is recursive.
4.3. Problem. Let Vbl be the binary relation such that Vbl(x,y) holds
iff y>0 and x is a term containing the yth variable \y or x is a formula
containing \y free. Show that Vbl is recursive.
4.4. Problem. Let Frm be the binary relation such that Frm(x,y) holds
iff x is a formula belonging to <I>y (i.e., having all its free variables among

1 Bounded existential quantification was introduced (Ex. 6.6.9) only in connection with
relations, but (as observed there) “ 3m<x”, e.g., can be replaced by which is
applicable to any function(al).
2 Actually, Tm as well as other functions and relations introduced below can easily
be shown to be primitive recursive, but we shall not make use of this.
CH 7, §4], TARSKI’S THEOREM 329

Verify that

Frm(x,}>) —Fla(x) a Vz<x [Vbl(x,z)-► (z«sy)],

hence Frm is recursive. (The point here is to verify that the bounded
quantifier Vz<x can be used instead of the unbounded \/z.)
The following example is of very great importance.

4.5. Example. Consider the total ternary function sb such that if a is an


expression (i.e., term or formula) and y >0 then sb(x,j>,z) is the expression
obtained from it by substituting the numeral sz for the variable xy; otherwise
(i.e. if x is not an expression or jp=0) we let sb(x,y,z) = x.
Let num(z) = # sz. Then

num(0) - 1,

num(z +1)=2 • 3num (z);

so num is recursive. It is now easy to verify the identity

num(z) if (x = 3y) a (y >0),


sb(x,y,z) = • f(x,y,z) if Vbl(x,>>) A(x^3y),
x otherwise,
where
f(x,y,z) = ex p(2,(x)0) • exp(3,sb((x)1,7,z)) • exp(5,sb((x)2,y,z))

and Vbl is the relation defined in Prob. 4.3.


From this identity it follows at once (e.g., by the method of Ex. 4.1) that
sb is recursive.
We now define the diagonal sequence d by the identity

<i(x) = sb(x,l,x).

Clearly, d is a recursive sequence. Moreover, for any formula a we have

(4.6) </(#«)= #[a(s#a)].

The main result of this section is the following theorem, due to Tarski.

4.7. Theorem. Let T be the property such that T(x) holds iff x=#P for
some {KfL Then T is not arithmetical.
Proof. The diagonal function d is recursive, so by Thm. 3.12 it is arithme¬
tical. If T were arithmetical, then by Lemma 3.3 its negation ~i T would
also be arithmetical. By Remark 3.11 the property 2x[~i T(J(x))]
LOGIC — LIMITATIVE RESULTS [CH. 7, §4
330

would be arithmetical as well. Let a be a formula (i®! representing this


property in SI. Thus for every x we have

a(sx)£Sl iff —i T(d(x)) holds.

In particular, giving x the value #a, we have

a(s#at)f£2 iff nf((/(#a)) holds.

But by (4.6) and the definition of T we have, on the contrary,

-ir(rf(#a)) holds iff a(s#a)<|£2.

This contradiction shows that T cannot be arithmetical. 8


The Tof Thm. 4.7 is the property of being a truth (i.e., a true sentence).
Somewhat less precisely, T may be called the property of truth in arithmetic.
Then Thm. 4.7 may be expressed by saying that the property of truth in
arithmetic is not arithmetical.
Let us analyse the proof of Thm. 4.7. If P is an n-ary relation and a
represents P in £2, then for any a 6 A” the sentence a(s0) may be construed
as (formally) asserting that P(a) holds. For, a(sa) is true iff P(a) holds.
In particular, if a were a formula representing in SI the property
Xx[~i r(c/(x))], then <x(s#a) would assert that —\T(d(#a)) holds. But, by
(4.6) and the definition of T, we see that “'nT(<i(#a)) holds” is equivalent
to “a(s#a) is false”. Thus a(s#a) would assert something equivalent to
“I am false”. Such a sentence cannot exist in if, for it could neither be
true nor false; so we conclude that the property in question is not arithmetical.
This has an obvious connection with the well-known Liar paradox.
(A persons says “I am lying now”; is he lying or telling the truth?...)
The difference is that while paradoxical sentences asserting their own
falsity are (apparently) possible in the imprecise natural languages with
their hazy interpretations, no such sentence can exist in a precisely con¬
structed formal language with a rigorously defined inteipretation. Therefore
the assumption that T is arithmetical is absurd because it implies the
existence of such a sentence in if under the interpretation SJL
An argument like the one used in the proof of Thm. 4.7 can be applied
in a wide variety of cases. Consider any formal language with a particular
“standard” interpretation. (In our case these were the language if with
its standard interpretation sJt.) It may be possible, using some method of
coding, to express in this language certain notions of its own syntax. Suppose
that this can be done to the extent that an analogue of the diagonal function
CH. 7, §4], TARSKI’S THEOREM 331

can be so expressed. (In our case, the diagonal function was arithmetical,
i.e., could be expressed in <5? under the standard interpretation 91.) Then
the pioperty of being a true sentence of the formal language (under its
standard interpretation) cannot be expressed in the language itself. For,
if it could, the Liar paradox would be reproduced in that language as we
saw in the proof of Thm. 4.7.
This suggests that there cannot exist a formal language which — under
some “standard” interpretation — could adequately serve as its own
metalanguage; for the syntax and semantics of a formal language cannot
both be adequately expressed within the language itself. In particular,
the dream of certain philosophers, that some day a precise formal language
will be constructed in which all scientific notions and theories would be
expressible, is most probably unrealizable. For, the syntax and semantics
of such a language would surely be part of science, and hence would have
to be expressible within that language, leading (as in the proof of Thm 4.7)
to the Liar paradox.

In the rest of this section we sketch the proof of a somewhat stronger


form of Tarski’s Theorem.

4.8. Definition. Let /be an n-ary function, and let a£On+1. We say that
a numeralwise represents /in a theory £ if for any n + 1 numbers a, b such
that f{a)—b we have

VU +1 +
4.9. Problem. Let a numeralwise represent the n-ary function / in the
theory £. Let and let P' be the formula

3vn+1 [p(vB+1)A«].
Verify that if f(a)=b then the sentence P(s6)«*P'(s0) belongs to £.
4.10. Problem. A formula is called a truth definition inside a theory
£ if for every sentence (p we have

y(s#(p)^(p€2:.

(i) Show that if the diagonal function d is numeralwise representable


in E and E is consistent, there cannot exist a truth definition inside E.
(Use Prob. 4.9 to find a formula 5 such that for all n the sentence
—Iy(sd(„))«-► 5(s„) belongs to E and take <p as §(s#s).)
(ii) Use (i) to show that there is no truth definition inside LI; hence
get a new proof for Thm. 4.7.
(iii) Show that if E is a sound theory, there is no truth definition inside £.
LOGIC — LIMITATIVE RESULTS [CH. 7, §5
332

§ 5. Axiomatic theories

In order to continue our discussion of Tarski’s Thm. 4.7, we lay down:

5.1. Definition. For any set E of sentences we let Tx be the property


such that T^(x) holds iff x is a sentence belonging to E.

In this notation, the T of Thm. 4.7 is Ta.


We now ask: under what conditons is it reasonable to regard a theory
E as axiomatic? Surely, it is not enough to require that E be based on
some set of “extralogical axioms”, i.e., postulates; because the whole
of E can always be taken as a set of postulates for itself, so that every
theory would be axiomatic. We must lay down some condition on the
set T of postulates. Ordinarily, one requires (at least) that the sentences
of T be generated one by one by some mechanical effective procedure.
This is tantamount to saying that the set of sentences of T is enumerated
by some algorithimc function. By Church’s thesis and Thm. 6.11.5, this
means that Tr is recursively enumerable. These considerations lead us to:

5.2. Definition. A theory E is axiomatizable if there exists a set T of


postulates for E such that Tr is recursively enumerable. If such a set
T of postulates is actually given to us so that we can find an r.e. index for
Tr (in the sense of §11 of Ch. 6), then we say that E is axiomatic.

5.3. Theorem. For any axiomatizable theory E there exists a set A of


postulates for E such that TA is recursive.
Proof. Let T, with r.e. Tr, be a set of postulates for E. If T = 0, then
Tr is recursive and there is nothing to prove. If T^0, then by Thm. 6.11.5
there is a total recursive function f that enumerates the sentences of T,
i.e., we have

r = {y„: n£N)
and
fin) = # y„ for all n.
Let 'v
§o = Yo, 6„ + 1 = y„ + 1a5„ for all n.

Then A={5„: n£N} is clearly a set of postulates for E. Also, if g(n) = #8„,
then it is easy to see that g is a monotone increasing recursive function.
Therefore, by Thm. 6.11.9, TA is recursive. |

5.4. Theorem. A theory E is axiomatizable iff Tz is recursively enumerable.


CH. 7, §5], AXIOMATIC THEORIES 333

Proof. If TV is r.e., then E is axiomatizable because we can take the whole


of S as a set of postulates.
Conversely, let T be a set of postulates for E and let Tr be recursively
enumerable.
Let Ax be the property such that Ax(y) holds iff y is an axiom of the
predicate calculus (see §1 of Ch. 3). We leave to the reader the simple (if
somewhat tedious) task of verifying that Ax is recursive.
We encode finite sequences of formulas: to the sequence a0,...,a„ we
assign the code number exp(/?0,#a0)...exp(_pn,#a„).
Let Dedr be the property such that Dedr(y) holds iff y is a first-order
deduction from r. It is easily verified that

Dedr(y) = (lh(y)>0)

a Vz<lh(y) {Ax((y)z) v Tr{(y)z)

v Bw<zVu<z[(y)u=64 • exp(3,(>>)„) • exp(5,(y)z)]}.

From Lemma 6.11.2 it follows that Dedr is an r.e. property. (We observe, by
the way, that if Tr is recursive then Dedr is recursive as well.)
Now, we clearly have

Ts(x) = Frm(x,0) a 3y [Dedr(y) a(x=(y)lhG,K1)].

It follows from Prob. 4.4 and Lemma 6.11.2 that T is an r.e. property. 1

5.5. Corollary. 12 is not axiomatizable.


Proof. Immediate from Cor. 3.5 and Thms. 4.7 and 5.4. |

By Church’s Thesis, this means that there can be no effective procedure


for generating one by one all the sentences of a set of postulates for £2.
This constitutes a serious difficulty for the formalist view of mathematics.

We conclude this section with a result on representability in axiomatizable


theories.
5.6. Theorem. Only r.e. relations are weakly representable in an axiomatiz¬
able theory.

Proof. Let a 6 <!>,,, and put

/(*)= #[a(s*)].

If a weakly represents the n-ary relation P in the theory E, then the iden¬
tity
= *))
23
LOGIC — LIMITATIVE RESULTS [CH. 7, §6
334

must hold. But if a— #a then

/(x) = sb(... sb (sb(a, 1, *0,2, x2)...,n, x„).

Since sb is recursive (see Ex. 4.5),/is recursive as well. Therefore, if 7/ is


r.e., P must be r.e. by Lemma 6.11.2. I
5.7. Problem. Employ Thm. 5.6 to obtain an alternative proor of Cor. 5.5,
not using Tarski’s Theorem.
5.8. Problem. Prove that only recursive relations are strongly represent¬
able in a consistent axiomatizable theory.

§ 6. Baby arithmetic
In this and the following three sections we shall present four theories, each
of which has some special important feature. These theories will all be used
later on.
In discussing these theories we shall often leave it to the reader to ver¬
ify claims of the form El-a. In all these cases it is of course possible to
construct a deduction of a from E. However, the reader will find it less
tedious to verify (by tableau or directly) that E|=a and then use the Com¬
pleteness Theorem 3.3.14.
In the present section we shall study the theory n0, which we shall also
call baby arithmetic, and which is based on the following postulates:

(6.1) s„+s0=sn,

(6.2) s„+sm+1=s(s„+sm),

(6.3) s„Xs0=s0,

(6.4) snXsm+1=(s„Xsm)+s„,

for all numbers n and m.


All these postulates are clearly true (i.e., hold in 91), hence II0££2, i.e.,
n0 is sound.
n0 is a very weak theory: it is evidently satisfied in a structure whose
universe consists of a single individual. Thus, e.g., even the sentence
is not in IT0.

6.5. Problem. Show that, for each n, the only 77-ary relations strongly
representable in II0 are the trivial ones: A" and the empty 77-ary relation.

Nevertheless, we shall soon see that every r.e. relation is weakly represent¬
able in n0.
CH. 7, §6], BABY ARITHMETIC 335

6.6. Lemma. Let t be a closed term, and let tn = t. Then (t=st)£ll0.


Proof. First, let t be s„+sm. We must show that (sn+sm=s„+m)6ll0 for
all n and m. But it is easy to verify by induction on m that s„+sm=sn+m
is deducible (in the predicate calculus) from postulates (6.1) and (6.2).
Next, let t be s„Xsm. By induction on m we verify that (s„Xsm =s„m)£ll0.
Finally, if t is an arbitrary closed term, it is now easy to verify the assertion
of our lemma by induction on deg t. |

6.7. Lemma. Let 9 be a sentence of the form 3xi---3xm(r=t) where 0.


If <p££2, then 9£II0.
Proof. By induction on m. For m=0, (p is r=t, where r and t are closed
terms. If <p£ii, then r9l=t91. Let k be this common value. By Lemma 6.6,
r=sfc and t=sfc are in II0. But r=t is deducible from these two sentences,
and hence must be in II0 as well.
Now let <p be 3ya,where a is a formula of the form 3xi---3xm(r:=t)
having no free variable other than y. If 96^, then clearly for some n we
must have a(y/sn)£il. Therefore by the induction hypothesis a(y/s„)6ll0.
But, by Thm. 3.1.11, a(y/sn)H3y«- I

Lemma 6.7 means that if a diophantine equation is solvable (in natural


numbers) then this fact is deducible from the postulates of II0.

6.8. Theorem. For any given n-ary r.e. relation P, we can find a formula
of the form 3vn+i---3vn+m(r=t) which weakly represents P in every sound
theory that includes Il0.
Proof. By the MRDP Thm. and Lemma 3.4 we can find a formula a of
the required form which represents P in £2. Thus, for any xZNn, P(x)
holds iff a(sst)££2. But a(s3e) is a sentence of the form covered by Lemma 6.7.
Therefore, if noeX;c£2, we have a(sI)€^ iff a(sj£E. |

In §4 we remarked that if a represents in £2 the n-ary relation P, then for


each a 6 TV" the sentence a(sQ) can be construed as formally asserting that
P(a) holds, since a(sa) is true iff P(a) in fact holds. Thm. 6.8 tells us that
if P is r.e. then (for a properly chosen a) all true formal assertions of this
form — but no false ones — can already be deduced (in the predicate
calculus) from the postulates of baby arithmetic.
In particular, if/ is any n-ary recursive function, then its graph is r.e. by
Thm. 6.11.3, hence represented in £2 by some formula a of the kind described
in Thm. 6.8. The sentence a(sa,sb) can then be regarded as formally asserting
that f(a) equals b. All true assertions of this kind — and no false ones —

23*
LOGIC — LIMITATIVE RESULTS [CH. 7, §7
336

are deducible in n0. This result was alluded to in our discussion of Church’s
thesis (§7 of Ch. 6).

6.9. Problem. Verify that the set of postulates of II0 (i.e., sentences
of the forms (6.1)-(6.4)) is recursive. Thus II0 is axiomatic.

6.10. Theorem. A relation is weakly representable in II0 iff it is recursively


enumerable.
Proof. Immediate from Prob. 6.9, Thm. 5.6 and Thm. 6.8.

§ 7. Junior arithmetic

In this section we shall study a theory IV in which all recursive relations


— and only they — are strongly representable.

7.1. Definition. For any terms r and t, r^t is the formula 3z(r+z==t)>
where z is the first1 variable that occurs neither in r nor in t.
The theory nx — which we shall call junior arithmetic — is based on
postulates (6.1)—(6.4) plus the following:

(7.2) s„?£sm,

(7.3) Vvi(v1<s„<+v1=s0v...vv1=sll),

(7.4) Vv^s^ViW^sJ,

for all n and all mXn.


Obviously, IV is an extension of II0; it is a proper extension because,
e.g., s05t£sx belongs to IV but not to II0.
Also, since all the postulates of IV are evidently true, IV is sound, i.e.,
included in fl.

7.5. Problem. Let *91 be a structure with domain *N=Nv{«=}, where °°


is some arbitrary object $N; let *0=0 and let the operations *s, *+ and
*X of *91 be the extensions of the ordinary successor, addition and multipli¬
cation such that *s(°°)=0, a*+fc = «= and a*Xb = 0 whenever a=°° or
b = °° (or both). Show that *91 is a model of IV Hence show that the
sentence Vvi(sva?£s0) is not in IV, so that IV is a proper sub-theory of EL.
7.6. Problem. Exactly the same as Prob. 2.4, but with “ft” replaced
by TV’.

1 If z is any variable not occurring in r or t, then 3z(r+z=t) is a variant of r^t. Below


we shall sometimes write “r^t”, where, in fact, there should be some variant of r^t.
This slight inaccuracy is harmless, since by 3.1.10 mutual variants are provably equivalent.
CH. 7, §7], JUNIOR ARITHMETIC 337

Below we shall write, e.g., “3x^r a” as short for 3x(x^rAa).

7.7. Lemma. Let <p be a sentence of the form 3xi<sni ... 3xfc^s„k (r=t),
where 0. Then (p^rTi if (p^il, and —lipCr^ if ~i<p(;iL
Proof. By induction on k. For k~0, (p is r=t, where r and t are closed
terms. If (ptfi then by Lemma 6.7 we have (p^nocr^. If “lipthen
r^t, where r=r91 and ^t91. By Lemma 6.6, the sentences r=s,. and t=s,
are in n0, hence in nt. By (7.2) the sentence sr?£st is in LL. But from
these three sentences the sentence r^t (i.e., —I cp) is deducible.
Now let <p = 3y=^s,,a, where a is a formula of the form

3xi^sni...3xfc^s„k(r=t)

having no free variable other than y. First, suppose <p£l2. Then for some
m<n we must have a(y/sm)€fi. Hence by the induction hypothesis,

(i) ai'y/sjen,.
Also, since m<n, the sentence sm<s„ is deducible from (7.3), and hence
belongs to Using this and (1) we have

sm<s„Aa(y/sm)€n1.

From this last sentence we can deduce (see Thm. 3.1.11) 3y(y<s»Aa),
i.e., q>.
Suppose, on the other hand, that Then for every m^n we must
have -|a(y/s,„)€Q. Hence, by the induction hypothesis, “ia(y/s„1)€ll1.
But from the sentences ~ia(y/sm), where m=0,...,n, we can deduce

~13y[(y=s0 V... V y=s„) a a]

and from this and (7.3) we can deduce “l3y(y^s„Aa), i.e., “lip. I

7.8. Lemma. Let H1\-ct(xjsai,...,xkjsa^. If m = max(a1,...,ak) and y is a


variable different from x1}...,xk, then

n1,-i(y^sm)H3xi<y---3x^y a-

Proof. Using postulate (7.3) — with m instead of n — we easily verify

n1,n(y<sm)i-y5*SoA...Ay5«ssm.

For each j=\,...,k we have a^m, so

n1,-i(y^sM)Hy^s0A...Ay5^sv
338 LOGIC — LIMITATIVE RESULTS [CH. 7, §7

Using postulate (7.3) for n = dj we now get

nx, -i (y^sj h ~i (y<s ap.

Hence, using postulate (7.4) with n—dj, we have

,“l(y<sm)(-sfl^y for j—\,...,k.

But from a(x1/sfl ,...,xk/sa) and the formulas sa <y (for j—\,...,k) we can
lk j

deduce 3xi^y ■ ■-3xk^Y a- I

The following result is extremely important.

7.9. Lemma. Given two n-ary r.e. relations P and P', we can find formulas

(0 P =3vn+1^y...3vn+^y (r=t),

(ii) P' = 3vn+i<y...3vn+fc<y (r'=t'),

{where y is vn+k + 1) such that the formula

(iii) y = 3y(PA-iP')

belongs to , and for every a£Nn we have

if p{d) a n/>'(a) holds,

“ly^jeLL if —]P(a) AP'(a) holds.

Proof. By the MRDP thm. and Lemma 3.4 we can find formulas

a = 3Vn + l---3Vn + fc(r—1)>

a'=3v„+i..3vn+l(r'=f),

representing in Cl the relations P and P' respectively. (Without loss of


generality we have assumed that the number of bound variables in both
cases in the same. This can be arranged by introducing redundant quantifiers
as in Prob. 2.2.4.)
Let p, p' and y be given by (i), (ii) and (iii). Since a and a' are in <D„,
we have y€On. We shall show that y behaves in the required manner.
First, suppose that P(a) holds and P'(a) does not. Since a represents
P in Q, we have a(sa)£il. Therefore there exist numbers an+1,...,an+k
such that the sentence

r(S“ ,S“n + !’••• +J= >Sfl„ + P • • • ’San + k)

also belongs to Cl. From this it follows that if m = max(an+1,...,an+k) then


CH. 7, §7], JUNIOR ARITHMETIC 339

P(sn)(y/sm)££I. But this sentence is of the kind considered in Lemma 7.7,


so we have

(1) p^Xy/sjen,.

On the other hand, if p is any number, then P'(sa)(y/sp) cannot be in £1;


for if it were, the sentence a'(sa), which is logically entailed by it, would
also be in SI, but a' represents P' in SI and P'(a) does not hold. Thus for
all p we have —l P'(sa)(y/sp)£Sl and hence, by Lemma 7.7,

(2) npXSaXy/SpK11! for all p.

Using (2) for p=m and (1) we see that the sentence

P(sa)(y/sm)A-lp'(sa)(y/sm)

belongs to E^. But from this sentence we can deduce

3y [P(sa)A“lp'(s0)],

which is y(sa).
Now suppose that P(a) does not hold but P'(a) does. Since a' represents
P' in SI, we have a'(sjeSl. It follows that there are numbers an+1,...,an+k
such that the sentence

belongs to a. By Lemma 6.7, this sentence must belong to n0 and hence


to Uv Taking m=max(an+1,...,an+k) and using Lemma 7.8 we therefore
obtain

(3) n1,-i(y<sm)h-P/(s«)-

Also, using the method by which we got (2), we now get

“Ip(s„)(y/Sp)en1 forallp.

Using this fact forp = 0,...,m together with postulate (7.3) for n=m, we have

nl5y<sml--lP(sa).

From this and (3) we easily get

nxh-ipCsjvpXsj.
340 LOGIC — LIMITATIVE RESULTS [CH. 7, §8

Generalizing on y (see 3.1.5) we obtain

nihVy[”ip(s0)vp'(sa)].

But this last sentence is provably equivalent to ~Ty(sa). |

7.10. Theorem. Given a recursive relation R, we can find a formula y


(of the form described in Lemma 7.9) such that y strongly represents R
in any theory that includes IT!.
Proof. In Lemma 7.9, take P and P' to be R and ~\R respectively. Then
the lemma shows that y strongly represents R in ni5 and hence in any
theory that includes Up

7.11. Remark. Our proof that every recursive relation is strongly represent¬
able in rij employed the characterization of r.e. relations (hence also of
recursive relations) peovided by the MRDP Theorem. But the same fact
can also be proved using, e.g., an older characterization of r.e. relations
due to M. Davis and R. Robinson (see §17 of Ch. 6) or even a more direct
characterization of recursive relations.

7.12. Problem. Verify that the set of postulates of 1^ is recursive, so


that n, is axiomatic. Hence show that a relation is weakly representable
in n, iff it is r.e., and strongly representable in iff it is recursive.
7.13. Problem. Let S be a theory such that n1£2.
(i) Show that every total recursive function is numeralwise representable
in 2. (If a strongly represents the graph of an n-ary function f prove
that the formula

Vys£v„+1 [a(v„+1/y)4*y=vn+1],

where y is, e.g., v„+2, numeralwise represents /.)


(ii) Show that, if 2 is consistent, there cannot exist a truth definition
inside 2. (Use Prob. 4.10.)

§ 8. A finitely axiomatized theory

Whereas IT0 and were based on infinitely many postulates, the theory
II2, which we study in this section, is based on the following nine postulates:

(8.1) Vvi(svi9^so),
(8.2) VviVv2(svi=sv2-^v1=v2),

(8.3) Vvi(v1+s0=v1),
CH. 7, §8], A FINITELY AXIOMATIZED THEORY 341

(8.4) VviVv2[Vi+sv8=s(v1+v2)],

(8.5) Vvi(v1Xs0=s0),

(8.6) VViVv2(viXsv2=v1xv2+v1),
(8.7) VVitVi^So-^rrSo),

(8.8) VviVv2(v1^sv2-4-v1<v2w1=sv2),

(8.9) VviVv2(v1^v2w2^v1).

8.10. Remark. The set of postulates of IT2, being finite, is certainly


recursive (cf. Thm. 6.11.9) so that Il2 is axiomatic. We say that it is finitely
axiomatized because the set of its postulates is finite. Note also that II2 is
clearly sound.

8.11. Theorem. IliCix,.


Proof. We show that the postulates of belong to n2.
Postulates (6.1)-(6.4) are clearly deducible from (8.3)—(8.6) respectively.
Postulates (7.2) are s„^£sm, where n^m. Suppose first that n>m. We
proceed by induction on m. Since n>ra, the numeral s„ is ssfc, where k^m.
For m—0, the sentence ssky£sm is deducible from (8.1). For the induction
step, let m=p+l. Then k>p, so by the induction hypothesis the sentence
skj£sp belongs to II2. But from this sentence and (8.2) we deduce sskj£ssp,
i.e., s„9£sm — as required.
If then by what we have just proved the sentence sm^s„ belongs
to II2. But from this we deduce sn^sm.
Next, consider (7.3). We proceed by induction on n. For n = 0, we start
from (8.3) and deduce s0+s0=s0, hence 3v1(s0+v1=s0), which by Def. 7.1
is s0^s0, hence
VVi(Tl=s0-fvi:^s0).

From this last sentence and (8.7) we deduce

Vv^v^So^v^So),

which is (7.3) for n — 0.


Now suppose that for some n the sentence (7.3) belongs to II2. From
this sentence we deduce

(1) Vvi(v1<s)1vv1=s„+1^v1=s0v...vv1=s„vv1=sn+1).

From (8.8) we deduce

(2) Vvi(v1<s„+1->v1<s„W1=s„+1).
342 LOGIC — LIMITATIVE RESULTS [CH. 7. §9

Next, from (8.4) we deduce

VViVv2(v1+v2=sn-^v1+sv2=sn+1)

and hence

Vvx [3v2(v1+v2=s„)-^ 3v2(v1+v2=s„ + 1)].

But by Def. 7.1 this sentence is

(3) VviCv^s^Vi^+j).

Also, from (8.3) we deduce sn + 1+s0=sn+1, hence

3vi(s„+!+vi=s„+x),

which is the same as s„+1^s„+1. But from this we deduce

Vvi(v1=sn+1-*v1<sII+1).

Using this sentence together with (2) and (3) we deduce

Vu(v1<s„+14>v1<s„vv1=sn+1).

From this and (1) we deduce

Vvi(y1^s„+14>v1=s0v...vv1=sIIvy1=s1I+0,

which is (7.3) for n+1.


Finally, (7.4) is evidently deducible from (8.9).

8.12. Remark, is a proper sub-theory of II2 because, e.g., postulate


(8.1) does not belong to nx (see Prob. 7.5).

8.13. Problem. Let *sJt be a structure like that of Prob. 7.5, except that
*s(°°) = oo, a*Xb — o° whenever a=°° but bx0, or b—°° and a is arbitrary.
(As in Prob. 7.5, °° *X0=0, and a*+b = °° whenever a=°° or b=°°.)
Verify that *9t is a model for II2. Hence show that the sentence Vvi(sv1?tv1)
is not in II2.

■§ 9. First-order Peano arithmetic

The theory II which we shall now begin to study has as its postulates the
six sentences (8.1)—(8.6) and all sentences of the form

(9.1) Vv2...Vvfc [a(s0)-* Vvi{a-► «(sVi)}-> Vvia],

where 1 and a£Q>k. Postulates (9.1) are called induction postulates.


CH. 7, §9], FIRST-ORDER PEANO ARITHMETIC 343

The postulates of II were obviously obtained by trying to formalize


in <£ the seven Peano postulates listed in §2. For this reason II is usually
called first-order Peano arithmetic. (It must be stressed however that what
Peano was aiming at was not II but, essentially, Q' of §2.)

Clearly, II is sound. Also, the induction postulates make it a very


powerful theory: no one has yet found a true sentence which expresses
a fact of practical number-theoretic interest (i.e., of interest to working
number-theorists) and which does not belong to II. (As we shall see later,
there are methods for constructing true sentences that do not belong to II.
But these methods yield sentences expressing facts that aie of purely logical
interest.)

9.2. Problem. Verify that the set of postulates of II is recursive; hence


n is axiomatic.

We shall now present a few examples of sentences belonging to II.


First, we show that

(9.3) Vvi [3v2(vi=sv2)w1=s0]€n.

We let a be the formula inside the square brackets in (9.3). Then it is easy
to see that h-a(svx) and hence

HVU [a-»-a(sv1)].

Also, we clearly have Ha(s0). Hence, using postulate (9.1), we get (9.3).
From (9.3) and Def. 7.1 we have

VuVU [Vi<v2o3v3(v1+sv3=v2)vv1+s0=v2]€n.

Hence, using postulates (8.3) and (8.4),

(9.4) VuVU [v1^v2^3v3{s(v1+v3)=v2}w1=v2]6n.

Using postulate (8.1) we now get

(9.5) Vvi(vi<s0^H=s0)€n.

Next, we shall show that

(9.6) VHtSo^ViKII.

We observe that (s0^s0)€ll by (9.5). Also, using postulate (8.4) we have

Vvj Vv2(s0+v2=v1 -► s0+sv2=sv1) 6 n,


LOGIC — LIMITATIVE RESULTS [CH. 7, §9
344

hence
V v^So < Vi s0 ^ sVi) € n.

Using postulate (9.1) with the formula s0^vx as a, we obtain (9.6).

9.7. Problem. Show that

VuVv2 [sv2+v1=s(v2+v1)]€n.

9.8. Theorem. II2cn.


Proof. We show that the three sentences (8.7)-(8.9) belong to II.
For (8.7), this follows at once from (9.5). To deal with (8.8), we observe
that by (9.4)
VuVv2 [v1^sv2-^3v3{s(v1+v3)=sv2}vv1=sv2]€n,

hence, using postulate (8.2) and Def. 7.1,

(9.9) VviVU [v1^sv2^->v1^v2Vv1=sv2]6n.

It follows at once that (8.8) belongs to II.


To deal with (8.9), we start by observing that

(9.10) Vv2(so^v2Vv2<s0)£II,

because by (9.6). Next, from postulates (8.3) and (8.4) we


deduce Vvi(vi+sso==svi) and hence

Vvifa^sVi).

Using this and Prob. 9.7, we get, from (9.4),

(9.11) VUVU [Vi^Va-^sVi^VaWa^svJell.

Also, from (9.9) we have

VU Vv2(v2<v1 vo^svx) € O

Combining this with (9.11) we get

(9.12) V Vx V v2(va < v2 V v2< vx ->• svx <v2 V Va^sVi) € II -

Using postulate (9.1) we get, from (9.10) and (9.12),

VviVv2(v1^v2vv2<v1)€n.

as required.

9.13. Problem. Show that Vvi(svi9^vi)€H; hence (by Prob. 8.13) II is


a proper extension of II2.
CH. 7, §9], FIRST-ORDER PEANO ARITHMETIC 345

9.14. Problem. Let *91 be any model of II. Since F^sII, Prob. 7.6
shows that there is an unique embedding / of 91 into *91. Thus without
loss of generality we may identify f(n) with n for all n£N and regard *91
as an extension of 91 (i.e., we may assume N^*N and *s(n)=n+1,
n*+m—n+m, and n*Xm=nm for all n,m£.N.) For a,b € *N we define:
a**sb if, for some c£*N, a*+c=b; also, a==b if, for some n£N,
a*+n = b or b*+n — a.
(i) Show that *=^ is a total ordering on *N.
(ii) Show that = is an equivalence relation on *N.
(iii) Call any of the = -classes into which *N is partitioned by = a block.
Show that N constitutes a block. Also show that any other block (if there
are any) has, under the ordering *<, the same order type as the integers.
(iv) For any blocks A and B, define A *<s B if o*< b for all a£A and
b£B. Show that *=< is a total ordering on the set of all blocks.
(v) Show that if *91 is nonstandard then the ordering of the blocks is
dense (i.e., between any two different blocks there is a third). Show also
that N is the first block, but there is no last block.
(Each part of this problem is solved by verifying that II contains certain
sentences. Thus, e.g., in (i) one shows that *< is transitive by verifying
that the associative law of addition

VViVv2Vv3 [(vi+v2)+v3=v1+(v2+v3)]

is in II.)
Note that all these results hold a fortiori if *91 is a model of fi.
9.15. Problem. Let *91 be a nonstandard model of II (i.e., a model of
n that is not isomorphic with 91). As in Prob. 9.14, we regard *91 as an
extension of 91.
(i) Show that for every and all a1,...,an£*N we have

*N-Nx{a£*N: *91 ■■,an,a]}.

(ii) Prove the so-called “overspill lemma”: if <p€<I>„+1 and a1,...,ane*N


and there are infinitely many a£N such that *91|=<P[ax,...,an,a], then there
is some b£*N—N such that *91N (p[ai,...,a„,fr].

9.16. Problem. Write r<t for sr^t. Let y be the formula

3v23v3(v2<v1av3<v1Av1=v2Xv3).
346 LOGIC — LIMITATIVE RESULTS [CH. 7, §9

(y(x) stands for “x is composite”)- Let *91 be any model of IT. Prove
that *91 is nonstandard iff there is an infinite sequence a0, a1? a2,... of
members of *N such that for all n we have an+1 = *s(an) and *9lNy[tf„].
(If *91 is nonstandard and contains no such sequence, let <p be the formula

Vv23v3 [v2<v3 Av3<v1+v2 a —1y(v3)]

and show that

*N-N={ae*N: *9t(=<p[«]},

contradicting (i) of Prob. 9.15.)


*9.17. Problem. Let 91 be any Jz?-structure. A subset X^A is said to be
definable if there is a formula <p € <I>i such that

X={a£A: 2Xt=<p[a]}.

An individual a6 A is definable if {a) is a definable subset of A. Let Df(9l)


be the set of all definable members of A, and put

£f(9I) = 9I|Df(9I).

If 9I=£>f(9I) then 91 is said to be pointwise definable. (Observe that the


standard structure 91 is pointwise definable.)
(i) Prove that if each non-empty definable subset of A has a definable
member then £>f(9I)-<9l.
(ii) Show that if 91 is a model of II then each non-empty definable
subset of A has a definable member.
(iii) Show that if T)f(9I)-<9l then X)f(9l) is pointwise definable.
(iv) Let 91 and 9V be elementarily equivalent pointwise definable
i?-stuctures. Prove that they are isomorphic. (For each a£A there is
such that a is the unique individual satisfying <p in 91. Since 91=91',
there is a unique individual a' satisfying <p in 91'. Show that the mapping/
defined by f(a) = a' is an isomorphism of 91 onto 91'.)
(v) Use (iv) to show that, up to isomorphism, 91 is the only pointwise
definable model of Q.
*9.18. Problem. Let L be a complete consistent theory such that n <=£.
(i) Let 91 be a model of E. Show that £>f(9l) is a prime (Prob. 5.6.14)
model of E. (Like Prob. 9.17(iv).) In particular, 91 is a prime model of il.
(ii) Show that E has no (finitely) saturated (Prob. 5.6.15) model. (Let
x|y be a formula expressing “x divides y”. Let P be the set of all prime
CH . 7, §10], UNDECIDABILITY 347-

numbers and for each X<=,P let

Ax = {sp|vi: p€X}v {ns^: p£P-X}.

In I?2(£) (Ch 5, §6) show that {|<p|: 96 Ax} can be extended to an ultra¬
filter Ux such that, if X^X', then UX^UX'. Now apply Prob. 5.6.15(vii).)
(iii) Show that I has no universal (Prob. 5.6.16) model. (Use (ii) and
Prob. 5.6.16.)

§10. Undecidability

Let £ be a set of sentences. The decision problem for £ is the problem of


finding an algorithm whereby, for each sentence ip, one could determine
whether <p££ or not. If such an algorithm can be found, the decision
problem for £ is solvable and £ itself is decidable.
Since the processes of coding and decoding (i.e., the transition from an
expression to its code number and vice versa) are obviously effective, the
decision problem for £ is equivalent to the decision problem for the prop¬
erty Tz, as described in Ch. 6 (see discussion preceding Ex. 6.10.7). Thus,
by Church’s Thesis, £ is decidable iff is recursive.
Without committing our technical terminology to Church’s Thesis, we
shall say that the decision problem for £ is recursively solvable and £ is
recursively decidable if TE is recursive. If Tz is not recursive, we say that
the decision problem for £ is recursively unsolvable and £ is recursively
undecidable.

From Tarski’s Theorem 4.7 and Cor. 3.5 it follows at once that Cl is
recursively undecidable. The recursive undecidability of £2 as well as the
four theories IT0,F^, II2 and II also follows from:

10.1.Theorem. If £ is a theory in which every recursive property is weakly


representable, then £ is recursively undecidable.
Proof. Similar to that of Thm. 4.7. Suppose T£ were recursive. Then the
property
Xx[-] T^(d(x))]

would also be recursive, because the diagonal function d is recursive (see


Ex. 4.5). Therefore this property would be weakly represented in £ by
some a^i- Thus for each x£N we must have

a(sx)€£ iff —]Tx(d(x)) holds.


LOGIC — LIMITATIVE RESULTS [CH. 7, §10
348

In particular, giving x the value #a,

a(s#a)£E iff holds.

But by (4.6) and Def. 5.1 we actually have

nrL(t/(#a)) holds iff a(s#a)<£E.

This contradiction shows that 12 cannot be recursive. I

10.2. Corollary. If£ is a sound theory that includes II0 (/.<?., II0cEc£2),
then E is recursively undecidable.
Proof. By Thm. 6.8, every r.e. relation — hence certainly every recursive
property — is weakly representable in E. Thus, by Thm. 10.1, E is recur¬
sively undecidable. '

10.3. Corollary. If E is a consistent theory in which every recursive


property is strongly representable, then E is recursively undecidable.
Proof. We have already noted (following Def. 3.1) that if a strongly repre¬
sents P in a consistent theory E, then a also weakly represents P in E. |

10.4. Corollary. If His a consistent theory and <= E, then E is recursively


undecidable.
Proof. Immediate from Thm. 7.10 and Cor. 10.3. |

The recursive undecidability of n0,11^ n2, II and £2 follow at once


from Cor. 10.2. The same results (except for n0) follow also from Cor. 10.4.
We can use Cor. 10.4 — but not Cor. 10.2 — to prove the recursive
undecidability of consistent but unsound theories.

10.5. Theorem. Let E be a theory such that EuIT2 is consistent. Then


E is recursively undecidable.
Proof. Put

A = {<p: <p£<D0, Eun2f-<p}-

Let n be the conjunction of the nine postulates (8.1)—(8.9) of FL. It is


clear that for any sentence <p we have (p£A iff (it-*(p)GE. Thus

^a(*) = ^s(64 • 3#7C • 5X).

From this we see that if TE were recursive then TA would also be recursive.
But A is a consistent theory and an extension of Il2, hence also of
So, by Cor. 10.4, TA is not recursive. |
CH. 7, §10], UNDECIDABILITY 349

We can now strengthen Cor. 10.2:

10.6. Corollary. Every sound theory is recursively undecidable.


Proof. If £ is sound, then Sull2 is also sound, hence consistent So, by
Thm. 10.5, £ is recursively undecidable.

As an application of Cor. 10.6 we prove:

10.7. Church’s Theorem. Let

A={<p: <p€<D0, H<p}-

Then A is recursively undecidable.


Proof. Immediate from Cor. 10.6, since A is clearly a sound theory. |

10.8. Remark. Clearly, A is an axiomatic theory (it is based on the empty


set of postulates). Thus, by Thm. 5.4, Tx is recursively enumerable. Hence
there is an effective procedure for generating, one by one, all sentences
of A, i.e., all logically valid sentences of if. Such a procedure was actually
described in §5 of Ch. 3. On the other hand, the fact that we have not
been able to devise an algorithm for deciding whether any given sentence
is logically valid or not is not accidental. For, by Church’s Theorem and
Church’s Thesis, such an algorithm does not exist (at any rate not for the
particular language if of this chapter).
The result of Thm. 10.7 is often expressed by saying that the predicate
calculus is undecidable.
In the remaining part of this section we shall consider not only if (the
first-order language of arithmetic) but ofter first-order languages as well.
If if' is any first-order language, then by ££"-theory we mean a set of
if'-sentences closed under first-order deducibility. The decision problem
for a set of if'-sentences is formulated in the same way as in the case of if.
Thus, it makes sense to say that such a set is decidable or undecidable.
For f=l,2 let if2 be a first-order language and let £; be a set of ^-sen¬
tences. By a reduction ofTux to £2 we mean an effective (i.e., algorithmic)
mapping f of the set of all ifj-sentences into the set of all if2-sentences
such that for each ifx-sentence (p we have <pC£x iff/(<p)£^2- If such a
reduction exists, then £x is reducible to £2.
It is clear that if £x is undecidable and reducible to £2, then £2 is undecid¬
able as well.
For the rest of this section, we shall assume the correctness of Church s
Thesis. Thus the if-theories which we have proved to be recursively undecid¬
able, will now be taken to be undecidable. With this as our point of

24
LOGIC — LIMITATIVE RESULTS [CH. 7, §10
350

departure, we shall use various reductions to show that certain theories


(in other languages) are also undecidable.

10.9. Example. Let be the language obtained from <£ by omitting the
unary function symbol s and adding a new individual constant 1. It <p is
an JS?-sentence and a term of the form sr occurs in <p, replace that term by
r+1, and continue this process until all occurrences of s have been elimi¬
nated. Call the resulting ^-sentence 9*. We claim that if 1=9 then also
1=9*. Indeed, let Ux be an ^-structure in which 9* does not hold. If
+ and 1 are the binary operation and individual of Ux corresponding to
+ and 1, we let s(a) = a+l for every a in the domain of Hx. By taking
s as the operation corresponding to s, we obtain an .^-structure in which
<p fails to hold.
Let Ex be any ^-theory. Put

£ = {9: 9£<P0, 9*££i}.

Clearly, * is a reduction of E to Ex. We claim that E is an ^-theory. Indeed,


if 9€<1)o and Eh-9, then for some 9x,...,9thE we must have

H9i->--->9k-^9-

Hence, by what we have shown, also

H 9i* 9t 9*-

But 9*,...,9^E1; and Ex is an ^-theory. Hence 9*hEx, so that 9€E.

Next, let 91! be the ^-structure with domain N and with 0,1, + and X
interpreted in the obvious way, as 0, 1, addition and multiplication, respec¬
tively.
Lor any ^-sentence 9 we clearly have 9i |= 9 iff 9IX N 9*.

If the J5?x-theory Ex has 9IX as a model, it now follows that the JSf-theory
E defined above is sound (i.e., has 91 as a model). By Cor. 10.6 and Church’s
thesis, E is undecidable. Since * is a reduction of E to Ex, it follows that
Ej is undecidable.
Thus we have shown that every ^-theory that has 9lx as a model is
undecidable. In particular if Ax is the set of all logically valid ^-sentences,
then Ax is undecidable.

10.10. Problem. Let Ex and E2 be .izf'-theories. We say that E2 is a finite


extension of Ex if there is a finite set of if'-sentences {9x,...,9fc} such that
Exu {9i,...,9fc} is a set of postulates for E2.
CH. 7, §10]. UNDECIDABILITY 351

Show that in this case the mapping *, where <|>* = tpxtp, is


a reduction of £2 to £x. Hence if £2 is undecidable then so is £x.

10.11. Problem. Let Jz?x and 9lx be as in Ex. 10.9. Let 3 be the structure
of integers, i.e., the ^-structure whose domain is the set of all integers
and with the symbols 0.1, + and x interpreted in the obvious way. Let
9 be the formula

3u3v3x3y(z=uXu+YXv+xXx+yXy),

where u,v,x,y,z are distinct variables.


For any sentence a, let a* be the relativization of a to <p, with z as chosen
variable (cf. §12 of Ch. 2).
(i) Verify that for any ^-sentence a we have 9txNa iff 3Not*. (Use
Lagrange’s theorem that every natural number is the sum of four squares.)
(ii) Let £ be an ^-theory such that 3 is a model for £ and the sentences
3zq>, cp{0}, <p{l}, <p{+} and q>{x} belong to £. (For the definition of the
latter four sentences, see §12 of Ch. 2.) Let £' be the set of all i^-sentences
a such that a*££. Use Cor. 2.12.4 to show that £' is an ^-theory. Hence
use (i) and Ex. 10.9 to prove that £ is undecidable.
(iii) Let £ be any ^-theory that has 3 as a model1. Show that £ is
undecidable. (Use (ii) and Prob. 10.10.)
10.12. Problem. Let JS?X and 3 be as above. Let i?2 be the language obtained
from by omitting the constants 0 and 1. Let 32 be the .^-structure
which is the same as 3 except that it has no designated individuals. Let
a and p be the formulas

Vv2(v2+v!=v2), Vv2(v2Xv1=v2),

respectively. For any i?x-sentence q>, let 9* be

VxVy[a(x)-*p(y)-*<|>']>

where x and y are the first two variables that do not occur in <p, and 9'
is obtained from 9 by replacing all occurrences of 0 and 1 by occurrences
of x and y respectively.
(i) Verify that for any J^-sentence we have 31=9 iff 321=
(ii) Let £2 be an ^f2-theory that has 32 as a model and contains the
sentences 3!vxa and 3!vip. (For the definition of 3! see Def. 2.10.1.)

1 In particular, S can be taken, e.g., as the set of all ^-sentences which hold in all
rings with identity.

24*
352 LOGIC — LIMITATIVE RESULTS [CH. 7, §10

Let 24 be the i^-theory based on the set of postulates 22 u {a(0), P(l)}.


Show that * is a reduction of 24 to 22 and 24 has 3 as a model. Hence,
by part (iii) of Prob. 10.11, 22 is undecidable.
(iii) Let 22 be any if2-theory that has 32 as a model. Show that 22 is
undecidable. (Use (ii) and Prob. 10.10.)
10.13. Problem. Let if2 and 32 be as in Prob. 10.12. Let i?3 be the first-
order language with equality whose only extralogical symbols are two
ternary predicate symbols S and P. Let 3a be the i^-structure whose
domain is the set of all integers and in which S and P are interpreted as
the set of all triples of the forms (i,j,i+j) and (i,j,ij) respectively.
Let tp be any ^-sentence. By Lemma 2.10.4 we can find an i?2-sentence
<p' logically equivalent to <p such that any atomic subformula of ip' which
contains + or x has the form x+y=z or xXy=z. Replace all such sub¬
formulas of <p' by Sxyz and Pxyz respectively, and let tp* be the resulting
sentence.
(i) Verify that for each jS?2-sentence (p we have 3a N 9 iff 331= <P*-
(ii) Let a and p be the =S?3-sentences VviVv23fv3Sv1v2v3 and
VviVv23’-v3Pviv2v3- Prove that for any ^-sentence (p we have |= q> iff
|=a-*P-Mp*. (Argue as in the proof of Thm. 2.10.5.)
(iii) Prove that if 2 is an ^-theory that has 33 as a model, then 2 is
undecidable. (Consider the finite extension 2' of 2 based on the postulates
2u {ce,p}, with a and p as in (ii). Show that * is a reduction to 2' of some
j£?2-theory which has 32 as a model.)
10.14. Problem. Let S£z and 33 be as in Prob. 10.13. Let J5?4 be the first-
order language with equality whose only extralogical symbol is a quaternary
predicate symbol R. Let 34 be the i?4-structure whose domain is the set
of integers and in which R is interpreted as the set of all quadruples of the
forms (i,j,i+j,i+j) and (i,j,ij,k), where k^ij.
For any J§?3-sentence <p, let <p* be the ^-sentence obtained by replacing
each atomic subformula Sxyz by Rxyzz and each atomic subformula Pxyz
by 3u(u5«£zARxyzu), where u is the first variable different from x,y and z.
(i) Verify that for each ^-sentence <p we have 33N <p iff 34 N <p*- Also,
if (= <p then (= tp*.
(ii) Prove that if 2 is an i?4-theory that has 34 as a model, then 2 is
undecidable. (Show that * is a reduction to 2 of some Jz?3-theory that has
33 as a model.)
10.15. Problem. Let be as in Prob. 10.14, and let ££h be the first-order
language with equality whose only extralogical symbols are an individual
constant c and a binary predicate symbol Q.
CH. 7, §11], INCOMPLETENESS 353

For each Jzf4-sentence q>, let <p* be the ^-sentence obtained when each
atomic subformula Ruvwz is replaced by

3x3y(QxyAQcx a Qcy a Quc a Qvc a Q wc a Qzc a Qux a Qxv a Q wy a Qyz),

where x and y are the first two variables different from u,v,w and z.
(i) Let It be an ^-structure. Let 23 be an ^-structure obtained as
follows. The individuals of 23 are those of U, plus a new object (u,v) for
every ordered pair (u,v) of individuals of It, plus one additional object c.
We take c as the designated individual of 23. As Q® we take the set of all
pairs ((w,u),(w,z)) where as well as all pairs (c,(u,v)), (u,c),
(«,(«,«)) and ((w,v),v), where u and v are any individuals of H. Verify that
llNq> iff 23 t= y*.
(ii) For i=4,5 let A; be the set of all logically valid ^-sentences. Show
that * is a reduction of A4 to A5. Hence, by part (ii) of Prob. 10.14, A5
is undecidable.
(iii) Let <£f6 be like Jzf5 but without the constant c, and let A6 be the set
of all logically valid ^-sentences. Show that A5 is reducible to A6, hence
A6 is undecidable. (Use 3.1.13.)
(iv) Let J5f7 be a first-order language without equality, whose only extra-
logical symbols are two binary predicate symbols. Let A7 be the set of
all logically valid ^-sentences. Prove that A7 is undecidable. (Use Thm.
2.11.2 to find a reduction of A6 to a finite extension of A7.)

§11. Incompleteness

In this section we again confine our attention to the first-order language


of arithmetic, j£f.
A theory £ is complete if it is consistent and for every sentence a we
have a6£ or “|a££. (Cf. Lemma 5.4.1.)
Among the consistent theories, complete theories are maximal with respect
to inclusion: clearly, no complete theory can be properly included in a
consistent theory. In particular, Q is the only sound and complete theory;
for fl is evidently complete, and any sound theory is, by definition, in¬
cluded in SI.
If £ is a consistent theory, then by Thm. 3.3.13 some ^-structure U is
a model for £. If £' is the set of all sentences holding in It, then £' is a
complete theory, and £c£/. Therefore £ itself is complete iff £=£.
Thus, a consistent theory £ is complete iff it is the set of all sentences
holding in some Jzf-structure It. (Cf. Prob. 5.4.2.)
354 LOGIC — LIMITATIVE RESULTS [CH. 7, §11

From now on we shall confine our attention to axiomatizable theories.

11.1. Example. Let £ be the set of all sentences that hold in the ^-structure
whose universe consists of a single individual. (There is only one such
structure, up to isomorphism.) By the above, £ is complete. Also, £ can
be characterized as the set of all sentences deducible from the single postulate
Vvi(v1==0), so £ is axiomatizable.

11.2. Problem. Let H be any finite if-structure. Show that the set of all
sentences that hold in U is an axiomatizable complete theory. (Cf. Prob.
5.2.12 (iii).)

The following theorem and (in a somewhat different version) the one
after it are due to Godel.
11.3. Theorem. Every sound axiomatizable theory is incomplete.
Proof. By Cor. 5.5, £1 is not axiomatizable. Hence every sound axiomatiz¬
able theory is properly included in L> and must therefore be incomplete. |

This result merely highlights the message of Cor. 5.5: no complete


axiomatization of Cl is possible.
If £ is a sound incomplete theory, there must exist a true sentence <p such
that cp$£. The proof of the following theorem yields a method for
constructing such a sentence q>, for any sound axiomatic theory.

11.4. Theorem. Given any sound axiomatic theory £, we can find a true
sentence <p of the form \Jx1...\jxk{rj£t) such that <p$£.
Proof. Since £ is given to us as an axiomatic theory, we can assume by
Def. 5.2 that we are given the r.e. property Tr, where Fisa set of postulates
for £. From Tr we can construct the r.e. property as in the proof of
Thin. 5.4. Hence we can also construct the r.e. property XxTfi<d(x)), where
d is the diagonal sequence of Ex. 4.5.
By the MRDP Theorem and Lemma 3.4, we can find a formula ouEOj,
of the form

3v2...3vm(r=t),

where m^ 1, such that a represents XxTfifi{x)) in Cl. It follows that the


formula
P = Vv2-..Vvm(r?£t),

which is logically equivalent to “la, represents Axr[-| rE(rf(;e))] in Cl Thus

P(s#p)dCl iff -irE(rf(#p)) holds.


CH. 7, §11]. INCOMPLETENESS 355

But by (4.6) and the definition of Tz we see that

“|TS(</(#P)) holds iff p(s#p)$E.

Let <p be the sentence p(s#p). Then by what we have just shown,

<p£li2 iff <p$E.


We cannot have tp(;E, because then tp^SI, contrary to the soundness of E.
Hence we have (p(fE and <p£il, i.e., 9 is true but is not in E. I
Note that the sentence tp constructed in the above proof asserts the
unsolvability (in natural numbers) of a certain diophantine equation.
In the proof of Thm. 4.7 we showed that if T'a were arithmetical then
we could find a sentence asserting its own falsity (or, at any rate, something
equivalent to its own falsity). Since such a sentence cannot exist, we
concluded that Ta is not arithmetical. Here, on the other hand, we have
used the same method to construct a sentence <p which asserts something
equivalent to “I do not belong to E”. There is nothing contradictory
about the existence of <p (we actually have shown how to construct it!).
Moreover, tp cannot be false because this would contradict the soundness
of E. Thus <p is true.
While Thm. 4.7 was proved by showing that if were arithmetical
then the Liar paradox could be reproduced in if, the proof of 11.4 merely
skirts the paradox.
Having disposed of the problem of completeness for sound axiomatizable
theories, we now turn to axiomatizable theories that are consistent but
not necessarily sound.
11.5. Theorem. Every axiomatizable complete theory is recursively decidable.
Proof. Let E be an axiomatizable complete theory. Since E is axiomatiz¬
able, Tz is r.e. by Thm. 5.4. On the other hand, ~i Tz(x) holds iff x is not
a SENTENCE or x is a sentence whose negation belongs to E. Thus we have
the identity
-l Tjfx) = “l Frm(x,0) v Tz(32 • 3X).
Hence, by Prob. 4.4 and Lemma 6.11.2, ~\TZ is recursively enumerable.
Since both Tz and -]TS are r.e., it follows from Thm. 6.11.4 that Tz is
recursive. m
Note that from Thm. 11.5 and Cor. 10.6 we get another proof of Thm.
11.3. Note also that the argument of Thm. 11.5 can be used to prove the
recursive decidability of the various axiomatic complete theories in §4
of Ch. 5.
356 LOGIC — LIMITATIVE RESULTS [CH. 7, §11

Is every axiomatizable and recursively decidable theory necessarily


complete? A negative answer is provided by:

11.6. Problem. Let £ be the set of all sentences deducible from the single
postulate
Vvi(v1=s0w1=s1).

Show that £ is the intersection of finitely many (actually: 1025) different


theories, each of which is axiomatizable and complete by Prob. 11.2,
and hence recursively decidable. Hence show that £ is recursively decidable
and incomplete.

The following two theorems were proved, in a weaker form, by Godel.


A version which is intermediate between Godel’s original one and the one
given below is due to Rosser.

11.7. Theorem. Every consistent axiomatizable theory that includes nt is


incomplete.
Proof. Immediate from Cor. 10.4 and Thm. 11.5. g
In the proof of the following theorem — a strengthened version of
Godel’s celebrated First Incompleteness Theorem — we show how, for
any given consistent axiomatic theory £ such that II1 c £, one can actually
construct a sentence q> such that <p^£ and ~|(p$£.

11.8. First Incompleteness Theorem. Given any consistent axiomatic


theory £ that includes nx, we can find a formula y £ <I>; (of the form described
in Lemma 1.9) such that the sentences y(s#Y) and ~ly(s#Y) do not belong to £.
Proof. As in the proof of Thm. 11.4, we can construct the r.e. property
rE. Put
P(x) = Tf32 • 3d(x)), P'(x) = Tfid(x)),

and construct y as in Lemma 7.9 (for n = l).


Suppose we had y(s#Y)££. By the consistency of £ we get ~iy(s#y)^£. But
#[y(s#T)]=rf(#y), #[“iy(s#Y)] = 32- 3d(#Y).

Hence “iP(#y) a P'(#y) hold?. By Lemma 7.9 it follows that


“iy(s#Y)6n1££,

contrary to the assumed consistency of £.


Now suppose “ly(s#Y)££. By the consistency of £ we get y(s#Y)<££.
Hence in this case P(#y)AnP'(#y) holds, so that by Lemma 7.9
y(s#Y)6£. This again contradicts the consistency of £. |
CH. 7, §11]. INCOMPLETENESS 357

At first sight it might seem that Thms. 11.7 and 11.8 add little of interest
to Thms. 11.3 and 11.4. After all, we are really interested in theories that
are not merely consistent but sound. And, for a sound theory E, Thms.
11.7 and 11.8 yield roughly the same results as Thms. 11.3 and 11.4, but
only subject to the additional condition that nxsE.
However, notice that the condition of soundness (which is imposed in
Thms. 11.3 and 11.4) is purely semantic. A theory E is sound iff all its
sentences are true, i.e., Esft; and we know that ft cannot be characterized
in a reasonably constructive way. On the other hand, the conditions
imposed in Thms. 11.7 and 11.8 are much less problematic. The condition
IliCE means that certain sentences of a rather simple form ((6.1)—(6.4)
and (7.2)—(7.4)) belong to E. To show that this condition holds, it is enough
to describe a method for constructing deductions of these sentences from
a given set of postulates for E. (Actually, since L^cix,, it is enough to
exhibit deductions of the nine postulates of II2.) Also, the condition that
E is consistent simply means that the sentence 0^0 does not belong to E,
i.e., that no deduction from a given set of postulates for E terminates
with this sentence.
Thus the advantage of Thms. 11.7 and 11.8 is that the notions mentioned
in them are far more elementary, constructive and “tangible” than the
notion of soundness.
This fact is utilized in the proof of Godel’s Second Incompleteness
Theorem, which we now proceed to outline.

We start by noting that the proof of Thm. 11.8 works also if E is assumed
to be axiomatizable rather than axiomatic, except that in this case we can
only say that the formula y exists, not that we can actually find it.
Now, let E be any axiomatizable theory that includes first-order Peano
arithmetic II. Define P and P' as in the proof of Thm. 11.8, and let a, cd,
P, P' and y be as in Lemma 7.9 (for n — 1). Since H^sIIciE, the second
part of the proof of Thm. 11.8 establishes that

(1) if E is consistent, then ~ly(s#y)$E.

Let us look for an if-sentence that formalizes assertion (1). Let k— #(0?£0).
(In fact, k=32- 3240.) To say that E is consistent is clearly tantamount to
saying that the sentence O^O is not in E, i.e., that Tfk) does not hold.
Since 0?£0 is a sentence, it is easy to see that d(k)=k. Thus the statement
that E is consistent is equivalent to the statement that Tfd(k)) does not
hold, i.e., that P\k) does not hold. But the formula a' represents P' in ft.
LOGIC — LIMITATIVE RESULTS [CH. 7, §11
358

(See proof of Lemma 7.9.) Therefore, if we put <p = la'(sk) then

(2) E is consistent iff <p£l2.

It is natural to regard <p as the J§?-sentence asserting the consistency of E.


Next, we observe that the statement that ”iy(s#Y)$E means that P(#y)
does not hold. Since a represents P in we have

(3) -iy(s#Y)<£L iff -ia(s#Y)££L

From (2) and (3) it follows that (1) is equivalent to the statement that the
sentence

(4) <p-»“la(s#Y)

is true, i.e., belongs to 12. Since we have actually established (1), we conclude
that the sentence (4) in fact belongs to 12.
It turns out that (4) belongs not only to 12 but to II. This can be established
by analysing in detail the whole chain of steps in the informal proof of
(1) and showing that this proof can be formalized as a deduction of (4)
from the postulates of II. (This is made possible by the great strength of
these postulates.) This detailed analysis is beyond the scope of our book,
and we ask reader to accept on trust the fact that (4) belongs to II.
Next, let m be any number. Inspecting the formulas a, p and y (see
proof of Lemma 7.9) we easily see that ~|a(sm)l- ~l3yP(sm) and hence
~lot(sm)|--|y(sm). Using this form= #y, and the fact that (4) belongs to IT,
we see that

(5) (<p-»“ly(s#y))€II.

Now, we have assumed that II £ E. If E is consistent, then 9 cannot belong


to E, because then by (5) we would get —iy(s#Y)£E, contradicting (1). Thus
we have:

11.9. Second Incompleteness Theorem. Let E be an axiomatizable theory


that includes first-order Peano arithmetic. If E is consistent, then the
sentence asserting this is not in E. |
‘ \

This result can be extended to ^f'-theories E' in languages other than ,


provided that appropriate “translations” or counterparts of the postulates
of II are in Er. Speaking somewhat loosely, we may therefore say that
if r is an axiomatic theory (in any formal language) which is at least as
strong as II and if I is consistent, then this fact cannot be deduced in T.
This constitutes a grave difficulty for the formalist view of mathematics.
CH. 7, §12], HISTORICAL AND BIBLIOGRAPHICAL REMARKS 359

According to the formalist doctrine (as embodied, e.g., in the Hilbert


Programme), classical mathematics — more precisely, those parts of it
that refer to infinite collections — cannot be justified by virtue of its alleged
meaning but must be taken to be an axiomatic formal theory (or perhaps
a collection of such formal theories).
In order to be sure that a formal theory will not lead us to contradiction,
we need to prove its consistency. And one will only be convinced by a
proof that can be justified by virtue of the meaning of the concepts used in it.
But by the Second Incompleteness Theorem it turns out that if classical
mathematics is taken to include, e.g., first-order Peano arithmetic (and
there is very little left if it is not included), then any consistency proof for
such an axiomatic formal theory is — at least in some sense — less elementary
that the methods which the formal theory formalizes. It would seem that
the consistency proof itself is at least as much in need of justification as
the formal theory whose consistency is being proved.
A conceivable way out of this dilemma would be to demarcate the
boundary — between methods of proof that are supposed to be convincing
by virtue of their meaning, and methods that are regarded merely as formal
manipulations unjustified by meaning — in such a way that the methods
needed for a consistency proof for classical mathematics would fall just
on the “right” side of the boundary. The Second Incompleteness Theorem
implies that the task of inventing a satisfactory demarcation of this kind is
tremendously difficult. But it is not hopeless.
*11.10. Problem, (i) Show that there are 2*° countable models of n,
no two of which are elementarily equivalent. (Using the First Incompleteness
Theorem, get for each fd2N, i.e., for each infinite sequence of 0 s and 1 s,
a consistent theory Ey such that and such that if f^f then there
is a sentence <p for which ip^E/ and “i<p ££/'•)
(ii) Show that there are 2*° pointwise definable models of II no two
of which are elementarily equivalent. (Use (i) and Prob. 9.17.) This lesult
is in sharp contrast with the last result of Prob. 9.17.

§ 12. Historical and bibliographical remarks

The existence of non-standard models of H was proved by Skolem [1934],


He constructed such a model using a method very much like that of
Prob. 3.13.
A rigorous semi-formal characterization of the natural numbers was
first given by Dedekind [1888].
360 LOGIC — LIMITATIVE RESULTS [CH. 7, §12

Peano [1889] proposed his famous postulates, formalized in a language


very much like the language j£?' of §2.
Results which include Thm. 4.7 and Prob. 4.10 were contained in a paper
published by Tarski in Polish in 1933, of which Tarski [1935] is a German
translation. (English translation in Tarski [1956].)
The undecidability of the predicate calculus (Thm. 10.7) is due to Church
[1936a]. For other undecidability results see Tarski, Mostowski and
Robinson [1953] and Davis [1965]. Additional results on undecidability,
as well as positive solutions of various decision problems, abound in the
literature and are too numerous to be listed here.
The results of §11 are due to Godel [1931], For his First Incompleteness
Theorem, Godel required the theory in question to be co-consistent, i.e.,
if a sentence “iVx« belongs to the theory, then for some number n the
sentence a(x/s„) does not belong to the theory. This requirement is strictly
stronger than consistency. Rosser [1936] gave a modified proof, in which
only consistency is assumed.
A proof of the Second Incompleteness Theorem can be found in Fefer-
man [I960],
CHAPTER 8

RECURSION THEORY (CONTINUED)

In this chapter we shall discuss a selection of topics in recursion theory,


most of which arise or acquire their significance from ideas studied in Ch. 7.
Thus, while the bulk of this chapter can be understood by a reader who
has only read Ch. 6, a proper appreciation of the material requires some
knowledge of Ch. 7.

§ 1. The arithmetical hierarchy

In this section relation will mean a second-order relation in the sense of §6


of Ch. 6. Recall that first-order relations are also included, as a special case.
We define the class of arithmetical relations as the smallest class that
contains all recursive relations and is closed under the logical operations,
i.e., the propositional operations (defined in Ex. 6.6.6) as well as (unbounded)
existential and universal quantification (defined at the beginning of §11
of Ch. 6).
For first-order relations this definition is equivalent to Def. 7.3.2, in
view of Thm. 7.3.7.
There is a neat and useful classification of arithmetical relations, based
on the following:

1.1. Definition. By induction on m we define classes Z°m,n°m and A°m as


follows:
I'll is the class of all recursive relations;
for all m, n°m consists of all relations which are negations of relations
belonging to Z°m;
for all m, A0m=Z°mnn°m;
for all m, +1 consists of all relations obtained by an existential
quantification from relations belonging to 17®.
The collection of all classes I°m, n°m and A°m is the arithmetical hierarchy.
362 RECURSION THEORY (CONTINUED) [CH. 8, §1

The superscript “0” in the above notation is designed to distinguish


between the arithmetical hierarchy and other hierarchies (whose classes
are denoted similarly, but with different superscripts) which are studied
in recursion theory and other branches of mathematics.
Evidently, a relation belonging to any class in the arithmetical hierarchy
is arithmetical. We shall soon see that the converse is also true.
From Def. 1.1 it is clear that n°0 = Z°0, since a relation is recursive iff
its negation is recursive. Also, the first-order relations belonging to
are precisely the r.e. relations.
Further, since

-i 3yQ(f\x,y) = Vy n Qif ;*,y),

it follows that, for all m, n°m+1 consists of all relations obtainable by a


universal quantification from relations in Z°m.
Thus, if in each clause of Def. 1.1 we interchange the symbols “I”
and “77” and the words “existential” and “universal”, the resulting state¬
ment is true. The same holds for any theorem deduced from Def. 1.1.
This fact is the principle of duality for the arithmetical hierarchy. (We
do not pursue this matter in rigorous detail because we shall only use
duality in an informal way.)
By induction on m it is now easy to show that an n-ary relation P is
in Z°m iff the identity

P (€;*)=Vy2 3y3---^(^;2,Tiv, ym)

holds for some recursive relation R. Here the quantifiers alternate between
existential and universal, beginning (on the left) with the former.
The dual of the above statement yields a general form of n°m relations.
We shall now show that the arithmetical classes enjoy certain closure
properties.

1.2. Femma. Let H1,...,Hk be total recursive n-ary functionals. If

and P belongs to some class in the arithmetical hierarchy, then Q belongs


to the same class.
Proof. For the class Z?. (=70 the lemma is obvious. For Z° and 17°
with ra>0, the result follows by induction on m. The result for A°m is then
immediate from the definition of this class. I
CH. 8, §1], THE ARITHMETICAL HIERARCHY 36?

1.3. Problem. Show that if

Q(£;*, z1,...,zm)=P(lyH(£;y, Zl,...,zJ;x),

where H is a total recursive functional, and if P belongs to some class in


the arithmetical hierarchy, then Q belongs to the same class. (For the
classes Z°m and n°m proceed by induction on m. Use Prob. 6.5.4 or Prob.
6.8.7.)

1.4. Lemma. For all m, the class Z°m+1 is closed under existential quantifica¬
tion. Dually, n°m + 1 is closed under universal quantification.
Proof. Similar to the proof of part (2) of Lemma 6.11.2. g

1.5. Lemma. Every class in the arithmetical hierarchy is closed under disjunc¬
tion, conjunction and bounded quantification (both existential and universal).
For every m, the class A°m is closed under negation, and hence under all
propositional operations.
Proof. For S°m and 77® proceed by induction on m. The case m=0 is
covered by Ex. 6.6.6 and Ex. 6.6.9. In the induction step, for Z°m+1 argue
as in the proof of Lemma 6.11.2, and for 17®+1 by duality. For d®H the
results then follow by the definition of this class. |

1.6. Lemma. Z°mvIl0m^:A0m+1 for all m.


Proof. By induction on m. For m = 0 we observe that

P(£ ;*) = 3 fi[P(£ ;x) a (y=y)]

=VAP(£i*)A(y=y)]>
so that, if P is recursive, it belongs to both T® and 77®, hence to d®.
For the induction step, let P£Z°m+1. Then P(fi',tP) = ^yQ(fi',x,y)t
where Q£ll0m. By the induction hypothesis we have Q€.A°m+1^Il0m+1,
hence P£Z°m+2. Also, Xy(y=y) is recursive, i.e., belongs to but by the
induction hypothesis

Since
P(£;x)=Vy[P(Z;x)A(y=y)\,

it follows from Lemma 1.5 that P^n°m+2. Thus P^Z\'0m + 2


and we have shown that T®!+1 £ d®n+2. Dually, also 77®„

In the sequel we shall often make implicit use of the results obtained
so far in this section.
364 RECURSION THEORY (CONTINUED) [CH. 8, §1

We can now see that any arithmetical relation belongs to A°m for sufficiently
large m. The recursive relations constitute the class Zl=ZlnTll=Al.
If P is in A°m, then ~iP too is in A°m; and if P'is in A°m, then both P and P'
are in d^ax(m so that the disjunction PvP' also belongs to this class.
The other propositional operations can be reduced to negation and disjunc¬
tion. Finally, both universal and existential quantification map A°m
into A°m+V
Thus the arithmetical hierarchy is a cumulative classification of all
arithmetical relations.
We shall now obtain a convenient canonical form for relations in Z°m+1
and in 77°m+1.

1.7. Enumeration Theorem. Let P be an n-ary relation. Then P belongs


to Z°m+1 iff there is a number e such that

P(ff,x)=3y1\/y2...fym3ym+1Tn+m(ff, e,x,y^...tym+^
•or
P(Z;z) = 3y1Vy2..3ymVym+1^ Tn+Jfi\ e,x,y1,...,ym+1),

according as m is even or odd. Similarly for n°m+1, interchanging “3” with


“V” and “rn+m” with “-1 rn+m”.
Proof. It is enough to prove the result for Z\, since the other cases then
follow easily by induction on m, using Def. 1.1.
In Cor. 6.11.6 we have already got the required result for first-order
relations. (For, the r.e. relations are clearly the same as the first-order
relations belonging to T°.) We shall use the same method in the present
proof.
First, observe that if

(1) P{^x) = 3yT{^e,x,y)

then P£Z\ because Tn is a p.r. relation, and hence belongs to n°0.


Conversely, suppose that PdZ\. Then

(2) P(^x)=3yR^;x,y),

where R is a recursive relation. Put

(3) F(Z-,x)=nyR(£\x,y).

Then Fis clearly a recursive functional and hence by the NFT (Thm. 6.10.1)
CH 8, §1], THE ARITHMETICAL HIERARCHY 365

F has an index e, for which

(4) F(Z;x)=U(pyT(£;e,x,y)).

From (2), (3) and (4) we easily get (1). I


Put Q(£;x,z) = 3yT(l;;z,x,y). Then Q is clearly in T®. By Thm. 1.7,
Q enumerates all «-ary relations in T®, in the sense that as e runs through N,
the relation P defined by (1) in the above proof runs through all n-ary
relations in T®.
Similarly, for each n and each of the classes I1®+1 and 77®+1, Thm. 1.7
gives us an (n-fl)-ary relation belonging to the class in question, which
enumerates all n-ary relations in this class. Hence the name of the theorem.

1.8. Problem. Show that, for all n and m, there does not exist an (n + l)-ary
A°m relation which enumerates all n-ary A°m relations. In particular, since
Il=n°0=A°0, Thm. 1.7 cannot be extended to T® and 77q.

1.9. Hierarchy Theorem. For every m we can find a relation belonging


to I°m+1 but not to 77®n+1 (hence its negation belongs to 77®+1 but not to

Proof. Thm. 1.7 gives us a binary T®+1 relation Q which enumerates all
unary T^+1 relations. Put

P(£;x) = Q(£;x,x).

Then P is in Z°m + 1 by Lemma 1.2. On the other hand, if P were in n°m+1


then —i P would be in T®+1, so for some e we would have the identity

-iQ(£;x,x) = Q(£;x,e).

For x=e this yields a contradiction. ■


The Hierarchy Theorem is so called because it shows that the cumulative
arithmetical hierarchy is non-degenerate, in the sense that we get something
new each time we go up the hierarchy. For, if P is in Z°m+1 but not in
ni + l then P$A°m+l and hence
The rest of this section is devoted to an important characterization of
A°m+1 in terms of T® and 77® . We begin with:

1.10. Definition. Let <P be any class of functionals. The recursive closure
of <P (briefly, 3$<P) is the class of all functionals obtainable from the func¬
tionals of <P and the basic functionals Z, S, A and In i (for all n and i such
that !</<«) by a finite number of applications of the basic operations:

25
RECURSION THEORY (CONTINUED) (CH. 8, §1
366

composition, primitive recursion and minimization1. If we say


that <P is recursively closed.

1.11. Remark. The following facts are easily established and will be used
without special mention:
(i) ^20 is the class of recursive functionals.
(ii) is the smallest recursively closed class that includes <P.
(iii) If then
(iv) F^F iff F£@<P0 for some finite

1.12. Lemma. A0m+1^&n°m for all m.


Proof. Let P be an n-ary A°m+1 relation. Then P is both L°m+1 and n°m+v
In other words, both P and ~i P are Z°m+V Thus

Ptf;*)=3yQ(Z;x,y),

~iP^;x) = 3yR(^x,y),

where Q and R are 17® . We easily verify (as in the proof of Thm. 6.11.4) that

p (£;*)=<2(£; w[2(£ v R<£


Thus P is obtained by recursive operations from Q and R, and hence
belongs to &n°m. I

1.13. Theorem. d®=17® (=T°=zlo).


Proof. Since n°0 (=Io) the set of recursive relations, I7q£^0; therefore
^2I7j5c^?0, so by Lemma 1.12 all A\ relations are recursive. Thus Al^n°0.
By Lemma 1.6, d°=77o. i

1.14. Lemma. Let H be a total unary functional. Let G be any n-ary functional
obtained from H and the basic functionals Z,S and the Im i (but not A) by
a finite number of applications of the basic operations: composition, primitive
recursion and minimization. Then for some n-ary recursive functional F,

G^;x) = F(XyH(^yy,x).

Proof. By induction on the total number k of applications of the basic


operations made in order to obtain G from 77, Z, S and the Imi.
For k = 0, we must show that if G is 77, Z, S' or In i then the conclusion
of our lemma is correct. For 77, this follows from the identity

H(fi ;x)=A(XyH(b,; y); x).

1 For the definition of these basic functionals and operations, see Def. 6.5.2.
CH. 8, §1], THE ARITHMETICAL HIERARCHY 367

The cases of Z, S and the Ini are trivial. (E.g., for S we have S(£;x) =
= S(XyH(Z;y);x).)
In the induction step we must show that if G is obtained by one application
of composition, primitive recursion or minimization from functionals
satisfying the conclusion of our lemma, then G has this property as well.
This is left as a simple exercise to the reader. |

1.15. Lemma. Let P be an n-ary relation. If P£t%n°m, then PdA°m+1.


Proof. Since P£3Hl°m, it follows that P is in the recursive closure of some
finite subset of n°m. Thus Pdt%{Q0,...,Qk}, where the Q; are 17° relations.
Now, let Q' be defined by the identity

Q'^;x,y)=(A(^;x)=y).

Then Q' is clearly a recursive relation, and so certainly belongs to n°m.


Therefore we assume without loss of generality that Q' is one of the Q{.
It follows that P is obtainable via the three basic operations from the
Qt and the basic functionals other than A. For, A itself can be obtained
from Q' by minimization:

A(£;x)=pyQ'(£;x,y),

and Q' is one of the Qr

Next, the Q: may be replaced by unary n°m relations. Suppose, e.g., that
Qi is binary. Put

Then is a unary n°m relation. Also, Qt may be recovered from Mt


by composing the latter with a p.r. function:

Q^;x,y)=M^;2x-y).

A similar device can be used if Qt is ternary, etc.


Thus P can be obtained via the three basic operations from unary n°m
relations M0,...,Mk and the basic functionals (excluding A). Let us put

Af (f; x)=exp(p0, M0(£ ;x))... exp (pk, Mk(f ;x)).

Then each Mt can be obtained by composing a p.r. function with the


functional M:

Therefore P can be obtained via the three basic operations from M and

25*
368 RECURSION THEORY (CONTINUED) [CH. 8, §1

the basic functionals (excluding A). By Lemma 1.14 it follows that

P^;x)=F(XyM^;y);x),

where F is some recursive functional. Let e be an index of F. By the


NFT we have

P(£;x) = U(nyT*(Y\ exp(/?,,M(£;0), e,x,y)).


t^y

Hence it is easy to verify the identities


(1) P(£;x)=3y 3z[K(£;y,z)AT*(z,e,x,y)A(U(y)=0)l

(2) P(Z;x)=Vy Vz[K(£;yj) AT*(z,e,x,y)-+(U(y)=Oj],

where K(£;y,z)=(J],^exp(pt,M(5,t))=z)-

If we can prove that the relation K is A°m+1, then (1) shows that P is
obtained by existential quantifications from a Z°m+1 relation, hence P
is X°m+1. Similarly, (2) shows that P is n°m+r Hence P is A°m+1.
To prove that K is A°m+1 we observe that

(3) K(£; y,z)=(lh(z) *sy+1) a Vt^y [(z)t = M(f; t)]•


Let us put
R(£;t,u)=(u=M(£;tj).

Clearly, if we can prove that the relation R is A°m+1 then (3) shows that K
is A°m+1 as well.
To show that R is A°m+1 we observe that by the definition of M

R(£ ;t,u) = (lh(u)«sL +1) a ((t/)0 = M0(f ;t)) A ... a ((u)k = ;tj).

Thus to prove that R is A°m+1 it is enough to show that the relations R0,...,Rk
defined by the identities

Ri(£;t,v)=(v=Mi(£;t)) for i=0,...,k

are A°m+1. However, from this definition of the i?, it follows that

Ri(£ ;t,v)=(v=0) a ;t) v (u = 1) a ~i M££;t).

Since the Mt are 77“ , hence A°m+1, this shows that the Rt are A°m+1
as well.

1.16. Theorem. A relation belongs to A°m+1 iff it belongs to n°m).


Proof. Observe that since any Z°m relation can be got by
negation from a 17“ relation. Hence ^(I°,u77?n)=^/7° , and our theorem
follows at once from Lemmas 1.12 and 1.15. |
CH. 8, §2], A RESULT CONCERNING Tq 369

*§ 2. A result concerning Tq.

The ideas of §1 can easily be extended to cover relations with more than
one sequence argument (cf. §9 of Ch. 6). Thus, it makes sense to say that
an (m,«)-ary relation (a relation having m sequence arguments and n
numerical arguments) is arithmetical.
Now, we introduce two new operations: existential and universal quanti¬
fication over a sequence variable (rather than over a numerical variable).
These operations are defined in the obvious way.
Using these new operations, one defines classes Z\, III an<^ ^it for
k>0, as follows: T* is the class of all (m,n)~ary relations obtained from
arithmetical relations by one existential quantification over a sequence
variable. For all 0, III is the class of all negations of relations in
and T£+1 is the class of relations obtained from III relations by one
existential quantification over a sequence variable. Finally,
The classes III and Al constitute the analytical hierarchy.
The class A\ is of special importance; the role it plays in the analytical
hierarchy is analogous to that played by A\ (which, by Thm. 1.13, is the
class of recursive relations) in the arithmetical hierarchy. A relation
belonging to A\ is said to be hyperarithmetical.
We shall not study the analytical hierarchy in this book. But we shall
now sketch a proof that the property Ta defined in Ch. 7 is hyperarith¬
metical.
We shall use the same terminology and notation as in Ch. 7. In particular,
recall that =£? is the first-order language of arithmetic, fl is the set of all
cSf-sentences which are true (i.e., hold in the standard structure 91) and
Tn is the property of being a sentence of fl.
Let ® = {<p;: i£N} be a sequence of .^-sentences. We say that <P is
a truth sequence if the following six conditions hold:
(i) If “|“iq>€®, then <p€®.
(ii) If (<p->-\|/)€®, then “l<p€® or i|/€®.
(iii) If —i(<p-m|/)€®, then <p£® and —|vJ/C<l>-
(iv) If Vxa£®, then a(x/s„)6® for all n.
(v) If then “| a(x/s„) € ® for some n.
(vi) If (p belongs to ® and has the form r=t or r^£t (where r and t
are closed terms), then <p is true (i.e., (p€£2).

We now define a (l,0)-ary relation Ts as follows: Ts (£) holds iff there


is a truth sequence {(p,-: i£N} such that ^(0=#<P f°r ah i-
It is then an easy matter to verify that Ts is arithmetical. (In connection
370 RECURSION THEORY (CONTINUED) [CH. 8, §3

with clause (vi) of the definition of truth sequence, note that while the
property Ta of being a true sentence is not arithmetical, the property
of being a true sentence of the form r=t or r^t is recursive.)
Moreover, it is easy to show by induction on deg <p that iff <p
belongs to some truth sequence. Thus

(1) Ta(x) = By [Ts (£) a (£(y)=x)].

Also, a sentence <p belongs to £2 iff Hence

(2) Tn(x) = Vy [Frm (x,0) a {n Ts (f) v (£(y) A 32 • 3*)}].

From (1) and (2) it follows at once that Tn is hyperarithmetical.

§ 3. Encoded theories

Many of the results proved in Ch. 7 involve both logic and recursion
theory. We shall now set up a framework which generalizes and abstracts
from the particular logical setting of Ch. 7, and in which we can formulate
and prove generalized and rather abstract versions of some of those results.

3.1. Definition. An encoded theory is a triple (A,(p,f) where A^N, (p is


a recursive sequence and / is a binary total recursive function.

If £ is a theory in the sense of Ch. 7 (a deductively closed set of sentences


in the first-order language of arithmetic), then the corresponding encoded
theory is (A,(p,f), where

A = {x: Ts(x) holds} =#[£],

<p(x) = 32-3*,

f(x,y) = sb (x,l,y).

Thus A is the set of all code numbers of sentences in £ (see Def. 7.5.1);
and if x is a code number of a formula a then cp(x) is the code number
of the negation of a (see §4 of Ch. 7) and /(x,y) is the code number of the
formula obtained from a be substituting the yth numeral for the first variable
(see Ex. 7.4.5).
More generally, we may consider an arbitrary formal language if (which
may or may not be a first-order language). We suppose that if has certain
objects called “expressions” which are coded by natural numbers. Let
certain expressions be designated as “provable” (or “true”); let A be the
set of code numbers of all such expressions. Suppose also that we have
ENCODED THEORIES 371
CH. 8, §3],

a recursive sequence cp and a binary total recursive function f such that


whenever x is the code number of an expression a then (p(x) and f(x,y)
are code numbers of expressions which in some appropriate sense can be
regarded as the “negation of a” and the “/h instance of a” respectively.
Then (A,cp,f) is the encoded theory corresponding to this situation.
(In many results concerning encoded theories (A,cp,f) the sequence
cp plays no role whatsoever. Such results can be applied to languages which
have no negation.)
We shall need the following definitions.

3.2. Definition. An encoded theory (A,cp,f)> is consistent if Ancp[A] — 0,


recursively decidable if A is recursive, axiomatizable if A is an r.e.^ set1.
An extension of an encoded theory (A,cp,f) is any encoded theory (A',q>,f)
with A £ A'.
3 3 Definition. Let (A,q>,f> be an encoded theory and let Ms A. Then
M is weakly representable in (A,<pJ) if there is a number n such that the
identity
(f(n,x) € A) = (x £ M)

holds. M is strongly representable in <A,<p,f > if there is a number n such


that for every x

f(n,x)£A if xdM,

cpf(n,x)(zA if x(£M.
(Cf. Def. 7.3.1.)

In what follows, if A then we put MC=A-M.


Generalized versions of several results of Ch. 7 can be obtained as
corollaries to the following:

3.4. Theorem. Let (A,(p,f > be an encoded theory in which every Z°m(or every
n°J set of numbers is weakly representable. Then A is not in IIm (not in

Proof^Suppose A is in I7°m. Then clearly Ac is in L°m and by 1.2 the set

M={x: f(x,x)£Ac}
must also be in J°„. If every Z°m set is weakly representable in the encoded
theory, then for some number n we have

(f(n,x)€A)=(xeM).

1 Cf. Thm. 7.5.4.


372 RECURSION THEORY (CONTINUED) [CH. 8, §4

Using the definition of M and taking n as the value of x we have f(n,ri)£A


iff f(n,ri)£Ac, which is absurd.
The same argument works also when and “n°m” are interchanged. |

In particular, if every arithmetical set of numbers is weakly representable


in (A,(p,f), then by Thm. 3.4 we see that A $n0m for all m. Thus by what we
have shown in §1, A is not arithmetical. This is a generalized abstract
version of Thm. 7.4.7.
For m=0 Thm. 3.4 says that if every recursive set of numbers is weakly
representable in an encoded theory, then the theory is recursively undecid-
able. This is a generalized form of Thm. 7.10.1.

3.5. Problem. Let (A,cp,f) be a consistent encoded theory in which every


Z°m (or every 77“) set of numbers is strongly representable. Show that
(p-'iA] is not in S°m (not in 17“, respectively).

§ 4. Inseparable pairs of sets

Two sets A and B of natural numbers are recursively separable if there are
disjoint recursive sets C and D such that A c C and B^D. If such C and D
exist, then without loss of generality we may assume that D = CC. Thus
A and B are recursively separable iff there is a recursive set C that includes
A and is disjoint from B. If A and B are not recursively separable, we say
that they are recursively inseparable.
The notion of recursive separability arises in a natural way in connection
with encoded theories:

4.1. Theorem. An encoded theory (AypJ) has a recursively decidable and


consistent extension iff A and (p[A] are recursively separable.
Proof. If (A',(p,f) is a recursively decidable and consistent extension of
(/4>(Pif^ then from Def. 3.2 we see at once that A' is a recursive set that
includes A and is disjoint from <p[A'], hence also from cp[A\.
Conversely, if C is a recursive set that includes A and is disjoint from
(p[A\, put
A'={x: x£C, <p(x)<|C}.

It is easy to see that (A ,<p,f) is a recursively decidable consistent extension


of (A,(p,f). |

It is easy to see that if A and (p[A] are recursively separable, then so are
A and (p-^A]. In view of this and Thm. 4.1, the following theorem can be
regarded as a generalized version of Cor. 7.10.3.
CH. 8, §4], INSEPARABLE PAIRS OF SETS 373

4.2. Theorem. Let (A,cp,f) be an encoded theory in which every recursive


set of numbers is strongly representable. Then A and (p~r[A) are recursively
inseparable.
Proof. Let C be a set of numbers including A and disjoint from <p_1[T].
It follows from Def. 3.3 that every set which is strongly representable
in (A,(p,f) is weakly representable in (C,<p,f). But then by Thm. 3.4
(with ra = 0) we see that C cannot be recursive. |

4.3. Remark. Thm. 4.2 is of interest only if the encoded theory (A,cp,f)
is consistent — otherwise, the sets A and (p~1[A] intersect and are trivially
recursively inseparable. Note also that if (A,(p,f > is axiomatizable then
A is r.e. and it is easy to see that (p'^A] is r.e. as well.
If £ is a consistent (and axiomatizable) theory in the sense of Ch. 7,
and if £ includes “junior arithmetic” n1( then the corresponding encoded
theory (A,(p,f) satisfies the condition of Thm. 4.2 (see Thm. 7.7.10). Hence
A and <p-1[d] are a pair of disjoint (and r.e.) recursively inseparable sets.
We shall soon strengthen this result.

4.4. Definition. Two sets of numbers A and B are effectively inseparable


if there is a recursive binary function g with the property that whenever
a and b are numbers such that

Hedoni^a}, ^cdom^},

dom^a} n dom^} = 0,

then g(a,b) is defined and g(a,6)^dom1{a}udom1{6}.

Clearly, if A and B are effectively inseparable, they are also recursively


inseparable.
For our first example of a pair of effectively inseparable and disjoint
r.e. sets we shall need two lemmas.

4.5. Separation Lemma. Let A and B be given n-dimensional r.e. sets.


Then we can find disjoint n-dimensional r.e. sets A' and B' such that
A —B^. A', B — AeB', and AuB=A,'uB'.
Proof. By 6.11.6 we can find indexes i and j for A and B respectively. Thus

(x£A) = 3yT(i,x,y),

(x£B)=3yT(j,x,y).
RECURSION THEORY (CONTINUED) [CH. 8, §4
374

Now define A' and B' by the identities

(x£A') = 3y[T{i,x,y) a ~i3z<y T(j,x,z)],

(xeB') = 3y[T(j,x,y) a ~i3z<y T(i,x,z)\.

The idea behind this definition is this. An n-tuple x belongs to A iff some
(unique) computation code y “puts x in A ’, i.e., y is such that T(i,x,y)
holds. Similarly with “A” and “i” replaced by “B” and “j”. Now A'
is defined as the set of all x£A such that the computation code putting
x in A is smaller than the computation code (if any) putting x in B. And
B' is defined as the set of all x£B such that the computation code (if any)
putting i in A is not smaller than the computation code putting x in B.
It is easy to see that A' and B' are as required. 8

4.6. Symmetric Lemma. Let

C={z: zGdom^^z)}},

D = {z: z6domx{Z,(z)}},

where K and L are the functions defined in Ex. 6.6.13. Then C-D and D-C
are effectively inseparable.
Proof. Suppose that

(1) C—D £ dom^a}, Z) —C£ dom^}.

We show that /(Z>,a)6dom1{o) iff J(6,a)€dom1{fe}.


Indeed, suppose that J(b,a) belongs to dom^a} but not to dom^}.
Since b=K(J(b,a)) and a=L(J(b,a)), it follows from the definition of C and
D that J(b,a)£D-C. Hence by (1) we have Jr(6,a)6dom1{6}, contrary
to our assumption. Similarly, the assumption that J(b,a) belongs to
dom^} but not to dom^a} leads to contradiction.
If dom1{o) and dom^} are disjoint, then J(b,a) cannot belong to both,
so it belongs to neither.

4.7. Example. Let C and D,be as in Lemma 4.6. Clearly, these sets are
r.e., so by Lemma 4.5 we find disjoint r.e. sets A and B such that C—D ^ A
and D — C^B. Since C—D and D — C are effectively inseparable, A and
B must also be effectively inseparable.

•4.8. Problem. Let

A = {z: {z}(z)=0}, B={z: {z}(z) = l}.


CH. 8, §4], INSEPARABLE PAIRS OF SETS 375

It is easy to see that A and B are r.e. and disjoint. Prove that they are
effectively inseparable. (Find a nice function g such that {g(x,y)}(z)= 1
whenever z^dom^x} — dom1{v}, and {g(x,y)}(z)=0 whenever
zGdom^v} — dom^x}. Show that g has the property required in Def. 4.4.)

From a given pair of effectively inseparable sets, new pairs may be


obtained using the following:
4.9. Lemma. Let A and B be effectively inseparable, and let q be any recursive
sequence. Then £>[/l] and q[B] are effectively inseparable.
Proof. Let g be as in Def. 4.4. Using Thm. 6.11.11 we get a nice sequence
(p such that

(z€ dom^O)}) = (e(z) 6 dom^z}).

Now put h(x,y) = Qg(cp(x), (p(y)). It is easy to verify that h has the property
required in Def. 4.4 (with respect to q[A] and £[i?]). I

Note that if A and B are r.e. then so are q[A] and e[5].
We shall now apply the above results to encoded theories. Let C and
D be sets of numbers, and let <A,(p,f) be an encoded theory. We shall
say that the pair <C,D) is representable in (Aypff) if for some n

f(n,x)eA for all x£C,

(pf(n,x)ZA for all x 6 D.

We now have:

4.10. Theorem. If the pair {C,D) is representable in the encoded theory


(Aypff), and C, D are effectively inseparable, then A, <p'}[A] are effectively
inseparable.
Proof. Let n be as above, and put Q(x)=f(n,x). Then q[C]S:A and
Q[D]<^(p-l[A\. By Lemma 4.9, q[C] and q[D] are effectively inseparable,
hence so are A and <p_1[A].

4.11. Examples. Let L be a theory (in the sense of Ch. 7) that includes
“junior arithmetic” U, and let (A,cp,f) be the corresponding encoded
theory. From Lemma 7.7.9 it follows at once that if C and D are r.e. sets
of numbers then (C—D, D — C) is representable in (A,cp,ff. In particular,
we can take C and D to be those of Lemma 4.6. Thus by Thm. 4.10 the sets
A, (p~r[A] are effectively inseparable.1

1 This is in fact the recursion-theoretic content of the First Incompleteness Theorem


(Thm. 7.11.8).
376 RECURSION THEORY (CONTINUED) [CH. 8, §5

The same result holds for a wide variety of encoded theories. For example,
in any of the usual formalizations of axiomatic set theory (such as the one
studied in Ch. 10 of this book) it is possible to deduce appropriate trans¬
lations of all the sentences of “junior arithmetic”. (For the formalized
set theory of Ch. 10 the method is indicated in §2 of that chapter.) Therefore,
if (A,cp,f) is an encoded version of such a theory, A and cp~1[A] are effectively
inseparable.

§ 5. Productive and creative sets; reducibility

The notion of r.e. set can be regarded as a recursion-theoretic analogue of


the set-theoretic notion of countable set. Thus a set is countable iff it is the
range of some mapping whose domain is included in N; and byThm. 6.11.5 a
set of numbers is r.e. iff it is the range of some recursive unary function.
Other useful recursion-theoretic notions may be defined by pursuing
the same kind of analogy.
What should be the recursion-theoretic analogue of the notion of
uncountable set? At first sight it may seem that if r.e. sets are analogous
to countable sets then non-r.e. sets are similarly analogous to uncountable
sets. But this is of course wrong. What we are looking for is the notion
of set which not only fails to be r.e., but whose failure to be r.e. is actually
“demonstrated in a recursive way”.
To find the right analogue, let us analyse how a set A may be proved to be
uncountable. Perhaps the most direct way of proving this is to show that
for every countable set B such that B^A there is a member of A not belonging
to B. The required analogue is obtained from the above statement by
replacing “every countable set” by “every given r.e. set” (where “given”
means, as usual, given via an index) and by insisting that the required
member of A -B be obtained by a recursive function from the given index
of the subset B. These considerations lead to the following definition of
productive set, which is the correct recursion-theoretic analogue of un¬
countable set.
5.1. Definition. A set A £ N is productive if there is a unary recursive
function / such that f(e)£A — dom/c} for every number e such that
dom^e} c A. Such a function/is called a productive function for A.
Note that by this definition /(<?) has to be defined when domx{e} c A,
but not necessarily otherwise.
A productive set is a fortiori not r.e., so the lowest class in the arithmetical
hierarchy in which we may expect to find a productive set is the class Il\.
CH. 8, §5]. PRODUCTIVE AND CREATIVE SETS; REDUCIBILITY 377

5.2. Example. Let

A = {z: z^donqlz}}.

Now, since (z6^4c) = (2 6dom1{z}), it follows from 6.11.6 that Ac is r.e.,


hence A is n°v We show that A is productive by proving that the identity
function Az(z) is productive for A. Indeed, we have

(z£A) = (z$ dom^z}).

Thus the number e belongs to exactly one of the two sets A and dom1{e}.
In particular, if dom^ejcyl then e must belong to A but not to domx{c}.
(Cf. Ex. 6.11.8.)

5.3. Problem. Let A, B be sets of numbers such that B is r.e. and AnB
is productive. Prove that A is productive. (Obtain a nice sequence q> for
which the identity dom1{^(z)} = dom1{z}n5 holds.)
5.4. Problem. Let ran{c) be the range of the unary function 2x[{e}(x)].
(i) Find nice sequences q> and i]/ such that for every z

dom1{z} = ran{cp(z)}, ran{z} = dom1{t/>(z)}.

(Use the S'” thm. to get <p for which the identity {<p(z)}(x)=0 • (z}(x)+x
holds. To obtain ift use 6.11.11.)
(ii) Show that a set A of numbers is productive iff there is a unary recursive
function g such that g(e)6T. — ran{£>} whenever ran{e}^A. (Use cp and
\j/ of (i). Show that if g satisfies the condition then g(p is a productive
function for A; and if/ is a productive function for A then fif/ satisfies the
condition.)
(iii) Show that the set {z: z$ran{z}} is 77j and productive.

5.5. Theorem. Every productive set has a productive sequence (i.e., a total
productive function).
Proof. Let / be a productive function for the set A. Let i be an index
of /. Put

R(x,z) = 3yT(i,z,y) a (xgdom^z}).

Then R is clearly r.e., and by applying to it Thm. 6.11.11 we obtain a nice


sequence cp such that

(x ^ dom1{(p(z)}) = ByT(i,z, y) a (x 6 dom^z}).


378 RECURSION THEORY (CONTINUED) [CH. 8, §5

This can be re-written as follows:

dom^z} if f(z)
(1) 0 if/(z)=°o.

(This is a solution of Prob. 6.11.12.). Now let j be an index of the function


ftp 0-e., 2z[/(<p(z))]). Put

(2) g(z)= U(y,y[T(J,z,y) v T(j,z,y))).

Clearly, g is recursive. Also, by the choice of i and j it is clear that, if at


least one of the values /(z) and fcp(z) is defined, then g(z) is defined and
equals one of these two values.
As a matter of fact, for each z at least one of the values /(z),/<p(z) must
be defined. For, if /(z) = °° then by (1) we have dom 1{(p(z)} = $^ A; and,
since/is a productive function for A,ftp(z) must be defined (and, incident¬
ally, must belong to A).
Thus g is a total recursive function. We shall show that it is productive
for A.
Suppose dom^z} <=,A. Then, since/is productive for A, we have

/(z)e^-domx{z}.

In particular /(z) must be defined, so that by (1) we have dom1{q(z)} =


= dom1{z}sA. Hence, using again the fact that / is productive for A,

fcp(z) d A — dom1{(p(z)} = A — dom^z}.

But we have seen that g(z) must equal f(z) or fcp(z), so in either case we
have g(z)£A — domx{z}, as required. |

The above result can be strengthened still further:

5.6. Theorem. Every productive set has a monotone increasing productive


sequence.
Proof. Let A be a productive set. By Thm. 5.5 we may assume that A
has a productive sequence, say cp.
By Thm. 6.11.11 we get a nice sequence t such that

(x € dom1{'r(z)})=(x 6 dom1{z}) v (x = <p(z)).

This can be re-written thus:

(1) dom1{z(z)} = donqlz} u (<p(z)}.


CH. 8, §5]. PRODUCTIVE AND CREATIVE SETS; REDUCIBILITY 379-

Now put

(2) T°(z)=Z, TU + 1(z) = TTU(z).

Clearly, the binary function Xzu [ru(z)] defined by (2) is p.r. and certainly
recursive. Putting tu(z) for z in (1) we get

(3) dom1{Tu+1(z)} = dom1{xu(z)} u {<;px\z)}.

Now suppose that z is such that dom^z}^. Then for all u we have

(4) dom1{Tu(z)}^'4>

(5) <pTu(z) £A — dom1{x“(z)}.

These two facts are proved simultaneously by induction on u. For u — 0


we have (4) by our assumption that dom^z) ^ A\ and (5) holds because
cp is productive for A. Next, if we assume (4) and (5) for a given u, then
by (3) we get (4) for u +1 as well; and since (p is productive for A we also
have (5) for w + 1.
Let us outline how we are going to obtain a monotone productive
sequence \J/ for A. The value i^(z) will be defined by induction on z. If z
is such that dom^z}^, then all we need to ensure is that \j/(z) is greater
than the previous values of \j/. This is easily done. However, if dom^} A
we must also ensure that t//z)€ A — dom^z}. Now, if donqlz} £ A we know
than (5) holds. From this together with (3) it follows at once that the values
(pxu{z) for m=0,1,2,... are all different and all belong to ^4—domx{z}. Among
these infinitely many different values there must be such that are greater
than iKz-1) and we shall choose such a value as iKz).
Now let us fill in the details. Put

/(z,h>)=ny [3u<y [(pxu(z) = (pxy(z)] v (q>xy(z)>w)'].

Clearly, / is recursive. We show that it is total. Let us fix z and w and


consider two cases.
Case 1: The values <pxu(z) for m=0,1,2... are all different. Then for
every y the first disjunct in the definition of / is false. But for some y the
second disjunct must hold; so /(z,w) is the least such y. In particular,
in this case we have <pT/(z’"’)(z)=-w.
Case 2: The values cpxu(z) for w=0,l,2... are not all different. Then
the first disjunct in the definition of / must hold for some y (while the
second disjunct may or may not hold for some y). Thus f(z,w) is defined
in this case as well.
380 RECURSION THEORY (CONTINUED) [CH. 8, §5

We now define

m=<p(P),

i//(z+ l) = max(t/^(z) +1, <pT/(z+1,'Mz))(z+1)).

(For the definition of max see Ex. 6.6.5.) From this definition it is immediate
that \J/ is recursive and monotone. It remains to show that ij/ is productive
for A. Since \f/(0)=(p(0), we only need to consider the behaviour of i//
for positive values of its argument.
Suppose therefore that dom1{z+1} c A. Then, as we have shown above,
the values <ptu(z+l) for m=0,1,2,... are all different and they all belong
to A— dom^z+l}. Therefore in the definition of /(z +1, t/^z)) we have
Case 1, and hence

(pz/(z + 1>^(z))(z +1) >1j/(z).

It now follows from the definition of ij/(z +1) that

iA(z+1) = <Pt/(z+1^(z))(z+1).

Thus \j/(z+l) has the form cpzu(z + l) and must therefore be in


A — dom^z+l}, as required. ')

5.7. Problem. Show that every productive set has an infinite recursive
subset. (Using 6.10.15, get a monotone recursive sequence cp such that
doma{(g(z)} = 0 for all z. Let ^ be a monotone productive sequence for
the set A. Show that ijjcp enumerates an infinite recursive subset of A.)

5.8. Definition. A set of numbers is creative if it is r.e. and its complement


is productive.
A productive set is a set whose failure to be r.e. is “proved in a recursive
way” — by means of a productive function. Similarly, we may say that
a creative set is an r.e. set whose failure to be recursive is “proved in a
recursive way” — by means, pf a productive function for its complement.

5.9. Example. The set C={z: z^dom^z}} is creative by Ex. 5.2.

5.10. Problem. Prove that if (A,cp,f) is an encoded theory of the kind


discussed in Thm. 4.10 and Ex. 4.11, and (A,(p,f) is consistent and axio-
matizable, then A is creative. (Show that if A and B are disjoint and
effectively inseparable r.e. sets, then each of them is creative.)
CH. 8, §5], PRODUCTIVE AND CREATIVE SETS; REDUCIBILITY 381

5.11. Definition. Let A and B be sets of numbers. A many-one reduction


of A to B is a recursive sequence g for which the identity

(x<=A)=(g(x)<=B)

holds, i.e., A = g~1[B]. If such a g exists then A is many-one reducible to B.


We shall usually omit the words “many-one” and say simply “reduction”
and “reducible”.

We shall apply the notion of reducibility to the study of productive and


creative sets.

5.12. Theorem. If A is productive and reducible to B, then B is productive.


Proof. Let g be a reduction of A to B. Using the S'" thm. for r.e. sets
(6.11.11) we find a nice sequence cp such that

(x € donql^z)})=(e(x) 6 dom^z}).

Thus for every e we have

(1) p“1[dom1{c}] = dom1{(p(e)}.

Also, since g is a reduction of A to B we have

(2) q~'[B]=A.

Let/be a productive function for A. We show that gf(p is a productive


function for B.
Let dom^e} c B. Then from (1) and (2) it follows that domx{>(e)} c A.
Since / is productive for A, the value fp(e) is defined and belongs to
A-dom^cpie)}. By (1) and (2) it now follows that gf(p(e) is in .S-dom^e},
as required.

5.13. Corollary. If A is creative and reducible to an r.e. set B, then B is


creative.
Proof. A reduction of A to B is clearly also a reduction of Ac to Bc. R

5.14. Example. Let B={z: dom1{z}=A}. Thus B is the set of all indexes
of recursive sequences. (Cf. Prob. 6.10.10). We shall show that B is produc¬
tive by reducing to it the set A of Ex. 5.2. Let/be defined by

if ~}T(z,z,y),
(1) otherwise.

26
RECURSION THEORY (CONTINUED) [CH. 8, §5
382

By Ex. 6.10.5/is recursive, so we can use the S™ thm. to get a nice sequence
q such that

(2) {Qiz)}(y)=f{z,y).

We claim that q is a reduction of A to B, i.e., that for all z

(3) z^dom^z} iff dom1{^(z)}=W

Recall that (x€dom1{z}) = 3^7’(z,x^). Thus for given z we have z$ dom^z}


iff —i T(z,z,y) holds for all y; so that (3) follows at once from (1) and (2).
Note that (z6^) = \/A:(x€dom1{z}), so that Bc belongs to class 17° in
the arithmetical hierarchy. Now, level 2 in this hierarchy is the lowest
in which we can hope to find a productive set whose complement is also
productive. (Sets of level 1 are r.e. sets and their complements.) We now
show that our B is in fact such a set: B is productive as well.
By Thm. 6.11.13 (the RT for r.e. sets) we can find a nice sequence (p
such that

(* € donij^z)}) = (<p(z) € dom1{z}).


Thus

We therefore have the identity

(<p(z) € B) = (<p(z) € dom^z}),

from which it follows at once that q> is a productive sequence for Bc.

5.15. Problem. Prove that the set {z: dom1{z}?£0} is creative.


5.16. Problem. Prove that the set {z: ran{z} = lV} is a IJ°2 productive set
whose complement is also productive. (Show that the cp of Prob. 5.4 is a
reduction to these sets of the sets considered in 5.14.)

The notion of reducibility arises naturally in connection with encoded


theories. If (A,q>,f) and (B,rf/,g) are encoded theories, then a reduction
of A to B is a uniform recursive method of reducing the decision problem
for the first theory to that for the second. (This is obviously closely connected
with the notion of reducibility defined in the discussion preceding Ex. 7.10.9.
In fact, the results of 7.10.9-7.10.15 can easily be rephrased in terms of
reducibility — in our present sense — of the corresponding encoded
theories.)
CH. 8, §5], PRODUCTIVE AND CREATIVE SETS; REDUCIBILITY 383

5.17. Theorem. Let (A,(p,f) be an encoded theory in which every r.e. set
of numbers is weakly representable. Then Ac is productive. In particular,
if (A,cp,f) is also axiomatizable, then A is creative.
Proof. By Ex. 5.9 the set C={z: z^dom^z}} is creative. In paiticular,
since C is r.e., it is weakly representable in (A,(p,f). Therefore for some
n the identity

(f(n,x)eA)=(xeC)

holds. Thus Xx[f(n,x)\ is a reduction of C to A, hence also of Cc to Ac.


By Thm. 5.12, Ac must be productive. I

5.18. Problem. Let <A,cp,f) be an encoded theory in which every 17° set
of numbers is weakly representable. Show that A is productive.

The following two problems assume familiarity with Ch. 7.

5.19. Problem. Show that #[£2] and (#[L2])C are productive.


5.20. Problem. Let £ be a theory in the sense of Ch. 7.
(i) Prove that if £ is sound and includes “baby arithmetic” no, then
(#[£])c is productive. (Use Thm. 7.6.8 and Thm. 5.17.)
(ii) Prove that if £ is sound then (#[£])c is productive; hence if £ is
sound and axiomatizable then # [£] is creative. (Let A be the set of all
sentences deducible from £ull2. As in the proof of Thm. 7.10.5, obtain
a reduction of #[A] to #[£].)

5.21. Remark. Let S£ be any first-order language and let £ be an if-theory


(i.e., a deductively closed set of if-sentences). Call £ creative if the set of
all code numbers of sentences of £ (under any of the standard methods
of encoding ^-expressions by numbers) is creative. The results of this
section yield a whole spectrum of creative theories, widely different in
their (apparent) mathematical intricacy. At one end of the scale, we see
from Prob. 5.20 that the set of all logically true sentences in the first-order
language of arithmetic is creative. Using the reductions of 7.10.9-7.10.15
we see that the same holds for the set of logically true sentences in various
other first-order languages. At the other end of the scale, we see from
Prob. 5.10 that if £ is any of the usual formalizations of axiomatic set
theory then £ is creative provided it is consistent. In between these two
extremes we have, e.g., first-order Peano arithmetic which is creative by
Prob. 5.10 as well as by Prob. 5.20.

26*
RECURSION THEORY (CONTINUED) [CH. 8, §6
384

§ 6. One—one reducibility; recursive isomorphisms

We say that A is one-one reducible to B if there is a one-one reduction


of A to B, i.e., a recursive sequence q which does not take any value more
than once and such that A = q~1[B}.

6.1. Theorem. Any r.e. set of numbers is one-one reducible to the complement
of any productive set, hence to any creative set.
Proof. Let A be an r.e. set of numbers and let Bc be productive. By Thm. 5.6,
Bc has a monotone productive sequence \J/. Thus for any z we have

(1) i/^(z)£i?c — dom^z} if dom1{z} c Bc.

By 6.11.13 (the RT for r.e. sets) we get a nice — and hence recursive and
monotone — sequence cp such that

(xedom^Cz)}) = (x = il/cp(z)) a (z£A),

which can be re-stated in the form

otherwise.
Now, ij/cp is clearly recursive and monotone, and hence one-one. We shall
show that fcp reduces A to B.
First, let e£A. Then, by (2), dom^^e)} is the singleton {ip(p(e)}. Hence
f(p(e)iBc — dom1{(p(e)}, and it follows from (1) that domx{^(e)} $ Bc. This
means that ijj(p(e)dB.
Conversely, if e$A, then, by (2), dom^^e)} = 0 c= Bc and we see from (1)
that ij/(p(e)dBc, i.e. i]/(p(e)$B. |

It follows from Thm. 6.1 that any axiomatic formal theory is reducible
(in the sense of §10 of Ch. 7) to any creative theory: e.g., to any of the
theories mentioned in Remark 5.21.

Two sets of numbers are many-one equivalent if each of them is many-one


reducible to the other. The notion of one-one equivalence is defined in a
similar way.
. . * S
It is easy to verify that both many-one equivalence and one-one equi¬
valence are indeed equivalence relations between sets of numbers. The
following theorem shows that the creative sets constitute an equivalence
class under both relations.

6.2. Theorem. Let A be creative, and let B any set of numbers. Then the
following three conditions are equivalent:
CH. 8, §6], ONE-ONE REDUCIBILITY; RECURSIVE ISOMORPHISMS 385

(1) B is creative',
(2) A and B are one-one equivalent;
(3) A and B are many-one equivalent.
Proof. (1) =>■ (2) by Thni. 6.1, and (2) =>■ (3) trivially.
Now assume (3). Then B is reducible to the creative — and hence r.e. —
set A. From this it follows at once that B is r.e. as well. Since A is reducible
to B, Cor. 5.13 shows that B is creative. i

A recursive permutation is a recursive sequence which takes every value


exactly once1.
Sets of numbers A and B are said to be recursively isomorphic if there is
a recursive permutation cp which maps A onto B. (It is easy to see that in
this case, if iJ/ is defined by f(x)=gy((p(y)=x), then i/a is a recursive per¬
mutation mapping B onto A.)
The following remarkable result, due to Myhill, is the recursion-
theoretic analogue of the Schroder-Bernstein theorem.

6.3. Theorem. Two sets of numbers are recursively isomorphic iff they are
one-one equivalent.
Proof. If A and B are recursively isomorphic, then they are obviously

one-one equivalent.
To prove the converse, suppose A and B are one-one equivalent, and
let (p and i/a be one-one reductions of A to B and B to A respectively.
We seek unary recursive functions f g and binary recursive functions
h, k satisfying the following identities:

lgu[-i3y<x{f(y)±u}] if * is even,
(1) fW \h(x,pu[~i 3y<x{f(y) = h{x,u)}}) if x is odd;

lgu[-i3y-<x{g(y) = u}] if x is odd,


(2) g(-x)~\k(x,pu[n3y^x{g(y) = k(x,u)}]) if x is even;

(3) h(x, o)=fg(x), h(x,u+1)=ite{w[f(y)=Kx,u)]);

(4) k(x, 0) = (pf(x), k(x,u +1) = (pf(gy[g(y) = k(x,u)]).

Such functions can be obtained, using Prob. 6.10.26 as follows. In the


above identities, replace “/”, “g”, “/i” and “k” by {^x} , {e2} » {^3}
and respectively. Then Prob. 6.10.26 yields numbers elt e2, e3 and

1 It is easy to verify that the recursive permutations constitute a group under the operation
of composition. It can be argued that the recursion-theoretically significant notions
are precisely those that are invariant under this group.
386 RECURSION THEORY (CONTINUED) [CH. 8, §6

e4 for which the resulting identities all hold. We now only have to define

/(*)== W(*)> #00 = WO),


h(x,u) = {e3}(x ,u), k(x,u) = WOW-

We shall now prove that, for all y and y',

(5) Ay) ^ and if y^y' then /(y) ^/(/);

(6) and if y^y' then g(y)^g(y');

0) /OW iff g(y)£B.

We proceed by induction. Assume that (5), (6) and (7) hold for all y and y'
which are smaller than some number x; we shall prove that (5), (6) and (7)
hold for all j,./<x.
Suppose first that x is even. Then by (l),/(x) is defined as the least number
not belonging to the set {/(_>>) : >><x}. Therefore (5) holds for all y,y'«;x.
Next, we prove that for all w the following statement holds:

(8) If for all v<w the values k(x,v) are distinct and belong to the
set {g(>0 : y<x}, then k(x,w) is defined and is different from
all these k(x,v) with

For w=0, (8) merely claims that &(x,0) is defined. But by (4) we have
&(x,0) = <p/(x) and we have already shown that /(x) is defined.
Now let w = w+l. Suppose that all the values k(x,v) for v<u are distinct
and belong to the set {g(y) : j<x}. Since (6) is assumed to hold for all
_y,y<x, it follows that for each v^u there is a unique yv<x such that
g(yv)—k(x,v). Also, these yv are all distinct, since the k(x,v) are distinct.
By (4) we now have

(9) k(x,v + l) — (pAyv) for all v<u.

Since (5) holds for all y, / <x (and even, as we have shown, for all y, /<x)
it follows that Ayu)is defined. Hence k(x,u+1) is defined by (9). It remains
to show that k(x, u+l)^k(x,v\{or all v<u.
Suppose that k(x, u+\)=k(x,0). Then by (9) and (4) this would mean
that (pAy\) = (pAx). Since <p is one-one, it would follow that /(yu)=/(x).
But, since ju<x, this contradicts the fact that (5) holds for all y, y'<x.
Suppose next that &(x,«+l)=&(x,t; + l) for some v<u. Then by (9)
this would mean (pAyu) = <pAyv)• Since (p is one-one and yu,yv are distinct
numbers <x, we would again get a contradiction.
CH. 8, §6]. ONE-ONE REDUCTIBILITY; RECURSIVE ISOMORPHISMS 387

Thus we have shown that k(x,u +1) is defined and different from k(x,v)
for all v^u, and the proof of (8) is complete.
Let us put

(10) u(x)=nu[-\3y<x{g(y)=k(x,u)}].

Then, for our particular x, it follows from (8) that w(x)^°°, since otherwise
the successive values k{x,u) for u=0, 1,2,... would all have to be distinct
members of the finite set {g(y) : y<x}, which is clearly impossible.
Turning now to (2) and recalling that x was assumed to be even, we see
that g(x)=k(x,u(x)). Thus, by what we have just shown, g(x) is defined.
Moreover, from (10) it follows that g(x)^g(y) for all y<x. Thus (6) holds
for all y, y'szx.
Next, we want to show that f{x)^A iff g(x)(zB. Since we know that
g(x)=fc(x,«(*)), it will be enough to show that

(11) f(x)€A iff k(x,u)£B

for all u<u(x). We proceed by induction on u.


For u=0 we have (11) because k(x,0) = (pf(x) by (4), and cp is a reduction
of A to B.
Now suppose that (11) has been established for some «<w(x). Since
w<w(x), it follows from (10) that

(12) k(x,u)=g(yu)

for some unique yu<x. But (7) was assumed for all y<x. Hence by (11)
and (12) we have

(13) f(x)€A iff f(yj£A.

Also, by (4) and (12) we have k(x, u+l)=(pf(yu); hence, since (p is a


reduction of A to B,

f(yu)£A iff k(x,u-\-l)£B.

From this and (13) we see that (11) holds for w + 1 as well.
Thus we have established that/(x) 6^4 iff g(x)dB and hence (7) holds
for all y«sx.
So far we have assumed x to be even. If x is odd, the treatment is exactly
the same except that /, k, <p and A exchange roles with g, h, $ and B
respectively.
Thus (5), (6) and (7) have been established for all y and y'.
388 RECURSION THEORY (CONTINUED) [CH. 8, §7

From (5) we see that / is a one-one sequence (it does not take a value
more than once). Also, from the first part of (1), which deals with the
case that v is even, we see that / takes each value at least once. Thus / is
a recursive permutation. Similarly, g is a recursive permutation.
We put
e(x)~g(py(f(y)=x)),

i.e., q =gf~1. Then q is clearly a recursive permutation, and it follows


at once from (7) that q maps A onto B. g
From Thms. 6.2 and 6.3 it follows that any two creative sets are re¬
cursively isomorphic. Thus, if Lj and H2 are two creative theories, then
their respective sets of code numbers can be obtained from each other
by a recursive permutation. From a purely recursion-theoretic viewpoint
it would therefore seem that the differences in mathematical intricacy
between such theories (see Remark 5.21) are illusory. However, it is also
correct to say that there is much more to a formal theory than meets the
recursion-theorist’s eye.

§ 7. Turing degrees

Let (p be any sequence, not necessarily recursive. We recall (see §5 of Ch. 6)


that an n-ary function / is ^-recursive if the identity f(x)=F((p;x) holds
for some recursive functional F.
Using Prob. 6.5.4 (see also Prob. 6.8.7) it is easy to see that if if/ is
^-recursive and % is (/(-recursive then x is ^-recursive.
Let us say that sequences (p and ijj are recursively equivalent if (p is
t/'-recursive and iJ/ is (^-recursive. It is easily seen that recursive equivalence
is indeed an equivalence relation. The classes into which the collection
NN of all sequences is partitioned by this relation are called degrees of
unsolvability, or Turing degrees, or, briefly, degrees. We define deg cp to be
the degree to which the sequence cp belongs.
A partial ordering < of degrees is defined as follows: degi/^deg <p
if i]/ is ^-recursive. (This definition is legitimate, since, if i/F and q>' are
recursively equivalent to tf/ and <p respectively, and ij/ is ^-recursive then
if is (/-recursive.)
If neither degi/(=<deg <p nor deg (p«sdeg t/q then deg cp and degi/( are
said to be incomparable.
The recursive sequences constitute a single degree, denoted by O. For
every (p we clearly have 0<degcp.
CH. 8, §7], TURING DEGREES 389

7.1. Problem. Show that a function / is ^-recursive iff f£t%{cp}. (See


Def. 1.10.) Hence show that degt/7«deg<p iff &{iJ/}^^{cp}.
1.2. Problem. Show that for each cp there are at most denumerably many
degrees ^deg cp, but the collection of all degrees has the cardinality of the
continuum.
7.3. Problem. Prove that any two degrees have a supremum.

The study of degrees constitutes an important branch of recursion theory.


Unfortunately, it is beyond the scope of this book; the reader interested
in this topic should consult the references in §9 below. Here and in the
next section we shall barely skim the surface: we shall prove two results
in order to illustrate some of the methods used in the study of degrees.

For any finite sequence of numbers <<a0,...,ak_x) we define B(a0,...,ak-1)


to be the set

{<p£NN: (p(i)=ai for all /<&}.

In particular, if A:=0 (so that the finite sequence is in fact empty) this set
is the whole of NN.
With this notation we prove:

7.4. Lemma. For any non-recursive sequence i{/, finite sequence of numbers
(a0,...,ak-j) and number e, there exist a number m =&k and numbers ak,...,am
such that for every (p^B(a0,...,ak^1, ak,...,am) we have

(p?±2x[{e}(\J/;x)], ^ 2x[{e}((p ;x)].

Proof. We let ak = 0, unless {e}(<A;k) = 0 in which case we put ak=1. This


choice of ak ensures that — no matter how ak+1,...,am will be chosen — we
shall have, for every (p£B(a0,...,am), the inequality (p(k)^{e}(\j/;k), and
hence cp ^ 2x[{e}(fi ]x)]. It therefore remains to choose m and ak+1,...,am
such that i{/^2x[{e}((p;x)\ for every cpeB(a0,...,am).
We define a set AcN2 as follows. For any given x and z, (x,z)eA holds
iff for some cp

(1) (p£B(a0,...,ak), {e}(q>-,x)=z.

Now, suppose that x and z are such that (x,z)6^f, and let (p be a sequence
satisfying (1). Recall that by the NFT (6.10.1) we have

{e}(.<p;x)=U(nyT*($(y), e,x,y)),
where
v(y)= n exp(/7(,<p(0)-
t^y
RECURSION THEORY (CONTINUED) [CH. 8, §7
390

By (1) there is therefore a (unique) y such that

T*((~p(y), e,x,y)A(U(y)=z).
Now let
w = ]lexP
where the product is taken for all t<max(k,7). We clearly have

(2) ((w)o = a0) a ... a ((w)k=ak),

(3) T*( exp(p„(w)(), e,x,y) a (U(y)=z).

Conversely, if y and w are such that (2) and (3) hold, put <p=Ar[(w,)t] and
it is easy to see that cp satisfies (1). Thus we have shown that (x,z)£A
iff there are y and w such that (2) and (3) hold. It follows that A is an
r.e. set.
Since ^ is non-recursive, the graph of \]/ must be different from A by
Thm. 6.11.3. Therefore at least one of the following must be the case:
(i) For some x, (x,ij/(x))$A.
(ii) For some x and z, (x,z)£A but i\)(x)Az.
Suppose first that (i) holds, and let x be a number such that (x,f(x))^A.
Then by the definition of A we have {e}((p ;x) A ip(x) for every (pdB(a0,...,ak).
Hence in this case we can put m=k, and the proof is complete.
Now assume (ii) and let x and z be numbers such that (x,z)£A but
tl/(x)Az. Then, again by the definition of A, there is some (p£B(a0,...,ak)
such that for some y we have

T\(p{y),e,x,y) a (U (y)=z).

We put m—max(k,y) and at = (p(i) for i=k+\,...,m. Then if cp' is any


sequence in 5(a0,...,om), we have cp'(y) = (p(y) and hence {e}((p';x) =
=z^(x). |

7.5. Theorem. If ij/ is a non-recursive sequence, then there exists an un¬


countable set T of sequences such that ij/£W and

(*) if i]/k, if/2 are distinct members of W, then deg tl>k and deg ij/2 are
incomparable.

Proof. Consider the family of all 1F^Nn satisfying condition (*). This
family is partially ordered by inclusion, and it is easy to verify that the
CH. 8, §7]. TURING DEGREES 391

hypothesis of Zorn’s Lemma is satisfied. Also, the singleton {iJ/} belongs


to the family, since in this case (*) holds trivially. Therefore by Zorn’s
Lemma there exists a maximal set W satisfying (*) and such that {t/i}c
i.e., iWe shall show that is uncountable.
Suppose we had W — {\\/j: jdN}. We shall define an infinite sequence
{a;: i'€iV} by stages. Suppose that up to the nth stage aQ,...,ak-x have
already been defined. Then at the nth stage we define ak,...,am for some
m^k, as follows. Put K(n)=e and L(n)=j. Thus n=J(e,j). Now, it is
clear that tj/j is non-recursive. (If = then 1J/j is non-recursive by
assumption. Otherwise deg \]/j and deg x]/ are incomparable by (*) and
again i{/j cannot be recursive.) Using Lemma 7.4 we define ak,...,am such
that for every (p£B(a0,...,am) we have

In this way al is eventually defined for every i. We let cp(i)=ai for


all i. Then deg cp is incomparable with deg 1j/j for all j. For, if for some
j deg (p and deg il/j were comparable, then for some e we should have
cp = Ax[{e}(i!/j ;x)] or \j/j=Xx[{e}(<p;x)], but this was prevented at the J(e,j)th
stage in the definition of the at.
Since T is a proper subset of Tu {cp}, this contradicts the maximality
of x¥. I

Our proof that there exists an uncountable set of pairwise incomparable


degrees used the axiom of choice (in the form of Zorn’s Lemma). However,
the same result — and even the existence of a set of power 2X° of pairwise
incomparable degrees — can be proved without using the axiom of choice.
(For references see §9.)

In the following problem we outline a topological version of the proof


of Thm. 7.5.

7.6. Problem. Define the distance d(cp,i]/) between two different sequences
cp and ijj to be (^x[(p(x)^i//(x)] + l)_1, and d(<p,<p)—0.
(i) Show that with this distance function, NN becomes a complete
metric space.
(ii) Show that, if ij/ is a non-recursive sequence, the set of all sequences
cp such that deg \J/ is comparable with deg cp is of the first category, i.e.,
a countable union of nowhere dense sets. (Use Lemma 7.4.)
392 RECURSION THEORY (CONTINUED) [CH. 8, §8

(iii) Use the Baire category theorem to prove Thm. 7.5. (Obtain a
maximal 1F by Zorn’s Lemma, as in the original proof. Use (ii) to show
that, if T were countable, then the whole space NN would be of the first
category, contrary to Baire’s theorem.)

*§ 8. Post’s problem and its solution

If A is a set of numbers, the corresponding property Ax(x£A) is, by our


conventions, a sequence of 0’s and l’s. We define the degree of A (briefly,
deg A) to be the degree of Ax(x£A).
If A is an r.e. set of numbers, then deg A is called an r.e. degree. Note
that if deg A is r.e. it does not necessarily follow that A is r.e.; for example,
the complement of an r.e. set — which may not itself be r.e. — always
has an r.e. degree, because deg A = deg Ac for every A.
It is easy to see that, if A is many-one reducible to B, then deg A«deg B.
Therefore by Thm. 6.1 all creative sets have the same degree, which is >
any r.e. degree. The degree of creative sets is denoted by O'.
All the r.e. sets which we have met up to this point were either recursive
or creative. Thus from what we have learnt so far we do not yet know
whether there exist r.e. degrees other than O (the degree of all recursive sets)
and O'. This problem was posed by Post in 1944 and remained open for
twelve years.
Post’s problem is of importance in connection with encoded theories.
It is reasonable to take deg A as a measure of the difficulty of the decision
problem for A. Thus Post’s problem is equivalent to the problem of wheth¬
er there exists an axiomatizable encoded theory whose decision problem,
while being recursively unsolvable, is less difficult than that of a creative
theory.
Post’s problem was solved independently, and almost simultaneously,
by the American Friedberg and the Russian Mucnik, both of whom were
young students at the time. They showed, in fact, how to construct two
r.e. sets whose degrees are incomparable. It is clear that these two degrees
must be different from both O and O'. (The same method can be used to
construct an infinite set of pairwise incomparable r.e, degrees.)
We shall now present Friedberg’s construction.
We wish to construct two r.e. properties (i.e., sequences of 0’s and l’s)
a and [3 such that deg a and deg 13 are incomparable.
For deg a and deg (3 to be incomparable, we must ensure that for every
CH. 8, §8], POST’S PROBLEM AND ITS SOLUTION 393

e we have

a 5* Ax[{e}(J3 ;x)], p ^ Xx[{e}(a ;x)].

We shall achieve this by defining, together with a and /?, a sequence cp such
that for every number e we have

a(<p(2e))=0 iff {<?}(/? ;<p(2e)) = 1,

P(q>(2e+ !))■= 0 iff {e}(or,(p(2e +1))= 1.

We shall define a, ft and <p “by stages”. At the «th stage we define, by
induction on n, properties a„ and pn and a sequence cpn; these will serve
as our “«th approximations” to a, ft and cp respectively.
For b=0 we put
<*<)(*) = 1> Po(*) = U cp0(z) = 2z.

Now suppose that a„, /?„ and cpn have been defined. To define a„+1, fin+1
and cp„+i, we distinguish two cases, each divided into two subcases.
Case 1: (n)0 is even; say («)0 = 2e. We let /?n+1 be the same as /?„. To
define an+1 and (p„+1, we search for a number b<n satisfying the condition

(*) T(pn; e,<p„(2e),b)A(U(b) = l)A ~icin(<pn(2e)).

Recall that by the NFT (6.10.1) there is at most one such b (it is the computa¬
tion code for (/?„; e,(p„(2e))).
Subcase Y. If there is some (unique) b<rt satisfying (*), we put

a„+x(x) = a„(x) v (x — cpn(2e)),

(3b • <pn(z) if z is odd and > 2e,


(/9', +1 “) {(pn{z) otherwise.

Subcase Y'. If there is no b<n satisfying (*), we let a„+1 and cpn+1
be the same as a„ and <pn respectively.
Case 2: (n)0 is odd; say («)„ = 2e+\. We let a„+1 be the same as a„.
Also, we consider the condition

(* *) T(a„; e,(pn(2e + l),b) a (U(b) = \) a ~i p„((pn(2e+1)).

Subcase 2'. If there is some b<n satisfying (* *) — in which case this


b is unique — we put

P„+i(x)=P„(x) V (x=cp„(2e +1)),

3b • q>„(z) if z is even and 2e + l,


<Pn(z) otherwise.
394 RECURSION THEORY (CONTINUED) [CH. 8, §8

Subcase 2If there is no such b, we put /3n+1=(3n, (pn+1 = (pn.


It is easy to see (e.g., using Prob. 6.10.26) that Xnx [a„(x)] and Xnx [f3n(x)]
are recursive binary relations and Xnz [cpn(z)] is a recursive binary function.
(In fact, they are p.r.: but we shall not need this.)
We put

c/.(x) = 3ny.n(x), f}(x) = 3nf3n(x).

Clearly, a and (3 are r.e. properties.


From the definition of the a„ it is clear that if a number x has the property
a„ (i.e., if an(x) vanishes) then * has the property otn+1 as well. Among the
numbers having the property ocn+1 there is at most one “new” number,
which does not have the property txn. In fact, such a new number exists
iff n falls under subcase V, in which case the new number is cpn(2e). Since
there is no number with the property a0, it follows by induction that for
each n there are only finitely many numbers with the property a„. Never¬
theless, from what we have just said and from the definition of a it follows
at once that if a number has the property a then that number also has the
property a„ for all sufficiently large n. Similar facts hold also for the
Pn and P-
Let us now examine the definitions of a„, /?„ and cp„ more closely. We
shall first explain these definitions informally; later on we shall show formally
that they actually work: the degrees of a and (3 are incomparable.
Suppose that («)0 is even, say (n)0 — 2e, so that we are in case 1. What
we try to do in this case is to ensure that cc^Xx:[{e}(/?;x)] by allowing
a.{cp{2.e)) to hold (i.e., to be 0) iff {<?}(/?;<p(2e)) = 1. However, at this stage
we do not yet have [3 and cp(2e) but only their nth approximations f3„ and
cpn{2e); so we use these instead. Thus we try to compute {e}(/3„; cp„(2e)),
and if we find that its value is 1, we allow an+1((pn(2e)) to hold. Then
a(<p„(2<?)) will certainly hold. (If a„(<p„(2e)) already holds, then ocn+1((pn(2e))
will hold automatically, and we do not have to do anything. We have
to add (p„(2e) as a new number only if an((pn(2e)) does not hold. This is
why the clause ~iocn{cpn(2e)) is included in (*).)
Of course, we must not make the definition of a„ + 1 dependent on an
indefinite search for a computation of the value {e}{[3„\cpn{2<?)), because
this value may not be defined, in which case an+1 would not be recursive.
Therefore we confine our search for a computation code b to numbers </7.

Since there are infinitely many numbers m>n with (m)0 = (n)0=2e, we
hope to resume our search at those later /77th stages, so that if the computa-
CH. 8, §8], POST’S PROBLEM AND ITS SOLUTION 395

tion code b exists at all, it will eventually be discovered even if it is not


smaller than our present n.
However, here we have a snag: if m>n and (m)0=(ri)0=2e, then at the
future mth stage we shall be searching for a computation of {<?}(/?„, ;<pm(2e))
rather than of {e}(/?„;<p„(2e)). How can we overcome this snag? Suppose
that {<?}(/?„ ;<pn(2e)) is defined and that the corresponding computation
code is b. Then the computation of {e}(pn;(p„(2e)) depends not on the
whole of but only on fin(b) — i.e., on the first b +1 values of /?„. Now,
if n is sufficiently large, then any number <6 that has the property /? will
already have the property /?„, so that ftn(b)=fi(b). And if m>n then
fijjj)—fi(b) = fin(b). Therefore the computation of {e}(Pn',cpn(2e)) will be
the same as that of {e}(Pm;(pn(2e)). If we can arrange matters so that for
sufficiently large n we have also <pm{2e) = (pn{2e) for all m>n, then we will
have overcome the snag; because for sufficiently large n the computation
of {e}(p„-,(pn(2e)) wili be the same as the computation of {e}(pm;(pm(2e))
for any m>n.
For any given n and z, let us say that cpn(z) is pegged if cpm{z) = (pn{z)
for all m>n. The above analysis of what happens in case 1 and a similar
analysis of case 2 suggest that the cpn should be defined in such a way that
for every z there is some n for which cpfz) is pegged. But why not take
(pn{z) to be independent of n in the first place? To answer this question let
us take another look at the definitions of a„, /?„ and cpn.
Suppose again that (n)0=2e, so that we are in case 1. Suppose also that
condition (*) holds for some b<n. This means that we are in subcase V;
we then allow cpn(2e) to have the property a„+1 and hence also the prop¬
erty a. Thus, we would be sure that a^2x[{e}(P',x)] if we can arrange that
{e}(P;(pn(2e)) = 1. Now, what we do know from (*) is that {e}(Pn;(pn(2e)) = 1.
Also, since b is the corresponding computation code, we know that the
computation of {e}(fin',(pn(2e)) depends only on the values ftn(x) for x^b.
Therefore all will be well if we can ensure that /?(*) = jS„(x) for all x<b.
Now, by looking at case 2 we see that every number with the property p
but not P„ must be of the form <pm(z), where z is odd and m>n. Thus,
if z is odd, we would like to make sure that <pm{z)>b for all m>n. Actually,
we take (pm(z) to be non-decreasing in m, so we only need to ensure that
(pn+1(z)>b for odd z. Of course, if for some odd z it happens that cpn(z)<b,
then <p„(z) must not be pegged if we want to have (pn+1{z)>b.
Thus we are faced with two conflicting requirements. On the one hand, for
each z we want cpfz) to be pegged for some n. On the other hand, whenever
n falls under subcase V we would like to ensure that cpn+x{z) is sufficiently
RECURSION THEORY (CONTINUED) [CH. 8, §8
396

large for odd z. (And, symmetrically, whenever n falls under subcase 2'
we want to ensure that <p„+1(z) is sufficiently large for even z.)
The conflict between these requirements is resolved as follows. In
subcase T we do not insist that (pn+1{z)>b for all odd z but only for those
odd z which are >2e. For the other values of z we let <pn+1(z) be the same
as (pn{z). We do a similar thing in subcase 2'. A simple argument (which
we shall present below) shows that this procedure ensures that for each
z there is some n such that cp„(z) is pegged. Also, if n falls under subcase V
and is so large that cpn(2e) is already pegged — which is the only case that
really matters — it turns out that pn(x)=p(x) for all x<b, as required.
(A similar fact also holds in subcase 2'.)
There is another point concerning the definition of the <p„ which requires
an explanation. If («)0=2e and n is so large that cpn(2e) is pegged, we want
<p„(2e) to have the property a only if {e}(/?;<p„(2e)) = \. Now, if we had
<pn(2e) = (pm(2d) for some m and some d^e, then <pn{2e) could get the
property a “by mistake”, i.e., even if {<?}(/?;<?„(2e))^l; because it may
happen that {d}(0;<pm(2d)) = l. We therefore want to ensure that <p„(z) ^
7±(Pm(y) whenever z^y. This is done by defining cpn(z) in such a way that
we have (<p„(z))0=z for all n and z. (In fact, <p„(z) always has the form
2Z • 3“ for some u.)
Having explained the definitions of the a„ and /?„, and especially the
intricacies in the definition of the q>„ — on which the whole construction
hinges — let us now show formally that the construction actually works.

8.1. Lemma. For every z there is an n such that <pn(z) is pegged.


Proof. We use induction on z. Suppose that z is odd. We know that any
number having the property a must have the property a„ for all sufficiently
large n. Now, by the induction hypothesis there are only finitely many
different numbers of the form (pm(y) with >’<z. Therefore, if n is sufficiently
large, then any among these numbers that have the property a must also
have the property a„.
For all such large n we must have <p„+1(z) = <p„(z). Otherwise, n would
fall under subcase F, and (n)0=*2e<z, because this is the only case in which
<pn+1{z)^(pn{z) for odd z. But, in subcase V, <p„(2e) is allowed to have the
property an+1, hence also a, while it does not have the property a„. Since
2e<z, this contradicts the “sufficient largeness” of n.
A similar argument applies to even z. |

For any given z we put <p(z) = <p„(z), where n is taken so large that <p„(z)
CH. 8, §8], POST’S PROBLEM AND ITS SOLUTION 397

is pegged. (Of course, the least n for which (p„(z) is pegged depends on r.)
By Lemma 8.1, cp(z) is uniquely defined for each z.
We now prove:
8.2. Theorem. The r.e. properties a and [1 have incomparable degrees.
Proof. To prove that a is not (]-recursive, we show that for every e

a(<p(2e)) holds iff {e}{fi\(p(2<?))=1.

First suppose that {<?}(/? ;(p(2e)) = 1. Then for some (unique) b we have

(1) T(/J; e,(p{2e), b) a(C/(6) = 1).

We take n=22e • m, where m is odd. We choose m so large that n>b, and


(pn{2e) is already pegged, and fi(b) = jSn(b) (i.e., every number <b having
the property /? already has the property /?„). Then (1) can be written in
the form
T(pn; e,q>n(2e), *) a (!/(&) = 1).

Thus if q>„(2e) does not have the property a„ then (*) holds; and, since
b<n and (n)0=2e, it follows that n falls under subcase V. Then <pn(2e) is
made to have the property a„+1, and hence also a.
If <p„(2e) does have the property ocn, then again it has the property a.
Thus in any event a((pn(2e)) holds. But q>n(2e) = <p(2e) by the choice of n.
Conversely, suppose that a(<p(2e» holds. Then there is some (unique)
n such that tx„((p(2e)) does not hold but an+1(<p(2e)) does. This n must
therefore fall under subcase V and

(2) <p(2<?) = <p„((/i)0).

But the cpn are defined in such a way that (pm(y) and cpn(z) are the same only
if y=z (in fact, (<p„(z))0=z). Also, (p{2e) = cpm{2e) for sufficiently large in.
Therefore by (2) we have (n)0=2e. It now also follows from (2) that q>n(2e)
must be pegged.
Since n falls under subcase Y, there exists a unique b<n such that (*)
holds. In particular, since cpn(2e) is pegged, we can replace (p„(2e) in (*) by
w{2e) and obtain

(3) T(/Jn; e,cp{2e), b) a {U(b) = 1).

(We have omitted the third conjunct because we do not need it.) If we

27
RECURSION THEORY (CONTINUED) [CH. 8, §9
398

can show that fi„(b) = fi(b) then it will follow at once from (3) that
{<?}(/? ;<p(2e)) = l, as required.
Suppose that pn(b)^ fi(b). Then there exists some t^b such that t has
the property ft but not /?„. Therefore there is some m>« such that t has
the property j8m+1 but not (lm. Thus m falls under subcase 2'. Hence we
have not only 7?7>/7 but actually m>n, since n falls under subcase V. Also,
we must have t = (pm(2d+1), where (m)ti = 2d+\.
Since m>n, we have

(4) t=(pm(2d+\)^(pn+1(2d+l).

Now, either 2d+l>2e or 2e>2d+l. If 2d+\>2e, then (since n falls


under subcase 1') we have (pn+1(2d+1) = 36 • cpn(2d+ \)>b and by (4) we get
t>b, contradicting the assumption that t<b.
On the other hand, if 2e>2d+l then (since m falls under subcase 2')
we have (pm+1(2e) = 3c • <pm(2e), where c is some computation code and
hence is positive. Therefore cpm+1(2e)xpm(2e), and since m>/7 we have
also cpm+1(2e)xpn(2e). But this contradicts the fact that cpn(2e) is pegged.
This completes the proof that a is not ^-recursive. Symmetrically, we can
show that P is not a-recursive. f

8.3. Problem. Show that the sequence (p defined above is not recursive.
(Find a nice sequence 1/7 such that the identity {1p(z)}(P;x)= ~ip(z) holds,
and show that P(z) = tx(p{2, •

§ 9. Historical and bibliographical remarks

For a wealth of information on all the topics touched upon in this chapter
(as well as on other topics in recursion theory) we refer the reader to the
treatise Rogers [1967], The topics of §§3-6 are studied in detail by Smullyan
[1961], to whom our own treatment owes much. Two books devoted
entirely to degrees of unsolvability (touched upon in §§7 and 8) are Sacks
[1963] and Shoenfield [1971].
The arithmetical hierarchy was first studied by Kleene [1943], who proved
the Enumeration Theorem (Thm. 1.7) and the Hierarchy Theorem (Thm. 1.9)
as well as various other results, some of which are included in §1. Broadly
similar results were obtained independently, but published only after the
second world war, by Mostowski [1947]. Thm. 1.16 is due to Post [1948],
The result of §2 is due to Hilbert and Bernays [1939],
CH. 8, §91. HISTORICAL AND BIBLIOGRAPHICAL REMARKS 399

The study of creative sets and various notions of reducibility was started
by Post [1944], Important results — including the remarkable fact that
one-one equivalence implies recursive isomorphism, reproduced here as
Thm. 6.3 — were proved by Myhill [1955]. Productive sets were studied
by Dekker [1955].
The notion of degree of unsolvability was introduced by Post [1944],
following the ideas of Turing [1939], The result reproduced here as Thm. 7.5
is due to Shoenfield [I960],
Post [1944] posed the question that came to be known as “Post’s prob¬
lem”. It was solved by Friedberg [1957] and Mucnik [1956],

27*
CHAPTER 9

INTUITIONISTIC FIRST-ORDER LOGIC

This chapter is a brief — and rather sketchy — introduction to the logic


which governs mathematical statements when these are interpreted in
a constructive (as opposed to structural) way. The resulting system of
logic will be compared with the classical system developed in the first
three chapters of this book. We shall therefore assume familiarity with
the material covered there, including the method of first-order tableaux.

§ 1. Preliminary discussion

In §2 of Ch. 1 we remarked that a great many mathematical statements


are about structures. With this in mind, we have set up first-order formal
languages, in which such statements can be formalized. The structural
interpretation of sentences of such a language was then precisely formulated
in the BSD (2.1.1).
Underlying this approach is the idea that structures are given to us as
— or as if they were — entities existing “out there'*, finished and complete
before we ever come to use them in our semantic analysis of formulas.
For each ^-structure, any if-sentence has a definite truth value — it is
either true or false in that structure, independently of whether we actually
know (or shall ever know) which of the two is the case.
Note that a structure U is a certain system of classes: the domain U
of U is an arbitrary non-empty*class; each basic n-ary relation is a sub-class
of Un; and each basic /7-ary operation is completely determined by its
graph, which is a sub-class of Un+1. Except in relatively trivial cases,
some or all of these classes are infinite. Thus, a structural interpretation
of mathematical statements commits us to regarding systems of (infinite)
classes, as — or as if they were — finished pre-existing entities.
Now, it is philosophically doubtful as to what extent such a view is jus-
CH. 9, §1], PRELIMINARY DISCUSSION 40!

tified.1 Moreover, certain important mathematical statements neither require


a structural interpretation nor lend themselves to it without losing some of
their content.
Consider the following well-known mathematical statement:

(1.1) For any given natural number n we can find a prime number greater
than 77.

Of course, this statement can be formalized in a suitable first-order language.


For example, in the language of Ch. 7, and using the notation of Prob.
7.9.16, we can write
(1.2) Vv2 3vi[~iyAv2<v1];

and we may wish to say that statement (1.1) means that the sentence (1.2)
expresses a truth about the structure 92 of natural numbers, or that (1.2)
holds in 92. But in this we would be doing an injustice to (1.1); because
what it says (and means) is that we are able to do something, and this
message gets lost in the claim that (1.2) holds in 92.
In fact (1.1) claims that a certain construction can be made. We prove
this claim by prescribing how to make the constuction and showing that
it yields the required result. (E.g., thus: Calculate /?! + l and, using a pre¬
viously established construction which yields for any number >1 the least
prime factor of that number, find the least prime factor p of 77! +1. This
p is the result of the new construction. We already know from the previous
construction that p is prime and divides n \ + 1. Therefore, by another previous
result, p cannot divide n\ and hence must be >77.)
Moreover, to understand (1.1) — and its proof— we do not need to
assume the pre-existence of the structure 92, or of any infinite class, as a
finished entity. What we do need to assume is that we can construct natural
numbers 0, 1,2 and so on without ever having to stop,2 but without ever
having constructed “all” natural numbers; and that we can perform certain
operations (which are also constructions) on constructions made previously.
Under this constructive point of view — which in the case of (1.1) seems
most natural — a statement such as (1.1) conveys more information (and
also information of a different kind) than does the claim that a correspond¬
ing formal sentence such as (1.2) holds in some infinite structure such as 92.

1 The philosophical position which claims that mathematical entities, including infinite
classes, actually exist as ideal objects independent of the human mind is known as
Platonism.
2 In this there is, of course, an idealization; because life is too short.
402 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §1

Most mathematicians recognize this and attribute special importance to


constructive statements and proofs.1
Since the constructive interpretation of a formal statement conveys more
information than the structural one, it should come as no surprise that it
may require more effort to produce a constructive proof than a classical
(non-constructive) proof of the same statement. In other words, certain
arguments which occur frequently in classical proofs are not acceptable
from a constructive viewpoint, and cannot be used in constructive proofs.
As an example of this, consider the statement:

(1.3) There are irrational real numbers a and b such that ab is rational.

From a classical point of view, we can prove (1.3) as follows. Let b = 21/2.
It is well known that b is irrational. If bb is rational, put a=b and we are
through. If bb is irrational, put a=bb\ then ab~2, which is rational.
However, if we want to interpret (1.3) constructively, then it must be
re-phrased (or re-interpreted) as a claim that we can construct two particular
real numbers a and b, and prove that they are irrational and that ab is rational.
The above proof does not actually construct the required a and b. After
reading and understanding it, we still do not know any particular pair of
numbers with the required property. To convert this proof into a construc¬
tive proof of (the constructive version of) statement (1.3), one would need
to make a much deeper study of the numbers b = 21/2 and bb. (In fact
it can be shown that bb is irrational.)
In this chapter we propose to pursue some of the consequences of the
constructive point of view. We shall not discard first-order languages
as a means for formalizing mathematical statements, but we shall try to
see what happens when a structural interpretation of formal sentences is
replaced by a purely constructive one.
Just as the structural approach suggested the so-called classical first-order
logic developed earlier in this book, so the constructive approach will
naturally suggest a somewhat different logic governing first-order sentences.
This is known as intuitionistic first-order logic.

1 This was also the attitude we adopted earlier in this book. Although the interpretation
chosen for the object language was structural, results about 5£ were, in most cases
where this was possible, formulated and proved constructively.
CH. 9, §3]. CONSTRUCTIVE MEANING OF SENTENCES 403

§ 2. Philosophical remark

In taking the constructive viewpoint seriously and studying intuitionistic


logic, we are not committing ourselves to the intuitionistic philosophy of
mathematics of Brouwer and his more orthodox disciples. This philosophy
claims, among other things: that the constructive interpretation of mathe¬
matical statements is the only legitimate one; that any mathematical statement
or proof which cannot be understood constructively is meaningless and must
be rejected; that mathematical activity is essentially subjective and consists
in mental constructions in the mind of the individual mathematician, rather
than a primarily social activity concerned with objective facts independent
of any individual mind; that it is in principle a pre-linguistic or extra-
linguistic activity and therefore ideally no language is required for doing
mathematics but only for recording and for transmitting it from one indi¬
vidual mind to another; and that when mathematical ideas are expressed in
linguistic form (even in a formal language) they are necessarily distorted and
lose some of their rigour and sharpness.
Most mathematicians and philosophers of mathematics do not accept
these views. While admitting that a constructive approach is always valuable,
and sometimes even more natural than a structural one, they do not reject
the latter either, and regard the fruitful and intricate interplay between the
two as the very soul of mathematics. For Platonists, the structural approach
is justified in itself. Others, who are neither intuitionists nor Platonists,
believe (or hope) that, while infinite structures may not really exist as
Platonic objects, it is possible to justify using them as if they did.

§ 3. Constructive meaning of sentences

Let us sketch some of the features of a constructive interpretation of formal


sentences.
The variables occurring in a sentence are taken to range over constructions
of some prescribed kind. Let us call these the basic constructions. For
example, in the case considered in §1, if we want (1.2) to be a formal version
of (1.1), we take the natural numbers as basic constructions. As we have
already pointed out, if there are infinitely many basic constructions, we
do not think of them as constituting a finished collection, but as being
generated one by one, as and when required. The same applies to other
constructions discussed below.
Besides the basic constructions we also consider constructions of more
404 TNTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §4

complicated types. For example, there are constructions that can be applied
to one or several given basic constructions, to yield yet another basic
construction. More generally, a construction can be applied to zero or
more constructions of specified kinds, to yield other constructions.1
Under a constructive interpretation, each formal sentence asserts that a
construction satisfying certain conditions can be made. Proving a sentence
consists in performing such a construction. Thus, a proof of a given sentence
is nothing but a construction whose performability the given sentence
asserts. For example, under a suitable constructive interpretation, the
sentence (1.2) asserts the possibility of making — and is proved by actually
making — a construction which can be applied to any number (i.e., any
basic construction) n, to yield a number p together with a proof (which
is also a construction) that p is prime and a proof that p>n.
Thus, under a constructive interpretation, instead of saying “the sentence
a asserts that such-and-such a construction can be made” we can say
equivalently “such-and-such a construction is a proof of a” or “a is proved
by such-and-such a construction”.
Note the difference: under a structural interpretation (an =£f-structure)
the assertion made by an if-sentence is quite a separate matter from the
possibility of proving the assertion; but under a constructive interpretation
these two matters are inseparable.

§ 4. Constructive interpretations

In order to see what constructive meanings should be attributed to the


connectives and quantifiers, we shall have to proceed more formally.
We let <£f be a first-order language. For the sake of simplicity we shall
assume throughout this chapter except in §13 that if is a language without
equality (and hence without function symbols other than constants).
The variables of if are enumerated in a fixed alphabetic ordering: v1? v2>
v3, etc.
Since we cannot presuppose that the constructive meanings of v and A
can be reduced to those of —i and -* as in classical logic, we take all these

1 Complex constructions occur frequently in mathematics, and have already cropped


up many times in this book. For example, in proving Lemma 1.8.6 we set up a construc¬
tion that can be applied to two given propositional tableaux of specified kinds, to yield
a third propositional tableau. Tableaux themselves are constructions, of course. In
proving Thm. 6.5.5. we set up a construction that can be applied to any description of
a functional to yield a program for computing that functional. Descriptions and programs
are themselves constructions.
CH. 9, §4], CONSTRUCTIVE INTERPRETATIONS 405

four connectives as primitive symbols (cf. §15 of Ch. 1). Similarly, we take
both quantifiers 3 and V as primitive.
Formulas are formed in the usual way. We assume that the definitions
of the various syntactical notions (e.g., free occurrence of a variable in
a formula, substitution, alphabetic change, etc.) have been adapted to the
present setting. In this chapter the degree of a formula a (briefly, deg a)
is taken to be the total number of occurrences of the four connectives and
two quantifiers in a. A constructive interpretation (f of consists of the
following ingredients.
(1) A prescription for generating certain constructions, called the basic
constructions or individuals'- of £.
(2) A construction that can be applied to any constant a of ££, to yield
an individual a11 of (£.
(3) A decision procedure that can be applied to any construction / and
any A-ary predicate symbol P of Jz? and any /:-tuple (b±,...,bk) of individuals
of (£, whereby we can decide whether or not f is a proof of the atomic
statement V'i{b1,...,bk).

Let £ be a constructive interpretation. Let xlv..,x„ be distinct variables,


and let a be any formula all of whose free vatiables are among xk,...,xn.
With any given n-tuple (a1,...,a„) of individuals of (£ associate a statement
«G[xi/a1,...,xja„]. Intuitively, this is the statement made by the formula
a under the interpretation (£, when ak,...,an are taken as the values of
x1,...,x„ respectively.
When we deal with a fixed interpretation (£, we shall omit the superscript
and write “a[x1/a1,...,xn/anY\ Also, when there is no risk of confusion,
we shall abbreviate this further and write simply “a
To explain the meaning of a statement, we should specify what construc¬
tions the statement asserts to be performable. In view of what we saw in
§3, this is equivalent to specifying what constructions constitute proofs of
the statement.
Proceeding by induction on deg a, we shall now specify what a proof ol
a[ai,...,a„] is. (What follows cannot be regarded as a rigorous definition
but as an explanation which, we hope, will serve as a heuristic guide to the
constructive meaning of formulas.)
First, let a = Pr1...rt, where P is a A>ary predicate symbol and r1}...,rk

1 We shall use lower case italic letters from the beginning of the alphabet (esp. a , b
and “c”) to denote these individuals. We reserve “/”, “g” and “A” for arbitrary con¬
structions.
406 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §4

are terms, i.e., constants or variables. For each i=l,...,k we let bt be an


individual defined as follows. If r; is a constant, we put bt=rf. If r( is
a variable, it must be among xx,...,xn; if r; is Xj we put bi=aj. We now
let «[#!,...,a„] be the atomic statement Pc:(61,...,Z>fc). The proofs of this
statement are already specified as part of the definition of (£, in clause (3)
above.
Now let a = ~lp. The statement v.[ar,...,an] is thus the negation of
P[a1,...,aJ. What should a proof of a[alv..,a„] be like? Surely, it must
show that p[a1,...,a^ is unprovable. Constructively, this means that a proof
of «[#!,...,«„] is a construction g which, when applied to any construction f
yields a proof that f is not a proof of [\[a],...,an] (plus a proof that g has this
property).
Since this constructive meaning of negation is rather tricky, let us illustrate
it by analysing a proof of a negation statement presented in Ch. 1. Consider
the proof of the consistency of the propositional calculus immediately
following Thm. 1.11.3. It consists in showing that the empty set of formulas
does not have a (propositional) confutation. Suppose therefore that, for
any finite set of formulas ®, the statement P[<1>] says that ® has a confutation.
A (constructive) proof of p[<I>] is simply a confutation of ®, which is of
course a special kind of construction. From the definition of the notion
confutation of ® we know how to tell whether or not any construction
f presented to us is a confutation of ®. In particular, two of the necessary
conditions are:
(i) /should be a tableau having ® as its initial node;
(ii) if <J> does not contain a prime formula and its negation,/must have
more nodes than one, and hence ® has to contain some formula to which
one of the three tableau rules is applicable.
Thus, to show that/is not a confutation of ®, it is enough to show that
/ fails to satisfy (i) or (ii). Now, the statement that we want to prove is
I P[0]- It is proved by showing that, for any construction / and for ® = 0,
at least one of the conditions (i) and (ii) must break down: if/is not a tableau
having 0 as its initial node then (i) fails, and if/has 0 as its initial node then
(ii) fails. Also, every construction / either is or is not a tableau having
0 as its initial node (and we can always tell which of these two is the case!),
so that for each/ we can actually locate one of the two conditions (i) and (ii)
which fails.

Let us now return to our inductive description of what a proof of


aK,•••,«„] is.
CH. 9, §41. CONSTRUCTIVE INTERPRETATIONS 407

Let oc = pAy. Then a proof of ais a construction consisting


of two parts (or, an ordered pair of constructions) the first of which is a proof
of p[tfl5...,an] and the second a proof of y[a1,...,a,(\. This explanation is
both clear and highly natural, and calls for no further comment.
Next, let a = Pvy. Then a proof of a[a1,...,a,(\ is any construction which
is a proof of p[ax,...,an] or a proof of ya„]. This is also very natural.
But note that it implies that (PV"lP)[fli,...,flJ may not be provable. For
we must admit the possibility that, while we cannot find a proof of P[ay,...,^],
we may not be able to find a proof of (~1P)^,...,^] either. After all, a proof
of (—| p)[«1,...,<ar„] gives us not merely the fact that p [al5...,a„] is improvable,
but a constructive proof of this fact. Thus the law of the excluded middle
(tertium non datur) does not generally hold under a constructive inter¬
pretation.
Now let a = P-fy. The natural thing to do here is to take a proof of
a[aj,...,^] to be a construction g which, whenever it is applied to a proof
f of p[>!,...,a„] yields a proof of y[a1?...,«„] (plus u proof that g has this
property).
To deal with the quantifiers, we introduce the following notation. If x is
the same as xh then
(*) a[x1/a1,...,x„/aB][x/fl]
is the same as a[x1/b1,...,xn/b„], where bj=aj for all yVz, and h;=a. If x
is different from all the x„ then (*) is the same as a.[x1/a1,...,xjan,x/a\.
Now let a = VXP- Here it is natural to take as a proof of «[alr..,flj
a construction g which, when applied to any individual a, yields a proof of
p[x1/fl1,...,x„/fl„][x/fl] (plus a proof that g has this property).
Finally, let a = 3xp. Here too there is only one natural way to proceed,
provided we remember that the constructive meaning of 3 is not merely
“there exists...” (exists where?) but “we can construct...”. A proof of
a[alt...,a„] is a construction consisting of two parts (or, an ordered pair
of constructions) the first of which is a basic construction (i.e., an individual)
a and the second a proof of p[x1/a1,...,x„/an][x/a].

The above explanation of the term proof of a[au...,an] cannot be regarded


as a precise definition, for several reasons.
First, it uses notions — especially the notion of construction — which
are left too vague. To turn our explanation into a definition, one would
need to have as a background theory a general theory about constructions.
Second, the explanation is — to use a technical term very impredicative.
in the part dealing with negation, for instance, a proof of (“1 P)[«l5--■
408 1NTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §5

is characterized as a construction that yields such-and-such results when


applied to arbitrary constructions. Now, the arbitrary constructions referred
to must, presumably, include also the very construction (proof) which is
being characterized. This is somewhat suspect. To mitigate this, one could
e.g. distinguish constructions of various levels and characterize a construc¬
tion only in terms of its effect on constructions of levels lower than itself.
A formula of higher degree should have proofs of higher level.
These and other difficulties can be overcome, at least to some extent,
and our explanation can be converted into something much more like a real
definition; but this is beyond the scope of the present book.1 Nevertheless,
even as they stand our explanations should be helpful as a heuristic guide
and a partial justification for the development in the following sections.

4.1. Problem. Let all the free variables of a be among the distinct variables
xlf...,xB, and let x be another variable. Check that, for any individuals
«!,...,an, a, a construction is a proof of a[x1/a1,...,xn/a„,x/a] iff it is a proof
of a[x1/a1,...,xjan],
4.2. Problem. Let a be as in Prob. 4.1, and let a' be a variant of a. Check
that a construction is a proof of a[al5...,an] iff it is a proof of a'[au...,a^.
4.3. Problem. Let t be any term (i.e., a variable or a constant), and let
the distinct variables x1,...,x„ include all the free variables of a(x/t). Show
how to obtain from any proof of (a(x/t))[a1,.. .,an] a proof of (3xa)[^1,...,an].

§ 5. Intuitionistic tableaux

We introduce a new kind of constructive statement, called “►statements”.


These are statements about finite sets of formulas.
Let <I> = {<Pi,...,<{>*} be a finite set of formulas and let vj/ be a formula.
Let all the free variables of ® and v[/ be among the distinct variables x,,.. .,x„.
Then the statement

<!>► \\i

means that we can perform a construction / such that when applied to


any given constructive interpretation (£, any n given individuals ai,...,a„
of G and any k given proofs/),...,/* of (?f[x1/a1,...,xn/an],...,q>*[x1/a1,...,xn/anl
respectively, / yields a proof of \|t*[xjalt...,xjan\. (In particular, in the
case k = 0, for any given C and ax,...,an, the construction / must yield
a proof of v|/<t[x1/a1,...,x„/aj.)

1 The interested reader should consult the works referred to at the end of this chapter.
CH. 9, §5]. INTUITION1STIC TABLEAUX 409

Also, the statement


<i> ►

means that <Pi*~vl/ for every formula \J/.


If <p*»-v|/, we say that i|/ follows intuitionistically from <P. In particular, il
we write this as “►i|/” and say that v|) is intuitionistically valid.
If <p ►, we say that <P is intuitionistically contradictory.
It is clear that the statement
(5.1) <D. v)/^v|/
always holds.
The following thirteen ► rules can be verified easily, using the explanation
of §4. In each case, a proof of the ► statement below the horizontal line
can be obtained by a rather simple construction from any given proof(s)
of the ► statement(s) above the line. In all cases, <P is any finite set of
formulas, and a and p are any formulas. The name of each rule appears
on the left. In the rule and the rule, t is any term. In the 3»-
rule and the e- V rule, y is a variable which is not free in any of the formulas
below the line. Here as well as in the rest of this chapter we let the letter
stand for any formula or for the empty string of symbols. (In the latter
case, is simply <!>►.)
<pEs-a
► v1•
<D,avp,a»-^; <P,avp,p*-^ (Pb^ a vP
v► : -—--—z-
0>, avp*-^ <P*-p
v,:
<P*-avP

<D, «aP. «, <P ► a; <P «►


A:
<P, «aP^^ <Pi*~ aA p

<P, a-*P^a; <P, a- p,p^; <P, «► P


<P, a -+• P *► £ <P*~a->P

(p, —l a ► a <P,

<P, tp^ ~ia

<P, 3xa,a(x/y)i <P*-a(x/t)


3*~ -3
*P, 3xa*-£ ^►3xa

<P, Vxa> a(x/t) <p£s»a(x/y)


►V:
tp, <P»- VXOt

The first nine rules are the propositional rules; the last four are the
quantifier rules.
410 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §5

We leave the verification of these rules to the reader.


We have justified (5.1) and the ► rules in terms of the heuristic explana¬
tions of §4. We now adopt an axiomatic approach: we take (5.1) as an
axiom scheme and the thirteen ► rules as rules of inference. From now on
we shall only assert a ► statement if we can derive it from instances of
(5.1) by a finite number of applications of the ► rules.
While the explanations of §4 are certainly not acceptable as a rigorous
definition, scheme (5.1) and the thirteen ► rules are generally accepted
as correct principles of intuitionistic first-order logic. (Thus, any correct
rigorous version of the ideas of §4 should vindicate (5.1) and the thirteen
rules.) Also, all principles of intuitionistic first-order logic at present
generally accepted, can be derived from (5.1) by means of the thirteen rules.
We shall now set up a very efficient tableau method for deriving correct
statements.
We introduce a new symbol ” (minus). By the negative of the formula
a we mean the expression —a. (This should not be confused with the
negation ~ia of a.) The minus sign is allowed to occur at most once in
each expression, and only in front of a whole formula. (Therefore no
brackets are needed for writing negatives. For example, in — «aP the
minus must apply to the whole formula aAp and not just to a.) By
“iformula” we mean formula or negative of a formula.
An intuitionistic propositional tableau has at its initial node a finite set
of formulas and at most one negative. There are nine propositional rules
for extending a given tableau. They are presented schematically below.
In each case the name of the rule appears on the left. The meaning of the
stars that accompany seven of the rules will be explained later.

— a vp

v:

a
av p
— V

/ \
ct p — a vp

P
CH. 9, §5]. INTUITIONISTIC TABLEAUX 411

A: —A :

... < . -8
ea.
— aAP
8

* //\s*
-a -p

-»• -► \

a—► p — a-*p

*/\ *l
-a p a
-p
“1 — “1:

“la — “la

l* l*
—a a

The function of the star is to “kill” any negative occurring earlier in the
branch. For example, in the tableau

—«aP, y-*5
* /x*
-a -p
*/\
—y 5

the Tformulas of the leftmost branch are y~»-S and —a, but not — aAp,
which has been “killed” by the star accompanying the edge that leads to
—a; the iformulas of the middle branch are y~>S and —y, but not — <xaP
nor — p because they have been “killed” by the first and second stars,
respectively, of this branch; the + formulas of the rightmost branch are
y_*5, — p, 8 because — <xaP has been “killed” by the only star of this
branch; but — P does belong to this branch because there is no star below
— P in this branch. Notice that a star kills a negative only in the branch in
which the star occurs; the same negative may continue to live in other branches.
INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §5
412

(Thus, in our example, —p was killed only in the middle branch but continues
to live in the rightmost branch.)
A branch is closed if it contains a prime formula (i.e., an atomic, existential
or universal formula) as well as its negative. A tableau is closed if all its
branches are closed. A tableau which has a given set of ±formulas as
its initial node is said to be a tableau for that set of + formulas.
Notice that in any branch there is at most one negative. For, the initial
node is allowed to have at most one negative, and the rules are such that
whenever a new negative is introduced, the old one (if any) is killed. (In
the —~i rule the old negative is killed but a new one is not introduced.)
Let <D be a finite set of formulas. We write “O o0\{/” as short for “we
can construct a closed intuitionistic propositional tableau for O, —vj/".
In particular, if O is empty we write “t>0\|/” instead of “0 o0\|T'. We write
o0“ as short for “we can construct a closed intuitionistic propositional
tableau for <D“. A proof of a t>0 statement is simply a tableau of the kind
that the statement asserts to be constructible.

5.2. Theorem. If <I> t>0^ then also <t> q; and we can derive <!>►£, from
instances of (5.1) by a finite number of applications of the propositional
b»- rules.
Proof. By induction on the depth d of a given proof of 3>
If d= 0, then (E, must be a prime formula and) <F must contain Thus
<!>«►£, is an instance of (5.1).
If d>0, we distinguish nine cases according to the tableau rule that
gave rise to level 1 in the given proof. For example, if the rule in question
is the -► rule then the given proof begins thus:

<1». -c
*/\
—a p

where a-^-P^O. (If % is the empty string of symbols, then the initial node
consists of <I> only.) From this.proof we obtain, by amalgamating the initial
node with the nodes {—a} and {p} respectively, proofs of Ot>0« and
<I>, p t>0E, and the depths of these proofs are <d. By the induction hypothesis,
we can derive <!>»>► a and O, P^^ from (5.1) by finitely many applications
of the propositional e- rules. But since a-»P(;<I>, we get from these
two ► statements the statement Q> m*- £ by one application of the -» rule.
The other eight cases are similar. §
CH. 9, §5). INTUITIONISTIC TABLEAUX 413

5.3. Problem. Show that <I>, \|/ o0v|/ for any formula \|/. Thus a branch that
contains any formula (not necessarily prime) as well as its negative is “as
good as closed”. (Use induction on deg vj/.)
5.4. Problem. Prove the converse of Thm. 5.2: if we can derive <!>►£,
from (5.1) using the nine propositional ► rules, then <5 d>0£,.

In the following problem -—- and in the sequel, whenever we make a


comparison with classical logic — it should be remembered that we now
have four primitive connectives and two primitive quantifiers. In particular,
whenever we refer to classical propositional logic, we assume that it is
treated as sketched in §15 of Ch. 1.

5.5. Problem. Show that if <I> o0 then we can construct a propositional


confutation (in the sense of Ch. 1) of <I>; and that if <I> t>0vj/ then we can
construct a propositional confutation of 0,“|\J/, hence (Examine
what happens to our present nine tableau rules if the minus sign is replaced
by negation.)
5.6. Problem. Show that:
(i) c>o aV P iff i>0<x or oop.
(ii) aoop iff o0a-fp.
(iii) c^av-la and Oq-l-la-f a are impossible if a is a prime formula.
5.7. Problem. Show that for all a and p
(i) >0a->“]—la,
(ii) >0"nn«->"ia,
(iii) t>0~l“l(“l—la-fa),
(iv) c>0-|“l(av-ia),
(v) n_l(a-fP)t>o“l-ia-f-|nP,
(vi) —I ~1 a-f I I p t>0~l ■“l(a-f P),
(vii) a—f P, ~1P t>o”la,
(viii) ~lavp t>0a-fP,
(ix) c=>0“I I [( I a —f- P)-f(“la-f IP) —>- a].
(Note that even if a formula has been used once to extend a branch, it may
have to be used again in some extension of that branch.)

We shall now sketch a method whereby we can effectively decide, for


each formula a, whether or not i>0a.
Let a formula a be given. Let O be the set of all subformulas of a. Then
O is finite and we can actually construct it.
We consider statements of the form ¥►!;, where 'Fs® and £, is the
empty string or There are finitely many such statements.

28
INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §5
414

Using Thm. 5.2 and Prob. 5.4 it is easy to see that o0 a iff there is a
sequence ► i1 </«£«} of statements of the above form, such that
(1) for each /, the statement is an instance of (5.1) or follows
from one or two earlier statements in the sequence by a single application
of a propositional ► rule.
(2) ¥„ = 0 and ^„ = a.
(3) All the statements in the sequence are distinct.
There are only finitely many sequences satisfying condition (3), and we
can check whether or not there is among them a sequence satisfying the
other two conditions as well.
(By the same method we can also decide, for each finite set <I> of formulas
and for each whether or not <I> o0l;.)

First-order intuitionistic tableaux are constructed like propositional ones,


except that there are four additional quantifier rules for extending branches.
Also, the rule for closing a branch is different. The four new rules are
represented schematically as follows:

3- - 3 -

3xa —3xa

*
«(x/y) —a(x/t)

V: -V:

Vxa —Vxa

1*
a(x/t) —a(x/y)

Here t is any term (i.e., variable or constant) and y is any variable which
is not free in any of the ± formulas of the branch which is being extended.
(The free variables of a negative — p are the same as those of p.) The
particular y used in an application of the 3 rule or the —V rule is called
the critical variable of that application.
A branch is closed if it contains an atomic formula as well as its negative.
If <t> is a finite set of formulas, we write “cDoiJ/” as short for “we can
construct a closed intuitionistic first-order tableau for <I>, —\J/”; and we
write “On>” as short for “we can construct a closed intuitionistic first-
CH. 9, §5], INTUITIONISTIC TABLEAUX 415

order tableau for (I)”. A proof of a > statement is a tableau of the kind
whose constructibility that statement asserts.

5.8. Theorem. T/’GJoS, then and we can derive <X>*~L,from (5.1) by


a finite number of applications of the thirteen ► rules.
Proof. Similar to Thm. 5.2. |

5.9. Problem. Show that ct>, i|/i>i|/ for any formula \|/. Thus a branch
containing any formula and its negative is “as good as closed”. Hence
show that if ® t>0^ then also
5.10. Problem. Prove the converse of Thm. 5.8.
5.11. Problem. Show that if Oo then we can construct a first-order
confutation (in the sense of Ch. 2) of O; and if then we can construct
a first-order confutation for <I>, 1 \|/, hence «I>i— vj/.
5.12. Problem. Show that, if a is atomic and contains x, it is impossible
that >“i"lVx(aV“la); also ^ is impossible that o“l“1 Vx( !~1 <x—>a).
5.13. Problem. Show that >avp iff >aor op. Also, a>P iff oa-»p.
5.14. Problem. Show that >3xa iff we can find a term t for which
oa(x/t).
5.15. Show that for all a:
Problem.
(i) >nnVx«-^Vx“ina, >3xnna->nn3xa;
(ii) >Vxna-+n3xa, t>“0xa-»-VX_1°L
(iii) o3x~la-4- 1 Vxa-
5.16. Problem. Let a be an atomic formula containing x. Show that it is
impossible that o“|Vxa->3x~la- What about >VxTia->”l“lVxa
and o-|-l3xa->3x-i“ia?

The following three results will be needed in §7.

5.17. Problem. Let ylv..,y„ be any variables. Show that if <Do^ then
we can find a proof of <£o^ in which none of these variables is used as
a critical variable. (Prove a result similar to Lemma 2.5.2, then proceed
as in the proof of Lemma 2.5.3.)
5.18. Problem. Let be a finite set of formulas, and let \|/ be a formula.
Show that if Ooi; then we can find a proof T of <X>o£, such that T can
be added to the initial node of T, yielding a proof of tDuToS,. Also
show that if <1> o then we can find a proof T' of «J>t> such that by adding
—vj/ to the initial node of T' we get a proof of «D o\|/. (Use 5.17.)
5.19. Problem. Show that, for any variable z and any term s, we can
transform any proof of into a proof of #(z/s) o^(z/s), where £>(z/s)

is the empty string if E, is empty. (Proceed as in 2.5.4.)

28:
416 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §6

§ 6. Kripke’s semantics

There are various methods of interpreting JSf-formulas in such a way that


the rules of intuitionistic logic -— but not all the rules of classical logic -—
turn out to be sound. One of these methods, which is rather simple and
very appealing intuitively, is due to Kripke. We shall present Kripke’s
semantics in a purely structural manner, without attempting to bring it
into line with the constructive viewpoint1. Rather, we shall give Kripke’s
semantics an independent, but structural, heuristic justification. Also, we
shall not deal separately with propositional logic, but start at once with
first-order logic.
In what follows, when we say that t is a term of a we mean that t is a
constant occurring in a or a variable free in a. We use a similar terminology
for sets of formulas.

A Kripke System 5\ for consists of the following ingredients:


(1) A non-empty collection ZR, whose members are called the states of 5\.
(2) A partial ordering of ZR.
(3) A mapping T# which assigns to every state AdZ^ a non-empty set
T#(A) of Jzf-terms, in such a way that if A <HB then TR(A)c TA(B).
(4) A mapping F^ which assigns to each state A^Z^ a (possibly empty)
set Fh(A) of atomic .^-formulas such that if tx^F^A) then all the terms
of a belong to TR(A), and if A B then F#(A) £ F^iB).
When we deal with one particular Kripke system 51, we shall omit the
subscript, and write simply “T”, “7”’ and “F”.

Intuitively, we interpret the states of 51 as “states in the progress of


knowledge”, and the partial order as “possible succession in time”:
A^B means that state B may possibly follow state A. At each state A,
we “know” certain terms — those belonging to T(A) — and we “know”
that certain atomic formulas — those belonging to F(A) — are true. We
assume that we never “forget”, so that what we know at state A we shall
still know at any state that may follow A. Also, we cannot know that an
atomic formula is true without knowing at the same time the terms of that
formula. In view of this intuitive explanation, conditions (l)-(4) above
are quite natural. (The stipulations that Z^0 and T(A)^0 for all A£Z
are made to exclude trivial cases.)
Let 5t be any Kripke system for SF. We shall now define a relation
between states of 5\ and ± formulas of jSf. (When there is no risk of

1 For a discussion of this question see references quoted at the end of this chapter.
CH. 9, §6]. KRIPKE’S SEMANTICS 417

confusion, we write simply “lb”, omitting the subscript.) If ^4|ba holds,


we say that a is forced (to be true) at A or that A forces a. Intuitively,
^Iba means that at the state A the formula a is definitely known to be
true; and ^41|-a means that at the state A the formula a is understood
but not known to be true.

6.1. Kripke’s Semantic Definition.


(1) If a is atomic, then

A lb a iff a £F(A).

(2) If a = pvy, then

yllba iff all the terms of a are in T(A), and ^ibP or ^Iby.

(3) If a = PAy, then

A lb a iff AllbP and vllby.

(4) If a = P-*y, then

^4lba iff all the terms of a are in T(A), and 5lby whenever
B^A and B\\- p.

(5) If a = ~iP, then

y4lba iff all the terms of a are in T(A), and #fb P whenever
B^A.

(6) If a = 3xp, then

A lb a iff ^4 lb P(x/t) for some term t.

(7) If a = V*P, then

A\ha iff if lb P(x/t) whenever B^A and t £T(B).

(8) If a is any formula, then

A\\-a iff all the terms of a are in T(A), but A\boc.

We shall refer to this definition briefly as “KSD”.


If ® is a set of + formulas, we write /lib® provided A forces every
+ formula in ®.

6.2. Lemma. Suppose that A\\-a. Then all the terms of a are in T(A).
Also, if B^A then B\\-<x.
Proof. Easy, by induction on deg a.
INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §6
418

Some clauses in KSD call for comment. Suppose we are at state A.


Intuitively, we say that we “understand” a if all the terms of a are in T{A).
To be sure that p-*y is true, we must understand this formula, and we
must also be certain that if at any future time (i.e., at any state B that may
follow A) we come to accept p as true, then at that time we also accept y
as true; hence clause (4). To be sure that ~iP is true, we must understand
P and be certain that p will never be accepted as true; hence clause (5).
Finally, to be sure that VXP is true, we must be certain that whenever any
term t comes to be known, then P(x/t) shall be accepted as true; hence
clause (7). The other clauses in KSD call for no special comment.
Note that if A lb ~! a then AII-a; but the converse is not necessarily true.
From now on, when we say “Kripke system”, without any further
qualification, we mean Kripke system for if or for any language obtained
from if by adding new constants.
We say that a set *f> of Fformulas is enforceable if there exists a Kripke
system S\ and a state A of 5x such that A\\-<S>.

6.3. Theorem. If a set tl> offormulas is satisfiable {in the sense of Def 2.1.5),
then is enforceable.
Proof. Let cr be a valuation such that o\= <F, and let U be the universe of a.
By adding new constants, if necessary, we get a language if' and an
if'-expansion1 a' of a such that for every a£U there is a constant a in
if' for which — a.
2d'

We define a Kripke system for if'. The system will have just one state, A.
As T(A) we take the set of all if'-terms. As F(A) we take the set of all
atomic if'-formulas satisfied by o'. If a is any if'-formula, then by induc¬
tion on deg a it is easy to see that A ||- a iff o' N a. In particular, A IF O. |

6.4. Problem. Show that if <t> is finite, then in the proof of Thm. 6.3 there
is no need to extend if. (Use Thm. 3.3.1 to show that without loss of
generality we can assume that for each a£U there is a variable x such that
xa = a.)

6.5. Counter-Example. The converse of Thm. 6.3 does not always hold.
Let P be a unary predicate symbol of if, and consider the following Kripke
system:
T= {Alf A2, A3,...};
An<Am iff

1 For the meaning of this, see §9 of Ch. 2.


CH. 9, §6]. KRIPKE’S SEMANTICS 419

for all n the set T(An) consists of all the variables;


^0O = {Pv,-: 1
Then for all n we have A„ II-Pv„; also /t,1 + 1ibPv„, hence II-“lPv,,.
Therefore A,,11-Pvnv'_lPv,I, so

A„ II-Vx[PxV“lPx].

Since this is the case for every n, we have also

^4„ll- —l V x[Px v —i px].

6.6. Problem. Let P be a unary predicate symbol of L£. Show that the
sentence
—I[Vx—1 “lPx-> “1 “1 VxPx]

is enforceable.

We write “Olb a” when we want to assert that O,—a is not enforceable.


This means that for every Kripke system ft and state A of ft, if A lb O and
all the terms of a are in T(A), then A lb a. In particular, if 0 = 0 we write
“lb a” and say that a is K-valid. This means that for any Kripke system
.ft and any state A of ft we have A lb a provided all the terms of a are in T(A).

6.7. Theorem. Let <I> be any set of formulas. //’Olb a then On a. In


particular, if lb a then N a.
Proof. If ON a, then 0,~ia is satisfiable. By Thm. 6.3 there is a state
A of some Kripke system such that A lb 0,1 a, and hence A lb O, —a. Thus
Ofba. i
6.8. Counter-Examples. The converse of Thm. 6.7 does not always hold.
From 6.5 we see at once that the following logically valid formulas are
not K-valid:

PxvnPx, Vx[pxv-iPx], -i“iVx[Pxv-iPx],

From 6.6 we see that the logically valid sentences

Vx-1 “lx->- —I —1 Vxpx, i I [VX I I Px ► I I VxPx]

are not K-valid.

6.9. Problem. Let P and Q be unary predicate symbols, and let x and y
be distinct variables. Show that the logically valid formula

Vx[Py v Qx] -► Py v V XQX


420 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §6

is not K-valid. (Consider two states A, B where AcB. Put F(/l)={y},


7XS) = {x,y}; F(M)={Qy}, F(B) = {Py, Qy}. Show that the negative of
the above formula is forced by A.)

In what follows we shall need to consider two simple operations on Kripke


systems.
First, let ft be any Kripke system, let t be a term that does not belong
to T(A) for any state A of ft, and let s be an arbitrary term. We let ft(s/t)
be the Kripke system obtained from ft when s is replaced by t. (Thus the
states of ft(s/t) and their partial ordering are the same as in ft; but whenever
s belongs to T^(A), we replace s by t in TH(A) and in all the atomic formulas
of Fh(A).) Now let a be any formula such that neither s nor t are among
the terms of a. It is easy to see that

A lh a(x/s) in ft iff A lb a(x/t) in ft(s/t).

The same applies also to — a(x/s) and — a(x/t).


Next, let ft, s and t be as above. We let ft(s,t) be the Kripke system
obtained from ft be “introducing t as a synonym for s”. More explicitly,
we leave the states of ft and their partial ordering unchanged; but whenever
s belongs to Tst(A) we add t as well, and we add to Fn(A) all formulas
obtained from a formula of F^(A) by replacing one or more occurrences
of s by occurrences of t. Now let a be any formula such that t is not a term
of a(x/s). It is easy to see that

A lb a(x/s) in ft iff A lb a(x/t) in ft(s,t).

The same applies also to — a(x/s) and — a(x/t).


One more bit of notation: if O is a set of ±formulas, we put 0+ for the
set of all formulas in O.

Finally, let us agree to say that a branch in an intuitionistic tableau is


enforceable if the set of ±formulas of the branch is enforceable.
We are now ready to prove:

6.10. Lemma. Let an enforceable branch in a first-order intuitionistic tableau


be extended to one or two new branches by one of the thirteen rules. Then the
new branch — or, if there are two, at least one of them — is enforceable.
Proof. Let O be the set of ± formulas of the given branch. We consider
the cases of the quantifier rules, leaving the propositional rules to the reader.
First, take the 3 rule. Here the ± formulas of the new branch are
CH. 9, §6]. KRIPKE’S SEMANTICS 421

<I>, a(x/y), where 3xa£3> and y is not free in any ±formula of <P. Let
A be a state in some Kripke system ft such that A lb®. Since 3xa6<I>>
it follows from KSD that for some s£ T(A) we have A lb a(x/s). If s happens
to be y, we are through. If s^y, we may assume without loss of generality
that y$T(B) for all states B of ft. (If this is not so, we consider ft(y/a)
instead of ft, where a is a new constant that does not belong to T(B) for
any state B of ft. By what we have seen above, A will force O in ft(y/a)
as well.) Now consider ft(s,y). Since A forces 0, <x(x/s) in ft, it follows
that A must force O, a(x/y) in ft(s,y).
Next, consider the —3 rule. Here the ±formulas of the new branch are
<I»+, — a(x/t), where — 3xa60. Again, let A be a state in some Kripke
system ft such that A||-<I>. Since —3xa£<I,> it follows from KSD that
Til-a(x/s) for every s€T(T). If t £T(A), we are through. Otherwise,
we may assume that t(£ T(B) for all states B of ft. (If this is not so, we can
choose a new constant a and consider ft(t/a) instead of ft.) Now take any
s£T(A) and consider ft(s,t). It is clear that in ft(s,t) the state A forces
(hence also <D+) as well as — a(x/t).
The case of the V rule is similar to that of the —3 rule.
Finally, take the —V rule. Here the formulas of the new branch are
<p+? —a(x/y), where — Vx«€® and y is not free in <I>. Now let B be a state
in some Kripke system ft such that 5lb<I>. Since — Vxa^O, it follows
from KSD that for some A, where B^A, and some s€T(A), we have
A ||-a(x/s). Also, since B^A, it follows by Lemma 6.2 that y4lb<I> + -
From this point on we can argue as in the case of the 3 rule. I

We can now show that the method of first-order intuitionistic tableaux is


sow id relative to Kripke’s semantics:

6.11. Theorem. Let 0> be a finite set of formulas. If <Di>\|/, then


Also, if*Do then O is not enforceable.
Proof. Suppose OJF\|/. Then, by definition, <1>, —v|/ is enforceable. By
Lemma 6.10 it follows that in any first-order intuitionistic tableau for
O, —\Jt at least one branch must be enforceable. But then this branch cannot
contain a formula together with its negative, and hence cannot be closed.
It is therefore impossible that <J>o\|/.
The second part of the theorem is proved similarly. I

6.12. Problem. Consider the following method of Beth tableaux. We


modify the method studied above in three ways. First, we allow the initial
node to contain more than one negative. Second, instead of the two rules
422 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §7

— Vx and — v2 we have one — v rule:

—avp

—a
-P
Note that this rule is not starred. Third, we remove the stars from the rules
— A, -*■, ~~l and —3 (so that only the three rules-► , — “1 and — V
remain starred).
If <I> is a set of Tformulas, we write “3>[>B” as short for “we can construct
a closed Beth tableau for tD”.
Prove that if <I>e>b then O is not enforceable. (Note that the star in
the three starred rules can be interpreted as “transition to a possible future
state”. The proof in detail is like that of 6.11.)

§ 7. The Elimination Theorem for intuitionistic tableaux

In §§8-9 we shall introduce intuitionistic versions of the propositional and


predicate calculi, and we shall want to show in a constructive way that
they are equivalent to the methods of intuitionistic propositional and
first-order tableaux. For this purpose — just as in the corresponding
classical case — we shall need an elimination theorem. The main work
is done in the following:

7.1. Elimination Lemma. Let be a finite set of formulas. If Q), 5 and


<I>t>5, then Oo^.
Proof. Let Zj and T2 be given proofs of and O >5 respectively.
Thus Tj and T2 are closed first-order intuitionistic tableaux for 0,5, —\
and O, —5. (Here and in what follows, if \ happens to be the empty string
of symbols then — \ too is considered to be the empty string.)
Let r be the least number such that 5 is not used in Zj below the rth level
(neither to extend a branch nor to close one). If r=0, then either 5 is not
used at all in Tj — in which case Tj proves O and we are through —
or 5 is used at level 0 only, to close Tx at once. But in this latter case 5
must be (atomic and) the same as so that T2 is a proof of — and
again we are through. Thus we may assume r>0.
Similarly, let s be the least number such that —5 is not used in T2 below
the xth level. If —5 is not used at all in T2, then T2 proves <I> t>, hence we have
CH. 9, §7], THE ELIMINATION THEOREM FOR INTUITIONISTIC TABLEAUX 423

by Prob. 5.18. If —5 is only used at the 0th level, to close T2 at once,


then 8 must be (atomic and) in d>, so that Tx is actually a proof of
Thus we may assume ^>0. So r+s>2.
Our lemma will be proved by induction on deg 5; this is the primary
induction of our proof. But within the induction step of our primary induc¬
tion we shall use a secondary induction on r+s.
To help the reader find his way through this double induction, we present
here a schematic plan of our proof:

THE LEMMA

Case 0 Case 1

Basis of Ind. step of


prim. ind. prim. ind.
deg 8 = 0. deg 8 -0.

Case 1.0 Case 1.1 Case 1.2

Basis of /•>2, s> 2.

sec. ind. s— 1.
r +s = 2.

According to level
According to the form
1 of T,. 1 of T,.
of 8.

Now for the actual proof.


Case 0: deg 8=0. Then 8 is atomic and the only way in which it can
be used in is for closing branches in which —8 crops up. But since we
have 0> at the initial node of Tlf and we are given a proof T2 of 0> o 8, we can
replace the use of 8 in Tx by an appeal to T2; for by T2 any branch of Tx
in which -8 crops up is as good as closed without using 8. (Here we are
424 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §7

employing Prob. 5.18 again. Below we shall often use that result, without
special mention.) Thus we can get rid of 6 in Ij and obtain a proof of
0> as required.
Case 1: 8 is not atomic. Then 8 can only be used for one of the six
rules for formulas (the rules without minus) and —8 can only be used for
the corresponding minus rule (or, in the case of disjunction, one of the
two corresponding minus rules). We now start our secondary induction.
Case 1.0: r+s=2, so that r—s= 1. Thus 8 and —8 are used to yield level
1 of 7j and T2 respectively, but are never used after that. (In fact, —8
cannot be used any more because it gets killed.) We consider the various
possible forms of 8.
Case 1.0.1: 8 = avp. Then Ij and T, start thus:

avp, — £, <D, — avp


/\ I*
a p —a. or —p

Since a V P is not used any further in 7j, we get, from the left-hand part of 2j,

(1) a proof of <!>, ao^,

and from the right-hand part of 7j we get

(2) a proof of d>, p>^.

Also, from T2 we get a proof of Ooa or a proof of 0>[>p. Using this


with (1) or (2), as the case may be, we obtain, by the primary induction
hypothesis, a proof of <Do^, as required.
Case 1.0.2: 8 = aAp. Then 7j and T2 start thus:
<D,<xaP, — £, <D, —ocaP
*

a —a —P
p : :

• \

From Ij we get

(3) a proof of <D,a,p>^.

From the right-hand part of T2 we get a proof of Oop and hence a proof
of d>, a o p. From this and (3) we get, by our primary induction hypothesis,

(4) a proof of <D, on>^.


CH. 9, §7], THE ELIMINATION THEOREM FOR INTUITIONISTIC TABLEAUX 425

Also, from the left-hand part of T2 we get a proof of O oa; and using this
and (4) we get, again by the primary induction hypothesis, a proof of

Case 1.0.3: 5 = a->p. T1 and T2 start thus:

<P, a->p, — £, <D, — a-»p


*/\ |*
—a p a
: ; -p

From 7\ we get

(5) a proof of

(6) a proof of O, Pi>^.

Also, from T2 we get a proof of <I>, at>p. Using this and (5) we get, by
the primary induction hypothesis, a proof of 0>P; and from this and (6)
we get a proof of Oc>^.
Case 1.0.4: 5=~la. and T2 start thus:

0, “la, — £, <J>, — “la


|* I*
—a a

We get proofs of and 4>,a> and hence, by the primary induction


hypothesis, a proof of Oi> and hence a proof of
Case 1.0.5: 5 = 3xa T1 and T2 start thus:

O, 3xa, — E, 0, —3xa
I*
a(x/y) — a(x/t)

Here y is not free in <I>, 3xa, By Prob. 5.17 we may assume that y is
not bound in a, so that in performing the substitution a(x/y) no alphabetic
changes are made. From Tx we obtain a proof of 0, a(x/y)o^. Hence
by Prob. 5.19, substituting t for y, we get a proof of <D, a(x/t)n>^. Using
this and the proof of <D>a(x/t), which we get from To, we obtain, by the
induction hypothesis, a proof of
Case 1.0.6: 5 = Vxa This is treated by the same method as case 1.0.5.
INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §7
426

We have now established the basis of our secondary induction and we


must start the secondary induction step. Here we assume r+s>2. We
distinguish two cases: either 5=1 and 2 (case 1.1) or 5»2 and r>i
(case 1.2).
Case 1.1: 5=1 and 2. We consider the various ways in which level 1
of could have been obtained. There are two possibilities. First, level 1
of Tx was obtained by using some ± formula other than 8 (i.e., a formula of
tp or —^); this possibility is covered by cases 1.1.1-1.1.2 below. Second,
level 1 of Tj was obtained by using 8; this possibility is covered by cases
1.1.3-1.1.8.
Case 1.1.1: Level 1 of Tx was obtained by applying a non-splitting rule
(i.e., any rule except v, — A, —►) to a formula of <I> or to —Here there
are ten sub-cases to consider, according to which rule was used, but they
are all treated by the same method. Consider, e.g., the sub-case where
the rule in question is --► . Then T1 starts thus:
0> ,8,
I*
a
-P

where £, = a-^p. From this we get

(7) a proof of <J>, a, 8t>p.

Moreover, if r' is the least number such that 8 is not used in the proof (7)
below the (r')th level, then r' = r— 1. (This is because, when (7) is obtained
from 2\, level r of T1 becomes level r— 1 of (7).)
Also, the given proof T.2 of <Dt>8 can be converted, without changing
s into a proof of <F,at>8. From this and (7) we get, using our secondary
induction hypothesis,
(8) a proof of «J>,aop.
We now construct a proof of We start, as in Tu by applying the
-► rule to — £, (i.e., to —a->P), thus:

4>, -5
I*
a
-P
and this is as good as closed because we have (8).
CH. 9 §7], THE ELIMINATION THEOREM FOR INTUITIONISTIC TABLEAUX 427

The other nine sub-cases are left to the reader.


Case 1.1.2: Level 1 of Tx was obtained by applying one of the rules
v, — a, ->• to a formula of or to Consider, e.g., the sub-case
where the rule in question is Then starts thus:

0), 5, -E,
*/\
—a P

where a->Pt<l>. From the left-hand part of we get

(9) a proof of O, §t>a,

and the least number r' such that 5 is not used in (9) below the (r')th level
is smaller than r.
From (9) and T2, by the secondary induction hypothesis we get

(10) a proof of

Similarly, from the right-hand part of T1 together with T2 we get, by the


secondary induction hypothesis,

(11) a proof of O, Poi;.

We construct a proof of rDoE, as follows. We start by using the formula


a->-p£<D as in Tv By (10) and (11) this is as good as closed.
The remaining two sub-cases (those of V and —a) are similar and are
left to the reader.
Case 1.1.3: Level 1 of T1 was obtained by applying the V rule to 5.
Then T1 starts thus:

<D, 5, -5
/\
a P

where 5 = avp. But in this case 5 need not be used ever again in 7\. For
example, if further down in the left-hand part of we have again

a p
INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §7
428

then we can cut out this new node {(3} as well as all the nodes below it,
and also the new node {a} (but not the nodes below it) because we already
have a above this part of Tv Thus by cutting out redundant parts of Tx
we can reduce /• to 1. Now we are back to case 1.0, which we have covered
before.
Case 1.1.4: Level 1 of was obtained by applying the A rule to 5.
This is similar to (but slightly simpler than) case 1.1.3 and we leave the details
to the reader.
Case 1.1.5: Level 1 of 7\ was obtained by applying the -* rule to 5.
Then 2\ begins thus:

—a P

where 5 = a-*p. Here we may assume that 5 is not used again in the
right-hand part of 2\, because any such use would be redundant, as explained
in case 1.1.3. But in the left-hand part of T1 it may be necessary to use
5 again, because —a may get killed. Thus from the right-hand part of
T1 we get

(12) a proof of O, p

and from the left-hand part of T1 we only get

(13) a proof of 0,5>a.

But if r' is the least number such that 5 is not used in (13) after the (/-')th
level, then r'<r. Therefore from T2 and (13) we get, by the secondary
induction hypothesis,

(14) a proof of 0>a.

We now construct a proof of <I>, 5o^ as follows. We start by applying


the -»■ rule to 6, getting

—a P
and, since we have (12) and (14), we can continue this proof of O, 8t>£,
CH. 9, §7]. THE ELIMINATION THEOREM FOR INTUITIONISTIC TABLEAUX 429

without ever using 5 again. Thus we have reduced r to 1, and we are back
to case 1.0, which we have already covered.
Case 1.1.6: Level 1 of T1 was obtained by applying the ~l rule to 5.
This is similar to (but slightly simpler than) case 1.1.5, and we leave it to
the reader.
Case 1.1.7: Level 1 of T1 was obtained by applying the 3 rule to 8.
Then T1 begins thus:

O, 8, -E,

a(x/y)

where 8 = 3xa and y is not free in O, 8, %. By Prob. 5.17 we may assume


that y is not bound in a, so that when we substitute y for x in a no alphabetic
changes are made. From T) we get

(15) a proof of <D, a(x/y), 8>^,

and if r' is the least number such that 8 is not used in (15) after the (r )th
level, then r'—r—l.
Also, the proof T2 of <Dt>8 can be converted, without changing s, into
a proof of O, a(x/y)>8. From this and (15) we get, by the secondary
induction hypothesis,

(16) a proof of <D, a(x/y)i>£.

Now let us examine T2. Since we have assumed j=1, T2 must start thus:

O, —8
I*
— «(x/t)

where t is some term. This yields

(17) a proof of Ooa(x/t).

Also, by Prob. 5.19 we can substitute t for y in (16) and get a proof of
O, a(x/t)>q. From this and (17) we get, by the primary induction hypothe¬
sis, a proof of
Case 1.1.8: Level 1 of T2 was obtained by applying the V rule t0 6-

29
INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9. §7
430

Then Tx starts thus:

O, 6,

<x(x/t)

where t is a term and 5 = yfxa. From this we get

(18) a proof of <I>, a(x/t), Sol;,

and if r' is the least number such that 5 is not used in (18) below the
(r')th level, then r' = r— 1.
Also, the proof J2 of Oo8 can be converted, without changing s, into
a proof of O, a(x/t)o8. From this and (18) we get, by the secondary
induction hypothesis,

(19) a proof of 3>, a(x/t)ol;.

Now consider T2. Since s=l, T2 must start thus:

<D, -8
I*
—*(*/y)

where y is not free in <D nor in 8. Again, we may assume that the substitu¬
tion of y for x in a does not require any alphabetic change. Thus T2 yields
a proof of ®oa(x/y); and by Prob. 5.19 we may substitute t for y and
get a proof of 0>a(x/t). From this and (19) we get, by the primary induc¬
tion hypothesis, a proof of ®o^.
We have now covered all possible subcases of case 1.1.
Case 1.2: so2. We consider level 1 of T2. The only ways in which this
level could have been obtained is by applying one of the rules v, A,
3, V to a formula of <I>, because any other move would kill —8 at once
— contradicting our assumption that so 2. (The -»■ rule kills —8 only on
one branch, so it must be considered as a possibility.) The rules a, 3? V
will be considered together (case 1.2.1) and each of the rules v, -> will be
considered separately (cases 1.2.2-1.2.3).
Case 1.2.1: Level 1 of T2 was obtained by applying one of the rules

r
CH. 9, §7], THE ELIMINATION THEOREM FOR INTUITIONISTIC TABLEAUX 431

A, 3, V to some formula a£®. Then T2 starts thus:

<D, —5

where T is a set of one or two formulas (one in the case of 3 and V> two
in the case of a). From T2 we get

(20) a proof of $uT>5;

and if s' is the least number such that —8 is not used in (20) below the
^/th level, then s'=s—l.
Also, from T± we get, without increasing r, a proof ofOuT, 5>^. From
this and (20) we get, by the secondary induction hypothesis,

(21) a proof of

We construct a proof of as follows. Starting as in T2, we use a£®


to obtain
<F, -5

and this is as good as closed by (21).


Case 1.2.2: Level 1 of J2 was obtained by applying the rule V to some
formula of ®. Then T2 starts thus:

®, -8
/X
a P

where avp6®. We consider the left-hand and right-hand parts of T2 in


turn, and using the same method as in case 1.2.1 we get

(22) proofs of ®, aand <t>, P>^.

We construct a proof of as follows. Starting as in T2, we apply the


rule V to avp. By (22) this is as good as closed.
Case 1.2.3: Level 1 of T2 was obtained by applying the rule -► to some

29*
432 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §7

formula of O. Then T2 starts thus:

<D, — 8
*/\
—a p

where a-*P6®. From the left-hand part of T2 we get

(23) a proof of ®i>a.

And from the right-hand part of T2 we get, as in case 1.2.1,

(24) a proof of <I>, Pc>^.

Now start a proof of by applying the rule -► to a.-* p:

<D, -5
*/\
—a p

By (23) and (24) this is as good as closed.

7.2. Corollary. /( $, 8o0£, and $>05, then ® [>„£,.


Proof. Same as proof of 7.1, except that the basis of the primary induction
is the case where 5 is a prime (rather than atomic) formula, and all con¬
sideration of the quantifier rules is left out.

We now introduce (temporarily) an additional fourteenth rule for extend¬


ing intuitionistic tableaux. This rule, called “the ± rule” is analogous
to the EM-rule of classical tableaux and is represented schematically as:

/V
8 -8
where <5 is any formula. However:

7.3. Elimination Theorem. Given a proof of (DoS, {or of ® iwhich


uses the ± rule, we can construct a proof of the same statement without
using that rule. 's
Proof. Similar to 1.8.7. |

7.4. Problem. Show that if $>a and <Doce->-p, then $>p. Similarly
with “o0” instead of “o”. (Apply ± rule with 8 = a-»p, then use 7.3.)
7.5. Problem. Show that if Ooa and <Do la then ®o. Similarly with
<‘e>o” instead of “o”. (Apply + rule with 8= “la, then use 7.3.)
CH. 9 §8], INTU1TIONISTIC PROPOSITIONAL CALCULUS 433

§ 8. Intuitionistic propositional calculus

As axioms of the intuitionistic propositional calculus we take all formulas


of the following forms:

(8.1) a-^P->a,

(8.2) (a-*P-»-Y)-Ka-»-P)-»a-»-Y,

(8.3) a->P->«AP,

(8.4) aAP-^a,

(8.5) «Ap-*p,

(8.6) a-favp,

(8.7) p-^avp,

(8.8) (a-»-Y)->(P->,Y)->-aVP->-y’
(8.9) (a-* P)->-(a-* “l P)-> ~la,
(8.10) “la-^a-^p.

As rule of inference we take modus ponens.

Deduction in this calculus is defined in the usual way. We write “<D h-I0 'I'”
to indicate that we can construct a deduction of \|/ from in this calculus.
Also, we write “G> HI0” as short for “for some formula a, both h-I0 a
and O —10 ~la”. Using axiom (8.10) it is easy to show that if O HI0 then
<P 1—10 a for every formula a. If <D HI0 we say that O is inconsistent.
The Deduction Theorem — if dfia |—10 P then <I> H[0 a-> P is proved
in the usual way. (Note that Ax I and Ax II of the classical propositional
calculus presented in §10 of Ch. 1 — which are the only ones needed to
prove the Deduction Theorem — are present here as (8.1) and (8.2).)

8.11. Problem. Show that if (pis an axiom of the intuitionistic propositional


calculus then t>0 <p.

8.12. Theorem. For any finite set <P of formulas, <D t>0s iff® UI0 \-
Proof. Suppose O c>0^. We proceed by induction on the depth d of a given
proof T of
If d=0, then \ (is atomic and) belongs to O. So <I> 1-I0 % trivially.
If d>0, we consider nine cases, according to which propositional rule
accounts for level 1 of T. All nine cases are very easy, and the following
hints should suffice.
434 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §9

For the v rule, use the Deduction Theorem and (8.8). For —v1 and
— V2 use (8.6) and (8.7) respectively. For a use the Deduction Theorem
(8.4) and (8.5). For — a use (8.3). The case of the -> rule is trivial. For
-► use the Deduction Thm. For “l use (8.10). For — ~l use the Deduc¬
tion Theorem (twice) and (8.9).
In all nine cases we find that d> |—10 \-
Now assume that <J> |—10 and first take the case that ^ is a formula.
Then we have a deduction aq,...,^ of £, from O. Using Probs. 5.3, 7.4
and 8.11 it is easy to show by induction on k—\,...,n that Ot>0aq. For
k=n we get d>c>0£, us required.
Finally, suppose that <I> |—10. Then for some (indeed, for all) a we have
<I> 1—I0a and d> I—10 |oe. Thus, by what we have already shown, <J>o0a
and (1) t>0~I a. Hence tf> i>0 by Prob. 7.5. Q
In the sequel we shall often use Thm. 8.12 without special mention.

8.13. Problem. Prove the consistency of the intuitionistic propositional


calculus, i.e., the impossibility of 01—10.

§ 9. Intuitionistic predicate calculus

As axioms of the intuitionistic predicate calculus we take all formulas of the


following eight groups:

(9.1) All intuitionistic propositional axioms, i.e., all instances of


(8.1)—(8.10);

(9.2) Vx(a_* P)-*” Vxot_^ VxP;

(9.3) Vx(a->P)-*3xa-»3xP;

(9.4) a-*Vxa, provided x is not free in a;

(9.5) 3xa-fa, provided x is not free in a;

(9.6) Vxa^a(x/1), provided t is free for x in a;

(9.7) a(x/t)-*3x« provided t is free for x in a;

(9.8) All generalizations of axioms of the preceding seven groups.

As rule of inference we adopt modus ponens.

Deduction is defined as usual. We use “bY’ for this calculus in the same
way as we use “h-J0” for the calculus of §8.
CH. 9, §9]. INTUITIONISTIC PREDICATE CALCULUS 435

The Deduction Theorem is proved in the usual way.

9.9. Problem. Show that ><p for every axiom <p of the intuitionistic
predicate calculus.
Using Prob. 9.9 it is quite easy to show that if O Hi where O is a finite
set of formulas, then also Oo^. But in order to prove the converse ol
this (as well as for other purposes) we shall need a few simple technical
results. Since the work here is very similar to the work done in §1 of Ch. 3,
we can proceed rather quickly.
9.10. Theorem. Let x be a variable which is not free in (t> or \; then
(i) if O Hx a, then O Hr V*a;
(ii) if 0), a Hj then <D, 3xa Hi
Proof. A counterpart of Thm. 3.1.4 can be proved — in exactly the same
way — for our present calculus. Hence (i).
To prove (ii), let us first take the case where \ is a formula. If <t>, a
then by the Deduction Theorem and (i) we have d>HiVx(a-*^)- Hence,
using ax. (9.3) we get O, 3xa \-x 3x%; and by ax. (9.5) we have ®, 3xa Hx
as required.
Now let £, be empty. If O, a Hi, then in particular <D, a Hi~l3xa. Thus,
by what we have already proved, ®, 3xa Hi l3xa arid theiefore <I>, 3xa Hi
H
as required. a
9.11. Theorem. If p~p', then p and p' are provably equivalent, i.e., P HiP'
and pr Hi P-
Proof. A counterpart of Lemma 3.1.9 can be proved in a very similar
way — for our present calculus. Hence it is enough to show that il z is
not free in a but is free for x in a, then:
(1) Vxa and Vzta(x/z)] are provably equivalent;
(2) 3xa and 3z[a(x/z)] are provably equivalent.
Now, (1) is proved exactly as in Thm. 3.1.10. To prove (2), we notice that,
using ax. (9.7) we have a(x/z)Hi3x«> hence by Thm. 9.10 (ii) we have

3z[a(x/z)] Hi 3xa.
Since alphabetic changes are reversible, we can show similarly that

3xa Hi 3z[a(x/z)]- ®
9.12. Problem. Show that for every a, x and t:

(i) HiVxa-^a(x/t);
(ii) Hi«(x/t)-»-3xa-
(For (i) proceed exactly as in 3.1.11. But (ii) cannot be proved exactly as
in 3.1.11, because now 3 is not defined in terms of V-)
436 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §9

9.13. Theorem. Let c be a constant that does not occur in <D, a or then:
(i) if |—jO(x/c), then O |—xVxal
(ii) if <D, a(x/c) \-x then ®, Sxah^.
Proof, (i) is proved exactly as in Thm. 3.1.12.
To prove (ii), suppose first that ^ is a formula. Let y be a variable that
does not occur in a or i;. Then

a(x/c) = a(x/y)(y/c).

Therefore, if O, a(x/c) |—j we have by the Deduction Theorem and (i),

HVyWx/y)-^].

Using (9.3) and (9.5) we obtain

3y[a(x

But by Thm. 9.11 we know that 3y[«(x/y)] and 3xa are provably equivalent,
hence <D, 3X(* \-^ as required.
If £ is empty, we proceed as in 9.10. |

We can now show that the intuitionistic predicate calculus is equivalent


to the method of intuitionistic first-order tableaux:

9.14. Theorem. Let d> be finite. Then <l>o£ iff iP|-i£.

Proof. Suppose <Do^. We proceed exactly as in Thm. 8.12, except that


we now have to consider four additional cases, corresponding to the
quantifier rules:
In the case of rule 3, the induction hypothesis is

O, a(x/y)

where 3xa£<I> and y is not free in <D or By Thm. 9.10 (ii) we get

^ay ^ v/y)]^.

But 3y[a(x/y)] is a variant of 3X®> hence provably equivalent to it by


Thm. 9.11. Since we already have 3xa in O, it follows that as
required.
In the case of the —3 rule, the induction hypothesis is

O I—, a(x/t),

where ^ = 3xa. By Prob. 9.12 (ii) we get


CH. 9. §9], INTUITIONISTIC PREDICATE CALCULUS 437

In the case of rule V> the induction hypothesis is

<P,a(x/t) |-i5,

where But then by Prob. 9.12 (i) we have O Hi a(x/t), hence O H^-
Finally, in the case of the —V rule, the induction hypothesis is

<P Hi a(x/y),

where ^ = Vxa and Y is not free in ® or By Thin. 9.10(i) we get


O HiVy[a(x/y)] and hence, by Thm. 9.11, <P HjVxob be., ^Hi^-
Now suppose, conversely, that OHi^- To show that <Di>£„ proceed
as in Thm. 8.12, but this time using Prob. 9.9 instead of Prob. 8.11. |

In the sequel we shall often use Thm. 9.14 without special mention.
As an application, we prove the following result, due to Rasiowa and
Sikorski.

9.15. Theorem. Suppose that Hi3xa- b/"3xa has terms (i.e., free variables
or constants), then \-1<x(x/t) for some such term t. If 3X<* has no terms,
then HiVxa-
Proof. By Prob. 5.14, Hxa(:x/t) for some term t. If t is a term of 3XQ4
we are through.
If t is not a term of 3xa — and, in particular, if 3xa has no terms —
then either t is a variable y not free in 3xa, or t is a constant c not occurring
in 3xa. In the first case we have H]Vy[«(x/y)] by Thm. 9.10(i) and hence
HjVxa by Thm. 9.11. In the second case we have HiVxa by Thm. 9.13(i)
In both cases we have Hi a(x/s) for every term s, by Prob. 9.12(i). 1

9.16. Remark. Another consequence of Thm. 9.14 is that deducibility


is invariant with respect to language. Suppose that O is a set of if-formulas
and \ is an if-formula or the empty string. Let if' be an extension of if,
and suppose that ^ % in if'. Then <D0 Hi % in if', where O0 is some
finite subset of O. Hence by Thm. 9.14 we have O,,1^ in if'. If T is a proof
of O0 in SC', then all the symbols occurring in T, except perhaps some
constants, must be in if. Now, any constant which occurs in T and does not
belong to if can be replaced by a variable, provided that variable does not
already occur in T. In this way we obtain a proof of O0 t>£, which is entirely
in if. Therefore by Thm. 9.14 we have O>0 Hi 5 in if and hence <D Hi %
in if.
9.17. Problem. Prove the consistency of the intuitionistic predicate cal¬
culus, i.e., the impossibility of 0Hi-
438 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §10

§10. Completeness

In this section our aim is to prove the completeness of the intuitionistic


predicate calculus relative to Kripke’s semantics.
We shall adopt here a non-constructive attitude. Thus in this section
when we say, e.g., <t> 1—j a, we mean that there exists a deduction of a
from ®, not that we necessarily know how to find such a deduction.
However, note that under this non-constructive interpretation the results
of §9 continue to hold.
If C is a set of new individual constants not belonging to if, we let
if(C) be the language obtained from if by adding the constants in C.
Throughout this section we let X be the cardinality of XL (see Def. 3.3.10).
Then, by Thm. 3.3.11, X is also the cardinality of the set of all if-formulas.
A set ® of if-formulas will be called strongly consistent in if if the follow¬
ing four conditions hold:
(a) ® is consistent, i.e.,
(b) Whenever a is an if-formula such that <I> l— j<x, then aC®.
(c) Whenever avPC®, then aC® or PC®.
(d) Whenever then also a(x/t)C® for some term t.

10.1. Lemma. Let C be a set, of cardinality X, of constants not belonging


to if. Let <£> be a set of XL-formulas and let y be an XL-formula such that
® i/, y. Then there exists a set VF of XL(C)-formuIas such that 'F is strongly
consistent in XL{C), and ® c 'F, and *F \y-l y.
Proof. The cardinality of if(C) is clearly X. We fix a well-ordering.
{<p4: of all if(C)-formulas. By transfinite recursion we define for
each a set ®^ of if(C)-formulas such that:
(1) ®,,^®c for all
(2) ®(t/j y,
(3) only finitely many, or at most |(| new constants (i.e., constants of C)
occur in ®;.
For C=0 we put ®0 = ®. Then (1)—(3) clearly hold.
Now, let 0<£</l, and suppose that for all (<£> the have been defined
in accordance with (l)-(3).
If is a limit ordinal (in particular if q = X) we put

«>e = U{<IV (<£>}•

It is easy to see that (l)-(3) hold for q.


If Q is a successor, say £>=(+1, we distinguish four cases:
CH. 9, §10], COMPLETENESS 439

Case 1. IfOc, <pc i-j y, we put <D?+1 = <I)C. Then (l)-(3) hold automatically
for C + +
Case 2. If <D^, <p? ft y and is neither a disjunction formula nor an
existential formula, we put

0?+i = 0cu (<pj.

Here too (1)—(3) hold for C + +


Case 3. If , <pc ft y and <pc = <xvp, then we must have 0>c, <pc, a h-fi Y

or <J>?,<pc,p fty, otherwise by ax. (8.8) we would have<l>c, tp? I—jY, contrary
to assumption. If Oc, <pc, a ft y, we put

<Dc+i=®cu{q>{,a};

otherwise, we put

Oc+i = <D;u{(pc,p}.

Again, (l)-(3) must hold for C + +


Case 4. If <Bc,<p5l/iY and <pc = 3xa, then by (3) we can find a new
constant c that does not occur in or (p^ and for some such c we put

®5+i=®cu{<Pc»«(x/c)}.

Then (1) and (3) clearly hold for C + l; also, by Thm. 9.13(h) we see that
(2) must hold for C++
We put

and it is easy to verify that has all the required properties. First, it is
clear that <pc'F and T' fty, hence also ft. Next, if a is an if(C)-
formula such that ¥ ft a, then W,afty and <x = <pc for some C<+
Thus a was put in 0>c + 1. Similarly, if avPe'F, then avP = <pc for some
C and a or p was put into 0^+1. Finally, if 3xa6'F then 3xa = (P? fQ1
some C and for some new constant c we have put a(x/c) in ^>?+i. I

We define a sequence of languages {if„: n£N) as follows. if0 is our


original language if; and for each n we put

if„+1 = ifn(C„),
where C„ is a set, of cardinality A, of constants not belonging to if„. We let
if' be the union of all these languages, i.e.,

if' = if(C),
where C = \J{Cn: n£N). The cardinality of C is clearly X.
440 1NTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §10

Note that, if a set Y of ^-formulas is strongly consistent in F£n, then


for every formula a of Jz?„ we have a->-a£'F, hence all the constants of
F£n must actually occur in Y, and Y cannot at the same time be strongly
consistent in FFm with men.
We now define a Kripke system 5v as follows. The states of 51 will be
sets of formulas: if for some n — which, by what we have just seen, must
necessarily be unique — Y is a set of JS?„-formulas and is strongly consistent
in J5f„, we take T as a state of 5L For that unique n we let 7\Y) be the
set of all terms of F£n\ and we let F(Y) be the set of all atomic formulas
belonging to Y. The states are partially ordered by inclusion: Y*s52 iff
'Fen.
5\ is clearly a Kripke system for F£'.

10.2. Lemma. For each state Y of S\ and for each FF' -formula a we have
*F ll-ft a iff ae'F.
Proof. Let n be the unique number such that ¥ is strongly consistent
in FFn. We proceed by induction on deg a and distinguish seven cases,
corresponding to the seven “positive” clauses of KSD. Cases 1,2,3 and 6
(a atomic, a = Pvy, a = PAy, a = 3xp) are routine and are left to the
reader. We deal in detail with the other three cases.
Case 4: a = P->-y. First, suppose a£Y. Then a is in FFn and hence
all the terms of a are in 7\Y). Now let 52s=Y; then there is some
such that 52 is strongly consistent in FFm. Also, a £52 because Ye 52.
Let 52 IF* P; we have to show that 52 ih* y. But by the induction hypothesis
we have P C 52, and hence 52 |—t y by modus ponens. Since 52 is strongly
consistent in FFm, we must have y£52; hence 52 |(-« y by the induction
hypothesis. Thus 'F Ih^a.
Conversely, let a$ *F. If a is not in F£n, then not all the terms of a belong
to T(Y); hence ¥ If a is in FFn, then Y \/-y a. because Y is strongly
consistent in FFn. Hence, by the Deduction Theorem, Y, p y. By Lemma
10.1 we extend *Pu{p} to a set 52 of J£?n+1-formulas which is strongly
consistent in FFn+1 and such that 52 hence y<|52. By the induction
hypothesis we see that 52 lh« p and 52 y. Since Y*s52, we have Y
Case 5: a = ~iP- Suppose a£Y. Then clearly all the terms of a are in
T(Y). Let 52»Y; we have to show that 52 p. But, since a(Ycft
and 52 is consistent, it follows that p ^ 52; hence 52 P by the induction
hypothesis. ThusYlh^a.
Conversely, suppose a$Y. If a is not in FFn, then (as in case 4) we see
that Y JF^a. If a is in FFn, then Y [/jOt because of the strong consistency
CH. 9, §10], COMPLETENESS 441

of *F in sen. Now, it is easy to see that P-*“lP HI0 ~lP (see ax. (8.9)),
and therefore, recalling that a = —iP, we see that ¥ P—> —IP; hence
*F, p f/I0“ip. By Lemma 10.1 we extend *F u {P} to a set f2 strongly con¬
sistent in £Fn+1. Then by the induction hypothesis £2 lhfl P, and since
lF<£2 we have *F tb^a.
Case 7: a = Vxp. Let a^'F. If £23*^, then also a££2 and hence by
Prob. 9.12(i) we have £2 t—T P(x/t) for every term t£T(£l). By the strong
consistency of £2 (in some JSfm with /«>«) we get p(x/t)££2; hence £2 lb* p(x/t)
by the induction hypothesis. Thus 'F |f-w a.
Conversely, let If a is not in SFn, we see (as in the previous cases)
that ¥ If a is in £Fn, then again 'F i/jOt. Take some c<EC„. Then
c cannot occur in 'F or a, because the constants of C„ were taken not to be
in T£n. Thus by Thm. 9.13(i) we have 'F \/-Y P(x/c). By Lemma 10.1, we
extend ¥ to a set £2, strongly consistent in + 2, such that £2 y-x P(x/c),
hence P(x/c)^£2. Thus £2 Jbs P(x/c) by the induction hypothesis; and
since T=£:£2 we have ¥ tb^a. §

10.3. Strong Completeness Theorem. There exists a Kripke system 5v


for the language k£{C), where C is a set, of cardinality X, of constants not
belonging to T£, such that for any set <D of T£-formulas and any SF-formula
y such that <t> \yx y, some state of 5\ forces <Du {—y}. In particular, if <J> bt
then some state of 51 forces <J>.
Proof. Let 5\ be the Kripke system defined above. If O and y are in ST
and O i/j y, then by Lemma 10.1 we extend <I> to a set ¥ of J^-formulas
which is strongly consistent in SFX and such that ¥ \yx y, hence y$¥. Then
¥ is a state of 51, and by Lemma 10.2 we clearly have ¥ lbfi u {—y}.
In particular, if O (/r then d> \y-x y for some =S?-formula y. 1

10.4. Theorem. For any set <I> of T£-formulas and any Sd-formula y, we have
<t» 1—j y <t> II— y. Also, <S> \-Yijf <I> is not enforceable.
Proof. If h-I y, then for some finite d>0 £ we have <J>0 |—t y, hence
$o !l-y by Thm. 6.11. Therefore <D lb y. Conversely, if d> y then
$u{-y} is enforceable by Thm. 10.3; thus O Jb y.
The proof that <t> |—x iff <t> is not enforceable is similar. |

Note that in Thm. 10.4 we have not used the full power of Thm. 10.3.
For by Thm. 10.3 we have a certain universal Kripke system 5\ which
works for every and y.
It is doubtful whether Kripke’s semantics is — or can be turned into —
a satisfactory explication of the constructivist heuristic outlined in §4.
442 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §11

Nevertheless, because of the results of the present section, it is very useful


both heuristically and technically. Many mathematicians are so used to
(or perhaps brainwashed into) the purely structural mode of thinking that
they find it difficult to test directly the constructive validity of a given
formula. The heuristic of Kripke’s semantics provides them with a con¬
venient structural Ersatz. On the technica) level, Kripke systems are useful
for obtaining independence (=unprovability) results for intuitionistic (and
even classical) formal theories. As a very simple example, observe that
by 6.8 we have

b^m[VxnPx-+“nVxPx]
(cf. Prob. 5.16), and by 6.9 we have

1/j V x[Py v Qx] Py v V xQx.

§11. Translations from classical to intuitionistic logic

It is easy to see (directly or via Prob. 5.5) that if tD HI0 a then also <t> h0 a-
By 5.6(iii) the converse of this is false. Similarly, we easily see (directly
or via Prob. 5.11) that if <6 Hi a then <D|-a The converse of this is false
by 5.16.
It therefore seems that the intuitionistic propositional and predicate
calculi are strictly weaker than their classical counterparts. However, this
is the case only if we insist on comparing each formula in the classical
calculus with the same formula in the corresponding intuitionistic calculus.
But this comparison is unfair, because the structural interpretation of
some of the logical symbols is very different from their constructive inter¬
pretation.
In this section we shall show (in a constructive way) that it is possible
to translate classical formulas in such a way that a formula is provable
in the classical propositional or predicate calculus iff its translation is
provable in the corresponding intuitionistic calculus. Thus the classical
calculi can be “interpreted” within their intuitionistic counterparts. We
begin with the propositional case.
We employ the version of the classical propositional calculus formulated
in §15 of Ch. 1. Notice that the axioms of this calculus are the same as
those of the intuitionistic propositional calculus, except that the two schemes
(8.9) and (8.10) are replaced by the single scheme

(11.1) ( I ot—► P)—>- ( i ot—► IP)—► ot.


CH. 9, §11], TRANSLATIONS FROM CLASSICAL TO INTUITIONISTIC LOGIC 443

For any set <D of formulas we put —| <I> = {—up: tp £<!>}. We then have the
following result, due to Glivenko:

11.2. Theorem. For any set O of formulas and any formula <p we have
<J> Ho <P iff “in® 1—jo “l~l<p
Proof. Suppose O f-0 <P- We proceed by induction on the length of
a given (classical) deduction of <p from O.
If tp is one of the axioms (8.1)—(8.8), then trivially 1-I0 <P, hence l—10 I I <P
by Prob. 5.7(i); therefore —|—|<D |—10 I 1<P-
If (p is an axiom of the form (11.1), then p-I0~i-l<p by Prob. 5.7(ix), and
again we have —1 “lO i-I0 I Itp
If <p£<I>, then trivially ~1 ~ld> (—xo “I ~icp.
Finally, if tp is obtained from two earlier formulas by modus ponens,
we get —i —i <I> |—jo —I —I by the induction hypothesis and Prob. 5.7(v).
Conversely, if “l ~10 i—j0 I I<P then clearly nnOh0“in<p; hence
O 1—o tp by Thm. 1.10.5. I

11.3. Problem. Show that “lO H0 “l<p iff “lO Hi0 “l<p. (Use parts (i) and
(ii) of Prob. 5.7.) Also, show that if d> is consistent in the intuitionistic
propositional calculus, it is consistent in the classical calculus as well.

By Thm. 11.2, the mapping/defined by /((p) = “I "1 <p is a faithful translation


from the classical propositional calculus into its intuitionistic counterpart1.
Another faithful translation is suggested in the following:

11.4. Problem. For any formula <p, let tp' be defined by induction on deg tp
as follows:
<pr = —i —s tp for every prime formula tp,
(tp Vv|/)' = tp'Vv|/',
(tp A'{/)' = tp Av|C,
(tp-M|/)/ = <p'-*\|/',
(-Kp)/ = -|(<p/)-
Also put O'= {tp': <p£0}.
(i) Let tp be any formula obtained from its prime components without
using disjunction. Show that T"l<p 1—10 and (P/ Hjo 19- (Proceed by
induction on deg tp, using Thm. 11.2; for the case of implication use also
Prob. 5.7.(vi).)
(ii) Let tp be as in (i) and let O be a set of formulas of the same kind

1 The term faithful here refers to the purely formal business of deducibility, not to meaning.
444 1NTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §11

(i.e., obtained from their prime components without using disjunction).


Show that <D l—o <P iff HI0 <?'•
(iii) Show that if a is prime then l/I0 “I “I a v “1 “I ”1 a. Thus (ii) does not
work in the presence of disjunction.
Since every formula can be transformed into a classically equivalent
formula not containing disjunction, Prob. 11.4(ii) yields another faithful
translation from the classical propositional calculus into its intuitionistic
counterpart.

We turn now to the predicate calculus. Together with if, we consider


the language if* obtained from if by excluding the universal quantifier.
In view of what was said in §4 of Ch. 3, the classical predicate calculus
in if* can be based on the following seven groups of axioms:

(11.5) all if *-formulas of the forms (8. l)-(8.8);

(11.6) all if *-formulas of the form (11.1);

(11.7) ~i3x“l(a->-P)->"-i3x“ia-> ~l3x~iP;

(11.8) “i3x_l(a-» P)-*3X“-*3XP;


(11.9) a-f“i3x_la, provided x is not free in a;

(11.10) “i3x“ia->Qt(x/t), provided t is free for x in a;

(11.11) —13xi—I —I 3x2—• - - - —I 3xl—1 where 1, and xlv..,xfc are any


variables (not necessarily distinct) and a is any axiom of the
preceding six groups.

The only rule of inference is modus ponens.

11.12. Theorem. Let d> be any set of -formulas, and let ip be an if*-
formula. Then <D|— <p iff ~i —|tl> |—t —| —| tp.
Proof. Like the proof of 11.2, except that now we also have to verify that
1—j i I q> for each if *-formula <p belonging to the groups (11.7)—(11.11).
This can easily be done by tableaux, and is left to the reader. |

11.13. Problem. For and ip as in 11.12, show that —l <I> 1— —| tp iff


I1—j 1 tp. Also, show that'if such <I> is consistent in the intuitionistic
predicate calculus, it is classically consistent as well.

Since every if-formula is classically equivalent to an if*-formula,


Thm. 11.12 provides us with a faithful translation from the classical predicate
calculus into its intuitionistic counterpart. Another translation is suggested
in the following:
CH. 9, §12], THE INTERPOLATION THEOREM 445

11.14. Problem. For any ^-formula <p, let cp' be the formula obtained
from <p when each atomic subformula a of <p is replaced by I I oe, and
let <p* be the formula obtained from tp when each universal quantifier
Vx is replaced by ~13X—I For any set O of if-formulas let 0'= {(p': qUO}.
(i) Show that if <p is an ^f-formula not containing the symbols V and 3,
then I-Up* l—!(pr and <pr 1—t —| —|cp*. (Proceed as in Prob. 11.4(i). In the
case where ip is a universal formula, use Prob. 5.15.)
(ii) Let (p be as in (i) and let O be a set of formulas of the same kind
(not containing v and 3)- Show that <t» |— q> ifF <!>' I—r <p'. (Use (i) and
Thm. 11.12.)
Again, every ^-formula can be transformed into a classically equivalent
formula without the symbols v and 3. Thus we get from Prob. 11.14
a faithful translation from the classical into the intuitionistic predicate
calculus. This result is due to Godel.
The results of this section show that the intuitionistic propositional and
predicate calculi are equal in strength to their classical counterparts.
However, intuitionistic logic is certainly richer than classical logic, because
the former makes distinctions that the latter fails to make. Thus, e.g.,
—I —| a is distinguished from a (Prob. 5.6(iii)) and *“lVxa is distinguished
from 3x“l« (Prob. 5.16). There are many such distinctions; but, as the
following problem suggests, the most important one is that between I let
and a — for without it the intuitionistic calculi collapse into their classical
counterpaits. It will also be seen that the absence of the law of the excluded
middle (Probs. 5.6(iii) and 5.12) is equally crucial.

11.15. Problem, (i) Show that if we add the axiom scheme


|a—►a
to the intuitionistic propositional and predicate calculi, we obtain (equivalent
versions of) their classical counterparts. (For the propositional case use
11.2; for the predicate case use 11.12 and 5.15.)
(ii) Prove the same result for the scheme avia instead of -|“la-*-a.

§ 12. The Interpolation Theorem

In this section we prove, for the intuitionistic predicate calculus, an important


result known as Craig's Interpolation Theorem (or Lemma). Originally,
Craig proved this result for the classical predicate calculus, and it was
only later extended by Schiitte to the intuitionistic case. (In §13 we shall
derive the original classical version from the intuitionistic one.)

30
446 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §12

In what follows, <I> and ¥ are finite sets of if-formulas and 2j, as usual,
is an ^-formula or the empty string.
We say that a formula 5 is an interpolant for (O; ¥, 2;) if the following
three conditions hold:
(a) Every free variable of 5 is free in both O and ?u{^}, and every
extralogical symbol (constant or predicate symbol) occurring in 8 occurs
in both <D and
(b) <1> o 5;
(c) ¥, 8t>^.
Condition (a) is expressed more briefly by saying that 8 is in the common
vocabulary of and ¥ u {^}.
Note that, if <I> and ¥ u {^} do not have any predicate symbol in common,
then no formula 8 can fulfil condition (a), so in this case we cannot have
an interpolant for (O; ¥, ^).
We say that we can interpolate in (O, ¥, E) if we can do at least one of the
following three things:
(1) find (a formula 8 and prove that it is) an interpolant for ¥, 2;).
(2) prove <tr>;
(3) prove
Note that if there is a predicate symbol P occurring in both <I> and ¥ u
then if we can do (2) or (3) we can also do (1). Indeed, let y be any sentence
whose only extralogical symbol is P. If we can do (2), then S = yA~ly
will serve for (1) because (as can easily be seen) yA ~iy o0. If we can do (3),
then 8 = y-*y will serve for (1) because n>0y->y. Thus we had to admit
(2) and (3) as separate possibilities only because and ¥ u {2;} may not
have a predicate symbol in common.

The main work of this section is done in

12.1. Lemma. Given a proof T of OuT>^, we can interpolate in (<!>;¥, 2;).


Proof. By induction on the depth d of T.
If d= 0, then 2; (is atomic and) belongs to OuT. If ^£<D, then ^ is
easily seen to be an interpolant.for (<D; ¥, £). If ^¥, then we can trivially
prove ¥o^.
Now let d>0. We distinguish several cases, according to the way in
which level 1 of T was obtained. Strictly speaking, we ought to consider
nineteen cases (because level 1 of T could have been obtained by applying
any one of the six “positive” rules to a formula of <I>, or any one of these
six rules to a formula of ¥, or any one of the seven “negative” rules to —E,).
CH. 9, §12], THE INTERPOLATION THEOREM 447

But by grouping together cases whose treatment is identical or very similar,


we are left with eleven cases.
Case 1: Level 1 of T was obtained by applying rule a or 3 to some
formula of ®. Then T starts thus:

where E is a set of one or two formulas. From T we obtain a proof, with


depth d— 1, of (3>uL)uT>^. By the induction hypothesis, we can
interpolate in (®uE; Y, £). There are three subcases:
Subcase 1.1: We have got an interpolant 6 for (®uE; Y, £>. Notice
that the extralogical symbols of E must occur in ®. Also, if E was obtained
in T by the 3 rule, the critical variable y involved does not occur free
in the initial node; in particular, y is not free in Yu{^}. Hence y cannot
be free in 5, because 5 is an interpolant for (®uE; Y, £). It is now easy
to see that 5 is also an interpolant for (®; Y, £). (To get'a proof of ®i>5,
start by obtaining E from ® as in T, and then use the proof of OuE>5
which we possess by the assumption of the present subcase.)
Subcase 1.2: We can prove $uE>. Then we can get a proof of ®c>.
(Start by obtaining E as in T.)
Subcase 1.3: We can prove Yo^. Then we are through.
Case 2: Level 1 of T was obtained by applying rule a or 3 to a formula
of Y. This is similar to case 1, except that now the induction hypothesis
is that we can interpolate in(<P; YuE, %), where E is the node of level 1
in T.
Case 3: Level 1 of T was obtained by applying rule 1 to a formula
of ®. Then T starts thus:

®, Y,
I*
—a

where “la^O. We obtain a proof, with depth d— 1, of Yu® t>a. Hence,


by the induction hypothesis, we can interpolate in <Y; ®, a). We have
three subcases:

30*
1NTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §12
448

Subcase 3.1: We have got an interpolant 8 for (Y;<D, a). Then “18
is easily seen to be an interpolant for (O; Y, ^). (To get a proof of <I> i> —18,
start by using— “i8, then use “la and employ the proof of <D, 8t>a that
we possess by assumption. To prove Y,“i8ol;, use “i8 and employ the
proof of Yi>8 that we possess by assumption.)
Subcase 3.2: We can prove To. Then we can also prove Yo^.
Subcase 3.3: We can prove Then we can prove Oo. (Start by
using “la.)
Case 4: Level 1 of T was obtained by applying rule “I to a formula of Y,
or one of the rules — Vx, — v2, — — “1, — V to — These are all treated
in a very similar way. As an example, we shall do in detail the case of
rule —V which is slightly more problematic than the other five. In this
case T starts thus:

<I>, Y, -E,
I*
— «(x/y)

where E,= V*a and y is not free in <P, Y or By the induction hypothesis,
we can interpolate in (<D; Y, a(x/y)). Three subcases arise:
Subcase 4.1: We have got an interpolant 8 for (G>; Y, a(x/y)). Then
y cannot be free in 8 (because y is not free in <I>) and it is easy to see that
8 is also an interpolant for (O; Y, £). (To prove Y, 8>^, start by using
—E, just as in T, then employ the proof of Y, 8t>a(x/y) which we are
supposed to possess.)
Subcase 4.2: We can prove Oo. Done.
Subcase 4.3: We can prove Yoa(x/y). Then we can also prove YoE,.
(Start by using —E,, with y as critical variable.)
Case 5: Level 1 of T was obtained by rule v applied to a formula of *I>.
Then T starts thus:

<I>, Y, -E,
/\
a p

where avP<E<I>. By the induction hypothesis, we can interpolate in both


<<D u {a}; Y, and (Ou{p}; Y, E,). Here there are five possible subcases:
Subcase 5.1: We possess interpolants y and 8 for (<Du{a};Y, £) and
CH. 9, §12], THE INTERPOLATION THEOREM 449

(Ou {P}; Y, £) respectively. Then yv8 is an interpolant for (<h;*F,!;).


(To prove <I>i>yv5, start by using avp, then apply to —yv5 rule — vy
in one branch and rule — v2 in the other, as follows:

0), —yv5

a P
*| |*
—y —6

Now employ the proofs of $,a>y and <1>, Pc=-S that we have got by
assumption. To prove T, yv5i>^, start by using yv5.)
Subcase 5.2: We have got an interpolant y for (Ou {a}; £) and a proof
of O, pn>. Then y is also an interpolant for (O; W, £). (To prove <D>y,
start by using avp, then utilize the proofs of <D,a>y and O, P> that
we are supposed to possess.)
Subcase 5.3: We have got a proof of O, ao- and an interpolant foi
(Ou {p}; 'F, £). This is similar to subcase 5.2.
Subcase 5.4: We can prove 0,a> and O, Pt>. Then we can prove Oo.
(Start by using avp.)
Subcase 5.5: We can prove vFo^. Done.
Case 6: Level 1 of T was obtained by applying rule V to a formula
of *F. Then T starts thus:

O, XV,
/\
a P

where avP^T. By the induction hypothesis we can interpolate in both


/0;Yu {a}, £) and (O; ¥u {p}, £). Here there are five subcases, analogous
to those of case 5:
Subcase 6.1: We have got interpolants y and 5 for (O; fu {a}, and
(0;'Fu{P), £) respectively. Then yAfi is an interpolant for (0;T, ^).
(To prove OoyAfi, start by using —yAfi. To prove ¥, yA8t>£,, start
by using yA8, then use ctAp.)
The other four subcases of the present case 6 are treated like the analogous
subcases of case 5. We leave this to the reader.
Case 7: Level 1 of T was obtained by applying rule -► to a formula of 0>.
INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §12
450

Then T starts thus:

<P, Y,

where P£<P. By the induction hypothesis we can interpolate in (Y;<P, a)


and (<Pu{P}; Y, £). Again there are five possible subcases:
Subcase 7.1: We have got interpolants y and 5 for (Y; <P, a) and
(<Pu {p}; Y, 'Q respectively. Then y-*6 is an interpolant for (<D; Y, £).
(To prove <Poy-*S, start by using —y-*5, then use a->p. To prove
Y, y-*5t>^, start by using y-»5.)
Subcase 7.2: We have got an interpolant y for <Y; <P, a) and a proof of
<D, po. Then —ly is an interpolant for (<P; Y, £). (To prove (Pony,
start by using ——ly, then use a-f p. To prove Y, —|yo%, start by using “ly.)
Subcase 7.3: We have got a proof of <Poa and an interpolant 5 for
<(Pu{p}; Y, %). Then 5 is also an interpolant for (<P; Y, £,). (To prove
<Po5 start by using a-»p.)
Subcase 7.4: We have got proofs of <Poa and (P, Po. Then we can
prove <Po. (Start by using a-*p.)
Subcase 7.5: We have got a proof of Yo or of Yo^. In either case
we can prove YoE,.
Case 8: Level 1 of T was obtained by applying rule to a formula
of Y or rule — A to —These two are treated in almost the same way.
As an example, we do in some detail the case of rule -+. In this case
T starts thus:

0), Y, -5
*/\
-a P

where a-* P€Y. By the induction hypothesis we can interpolate in (<P; Y, a)


and (<P; Y u{p}, 5). Again five subcases arise:
Subcase 8.1: We have got interpolants y and 5 for (<P; Y, a) and
(<P; Yu{p}, £) respectively. Then yAfi is an interpolant for (<P; Y, %).
(To prove <PoyA5, start by using —yAfi. To prove Y, yA5t>^, start
by using yAfi, then use a->-p.)
Subcase 8.2: We possess an interpolant y for (<P; Y, a) and a proof of
Y, p>^. Then y is easily seen to be an interpolant for (<P; Y, ^).
CH. 9, §12], THE INTERPOLATION THEOREM 451

Subcase 8.3: We have got a proof of Toa and an interpolant 5 for


(<J>; Tu{p}, £). Then 5 is easily seen to be an interpolant for (O; T, £}.
Subcase 8.4: We can prove To a and T, po^. Then we can prove
To£. (Start by using a-*p.)
Subcase 8.5: We can prove 0>. Done.
The final three cases require a little more finesse.
Case 9: Level 1 of T was obtained by applying rule V to a formula
of <D. Then T starts thus:

T,
a(x/t)

where t is a term and Vx®^®- By the induction hypothesis we can inter¬


polate in (<Du (a(x/t)}; T, 5). Three subcases are possible:
Subcase 9.1: We have got an interpolant 6 ior (Ou{#(x/t)};¥,^).
So we can prove both T, 5o^ and <I>, a(x/t)o6 and hence (since VxctfE^L)
also d»o5. Thus, if 5 is in the common vocabulary of <P and 'Pu{^})
then 5 is an interpolant for (<J>; T, £).
But 5 may fail to be in the common vocabulary of <D and *Pu {^}. This
happens if t is a term (i.e., a constant or a free variable) of 5 — and hence
necessarily of Tu{^}- but is not a term of <D. In this case we proceed
as follows. We choose a variable y that does not occur in 5 and put y = 8(t/y)
(if t is a constant, this is defined in Def. 2.3.13). Then we have 5=y(y/t).
Now, since we can prove $>5, i.e., 0>y(y/t) and since t is assumed
not to be a term of <I>, we obtain — by Thms. 9.10(i) and 9.11 (if t is a
variable) or by Thm. 9.13(i) (in case t is a constant) — a proof of 0>oVyY-
Also, since we can prove T, 6>^, i.e., T,y(y/t)o^, we can obviously
prove T, Thus VyY> which does not have the offending term t,
is an interpolant for <<I>; T, £).
Subcase 9.2: We can prove d>, a(x/t)c>. Then, since Vxaf:^, we can
also prove <Do.
Subcase 9.3: We can prove Done.
Case 10: Level 1 of T was obtained by applying rule V to a formula
of T. Then T starts thus:

«(x/t)
452 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §13

where Vxa€^- By the induction hypothesis we can interpolate in


($;Tu{a(x/t)},^). Here again there are three subcases:
Subcase 10.1: We have an interpolant 8 for (O; Yu (a(x/t)}, £). So we
can prove <Pi>8 and *F, a(x/t), 8t>^, and (since Vxa€'P) also 'F, 6>^.
Thus, if 8 is in the common vocabulary of O and *F u {£,}, it is an interpolant
for <<D; «F, £,).
But 8 fails to be in the common vocabulary of O and *F u if t is a
term of 8 — and hence necessarily of <I> — but not of 'F. In this case we
define y and y as in subcase 9.1. and using the results of §9 we see without
difficulty that 3YY is an interpolant for (<P; *F, £).
The other two subcases of the present case are similar to those of case 9.
Case 11: Level 1 of T is obtained by applying rule —3 to This
is very similar to case 10. In the problematic situation arising in subcase 11.1,
we define y and y as before and then see without difficulty that 3yY is an
interpolant for (O; *F, £). The other details are left to the reader. |

We shall say that we can interpolate in (<t>; a) if we can interpolate in


a). We then have:

12.2. Interpolation Theorem. Given a proof of tFoa, we can interpolate


in <<D; a). |

Stated more fully, and replacing “o” by “1—j”, the Interpolation Theorem
asserts that if we have «t> |—j a then we can do at least one of the following
three things:
(1) Find a formula 8 in the common vocabulary of <I> and a and prove
that O |—i 8 and 8 |— x a (in which case 8 is an interpolant for (<h; a)).
(2) Prove that <P |—x.
(3) Prove that (—j a.

12.3. Corollary. T/- O a but <I> and a have no predicate symbol in


common, then O f-j or |—j a. |

§13. Some results in classical logic

In this section we shall use Thru. 12.2 to derive the classical version of the
Interpolation Theorem and from the latter we shall derive certain other
important results for classical first-order logic.
From now on we let SC be an arbitrary first-order language, possibly with
equality and function symbols. Since we are concerned here with classical
logic, we shall revert to regarding only "i,-* and V as primitive. The
other connectives and 3 are introduced as abbreviations (see Def. 1.5.1).
CH. 9, §13], SOME RESULTS IN CLASSICAL LOGIC 453

We also revert to the terminology of Ch. 3. Thus, e.g., by deduction


we mean deduction in the classical first-order predicate calculus.
Let «I> be a set of formulas and let p be a formula. We say that a formula
5 is in the common vocabulary of O and p if the following three conditions
hold:
(1) Every variable free in 6 is free in both <l> and p.
(2) Every one of the extralogical symbols (i.e., function symbols, including
constants, and predicate symbols other than the equality symbol =)occurring
in 5 occurs also in both <l> and p.
(3) If = occurs in 5 then <Du{p} contains = or some function symbol
other than a constant.
We say that 5 is an interpolant for (O; P) if 5 is in the common vocabulary
of <I> and P, and both <I>|— 5 and 51— p.
We say that (O; p) has the interpolation property if there is an interpolant
for <<D; p), or <D is inconsistent, or |— p.

13.1. Interpolation Theorem. //'Oh p, then (O; p) has the interpolation


property.
Proof. Since any deduction of p from O uses only finitely many formulas
of O, we can assume without loss of generality that O is finite. Then we
can replace O by a conjunction of all its formulas. So from now on we
shall assume that O consists of a single formula a.
We distinguish three cases.
Case 1: Both a and p contain neither = nor function symbols other
than constants. Then our theorem follows easily from Thm. 12.2, using
Thm. 11.12 or Prob. 11.14(ii).
Case 2: a or p may contain = but neither of them contains function
symbols other than constants.
If P is an n-ary predicate symbol, let yP be the sentence

V Vi • • • V V2n(V!=V„ + v„=v2n -► Pv,... v„ Pv„ +!... v2n).

We let a' be the formula

Vvi(v1=v1)AYpiA...AYpkAa,

where Plv..,Pk are all the distinct predicate symbols (including =) occurring
in a. We let pr be the formula

VVi(v1=v1)AYQlA...AYQm-^P,

where Qlf...,Qm are all the distinct predicate symbols occurring in p.


454 INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §13

Since akP, it is easy to see that a' k' P', where k' denotes deducibility
without using the axioms of equality (i.e., treating = as if it were an extra-
logical predicate symbol). By case, 1, ({a'}; p) has the interpolation
property (with = treated as extralogical). Three subcases arise:
Subcase 2.1: There is an interpolant 5 for ({a'}; P'), for which a' 8
and 8 k' P'. Then it is easy to see that a|-S, 5l— p, and 8 is an interpolant
for <{«}; p).
Subcase 2.2: {a'} is inconsistent when = is treated as extralogical.
Then {a} is clearly inconsistent in the ordinary sense. (Actually in this
case too we have an interpolant: e.g., VxCx^x)-)
Subcase 2.3: l—' P'. Then clearly |— p. (In this case as well we have an
interpolant: e.g., Vx(x=x)-)
Case 3: a or p may contain function symbols other than constants.
We proceed by induction on the number of such function symbols occurring
in a or p. Let f be an «-ary function symbol occurring in a or p, where «>0.
By Lemma 2.10.4, we can transform a into a logically equivalent formula
a* in which f only occurs in the form

fx1-xn=y,
where x^.-^x^y are distinct variables. The procedure described in the
proof of Lemma 2.10.4 is such that a* has the same free variables and
extralogical symbols as a. Also, it is not difficult to see — even without
the Completeness Theorem — that a and a* are provably equivalent. Sim¬
ilarly, we transform p into p*.
Next, we introduce an (« + l)-ary predicate symbol P that occurs neither
in a nor in p. We define a' as follows. If f does not occur in a, then a' is a.
If f occurs in a, then a' is

Vv1...Vv„3!vn + 1Pv1...v„vn+1Aa**,

where a** is obtained from a* when each atomic part of the formfx1...xn=y
is replaced by Px^. x^. Also, we define p' as follows. If p does not
contain f, then P=p. If p contains f, then P' is

V vx.. V v,,3 + iPVi—v„v„ +! -*• P**,

where p** is obtained from p* as a** was obtained from a*.


Now, from the fact that ot|-p it follows that aV P'. To see this, we
note that {a, ~lp} is unsatisfiable; hence — as in the proof of Thm. 2.10.5 —
we can see that {a , | p } is unsatisfiable, so a" |— pr by the Completeness
Theorem. (We can in fact avoid using the Completeness Theorem and
CH. 9, §13], SOME RESULTS IN CLASSICAL LOGIC 455

argue in a constructive way: it is not difficult to show that any confutation


of {a, Ip} can be transformed into a confutation of {a', —IP'}.)
Since f does not occur in a' or P', the induction hypothesis implies that
({a'}; P') has the interpolation property. Again there are three subcases:
Subcase 3.1: There is an interpolant y for ({a'}; P'). Let 5 be the formula
obtained from y when each subformula Ptx...t„s is replaced by ft1...tn=s.
Then 5 is an interpolant for ({a}; p). This can be seen by showing, as in
the proof of Thm. 2.10.5, that {a, 15} and {5, ~lp} are unsatisfiable;
or constructively, by showing that confutations of {a', ly} and {y, IP'}
can be transformed into confutations of (a, 15} and {5, IP}.
Subcase 3.2: {a'} is inconsistent. Then it is not difficult to show (either
as in Thm. 2.10.5 and using the Completeness Thm., or constructively)
that {a} is inconsistent.
Subcase 3.3: |-P'. Then — by the same method as before — we can
show that HP 8
In what follows, E is any set of if-sentences and P is an extralogical
77-ary predicate symbol of if, but not the only predicate symbol of if.
We say that P is explicitly E-definable, if we have an if-formula p whose
free variables are among such that P does not occur in p and

(13.2) EhPv1...v„ p.

Now let P' be an 77-ary predicate symbol not belonging to if. For each
if-formula a we let a' be the formula obtained from a when every
occurrence of P is replaced by P'. For a set 0> of if-formulas we put
0' = {<p': <p €<!>}. We say that P is implicitly E-definable if
(13.3) EurhPv1...v„44P,v1...v„.

Semantically speaking, this means that if ll and IF are any if-structures


which are models of E, and which have the same universe and agree on the
interpretation of all extralogical symbols other than P, then H and U
must also agree on the interpretation of P.
The following result is known as Beth's Definability Theorem (for predi¬
cate symbols)
13.4. Theorem. P is explicitly E-definable iff it is implicitly E-definable.
Proof. First, assume that P is explicitly E-definable, so that for an appio-
priate if-formula p we have (13.2). Then clearly SVP’vi...v„hP' and
(13.3) follows instantly.
456 INTUITJONISTIC FIRST-ORDER LOGIC [CH. 9, §13

Conversely, let P be implicitly E-definable, so that (13.3) holds. Without


loss of generality we may assume E to be finite. Then we can replace
I by a conjunction of all its sentences. So we may assume that E consists
of a single sentence Thus we have

a, a' l— PVi.-.Vn «-► P'Vi.-.v,,.

Hence, in particular,

(1) <T, Pv1...V„H<T/-»> P'Vi.-.Vn .

By the Interpolation Theorem 13.1, at least one of the following three cases
must arise:
Case 1: There is an interpolant p for ({<7, Pvy.-.v,,}, a'->P,v1...vn). Then
P contains neither P nor P', and has no free variables other than
Also
CT, Pv1...vn|-p, PHW -^P'Vj . V^

It follows that <r|—Pvj.p and a'\- P-»P/v1...v„, hence clearly also


fff-P-fPvj. -Vn. Thus <ri-Pv1...vn4->P, so we have (13.2).
Case 2: {<t, Pvj.-.v,,} is inconsistent. Since P is not the only predicate
symbol of <5?, we can find an ^-sentence y which does not contain P. Let
P = yA~ly. Then it is easy to see that <r(-Pv1...vnoP, so again we
have (13.2).
Case3\ 1— a/->P/v1...v„. Then clearly also l— a-^Pyj.-.v,,. Let y be any
Jzf-sentence not containing P, and let p = y-*y. It is easy to see that
(7HPv!...v„44-P, as required. |

13.5. Problem (Beth’s Definability Theorem for function symbols). Let


f be an n-ary function symbol of JSC. We say that f is explicitly E-definable
if there is an jSCformula p which does not contain / and with no free
variables other than v!,...,vn + 1 such that EHfv1...vn=vn+14+p.
Let L be an n-ary function symbol not belonging to !£. For any
JS?-formula a, let a' be the formula obtained when f is replaced by f'. Let
E'={<*': o JE}. Then f is implicitly 'L-definable if

EuEVfv1...v„=f'v1...v„.

Prove that f is explicitly E-definable iff it is implicitly E-definable.


ICH. 14, §14 HISTORICAL AND BIBLIOGRAPHICAL REMARKS 457

13.6. Problem (A. Robinson’s Consistency Theorem). Let and L2 be


consistent sets of ^-sentences. Suppose that, for every .^-sentence <p such
that all the extralogical symbols occurring in <p also occur in both
and L2, we have: if Exi— (p then £2|/-nq>. Prove that EjuS, is consistent.

§14. Historical and bibliographical remarks

For a brief outline of intuitionism and its history see Kneebone [1963],
A more detailed outline is in Fraenkel, Bar-Hillel and Levy [1973],
A very readable explanation and defence of intuitionistic views is in
Heyting [1972], An introduction to the technical aspects of intuitionistic
mathematics is in Troelstra [1969].
Bishop [1967] is an impressively successful attempt to develop various
branches of analysis by constructive — though not specifically intui¬
tionistic — means.
An outline of a general theory of constructions, of the kind needed to
make the explanations of §4 more precise, is proposed by Kreisel [1965].
A detailed rigorous development is Goodman [1970].
The tableau method of §5 is taken from Fitting [1969]; it is Smullyan’s
adaptation of the Gentzen-type system (73 of Kleene [1952],
Kripke [1965] proposed and investigated the semantics presented here
in §6. Broadly similar ideas were proposed by Beth [1956] and Grzegorczyk
[1964], The possibility of giving a more constructive character to arguments
which employ Kripke’s semantics is outlined by Smorynski in Troelstra
[1973]. However, Smorynski rejects the claim that Kripke’s semantics
is a heuristically plausible explication of intuitionistic reasoning. In papers
to be published, H. de Swart and W. Veldman give a constructive version
of Kripke systems and provide a constructive proof of the Completeness
Theorem.
Our proof of the Elimination Theorem in §7 is an adaptation of the proof
of the analogous result for (73 in Kleene [1952],
The first version of the intuitionistic propositional calculus is in Heyting
[1930] and the first version of the intuitionistic predicate calculus is in
Heyting [1930a]. However, Heyting was anticipated to some extent by
Kolmogorov [1925]. The calculus presented here in §8 is taken from
Kleene [1952]. The version of the intuitionistic predicate calculus in §9
was suggested to us by D. H. J. de Jongh. It is equivalent to Heyting s.
The first proof of the completeness of the intuitionistic predicate calculus
relative to Kripke’s semantics is due to Kripke [1965], The proof presented
INTUITIONISTIC FIRST-ORDER LOGIC [CH. 9, §14
458

in §10 is adapted from Fitting [1969]. Similar proofs were invented


independently by Aczel [1967] and Thomason [1968].
The earliest version of a translation of classical into intuitionistic logic
is due to Kolmogorov [1925].
A wealth of information on intuitionistic formal systems of logic,
arithmetic and analysis can be found in Troelstra [1973].
Fitting [1969] applies Kripke’s semantics to obtain independence proofs

for formalized set theory.


The Interpolation Theorem was proved for the classical case by Craig
[1957] and extended to the intuitionistic case by Schutte [1962].
The Definability Theorems 13.4 and 13.5 go back to Padoa [1900],
who stated them without proof, as obvious. The first proof for first-order
logic was given by Beth [1953]. The Consistency Theorem (Prob. 13.6)
was proved by A. Robinson [1956], who used it to give another proof of
the Definability Theorems.
CHAPTER 10

AXIOMATIC SET THEORY

In this chapter we set up and develop a system of axiomatic set theory


and eventually prove Godel’s celebrated result that, if the axioms of set
theory are consistent, then the Axiom of Choice and the Generalized
Continuum Hypothesis may be adjoined without destroying that consist¬
ency.

§ 1. Basic developments

The language of set theory is a first-order language if with equality. For


the purposes of this chapter it will be convenient to modify somewhat the
conventions of Ch. 1, §5 governing the metalinguistic symbols we have
heretofore used to denote if-symbols and their combinations. Thus: we
assume that the individual variables of if are enumerated in a fixed
alphabetic sequence v0,vt,... (starting now with v0 rather than %!) We shall
use lightface lower case italic letters (possibly with subscripts) as meta¬
linguistic symbols ranging over these variables. Unless otherwise stated,
in any given context, distinct letters of this type are understood to refer to
distinct variables.The equality symbol will be denoted by a lightface =,
the connectives by lightface symbols a, i, etc., and the quantifiers by
lightface symbols V, 3. With the exception of certain defined formulas
and terms, we shall as before employ bold lower-case Greek letters <p, \|/, x,
etc., to denote formulas and bold lower-case Roman letters s, t, etc., to
denote terms.
We agree to take negation (~i), conjunction (a) and the existential
quantifier (3) as the primitive symbols of if, the others, i.e. v, —, V
being defined in terms of these in the usual way (Ch. 1, §14 and Ch. 3, §4).
The only extralogical symbol of if is the binary predicate 6. If t and t'
are if-terms, we write t(Et' for ftt'. tfC is to be read “t belongs to t'”, or
“t' contains t” or simply “t is in t'”.
AXIOMATIC SET THEORY [CH. 10, §1
460

We adopt the convention that if a formula (p is first introduced as1


*p(x1,...,xk) and tlv..,tk are terms — which may be either variables or
virtual terms as defined in §13 of Ch. 2 (and further explained below) —
then <p(tlv..,tfc) stands for y(xjt1,...,xk/ti) (cf. 2.3.14). Similarly, if a term s
is first introduced as s(x1,...,xk), then s(tl5...,tfc) stands for the result of
simultaneously substituting tlv..,tk for xlv..,xk in s.
Defined formulas are abbreviations introduced as in the following scheme:

verbal expression: defined formula defining formula,

where the defining formula is the ^-formula for which the defined formula
is to serve as an abbreviation, and the verbal expression (which is sometimes
omitted) indicates how the defined formula is to be read. The scheme is
called the definition of its defined formula. We make the following defini¬
tions without further delay:

x is not equal to y: x?^y++df~\(x=y);

x is not in y: x$y ++df—i(x£y);

x is a subset of y (or y includes x): x<=y -<->-dfVz(z£x — zdy);

For some x in y, <p: 3x£y(p -*-*-df 3x(x£y a <p);

For all x in y, (p: Vx£>’<p ^df Vx(x£y^<p);

For some subset u of x, <p: 3ucx<p —►df 3w(w£x a <p);

For all subsets u of x, <p: V«^x<p —»dfVw(w^x-^<p);

There is at most one x such that <p(x):

3*x<p ^df Vx My [<p(x) a <p(_y) -*-x=j>], where y is not free in (p.

Extralogical axioms — also called postulates — will be introduced as we


go along. Unless otherwise specified, we write “|—<p” for “the formula (p
is a logical consequence of the postulates introduced so far” or, equivalently,
for “<p is deducible in the first-order predicate calculus from the postulates
introduced so far”. Since all the postulates are going to be sentences
of ST, we have
l-tp^HVxtp.
Formulas <p for which (— tp are called (formal) theorems of set theory.

1 It is to be understood that this is merely a device for drawing attention to the variables
xlt ..., xk\ it does not signify that all or indeed any of the xt occur free in cp.
CH. 10, §1], BASIC DEVELOPMENTS 461

Now, if set theory is to be more than a mere game with symbols, one
must posit1 a (non-empty) domain or collection V of objects called sets,
and a binary relation, membership, defined on V. The domain and the
membership relation together constitute the universe of sets. The postulates
— hence also the theorems — of set theory are supposed to be true in the
universe of sets. In other words, the postulates represent true statements
when € is interpreted as the membership relation and the variables of
«£? are taken as ranging over V. We may think of the universe of sets as
a rough approximation to “Cantor’s paradise” of naive set theory.
In accordance with the position we have just advocated, we shall assume
that all the formulas of actually refer to the universe of sets. This
enables us to adopt a rather informal method of presenting deductions
in jz?, namely, we deal with the variables of S£ as denoting, or ranging
over, sets, i.e. individuals of V. Thus, e.g., suppose we wish to prove
1— <p(jc). We take (p(x) as asserting something about “the set x” (whereas
strictly speaking we should regard <p(x) as expressing a condition on the
value of x in V). We then proceed to show, using the postulates more or
less informally, that “every set x” satisfies the assertion tp(x), and hence
we conclude l— <p(x). Of course, all these informal deductions can quite
easily (but tediously!) be translated into formal deductions of the first-order
predicate calculus in i*?.
We must distinguish very carefully between sets (i.e. objects in V) and
collections (i.e. pluralities of objects in the intuitive sense). Collections
of sets will be called classes. To each set x there corresponds a class, called
the extension of x, which is the collection of all members of x (i.e. objects
bearing the membership relation to x). A set and its extension are quite
different, but a fundamental part of the intuitive notion of set, which we
insist on incorporating into our formal theory, is that a set is uniquely
determined by its extension, that is, no two different sets have the same
extension. This is expressed by our first postulate, namely the Axiom of
Extensionality:

Ext: V* Vy [\/z(z£x — z^y) - x=y].


The proof of our first theorem is easy and is left to the reader.

1 Evidently we are adopting what amounts to a Platonist view on the question of the
existence of sets. This view is in some ways a dangerous oversimplification and would
certainly need — at least — major revisions if it were to become a component of a se¬
rious philosophy of mathematics. Nonetheless we regard the Platonist position — des¬
pite its shortcomings — as being closest in spirit to the intuitive picture behind set theory.

31
AXIOMATIC SET THEORY [CH. 10, §1
462

1.1. Theorem. If x is not free in cp, then

\-3xfy [y£x~ <p]-3!xV.y [y€x~~ q>]. B

The argument used in the informal proof of Cantor’s classic result that
any set is of strictly smaller cardinality than its power set shows that there
cannot exist a one-one correspondence between all classes and all sets.
Thus, not all classes can be extensions of sets.
It might be thought that the reason for this disparity is that the notion
of class used here (i.e., a completely arbitrary collection of sets) is too
vague and general, and that, perhaps, if we could single out those classes
that can be “precisely described”, then these classes would be exactly all
set extensions. This prompts the following definition.
Let <|> be a formula whose free variables are among y, ylt...,yk (where
0; the variables ylt...,yk are to be regarded as parameters). Assign
sets ar,...,ak as values to the parameters J'i,...,}’*, respectively. Then the
given formula <p and the parameter values ax,...,ak are said to define a class,
namely, the collection of all sets a such that <p holds when y takes the
value a (and the parameters take their assigned values). A class obtained
in this way is said to be definable.
Using informal set-theoretic reasoning it is easy to show that if one
assumes the universe of sets to be infinite (as we shall want to do in any
case!), then the collection of all classes which are definable in the above
sense is not more numerous than the universe of sets and hence not more
numerous than the collection of extensions of all sets. Moreover, the
extension of any given set is definable by means of a formula with one
parameter: namely, take the formula y^y1 and assign the given set as
value to yv It is therefore tempting to postulate that every definable
class is the extension of a set. This assertion may be formalized as the
following scheme:

(1.2) Vy1...Vyk3xVy[y£x+~ <p],

where <p is any formula with free variables among y, yi,...,yk. This scheme
is the (unrestricted) Comprehension Axiom.
Unfortunately, however, (1.2) is untenable even when k=0, because it
leads to the well-known Russell paradox as follows. Take cp to be
Then by (1.2) we get 3x Vt (X^^jO)- From this we immediately obtain
3x(x€x x$.x), which is logically false.
Accordingly, we cannot adopt (1.2) in general. But because of its natural-
CH. 10, §1]. BASIC DEVELOPMENTS 463

ness we shall adopt as postulates certain special cases of (1.2) for particular
formulas <p.
The Axiom of Replacement (briefly, Rep) which we shall now adopt is
not a single axiom but an axiom scheme. For each formula (p with free
variables among x,y,y1,...,yk we take as a postulate the corresponding
instance of Rep, namely
Rep: Vj>i...VTfc V«[Vx€« 3*y<p - 3z Vy[y£z — 3x6w(p]],

where1 z does not occur free in <p.


It is easy to see in what sense Rep is a particular case of the Comprehension
Axiom. If we ignore the initial k+1 quantifiers, Rep is an implication.
The antecendent says that tp defines a partial single-valued mapping on the
extension of u, while the consequent says that the class of all images of
members of u is the extension of some set z. Thus Rep asserts that any
class which is the class of all images of the members of some given set
under a definable partial single-valued mapping is itself the extension
of a set.

1.3. Theorem. If z is not free in v|/, then:


(i) hVw 3z Avj/].
(ii) l— 3z V># -+y&\ - 3z Vy[>'£z —
Proof. For (i), put tp for x=y a v|/. It is then clear that we have |— \/x£u 3*ytp.
Hence, by Rep, we conclude that

l-3z Vy[y€z «->- 3x£«cp].

But obviously we have

h-3x€wtp —-yGM av|/,

so we obtain l— 3z \/y\.y(Lz-*+y(iu av[/] as requiied.


(ii) follows easily from (i). I

The scheme of sentences of the form

Sep: Vyi• • • 3z Vy[y€a y€w a v)/],


where z is not free in \J/ is known as the Axiom of Separation. In the present
axiomatic system it does not have to be introduced as a separate postulate
because, as we have shown in Thm. 1.3, it is deducible from Rep.

1 Strictly speaking, we should specify 2 uniquely, e.g. as the^t variable not free in tp.
However, here and in similar contexts below, we omit such niceties.

31*
464 AXIOMATIC SET THEORY [CH. 10, §1

1.4. Problem. Show that V is not the extension of any set.

We now establish the existence of a unique set with empty extension.

1.5. Theorem, \-3\z \/y(y£z ++y^y).


Proof. In 1.3(i) take \|/ to be y^y. Then we have

h- 3z VjO ^ a y € «)•
But
Y-(y^y yAy,

since both sides are logically false. Hence

h-Bz Vy(y€z*-+y^y),

and the theorem follows by 1.1. i

From now on we shall often use the method explained in §13 of Ch. 2
to introduce virtual terms. In each case, we have to select (in the notation
of §13 of Ch. 2) a formula a with one free variable, and a second formula <p.
We shall always take a to be the formula fy{y£z **y^y). Note that by
1.5 we have <]> 1= 3 Iza, where <J> is the collection of all the postulates introduced
so far.1 As (p we shall select various formulas from time to time — begin¬
ning, in fact, with a itself. Recall that while in practice we can manipulate
virtual terms as ordinary terms in some extension of <£, we nevertheless
regard a formula containing such terms as an abbreviation for a suitable
iAformula. Thus, in particular, a formula containing virtual terms intro¬
duced previously can perfectly well be used as the <p for introducing new
virtual terms.
We define the free variables of a term t inductively as follows: if t is
a variable x, the only free variable of t is x, and if t is iytp, then the free
variables of t consist of all free variables of tp with the exception of y.
Abbreviated notations for terms are introduced by means of a method
similar to that used for introducing defined formulas, viz., according to
the scheme:
verbal expression: defined term =df defining term,

where the defining term is the (virtual) iAterm for which the defined term
is to serve as an abbreviation, and the verbal expression (which is sometimes

1 As a matter of fact it is easy to see from the proofs of 1.3 and 1.5 that it is enough
take <1* to be Ext plus a single instance of Rep, namely the instance where the formula
<p is x=yt\y^y.
CH. 10, §1], BASIC DEVELOPMENTS 465

omitted) indicates how the defined term is to be read. The scheme is called
a definition of its defined term.
We now define:

the empty set or zero : 0=dfiz[Vy(y£z++y-ty)l

The term 0 is called the zeroth numeral.

1.6. Lemma.
(i) y^y.
(ii) \-z=d^My{yiz).
(iii) l-y = ix<p(x) — [3 !x<p(x) a <p(y)] v [~i 3 !xq>(x) ay = 0].
(iv) h- 3 !x<p(x) - <p(ix<p).
Proof. This lemma follows immediately from the above definitions and
Prob. 2.13.3. I
We see from (iii) of this lemma that ixtp(x) is the unique x such that
<p(x) if such an x exists, or 0 if not.
We now define

the set of x such that <p: {x : ((>} =df iz[Vx(x€^^<p)]j

where z is not free in <p. (Note that x is not free in the term {x : <[>}.)
Terms of the form {x : <p) are called abstraction terms. An abstraction
term {x : <p(x)} is said to be legitimate if we have,

(1.7) Ke£{* : <P(*)}^<P(l)-

Thus the term {x : <p(x)} is legitimate iff the extension of the set it denotes
is precisely the class defined by <p.
We observe that {x: x<Ey} is legitimate and !-{*: x£y}=y.

1.8. Lemma. The abstraction term {x: tp(x)} is legitimate iff

1-Bz Vx[(p(x)-x€z].

Proof. Necessity follows immediately from (1.7). Conversely, suppose that

f-3* Vx[<p(x) — x£z\.

Then, by 1.3(ii) and 1.1 we have

hB!z Vx[x(:z^<|>(x)],

so {x: (p(x)} is legitimate by 1.6(iv). I


466 AXIOMATIC SET THEORY [CH. 10, §1

Lemma 1.8 tells us that the abstraction term {x: <p(x)} is legitimate
iff the class defined by <p(x) is included in the extension of some set.
We shall be introducing many abstraction terms in the future, and they
will all be legitimate. In some cases when a new abstraction term is intro¬
duced, we shall leave the reader to verify its legitimacy. In practice this will
amount to nothing more than an application of Lemma 1.8 and the
postulates that will have been introduced.
Our next postulate is the Axiom of Union

Union: \fz 3x My[3u^z(y^u) -^y£x\.

This postulate asserts that for each set z there is a (unique) set whose
extension is precisely the collection of all members of members of z.
We define
union of z: [)z {y\3u£z{y£u)}.
Thus Ur is the unique set whose extension is the collection of all members of
members of z.
We now introduce our next postulate, the Power Set Axiom
Pow: fz 3x \/y[y=:Z++y(ix].
This postulate asserts that for each set z there is a (unique) set whose
extension is the collection of all subsets of z.

Problem. Show that it would have been enough to postulate Union and
Pow with in place of

We define
power set of z: Pz=dt;{y: ysz}.
Thus Pz is the unique set whose extension is the collection of all subsets of z.
We also put

one\ 1 =df P0,

two: 2=dfPl.

The terms 1 and 2 are called the first and second numerals, respectively.

1.9. Theorem.

(0 HM1.
(ii)
(iii) bx62-H-x=0vx=l.
Proof. Left to the reader.
I
CH. 10, §1]. BASIC DEVELOPMENTS 467

1.10. Theorem. \-3x My[y£x++y=u vy=v].


Proof. Let y(z,y) be the formula

[(z=0 a y=u) v (z — 1 aj = i;].

By 1.9 we have 0x1, so it follows that |— VzX23*yty(z,y). Hence,


by Rep,
H Bx V>[>’ 3z 6 2(p(z,^)],

which immediately gives the required result. I

We now define:
unordered pair of u and v: {u,v} =df {y : y = uvy = v}\

singleton of u: {u} =dt {a,u};

union of u and v: «ot)=dfU{M,ti};


intersection of u and v: unv=df{x : x£u ax£v);

complement of v in u: u—v=d({x : xO/Axfr};

ordered singleton of u: (u) =df u;


ordered pair of u and v: (u,v) =df {{«}, {w,u}};

Cartesian product of u and v: uXv —df {x: 3y£u3z£v(x—(y,zy)}.

To verify the legitimacy of the defining term in the definition of the Cartesian
product, notice that
\-y£u Az£v-+(y,z)£PP(u'Jv)

and apply 1.8.


Proceeding inductively, we define, for ns*3,

=df {wi,...,«n-i}u {Wn}>

... UW„ =df(«i'-'...UU„_1)U«„,

H,n...nw„ —dr fan...

fa,... ,un) =df««!, • • • _i),u„),


jqX.-.Xw,, =df («iX ••• Xm„_x)X w„-

Notice that by our previous definitions we already have

{mi,w2}=K}u{m2},

(u1,u2)=((u1),u2).
468 AXIOMATIC SET THEORY [CH. 10, §2

We also put

un =df w X... X h (with n factors).

1.11. Theorem.
(i) \-(u1,...,un') — (v1,...,vf*-+u1 = v1 A ... A U„ = V„.
(ii) X...Xu„={^: 3xx€u1..3x„£u„(y=<*iv,*„»}.
Proof. Left to the reader. |

We also put

(1.12) the set of x in u such that ip: {x£u : (p}=df{x :xGha<p}.

(The defining term here is legitimate by 1.3(i).)


Let t be a term which is not a variable, and let tp be any formula. We put

(1.13) {t: .v^Mi a ... ax„(w„ A(p}=df {z: 3x:1£«1...3x,I6«n[z=t a <p]},

where z is not free in t nor in tp.


To verify the legitimacy of the defining term in this definition, put
'l'— df 3x16wi...3x„6Mn[z=tA<pAj=(x1,...,xB)].
Then clearly 1-Vy€WiX...Xun 3*zi|/. Hence, by Rep,
(1) H3w Vz[3>>€wiX...Xu„v|/-*z€h'].

Also, from the definition of \|/,

(2) H3x16i/1...3xn€wn[z=t A(p]^3^6«iX...Xu„v|/.

From (1), (2) and 1.8 the desired result follows easily.

§ 2. Ordinals

We define
epsilon well-orders x:

Ew(x) *+df Mz£x[y=z vy£z vz£y]

a x[«^0-<- 3^€« Vz€u[z(£j]].


The second clause in this definition requires that every non-empty subset of
x has a member which is “minimal” with respect to the membership relation.
We also put
x is transitive: Trans(x) -»df Vj^xty^x],
x is an ordinal: Ord(x) —df Ew(x) a Trans(x).
CH. 10, §2], ORDINALS 469

2.1. Problem. Show that Ord(O) a Ord(l) a Ord(2).

We shall use lower case Greek letters — chiefly a, /?, y, A, £, — as


a new kind of variable ranging over the ordinals. This means that Vatp(a)
and 3a<p(a) stand for Vx[Ord(x)-<-tp(x)] and 3x[Ord(x) a <p(x)], respec¬
tively, where x is a variable not occurring in <p(a). Also, if a, /?, y,... are
free in (p(a,/?,y,...), then the assertion |— q>(a,/?,y,...) stands for

h- Ord(x) A Ord(y) a Ord(z) A ... — (p(x,y, z,...),

where x,y,z,... are variables not occurring in <p(a,/?,y,...). Expressions like


3 !a(p(a), ia<p(a) and {a: <p(a)} are defined similarly.
We define

x is smaller than y: x<y -«-df x£y,

successor of x: x+l=dfxu{x},

but we shall use these notations for ordinals only. We write 3a</?<p for
3a[a</l a <p], etc.
Our next theorem tabulates some of the basic facts about ordinals.

2.2. Theorem.
(i) l-a^a.
(ii) |-Ord(a+l) Aa<a+1.
(iii) \-y£<x-* Ord(y).
(iv) 1 t-y<f}«x-+y-<u.
(v) ATrans(y)^-j = a vy^a.
(vi) 2 a.
(vii) H ~i(a<^<a) a-i(a<a).
(viii) l-a<+ l«s/k
(ix) h-Vy€x Ord(y)-Ew(x)AOrd(U^)-
(x) !-Vy€x Ord(y) + [\Jx<£P++V<xex(ci^p)].
Proof. Before we begin, let us define

y is a minimal element of x: Min(y,x) ydx a Vz€x[z$y].

(i) If j€a, then y$y. For if not, then {y} would be a subset of a with
no minimal element, contradicting the definition of an ordinal. Thus
|-a£a-*-a$a, whence l-a^a.
(ii) is straightforward, and we leave its proof as an exercise to the reader.

1 We write x£y£z for x£.yl\y£z, and a for A£<a.


2 We write a<^ for a<jffVa=y?, and 3a<j8<p for 3a[a«c/?Aq>], etc.
470 AXIOMATIC SET THEORY [CH. 10, §2

(iii) Let y£a. Since Trans(a), we have y£a, and since also Ew(a), it
follows immediately that Ew(y). It remains to show that Trans(y), i.e.,
given that u£z£y we must show that «6y. Since Trans(a), we have z£a
and u£a, and since Ew(a), we have either y£u or y=u or u£y. Either of
the first two alternatives implies that {u,z,y} has no minimal member,
which is impossible. Hence u£y and (iii) follows.
(iv) follows at once from Trans(a).
(v) Suppose yea and Trans(y). Put u=a —y. If u~0, then y=a.
If u^O, then u has a minimal member v. Since we have v£cc. To
show that y£a it will suffice to show that y—v. If zdv, then z£a, since
we have Trans(a); but by the minimality of v in u we cannot have z£w.
Thus by the definition of u we must have z£y. Conversely, let z£y. Then
_z£a, and, since Ew(a), we have z=v or v£z or z£v. Now z—v would
imply v£y, which is impossible, while v£z would, in the presence of the
assumption Trans(y), also imply v£y. Thus the only remaining possibility
is z£v.
(vi) Consider the set an/?. It is clearly transitive and so, by (iii) and (v),
an ordinal, y say. Applying (v) again, we see that y=sa and y</?. Then
y = a or y=/?, for y^a and y^/? implies, by (v), y£an/? = y which, according
to (i), is impossible.
(vii) follows immediately from (i) and (iv).
(viii) is a straightforward consequence of (ii), (vi) and (vii), which we
leave as an exercise to the reader.
(ix) Suppose Vy(;*Ord(y), and let We claim that u has
a minimal element. Since u^O, we can choose a£«. If an»=0, then
clearly a is a minimal element of u, while if an w^O, then a minimal element
of anu is easily seen to be a minimal element of u. This proves the claim,
which, together with (iv) and (vi), gives Ew(x).
Now let z=\Jx. Clearly we have Trans(z). Moreover, it is equally clear
that Vy £z Ord(y), so that Ew(z) by the first part of the proof. Hence Ord(z).
(x) Suppose Vy€* Ord(y), and let z = \Jx. Then Ord(z) by (ix). If a£x,
then a c r, so that a^z by (v). Thus if {Jx^p then Va£x(a=c/?). Conversely
if Va€*(a*c/?), then Va£x(as/?) so that z^/?, whence by (v) z</?. |

It follows immediately from (ix) and (vii) of this theorem that each
non-empty set x of ordinals has a unique minimal member. We call this
the least member of x.
We now show that there is no set whose extension includes the class of
all ordinals (cf. Prob. 1.4.).
CH. 10, §2]. ORDINALS 471

2.3. Theorem, h n3xVy[Ord(y)x].


Proof. Suppose 3x V>’[Ord(y) — y€x]. Then, by 1.3(ii), there is a set
.v such that Vy[Ord(y)-<->-iy€x]. We have Ord(x) by 2.2(ix) and (iii), and
it follows that x£x. But this contradicts 2.2(i). I

2.4. Theorem. If does not occur free in 9(a), then:


(i) l-3a<p(a)-*-3!a[(p(a) a V/? < a-|<p(/?)].
(ii) f-Va[V^<a(p(jS)-(p(a)]-Va<p(a).
Proof, (ii) is an easy consequence of (i), so we merely prove (i).
Assume 3a<p(a), and choose y so that 9(7). Let

u={/3: /icya q>(/?)}.

Then usy. If w=0, then y is easily seen to satisfy <p(y) a rn<p(^)-


On the other hand, if u^O, then the least member of u satisfies the above
condition. Uniqueness follows from 2.2.(vi). I

2.4(h) is called the principle of transfinite induction (on the ordinals).


It is a generalization of the familiar principle of induction on the natural
numbers.
We now define

least ordinal a such that <p(a): na(p(a) =df ia[<p(a) A —\ q>(/5)],

where /? is not free in 9(a), and

x is a limit ordinal: Lim(x) -*->-df Ord(x) ax^O a Va(x=^oH-l).

2.5. Problem. Show that:


(i) h-Lim(^) Aa</?-^a+l</?.
(ii) \-VyexOrd(y)-+{Jx=\icc[VPex(P^a)].
(iii) If /?, A do not occur free in (p(a), then

H (p(0) a Va [9(a) — 9(a+l)] a

a VA [Lim(A) a Vj8<A9(jS) - 9(A)] - V«9(a)-

We now introduce the natural numbers. We define

x is a natural number: N(x) -«-df Ord(x) A Va^xn Lim(a).

We shall use lower case italic letters from the middle of the alphabet
chiefly i,j,k,m,n — as variables ranging over the natural numbers in
exactly the same way as we have been using Greek letters as variables
ranging over the ordinals. We shall also continue to use these letters
472 AXIOMATIC SET THEORY [CH. 10, §2

metamathematically (e.g. as indices) but in each case it should be clear


from the context what usage is intended.
We have already defined the numerals 0,1 and 2. For 3 we define
the nth numeral recursively by putting n=dfku{k}, where k—n— 1.

2.6. Theorem.
(i) 1—N(n) for each natural number n.
(ii) b-N(n-fl) a [a<«->-N(a)].
(iii) 1- Bmp(n) ->- 3 \n [tp(«) a Vm</j —i <|>(ah)], where m is not free in <p(n).
(iv) h- <p(0) a Mn [(p(n) ->- <p(«+l)] — V«<p(«).
(v) f-0(xA dy£x [yu{_y}€x]-^V« [«€*].
Proof, (i) and (ii) are easily proved; we leave this task to the reader.
To prove (iii), take a Greek variable, say a, which is not free in tp(«),
and put N(a) a <p(a) for <p(a) in 2.4(i).
To prove (iv), assume ~iV«<|>(n). Then by (iii) there must be a natural
number n such that

“i tp(w) a \/moup(m).

If n=0, then —i<p(0). On the other hand, if nAO then, since —iLim(n),
we must have n = a+l for some a. Then a<n and so N(a) by (ii). Thus
N(a)A<p(a) a ~up(a+l), which implies

“1 V«[<p(«) -► <p(rc+l)].

Finally, (v) is obtained by taking <p(n) to be the formula a: in (iv).

2.6(v) is the (second-order) induction axiom for natural numbers. The


other Peano axioms, viz. V«[n+1^0], dn\/m[n+\=m+l -+n=m\ are easily
derivable as theorems of set theory. Also, the axioms for addition and
multiplication of natural numbers become provable when these operations
are suitably defined.

2.7. Theorem. b3.v[06xaV.y(E.x[>,uVt[a€x^N(j)].

Proof. Assuming the left-hand side we deduce 3xVfl[fl(:x] from 2.6(v).


1.3(ii) now gives the right-hand side. The converse follows immediately
from 2.6(ii). |

The Axiom of Infinity (briefly: Inf) is the left-hand side of the formula
proved in 2.7, namely

Inf: 3x[0£xa V)’^[}’u(v}6x]].


CH. 10, §2], ORDINALS 473

By 2.7 this is tantamount to postulating the existence of a set whose extension


is the class of all natural numbers, so we can define

the set of natural numbers: co=df {y: N(y)}.

2.8. Theorem. (-Lim(m).


Proof. Since \- \fx^a> Ord(x), we have 1— Ew(co) by 2.2(ix). By 2.6(ii)
we have b-Trans(o>); therefore (-Ord(co).
If a <a>, i.e. a £co, then N(a) and so —iLim(a). If we had —iLim(co) as
well, then —iiBacco Lim(a), so that we would have N(co) and hence co£co,
which contradicts 2.2(i). |

We now make the following definitions:

/ of x: f‘x =df iy[(x,y)£f];

f is a function: Fun(/) •*-»• df V*£/3w 3r[(w,t;)=x] a

\/u \/v \jw[(u,v) 6/a (u,w) £f -~v = w\;

domain off: dom(/)=df{»: 3 v[(u,v)ef]};

range off: ran(/)=df{u: 3m[<w,f>€/]};

restriction off to x: f\x =df/n[xXran(/)].

To legitimize the defining terms in the definitions of dom(/) and ran(/),


we observe that

H (u,v) u 6 U U/a v € U U /•

For each term t(y) we also define1

t|*=df {<>’,t(y)> :y£x}.

Clearly we have, for any term t(y)

I- Fun(t | x) a dom(t | x) = x.

The reader will doubtless be familiar with the process of constructing


functions by (course of values) recursion on the natural numbers. This
process yields a function defined on the natural numbers whose value
at a natural number n is determined by n and the behaviour of the function
at all m<n. We are now going to show that a similar procedure may be

1 y need not be the only variable free in t, but it will always be clear from the context
which variable is intended to pia^ #he role of y.
AXIOMATIC SET THEORY [CH. 10, §2
474

employed in set theory to construct terms which behave in a prescribed


way on the class of all ordinals.

2.9. Theorem. For each term s(y,z) we can construct a term t(.v) such that1

1— t(a) = s(t|a,a).

Proof. Let us define

/ meets the recursive conditions up to x (with respect to s):

Rec(f,x) Fun(/) aiu {x}£ dom(/)

a Va^dom(/)[/‘a = s(/ fa,a)].

Then clearly we have

(1) I— Rec(/,y) Aa<y--/‘a =s(/fa,a) a Rec(/,a).

Moreover, we have

(2) b- Rec(/,y) a Rec(g,y) -*/‘y =g‘y.

For suppose (2) fails for some y. Then by 2.4(i) there is a least ordinal
y0 for which it fails. Then we have

Rec(/, y0)A Rec(g,y 0) a/'1 y 0 ^ g‘ y0,

and therefore

s(/t y0 ,y0) —f‘yo^g‘yo=sfei y0 ,y0)-

Accordingly/lyo^&lyo and so there must be some /?<y0 for which


But by (1) we have Rec(/,/?) and Rec(g,^S). This contradicts the assumption
that y0 is the least ordinal satisfying these conditions, and (2) follows.

Now put t(x) for the term

xy [V/[Rec(/,x)-/‘x=y]].

Then we have

(3) 1—3/ Rec(/,a) t(a)=s(t |a,a).

For, if Rec(/,a), then from (2) and the definition of t we see that t(a)=/‘a.

1 Of course, y and z are not assumed to be the only free variables of s, nor is x assumed
to be the only free variable of t. In fact, the t constructed in the following proof has a s
free variables x and all the free variables of s other than y and z.
CH. 10, §2). ORDINALS 475

Also, if /?<oe, then Rec(/,/?) so that t(fi)—ffi. But then we get

t(a) =/‘a = s(f\ a,a) = s(t |a,a)

as claimed.
Now, by (3), we will obtain |— t(a) = s(t|a,a) if we can show that

h-3/Rec(/,a).

To prove this we use the principle of transfinite induction. Assume that


V/?<a 3/Rec(/,/J). Now put g for

t|au {(a,s(t|a,a))}.

Clearly we have

Fun(g) a dom(g) = au {a}.

If /?<a, then g‘fi =t(/?) and g‘y=t(y) for all y</?. Since 3/Rec(/,/l) for all
/?<a, by (3) we have t(/?)=s(t|/?,/l) for all /?<a, which immediately gives
g‘P=s(g\[},p) for all /?<a. Finally, we have

g‘a = s(t|a,a) = s(gfa,a).

Hence Rec(g,a), which completes the proof. I


Thm. 2.9 is called the principle of ordinal (or transfinite) recursion, and
any term t satisfying the condition specified in the theorem is said to be
constructed (from s) by recursion on a.

2.10. Problem, (i) Show that, if tx and t2 are both constructed from s by
recursion on a, then

1-Va[t1(a)=t2(a)].

(ii) Let s(y,z) be any term. Show that

f- Va 3/[Fun(/) a dom(/) = a a V/?<a[/‘/?=s(/>/?,/?)]].

Many different forms of ordinal recursion may be reduced to the form


given in 2.9. For example,

2.11. Theorem. Given three terms s0,s1(>’,z), we can construct a term t(.v)
such that
b-t(0) = s0 a [a = /M-1 — t(a)=s1(t(^),^)]

a [Lim(a) -t(a)=s2(t|a,a)].
AXIOMATIC SET THEORY [CH. 10, §2
476

Proof. Let s(x) be the term

IE[[7=0 A E=S0]

v3/?£domO) [dom(y)=p+l AV=s1(y‘P,P)]

v [Lim(dom(») a (u = s2(j,dom(^))]

v [ "i Ord(dom(y)) ad=0]].

Then by Thm. 2.9 we can construct a term t such that bt(a) = s(t|a), and
it is an easy task to verify that t satisfies the required conditions. |

Observe that if, in applying Thm. 2.11, one is interested only in t(n), i.e.
when the argument is a natural number, then s2 is irrelevant and may be
taken to be any term whatsoever.
The foim of ordinal recursion given in Thm. 2.11 is frequently used to
define addition and multiplication of ordinals. For example, to define y+a,
we take y as a parameter and use recursion on a, taking s0 to be y, s1(j,z) to
be y+l and s2(y,z) to be (Jran(y). We leave it to the reader to define y • a.
We now consider a more general form of ordinal recursion. Let us
call a term r(y) a bounded ordinal term if

bVy Ord(r(y)) a Va 3z VpIrOO^sa-^z].

This condition means that r(y) is always an ordinal and for each a the
class of sets y with r(y)^a is the extension of a set.

2.12. Theorem. Given a bounded ordinal term r(y), and a term s(z1,z2),
we can construct a term t(x) such that

b t(x) = s(t | {y :r(y)<r(x)},x).

Proof. By ordinal recursion we can obtain a term u(x) such that

(1) b u(a) = {<x,s(U {u(jff): /?<a},x)> : r(x)=a}.

(We leave it to the reader to verify that this recursion can be reduced to that
given in Thm. 2.9.) It is themclear that, for each a, u(a) is a function with
domain {x: r(x)=a}. If we put

v(a) = U{u(/?):/?<a),

then v(a) is a function with domain (x: r(x)<a}. Now define t(x) by

(2) t(x) = u(r(x))‘x.


CH. 10, §3]. THE AXIOM OF REGULARITY 477

If r(y)<a, then we have, by the definition of t and v,

t (>0 = u(r (>’)) ‘ y = v (a) ‘y.

This implies, again using the definition of v,

(3) r(y)<a} = UW): P«x}.


(1), (2) and (3) now give

t-t(*)=u(rO)),x

=S(UM/?): jS<r(x)},x)

=s(t|{y: r(y)<r(x)},x)
as required.

The term t(x) obtained in Thm. 2.12 is said to be constructed by recursion


on r(x).

§ 3. The Axiom of Regularity

If we apply Thm. 2.9 to the term

sO) = U{P*: x€ran(y)},

we obtain a term t such that

M(cO=U{Pt({) :{<*}•
Writing Ra for t(a), we have

(3.1) hi?a=U{Pi?i :£«*}•

It is then clear that

(3.2) hRa={x: 3(<a(xg^)}.

The intuitive idea behind the construction of the sets Ra is as follows.


We think of the ordinal a as the ath “stage” in the process of “collecting”
sets. Ra is then the set of all sets “collected” at “stage” a. We see from
(3.2) that Ra is in fact the set of all sets x all of whose members have been
“collected” at some “stage” before a. The family of all Ra is called the
cumulative hierarchy.
We now define

x is regular: Reg(x) **df 3a(x£Ra).

32
478 AXIOMATIC SET THEORY [CH. 10, §3

It is easy to verify that h-Reg(x)—3a(xc i?a). We also define

rank of x: q(x) =df |ia[x£

3.3. Theorem.
(i) H - Ra £ R„] a [a<j3 - Rx € /?„].
(ii) 1- Trans (Ra).
(iii) hi?0 = ()Aj?« + i = p^'
(iv) hLim(A)-^=U{i?«:
(v) H Reg(a) a g(a) = a.
(vi) f-Reg(x) Ay€*-Reg(y) a Q(y)<g(x).
(vii) \-Vy£x Reg(y) - Reg(x).
Proof, (i) follows immediately from (3.1).
(ii) If x£Ra, then by (3.2), x^R^ for some £<a, whence x^Ra by (i).
(iii) The first conjunct follows immediately from (3.1). By (3.1) we have
\-Rx+1=PRauRa and, using (ii), we see that \-Ra^PRx. (iii) follows.
(iv) Assume that Lim(A). By (i), we have U{R«?: CBut

i?A = U{P^: £<A}=U{^+1:

by (iii), and since Lim(A), we have 1<A by 2.5(i). Hence

U{^+1: {<A}eU{^: ^}

and (iv) follows.


(v) By (3.1), it suffices to show that

t-<x^RaAoc$Ra.

We establish the first conjunct by transfinite induction. Assume

Vf<oc(f £=**);

then V<^<a(^R{+i), so that

ac(J{i??+1; £<a}=U{Pi?{: } = Ra,

as required. To show that u.$Ra, assume the contrary, and let p the least
ordinal such that P^Rp. Then by (3.2), p^R$ for some £</?, and it follows
that £6Ri, contradicting the choice of p.
(vi) Assume

Reg(x) Ay ex.
CH. 10, §3]. THE AXIOM OF REGULARITY 479

Reg(y) follows immediately from (ii). We have x£Re(x) by definition of


(>(x) and therefore for some £<p(x). Accordingly, (?(>’)< £<f?(x).
(vii) Assume

Vy€x Reg(y);
let
a=U{eO):
By 2.5(ii), we have so that Vy€x[y^-RJ- Hence
Vy€*[y€-Ra+i] and it follows that x^Ra+1. Thus Reg(x). |

Let <p(x) be a formula in which the variable z is not free. We write

Tnms„(je) —d fVx Vz[<p(x) a z£x- <p(z)].

<p(x) is said to be transitive in x if 1—Trans<p(x). (When the identity of the


variable x in question is clear from the context, we shall simply write
Trans,,, and say that tp is transitive.) If <p is the formula x£y then it is
clear that
(—Transq)(x)^Trans(j).

It follows from 3.3(vi) that Reg(x) is a transitive formula: this fact will
be of great use to us later on.
InThm. 2.11, put x for s0 and UTforsx. We then obtain a term t for which

b-1(0)=x a t(n+l) = U t(«)-

We define

Transitive closure of x: TC(x) =df U{t(«)• n^co).

We now show that TC(x) is the least transitive set including x.

3.4. Theorem.

hxc TC(x) a Trans(TC(x)) a Vy[Trans(y) a x <= y - TC(x) £ y].

Proof. Clearly x£ TC(x). If zeTC(x), then z£t(n) for some n£o). Hence

zs Ut(n)=t(/i+l)sTC(x).

Trans(TC(x)) follows.
Now suppose that Trans(y) a x <= y. We show by induction on n that

Vn€®[t(w)£y],

32*
480 AXIOMATIC SET THEORY [CH. 10, §3

from which it follows that TC(x) £_y. We have t(0)=x£=.y. If t(n) Q y then,
since Trans(y), we have t(«+l) = UK«) ^ y> which completes the proof. |

We are now in a position to prove the important

3.5. Theorem. hV* Reg(x) ^ Vx[x^03y€x(xny=0)].


Proof. Assume \/x Reg(x), and let x^O. Let a be the least of the ranks
of members of x and let y be a member of x satisfying g(y) = <x. Then
xnj=0, for, if z£xn>>, then by 3.3(vi), Q(z)<Q(y)=<x, contradicting the
definition of a.
Conversely, suppose that 3xn Reg(x); choose x to satisfy ~i Reg(x). Put

z={>>€TC(x): nReg(y)}.
Then z^O, for since ~iReg(x) there must, by 3.3(vii), exist ,y£x such that
—iReg(j), and it is then clear that y£z. Moreover, for each ydz we have
zny^O. For, if y£z, then —iReg(y), so by 3.3(vii) there is u£y such that
—iReg(w). But then u£y£z, so that w£TC(x) and hence u£z. It follows
that znj^O, completing the proof.

The sentence

Reg: V-v[x^0^3jE3x(xny=0)]

is called the Axiom of Regularity. Thm. 3.5 asserts that Reg is equivalent
to the assertion that all sets are regular, hence the name.
The Axiom of Regularity has considerable simplifying power, as we
shall see, but it is by no means intuitively obvious. Indeed, it differs from
the preceding postulates — with the exception of the Axiom of Exten-
sionality -— in not being an instance of the Axiom of Comprehension.
Accordingly, we shall not use it in our proofs until we have established its
consistency with the other postulates, a task which we now turn to.
Let (p(x) be any formula. Recall that in §12 of Ch. 2 we defined the
relativization a* of a formula a to <p(x) (with x as chosen variable). We
agree to write a((p) for a* from now on.
It is easy to see that if a is a formula with free variables among x1,...,xk,
and aly...,ak are sets such that (p(a;) holds for all i, 1 < i k, then
a(v)(al,...,ak) says that a holds in the class defined by <p(x) when 3 is inter¬
preted as the membeiship relation and xl5...,xft are assigned the values
a1,...,ak.
Now suppose that we are given a set of sentences E, and a sentence x.
x is said to be consistent relative to E if the consistency of E implies that
CH. 10, §3]. THE AXIOM OF REGULARITY 481

of Lu{t}, Our next theorem gives a sufficient condition for this to be


the case.

3.6. Theorem. Suppose that there is a formula <p(x) with exactly one free
variable x such that:
(i) L[—3x(p(x);
(ii) 2|— tr(<p) for all n in I;
(iii) Shxw.
Then x is consistent relative to 2.
Proof. By Cor. 2.12.4 we have

(1) <r is logically valid =>[3x(p(x)-*<r(<p)] is logically valid,

for any sentence <r. Now suppose that 2u{x} is inconsistent. Then there
is a finite subset {<Tlv..,an} of 2 such that the sentence <7X a ... ~it

is logically valid. Hence, by (1) and the properties of relativization, the


sentence

(2) 3xip(x) - [a™ x(ip)]

is logically valid. But (ii) implies that Lh<)A...Acf, and this,


together with (i) and the logical validity of (2) gives 221— i Hence,
in view of (iii), 2 is inconsistent. I

Observe that the proof of Thm. 3.6 provides an explicit method of


converting a proof of an inconsistency from 2u{t) into a proof of an
inconsistency from 2.
We can now prove:

3.7. Theorem. Reg is consistent relative to the previous postulates.


Proof. We apply Thm. 3.6, with Reg(x) as the formula <p(x). Naturally,
Reg will not be used as a postulate in any deduction we make in the
present proof.
First, since |— Reg(0) by 3.3(v), we have f— 3xReg(x).
Next, we deal with each postulate in order:
(i) Extensionality. We have to show that

1— Reg(x) a Reg(y) a Vz[Reg(z) [z 6 x**z 6y]] x—y.

Assume the antecendent. If z€x, then Reg(z) by the transitivity of Reg;


so z6y. Similarly, z£y implies z£x; hence x=y.
(ii) Replacement. Let ip be a formula with free variables among
AXIOMATIC SET THEORY [CH. 10, §3
482

x, y, yi,...,yk. We have to show that

(1) h Reg(yi) a ... a Reg(yfc) a Reg(w)

a Vx£ «[Reg(x) - 3*y[Reg(y) a vj/]]

— 3z[Reg(z) a Vy[Reg(y) [ y£z«+ 3x£w[Reg(x) a vj/]]]],

where \J/ is <p(Reg).


Assume the antecendent of (1) (i.e. the part of the formula preceding
the second —). Since we are assuming Reg(«), 3.3(vi) implies that
Vx£«Reg(x), so we have

Vx£w 3^>’[Reg(\y) /\ vj/].

Hence, by Rep we deduce that there is a z such that

Vy[y£z—3x£w[Reg(y) a\J/]].

Using 3.3(vii) we infer from this that

(2) Reg(z).

Also, we see that

(3) Vy[Reg(y)-[y£z— 3x6 ml/]].

But since we have Reg(u), the transitivity of Reg gives

3x£wv|/^>-3x£w[Reg(x) a v|/].

Thus (3) is equivalent to

(4) Vy[Reg(y) - [y £ z +-*- 3x £ u[Reg(x) a \J/]]].

(2) and (4) imply the consequent of (1).


(iii) Union. We have to show that

(5) p-Reg(z)-3x[Reg(x) a Vy[Reg(y)-[y£x—3w£z[Reg(w) a y £ w]]]].

Assume Reg(z), and put x=[jz. Then we have Reg(x) by the transitivity
of Reg and 3.3(vii). Moreover,'we have, by definition of x,

y£x^3u£z[y£w],

and hence, since Reg is transitive,

y£x*-3«£z[Reg(w) Ay£w].

(5) follows.
CH. 10, §3]. THE AXIOM OF REGULARITY 483

(iv) Power set. We have to show that

(6) 1— Reg(z) -> 3x[Reg(x) a V>>[RegO) — [>>€x+-* V«[Reg(«) a u£y-* w€z]]]].

Assume Reg(z), and put x=Pz. Then, applying 3.3(vii) twice, we see that
Reg(x).
Also, by definition of x, we have

y£x++Vu[u£y-~u£z},

and hence, since Reg is transitive,

RegO) — [y € *—Vw[Reg(«) a u £y u £ z]].

(6) follows.

(v) Infinity. Put <p(x) for the formula

By[Vz[z^v] aj£x]

a V.yfyCx' — 3w[Vu[u€w',-,-u€y v v—y] awEx]].

Then the Axiom of Infinity is equivalent to the sentence 3xq>(x). Thus we


have to prove the existence of a regular set x with the following two
properties:

(7) 3y[RegO) a Vz[Reg(z)-z$y] a y£x],

(8) Vj[Reg(>>)-.yex
- 3«[Reg(w) a V»[Reg(w)- [o € w—v€yv v=y]] A«(r]].

We claim that co is such an x. First observe that f-Reg(co) by 3.3(v).


Moreover, taking y to be 0 we see that this y is regular, has no members
_Jet alone regular membeis — and belongs to ax Thus co has property (7).
Also, by the definition of co we have

V>£co 3w€co 'iv[v£u+~v£yvv=y\;

using the transitivity of Reg, it follows that co satisfies (8). ^


To complete the proof of the theorem we have to show that h Reg(Reg), i.e.

(9) Reg(x) a x^O - 3 y € x [Reg(y) a Vz[Reg(z) az€^z^]].

Suppose that Reg(x) and x^O; let a be the least of the ranks of all members
of x, and let y be a member of x with rank a. Then Reg(y) by the transitivity
of Reg. Moreover, if z£y then e(z)<a by 3.3(vi), so that z$x by the
definition of a. This gives (9), and the proof is complete.
484 AXIOMATIC SET THEORY [CH. 10, §3

Now that we have established the relative consistency of Reg, we hereby


adopt it as a postulate1 and use it in our proofs. The next three problems
should be approached with this fact in mind.

3.8. Problem. Show that:


(i) \-x$x.
(ii) b _i3x1...3xn[x16x2Ax2€x3A •••
(iii) I- “i3/[Fun(/) Adom(/)=coA V«[/‘(«+l)t/‘«]]-
(iv) b Ord(x)«->*V>bx \/zdx[y=z vy£z v z£y] ATrans(x).
(v) beC^^xc^AxfH,,.
3.9. Problem. Let \||/(x) be any formula. Show that:
(i) bReg(v).
(ii) bTransv ->- Vx[vJ/(x) -*■ [Ord(x)^Ord(v)(x)]].
(iii) bTrans¥ [[3a(p(a)](M/)^3a[\|/(a) A <p(¥)(a)]]
a [[Va(p(a)](M/)^ Va[vl/(a) <p(v)(a)]],
where <p(y) is any formula.
(For (ii), show that the equivalence stated in Prob. 3.8(iv) follows from
Reg alone; then use (i).)

3.10. Problem. Prove the principle of induction on rank: if y does not


occur free in tp(x), then

b Vx[V>’[e(y)< e(x) - <p(y)] - tp(x)] - Vx<p(x).

From Thm. 3.5 and the Axiom of Regularity it follows immediately that
I Vx Reg(x), i.e. every set is regular. It is useful to envisage the universe
of (regular) sets as a striated cone (see Fig. 6). The root of the cone
represents the empty set, the subcones — OAB for example — bounded by
the horizontal lines represent the RJs, and the vertical “spine” represents
the class of ordinals.
The rest of this section is devoted to proving some technical results
which will be of importance later on.
For any formula ip and any variable z, we define <p(2) to be the rela-
tivization of rp to the formula x£z, with x as chosen variable (Ch. 2, §12).
Note that if the free variables of <p are among xl5...,x„, then the free
variables of (p('} are among x1,...,xn,z. If t is a term, we write rp(t) for the
result of substituting t for z in <p(z).

1 Our previous remarks imply that in adopting Reg we are merely confining our attention
to regular sets. Experience shows that — so far at least — nothing of mathematical
interest is lost by this restriction.
CH. 10, §3]. THE AXIOM OF REGULARITY 485

We define:
/ is an injection:

Inj(/) «-df Fun(/) a V*€dom(/) \jy£&om(fj[flx=fly-+x=y\.

f is an 6-isomorphism of u onto v:

Isom (f,u,v) — df Inj(/) a dom(/)=w a ran(/)=u

a 'ix£u'iy(iu[x£y++f‘x€fty].

u is extensional:

Ex(w) fy€u[xnu=ynu^x=y].

It is easy to verify that

hEx(«)-ExFu),

h- T ra ns(«) Ex(w).

Ordinals

Fig. 6

Our next result asserts that there is an 6-isomorphism of any given


extensional set onto a transitive set.

3.11. Theorem (Mostowski’s Collapsing Lemma).

b- Ex(w) - Bp B/[Trans(p) a Isom(/,«,«)].


486 AXIOMATIC SET THEORY [CH. 10, §3

Proof. By 3.8(v), g(y) is a bounded ordinal term and so we may apply


Thm. 2.12, taking s to be the term ran(zpOnz,,)). We then obtain a term
t(x, u) such that

(1) |-t(x,u)={t(y,u):y£xnu}.

We put

f = {(.x, t(x,«)) : x d u}, v — {t(x,«): x € u).

Clearly we have

\- Fun(/) a dom(/) = u a ran(/) = v.

Moreover, it follows immediately from (1) that

(2) \-\/x£u[f‘x={fiy : yCxnw}].

Now suppose that Ex(m). We claim that under these conditions we have
Transfh) and Isom(/,w,i;). To prove the first assertion, suppose that z£v.
Then z=/‘x for some x£u, so that, by (2),

z=fix={fty :jbTn«}cran(/) = r.

Hence Transfh).
We next show that Inj(/). Let <p(y) be the formula

(3) VxZu[rx=ry-+x=y\.

We have to show that Vy€wcp(y), and to do this we argue by induction


on rank (Prob. 3.10). Thus, assuming that y£ u, g(y)=a and <p(z) holds
for all z£u such that g(z)<a, i.e.

(4) Vz€ m[{?(z)<oc \fx £ u[f‘x=f'z-+x=z]\,

we have to show that q>(y), i.e. (3). Since we are assuming Ex(u), in order
to prove (3) it suffices to prove

(5) 'ix€.u\J‘x=fiy^xr\u=ynu\.
‘ ^

Let x£u and suppose that f‘x=f‘y. If z£xnu, then since fix=fiy
it follows from (2) that f‘z=f‘w for some w^ynu. But then q(w)<oc
by 3.3(vi) and so z = w by (4), whence z£ynw. On the other hand, if
w£ynu, then, as before, (2) implies that/‘w=/‘z for some z£jcnu. Again
we have g(w)<a by 3.3(vi) and w=z by (4), so that w£xnu. It follows
that xnu—yr^u, which proves (5).
CH. 10, §4], CARDINALITY AND THE AXIOM OF CHOICE 487

It remains to show that

\/xeu 'iyeu[x£y++f‘xef‘y].

Suppose that x£u and y£u. If x€y, then f'x^f'y by (2). Conversely, if
f'xtf'y, then, by (2), /‘x=/‘z for some z£y. Since Inj(/), we have x=z,
so that x£y. B

Notice that in the proof of Thm. 3.11 we have actually shown that the
6 -isomorphism/can be uniformly defined from u. More precisely, we have
shown that we can construct a term t(x,u) such that

1— Ex(w) — Trans(t[«]) a Isom(t|w, u, t[u]),


where
t[u] = (t(x,w) : x€u}, t|u = {(x,t(x,«)> : x€w}-
Remark. Let u be an extensional set which is not transitive. For
each x£u, xr>u is in general a proper subset of x, because if x=xr\u for
all x£u, then obviously u would be transitive. The set x—xnu is just so
much “empty space” as far as u is concerned. Now, if/is an £ -isomorphism
of u onto a transitive set v, then ftx=f*xnv for each x£w, and thus we
may say that /‘x is “densely packed” with respect to v. Thus the effect
of/is to “collapse” each x£w onto the “densely packed” set/‘x. For this
reason / is often called a collapsing isomorphism, and v the transitive
collapse of u.

3.12. Problem. Show that


h-Isom(/,M,u) ATrans(u)-*Vx€tt[/‘x={/‘>> :

3.13. Problem. Show that the €-isomorphism / and the transitive set
v whose existence is established in Thm. 3.11 are unique. (Argue by induction
on rank, using 3.12.)
3.14. Problem. Show that
1— Ex(«) a Trans(x) axcka Isom(/,w,r) a Trans(u) — iy£x[f y=y\-

§ 4. Cardinality and the Axiom of Choice

In this section we presuppose a slender acquaintance with cardinal arithmetic


(see, e.g. Halmos [1960] or Rotman and Kneebone [1961]).
We define
x is equipollent with y : x~y 3/[Inj(/) a dom(/)—x a ran( / )—>’].

4.1. Theorem. \-x^x a [x^y —y~x] A[x^y Ay^r-x^z].


488 AXIOMATIC SET THEORY [CH. 10, §4

4.2. Theorem.
(i) |-B^c Px(a^y) a -i 3zgx(z% Pa).
(ii) \-x^y' Ay'^yAy^x' Ax'^x-^x^y. g

4.2(i) and (ii) are the well-known theorems of Cantor and Schroder-
Bernstein respectively.
We now introduce our last postulate - the Axiom of Choice

AC: Mx 3/[Fun(/) a dom(/)=A a Vy€A[y^0-^/>€y]].

A function /whose domain is a and such that, whenever y is a non-empty


member of a, / “chooses” a member of y (i.e. ff^y) is called a choice
function for x. The axiom of choice postulates that each set has a choice
function. Most mathematicians accept the axiom of choice as intuitively
true — which is just as well, since it plays a well-nigh indispensible role
in modern mathematics. Some, however, distrust it because of its highly
non-constructive character.1 It asserts the existence of an object — a choice
function — without indicating how this object is to be constructed or,
indeed, characterizing it in any other way. It is therefore a pure existence
statement. For this reason it is customary not to use it without giving
notice; in any case we want to discuss its consistency relative to the other
postulates. Accordingly, we shall depart from our usual convention and
write “AC” below the deducibility symbol \- whenever our proof of the
formula in question depends on the Axiom of Choice. Similarly, when
introducing a defined term or formula whose meaning depends on the
Axiom of Choice, we shall write (AC) in the right-hand margin.
One of the most important applications of the Axiom of Choice is in
the proof of the following theorem, which asserts that each set is equipollent
with an ordinal (i.e., can be well-ordered).

4.3. Theorem. bAC Va 3a[a^x].


Proof. We apply Thm. 2.9 with /‘(a - ran(y)) as s(y,z). We then get a term,
t say, such that, if/ is a choice function for Pa, then

(1) Va[t(a)=/‘(A-{tO?):)S<a})].

Assuming AC, such a choice function /exists, and for it we have (1).
Put
Ua = a-{to?): 0<a};

' It is obviously not an instance of the Axiom of Comprehension!


CH. 10, §4], CARDINALITY AND THE AXIOM OF CHOICE 489

then t(a)=/‘«a. Clearly, we have If ua^0, then t(oc)=f‘ua^ux, so


t(a)€x; also, it follows from the definition of ua that if /?<a then t(fi)$ua,
so that t(a)^t(j5).
Suppose now that ua^0 for all ordinals a. Then by what we have just
seen, if a is any ordinal and >’=t(a), then ydx and

a = ip[y=t(jS)\.

Thus the set (ifi[y=t(ft)] :y£xj includes all the ordinals, in contradiction
with Thm. 2.3. We conclude that ua=0 for some ordinal a. Then, if a
is the least such ordinal and /?<oc, we must have 0, so that t(/?)^t(y)
for all Also, since ux — 0, we have

x={t(fi): /?<a},

so t|a is a one-one map of a onto x. I

We now define

cardinality of x: |x| =df pa[a^x].

Thus \x\ is the least ordinal which is equipollent with x, provided such an
ordinal exists. Notice that (in the absence of AC), a set which is not
equipollent with an ordinal has cardinality 0.
The following results are immediate consequences of 4.1, 4.2 and 4.3:

(4.4) Nac
(4.5)' |— |a]«a,

(4.6) HacI*M
We define
x is a cardinal: Card(x) -*-df Ord(x) ax=|x|.

It follows from (4.5) that a cardinal is an ordinal which is not equipollent


with any smaller ordinal. It is also clear that we have

(4.7) H Card(«) a Card(co).

Next, we prove:

4.8. Theorem. Hac Vy€x Card(y)->-Ba[Card(a) a V^€x[/i<a]].


Proof. Assume 'iy^.x Card(j). Then in particular dydx Ord(j) and so,
by 2.2.(x),

V/?€x[/?<Ua1-
490 AXIOMATIC SET THEORY [CH. 10, §4

Putting

a = |P|Jx|,

we have Card(a) and it follows from 4.6 that V/? €*[/?<«]. I

Thm. 4.8 asserts that, assuming AC, for any set x of cardinals there is
a cardinal a which exceeds all the members of x. Moreover, we may assume
that axa, for if a<co, then we may replace a by co. We now apply 2.9,
taking for s(y,z) the term

p/?[Card(/f) a co</? a /?(£ ran(y)].

We obtain a term t(x) such that

t-AC^a)=n£[Card(j3) a axp a p $ (t(y): y<a}].


We define
Aleph a : =df t(a). (AC)

4.9. Theorem.
(i) b-ACCard(^a) Aco<Ka A[a<j5-^^a<^] a ^0=cu.
(ii) f-ACCard(^) a <n<j3 -*■ 3a[/?=tfJ.
Proof, (i) is a simple consequence of the definition of K*, and we leave
its proof to the reader.
(ii) It follows from (i) that whenever a. A a.', and a straightforward
application of Rep and Thm. 2.3 shows that there is no set which contains
all the Ka- In particular, if /l is a cardinal such that ax/7, then there must
be an ordinal a such that Xa$p. Thus /?<Ka. If /? = Ka, we are through.
If on the other hand then since $x is the least cardinal s*co not in
{Kv: y«x}, we must have /?£ {tfy: y<a}, which gives the required result. |

Calling a cardinal P infinite if ax/7, Thm. 4.9 implies that K0, Ki,...,Ka,..-
enumerates the class of all infinite cardinals. Thus, given an infinite cardinal
it makes sense to ask precisely where it appears in the sequence of Ka’s.
Consider, for example, the cardinal |PK0|. We have Ko<|PKol> so that
Ki<|Pa0| by the definition of Cantor was firmly convinced that |PK0|
is actually equal to but never succeeded in proving it. The statement

|P«ol = 8i

is called the Continuum Hypothesis (briefly, CH) because, as is well-known,


PK0 is equipollent with the continuum, i.e. the set of all real numbers.
Its truth or falsity is still an open problem. However, in 1938 Godel showed
that both it and the axiom of choice are consistent relative to the other
CH. 10, §5], REFLECTION PRINCIPLES 49 r

postulates of set theory. In fact, he demonstrated the relative consistency


of a much stronger assertion than the Continuum Hypothesis, namely
the so-called Generalized Continuum Hypothesis (briefly, GCH). This is-
the statement

Va[|Ps8| = S.+J.
We are eventually going to prove Godel’s result.
Now Godel’s pioneering work still left open the possibility that CH or
even GCH is actually a consequence of the other postulates. However,
in 1963 P. J. Cohen showed that CH (and hence GCH) as well as AC are
not provable from the other postulates provided these are themselves
consistent. Thus CH and AC are completely independent of the other
postulates. The proof of Cohen’s result is, unfortunately, beyond the
scope of this book. Readers interested in finding out more about Cohen’s
work are advised to consult Cohen [1966], Bell [1977] or Jech [1971].

§ 5. Reflection Principles

Let <(>!,..-,<p„ be a sequence of formulas, and let xl5...,xm be a list — in


alphabetical order — of all variables that occur free in any <p;, 1
We define
u reflects <pl5...,<p„:

Refl(Pi )<Pn(w) — df Vx1£u...\/xm£ w[[q>i—(pj0] a ... a [tp„—<p^u)]].

In this section we prove several results which assert that there are
arbitrarily large sets which reflect all the members of a given finite sequence
of formulas. An assertion of this kind is called a reflection principle. We
now formulate two such principles.
The First Reflection Principle is the scheme

RPX: VTi- •VjEfc 3«[Ti £uA...Ayk£uA Trans(w) a Refl^,...,*„(«)],

where <(>!,...,<pB are any formulas.


The Second Reflection Principle is the scheme

RP2: Va 3p[<x^p a Refl^..>Vn(Rp)l

where are any formulas.

Remark. Taking a single sentence a in RP2, we get

Va 3/?[a</? a <t(r^].
492 AXIOMATIC SET THEORY [CH. 10, §5

We may think of <r as expressing a (first-order) property of the universe


of sets and <r(R^) as expressing the same property of the extension of R^.
Accordingly, RP2 implies that each first-order property possessed by the
universe is also possessed by the extensions of arbitrarily large sets. In
particular, there is no property expressible in the language of set theory which
distinguishes the universe from the extensions of all of its members.

It is easy to see that, for each choice of <plv..,<pn,

(5.1) f-RP2^RPi.

We shall prove a theorem which yields RP2 as a particular case. We


shall need:

5.2. Lemma. Let t(x) be a term such that:


(i) |-a<jff-»t( a)ct(jg);
(ii) \- Lim(A) -+■ t(A) = U {t(0: £<A}.
Then
Y-'iyZx Ord(y) Ax^0^t(ljA) = U{t(a): a£x}.

Proof. Assume (i), (ii) and dy£x Ord(y) ai^O. That

U{t(a): a€x}ct((Jx)

follows immediately from (i). If x has a greatest element, /? say, then


[jx=P, and we have

t(Ux)=tO?)=U{t(a): a€x}.

On the other hand, if x has no greatest element, then it is not difficult to


see that (J* is a limit ordinal. Therefore, by (ii),

t(U*) = U{t(a): a€lU}

c(J{t(a): a£x} by (i). |

5.3. Theorem. Let t(x) be a term such that xlv..,xm are not free in t(x) and
(i) |-a<)3-t(a)ct(j8);
(ii) HLim(A)-t(A) = U{t(0:^<A}.
Put T(x) for the formula 3a[x€t(a)]. Then, for any formulas (px,...,(pn with
free variables among x1,...,xm we have

HVa 3/?[a</? A Vx-^t(j8)...Vxm€t(j8)[[(pf}—<pf(/J))] a ... a [(p,(,r)—<P„t(/?))]]].

Proof. We may clearly assume without loss of generality that each sub¬
formula of each formula in the list (p1;...,(pn also occurs in the list. We also
CH. 10, §5], REFLECTION PRINCIPLES 493

assume that the list is enumerated in such a way that the existential formulas
occupy the first p places.
Let 1 < j p. Then <p; is of the form

where zl5...,zfc are exactly all the free variables of <pj, and are therefore
among x1,...,xm. We put s/zj,...^) for the term

IHpy € t(y)vl/f)(z1,... ,zk, y)].

Thus if for given zl5...,zfc there is a y such that

T(y) A '\)(JT\z1,...,zk,y),

then s/z^...^*) is the least y such that a y of this kind can be found in t(y).
We now put s*(w) for

U{s/z1;...,zfc): zx6« a ... Azt(w}.

It follows immediately from (i) that

HU{t(s/zl5...,zfc)): z^ha ... az^hJet(s*(w)).

Thus, if zl9...,zk are in u and there is some y such that T(y) a \\i(JT\z1,...,zk,y),
then such a y can already be found in t(s*(w)). Hence

hVz16w...Vzfc€ u[3y[T(y) a —3 j € t(s* (w))v[/$T)] ,

and therefore, since z1,...,zk are among x1,...,xm, we have, a fortiori,

\- V*i £ u... \/xm € w[3y [ T(y) a v|/f}] — By € t(s* (w))v[/f}].

Next, putting s*(«) for

s*0)u...us*(u),

we have, for the same reasons as before,

(1) h V*! B w. • • Vxm <E w[3y[7Xy) a —3y 6 t(s*(w))\J/^T)]

for j—l,2,...p.
We apply 2.11 with a+1 as s0 and s*(t(y)) as s^z) to obtain a term r(x)
for which
l-r(0) = a+l a r(n+l) = s*(t(r(«))).
Put p for
U{r(«): neco}.

33
AXIOMATIC SET THEORY [CH. 10, §5
494

Then clearly a</i; we claim that

(2) Vxl 6 m... V*M € t(/i)[<p7} - 9?^


for all y=l,...,n. Once this claim has been proved the theorem follows
immediately.
We prove (2) by induction on the complexity of <py.
(a) If 9j is atomic, then (p^T) and are both identical with <py, and
(2) is clear.
(b) If (p,- is a conjunction, then the two conjuncts are included in the
list 91,...,9„. So 9y is 9g a 9r. By inductive hypothesis we already have (2)
with q and r in place of j, and from this we easily obtain (2) for 9;.

(c) If 9; is a negation, the proof of (2) is like that in case (b).


(d) If 9j is existential, then j^p, and 9^ has the form Bytyj. Since x|fj
is a subformula of 9^, it must occur in the list 91,...,9„, so by inductive
hypothesis we have

and hence
v^€t (p)..yxmetmByemvp-ByemviW)i

Now 3.v€t(j3)\|/?(/J)) is 9yt(W), so to get (2) for 9j we have to show that

Vx1et(J8)...Vxm€t05)[9f)-3^€t(j3)xl/f],
i.e. that

Suppose, therefore, that xl5...,xm are all in t(/i). By the definition of


P and Lemma 5.2, we have

t05) = U{t(r(«)): «6co}.

Hence for each i'=l,...,m there must be co such that x,-Ct(r(«;)). Let
k be an nf for which the ordinal r(A7,) is greatest (/'= l,...,m). Then X;£t(r(k))
for
By (1), we have

3y[T(y) a v|/<T)] — 3 y€ t(s*(t(r(k))))^T),

and, since hs*(t(r(&))) = r(&:+l), we get

(4) 3 y[T(y) a \J/f}] ~3y£ t(r(k+1 ))ty(P.


CH. 10, §5]. REFLECTION PRINCIPLES 495

But r(A:+l)</? so that t(r(&+l))ct(/?). Hence (4) gives, a fortiori,

3y[T(y)A^p]-3yttmY\

Since xlv..,xm were arbitrary members of t(/l), we immediately obtain (3).


This completes the proof. |

Taking Rx for t(a) in 5.3 immediately gives, using Reg:

5.4. Corollary. If <r is any instance of RP2, then |—<r. :;

Hence, by (5.1) we get

5.5. Corollary. If a is any instance of RP1; then ho. |

We are now going to show that the First Reflection Principle can be
employed as an alternative postulate for set theory. More precisely, we
show that if from our original system of postulates we drop the Axioms
of Infinity, Union, and Replacement, and substitute instead the Axiom
of Separation (1.4) and the First Reflection Principle, we obtain a system
of postulates equivalent to the original one.
Our original theory, which is based on the postulates Ext, Rep, Union,
Pow, Inf and Reg, is called Zermelo-Fraenkel set theory and is denoted
by ZF. The theory obtained by adding AC to ZF is denoted by ZFC.
The theory whose postulates consist of Ext, Pow, Reg, Sep, and RPi
we call Levy-Montague set theory and is denoted by LM.
Let us write FLM (p for “<p is deducible from the postulates of LM”.
Then we have:

5.6. Theorem. For any formula <p,

H<P *> I LM <P-


In other words, ZF and LM are equivalent theories.
Proof. It clearly suffices to show
0) 1—<r for every postulate <r of LM;
(2) h-LMx for every postulate x of ZF.
To prove (1), we run through the postulates of LM. If tr is Ext, Pow,
or Reg, then o is a postulate of ZF, and so, a fortiori, [-<*■ If is an instance
of Sep, then [-a by 1.3(i). Finally, if a is an instance of RPX then \-g
by 5.5. This proves (1).
To prove (2) we consider the postulates of ZF in turn.

33*
AXIOMATIC SET THEORY [CH. 10, §5
496

(a) Ext, Pow, and Reg. Trivially we have h-LM Ext A Pow A ReS since
all three are postulates of LM.
(b) Rep. We must show that, if y(x,y) is a formula with free variables
among x,y,A,...,yk, then, if z does not occur free in <p,

(3) I—lm w 3*yy(x,y)-+3z Vy[y£.z++3x£uy(x,y)].

Suppose that, for a particular choice of yx,...,yk, which we assume to be


fixed throughout the argument, we have

(4) Vx<E«3*y<p(x,.F).

Then, by RPls which is a postulate of LM, there is a set v such that


Trans(u); u,yx,...,yk<iv and

(5) Vx£v Vy£v[y(x,y)+~(p(1,)(x,y)],

(6) V* € v [By <p(x, y) ** 3y € iVP)(*,y)].

From (5) and (6) it follows that

(7) Vx€»[3yq)(*,y)-«'37€»q)(x,y)].

Using Sep — which is a postulate of LM — we see that there exists a set


z such that

Vy[y€z*+y(Lv A3xduy(x,y)\.

We claim that

(8) y£z++3x£u<p(x,y).

Clearly we have

y£z-+3x£u(p(x,y).

Conversely, if Bx€«(p(x,y), then

(9) <p(*»

for some x'du, whence 3w<p(x',vv). Since Trans(t;) and u£v, it follows
that x'£v and (7) then implies that 3w6d<p(x',w>). Thus we have q>(x',y')
for some y'£v. It follows from (4) and (9) that y=y', so that ydv, whence
y£z. This proves (8), and (3) follows.
(c) Union. We have to show that

00) I lm 3* Vy[y€x++3u£z(y€u)].
CH. 10, §6], THE FORMALIZATION OF SATISFACTION 497

By RPX, for each z there is v such that Trans(y) a z£ v. Using Sep we see
that there is a set x such that

^y[y£x++y£v a 3u€z(y€w)].

It is now easy to show, using Trans(r), that

Vy3u € z(y € w)],

and (10) follows.


(d) Inf. Since Ext, Rep and Pow are theorems of LM, we can prove
within LM the existence of the unordered pair {u,v} of any sets u,v (see
Thm. 1.10 and the definition immediately following it). Hence, since
Union is a theorem of LM, we can prove within LM the existence of the
set i/u {«} for any set u. In other words, we have

(11) Hlm Vw3u yx[x£v++x£uv x=u].

Using Sep and Ext we can prove within LM the existence of the unique
empty set 0 (see 1.5). Accordingly, by RPX and (11) we can prove within
LM the existence of a set w satisfying
Trans(w) aO£waVw€w3u€w f x^_w[x^v**x^u v x=u].
But this implies (using Trans(w)),
0€w a 3v£w 'ix[xdv+-*x(iu vx=u]
i.e.
0£w A {u}£w].

Hence
1—LM 3w[0€w a V«€w[mu {«}€wj],

i.e. I LM Inf- *

§ 6. The formalization of satisfaction

Let (pbea formula of if, u a set and x a sequence of elements of u. In


Ch. 5 we defined the (metamathematical) notion of x satisfying q> in any
if-structure, in particular in the structure (w, 61u), where € \u is the 6-relation
restricted to u, i.e.

e\u= {(y,z)£uXu: y£z}.

Since <p has only finitely many fiee variables, we may take x to be an
eventually constant sequence on u; i.e. a sequence which assumes a constant
value after some point.
498 AXIOMATIC SET THEORY [CH. 10, §6

We are going to show that the notion of satisfaction of a formula of


if in a structure of the form (u,£\u) can be expressed in if. Now any
assertion expressible inif is a statement about sets, but formulas are
not sets. So if we are to succeed in our attempt we must first find some
way of replacing formulas by sets. We do this by associating each formula
of if with a uniquely defined (legitimate) closed term of if.
We recall that the individual vaiiables of if are assumed to be enumerated
in a sequence v0,vuv2,... . For each formula <p of if we define the term
1 <pn by induction on the complexity of tp as follows. We put

rv~Vj1 =df (0,i,j>

r^j1 =df<l,i,j>

r<PA\JI1 =df<2>r<pVv|f’1)
r~i<P =df <3,r(pn>
r3»iq>’1 =df <4,i,r(pn).
It is clear that this prescription assigns a unique closed term r <p 1 to each
formula <p.
We now apply 2.11 to yield a term t(n) for which

M(0)=2X<«X®
a t(n+l)=t(fl) u [{2}Xt(«)Xt(«)] u[{3}xt(/?)] u [{4}X<uXt(«)].

We put F„ for t(n), and define

the set of formulas ': F=df(J{F„: n£oj}.

6.1. Theorem. For each formula <p of ,

h r<p 1 €F.

Proof. This theorem is proved by a straightforward induction on the


complexity of <p; details are left to the reader. |

We now turn to the problem of formalizing the satisfaction definition


in if. We define

the set of eventually constant sequences on u:

ec(w) =df {x: Fun(x) a dom(x) = co a ran(x) c u

a 3m3yfudn^m[x‘n=y]}.

x(y/z) =df [* ~ x‘y)}]u {(>’, z)}.


CH. 10, §6]. THE FORMALIZATION OF SATISFACTION 499

Thus, for each sequence x and each natural number i, x(i/z) is the sequence
obtained from x by replacing x‘i by z.

6.2. Lemma.

t- 3!/[Fun(/) a dom(/) =F

a Vi Vj[/‘<0,ij> = {x€ec(w): x‘i=x‘y'}

af‘(l,i,j) = {x£ec(u): x‘i€x‘y}

aVu€F \/we$[fl(2,v,w)=f‘vnfiw

A/‘<3,ti) = ec(w)-/‘y

a Vi[/‘(4,i>> = {x€ ec(w): 3z€w[x(i'/z)€/*«]}]]]•

The formula we are trying to prove is of the form


Proof.
Let v|be the formula obtained from <p by replacing the first
occurrence of F by F„ and the second and third occurrences by F„_j.
(where we define F_! to be 0). We prove by induction on n that, given u,
for all ;i€co there is a unique function/satisfying \Ji(n,u,f).
If n=0, we simply define / on F0 such that

/‘(0,ij> = {x € ec(w): x‘i=x‘y},

/‘(l,zj> = {x€ ec(i/): x‘/€x7}.

Clearly / defined in this way is the unique function satisfying v|/(0,u,/).


Now suppose that the condition holds for n; we show that it holds for
ii+l. Let/be the unique function satisfying i|Then / is defined
on F„, and we can extend/to Fn+1 by putting

f‘(2,v,w)=f‘vnf‘w,

— e c(m) —f‘v,

f‘(4,i» = {x6ec(«): 3zfu[x(i/z)6/‘r]}.

Clearly the function / extended to F„+1 in this way satisfies \\i(n+ \,u,f).
Since any such function is uniquely determined by its restriction to F„,
it follows that the function / is unique. This completes the induction
step.
We now define g to be the union of all the functions / defined above.
Clearly g is the unique function satisfying q>(w,g).
500 AXIOMATIC SET THEORY [CH. 10, §6

Putting

(6.3) s (u,v) =df xy\yf[y(u,f)-+y=f‘v}\,

where tp is the formula defined in the proof of Lemma 6.2, we see that,
informally speaking, if u£F, then S(u,v) is the set of all x€ec(w) which
satisfy v in (w, £|«). We now define:

x satisfies the formula v in u:

Sat(x,u,u) ^dfr(FAX(S(«,r).

Our next theorem asserts that Sat(x,r(pn,w) is a formalization of the


statement: “x satisfies <p in the structure («, €|w)”.

6.4. Theorem. For any indexes i,j and any formulas (p,v)/ we have

V- x £ ec(w) [(Sat(x,r v—Vj 1,u)++ x‘i=x‘j)

a (Sat(x,rvt£Vj '',u)^x‘i£x‘j)

a (Sat(x,r (p a vj/1 ,u)—Sat(x,r (p n,«) a Sat(x,r i|/n, w))


a (Sat(x,r i (p1 ,u) ++ ~i Sat(x,r <p n,«))

a(Sat(x,r3Uj<p ],u)*-+3y(zuSat(x(i/j;),r(p1,w))].

Proof. This is a straightforward consequence of Lemma 6.2 and the


definition of the formula Sat. We leave the details to the reader. |

6.5. Theorem. Let <p be a formula with free variables among v0,...,vn. Then

H Vx 6 ec(w) [Sat(x,r (pn ,m)—tp^(x‘0,... ,x‘n)].

Proof. A simple induction on the complexity of cp, using 6.4. We treat


the existential case, leaving the other cases to the reader. In fact, if x£ec(t<),
we have (assuming without loss of generality that i c n),

Sat(x,r 3t>,(p 1,u)++3y£u Sat(x(i/y),r <p1, u)

-3Tew(p(u)(xs0,...,x‘(i-l),T,x‘(i+l),...,x‘ii)
- [3y<p(x‘0,... ,x‘(i- l),y,x‘(i+l),... ,x‘n(u>)]
— (3^(p)(o)(x‘0,...,x‘n). |

Remark. It follows immediately from Thm. 6.5 that if <r is a sentence, then

[—w?/0-[VaI(ra\w)~<t(u)], where Val(u,n) — dfVx£ec(u)Sat(x,u,w).

That is, for u^O, the formula Val(r<7 \u) holds iff <r is true in the structure
CH. 10, §6], THE FORMALIZATION OF SATISFACTION 501

(u,d\u). Accordingly we may say that the formula Val(r<rn,w) expresses


the truth of 0 in (u,£ [«). It is natural to ask whether there is an ^f-formula
which expresses the truth of sentences in the whole universe of sets V.
More precisely, is there an Jzf-formula T(x) such that, for each sentence 0,
we have h<T^T(r0n)? In fact, assuming that ZF is consistent, it is easy
to modify the argument of Prob. 7.7.13(h) to show that such a formula
T cannot exist. Less precisely (but more suggestively!), truth in V is not
definable in ££.
We define
z is an elementary substructure of u:
ES(z,k) -^dfzc«AVr(F\/x(ec(z) [Sat(x,u,z)-«-Sat(x,u,«)].

Note that by 6.5 we have, for any formula <f> whose free variables are
an among v0,...,vn,
h-ES(z,w)-*\jv0^z...dvn^z [(p(ZJ-<-><p(u)].

We now prove what amounts to a formalized version of Thm. 5.2.1.


6.6. Theorem. kac Vje £ w[k0< |«| a b| - 3z[y c z a |z| = |y\ a ES(z,m)].
Proof. The proof is nothing more than a formalized version of the proof
of Thm. 5.2.1, so we give the merest sketch, leaving the reader to fill in
the details.
Let h be a choice function for Pu. For each i?£F, x^ec(u) and i£co we put
s(v,x,i) for the set
{wdu\ Sat(x(i/w),v,u)}.
Define the sequence y0,yi,■■■ of subsets of u inductively as follows: y0 is
y and yn+1 is
{h‘s(v,x,i): v$F a xdec(y„) a z'£co}.
Then, just as in the proof of Thm. 5.2.1, one verifies that the set U{>V «€<*>}
satisfies the conditions imposed on z in the theorem.
The last result in this section is proved in just the same way as Prob. 5.1.1;
again the details are left to the reader.
6.7. Theorem. For any formula <p whose free variables are all among v0,...,vn
we have
\-Isom(f,x,y)-+'dv{i£x...\lvn£x[y(-x)(y0,...,vfi^<y{y)(f‘v0,...,fivn)\.

In particular, for any sentence 0,

1- Isom(/,x,y) -► [0(x) ^ 0(y)]. I


AXIOMATIC SET THEORY [CH. 10, §7
502

§ 7. Absoluteness

A formula <p whose free variables are all among x1,...,xm is said to be
absolute if there is a finite sequence of postulates of ZF — called
an absoluteness sequence for <p — such that, for each formula vK*) which
does not have free variables among x1,...,xm,

(7.1) HTransv(jc) a 3x\K*) a ®iv) a ... a


- Vxx... Vxm[>K*l) A ... A v|/(xj - [(P(M,)— (?]],

in which the relativizations of formulas to \|/ are understood to be taken


with x as chosen variable, the other free variables of vj/ being regarded
as parameters. (For the definition of Traits^-,, see the material following
Thm. 3.3.)
A term t is said to be absolute if for any variable x not free in t the
formula x=t is absolute.
Informally speaking, the formula <p(xl5...,xm) (or the term t(xx,...,xm))
is absolute if there is a finite sequence a„ of postulates of ZF such
that whenever A is a non-empty transitive definable class in which <^,...,<7,,
hold (when £ is interpreted as the membership relation on A), then, for
any members w1,...,«m of A, 9(1^,..., wm) holds in A iff it holds in the universe
of all sets (or the value of t(w1,...,wm) in A is the same as its value in the
universe of all sets).

7.2. Lemma. Suppose (p is an absolute formula whose free variables are


all among x1?...,xm, and let <7lv..,<7n be an absoluteness sequence for <p.
Then for any variable y, we have

1— Trans(>>) a y A 0 a tr^ a ... a <7^

- Vx1...Vxm[x1^JA ... A HI¬


PROOF. Put x£y for \|/(x) in (7.1). 8

We are going to show that a large number of the defined formulas and
terms we have introduced in the course of our discussion are absolute.
Let us call a formula of restricted if all its quantifiers are of the form
Mx£y or 3x€y. (N.B.: This refers to formulas in the primitive notation
of the language JSf, i.e., to formulas which do not contain virtual terms)

7.3. Lemma.
(i) Any atomic formula is absolute. Also, if <p,(p' are absolute, so are
i tp, <p a (pr, (p v <p/, <p -► <p', <p«->- <p', VxO’<p and 3x6y<p. Hence all restricted
formulas are absolute.
CH. 10, §7]. ABSOLUTENESS 503

(ii) If <p is absolute and tp' is a formula with the same free variables as
<p such that |— (p~-<p', then <p' is absolute.
(iii) Let cpx and (p2 be absolute formulas such that \/yipx and 3z<p, have
the same free variables. If <p is a formula with the same free variables as
VjF<Pi and 3zq>2 such that [— <p ++■ V>’<Pi and |—<p+->-3z<p2, then (p is absolute.
(iv) If <p(x) is absolute and |— 3!x(p(x), then the term ix<p(x) is absolute.
(v) If <p(ju) and t are absolute, then so is <p(t).
(vi) If x is not free in t, and <p and t are absolute, then so are Vx£t<p and
3x€t<p.
(vii) If s(y) and t(j’) are absolute, then so is s(t(>’)).
(viii) If t is absolute, then so are x£t and t£x.
(ix) If <p(>’) is an absolute formula, and s and t(x) are absolute terms,
where t(x) is not a variable, then the terms {j€s: <p(>’)} and {t(x): x£s}
are absolute.
Proof, (i) is straightforward and is entrusted to the reader.
(ii) Suppose that (p is absolute and that t— (p■<-*-<p'. Let be an
absoluteness sequence for <p, and let xx,...,xm be the finite sequence of
postulates of ZF used in some proof of (p^cp'. It is now a simple exercise
to show that <p' is absolute with absoluteness sequence xl5...,xm.
(iii) Assume the hypotheses of (iii). Let alt...,on be an absoluteness
sequence for (px and let xlv..,xm be the finite sequence of postulates of
ZF used in some proof of (p^Vj'cPi- Suppose that the free variables of
(p are among x1,...,xk, and that v|/(x) is a formula which does not have free
variables among x1,...,xk. We may assume that y does not occur free
in vj/ either, for otherwise we can replace Vj><Pi by a suitable variant
Vh,<p1(vi’)5 where w does not occur in <px nor among xx,...,xfc. Now suppose
that
Transv(jc) a 3x\|/(x) a ct(1h') a ... a <r^v) a x^v) a ... a frf a v|/(xx) a ... a v]/(xfc)-

Then, under these assumptions, we have

tp -*• V>’<Pj
- Vy[v|/(jO - (pj

-V>-[vl/(7)-^v)]

(Vt<Pi)(v)

- (p(v).

A similar argument, using |—tp—>-3z<p2, shows that, with the requisite


assumptions, (p^<p. Hence (p is absolute as claimed.
504 AXIOMATIC SET THEORY [CH. 10, §7

(iv) Suppose that (p(x) is absolute and l— 3 Then we have

1— X = lX(p(x) <p(.x:),

so the required conclusion follows from (ii).


(v) Suppose that (p(j) and t are absolute. We have

1— <p(t)—Vjt y=t - q>(T)] >


h <p(t)—3y[y=t a <pO)],

so the absoluteness of <p(t) follows from (iii) and (i).


(vi) , (vii) and (viii) are immediate consequences of (v) and (i).
(ix) Assume the hypotheses of (ix). We have, by (1.7) and (1.12),

\-z={y£s: a<p(j)]aV7€s[<p0>)-j'6z],

so that the term {j€s: <p(j)} is absolute by (i), (ii), and (vi). Also, by (1.13),

hz= {t(x): x£s}^Vy£z 3x€s[y=t(x)] a Vx£s 3j€z[>=t(x)],

so that the term (t(x): x£s} is absolute for the same reasons as before. |

Using this lemma, we can now verify that each member of the following
sequence of defined formulas is absolute. In each case we first write down
the formula whose absoluteness is to be verified and then another formula
which is either the defining formula of the first or which can be proved
equivalent to it. The defining formula will be seen to be absolute by
applying Lemma 7.3 and using the fact that earlier formulas in the sequence
have been proved absolute. The absoluteness of the given formula then
follows from (ii) of 7.3.
(1) y^x: Mz£y[z<ix].
(2) z={x,y}: xdz Aydz a\/u£z[u=xv u=y].
(3) z={x}: z={x,x).
(4) z = (x,y): z={{x}, {x,y}}.
(5) z—\Jx: Vy'Gx Mu£y[u(iz] a \/y£z 3w€x[j'£«].
(6) z=x\jy\ z=[j{x,y}.
(7) z—xny: zciazejja■\ju£x[u£y-+u(iz].
(8) z—x—y: z<^xA\lu(ix[u§_y++u(iz].
(9) z—xXy: Vw€z 3w£x 3t?€.y[w = («,«)] a \/udx Mv£y 3w£z[w=(u,v)].
(10) Fun(/): Vz€/3x€UU/3j€UU/[- = <*,J>]
aVx6UU/V>’€ULI/Vz€UU/[<^,>’> €/a (x,z) €/— y=z]
(11) y = dom(/): Vx€t3z6UU/[<^)€/]
a Vx£ UU/Vz€ UU/[<*,*)€/->- x£y].
CH. 10, §7], ABSOLUTENESS 505

(12) j=ran(/): Tx£y 3z£ljU/[<z,*>€/]


aV^€UU/Vz€UU/[<^)€/-x€^].
(13) j=0:
(14) ^=a:+1: j=xu{x}.
(15) 7=1,2,3,4: y=0 + l,y=l + \, y = 2 + l,y=3 + l.
(16) y=f‘x\ VweUU/[<^w)6/— w=y\A(x,y)ef
v n 3z £ U U / V vv € U U / [<*,w) w=z\ a y=0.
(17) u=x(z/y): w = [x-{<z,x‘z)}]u{(z,>-)}.
(18) y=f[x: y =/n[xXran(/)].
(19) 1 Ord(x): fy£_xdz£x[y=z\/ y£z\/ z£y\/\dy£x[y^x\.
(20) Lim(x): Ord(:c) a a^O a V>’€A[A^y + 1].
(21) a=co: Lim(x)aVydx—iLim(y).
We must now extend the notion of absoluteness to formulas and terms
with ordinal or number variables. Given <p(x), we say that the formula
<p(a) (or tp(«)) is absolute if tp(x) a Ord(x) (or <p(a) a a 6 co) is absolute, and
similarly for terms. It follows immediately from (19) and (21) above that
if (p(x) is absolute, so are q>(a), (p(«), 3«<p(«) and V«<p(«)-
We next establish the absoluteness of terms defined by ordinal recursion
from absolute terms.

7.4. Lemma. Let s(y,z) be an absolute term and let t be a term such that
I—t(a) = s(t|a,a). Then the formula y = t(a) is absolute.
Proof. Using the absoluteness of the formulas (1)—(21), the absoluteness
of s(y,z) and 7.3, it is not difficult to see that the formula Rec(/,x) introduced
in the proof of Thm. 2.9 is absolute. But by the same proof we have

y—t(a) V/[Rec(/,a) —/‘a =y\,

H7=t(a)~3/[Rec(/,a) A/‘a=y].

The absoluteness of y=t(a) now follows from 7.3(iii). I

7.5. Lemma. Let s0 and sx(y) be absolute terms, and let t(x) be a term such that
1_ t(0)=s0 a t(n+l)=s1(t(n)).

(See Thm. 2.11.) Then the formula y = t(n) is absolute.


Proof. Let <p(fx) be the formula

Fun(/) axu {a} u {0} c dom(/) a/‘0=s0

a V«€dom(/)[/‘(«+l)=s1(/‘n)].

1 See Prob. 3.8 (iv).


506 AXIOMATIC SET THEORY [CH. 10, §?

Then, as in the proof of 7.4, it is not difficult to see that

1-y=tO?)—V/[<p(/>») -~rn=y\,
hy=t(n)^3f[<?(f,n) a f‘n=y\.

Since <p(/,x) is absolute by previous results of this section, it follows from


7.3(iii) that y—t(n) is absolute. I

7.6. Corollary. The term ec(u) is absolute.


Proof. By recursion we define a term t(y,u) such that

h-t(0,u)={coX{x}: x€u}At(n+l,u)={y(n/x): x£uAy^t(n,u)}.

This is a recursion of the type considered in 7.5, and, using the appropriate
instances of (1)-(21), we see that the terms playing the roles of s0 and sx
are absolute. Thus y = t(n,u) is absolute by 7.5. Also, it is easy to verify
that t{n,u) is the set of all sequences from u which assume a constant value
from the nth place on, so that

1— ec(w) = U {t(n,u): n^oo}.

The absoluteness of ec(u) follows. j

7.7. Corollary. The term F is absolute.


Proof. In §6 we introduced a term t(x) for which

b-t(0) = 2XcoXft>

a t(»+1) = t(n) u [ {2} X t(n) X t(«)] u [ {3} X t(n)] w [ {4} X to X t(/?)].

This is a recursive definition of the type considered in 7.5 and, using the
appropriate instances of (1)—(21) we see that the terms playing the roles
of s0 and Sj are absolute. Therefore t(«) itself is absolute by 7.5. But by
definition we have F=U {*(«): «€co} and the result follows. |

7.8. Lemma. The formula Sat(x,v,u) is absolute.


Proof. Recall that in Lemma 6.2 we proved that

(1) 1— 3!/cp(w,y)

for a certain formula <p(tf,/). Using the results already proved in this section,
it is easy to see that <p(w,/) is absolute.
We also recall that in (6.3) we put S(u,i>) for

il[V/[<pOaO-L=/‘i’]] •
CH. 10, §7], ABSOLUTENESS 507

It follows easily from (1) and 7.3(iv) that S(u,v) is absolute. Since Sat(x,u,w)
was defined to be the formula i>£Fax£S(u,v), the required result follows
from 7.7. I

It is important to observe that there are formulas and terms of <£ which
are not absolute. For example, let us define

z is countable: C(z) ^df 3a<eo(z?«a).

Then we have:

7.9. Theorem. If ZFC is consistent\ then the formula C(z) is not absolute.
Proof. Suppose that C(z) is absolute; let be an absoluteness
sequence for C(z), and let Tl5...,xn be a list consisting of all postulates
of ZF used in a proof of Cantor’s theorem that 3z~iC(z). Put <t for
Cj a ... A<ym, t for a ... a t„, and <p(j>) for the formula

Trans(y) a |>^| = a0 A ff(y) A t(y)-

Now x->-3z —iC(z) is a logically valid sentence, so that, by Cor. 2.12.4,


the formula

3x0c€>0 a x^y) -*• [3z i C(z)](},)

is logically valid. Hence

(1) f-<p(z)-'P^-iC(z)](,).

A straightforward application of RPj (see §5) and Thm. 6.6 shows that,
assuming AC, there is a set u for which

(2) Ex(«)a |«| = K0a<t(u)at(u).

By the Mostowski collapsing lemma (3.11), there is a collapsing isomorphism


/of u onto a transitive set y. It follows from this, (2) and 6.7 that

Trans(y) a |j| = K0 A °(y) A x(y)-

We have thus shown that

Hac 3>’<P(>;)-

This, together with (1), gives

I Ac 3t[<p(t)a [3znC(z)]w],

1 Later we shall show that if ZF is consistent, so is ZFC. Hence we only really need the
consistency of ZF for Thm. 7.9 to hold.
AXIOMATIC SET THEORY [CH. 10, §7
508

i.e.,
(3) h-AC 3;f[<pO) a Bzey ~i C (z)(y)].

However, by the definition of <p(y), we have

(4) H <(>(>’)-'Trans(y) Ay^O a<t(,,),

so, since ax,...,<rm is an absoluteness sequence for C(z), it follows from (3), (4)
and 7.2 that

K\c 3y[<|>(y) a 3z$ynC(z)].

Hence, using

H(p(y)^Trans(y)A|y| = K0,
we have
f-Ac 3y[|y | = K0 a 3z c y C(z)].
From this it would follow immediately that ZFC is inconsistent. §

Important Remark. The argument we have just given is essentially


a formal vers on of an informal argument — due to Skolem —- which
runs as follows. Suppose ZF is consistent. Then, by the Lowenheim-
Skolem Theorem it has a countab e model 21 = (A,E). Now Cantor’s
Theorem must hold in 21, so there is a member a of A such that 211= i C[a],
For each xeA, let

x= {yeA: yEx},
let
A' = {x: xeA),

let E' be defined on A' by xE'y^xEy, and let 2F = (A',EUsing the


fact that the Axiom of Extensionality holds in 21, we see that the map
establishes an isomorphism of 21 onto 2F. It follows that 2T N= ~iC[d].
But clearly we have a^A, so C(o) holds in the universe of sets. Thus
a is uncountable from the point of view of 21' but is countable from the
point of view of the universe of sets. This is Skolem's paradox, in which
a set may be uncountable inside a model of ZF but countable “from the
outside”. (Of course, this simply means that the given model does not
contain a function counting the set in question, although such a function
can be found somewhere in V.)

7.10. Problem. Show that the term Pco is not absolute, so that the term Px
is not absolute.
CH. 10, §8], CONSTRUCTIBLE SETS 509

§ 8. Construedble sets

The sets Ra are determined by the condition:

Rp+i~ P Up 5
/?<A} for limit A.

As a runs through the ordinals, the i?a’s exhaust the universe of (regular)
sets. This universe is thus built up by:
(a) starting from scratch (i.e. from 0);
(b) at each successor stage, collecting together all sets whose members
have already been collected;
(c) at each limit stage, collecting together all sets which have already
been collected;
(d) iterating this process indefinitely (i.e., throughout the ordinals).
We now enquire whether this process can be modified so that instead of
obtaining the whole universe we obtain a smaller collection in which,
however, the axioms of ZF still hold.
Now (a) and (c) are extremely natural, and in any case they do not seem
to be responsible for the largeness of the resulting collection. We do not,
accordingly, propose to modify (a) and (c).
As for (d), it is certainly partly responsible for the large size of the resulting
collection, because it tells us to continue the process of collecting new
sets as long as possible. The obvious way to modify this is to stop the
process at some (large) ordinal a. We shall not, however, pursue this
possibility here.
We focus our attention on (b): Rp+1~PRp. This is certainly responsible
for a very rapid increase in the size of the Rp s, because, by Cantor’s theorem,
we have || It is therefore reasonable to suppose that the largeness
of the collection obtained by applying the process (a)-(d) is to a considerable
extent due to the great “strength” of the operation P in (b), and we shall
therefore try to obtain a smaller collection by replacing P by a weaker
operation.
Now, the size of the power set Pu of a given set u is proportional not
only to the size of u but also to the “richness” of the entire universe (see
Prob. 7.10). We propose to modify the process (a)-(d) by replacing P
by another operation D. For a given set u, Du is to contain only those
subsets of u whose existence can be ascertained by, so to speak, examining
only u itself, without scanning the whole universe. (These are the so-called

34
AXIOMATIC SET THEORY [CH. 10, §8
510

predicatively defined subsets of m.) Our hope is that the lesulting collection
_ the universe L of constructible sets — will be much smaller than the
entire universe. On the other hand, since we are going to include in Du
every particular subset of u which is definable in ZF, it will turn out that
L is a model of ZF. Moreover, because of the orderly way in which the
members of L are constructed, AC will hold in L. Finally, since at each
stage in the construction of L we are adding as few new sets as possible,
it will follow that the power set operation in L is as weak as it possibly
can be, and so GCH will hold in L.
These considerations lead to the construction described in the remainder
of the present section.
We first define
the set of ail definable subsets of u:

Dm =df {y\ jsma|>=0v


3i>3x[c€FAx<Eec(M) Ay={z£u: Sat(x(0/z), v, «)}]].

Thus, if 0, Dm consists of all those subsets of u which are definable in


(m,<E|«) by a ’’formula'1 of JS? involving parameters from n, while D0={0}.

8.1. Lemma. For any formula 9 of whose free variables are all among
Vq ?•••,
h-w^o A 3i’i<Ew...3r„e»[>’={z6w: <p<“>(z,»!,...,»„)}] ^y£Du.

Proof. Assume the antecendent and fix v1,...,vn^u so that

y={z£u\ (p(u)(z,u1,...,t?n)}.
1
Define x£ec(m) by letting x‘0 be any element of m, x‘k = rk for 0 < k ^ n
and x‘/c = x‘0 for k£<x>-{0,...,n}. Then, by 6.5,

z£M-[Sat(x(0/z), ; (pn,M)—(p(u)(z,i;1,...,un)].

Hence y={z£u: Sat(x(0/z), r<p1,M)}; since (p^F by 6.1, it follows


thatjCDM. |

We now apply Thm. 2.9, taking for s(y) the term U{Dz: z6ran(j/)}-
We obtain a term Lx such that

(8.2) \-La = \}{DLf: /?<«}.

8.3. Theorem.
(i)
(ii) |-Trans(La).
CH. 10, §8]. CONSTRUCTIBLE SETS 51!

(iii) A)— 0 A La+1 — DLX a [Lim(A) — — U{Lp'


(iv) y-LacRa.
Proof, (i) If /J<a, then Lp^La follows immediately from (8.2). To show
that Lp£Lx, it is clearly enough to prove that Lp£DLp. If Lp^0, this
follows from 8.1 and the obvious fact that

Lfi = {x£Lp: (x=x){lp)}.

If Lp = 0, then T/j 6 {0} = DZ^.


(ii) If u£vdLx, then u£vdDLp for some /?<a by (8.2). Hence u£v^Lp,
so that u^Lp, whence u£Lx by (i).
(iii) is proved like 3.3(iii), and (iv) is easily proved by transfinite induction,
using the obvious fact that |-Dmc Pu. §

We now define

x is constructible: L(x) —df 3<x(x£La),

the order of x: A(x) =df |ia[x£ DLJ.

Notice that it follows immediately from 8.3(ii) that L(x) is a transitive


formula.
The class of all sets x satisfying L(x) (together with the membership relation
restricted to this class) is called the constructible universe.

8.4. Theorem. b-L(a) a 2(a) = a.


Proof. If a£La then, by 8.3(iv), which contradicts 3.3(v). Thus
we need only show that b-atDLx; we argue by transfinite induction. First,
we have 0 6 {0} = DZ.0. Suppose that aand /?«*. By inductive hypothesis,
/?6DLp, and so /?6La. On the other hand, if /?>a we cannot have /?6La,
for this would, in conjunction with Trans(La), imply that ot^Lx, which
we have already shown to be false. Hence

(1) oc={x6La: Ord(x)}.

But since a>0, it follows from 8.3(i) that 0; and |-Trans(La) by 8.3(iii).
Hence, by 3.9(ii),

(2) Vx6TJOrd(x)^Ord(to,)(x)].

(1) and (2) now imply that

a = {xeLx: Ord(i»)(x)},

so that y.fDLx by 8.1. 1

34*
[CH. 10, §8
AXIOMATIC SET THEORY
512

8.5. Lemma. The term D« and the formulas y=Lx and y£La are absolute.
Proof. The results of §7 and the equivalence

I— w = Di/^0$w

aVj€w’ £u

a 3t>£F 3x€ec(w)|> = {z£u: Sat(x(0/z), v, «)}]]

a Ve€F V*6ec(w) [{z£u: Sat(x(0/z), v, w)}€m;]

immediately imply that the term Du is absolute. Thus the term


U{Dz: z€ran(y)} is absolute; since La was constructed from this term by
ordinal recursion, the formula y—Fx is absolute by 7.4. This and 7.3(viii)
imply that the formula y£Lx is absolute too.

Our next result is the important reflection principle for the constructible
universe.

8.6. Theorem. For any formulas <Pi,- - -»<P„ whose free variables are among
\ X
H3j5[a</? a VA1€L/J...VAm€^[[(p(1L)-9iV] a ... A[(piL)-9iL/i)]]-

Proof. By Thm. 8.3 we have

{-oc^P-*Lx^Lp,

h-Lim(A)-»LA = U{^:
The required result now follows immediately from Thm. 5.3. |

We now show that the constructible universe is a model of ZF; more


precisely, we have:

8.7. Theorem. For each postulate <r of ZF,

Proof. Recall that in §5 we introduced the set of postulates for LM and


we proved in Thm. 5.6 that ZF and LM are equivalent. Accordingly, in
order to prove the present theorem it suffices to show that

(1) 1- t(l) for every postulate x of LM.

For suppose that (1) holds, and let a be any postulate of ZF. Then by
5.6 we have1 l-LM<r, so that, for some finite sequence x1,...,x„ of postulates

1 We recall from §5 that |-LM tp stands for “(p is deducible from the postulates of LM”.
CH. 10 §8], CONSTRUCTIBLE SETS 513

of LM, the formula is logically true. Therefore, by 2.12.4,


the formula

EUL(x) — t(1L) a ... a t[,l) — a(L)

is also logically true. But it follows instantly from 8.4 that |— 3xL(x).
Hence,
b-T<L)A...AT<LWL>,

and it now follows from (1) that ho,L)-


We prove (1) by considering all the postulates of LM in turn. We recall
that these are Ext, Pow, Reg, Sep and RP^
(i) Ext, Reg. That (1) holds for these postulates is a straightforward
consequence of the transitivity of L(x); we leave the verification of this
to the reader.
(ii) Pow. We have to show that

1— L(z) - 3x[L(x) a Vy [L( y) - [y € x—Vw[L(w) a u £y- u € z]]J] •

The fact that L(x) is transitive allows us to drop L(u) is the above formula,
and we therefore have to prove

(2) H L(z) - 3x[L(x) a Vy[L(y) - [y£x*+y c z]]].

Assume L(z), and put a for

U{AOO: L(y)Aycz}+l.

It follows instantly that

(3) L(y)Aycz^f4,

so that
{y: L(j)a}’cz} = {j: y<ETaAy£z}.

It follows from (3) that z€La, and, since La is transitive, it is easy to see that

y€La-+[(ycz)(M~ycz].
Hence
{y: L(y)Ajcz}={>’: y€Ta a (y cz)^}.

Thus, if we put x for

{y: L(j)aj/cz},
AXIOMATIC SET THEORY [CH. 10, §8
514

we deduce from 8.1 that x6DLa, so that L(.v). Also, it is clear from the
definition of x that

\Jy[L{y)^[y£.x++y<^z]\,

which gives (2).


(iii) Sep. Let <p be any formula with free variables among x,y1,...,vk,
and suppose that z is not free in <p. We have to show that

(4) 1- L(yx) a ... a L(yk) a L(w) - 3z[L(z)

a Vx[L(x) — [x£z^(p<L) a x<Ew]]].

Assume that L(>y) a ... a L(yt) a L(»), and put z for

{*€«: <p(L)}.
There are ordinals a, alv..,ak such that u£La Ayt^La^ A ... Ayk€.La^. Let
y be the largest of the ordinals a, alv..,at; then clearly {u, yl,...,yk}^ Lr
By 8.6, there exists an ordinal such that

xeLpAy^LpA... Ayk£Lp-+[(p(L)<p(L^)].

Since we have Ly^Lp, so that {u,y1,...,yk}^Lfi. Hence

yx£Lp[(p(t)++ <p(£,^)].

Using the transitivity of Lp and the fact that u£Lp, we see that this implies
that
z = {x£Lp: [xdu a (p](L^)).

Therefore, by 8.1, z£DLp=Lp + 1, so that L(z). Also, it follows immediately


from the definition of z that

L(.y) -*■ [,v €z*-> <p(L) a x (■_ u].

(4) follows.
(iv) RPt. We have to show that

(5) [- L(^x) a ... A L(yk)

->- 3w[L(») a yx6 w a ... a yk £ u a Trans(L)(w)

a V.xy Vxm 6 u[L(x1) a ... a L(xm)


— [(«Px—<pi“})(L) a ... a (<p,,~^‘'¥L)]]],

where <plv..,<p„ are any formulas all of whose free variables are among
x1,...,xm. But, in view of the presence of L(w) and the transitivity of the
CH. 10, §8], CONSTRUCTIBLE SETS 515

formula L(x), it is clear that, in (5), (<p;*-*<(>ju,)(L) may be replaced by


<PjL)Trans(L)(w) may be replaced by Trans (w), and L(xx) a ... a L(x„)
may be suppressed. Thus we have to prove

(6) ... aLO*)


— 3w[L(«) Ay^uA ... ayk^uA Trans(w)

a Vxj € u... Vxm £ w[[(p(iL)^ <P(iu)] a ... a [(p^L) — <pi“}]] ].

So let us assume that L0’x)a ...a L(>>k). Then, as in the proof of (iii),
there is an ordinal y such that {.Vx,...,yk}^Ly. By 8.6, there is an ordinal
f»y such that

(7) V^i€^...VAm€^[[(p(1L)—9^] a ... a [(piL)—


Taking u=Lp, we see that y^uA ... a yft£uATrans(u), and this, together
with (7), gives (6). I

The reader may find it instructive to prove Thm. 8.7 directly without
going through LM.

Let i|/(x) be a formula with one free variable x. We shall say that v|/(x)
defines a transitive model of ZF if
h-3xvj/ a Transv(jc) and 1—a(M/)

for every postulate <r of ZF. Speaking somewhat imprecisely, vj/(x) defines
a transitive model of ZF iff it is provable in ZF that the class defined by
v|, is a transitive model of ZF (with € interpreted as the membership
relation).
Thm. 8.7 implies that L(x) defines a transitive model of ZF. Moreover,
we have:

8.8. Theorem. Let vj/(x) be a formula which defines a transitive mode!


of ZF. Then:
(i) h- Vai|/(a) — Va:[yKx) — [L(M,)(x)-> L(x)]];
(ii) HvKa)-\KLJ;
(iii) h- VavJ/(a) ^ Vx[L(x) — vj/(x)].
Proof, (i). We have to show, assuming Vav|/(a),

(1) x)/(x)-[L(v)(x)-L(x)].

Now L(v)(x) is (3a[x£LJ)(v), and by Prob. 3.9(iii), since by assumption


h-3x\]/a Transv(x), we have

h (3a[x € La])M - 3a[%K«) a [x£ 4](v)] •


516 AXIOMATIC SET THEORY [CH. 10, §9

Hence
b L(v)(x) - 3a[v|/(a) a [x£ La](v)].

But we know that the formula x£La is absolute, so, since i|/(x) defines
a transitive model of ZF, we obtain

b \|/(x) — [L(v)(x) — 3a[v|/(a) a x6LJ].

But, since we are assuming Va\|/(a), we may suppress the \|/(a) and so get

tK*)-[L(v>(x)~3a[x€LJ],
i.e. (1).
(ii) Since \J/ defines a transitive model of ZF, and b3>’[j=ZJ, it follows
that b[Va 3>’[>’=L(I]](v), hence, using Prob. 3.9(iii),

bx|/(a)^3y[vl/(j;)Ab=LaF].

But the formula y=La is absolute, so that we obtain

b v|/(a) - 3^[vK>’) a y=Lx],

which implies b'Ka)^'KA*)> as required.


(iii) Assume VooKa) an<i L(x). Then x£La for some ordinal a, and
i|/(La) by (ii). Hence \|/(x) follows from the transitivity of \[/. |

Thm. 8.8.(iii) says that any transitive definable class model of ZF that
contains all the ordinals includes the constructible universe. On the other
hand Thms. 8.4 and 8.7 imply that the constructible universe is a model
of this kind. Therefore the constructible universe is the smallest definable
transitive model of ZF containing all the ordinals. This is an invariant charac¬
terization of the constructible universe, independent of the method by
which it is defined.

§ 9. The consistency of AC and GCH

The Axiom of Constructibility is the sentence

Constr: V.xL(x).

We do not adopt Constr as a postulate. We shall see, however, that it


plays an important role in establishing the consistency of AC and GCH.

9.1. Theorem. bConstr(L).


Proof. By Thm. 8.7, L(x) defines a transitive model of ZF and by Thm. 8.4
CH. 10, §9], THE CONSISTENCY OF AC AND GCH 517

we have VaL(a). Hence, by Thm. 8.8(i),

h- Vx[L(*) - [L(L>(x)-L(x)]],
so that
HV.x[L(x)-L(L>(x)],
i.e.
Constr(L). I

9.2. Corollary. Constr is consistent relative to ZF.


Proof. Since |—L(0), we have (— 3xL(x). By 8.7, we have l-o(L) for every
postulate <r of ZF, and by 9.1 we have 1— Constr(L). The conclusion now
follows from 3.6.

Despite the fact that Constr is consistent relative to ZF, most set
theorists do not accept it as an axiom for set theory. For to accept Constr
would amount to identifying the universe of sets with the constructible
universe, and the latter is far too neat and tidy — like a police state. Each
constructible set is generated by means of a formula and a finite number
of previously constructed sets and can therefore be identified by these
“birthmarks”. Since there is no reason to suppose that all sets can be
obtained and identified in this way, Constr is to be rejected as a postulate
for set theory.1
Nonetheless, Constr is important because, as we shall see, it implies
both AC and GCH, from which it follows that these latter are both consistent
relative to ZF.
So, just for the present, we adjoin Constr to the postulates of ZF. The
resulting theory will be called ZFL. By 9.2, ZFL is consistent if ZF is,
so clearly any theorem of ZFL will be consistent relative to ZF. We now
address ourselves to the task of proving AC and GCH in ZFL. We assume
that the reader is familiar with the elementary theory of well-ordered sets;
in particular, with the concepts of initial segment of a well-ordered set and
lexicographic well-ordrings. (A convenient reference here is Kuratowski-
Mostowski [1968].)

9.3. Theorem. ZFLi-AC.


Proof. The idea of the proof is to show that the constructible universe
can be well-ordered. The proof will be split into 3 parts.
We first claim that, given a well-ordering -< of a set it, we can construct
a well-ordering -<* of ec(w). To see this, for each x£ec(w) let n(x) be the

1 However, note that, by Thm. 9.1, Constr holds in the constructible universe.
AXIOMATIC SET THEORY [CH. 10, §9
518

least n6co such that x assumes a constant value at all Define the
Telation -<* on ec(w) by

x<*y n(x)<n(y)v[n(x)=n(y)

a x‘m<y‘m at the least m£a> for which x‘mX/w],

for x,yeec(u) and xXy. It is easy to verify that -<* is a well-ordering


of ec(w).
Our second claim is that the set F of rformulas" can be well-ordered.
First the set F0=2XcoXco can be lexicographically well-ordered, using
the natural well-ordering of co. We now argue inductively. If F„ has been
well-ordered by a relation -<„, one obtains a well-ordering -<n+i of

F„+1=Fnu[{2}XF„XF„]u[{3}XF„]u[{4}XcoXFn]

by well-ordering the last three summands lexicographically (using the


given well-ordering «<„ of F„), and then putting all the elements of F„ before
those of [{2}XF„XF„]-F„, the elements of this latter set before those of
[{3}XFJ — F„, and the elements of this latter before those of [{4}XcoX
XF„]-F„. In this way one obtains for each n£co a well-ordering -<„ of F„,
such that, for is the restriction of X„toFm, and Fm is an initial
segment of F„ with respect to One can then define a well-ordering
-< on F by setting, for u, w€F, «Xo,

u<v^df [v(m)<v(u)] v[v(m) = v(t) a u<V(u)v],

where, for each xdF, v(x) = fi«[.x€F„].


Finally, we prove the theorem. We construct by ordinal recursion
a term t(x) with the following properties:
(i) Ht(a) is a well-ordering of La;
(ii) H if then Lp is an initial segment of La with respect to t(a);
(iii) f- if t((1) is the restriction of t(a) to Lp.
We proceed somewhat informally, but the construction of t can easily
(but tediously) be recast in the form specified in 2.11.
First, we put

t(0) = t(l) = 0.

If t(P) has been defined to meet conditions fi)—(iii) for all /?<2, with
1 a limit ordinal, then

L, = \J{LP: H2.)
CH. 10, §9], THE CONSISTENCY OF AC AND GCH 519

and we put

t(A) = U{t(jff): /J<A}.

Suppose now that a = j8+l with /?>0 and t(j8) has been defined. Then
Lg,=Lp\jDLp. Now t(j8) is a well-ordering of Lp and so by the first part
of the proof we can well-order ec(Lp). Also, by the second part of the proof,
we can well-order F. Let -< be the lexicographic well-ordering of ec(Lp)xF
obtained from these two well-orderings.

For each y€ DLp there is v£F and x£ec(Lp) such that

(1) v = {zeLp: Sat(x(0/z),v,Lp)};

letg(j’) be the -<-least member (x,v) of ec(Lp)XF such that x and v satisfy (1).
We now define t(a) by

<y,z>€t(a)^[[y<=LpAz£LpA (y,z)£t(/?)]

v[y£LpAz£La-Lp]
v[y€La-LpAzeLa-LpAg(y)<g(z)\].

It is easy to verify that t(a) satisfies the required conditions.


We now put q>(x,y) for

3a[<x,y>€t(a)].

Assuming Constr, for each set u there is an ordinal a such that u^Lx,
and therefore

{(x,y>: x^uAy^u a <p(x,y)}

is a well-ordering of it. If for each uX0 we put s(w) for the least member
of it under this well-ordering, then

{<x,s(x)>: xXO a x£u}

is a choice function for u. AC follows. I

We now turn to the problem of deriving GCH in ZFL. We assume that


the reader is familiar with the basic facts about cardinal arithmetic — in
particular, the properties of cardinal sums and products — and how to
translate these facts into our present formal framework.

9.4. Lemma.
(i) h-F^co.
(ii) 1—AC 80<l“Mec(M)l = M-
AXIOMATIC SET THEORY [CH. 10, §9
520

Proof, (i) is easy and its proof is left as an exercise to the leadei (see the
second part of the proof of 9.3).
(ii) Assume &0=<|w|. For each x£u let Sx be the set of members of
ec(w) which eventually assume the constant value x. Then each member
of Sx is uniquely determined by the finite sequence of values it assumes
before it assumes the constant value x. It follows that, for each \£u,

|SJ<|UK: n£oj}|<I{Mn: h£co} = |w| • K0 = K


Therefore
|ec(«)| = ILK'S*: a-6w}|<!w|2=|m|.

Since we clearly have |u|<|ec(«)|, (ii) follows.

9.5. Lemma.
(i) (-Fin(u)-^Fin(Du).
(ii) b-ACKo<MHDwHw!-
(iii) 1—a<co — Fin(La).
(iv) |-AC£0<a_HAxl = lal>
where Fin(w) is 3n(u^n).
Proof, (i) is an immediate consequence of 1— D«c Puand (iii) follows easily
from (i).
To prove (ii), we first observe that each member of Diy is determined by
a member of F and a member of ec(«). Hence, using 9.4, if then

|Dw]<]FXec(t/)| = H0* |m| = |«|.

Also, it is easy to verify that

Dm,

so that ]w|=<[Dw|. (ii) follows.


Finally, (iv) is proved by transfinite induction, using (i)-(iii). It follows
from 8.4 that asLa, so that |a|«s|La|. Moreover, if co<a, then, using
the inductive hypothesis and (i)-(iii), we have

\La\ = \\J{DLp: p<a}\d{\DLp\: £<«}«|a|2 = ]a|.

This proves (iv).

We are now in a position to prove

9.6. Theorem. ZFLhGCH.


Proof. Since we know that ZFLhAC we shall use AC freely. Recall that
GCH is the assertion Va[|P^a| = ^a + 1]. Since I—AC + |PkJ, 'n order to
CH. 10, §9]. THE CONSISTENCY OF AC AND GCH 521

prove the theorem it suffices to derive |PKj<sKa+i in ZFL; and in view


of 9.5(iv), to achieve this it is enough to show that

(*) ZFLh P Lk^Lk+i.


Assume Constr, and let zsA. • We know from 8.5 that the formula
‘'a

x^Lp is absolute; let <rlv..,an be an absoluteness sequence for it. Then,


by RPl5 there is a transitive set u such that L^ u {z} £ u and

(1) Constrcu) a <r^u) a ... a

Now, by 9.5(iv), we have

{Z}l = — $cr>

and so by 6.6 there is an elementary substructure v of u such that |u| = tfa


and L# u{z}cd. Because of (1) and the fact that ES(u,w), we have

(2) Constr(l) a g[v) a ... a a(nv).

Also, since u is transitive, it is extensional, so that Ext(,,), whence Ext0’',


and therefore v is extensional. Accordingly, by 3.11, there is an € -iso¬
morphism / of v onto a transitive set y. Since |u| = Ka, we have |j>| = Ka.
Moreover, 6.7 and (2) give

(3) Constr(>,) a g^ a ... a g^\

Also, since Ls is a transitive subset of v, 3.14 implies that/‘.v=x for


every x^L^ ■ Hence, by 3.12 we have for our z^L^.

f‘z={f‘x: x£z}={x: x£z}=z,

so that z£y.
By (3) we have Constr(J,), i.e.

VxtylmxtLpi]™.

Since y is transitive, this gives, using 3.9(iii),

\/x<=y 3pey[xeLp]w.

Now x£Lp is absolute, and from (3) we see that the members of its absolute¬
ness sequence hold when relativized to the transitive set y, so we get

Vx€y3fi£y[x€Lfi].
522 AXIOMATIC SET THEORY [CH. 10, §10

Since z£y, we obtain in particular 3Pdy[zdLp\. Let ft be an ordinal in


y such that z3Lp. Then, since y is transitive, we have p^y, whence
|/f|<|_y|*=Ka, and so/*<Ka+1. Hence z€Lpc Lk + i, and (*) follows. |

Notice the similarity between this proof and that of Thm. 7.9.
We conclude from 9.2, 9.3 and 9.6 that

if ZF is consistent, so is ZF + AC + GCH.

§10. Problems

(Throughout this section we revert to our original convention and write


21— (p for “the formula <p is deducible from the set of sentences 2 in the
first-order predicate calculus.”)

10.1. Let ZF- be ZF with Inf omitted. Show that

ZFf-o(J?co)

for every postulate o of ZF- and ZF h- nInf(R“) a AC<r,u). Deduce that,


if ZF~ is consistent, then Inf is not a theorem of ZF-.

*10.2. Let Z (resp. ZC) be obtained from ZF (resp. ZFC) by omitting


Rep and substituting instead Sep and the postulate (“pairing axiom”)

V« Vu 3x Vy[y€x+-*-y=uv y=v].

(Z is called Zermelo set theory.)


(i) Show that

ZF[— Lim(a) a (resp. ZFCb- Lim(a) a

for every postulate <x of Z (resp. ZC).


(ii) Let x be the sentence V.v 3a[x^a], Show that

ZFC(— AC<r“ + “^ a —|x<r“ + “\

Using (i), deduce that, if ZF is consistent, t is not a theorem of ZC.


(iii) Show that, if ZF is consistent, then for each theorem a of ZF
(resp. ZFC), there is a theorem x of ZF (resp. ZFC) such that Z + 01/T
(resp. ZC + <t[/t). (If ZF|— <t, put x for

3a[Lim(a) a co<a a <t<r“)].


CH. 10, §10], PROBLEMS 523

Use RP2 to show that ZF|— t. If

/? = pot[Lim(a) a co<a a a(Ra)],

show that ZFh —ix(R^, and use (i).)


(iv) Deduce from (iii) that, if ZF is consistent, neither ZF nor ZFC
is finitely axiomatizable.

*10.3. A sentence of the form <p(RD is called an arithmetical sentence.


Show that, if <x is an arithmetical sentence and ZFLl— a, then ZF\-o, and
a fortiori, if ZF +AC + GCHhff, then ZF|—a. (Show that ZFh
and hence that a is absolute. Let <px,... ,<p„ be a proof of a from ZFL.
Show that ZF|— <pj,L) for k=\,...,n and then use the absoluteness of <r to
obtain the result.)

10.4. Let if + be a language obtained by adding new individual constants


to if, and let if+(a) be the language obtained by adding a new individual
constant a to if+.
(i) Let be terms of if + without free variables, let ZF+ be a theory
in if+ whose postulates include those of ZF, and let £ be the theory in
if + (a) whose postulates are:
(1) all postulates of ZF+;
(2) Trans(a) At^aA ... At„£a;
(3) Refl^fa), for each formula (p of if.
Show that £ is a conservative extension of ZF+, i.e., for each formula
<p of if + , El—9 iff ZF+h<P- (Use RP^)
(ii) Let ZFC+ be a theory in if+ whose postulates include those of
ZFC, let t be a term of if+ without free variables such that

ZFC+1—Trans(t) a R0<|t|>

and let E be the theory in if+(a) whose postulates are:


(1) all postulates of ZFC+;
(2) Trans(a) a tc a a |t| = |a|;
(3) Refl^fa), for each formula <p of if.
Show that E is a conservative extension of ZFC+. (Use RPj, Thm. 6.6
and the Mostowski collapsing lemma.)

*10.5. (i) Show how to construct an absolute if-formula Post(y) which


expresses the statement “v is a rpostulate of ZF1”.
(ii) Let Mod(w) be the formula,

\/y[Post(t>)->- V-x£ec(«) Sat(x,n,w)] a whO.


AXIOMATIC SET THEORY [CH. 10 §10
524

(Mod(«) then expresses “(«, £|«)(=ZF”.) Show that Mod(«) is absolute,


(iii) Show that, if ZF is consistent, then1

ZF|/ 3w[Trans(«) a Mod(«)]

(Suppose ZFl-3«[Trans(«) a Mod(«)]; let u be a set of least rank such that


Trans(w) a Mod(w). Now observe that 3x[Trans(x) a Mod(x)] holds in
(u, 6 |«), and use (ii) to conclude that ZF would be inconsistent.)

*10.6. Let E be a set of ^-sentences. A set u is called a model of E if


{u, €|«)|=£. The index of u, ind(w), is the least ordinal not in u.
(i) Show that, if u is a transitive model of ZF, then ind(u) is a limit
ordinal and ind(w) = {a: a£w}.
(ii) Let u be a transitive model of ZF and let a=ind(w). Show that

La = {x£w: L(u)(x)}.

(Use absoluteness of the formula x^L^.) Deduce that La is a model of ZFL.


(iii) Show that each transitive model of ZFL is of the form La. (Use (ii).)

*10.7. Assume that there exists a transitive model of ZF.


(i) Show that there is an ordinal £ for which is a model of ZF.
(ii) Let <j;0 be the least ordinal for which is a model of ZF. L^ is
called the minimal model of ZF. Show that, if u is any transitive model
of ZF, then (Show that £0«ind(«) and use 10.6(h).)
(iii) Show that £0 is countable. (Use the Downward Lowenheim-Skolem
theorem and the Mostowski Collapsing Lemma.)
(iv) Let (p(x) be a transitive ^-formula. Show that, if ZFl-a(<,)) for all
postulates a of ZF, then it is never the case that2 ZFp- —i Constr(<p). (Assum¬
ing the hypothesis, consider the transitive collapse of {x£L«o: <p(L^(x)}.)

*10.8 (i). Show how to construct an absolute term FV(x) such that, for
each rformula1 v, FV(ti) is the set of subscripts of variables free in v. In
particular, if <p is a formula whose free variables are exactly i?„ ,
show that
ZF|—FY(r(p1) = {n1,„.,nk}.

1 Notice that Godel’s Second Incompleteness Theorem (7.11.9) gives the stronger result
that, if ZF is consistent, then ZFT- 3 « Mod(«).
2 This result shows that the method we employed (that of constructing an “inner model”)
to prove the relative consistency of Constr (and hence of AC and GCH) cannot be used to
prove the relative consistency of mConstr (nor, a fortiori, of —lAC or —'GCH).
CH. 10, §10], PROBLEMS 525

(ii) Put
Df(«) =df {z£w: 3u€F 3.x£ec(w)[FV(t;) = {0}

a V>’€w[Sat(^(0/^),i;,w)^>;=z]]}.

Df(u) is the set of all definable elements of u (i.e. definable in the structure
(u, £|u), cf. Prob. 7.9.17). Let q> be a formula whose free variables are
among v0,...,v„. Show that

ZF|—JCj.6 Df(«) a ... Ax„(Df(«)AxC«

a Vl£m[(P(u)(>’, *!,...,xn) **y=x] - x 6 Df(w).

(iii) Prove the extended reflection principle of Myhill and Scott: for any
formulas (p1#...,(p„,

ZFh-Va1...\/am 3^ € DfCfy) a ... Aam€Df(^)A Refl^,...)<Pn(^)].

(Put <p(alv..,am) for the formula

3/J[ax€ D1(Rp) a ... a am€ Df(^) a Refl^.Jll,)],

and suppose if possible that 3a1...3amn <p. Define the ordinals yx,...,ym
inductively by

yi=pai[3a;+!... 3a,„ -| tp(yl5...,yf_lt ccL,...,aj]

for i=l,...,m. Then ~i<|>(yi,...,ym). Derive a contradiction by using RP2


and (ii) to prove the existence of an ordinal /? such that

Df(Rp) a ... a ym€Df(^)A Refl^..^).)

(iv) Let be the minimal model of ZF (10.7). Show that Df{L^=L^,


i.e. Lio is pointwise definable (Prob. 7.9.17). (Since L,q is a model of ZFL,
it has a definable well-ordering. Using this fact, show that Df(L^) is an
elementary substructure of . If / is the collapsing isomorphism of
Df(Z^) onto a transitive set M, show that M=L^ and /is the identity.)

*10.9. Define the term L(<x,u) recursively on a by putting

L(0,u)=u,

Z.(a+1 ,u) = DL(a,w),

/?<A} for limit 2.

35
AXIOMATIC SET THEORY [CH. 10, §10
526

Write Lfu) for L(ct,u), and let Constr(x,u) be the formula

3ot(xeLa(u)).

A set x satisfying Constr(x,w) is said to be constructible from u. Notice


that ZF l— La(0)= La.
(i) Show that

ZF f- T rans(w) -»■ T rans(La(u)) a a£La+1(u).

(ii) Show that the formula y—Lju) is absolute. (Like the proof of the
same assertion for y—La.)
(iii) Show that

ZFHTrans(w)-<T(Constr(x>u))

for each postulate <r of ZF, where the relativization is taken with x as
chosen variable. (Like the proof of ZFh-<r(L))-
(iv) Show that

ZF h- Trans(w) a 3a[u « a](Constr(*’u)) - AC(Constr(*-u)).

(Like the proof of ZFLt-AC).


(v) Let Co(u) be the formula Vx Constr(x,u). Show that

ZF h- Trans(w) - Co(w)(Constr(*’u)).

(Like the proof of ZFhConstr(L)).


(vi) Show that

ZFC|-Trans(u)ACo(w)^Va[|w|<Ka-HPNj = Ka+1].

(Like the proof of ZFLi-GCH.)


(vii) Show that, if

ZF + 3x[ —i L(x) a x c co]

is consistent, so is

ZF -f 3x[ n L(x) a x £ co] a AC a GCH.

(Take x£a> such that —iL(x); let w = cou{x} and show that

ZF + 3x[ L(x) a x c co] a AC a GCH

hold relativized to Constr(x,«).)


CH. 10, §10], PROBLEMS 527

(viii) Show that, if u is a transitive set, then the class of all sets x satisfying
Constr(x,u) is the least transitive definable class which includes the class
of all ordinals, contains u and is a model of ZF.

10.10. Put OD(x) for the formula 3a[x£ Df(i?a)]. A set a satisfying OD(x)
is called an ordinal-definable set.
(i) Let q> be a formula all of whose free variables are among v0,...,vn.
Show that

ZF |- 3ax...3a,, Vv[<p(y,oc1}...,an)^x=y]->■ OD(x).

(Use 10.8(iii) and (ii).)


(ii) Show how to construct a term t(x) for which

(— Inj(t|co) a ran(t|<n)=F.

(iii) Put Def(t>,z,^) for the formula

z£u adPFa FV(t>)= {0} a 3x£ec(«) dy£u [y=z-*-»Sat(x(0/.y)jt?,w)].

(Def(y,z,«) expresses “v defines z in the set «”.) Also put

Si(x) =dfpa[x€ Df(i?a)], s,(x) =df \m Def(t(n),A,i?SiW),

where t is the term introduced in (ii). Finally, put for the formula

OD(x) a OD(y) a [s1(x)<s1(y) v [s1(x)=s1(y) a s2(x)<s2(y)]].

Show that -< is a well-ordering of the class of all ordinal-definable sets.

10.11. Let £ be a set of ^-sentences which includes all the postulates of ZF.
(Under these conditions we say that £ is an extension of ZF.) £ is said
to have the selection property if for each formula <p containing exactly the
variable x free there is a formula vj; containing exactly x free such that

£ t- 3 !xi|/ a [3x<p 3x(<p a \J/)].

In other words, £ has the selection property if each non-empty definable


(without parameters) class can be proved in £ to have a definable element.
(i) Show that ZF + VxOD(x) is the weakest extension of ZF with the
selection property. (Use 10.10 to show that ZF +VxOD(x) has the selection
property. If £ is any extension of ZF with the selection property, consider
the formula ~iOD(x).)
(ii) Let 91 be a model of ZF and let 93 = 211 Df(^). Show that the following
conditions are equivalent:

35*
AXIOMATIC SET THEORY [CH. 10, §10
528

(a) 91 (= VxOD(x);
(b) Th(91) has the selection property;
(c) 93<9L
(Use (i).)

*10.12. Put HOD(x) for the formula V.y€TC({x}) OD(y). HOD(x) is


read “x is hereditarily ordinal definable”.
(i) Show that

ZFhHOD(.v)-OD(,x) a HOD(y),

ZFl—VaHOD(a).

(ii) Show that

ZF1— u'hod)

for any postulate a of ZFC. Deduce that, if ZF is consistent, so is ZFC.


(Show that, for each postulate of ZF, the set whose existence is asserted
by the postulate has the property HOD, using (i) and 10.10(i).)
(iii) Show that

ZF|-Vx[L(x)-HOD(x)].

(Use (i), (ii) and Thm. 8.8(iii).)

*10.13. A cardinal x is said to be (strongly) inaccessible (written In(x)) if


(a) %>co;
(b) for any cardinal a<x, |Pa|<x;
(c) if x£x and \x\<x, then HJx|<J<.
A set x is said to be accessible (written Acc(x)) if q(x) is less than every
inaccessible cardinal. Let t be the sentence Bxln(x).
(i) Show that ZFC[-<t(Acc) for every postulate a of ZFC, and that
ZFC|— —it(Acc). Deduce that, if ZFC is consistent, then the existence of
an inaccessible cardinal cannot be proved in ZFC.
(ii) Show that ZFCh-In(x)-»-In(L)(x;), and hence that ZFC|-t^t(L).
Deduce that, if ZFC + t is consistent, so is ZFC+t + GCH.
(iii) Show that, if x is inaccessible, then Rx is a model of ZFC.

*10.14. (i) Let x be an inaccessible cardinal, and let a<x. Show that there
is an ordinal [l such that a</)<x and (Rp, £\Rp)^(Rx, (Define
CH. 10, §11], HISTORICAL AND BIBLIOGRAPHICAL REMARKS 529

a sequence a0,a1,...-<x by

a0=«+l,

a„ + 1 = py[for all formulas tp whose free variables are all among

v0,...,vm and all (a0,...,am_1)e(Rx)m, if

<^*>€ |^x>N3om9[flo,...,am_J, then for some x£Ry,

(Rx, € \RX) N (p[o05---,am_!,*]].

Put /? = (J {a„: nd(o} and argue as in the proof of Thm. 5.2.1.)

(ii) Deduce from (i) that, if x is an inaccessible cardinal, there is x^x


such that \x\ = x, a£x} = i?x and (Ra,£\Ra)-<(Rp,€\Rp) whenever
a,Pdx and oc</?.

*10.15. Suppose that a<j5 and (Ra, £ \Ra)<(Rp£\Rp). Show that Ra and Rp
are both models of ZF. (First show that Lim(a)Aco<a, and use 10.2(i)
to show that Ra is a model of Z. To see that the Rep holds in i?a, suppose
that <p(x,y,ax,.defines a function, say /, in Ra, using parameters
au...,an6Ra. Then/must define a function, say g, in Rp which extends/.
Show that, for any o6Ra, ran(gfo)€R(5 and deduce that ran(f\a)£Rx.)

§11. Historical and bibliographical remarks

The notion of (infinite) set was first systematically developed by Cantor in


the 1880’s, although it had been discussed both by Bolzano in the 1840’s
and Dedekind in the 1870’s. The postulates of Z — except for the Axiom
of Regularity — were formulated by Zermelo in 1908. The Axiom of
Regularity was introduced by Mirimanoff in 1917. The Axiom Scheme
of Replacement was roughly indicated by Fraenkel in 1922, but the
first-order form of both it and the Axiom Scheme of Separation is due to
Skolem [1922]. For this reason calling our axiomatic system Zermelo-
Fraenkel set theory does an injustice to Skolem, and indeed some authors
write “ZFS” for our “ZF”.
The Axiom of Choice was first identified by Zermelo in 1904 (although
not christened by him until 1908), and used to give a rigorous proof that
each set can be well-ordered.
Our definition of an ordinal in §2 is due to von Neumann (1923), although
he was apparently anticipated by Zermelo in an unpublished paper of 1915.
The cumulative hierarchy, the notion of rank, and their use in proving
AXIOMATIC SET THEORY [CH. 10, §11
530

the relative consistency of the Axiom of Regularity was established by


von Neumann in 1929. (Skolem [1922] had already sketched a different
proof of the relative consistency of the axiom.)
The Reflection Principles were first introduced by Levy [1960] and
Montague [1961].
The formalization of the satisfaction relation given in §6 owes much
to unpublished lectures of Dana Scott delivered at the 1965 Leicester Logic
Colloquium.
The concept of absoluteness is due to Godel [1940]. Skolem s paradox
was formulated and analysed in Skolem [1922].
The notion of constructible set and the results of §§8 and 9 are due to
Godel [1938] and [1939]. The memoir Godel [1940] contains a different
— but essentially equivalent — definition of constructible set.
For further developments in set theory, the reader is advised to consult
Cohen [1966], Jech [1971], Drake [1974], and Bell [1977]. The book
Fraenkel, Bar-Hillel and Levy [1973] contains an excellent account of
the foundations of the subject, and a comprehensive bibliography.
CHAPTER 11

NONSTANDARD ANALYSIS

Any structure studied in mathematics may be regarded as an ^"-structure


for a suitably chosen first-order language JS?. If 51 is an Jzf-structure, then
the theory of 51 is defined as the set Th(5l) of all if-sentences <y such that
5In<t. In studying 51, mathematicians explore the theory Th(5l), try to
discover new sentences belonging to it, find interconnections between
such sentences, etc.
If 51 is infinite, then Th(5I) has nonstandard models, which are not
isomorphic to 51. In particular, 51 has elementary extensions which are
nonstandard models of Th(5I). (See, e.g., Thm. 5.2.6.) Since we are
interested primarily in 51, we may be tempted to regard these nonstandard
models as pathological monsters. However, in 1960 A. Robinson invented
a general method for exploiting nonstandard models of Th(5l) to facilitate
the discovery and proof of facts about 51 itself. This method he called
nonstandard analysis.
Generalizing the proof of Thm. 7.2.3, Robinson obtained, for any
jSf-structure 51, a special kind of elementary extension, called enlargement
of 51. An enlargement of 51 has certain saturation properties which are
not expressible in S£ but which can be used to prove assertions of the form
<r6Th(5I). This is analogous to methods used in many branches of mathe¬
matics: when we want to study a given structure (e.g., the real numbers,
or the Euclidean plane) we may find it convenient to embed it in some
larger structure (the complex numbers, or the projective plane) having
certain pleasant properties (being algebraically closed, or obeying the princip¬
le of duality). The latter structure may aid us in gaining knowledge about
the former. The main difference between nonstandard analysis and those
other well-known methods is that the relationship between the given
structure and its enlargements is characterized in purely logical terms1.

1 For a more detailed discussion of this, see §7 below.


NONSTANDARD ANALYSIS [CH. 11, §1
532

In particular, using nonstandard analysis Robinson was able to provide


a solid rigorous foundation for the method of “infinitely small” and
“infinitely large” quantities in classical analysis. That method, which is
intuitively very suggestive, had been widely used during the early stages
in the development of the calculus. But, after the failure of repeated
attempts to give it consistent justification, it was abandoned in favour of
the less intuitive “e — <5” method1.
In this chapter we shall set up the general machinery of nonstandard
analysis and then apply it to several mathematical situations.
We shall assume a slender acquaintance with Chapters 4 and 5 and the
first four sections of Ch. 10.

§ 1. Enlargements

In this section if is taken to be an arbitrary first-order language with


equality. The connectives ~l, A and the existential quantifier 3 are taken
to be primitive. The other connectives and V are introduced by definition
in the usual way.
We denote if-structures by upper-case German letters; the domain
(universe) of an if-structure will be assumed to be a set, and will be denoted
by the italic counterpart of the German letter denoting the structure.
If 91 is an if-structure and R is a predicate symbol of if, we take to
be the corresponding basic relation of 9t. Similarly, fM is the basic operation
of 91 corresponding to the function symbol f of if.
We revert to the convention of using bold type for all symbols of if.
We let bold-face Roman letters from the end of the alphabet (usually
lower-case and often with subscripts) range over the variables of if.
Throughout this chapter we adopt the convention that distinct letters of
this kind occurring in the same context refer to distinct variables of if,
unless otherwise stated.
If <p is an if-formula whose free variables are among x1,...,x„, and if
an are any individuals of the if-structure 91, we write

(1) 9lN<p[*i/a1,...,xn/an]'

to assert that (p is satisfied in 91 when x^...^,, are assigned the values


respectively. Note that if for each i=l.nwe have in if a constant

1 However, the older and officially discredited method survived in “lowbrow” mathe¬
matics and — as a private heuristic aid — even in “highbrow” mathematics.
CH. II, §1], ENLARGEMENTS 533

a, such that af = o;, then (1) is equivalent to

3lN<p(xi/ai,...,xn/an).

We recall that 51 is a substructure of 23 if A^B, and, for all R and f


of S£, R® and f® are the restrictions to A of R® and f® respectively. In
particular, A must be closed under all the operations f®.
If C is an arbitrary set, we let I£c be the language obtained from by
adding a distinct new constant c for each c£C. If 51 is an ^-structure and
C^A, we let 5IC be the ^-expansion of 51 obtained by taking each c£C
as the interpretation of the corresponding c. Of particular importance is
the case where C—A.

1.1. Lemma. Let 51 and 53 be JL-structures such that 51 is a substructure


of 53. Then in order for 53^ to be an ^-elementary extension of 51A it is
enough that for each SfA-sentence <i such that 51A 1= a we have also 53 A 1= cr.
Proof.1 Let tp be an Jz^-formula, and let the free variables of tp be among

x1?... ,x„. Let ax,... ,an € A. If 51A N ^[xja^... ,x,,/«„], then 5IX N (p(xi/al5... ,x„/a„).

Since (p^/a^... ,x„/a„) is an i^-sentence, it follows that 53 A1= (p^/a^... ,x„/a„).


hence 53 A 1= (p[x1/a1,...,xjan\. On the other hand, if 5lA^<p[x1/a1,...,x„/a„],
we consider “|<p and show that 53A^(p[x1/a1,...,x„/<3,,]. |

1.2. Definition. Let 51 be an ^-structure. An ^-concurrent collection is


any collection O of ^-formulas all of which have exactly one free variable,
say x, such that <I> is finitely satisfiable in 5lA. In other words, for each
finite O0c<j) there exists an a£A such that 5IA|=<p[x/a] (or, equivalently,
51A f= <p(x/a)) for all <p<E<J>0.

1.3. Definition. Let 51 and 53 be Jzf-structures. We say that 53 is an


enlargement of 51 (in symbols 5l<53) iff 51 is a substructure of ©, and
whenever <I> is an 51-concurrent collection then <D is satisfiable in 53A.
(Thus, if x is the free variable of O, then there is some b£B such that
53A|=<p[x/6] for all <p£<I>.)

1.4. Theorem. If 51 < 53 then 5tA «< 53 A; i.e., 53 A is an L£A-elementary


extension of 51A.
Proof. Let <t be any ^-sentence. If 5IAt=<r, then the singleton {<tax=x}
is trivially 51-concurrent. Thus by Def. 1.3 we must have 53Ah=<r. Therefore
5IA-<53A by Lemma 1.1, I

1 Cf. Lemma 5.1.6.


NONSTANDARD ANALYSIS [CH. II, §1
534

1.5. Existence Theorem. For any if-structure 21 there exists © such that
2I<©.
Proof1. We let if' be the language obtained from SFA by adding, for each
21-concurrent O, a distinct new constant c^. If <1> is 21-concurrent and
x is the free variable of O, we put

®' = {<P(x/ca,): <P€0}.


Let
S = U {<!>': <I> is 21-concurrent}.

Then E is a set of if'-sentences. From the fact that each <X> here is 21-con¬
current it follows at once that every finite subset of E is satisfiable. Thus
by the Compactness Theorem (3.3.16 or 5.3.12) E has some if'-structure
©' as a model.
Let © be the if-reduction of ©'. We define a mapping a^a' of A into
B as follows:

a'=a®’ for all a£A.

We claim that this mapping is an embedding of 21 into ©. Indeed, if


axAa2 then the singleton {a1?£a2Ax=x} is trivially 21-concurrent and hence
Eba^aa. Since ©' is a model of E, it follows that afVaf, as required.
In a similar way one shows that, for all alf...,a„^A} for every n-ary function
symbol f of if and every n-ary predicate symbol R of if,

(f ®(«lt • • • An))' = f ®(«v • • • An),

<«!,...,a„)€R* iff <«},...,<>€R®.

Since we have now shown that 21 can be embedded in ©, we may assume


without loss of generality that 21 is actually a substructure of ©. Then
from the fact that ©' is a model of E it follows immediately that 21 < ©. •’

For most of the work in this chapter it will be enough to operate with
an arbitrary enlargement of a given if-structure. But occasionally it is
useful to have enlargements of a special kind.

1.6. Definition. Let 2t be an if-structure, and let a be an ordinal >0.


For each let 2I(/S) be an if-structure such that
(1) 2I(0) = 2l,
(2) 2I(/i)< 2I(/?+1) for all £<a,
(3) 2l(/i) = U{2I(y): y</?} for all limit

1 Cf. Prob. 5.2.12(i).


CH. 11, §1], ENLARGEMENTS 535

(More fully, (3) means that, for limit fl«x,

A(P) = U {A(r):
and for every n-ary predicate symbol R of we have

R*0,, = U{R*,tv>:
and for every n-ary function symbol f of =£? and every alv..,an£ A(P> we have

where y is the least ordinal </? such that au...,an£A(y\) Then we say that
9I(ot) is an a-enlargement of 91. If a is a limit ordinal we say that 91(<z) is
a limit enlargement of 91.

By the Existence Theorem 1.5 it follows that for each positive a there
exists an a-enlargement of 91. Note also that a 1-enlargement of 91 is
simply an enlargement of 91.

1.7. Theorem. Let a be an ordinal >0. If 93 is an a-enlargement of 91


then 91 < 93.
Proof.1 We proceed by induction on a.
First, let a = /?+l. The case /7=0 is trivial. If £>0, then, by Def. 1.6,
£<93, where £ is a ^-enlargement of 91. By the induction hypothesis,
9l<£. Let <I> be an 91-concurrent collection of formulas. Then 0> is
satisfied in £x. But since £<93, we have, by Thm. 1.4, £c<©c, and
hence certainly £^<93^. Therefore an individual that satisfies O in £x
must also satisfy it in 93^. Thus 91 <93.
Now let a be a limit ordinal. Then for all /?=soc we have if-structures
9t(/i) as in Def. 1.6, with 93 = 91(at). We claim: //£ = 91(/!), where j8<oc, and
if <r is an d£c-sentencc, then £cN<t iff 93c (= <r.
This claim is proved by induction on deg a. The cases where a is atomic
or a negation sentence or a conjunction sentence are simple and we leave
them to the reader.
Let cr = 3x(p, where <p is an i?c-formula with no free variable other
than x. If £cN<r, then for some c€C we have £cN<p(x/c); hence, by the
induction hypothesis on degff, also 93c 1= tp(x/c) and therefore 93cl=tf-
Conversely, suppose that 93cN<r. Then for some b£B we have 93c|= <ptx/^]-
For some y<a we must have bdA^ and without loss of generality we
may assume /?*sy (otherwise take /? itself instead of y). Put — 91*- \
Then 93DN<p(x/b) and by the induction hypothesis on deg<T we have

1 Cf. Prob. 5.2.10.


NONSTANDARD ANALYSIS [CH. II, §2
536

33DN<p(x/b), hence Dct=®. Now, (£=2l(/J) and £ = s2l(y) with £^y<«.


Hence (£ = 33, or D is a ^-enlargement of (£ for some <3^y<a. In the former
case we have at once (£ch=<r. In the latter case the induction hypothesis
on a implies that (£-<33; hence (£c-<33c by Thm. 1.4. Since £)ct=<*, we
must have (£cN<t as well. This completes the induction on deg a and
proves our claim.
By Lemma 1.1. it follows that <£c-<93c for every (£ of the above form,
i.e., <£ = 2l(/J) with /?<a.
Finally, let O be an ^-concurrent collection. Then if (£=2l(/J) with
0</?<a, our induction hypothesis implies that 2l<(£ and hence <D is sa-
tisfiable in (£A. But we have shown that (£c-<©c; therefore certainly
(£^-<Sa, and O is satisfiable in 23^. Thus 2I<23. I

1.8. Corollary. If, for all /?«*, the if-structures 2I(/,) are as in Def 1.6
and if y</f=ca, then 9l(v) < I

§ 2. Zermelo structures and their enlargements

From now on we take if to be the same as in Ch. 10; i.e., the first-order
language with equality and one binary extralogical predicate symbol £.
Note, however, that in the present chapter we have reverted to the con¬
vention of using bold type for all symbols of if.
By a Zermelo structure we mean an if-structure U such that the basic
relation £u of U is the membership relation (restricted to the domain U
of H); and U is a non-empty set having the following four properties:

(2.1) transitivity: a£b£U=>adU;


(2.2) closure under unordered pairs: a,b£U^-{a,b}£ U;
(2.3) closure under union: a£U=>{JaZU;
(2.4) closure under power-set: a£U=>Pa£U.

The name “Zermelo structure” is given to U because it is a model of


Zermelo’s set theory, possibly without the axiom of infinity. (See Prob.
10.10.2.) In particular, the axioms of regularity and choice hold in U
provided we assume them to hold in the universe of sets; and we do assume
this. (The axiom of regularity is assumed purely for convenience: it
simplifies matters but one can do without it.)
For any set A there exists a Zermelo structure U such that A£U. For
example, we can take U=Ra (see 10.3.1), where a is any limit ordinal
greater than the rank of A. It is then easy to verify (cf. Prob. 10.10.2)
that U has the required properties.
CH. 11, §2], ZERMELO STRUCTURES AND THEIR ENLARGEMENTS 537

2.5. Problem. Verify that the domain U of a Zermelo structure has the
following properties:
(i) a<=,b^U=>a^U\
(ii) 06 U-
(iii) a,b£U=>avb£U;
(iv) a1,...,a„£U=>{a1,...,an}£U;
(v) a1,...,aneU=>(a1,...,a„)£U;
(vi) A,B£U=>AXB£U;
(vii) A,B£ U=>Fun(A,B)£ U, where Fun(/1,5) —5‘4 = the collection of all
mappings of A into B.

From now on we let U be a fixed but arbitrary Zermelo structure. In


applying nonstandard analysis to a given mathematical situation, we shall
usually assume — as we are entitled to do —- that some particular set
belongs to U.
The members of U are called standard objects. It will be convenient from
now on to reserve the term set for standard objects only. If A is a set in
the usual sense, but we do not wish to imply that A necessaiily belongs
to U, we refer to A as a collection. Similarly, when A^B, we say that A
is a subset of B only if we wish to imply that A and B are in U; otherwise,
we say that A is a subcollection of B, or simply that A is included in B.
We also fix a particular enlargement *U=(*£/, *6) of U. It is convenient
to read the prefix as “pseudoUnless otherwise stated, *11 can be
an arbitrary enlargement of It. Occasionally we shall want to assume that
*U is an a-enlargement for some suitable ordinal a.
It must be stressed that in general *6 is not the membership relation,
though by Thm. 1.4 its formal properties that can be expressed in £PV are
the same as those of membership.
We adopt the convention that if a symbol or an italic letter denotes
a member of *U then the same symbol in bold type (or, in the case of
a letter, in bold Roman type) denotes the corresponding constant of
If a is a sentence of we write “ as short for “U^Ncr”. Similarly,
if a is a sentence of £d*v, we write as short for “*U*t/|=<y,\ Note
that if a is a sentence of d£v then *|=<t iff *U[/|=<t.
For any ^f^-sentence a we have, by Thm. 1.4, t= o’ iff We shall
often need to make inferences of the form
(1) ^=<t, hence also *l=<r,
or
(2) *t=<r, hence also 1= <r.
NONSTANDARD ANALYSIS [CH. 11, §2
538

where a is an ^-sentence. To save space we write

(10 (*)N<t

and

(20 t*N<x

as short for (1) and (2) respectively. We call (1) (and its condensed form (T))
transference from U to *U. Similarly, (2) (and its condensed form (20)
will be called transference from *11 to U.

The members of *U are called *sets (read: pseudo-sets). Every set (i.e.,
member of U) is also a *set because £/c *U. But the converse is not true.
To see this, observe that, by (2.4), U must be infinite. It follows that
{x*£a :a£U) is a U-concurrent collection of ^-formulas, and since U < *11
there must exist some *set b different from all sets. A *set which is not
a set (i.e., any member of *U—U) is said to be nonstandard.
For any relation R among sets, defined by means of an 5£v-formula <p,
we have automatically a corresponding relation *R among *sets, which
is defined by means of the same <p. For example, consider the relation
This relation is defined by means of the formula 9, where

9=x—y=V z(z£x -► z€y);

in other words, for any sets a, b we have a £ b iff 1= a£=b (i.e., t= q>(x/a, y/b)).
Then the relation *£ is defined as follows: for any *sets a,b we have
a*^b iff *NaEb (i.e., *N= q>(x/a, y/b)). Thus a*^b iff every *member of a
is a *member of b.
Note that if a and b are sets (i.e., standard objects) then, by transference,
a £ b iff a*^b. More generally, if R and *R are relations defined on U
and *U respectively by the same ^-formula, then R must be the restriction
of *R to U.
Note also that ^ is a (defined) predicate symbol of while £ and *£
are the corresponding relations on U and *U respectively. We use the same
typographical convention for other defined symbols as well.
Now let t be a virtual term introduced into S£v by the method explained
in §13 of Ch. 2. (See also the discussion following Thm. 10.1.5. Note that
we can use here the virtual terms introduced in Ch. 10, because by (2.1)
and Prob. 2.5(ii) we have

Vy(y€z++y5*y),
CH. 11, §2], ZERMELO STRUCTURES AND THEIR ENLARGEMENTS 539

i.e., t=3!za, where a is the formula used for introducing the virtual terms
of Ch. 10.) Suppose xl5...,xn are the free variables of t. Then we have

N Vxi...Vx„3!y(t=y);

and we can define an operation t on U as follows: for any a1,...,an£U take


i(a1,...,an) to be the unique b£U such that

Nt(x1/a1,...,x„/a„)=b.

But by transference we have also

*NVx1...Vx„3!y(t=y).

Hence we can define an operation *t on *U as follows: for any a1,...,a,f*U


we let *t(a1,...,a„) be the unique b£*U such that

*Nt(x1/a1,...,xn/a„)=b.

It is easy to see that t is the restriction of *t to U.


For example, let t be the term xuy. Then the corresponding operation
on U is u and we have

(*) N Vx Vy Vz(z£xvy<+z£x Vz£y).

Hence the operation *u on *U is such that for a,bd*U the * members of


a*'ub are the * members of a plus the * members of b. And if both a and b
happen to be standard then a*yj b=aub.
In the sequel we shall often use, without special comment, relations of
the form *R, where R is a familiar relation among sets, and operations of
the form *t, where t is a familiar operation on sets. The definitions of
such *R and *t are obtained automatically from those of R and t by the
method just explained.
For any *set a we define a (read: “a /iat” and called the scope of a) as
the collection {b£*U : b*£a}. Thus a is a subcollection of *£/; in general
it is neither a set nor even a *set. The members of a are precisely the
^members of a. When we want to apply the “hat” (the scope symbol)
to an expression consisting of more than one letter, we put the hat to the
right of the expression. For example: (a*u b) .

2.6. Problem. Let a,b be *sets. Prove:


(i) (a*u b)" =au b\
(ii) (a*nby =anb;
(iii) a=h=>a=b. (Apply transference to the axiom of extensionality.)
NONSTANDARD ANALYSIS [CH. II, §2
540

(iv) h iff a*^b.

If a is standard, then a £ a because every member of a is also a ' member


of a. When do we have d = a?

2.7. Theorem. Let a be standard. Then a=a iff a is finite.


Proof. First let a = 0. We have (*)1= Vx(x^£0); hence 0 has no members,
so 0 = 0.
Next, suppose a={b1}...,bn}, where »>1. From (2.1) it follows that the
bt are standard; hence we can make the following transference
(*) Vx(x6a ^++x=b1 v... V x=b„),

so the ^members of a are precisely bx,...,bn and hence a = a.


Finally, let a be infinite. Then

{x£a A x?£b : b£a}

is clearly a U-concurrent collection of formulas. Therefore there exists


some cffU such that *N c£a and *1= c?£b for every b£a. Thus c ^belongs
to a but differs from all the members of a. In other words, c£ a — a, so that
a is a proper subcollection of a.
Let V be a subcollection of *U. If V=a for some *set a (which, by
Prob. 2.6(iii), must be unique) we say that V is internal. Otherwise, V is
external.

2.8. Example. Assume that the collection N of natural numbers is a set


(i.e., belongs to U). Then N^N^*U. We shall show that N is external.
Let
M={(n, n+1) :n£N}.

By parts (i) and (vi) of Prob. 2.5 we have M^U. Let xSy be the ^formula
<x,y)£M, and let S be the corresponding relation on U. Then we clearly
have

(1) (*)l= Vz[0€za V* 3y(x€z-*y€zAxSy)-»N£z].

Now suppose that N is internal, i.e., N=A for some *set A. Since 0£N,
we have

(2) 0 %A.

Also, for each n£N there is m£N such that nSm and hence also n *Sm. Thus

(3) for each n*£A there is m*£A such that n*Sm.


CH. 11, §2]. ZERMELO STRUCTURES AND THEIR ENLARGEMENTS 541

From (1), (2) and (3) we get N A, hence N £ A=N. Thus ft=N, contrary
to Thm. 2.7. Therefore N must be external.

2.9. Problem. Let s be standard and infinite. Show that s is external.


(Let b be any denumerable subset of s. Show that b is external, then use
the fact that b~bns.)

2.10. Criterion. Let F <=*£/. Then V is internal iff there is an £F*v-formula


<p with one free variable x, and a *set s, such that
(1) K={a€*t/: *h=a€sA<p(x/a)}.

Proof. If V is internal, then V=s for some *set s. Take q> to be the formula
x=x. Then clearly (1) holds.
Conversely, suppose that (1) holds for some JS^-formula <p and some *set s.
Let bl5...,b„ be all the constants occurring in (p. Then <p=^(yi/bi,...,y„/b„)
for some J5?-formula vj/ whose free variables are yx,...,yn,x. Then by the
axiom of separation (cf. 10.1.3) we have

C)NVyi—Vy» V*3« Vx(x€u<+xGzai1/).


In particular, giving yx,...,y„,z the values bx,...,bn,s respectively, we see
that for some *set c

*t= VxCxCc-f+x^s A <p).

Thus the * members of c are precisely those *sets a such that *|=a£sA<p(x/a).
It follows from (1) that V—c, so that V is internal. |

The following easy result is extremely useful:

2.11. Robinson’s Overspill Lemma. Let s be an infinite set, and let z be


an internal subcollection of s. If s<=,z then (s—s)nz^0. If s—s^z then
^nxF0.
Proof. The first part follows from the fact that ^ is external (see Prob. 2.9).
To prove the second part, we show that s—s is external. Suppose not;
then s—s = t for some *set t. Hence

s={aCU : *N a0> A a$t}.

By Crit. 2.10 it follows that 5 is internal, contrary to Prob. 2.9. |

Let U, and let F be the set of all finite subsets of 5. (F€ U by (2.4) and
Prob. 2.5(i)). If b *£ F then b is a ^finite *subset of s. Clearly, such b enjoy
the same formal properties as finite sets. More precisely, if (p is an £Tv-ioz-

36
NONSTANDARD ANALYSIS [CH. 11, §2
542

mula with one free variable x and N <p(x/a) for every finite set a, then also
*t=(p(x/b). Nevertheless, b can have infinitely many ^members; i.e., the
collection b can be infinite.

2.12. Theorem. Let sd U, and let F be the set of all finite subsets of s. Then
there is some b*d F such that s £ b. In particular, if s is infinite, so is b.
Proof. Clearly {x£F a c^x : is a H-concurrent collection. The
existence of b with the required properties follows at once, since H < *U. I

We now turn to a discussion of mappings. Henceforth we shall reserve


the term function for mappings which are sets (i.e., belong to U).
Let X and Y be sets. Then Fun(A, Y) is the set of all functions from
X into Y (see Prob. 2.5(vii)). If /€Fun(X,F) and adX, then fa (read
f of a, or / applied to a) is the value of / at a and is a uniquely determined
member of Y. (The corresponding virtual term of SFV is f‘a.) We shall
usually omit the application sign ‘ and write simply fa or f(a) instead
off‘a. (Also, in formal expressions we often write fa or f(a) for f‘a.)
To the operation Fun on sets and to the operation ‘ of function application
there correspond (as explained earlier in this section) the operation *Fun
on *sets and the operation *‘ of ^function ^application.
Let X and Y be *sets, and let f*d *Fun(JF,7). If a*£X then f*‘a (read:
/ *applied to a) is a uniquely determined ^member of Y. Thus we have
a mapping a*-+f *‘a of X into Y. We denote this mapping by /*. (So, if
a *€ X, then /*‘a is both / * applied to a and /* applied to a.) Notice that
/ and /* are in general entities of different kinds. The former, f is a *set
of ^ordered pairs; in general /is not a mapping. On the other hand,/* is
a mapping and is thus a collection of ordered pairs; in fact

f* = {(a,r‘a):ad X}.

(Both / and /* should be distinguished from /, which is a collection of


*ordered pairs.)
We call/* the mapping induced by /. When there is no risk of confusion
we shall write fa or f(a) instead of /*‘a.
If / happens to be a function, then the mapping /* is an extension of /.
In this case we call /* the natural extension of /.

2.13. Problem. Let a be an infinite regular cardinal and let *U be an


a-enlargement of H. Let A and B be *sets. Let C be a subcollection of
A such that the cardinality of C is less than a, and let (p be an arbitrary
mapping of C into B. Prove that there exists some / *£ *Fun(T,5) such
CH. 11, §3], FILTERS AND MONADS 543

that fc=cp(c) for all c£C. (Let {H(/7) : j?<a} be an enlargement chain as
described in Def. 1.6, with U(0) = H and U(a) = *H. Choose some /?<a
such that U(P) contains A and B and includes C and (p[C], Observe that

{x£Fun(A,B)Ax‘c=b ; c£C and b = (p(c)}

is a U(/!)-concurrent collection.)

We have now finished setting up the general machinery of nonstandard


analysis. In the following sections we apply this machinery to various
mathematical situations.

§ 3. Filters and monads

Throughout this section we let Xd. U be a fixed but arbitrary non-empty set.
Since X is standard, it follows that the members of X are standard as well;
we call them points. (Accordingly, the * members of A — i.e., the members
of X — will be called *points.)
By (2.4) and (2.1), PX, PPX as well as all their members are standard.

The reader may visualize A as a conglomeration of corpuscules or little


dots — the points of X. When X is extended to X, new corpuscules are
added; they may be visualized as differing in colour from the old ones.
The new corpuscules — the nonstandard *members of A — are interspersed
in gaps between the old ones as well as outside the old boundary of X.
Similarly, every subset A of X is extended to A by adding some of the new
corpuscules.

We say that is a filter over X if SF is a non-empty set of subsets of X


(i.e., Ox^dPPX) and has the following two properties:
(1) A,Bd^=>AnBddF,
(2) X^B^Ad S'^ B d_ 3F.
Note that this definition differs from the one used in Ch. 4 (see beginning
of §3 of Ch. 4 and Prob. 4.3.16) in one respect: we now allow 06#”. But,
in the presence of (2), we have 0€JMff J5r=PA. Thus the only difference
between the present definition and that of Ch. 4 is that we now admit
PX itself as a filter over X.
Throughout this section, when we say “filter” without any further
qualification, we mean filter over X. A filter is proper if PX.
If a filter is properly included in another filter #"2, we say that #i

36*
NONSTANDARD ANALYSIS [CH. 11, §3
544

is coarser than and #2 is finer than .^r1. Among all filters, {A} is the
coarsest and PX the finest.
Let <S be any set of subsets of X (i.e., %£??X). We define ^ to be the
filter generated by Thus ^ is the coarsest filter that includes <&. If ^=0,
then &={X}. If &X0, then & is the set of all B such that A1n...nA„c
cficj for some 1 and some A^.-.^A^^.

3.1. Definition. For any @£PPX we put

gg={p(iX: p£A for all A

We call p<$ the monad of

If ^=0, then p^—X. If ^0, we can write

p<g=f){A : A£<#}.

Clearly, p^ is always a subcollection of X, so p is a mapping of PPA


into PX. A subcollection of X is said to be monadic (or a monad) if it is
P& for some PPA.
It is easy to see that if ^ c: PPX then p<Sl 3 p&2.

3.2. Theorem. p^=p^ for every ^6 PPX


Proof. If ^=0, then ii^f=A=iu{X}=iu^, by Def. 3.1. So now we may
assume ^^0.
because Conversely, if then A1n...nAn^B for
some n>1 and some A1,...,A„e&. Since A1}...,An and B are all standard,
we have also Afn ...*nA„ B. Hence, by Prob. 2.6, Atn... nArl^ B.
If pZp1#, then p£A for all A€<&, and hence, by what we have just shown,
also p£B for all B6^; hence p^p^t. Thus p^^p^. |

If D *£ ^ and D c p<g, we say that D is a tiny ^member of (S. Note that


the condition D^p'S is equivalent to each of the following two conditions:

D^A for all A£<8,

D*cA for all

This can be seen at once from Def. 3.1 and Prob. 2.6(iv).

Recall that ^ is a base for a filter J* if (g=3F and for every A£_2F there
exists some B^ such that B^A. If ^ is a base for <&, we say that ^ is
a filter base. It is easy to see (Prob. 4.3.10) that a set rS^PPX is a filter
base iff ^^0 and for every A,Bthere exists some Csuch that
C c An B.
CH. 11, §3], FILTERS AND MONADS 545

3.3. Theorem. Let PPA. Then is a filter base iff it has a tiny * member.
Proof. Suppose ^ is a filter base. Then

{x£T£ a xcA: A£<$}

is easily seen to be a U-concurrent collection of formulas. Hence there


exists some such that D A for all A . D is a tiny ^member of (S.
Conversely, let ^ have a tiny * member D. Then ^^0 because 0 has
no ^members. Also, if A,B£& then D*^ A and D *£ B; hence

t*N3x(xe<2 A xcAnB).

Thus there exists some such that C^AnB. It follows that ^ is


a filter base. |

Thm. 3.3. is our second example of a characterization of a standard notion


(in this case: being a filter base) by nonstandard means; i.e., by means of
notions pertaining to *U rather than H. (The first example was Thm. 2.7.)

3.4. Definition. Let t be an arbitrary subcollection of X. We put

Ft = {A 2 X ngi},

Thus F is a mapping of PX into PPX. Moreover, it is easy to verify that


Ft is a filter; we call it the filter of t. Also, ifrcffcj then clearly Ft 2 Fcr.

3.5. Theorem. Py<g=<& for every PPX.


Proof. By Thm. 3.2 and Def. 3.1 we have

\x(§=H<8<^A for all A^Tt.

Therefore FyfS by Def. 3.4.


It remains to prove that F\TS 2 <S. Let A 6 PyTS. Then yfS gi by DeL 3.4.
Now, ^ is a filter, and a fortiori a filter base; therefore, by Thm. 3.3, <3 has
a tiny ^member, say D. Thus

But then D *2 A, and we have

t*N 3x(xG^ A xcA).


Thus A has a subset belonging to But § is a filter, hence A €# as
required.

3.6. Corollary. PydT=dT for every filter JL Distinct filters have distinct
monads.
NONSTANDARD ANALYSIS [CH. II, §3
546

Cor. 3.6 means that every filter can be recovered from its monad. Thus
all the information concerning a filter is encapsulated in its monad. There¬
fore any statement about filters can, in principle, be replaced by an equivalent
statement about their monads. This is often a considerable heuristic
simplification, since a filter is a set of sets of points, while its monad is
merely a collection of *points.

3.7. Corollary. For any filters #i, we have JT £ #2 Iff •


Proof. If /z^T^/z#^, then, applying F to both sides and using Cor. 3.6,
we have JT £ #2. The converse is obvious. I

3.8. Definition. For any tel we put

t~ =/zFt.

We call t~ the monad generated by x or the monadic closure of x.

3.9. Theorem. If x<=X, then x~ is the smallest monad that includes x.


Proof. By Def. 3.8, x~ is a monad. Also, by Def. 3.4, x<ykA for all A£Ft;
therefore, by Def. 3.1, tS//Ft=t“.
Now suppose t£/z^ for some We shall show that x~ <=/z^. Indeed,
applying F to both sides of the inclusion t2/z^, and using 3.5, we get

Ft = F^=^.

It now follows that //Ft £ ffS. This means that t~ 2/z^. '

3.10. Corollary. If x is a monad, then pPx=x~ =x. ;;

3.11. Theorem, (a) Ft=Ft- for any tc|.


(b) For any monads cr, x we have <7 2t iff FcsFt.
Proof, (a) Ft“ = F/zFt = Ft by Def. 3.8 and Cor. 3.6.
(b) If a and x are monads such that Fcr 2 Ft, then, applying /z to both
sides and using Cor. 3.10, we have agx. The converse is obvious.

To sum up: both /z and F are inclusion-reversing mappings. We have


^—Pand t_=//Ft for every &£PPX and x£PX. Finally, F/z#'=Jir
for every filter J5"; and //Ft=t for every monad x.

We shall now consider filters of certain particular kinds and find their
monads.

3.12. Example. A principal filter is a filter of the form {A}~, where 4cj.
(It is easy to see that {A} ={B: 4g5gl}.)
CH. li, §3], FILTERS AND MONADS 547

We clearly have p({A}~)=p{A}=A. Thus, the monads of principal


filters are scopes (“hats”) of subsets of X.
In particular, for the improper filter PA" we have PZ={0}~. Hence
/fPA"=0 = 0 by Thm. 2.7. At the other extreme we have the filter {A"},
whose monad is

3.13. Example. An ultrafilter is a proper filter which is not coarser than


any other proper filter. The monad of an ultrafilter is called an ultramonad.
Evidently, ultramonads are characterized by being minimal non-empty
monads: a monad t is an ultramonad iff the only monad properly included
in t is 0.
Let #■ be an ultrafilter. Take any p^pF'. Then {p}^pFr and hence
{p}~^pA? by Thm. 3.9. Since piF is an ultramonad, we must have
{p}~=plF. Thus each ultramonad is generated by the singleton of a *point.
Conversely, if p is any *point, we shall show that {p}- is an ultramonad.
Since {p}~ —p?{p) by Def. 3.8, it is enough to show that F{/>} is an ultra¬
filter. Let A^X. Then

ON Vx[x€X-*(x€A4*x$X-A)].

In particular/? *£A iffp *$X—A. This means that {p}^A iff {/?} $ (X— A) .
Thus, for any A^X, exactly one of the sets A, X— A belongs to F{/?}.
It follows that F{/?} is an ultrafilter. (Cf. Thm. 4.3.5.)
We have thus proved that x is an ultramonad iff x = {p}~ for some pdX.
Equivalently, is an ultrafilter iff ■F=F({p}~) for some pd£. (By Thm.
3.11 we can write F{/?} instead of F({/?}—).)
By the way, for every proper filter !F we can find an ultrafilter extending
3F, as follows. Since !F is proper, pF'Xfi. Take any p£plF. Then
{/?}“£ AiJ'T and F({p}~) is clearly an ultrafilter extending J5".

3.14. Example. Let p£X. Then {pY = {p) by Thm. 2.7. Therefore by
Ex. 3.12 we have {p}=p&r, where 3F is the principal filter generated by
{{/?}}. It follows that {pY = {p), because {p} itself is already a monad.
Thus we see that, for standard p, {p}_ = {p} = the monad of the principal
ultrafilter {{p}}~.
Conversely, let q be any *point such that the ultrafilter F({^}-) is principal.
We shall show that q is standard, i.e., q(LX. Indeed, by Ex. 3.12 we must
have {q)~=A for some non-empty Take any p£A. Then
{p)F:A <^A = {q}~. But, since p is standard, it follows from what we have
seen above that {p} is a monad. Thus {p} = {#}~, hence q£ {p}; so q=p£X,
as claimed.
NONSTANDARD ANALYSIS [CH. 11, §3
548

We see that, if q^X-X, then {q} must be the monad of a non-principal


ultrafiltei.

3.15. Problem. Prove that if q is nonstandard then {q}~ is an infinite


collection.
3.16. Problem. Let 2F be the filter of cofinite subsets of X, i.e.,

!F= {A cj ; X— A is finite}.

Prove that p&r=X—X. Hence deduce that an ultrafiltei is non-principal


iff it is an extension of JE

In Ex. 3.13 we saw that the ultramonads are precisely all collections
of the form {/>}“, with Suppose that the ultramonads {p}~ and
[q}~ have a common member r. Then the ultramonad {r}“ is included
in both {p}~ and {#}“. Hence {r}~ = {p}~ = {q}~- Thus distinct ultra¬
monads are disjoint. It follows that the ultramonads partition X. By
Ex. 3.14 and Prob. 3.15, an ultramonad consists of a single point or of
infinitely many nonstandard *points.

3.17. Theorem. Let p and q be * points. Then {p}~ = {q)~ iff q*£ A whenever
p *€ A cX.
Proof. Since the ultramonads partition X, we have {p}~ = {q}~ iff qd (p}~.
But {p}~ =p£{p) by Def. 3.8. Hence q(L{p}~ iff q£A for every A^{p),
i.e., q*£A whenever p*£A<=:X. |

The following theorem means that {p}~ = {<7}- iff p and q are “indistin¬
guishable” in STV.

3.18. Theorem. Let p and q be *points. Then {p}~ = {q}~ iff whenever
<p is an STv-formula with one free variable x such that *N(p(x/p), we also
have *N<p(x/q).
Proof. Suppose {/?}— = {^}—; then by Thm. 3.17 we have q%A whenever
p*dA^X. Let <p be an jS^-formula with one free variable x, such that
*t=q>(x/p). Put

A = {a: |=a£XA(p(x/a)}.

Clearly, p*L A and A^X. Hence also q*€A, so we must have *|=<p(x/q).
Conversely, suppose (p}~ ^ {q}~. Then by Thm. 3.17 there is some A<=,X
such that p*£A but q*$A. Since A is standard, x£A is an ^-formula;
and we have *|=p€A but not *t=q€A.
CH. 11, §3], FILTERS AND MONADS 549'

3.19. Problem. Let / be a non-empty collection, and for each if I let


^,€PPU. Prove:
(i) i^/}=nW: /€/}•
(ii) /if|{^: /€/}=(U{^: /€/})-.
3.20. Problem. Let and J*"2 be filters. Prove that p.{Fxn^v.)=^ u.
3.21. Problem, (i) Show that under the operation of monadic closure
(Def. 3.8) A" is a topological space. (Use 3.19(i) and 3.20.)
In the remaining parts of the present problem, U is taken with this
monadic topology.
(ii) Show that 2 is compact. (Use 3.19(i) to prove that if a collection
of monads has the finite intersection property, it has a non-empty inter¬
section.)
(iii) Show that X is dense in X.
(iv) Show that xsU is clopen iff z=A for some A<=,X. (If t is clopen
and A£Fz, then A — z or A intersects every member of F(U—z); hence,
if Ft is non-principal, every member of Ft intersects every member of
F(U—t). Consider tiny * members of Ft and F(U—t) respectively.
Alternatively, use Lemma 4.4.2.)
(v) Let MU be the collection of all ultramonads: MI=|{p}”: p£.%)-
Give MU the quotient topology, so that the closed subcollections of MU
are precisely those of the form {{p}_: p^~) with tcf, Show that MU
is naturally homeomorphic to the Stone space SPU of the power-set Boolean
algebra PU.

The following result, the deepest in this section, is due to Luxemburg.


It was also discovered — independently, but somewhat later — by Hirschfeld.
We shall use Hirschfeld’s proof.

3.22. Theorem. If F is a non-principal filter, then \.i2F is external.


Proof. Suppose that fiF is internal; then jiF=A for some ^subset A of U.
Let 28 be any strong base for 2F\ thus 28=2F, and B1tx
whenever ras* 1 and B1,...,B„£28. Since p28—p28=pF=A, we have

(1) A = {\{B : B£28).

We claim that A 28. To prove this, we start by choosing a *finite ^subset


^ of 28 such that

(2) 28

The existence of such ^ follows from Thm. 2.12. Next, by Crit. 2.10 the
NONSTANDARD ANALYSIS [CH. 11, §3
550

collection {C: *f=C(Eg A A^C} is internal and hence equals 9~ for


some Net 9. Thus

(3) 9~={C: andigC}.

We put D=*C\9. So a *point p ^belongs to D iff p ^belongs to every


^member of 9. By (3) this can be written as

(4) D = [\{C: C& and A^C).

We see at once that A^D. On the other hand from (1), (2) and (4) it is
clear that we have D^f]{B: Bd28) and therefore D^A by (1). Thus we
have A=D, so that A = D.
Now, from (3) it is clear that 9 is a ^subset of <6. Since # was chosen
as a ^finite ^subset of 28, it follows that 9 too is a ^finite ^subset of 28.
(Note: (*)t= every subset of a finite set is finite!) But 28 is closed under
finite intersections hence, by transference, ^closed under *finite inter¬
sections. It follows that A—D*£28, as claimed.
We now use the fact that !F is non-principal. By Lemma 4.3.11, we have
disjoint strong bases 28 x and 28% for 2F. But, by what we have just shown,

A£28x n28~ ={28xn28^ =0=0,

which is absurd. So p.!F must be external. i

3.23. Problem. Using Thm. 3.22 and Prob. 3.16, find a new solution for
Prob. 2.9.
3.24. Problem. Consider X with the monadic topology (Prob. 3.21 (i)).
Show that for any A*^X the following three conditions are equivalent:
(i) A is closed,
(ii) A is open,
(iii) A is standard.
3.25. Problem. Let 28 be a strong filter base, and let pdp.28. Show that
28 has a tiny *member B such that p£B. (Find a *finite ^subset ^ of 28
such that 2? £ C and p£C for every C£~.)

For the rest of this section we assume, in addition to X£U, also Y£U
and /£Fun(X,T).
If Jf£PPY and igf then /.iJY and Ft are defined in the obvious way,
analogous to Defs. 3.1 and 3.4.
If CS£PPX we put f&={f[A]: A£&}. Similarly, if Xf£PPY we put

/-I jr= {/-![£]: B£


CH. 11, §3], FILTERS AND MONADS 551

As explained at the end of §2, the function /has a natural extension/*,


which is a mapping of X into Y. When applying /* to an argument, we
omit the superscript “*”. In particular, if a c % then f[a] is the collection
of all images of members of a under /*. Similarly, if tcf then /-1[t]
is the collection of all members of X that are mapped by /* into r.
On the other hand, if A *<= X then f\A] is that *subset of Y which is
characterized by

q *£ f[A] iff q=fp for some p *£ A.

Thus {f [A]) —f[A\. Similarly, if i?*c Y then /_1[5] is that *subset of X


which is characterized by

pxrm ^ fP%B.
Thus (/-™~ =rm.
If a is a monad ^X, what is/[cr] ? If t is a monad £ Y, what is /~]|t]?
The second question is easier, and we deal with it first.

3.26. Theorem. f~l[pYY]=pf~1YY for every YY^PPY.


Proof.
p£f~1[p-YY]<^ fptpYY
<=>• fp£B for all B£YY
O p£f~1[E\ = {f-'[B]y for all B^YY
pdA for all Adf~1YY
o ptpf-'YY. 1
3.27. Theorem. (f[p3])~ =pf3 for every filter base 3^PPX.
Proof. We have p3^A for all A £3. Hence

/^s/[4=(/[4‘
for all A^3\ i.e., f[p3]^B for all B£f3. Therefore f[p3]^pf3 and by
Thm. 3.9 it follows that (f[p3])~ ^ pf3.
It remains to prove the reverse inclusion. Let B^Pf[p3]\ then by
Def. 3.4 we have f[p3]^B. By Thm. 3.3, 3 has a tiny *member D; thus
£> £ p3 and hence

(f[D]f =f[D]^f[p3}^B.

Therefore f[D\ *£ B; and since also D *€ 3, we have

t*N3x(x€^ A f[x]=B).
NONSTANDARD ANALYSIS [CH. 11, §3
552

So B has a subset of the form f[A], with A$f§. Therefore B£(f$) . We


have thus shown

F/^£W = F^.
Applying p to both sides we get (f[p,&])~ ^pf&, as required. |

If *U is a limit enlargement, we can prove a stronger result:

3.28. Theorem. If *U is a limit enlargement of U, then f[p@]=pf& for


every filter base ^6 ??X.
Proof. In view of Thm. 3.27, we only have to show that
Let {U(/?): /l=ea} be an enlargement chain as described in Def. 1.6, with
U(0) = H and U(a)=*lt, where a is a limit ordinal. Recall that by Cor. 1.8
we have ll(y)<U(/J) whenever
Now let q Thus, for every A we have q£{f\A\) ; i.e,

*Nq€f[A].

Since a is a limit ordinal, there exists some /?<a such that q£ U(P\ Because
-< H(a) = *U, we must have

U(W l=q€f[A].

From this and the fact that 0 is a filter base it follows that the collection

{fx=qAx£A: A

is U(/J)-concurrent. Since U(/?+1), there exists some C(/J+1) such that

U(/J+1) Nfp=qAp£A

for all A£<$. Since U(/S+1)<U(a) = *U, we also have *1= fp=qAp£A
for all A^y. Thus fp = q, where p£A for all A&&: hence pdp^ and
q(zf[P&], as required. |

3.29. Problem. Show that Thm. 3.28 continues to hold if the assumption
that *11 is a limit enlargement is replaced by the assumption that ^ is
countable. (Let ^={An: n£N). Without loss of generality we may assume
An+1l=An for all n£N. If q 6 pfi&, consider the collection {A: q*£ f[A]}
and use the Overspill Lemma 2.11.)
3.30. Problem. Consider X and Y with the monadic topology (Prob. 3.21).
Show that the natural extension /* of / is clopen (maps clopen sets onto
clopen sets) and continuous. Assuming that *U is a limit enlargement show
that /* is closed (maps closed sets onto closed sets).
CH. 11, §4], TOPOLOGY 553

3.31. Problem. Let A£' be a first-order language with equality. For the
sake of simplicity we assume that the only extralogical symbol of Jzf' is
a binary predicate symbol R. (Do not confuse Jzf' with if. In particular,
our U and *H are if-structures but in general they are not if'-structures.)
Fix a non-empty set A£U and a binary relation R^A2. Let 21 be the
if'-structure (A, R) and let 21* be the if'-structure whose domain is A and
whose binary relation is {(a,b): *|=(a,b)£R}.
(i) Prove that 21* is an enlargement and in particular an if'-elementary
extension of 21.
(ii) Let 76 U be a non-empty set and let /,g6Fun(7,T). Show that if
2F is a filter over 7 then {p£l: fp=gp}£.&' iff fp=gp for all p£p^.
(iii) Let 76 U, and let SF be an ultrafilter over 7. Consider the ultrapower
2I//Jr. For any p^plF we define the Hirschfeld mapping ij/p by putting
'l'p(f/&r)=fP f°r each /6Fun(7,T). Note that by (ii) this definition is
legitimate. Prove that iJ/p is an elementary embedding of 21 IjSF into 21*.
Thus 21* is a “universal envelope” for all ultrapowers of the form 21 Ij^r,
where 76 U.

§ 4. Topology

In this section, as in §3, we fix a non-empty JT6 U. We consider a topology


Ton I; so 3T$??X and the members of ST are the open subsets of X.
Unless otherwise indicated, ST is held fixed; and we shall often commit
the peccadillo of saying e.g. “the space X” when we actually mean the
topological space (X,ZT).
To prevent confusion, we shall only use the bar “ to denote the closure
operation in the space X, not for monadic closure in X, nor for the generated
filter.
For each point p£X we let SFp be the filter of neighbourhoods of p. Thus

J^={T: p^T^A for some T£2T}.


We put
p(p)=p^p,
and call p(p) the monad of p. The mapping p(-) thus defined, which maps
X into P%, is called the monadology of the space X. (When we want to
stress the dependence of the monadology on ST we write “/v”; but most
of the time this is not needed, since dT is held fixed or determined by the
context.) If q£p(p) we write

q—p
NONSTANDARD ANALYSIS [CH. 11, §4
554

and say that q is near (or infinitely close to) p. Note that in general ^ is
not an equivalence relation on X. In fact, since p(p) has been defined only
for a point p, the statement “q^p” is meaningful only if p is a point (and
hence standard); but q can be any * point, standard or not.
If q^p for some p<jX, we say that the *point q is near-standard. For
any p£X we have p€p(p). (Problem: Prove this.) Thus every point of
X is near-standard. But, as we shall see, a near-standard *point is not
necessarily standard.
A *point which is not near-standard is said to be remote.

It is helpful to visualize p(p) as a very small cluster of dots or corpuscules


(*points) around the point p. Since in general p(p) is external (Thm. 3.22) it
should be visualized as having no sharp boundary. The remote ^points
do not belong to any p(p).
We shall give nonstandard characterizations of various standard
topological notions. Many of these characterizations accord so well with
intuition that they may give the reader a feeling of deja connu.

4.1. Theorem. Let B^X. Then B is open iff p£B=^p(p)^B (i.e., q^p^B
=>q *£ B). B is closed iff p(p)nBX0=>p£B (i.e., p£X, q^p, q*Z B^-p^B).
Proof. B is open iff B£tFp for every p£B. This means that p^p^B for
every p£B. But pdTp=p(p) by definition.
The second half of the theorem follows at once, since B is closed iff
X— B is open.

4.2. Problem. Let 4gl and p£X. Show that p£A iff p(p)nA^0.

We started by defining the monadology p(-) from the topology TT.


Now we see from Thm. 4.1 that the topology can be recovered from the
monadology. Therefore every statement abot the topology can, in principle,
be replaced by an equivalent statement about the monadology.
Let X be a mapping of X into PX. By what we have just seen, there can be
at most one topology on X such that X is the corresponding monadology.
Under what conditions does such a topology exist?

4.3. Theorem. The following three conditions are necessary and jointly
sufficient in order that X be the monadology of some topology on X:
(i) X(p) is monadic for every p£X.
(ii) pdX(p) for every p£X.
(iii) For every r£X, if A£PX(r) then there is some B£PX(r) such that
A£PX(q) for each qdB.
CH. 11, §4]. TOPOLOGY 555

Proof. The necessity of (i) and (ii) is obvious. The necessity of (iii) follows
easily from the fact that if A is a neighbourhood of r then some open
neighbourhood B of r is included in A.
To prove the sufficiency of (i), (ii) and (iii) we put

T={GcJ: G€FA(r) for all r£G).

We leave to the reader the simple task of verifying (without using (i), (ii)
and (iii)) that ST is a topology on X and A(/?) £pr(p) for each p£X. We shall
show that %(p)cl(^).
Let A£?X(p). We put

G={qtX: A£?X(q)}.

We claim that p^G^A and that G is open (i.e., belongs to 8X).


Evidently, pdG since we have assumed A£?X{p). Also, if q£G then
A^?X(q), hence X(q)cA. By (ii) we have q£A, and since both q and A
are standard this means q£ A. ThusGsA. To show that G is open, suppose
that r£G. Then A£?X{r) and by (iii) there is some B^?X{r) such that
A(i¥X(q) for all q£B. It follows at once that B^G; and since B^FX(r),
we must also have G£FX(r). This shows that G^8T.
Since pdG^ A and G is open, it follows from Thm. 4.1 that p^(p)^G^A.
Thus we have established that q3-{p)^A for all A£?X(p). Hence p^-(p)^
^pFX(p); but from (i) we see that pFX(p)=X(p). I

If q£X and there is a unique pdX such that q^p, we put °q=p and
call it the standard approximation of q. If no such unique p exists, then °q
is undefined.
More generally, if A *£ X, we put

°A = {p£X : p(p)r,Ax0}-

°A is called the standard approximation of A.

4.4. Problem. Let A *£ X. Show that each of the following two conditions
is sufficient for °A to be closed:
(i) *U is a limit enlargement.
(ii) For each p^X the filter has a countable base.
(Use the standard characterization of closed sets. For (i), let {U(/;): /?<a}
be an enlargement chain as described in Def. 1.6, with a a limit ordinal,
U(0) = U, U(a) = *U. Take £ such that A£Um. For (ii), let p€(°A)~ and
let 83= {Bn: n£N} be a base for 8?. Without loss of generality assume
NONSTANDARD ANALYSIS [CH. 11. §4
556

Bn+1l=Bn for all n. Consider the collection {B<j& : A *n BXtt} and use
the Overspill Lemma 2.11.)

We return to the more pleasant pastime of finding nonstandard characte¬


rizations of topological concepts.
4.5. Problem. Prove:
(i) A point p is isolated — i.e., {p} is open — iff p(p) = {p}- (Use 4.1.)
(ii) X is a T0 space — i.e., if p,q are distinct points then pi {q}~~ or
.qi {p}~ — iff for any distinct points/?,<7 we have q$ p(p) or pi p{q). (Use 4.2.)
(iii) X is a Tx space — i.e., {/?} is closed for every p£X — iff for any
distinct points p,q we have p$p(q).

4.6. Theorem. X is a T2 (Hausdorff) space iff the monads of distinct points


are disjoint.
Proof. If X is T2 and p,q are distinct points, then there are Ai^p, Bi^q
such that An5=0. Hence AnB=(AnB)~ —0; and since p(p)^A and
p{q) £= B, also p(p)np(q) = tf.
Conversely, suppose p(p)np(q) = Q. By Thm. 3.3, and have tiny
*members D and E respectively. Since D^n(p) and E^p(q), we have
0=Z) nE—{D*n Eff. Therefore

t*N3x ay(x€^pAy€f£qAxny=0)

Thus p and q have disjoint neighbourhoods. I

Note that if q is a near-standard *point of a Hausdorff space then from


Thm. 4.6 it follows that °q is defined, because the condition q£p(p) deter¬
mines p uniquely.

4.7. Problem. The space X is regular if for each p £ X the filter is generated
by its closed members. Prove that X is regular iff for each point p and
each *point qip(p) there are disjoint open sets A, B such that pi A and
qiB.
4.8. Problem. X is normal if whenever A, B are disjoint closed sets there
are disjoint open sets C, D such that A^C and B^D. Prove that X is
normal iff for any *points p, q such that pi A, qiB for some disjoint closed
sets A, B, there are disjoint open sets C, D such that piC, qit>.
4.9. Problem. Let J5" be a proper filter over X, and let piX. Then p is
said to be adherent to dF if every member of 3Fp intersects every member
of $F. dF is said to converge to p if 3Fp <= <jF. Prove:
(i) p is adherent to dF iff /<(/?) n/(Use 3.19(i).)
CH. 11, §4], TOPOLOGY 557

(ii) ^ converges to p iff pL&^pfp).


Hence prove:
(iii) P is adherent to SF iff PE can be extended to an ultrafilter converging
to p.
(iv) If Xis a Hausdorff space, PF cannot converge to more than one point.

We now come to the nonstandard characterization of compact sets.


This is probably the single most useful tool of nonstandard analysis.

4.10. Theorem. Let CeX. Then C is compact iff Cc(J{^) ; pfC}; i.e.,
iff for each q£C we have q^p for some p£C.
Proof. Suppose first that C is compact and let q$\J{Kp) -P^C}. We must
show that q (£C. Put

@={B: B^ST,qiB}.

It p€C then, by assumption, q§,p(p). Since is generated by its open


members, there must exist some open neighbourhood B of p such that
q$B', hence p£Bd&. Thus Fft is an open cover of C. Since C is compact,
there are B1,...,Bk^i^ such that

(*)NCEB1u...uBt.

But 501u,..uJSt=(51u...u5t) . Hence q$C, as required.


Conversely, suppose that C is not compact. Then there is an open cover
of C such that no finite subset of covers C. Consequently,

{x6C a x£B : Be@}

is a U-concurrent collection of formulas. Thus for some q£C we must


have qdB for all B<S&. If p£.C, then p£B for some B£Ftf. Then p(p)^B
by Thm. 4.1; therefore q $p(p). Thus C is not included in U{/i(/7): p£C}. |

The standard definition of compactness involves quantification over


sets of sets of points: For every Sif PPY, if Sft is an open cover of C then
some finite part of P& covers C. The nonstandard characterization, on the
other hand, refers only to ^points: Every *point of C is near some point
of C. Thus Thm. 4.10 constitutes a far-reaching simplification (technically
as well as heuristically) of the notion of compactness; hence the great
usefulness of this result. For this reason nonstandard analysis is particularly
helpful in problems that involve compactness in an essential way.

37
NONSTANDARD ANALYSIS [CH. 11, §4
558

4.11. Example. Let X be a Hausdorff space, and let Cgl be compact.


We show that C is closed. Let q^p and q^C. By Thm. 4.10 we have
q~r for some r£C; hence, by Thm. 4.6, p = r(zC. By Thm. 4.1, C is closed.

4.12. Problem. Use Thms. 4.1 and 4.10 to show that each closed subset
of a compact set is compact.
4.13. Problem. Use Prob. 4.9 and Thm. 4.10 to show that for any C^X
the following three conditions are equivalent:
(i) C is compact.
(ii) Every proper filter & such that C6J5" has an adherent point p€C.
(iii) Every ultrafilter & such that converges to some p£C.
4.14. Problem. For any Cgl put

&C={B: C^T^B for some T^dT).

Prove that U{h(p): PtC}=p&rc c is compact. (Observe that for any C,


U{mCp): p£C}^pdPc- If C is compact, argue as in the proof of 4.10 to
show that the reverse inclusion holds. For the converse, observe that
C^p^c.)
4.15. Problem. Let X^X'eU, and let X' be a Hausdorff compactification
of X. Thus X' has a Hausdorff topology ST' under which X' is compact,
X is dense in X' and the topology induced on X by dT' is the original
topology dT of X. For each p£X, let cp(p) be the point of X' such that
p~cp(p) in the monadology of X'. (Since X' is a compact Hausdorff
space, cp{p) is uniquely defined.)
(i) Show that cp maps X onto X'.
(ii) Consider X in the topology

so that X' is a quotient space of X, with cp as the quotient mapping. Show


that cp~1dT' is equal to or coarser than the monadic topology (Prob. 3.21)
on X. (Use 4.12 and 4.14 to show that any subcollection of X closed in
the topology cp~^dT' is the monad of a filter.)
(iii) Show that the topology induced by q>~x£T' on X (as a subcollection
of X) is the original topology ST. (Use the fact that cp restricted to X is
the identity.)
(iv) Show that (in the topology cp~1PT') I is a compactification of X.
(Use (ii), (iii) and parts (ii) and (iii) of Prob. 3.21.)

4.16. Theorem. Let A^X. In order that A be compact it is necessary that


each q£A be near-standard. This condition is also sufficient if X is a regular
space.
CH. 11, §4], TOPOLOGY 559

Proof. If A is compact, then every *point of A — and in particular every


*point of A — is near a point of A, and hence near-standard.
Now suppose that X is regular and every q£A is near-standard. Take
an arbitrary r£A~ ; to show that A is compact, we must prove that r^p
for some p£A. Put

^={5: r*£B^}.

It is easy to see that & is closed under finite intersections. Also, if B£&
then r£(Ani?)~, hence dn5^0; and since B is open, also AnBX0.
It follows that the collection

{x£A a x£B: Bt&}

is U-concurrent. Hence for some qdA we have q£B for all B£&.
By assumption, q^p for some p£X. In fact, p£A by Prob. 4.2. We
claim that also r^p. Indeed, if this were not so, then by the regularity
of X there would be some closed D£?Fp such that r(£Z), hence X—D£_2ft
and therefore q$£>- But we must have q£.D, since q^p and |

4.17. Counter-Example. The condition of Thm. 4.16 may not be sufficient


if X is not regular. For instance, let X be the unit square

{{a,b): 0<a«sl, 0<6«1}.

As open subsets of X we take all sets of the form

B— {(0,b): 0<6«r},

where B is open in the usual (metric) topology of X and 0</-. (It is easy
to check that this is indeed a topology on X.) The remote ^points of X
are precisely those p*d{(0,b): 0=^6<1} such that p — (?0,0) in the
monadology of the metric topology of X. Thus X is not compact. But if

A = X— {(0,b): 0«6<1},

then every *point of A is near-standard while A=X.

4.18. Problem. Show that the following four conditions are equivalent:
(i) X is locally compact.
(ii) Every convergent ultrafilter has a compact member.
(iii) Every near-standard *point of X ^belongs to some compact subset
of X.
(iv) The collection of all remote ^points of X is the monad of a filter
which is generated by the set of all complements of compact sets.

37*
NONSTANDARD ANALYSIS [CH. 11, §4
560

Show also that if X is assumed to be regular then (iv) implies the other
three conditions even if the assumption on the way is generated is
omitted.
We now consider, in addition to X, a non-empty set YdU and some fixed
topology on Y. If q<=Y, we denote the filter of neighbourhoods of q by
and the monad of q by ‘/(/)”. No confusion will arise, because it
will always be clear from the context which space is being referred to.

4.19. Theorem. Let /€ Fun(X, Y), and let p£X. Then f is continuous at p
iff f[p(p)]^fffp).
Proof. By definition, / is continuous at p iff for every Be&Sp we have
/-1[5]6J*,. This is equivalent to /_1%cFr This in turn is equivalent
to By Thm. 3.2.6, this is the same thing as

f~Wp)\ 3p(p), i.e., f[p(p)]^p(fpf i


Thm. 4.19 can be rephrased equivalently as follows: The function
/6Fun(X,F) is continuous at p£X iff q^p=>fq—fp, i-e-> whenever q is
infinitely close to p then fq is infinitely close to fp. This is what mathema¬
ticians always wanted to say about continuity but didn’t quite know how,
without sacrificing rigour. In nonstandard analysis, this highly intuitive
characterization of continuity is at the same time completely rigorous.

4.20. Example. Let fe¥un{X,Y) be continuous, and let Cel be compact.


To show that/[C] is compact, take an arbitrary *point of/[C]. It must be
of the form fq where q*dC. Since C is compact, q^p for some p£C.
But /is continuous, so fq^fpdf[C\.
4.21. Example. Let fg be continuous £Fun(A,T). Assuming that Y is
a Flausdorff space, we show that the set A = {p£X: fpXgp) is open. Indeed,
let q^p£A. Then by the continuity off and g we have fq^fp and gq^gp.
But fpXgp because p^A. Since Y is a Hausdorff space, it follows that
fqXgq, i.e., q£A.
4.22. Problem. Let /£Fun(A,y) and p£X. Then / is said to be open at
p if/J^cJ%.
(i) Show that if f[p(p)] 2 pffp) then / is open at p.
(ii) Assuming that *U is a limit enlargement of U or that has a
countable base, prove the converse of (i).
4.23. Problem. Let Ne.U, and let fe Fun(N,X). (Thus/is a sequence of
points.)
(i) Show that/converges to p£X iff fn^p for all neN—N.
(ii) Show that p£X is a cluster point of /'iff fn^p for some «(iV-A
CH. 11, §5], TOPOLOGICAL GROUPS 561

Now let X be the topological product of a family {Xt: /€/} of topological


spaces. Thus each point of X is a function / with I as its domain and fi^Xt
for all /£/.
For each i£l we have / 6 {/}€(/,//) 6/6-35, hence /6PUUU-3L Similarly,
for each id I. Since we are assuming thet X£U, it follows
from (2.3) and (2.4) that I and each of the Xt belong to U.
If g£X and i£l, then gi (=g*7) is uniquely defined and belongs to (Xf .
Let faX and for each /£/ let 7r; be the canonical projection of X onto
Xt (so that 7i.f=fi). Then the filter <Ff of all neighbourhoods of/in X is
generated by

where &fi is the filter of neighbourhoods offi in Xt. Using Prob. 3.19(i)
and Thm. 3.26 we have

M/) = nKrl[M/0]: *e/},


where fi(fi) is the monad of fi in the monadology of X,. We have thus
proved:
4.24. Theorem. Let f^X andg^X. Then g^n(f ) iff gi^i-i(fi) for all i^I. |

Note that Thm. 4.24 says nothing about gi for ifj-l. Indeed, for
such i we have not defined /*(//), because Xt is not a topological space
but only a ^topological *space.
4.25. Example. Assume that, for each /6/ Xt is a Hausdorff space. We
show that X too is a Hausdorff space. Indeed, if f,gf.X and h£n(f)nn(g),
then, by Thm. 4.24, hi£f.i(fi)n^{gi) for all iTj. Hence fi=gi for all *6/
so f=g.
4.26. Example. Assume that, for each z‘6/ Xt is compact. We prove
(Tychonoff’s Theorem) that X is compact. Indeed, let gfX. For each
i£ /? gi^xf ; and since Xt is compact, we can choose afXt such that gi=^ah
(This requires the axiom of choice. But if the Xt are Hausdorff spaces then
the ai are unique and the axiom of choice is not needed.) Put// ai for all
id7; then/6X and by Thm. 4.24 we have g^f as required.

§ 5. Topological groups
Where there is an interplay between algebraic and topological notions,
nonstandard analysis often provides interesting insights. As an illustration
of this, we shall deal with the subject of topological groups. No previous
knowledge of this subject is needed for understanding our treatment; but
NONSTANDARD ANALYSIS [CH. 11, §5
562

some such knowledge is obviously required for comparing our treatment


with the conventional (standard) approach.

5.1. Definition. A topological group is a triple (G, •,^r) such that:


(i) ((?,•) is a group, i.e., • is a group operation on the set G;
(ii) (G,ZT) is a topological space; i.e., ST is a topology on G;
(iii) The mapping (g,li)^g-h of the product space GXG into G, and
the mapping gi—of G into itself are continuous.
Throughout this section we assume (unless indicated otherwise) that
{G,*,2T) is a fixed but arbitrary topological group such that GaU. With
the customary (and harmless) abuse of terminology, we speak of G itself
as a “topological group”, when we really want to refer to (G, • We let
e be the identity element of G.
The operation • on G can be extended in a natural way to an operation
on G: if g^g^G, we letgx-g2 be the unique haG such that *t=gi-g>=h.
If cr and t are subcollections of G, we put

(j• x — {g• h~. g£<y, hdt},

o~1 = {g^1: gac).

If A and B are ^subsets of G then A • B is that *subset of G whose ^members


are precisely all products a • b with a* a A and b *6 B. Thus (A • B) —A-B.
Similarly, (A-1) =A-1.
From the fact that (under the operation •) G is a group, with e as identity
element, it follows by transference that (under the operation • as extended
above) G is also a group, with e as identity element. Thus G is a subgroup
of G. This result evidently holds also for ordinary groups, without any
particular topology.

Using Thms. 4.19 and 4.24, we can re-state condition (iii) of Def. 5.1
as follows:

(1) p(g)mKh)^p(g-h) for all g,h£G;

(2) /r(g)-1cji(g-1) forallgeU.

Hence we easily get

(3) Kg)*M^)~1^(g-^_1) for all gMG.

Let J be the collection of all near-standard members of G. Thus

(4) Jr=U{fi(g): g€G}.


CH. 11, §5], TOPOLOGICAL GROUPS 563

From (3) and (4) it follows at once that J-J 1(^J, hence J is a subgroup
of G.
Also, by (3), p{e) • p{e)~x^p{e). Hence p(e) is a subgroup of J. The
members of fie) are called the infinitesimals of the topological group G,
and fie) is called the infinitesimal group of G.
Using (1), (2) and the fact that g(j/i(g), we get for each g£G,

g • p(e)^p(g) • /i(f)c/i(g)cg. p(g~l) • p(g)cg. fie).

Thus g• p(e)=fig). Similarly, fie)-g=p(g). It follows that, for each g£G,


the right coset and the left coset of g modulo fie) coincide with each other
and with fig). In view of (4) all cosets modulo p(e) in J are of the form
fig), and hence p(e) is a normal subgroup of J. To sum up:

5.2. Theorem. The collection J of near-standard members of G is a subgroup


of G. The collection of infinitesimals fie) is a normal subgroup of J. The
cosets in J modulo fie) are precisely the monads of points g f G. S
5.3. Problem. Prove that G is a Hausdorff space iff fie) n G = {<?}. Hence
prove that if G is a T0 space (Prob. 4.5(iii)) then it is a Hausdorff space
and in this case the quotient group J/fie) is isomorphic to the group G.

5.4. Theorem. Suppose that G is given as a group (without any particular


topology). Let x be a subgroup of G. Then the two conditions
(a) r is monadic, i.e., x = p&r for some filter $F over G,
(b) x-g-g’X for every g£G;
are necessary and jointly sufficient for the existence of a topology on G under
which G becomes a topological group with x as its infinitesimal group.
Proof. The conditions are clearly necessary. To prove that they are
sufficient we put A(g)=x • g for every g(zG and show that the three conditions
of Thm. 4.3 are fulfilled.
(i) For any g£G, let

tF-g={A-g: Ae^},

where & is the filter mentioned in (a), for which x=p&. It is easy to
verify that fid?-g) = x -g=fig); so 2(g) is monadic.
(ii) Since x is a subgroup of G, it contains e. Hence for every g£G
we have g€T-g=2(g).
(iii) For every g£G and A^Ffig) we have to find some B^FA(g) such
that A£FA(h) for each h£B. It is enough to consider the case g=e, since
the general case can be obtained by translation (i.e., multiplication by g).
564 NONSTANDARD ANALYSIS [CH. II, §5

We assume therefore that Ft = B*. If D is a tiny ^member of then


Dgi, and since r is a group,

(D- D)~ = D • /3ct c A.


Thus
t*f= 3x(x0^ax • xEA).

It follows that B-B^A for some B£^. Therefore B-B^A and icjB.
Hence, if hdB we have

X(h) = x • /icfi. Be A;

so that A^FXi(h), as required.


It now follows from Thm. 4.3 that there is a topology TonG such that
BAg) = x'g f°r every g£G.
Using (b) and the fact that t is a group, it is easy to verify that

bAs) • bAh)=bAs ' h), Ms)_1=AVfe-1)-


Hence G is a topological group under the topology .
Finally,

BAe)=x-e = T,

so that r is the infinitesimal group of G. g

For arbitrary near-standard h and g, we now re-define to mean


that h and g are congruent modulo /<(<?). This is consistent with the old
definition whenever the latter applies — i.e., when g is standard. According
to the present definition ^ is clearly an equivalence relation on the collection
of near-standard points.

Conventionally, the structure of the topological group G is determined


by specifying:
(i) the purely algebraic group structure (the “multiplication table”);
and
(ii) the topology 2T. \v
In the nonstandard approach, the topology is completely determined by
the monadology; and the latter is completely determined by the infinitesimal
group /<(e), because B(g)—g'B(e) f°r every g£G. Thus the structure of the
topological group G is now determined by specifying the purely algebraic
structure together with the infinitesimal group, which is a suitable subgroup
of G.
CH. 11, §5], TOPOLOGICAL GROUPS 565

Consequently, the nonstandard treatment of topological groups is rather


more algebraic in flavour that the conventional treatment. Topological
considerations are often replaced by considerations involving the in¬
finitesimal group n(e): e.g., calculations with congruences modulo /.i(e).
This seems to be a natural explication of the intuition underlying the notion
of topological group.

5.5. Example. Let A,C<=G, where A is closed and C is compact. We show


that A-C is closed. By Thm. 4.1 we have to show that if gdG and g^p-q
where pdA and qdC, then gdA-C. But, since qdC and C is compact,
we have q^c for some cdC. Hence g^p'C, so that g-c^1^p. Since
g• c~1dG, A is closed and pdA, it follows that g-c~1dA. Hence gdA-C
as required.
5.6. Example. W’e show that every topological group is a regular space.
Given any closed A<=G and gdG—A, we have to show that there are
disjoint open sets B and C such that gdB and HsC. We use the easily
established fact that if (i.e., D is an open neighbourhood of e)
and E is any subset of G, then E-D is open and E^E-D. Now, 2Fec\Br
is a base for the filter #"e; so we can take a tiny ^member D of
Then g • D and A • D are *open, g*dD and A*^A- D. Also, g • D and A • D
are ^disjoint. (Otherwise, using the fact that D^p(e), we would conclude
that g^q for some qdA\ and since A is closed it would follow that gdA.)
Hence
t*N3x 3y(x£ Tv A y£C A g£xAA£yAxny=0).

Thus there are open disjoint sets B and C such that gdB and A^C, as
required.

5.7. Problem. For each A^G, define subsets Ad and As of GXG by

Ad={(s,h)- h-g-'dA),
As ={(g’h): g~1-hdA}.

The right uniformity and left uniformity of G are the filters over GXG which
are generated by {Ad: Ad^e) and {As: Ad^e} respectively. Prove that
the two uniformities coincide iff" p(e) is a normal subgroup of G.
5.8. Problem. Let H be a subgroup of G.
(i) Show that H is a subgroup of G. (Use Prob. 4.2.)
(ii) Show that H is closed iff p(e)nH =p(e)nH.
(iii) Show that H is open iff hence show that every open
subgroup of G is closed.
NONSTANDARD ANALYSIS [CH. 11, §6
566

5.9. Problem. Assume that *U is a limit enlargement. Let 77 be a normal


subgroup of G, and let f be the quotient map of G onto GjH (so that
f(g)=g’H for each g€G).
(i) Show that, if G\H is given the quotient topology, it becomes a
topological group, with f[\x0)] as infinitesimal group.
(ii) Show that (under the quotient topology) G/77 is a Hausdorff space
iff 77 is closed in G.
5.10. Problem. In addition to G, consider another topological group
GfiU with e as identity element. G is said to be locally isomorphic to G'
if there is some and some homeomorphism/ of F onto a neighbour¬
hood of e' such that f(g • h)=f(g)-f(h) whenever g,h,g-heV.
Prove that G is locally isomorphic to G' iff there is some function /€ G
such that the natural extension of / maps p(e) isomorphically onto the
infinitesimal group of G'.
5.11. Problem. Assume that N^U. Let S be the set of all /£Fun(A,G)
such that fin) — e for almost all (i.e., all but finitely many) n. Let 776 Fun(S',G)
be defined by 77/=/(0)-/(l)«... •/(«), where f(m) = e for all m>n.
(i) Let V£ SFe and let

77= {77/: fdS,f(ri)£ Fob-1 for all n).

Prove that 77 is an open — and hence closed — subgroup of G. (Use 5.8.)


(ii) Prove that if G is connected then each g£G is a ^finite product of
infinitesimals: g=nf for some f£S such that f(n)dp(e) for all n£N.

§ 6. The real numbers

From now on we assume that N£U. Hence also R£U, where R is the set
of real numbers, since each real number is (identified with) a subset of N.
In the usual topology, R is a commutative topological group with respect
to addition. The operation of addition extends in a natural way to R, as
explained in §5. Since R is clearly a commutative group, the infinitesimal
group //(0) is a normal subgroup of R. The members of q(0) are the
infinitesimal * reals. Clearly, 0 is infinitesimal; but by 4.5(i) there are
infinitesimals ^0, and by 4.5(iii) these are all non-standard.
For arbitrary a,b£R we now re-define “a^b" to mean that a and b
are congruent modulo ^i(O). This is consistent with the definition made in §5
whenever the latter applies, i.e., when a and b are near-standard.
The operation of multiplication, the absolute-value function 1, and
the ordering relation < extend in a natural way from R to R. We shall
CH. 11, §6], THE REAL NUMBERS 567

denote these extensions by the same symbols that are used for R. Since
R is an ordered field, it follows easily by transference that R is an ordered
field. (R is clearly non-archimedean, but every a£R has the *archimedean
property: a<n for some n£N.)
We employ the usual notation for intervals. Thus for r,s£R we put

(r,s) = {*€R: [r,s] = {x£R: r<x<s}.

Also, we let P be the set of all positive reals.

The set of symmetric intervals {(—/%/•): r£P} is a base for the filter of
neighbourhoods of 0 in R. Thus 5^0 iff <5(i(—/y) for all r£P. This
means that |<5|<r for all r£P.
If a^R and u_1^0, then a is called an infinite *real. It is easy to see
that a is infinite iff |<a| =-r for all r£P.
A *real a is called finite if it is not infinite. This means that \a\<r for
some r£P. We let <P be the collection of all finite *reals.

6.1. Theorem. A *real is finite iff it is near-standard.


Proof. Let a be near-standard: a^r for some r£R. Then a = r+5, where
<5^0, so
|a| = |r + 5|*e|r[ + |<5|«s|r| + l;

hence a is finite. (The triangle inequality holds in R by transference.)


Conversely, if a is finite then |a|<r for some r£P. Hence a£[-r,r] ;
but [—r,r] is a compact set, hence, by Thm. 4.10, a is near-standard. |

6.2. Problem. Let X£ U be a metric space, with distance function (metric) g.


We extend g to a mapping of XxX into R in the usual way. A *point
X is called finite if g(p,q) is finite for some (hence for every) point p£X.
(i) Show that, for each pdX, p(p)={q£X: g(p,q)^0}.
(ii) Show that every near-standard q£X is finite.
(iii) Show that if the converse of (ii) holds then X is locally compact.
6.3. Problem. Let X be as in Prob. 6.2 and let / *£ Fun(A,X).
(i) Show that if fn is finite for every infinite n^N then there is some
m£N such that fn is finite for all n>m. (Take any p^X\ consider the
collection
{m: *f=m€NAVx[xGNAx>m Q(fx,p)<m]}

and use Robinson’s Overspill Lemma 2.11.)


(ii) Let fn^p for some pd% and all n£N. Show that theie is some
infinite m^N such that fn^p for all n<m.
NONSTANDARD ANALYSIS [CH. 11, §6
568

Since R is a Hausdorff space, it follows from Thms. 4.6 and 6.1 that
°a is defined for each a£0. For reasons that will soon be appaient, we
call °a the (standard) place of a. The mapping a^°a is called the (standaid)
place mapping.
6.4. Theorem. <P is a subring (with identity) of R. The place mapping
is a homomorphism of (P onto the field R of reals. The kernel of this
homomorphism is p(0), which is therefore a maximal ideal in T>.
Proof. Let a,b£(p. Then |a|<r and \b\^s for some r,s£P. It follows that
\a—6j</-+5 and \ab\=^rs, hence a—b and ab are in <P. Clearly, l€4>, so
<P is a subring of R, with identity.
For each a,b(z <P we have a^°a and b^°b. Since addition is a continuous
function from RXR into R, it follows that a+b^°a+°b, hence °(a+b) =
=°a+°b. Similarly °(ab)=°a'°b. Also, °r=r for every r£R- Hence the
place mapping is a homomorphism of <P onto R.
Finally, it is clear that °a = 0 iff a£p(0). I
By definition, the inverse of every infinite *real is infinitesimal, and hence
finite. Also, the infinitesimals are precisely the non-units of <P (i.e., the
members of which have no inverse in T>). Thus in the terminology of
valuation theory (see e.g. Jacobson [1964]) Thm. 6.4 means that <P is a
valuation ring in R, the place mapping is the canonical place of <P, and R is
the residue field of <P.
The canonical valuation of the valuation ring provides an interesting
explication of the vague classical notion of order of magnitude of a given
(infinitesimal, finite or infinite) quantity. For any afi-R, let us say that
a^b if a —be for some <£> — p(0). It is easy to see that ps is an equivalence
relation. We define 0(a) (read: the order of magnitude of a) to be the
ps-class of a. Clearly, 0(0) = {0}.
6.5. Problem. For any a,b£R define 0(a) • 0(b) = 0(ab)\ and 0(a)^0(b)
if a —be for some c£4>.
(i) Show that these definitions are legitimate (i.e., independent of the
choice of “representatives” a and b).
(ii) Show that (under the multiplication defined above) the collection
G=[0(a)\ a^R— {0}] is a commutative group, with 0(1) as identity
element.
(iii) Show that the relation < is a total ordering of the collection of all
orders of magnitude Gu{O(0)}.
(iv) Let H=\0(a): a£p(0)— {0}}. Show that H is closed under multipli¬
cation and G = Hsj {0(l)}u//-1.
CH. 11. §6], THE REAL NUMBERS 569

(v) Show that for any a^R, 0(a)£H iff 0(O)<O(o)<0(]).


(vi) Show that 0(a+&)*£max(0(a), 0{b)) for all a,b£R.

In the terminology of valuation theory, Prob. 6.5 means that O is a


valuation on R. In fact, it is the canonical valuation of the valuation ring <P.

6.6. Problem. Show that the ordering of {0(a)\ adR] is dense, with
first but without last member. (Use Prob. 7.9.14 (v).)

Starting from ideas explained so far in this chapter, various parts of


mathematics can be developed in an elegant and intuitively appealing way.
For such developments the reader is referred to the literature quoted in
§8 below.
We shall conclude our introductory treatment with a few examples.

6.7. Example. We prove the intermediate-value theorem for continuous


functions. Let / be a real function, defined and continuous in [0,1]. Let
/(0)<0</(l). Then it is easy to see that if n is any positive natural number,
there is a natural number m<n such that
m+1
(1) / ;0</

Now take an infinite n£N. By transference, there is such that (1)


holds. Since [0,1] is compact, there exists r6[0,l] such that

m m +1
n n

Because /is continuous at r, we clearly must have/(r)=0.

6.8. Problem. Let / be a real function defined in an interval (r,s) and let
a£(r,s). Show that f'(a) is defined and equal to b iff

/(a+ 3)-/(a) k
5
for every non-zero infinitesimal 5.
6.9. Problem. Let G£ U be a Hausdorff, locally compact topological group.
Let # be the set of all compact subsets of G. Define a function X: &eYS6-+N
as follows. Let B^e and C^€. Since C is compact and B includes an
open set, there exists an n£N such that for some glt...,gH€G we have
CcU{5-^: i = l,.Let X(B,C) be the least such n. Now fix some
(such K exists because G is locally compact) and fix also a tiny
^member D of 2Fe.
NONSTANDARD ANALYSIS [CH. II, §6
570

(i) Show that for each we have

X(D,C)^X(D,K)-X(K,C),

hence we can define a mapping 17: by putting, for each C€^,

fW)|
X(D,K) J
(ii) Show that 17 is a (right) Haar measure on G; i.e., r](C • g) — ri(C) for
all and g£G, and f](C1yjC2) = t](C1) + t1(C2) for all disjoint Cl5C2€^
(For the second assertion, show first that if g€G then C^D-g cannot be
non-empty for both z=l and i=2.)
(iii) Assuming that g• D• g~1 = D for all g£G, show that 17 is also left
invariant; i.e., t](g• C) = t](C) for all C^f€ and gdG. (Show first that
X(D,C) = X(D,g-C-g-1).)
(iv) Assuming that p(e) is a normal subgroup of G, prove that there
exists some tiny ^member D of £Fe such that g • D • g~x = D for all gdG.
(Take any tiny ^member B of and show that the collection
{g-b-g'1: b£B, gdG} is internal and equal to D where D is as required.)
6.10. Problem. Let G be a commutative, Hausdorff, locally compact
topological group “without small subgroups” (i.e., there is a neighbourhood
of e which does not include any non-trivial subgroup of G).
A one-parameter subgroup of G is a continuous homomorphism of the
additive group of reals R into G. Let M be the set of all one-parameter
subgroups of G. For all f,g€M and all r,s£R we put (f+g)(r)=f(r)-g(r)
and (sf)(r)—f(sr). We endow M with the “compact-open” topology, by
taking as sub-base all sets of the form {fdM: /[C]cf|, where C is
compact gk and V is open sG.
(i) Verify that under the above definitions M is a topological vector
space over the reals. Prove also that

A‘m(0) = {/£M: f(r)^e for all /•£[-!,1]~},

where 0 is the trivial one-parameter subgroup and //M(0) is its monad in the
monadology of M.
(ii) Since G is locally compact and has no “small subgroups”, we can
fix a compact V€ !Fe such that V does not include any non-trivial subgroup
of G. We may also assume V~1— V (otherwise, we replace F by Fn F-1).
Having fixed such a F, we let X be the set of all /€M that map [—1,1]
into F. Prove that A is a compact neighbourhood of 0 in M, hence M is
finite dimensional. (Use (i) to show that gM(0)^X and that every member
CH. 11, §6], THE REAL NUMBERS 571

of X is near-standard. Then observe that a Hausdorff topological vector


space over the reals is finite dimensional iff it is locally compact.)
(iii) Let ijz be the map of M into G defined by •/'(/)=/( 1). Prove that i/
is a continuous homomorphism and show that the natural extension of i/
maps pM(0) one-one into n(e). (Note that pM(0) is the infinitesimal group
of M, regarded as an additive topological group. Using the assumption
that G has no “small subgroups”, show that if y and <5 are infinitesimals
of G and y2=<52 then y=<5. Hence show that if /£/im(0) then /(l) uniquely
determines the values f( + k/2n) for all k,n£N; so / is uniquely determined
by/(!).)
(iv) Let V be as in (ii), and let yXe be an infinitesimal of G. Prove that
there exists a smallest n£N such that yn+1$V.
(v) For y^e, yXe, let n(y) be the least n of (iv). Prove that if k and m are
^integers such that |fc|, |w|«sn(y) and k/n(y)c^m/n(y) then yk and ym are
near the same point of V. (Use the fact that V is compact and does not
include a non-trivial subgroup of G.)
(vi) Let y and n(y) be as above. For each r£R put

where k is any ^integer such that k/n(y)^r. Prove that fy(r) is uniquely
defined for each r£R and that fy is a one-parameter subgroup of G. (To
show that fy(r) is uniquely defined, use (v). To show that/) is continuous
at 0 and hence everywhere, let be closed and symmetric; consider
the collection

{n: n£N and fy(x)£ W whenever |x|<2_"}

and use the Overspill Lemma 2.11.)


(vii) Prove that if /)(l/n(y))=<5 then n(y • (5“1)>n(y).
(viii) Prove that if K is compact <=G and e^K then there exists some
n0£N such that for every a^K there is a natural such that an$ F,
where V is as above.
(ix) Let X be as in (ii), and let \p be as in (iii). Put S=il/[X]. Show that
S is compact and Sn/j.(e) = iJ/[piM(0)], hence in particular Snn(e) is a group.
(x) Let S be as above. Show that p(e) c S, hence n(e) = ilsUiM(0)].
(Suppose ct^e but a$S. Show that a-S ^satisfies the conditions of (viii);
hence choose y£(a-S)~ such that n(y) is maximal. Using (vii) and (ix)
determine <5 and show that y • (5-1£(a • S)~ contrary to the choice of y.)
(xi) Show that G is locally isomorphic to a finite-dimensional real
vector space. (Use (ii), (iii), (x) and apply Prob. 5.10.)
NONSTANDARD ANALYSIS [CH 11, §7
572

§ 7. A methodological discussion

Because of limitations of space, we have confined ourselves to an exposition


of the rudiments of nonstandard analysis, without exhibiting many deep
applications to various branches of mathematics. (Examples of such
applications may be found in the literature quoted in the next section.)
However, we hope that even the material presented here has convinced
the reader that nonstandard analysis provides elegant, intuitively appealing
and formally simple characterizations of many (standard) mathematical
concepts, and can thereby facilitate the discovery and proof of (standard)
mathematical results. It is true that any standard result (i.e., a result
which refers to standard concepts only) provable by means of nonstandard
analysis can, in principle, also be proved without it1. However, the non¬
standard proof is often considerably simpler and requires fewer ad hoc tricks.
In the preface to the second (1974) edition of A. Robinson [1966],
K. Godel is quoted as making the following enthusiastic appraisal:
“I would like to point out.. .that nonstandard analysis frequently simplifies
substantially the proofs, not only of elementary theorems, but also of deep
results. This is true, e.g., also for the proof of the existence of invariant
subspaces for compact operators2, disregarding the improvement of the
result; and it is true in an even higher degree in other cases. This state
of affairs should prevent a rather common misinterpretation of nonstandard
analysis, namely the idea that it is some kind of extravagance or fad of
mathematical logicians. Nothing could be farther from the truth.”
With this we are in complete agreement. However, he goes on to say:
“Rather there are good reasons to believe that nonstandard analysis,
in some version or other, will be the analysis of the future.
“One reason is the just mentioned simplification of proofs, since simpli¬
fication facilitates discovery. Another, even more convincing reason, is the
following: Arithmetic starts with the integers and proceeds by successively
enlarging the number system by rational and negative numbers, irrational
numbers etc. But the next quite natural step after the reals, namely the
introduction of infinitesimals, has simply been omitted. I think, in coming
centuries it will be considered a great oddity in the history of mathematics
that the first exact theory of infinitesimals was developed 300 years after
the invention of the differential calculus.”
With this we beg to differ. In our view there is a fundamental difference

1 This is so because nonstandard analysis does not use any new principle which is
irreducible to principles accepted in standard mathematical practice.
2 Cf. references in the next section.
CH. II, §8], A METHODOLOGICAL DISCUSSION 573

between the enlarged system of *reals and the classical number systems
(the integers, the rationals, the reals, etc.) namely the fact that the latter
are canonical but the former is not. The classical number systems can be
characterized (informally or within set theory) uniquely up to isomorphism
by virtue of their mathematical properties (e.g., the field of rationals is
the smallest field containing the integers, and the field of reals is the comple¬
tion of the field of rationals). On the other hand, there is no known way
of singling out a particular enlargement which can plausibly be regarded
as canonical; and there is no good reason to believe that a method for
obtaining a canonical enlargement will necessarily be invented.
It is therefore not so surprising that ordinary (informal) mathematical
practice has discovered the classical number systems, which actually almost
force themselves on it, but searched in vain for the infinitesimals. There is
no such thing as the enlarged system of *reals; it depends on the choice
of enlargement. Where there is no canonical structure of a given kind,
the way mathematicians proceed is by specifying all structures of this kind.
And the only way to specify all enlargements is by their formal logical
properties.
For this reason also it seems to us wrong to try totally to replace standard
by nonstandard analysis. For example suppose one tries to define continuity
of a real function /at r£R by the condition: f(x)^f(r) for all x^r. This
would be a bad definition so long as we are not sure that the condition is
invariant (i.e., independent of the choice of enlargement). But in general
the only simple way of proving the invariance of a nonstandard condition
is to show that it is equivalent to a standard one. Thus we still need the
standard definition of continuity.
We therefore regard nonstandard analysis as an important tool of
clarification, exposition and research — often beautiful, sometimes very
powerful, but never exclusive.

§ 8. Historical and bibliographical remarks

Nonstandard models of arithmetic were discovered by Skolem [1934], but


for a long time thereafter they seem to have been locked up as skeletons
in the logical cupboard. No serious attempt to study such models — let
alone use them as more than pathological counter-examples — was made
before Henkin [1949], During the 1950s, the study of nonstandard models
of arithmetic gradually gathered momentum, and by the end of the decade
became quite fashionable. Thus, in the symposium on the foundations

38
NONSTANDARD ANALYSIS [CH. II, §8
574

of mathematics held at Warsaw in September 1959 (Infinitistic methods


[1961]) at least three papers on the subject were read (one of them by
A. Robinson) and the collection Bar-Hillel et al. [1961] contains two
essays on the same subject.
However, these investigations were confined to nonstandard models
of (natural number) arithmetic. Also, they were concerned with studying
these models as such, or with using them to prove metamathematical results
about formal systems, rather than mathematical results about numbers.
Thus they cannot be regarded as belonging to nonstandard analysis,
although they undoubtedly prepared the ground for it.
Nonstandard analysis was invented by A. Robinson in the autumn of
1960, when it occurred to him that “the concepts and methods of contem¬
porary Mathematical Logic are capable of providing a suitable framework
for the development of the Differential and Integral Calculus by means
of infinitely small and infinitely large numbers”.1
The first published account of the subject is in A. Robinson [1961].
Here he still works with the first-order structure of real numbers (whose
individuals are the real numbers only) and its proper elementary extensions.
In A. Robinson [1962] the treatment is extended to structures having
individuals of arbitrary finite types (e.g., sets of reals, sets of sets of reals,
etc.). This paper also contains nonstandard proofs of new standard results
in complex analysis (strengthened versions of classical results on the distri¬
bution of zeros of polynomials and on Julia directions).
A major break-through came in 1964, when Bernstein and Robinson
used nonstandard analysis to solve an important open problem in the
theory of linear spaces. (See Bernstein and Robinson [1966].) A well-
known theorem, due to von Neumann and Aronszajn, states that if T
is a compact operator in a Hilbert space then T has a (non-trivial) invariant
subspace. It was conjectured that the result holds also in the case where
T is continuous and non-compact, provided T2 is compact. Bernstein and
Robinson proved that this is in fact the case even if instead of the compact¬
ness of T2 one assumes that of p(T) for some polynomial p.
The characterization of compactness (Thm. 4.10) was discovered by
A. Robinson in 1963 (see A. Robinson [1965]).
A detailed systematic account of nonstandard analysis, including the
above mentioned applications and many others in a wide range of branches
of mathematics, is in A. Robinson [1966].

1 See A. Robinson [1966].


CH. 11, §8], HISTORICAL AND BIBLIOGRAPHICAL REMARKS 575

The spread of Robinson’s ideas among analysts was facilitated by


Luxemburg [1962], which presents an elementary treatment (using an
ultrapower of the first-order structure of real or complex numbers) without
explicit heavy use of logic. (Instead, each particular instance of transference
is carried out separately, as and when required.)
In dealing with entities of higher types, most writers on nonstandard
analysis have followed A. Robinson [1962, 1966] in using type-theoretic
structures, which are technically rather cumbersome. The first treatment
using a considerably simpler set-theoretical framework is in Machover
[1967]. This is followed also in Machover and Hirschfeld [1969] and in
the present book. (See also the paper by Robinson and Zakon in
Luxemburg [1969].)
Important collections of papers on nonstandard analysis are Luxemburg
[1969], Luxemburg and Robinson [1972], and Hurd and Loeb [1974].
The bulk of the results presented in this chapter are due to A. Robinson
[1966]. The results of §3 are due to Luxemburg (see his comprehensive
paper A general theory of monads in Luxemburg [1969]; this paper contains
also a wealth of results not presented here). Most of the results of §3 were
also discovered independently by Hirschfeld and presented in Machover
and Hirschfeld [1969]. The nonstandard construction of the Haar measure
(Prob. 6.9) is due to R. Parikh (in a paper included in Luxemburg [1969]).
The nonstandard version of the solution of Hilbert’s Fifth Problem in the
commutative case (Prob. 6.10) is due to Hirschfeld.

Remark. As noted above, most writers on nonstandard analysis use a type-


theoretic framework rather than the set-theoretic framework employed by us.
This enables them to assume that the basic relation of the enlargement
is the true membership relation. This is not a very big gain, because a
similar assumption cannot be made in general for defined relations. More¬
over, if one insists that *6 is true membership, then the structure 11 cannot
be a substructure of *11, although there is a canonical elementary embed¬
ding * of 11 into *11. Therefore those authors must always distinguish an
object u£U from the corresponding object *u£*U (except when u is of
lowest type in U). This distinction, although necessary in order to avoid cont¬
radictions, is rather cumbersome and is sometimes overlooked.
These points must be borne in mind by the reader who wishes to consult
the literature quoted above.

38*
BIBLIOGRAPHY

Aczel, P.H.G.
[1967] Some results on intuitionistic predicate logic (abstract). /. Symb. Logic 32, 556.
Bar-Hillel, Y., et al. (eds.)
[1961] Essays on the Foundations of Mathematics, dedicated to A. A. Fraenkel on his
seventieth anniversary. Magnes Press, Hebrew Univ., Jerusalem.
Bell, J. L.
[1977] Boolean-Valued Models and Independence Proofs in Set Theory. Clarendon
Press, Oxford.
Bell, J. L., and A. B. Slomson
[1969] Models and Ultraproducts: an Introduction. North-Holland, Amsterdam.
Benacerraf, P., and H. Putnam (eds.)
[1964] Philosophy of Mathematics: Selected Readings. Prentice-Hall, Englewood
Cliffs, N. J.
Bernstein, A. R., and A. Robinson
[1966] Solution of an invariant subspace problem of K. T. Smith and P. R. Halmos.
PacificJ. Math. 16,421-431. (Abstract in Notices Am. Math. Soc. 11, (1964)586.)
Beth, E. W.
[1953] On Padoa’s method in the theory of definition. Indag. Math. 15, 330-339.
[1955] Semantic entailment and formal derivability. Mededel. Kon. Ned. Akad.
Wetensch. Afd. Letterkunde N. S. 19, 309-342.
[1956] Semantic construction of intuitionistic logic. Mededel. Kon. Ned. Akad.
Wetensch. Afd. Letterkunde N. S. 19, (13). (Originally in French in: Colloq.
Logique Math. CNRS (1955).)
Birkhoff, G.
[1967] Lattice Theory (3rd ed.), Am. Math. Soc. Coll. Pub., Vol. 25. Am. Math. Soc.,
Providence, R. I.
Bishop, E.
[1967] Foundations of Constructive Analysis. McGraw-Hill, New York.
Bourbaki, N.
[1961] Topologie Generate (3rd ed.). Hermann, Paris.
Chang, C. C., and H. J. Keisler
[1973] Model Theory. North-Holland, Amsterdam.
Church, A.
[1936] An unsolvable problem of elementary number theory. Am. J. Math., 58, 345-363.
[1936a] A note on the Entscheidungsproblem, J. Symb. Logic 1, 40-41. (Reprinted
with corrections in Davis [1965], pp. 110-115.)
BIBLIOGRAPHY 577

[1956] Introduction to Mathematical Logic. Princeton Univ. Press, Princeton, N. J.


Cohen, P. J.
[1966] Set Theory and the Continuum Hypothesis. Benjamin, New York.
Craig, W.
[1957] Linear reasoning. A new form of the Herbrand-Gentzen theorem. J. Symb.
Logic 22, 250-268. See also: ibid., 269-285.
Davis, M.
[1958] Computability and Unsolvability. McGraw-Hill, New York.
Davis, M. (ed.)
[1965] The Undecidable. Basic papers on undecidable propositions, unsolvable problems,
and computable functions. Raven Press, New York.
Davis, M., H. Putnam ard J. Robinson
[1961] The decision problem for exponential diophantine equations. Ann. Math.
74, 425-436.
Dedekind, R.
[1888] Was sind und was sollen die Zahlen? Brunswick. (English transl. by W. W. Beman
in Essays on the Theory of Numbers. Open Court, La Salle, Ill. (1901).)
Dekker, J. C. E.
[1955] Productive sets. Trans. Am. Math. Soc. 78, 129-149.
Drake, F. R.
[1974] Set Theory: an Introduction to Large Cardinals. North-Holland, Amsterdam.
Dwinger, P.
[1961] Introduction to Boolean Algebras. Physica-Verlag, Wurzburg.
Ehrenfeucht, A., and A. Mostowski
[1956] Models of axiomatic theories admitting automorphisms. Fund. Math. 43, 50-68.
Feferman, S.
[1952] Review of Rasiowa and Sikorski [1951]. J. Symb. Logic 17, 72.
[1960] Arithmetization of metamathematics in a general setting. Fund. Math. 49, 35-92.
Fitting, M. C.
[1969] Intuitionistic Logic, Model Theory and Forcing. North-Holland, Amsterdam.
Fraenkel, A. A.
[1961] Abstract Set Theory. North-Holland, Amsterdam.
Fraenkel, A. A., Y. Bar-Hillel and A. Levy
11973] Foundations of set theory (2nd ed.). North-Holland, Amsterdam.

Frayne, T., A. Morel and D. S. Scott


]1962] Reduced direct products. Fund. Math. 51, 195-228.

Frege, G.
[1879] Begrijfsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen
Denkens. (Complete English transl. in Van Heijenoort [1967], pp. 1-82.)

Fried berg, R. M.
[1957] Two recursively enumerable sets of incomparable degrees of unsolvability.
Proc. Nat. Acad. Sci. 43, 236-238.
Gentzen, G.
[1934] Untersuchungen iiber das Logische Schliessen. Math. Z. 39, 176-210, 405-431.
(English transl. in M. E. Szabo (ed.), The collected Papers of Gerhard Gentzen,
North-Holland, Amsterdam (1969).)
578 BIBLIOGRAPHY

Gillman, L., and M. Jerison


[1960] Rings of Continuous Functions. Van Nostrand, New York.

Godel, K.
[1930] Die Volistandigkeit der Axiome des logischen Funktionenkalkiils. Monatsh.
Math. Phys. 37, 349-360. (English transl. in Van Heijenoort [1967], pp.
582-591.)
[1931] Uber formal unentscheidbare Satze der Principia Mathematica und verwandter
Systemel. Monatsh. Math. Phys. 38,173-198. (English transl. in Van Heijenoort
[1967], pp. 596-616.)
[1934] On undecidable propositions of formal mathematical systems. Lecture notes
by S. C. Kleene and J. B. Rosser, Inst, for Advanced Study, Princeton, N. J.
(Reprinted with corrections in Davis [1965].)
[1938] The consistency of the axiom of choice and of the generalized continuum
hypothesis. Proc. Nat. Acad. Sci. U.S.A. 24, 556-557.
[1939] Consistency-proof for the generalized continuum hypothesis. Proc. Nat. Acad.
Sci. U.S.A. 25, 220-224.
[1940] The Consistency of the Continuum Hypothesis, Ann. Math. Studies 3. Princeton
Univ. Press, Princeton, N. J.
Goodman, N. D.
[1970] A theory of constructions equivalent to arithmetic. In: A. Kino, J. Myhill
and R. E. Vesley (eds.), Intuitionism and Proof Theory, 101 -120. North-Holland,
Amsterdam.
Glivenko, V.
[1929] Sur quelques points de la logique de M. Brouwer. Bull. Acad. Roy. Belg. Sci.
(5) 15, 183-188.
Grzegorczyk, A.
[1964] A philosophically plausible formal interpretation of intuitionistic logic. Indag.
Math. 26, 596-601.
Halmos, P.
[1960] Naive Set Theory. Van Nostrand, New York.
[1963] Lectures on Boolean Algebras. Van Nostrand, New York.
Henkin, L.
[1949] The completeness of the first-order functional calculus. J. Symb. Logic 14,
159-166.
Hewitt, E.
[1948] Rings of real-valued continuous functions. Trans. Am. Math. Soc. 64, 45-99.
Heyting, A.
[1930] Die formalen Reglen der mtuitionistischen Logik. Sitzungsber. Preuss. Akad.
IViss. Phys.-Math. Kl. 42-56.
[1930a] Die formalen Reglen der intuitionistischen Mathematik. Ibid., 57-71, 158-169.
[1972] Intuitionism: an Introduction (3rd rev. ed.). North-Holland, Amsterdam.
Hilbert, D.

[1900] Mathematische Probleme. Vortrag, gehalten auf dem internationalen Mathe-


matiker Kongress zu Paris 1900. Nachr. K. Ges. fViss. Gottingen Math.-Phys. Kl.,
253-297. (English transl.: Bull. Am. Math. Soc. 8 (1901-1902) 437-479.)
BIBLIOGRAPHY 579

Hintikka, J.
[1955] Form and content in quantification theory. Acta Phil. Fen 8, 7-55.
Hurd, A., and P. Loeb
[1974] Victoria symposium on nonstandard analysis, Lecture Notes in Mathematics
369. Springer, Berlin.
Infinitistic methods
[1961] Proc. Symp. on the Foundations of Mathematics, Warsaw, 2-9 Sept. 1959.
Pergamon. New York, and Panstw. Wyd. Nauk.
Jacobson, N.
[1964] Lectures in Abstract Algebra, Vol. 3. Van Nostrand, New York.
Jech, T. J.
[1971] Lectures in Set Theory, Lecture Notes in Math. 217. Springer, Berlin.
Kelley, J. L.
[1955] General Topology. Van Nostrand. New York.
Kleene, S. C.
[1936] General recursive functions of natural numbers. Math. Ann. 112, 727-742.
[1938] On notation for ordinal numbers. J. Symb. Logic 3, 150-155.
[1943] Recursive predicates and quantifiers. Trans. Am. Math. Soc. 53, 41-73.
[1952] Introduction to Metamathematics. North-Holland, Amsterdam, and Van
Nostrand, New York.
Kneebone, G. T.
[1963] Mathematical Logic and the Foundations of Mathematics. Van Nostrand,
New York.
Kolmogorov, A. N.
[1925] On the principle of the excluded middle. (English transl. in Van Heijenoort
[1967], pp. 414-437.)
Kreisel, G.
[1965] Mathematical logic. In: T. L. Saaty (ed.), Lectures on Modern Mathematics,
Vol. 3, 95-195. Wiley, New York.
Kreisel, G., and J. L. Krivine
[1967] Elements of Mathematical Logic. North-Holland, Amsterdam.
Kripke, S. A.
[1965] Semantical analysis of intuitionistic logic I, In: J. N. Crossley and M. A. E.
Dummett (eds.), Formal Systems and Recursive Functions, 92-130. North-
Holland, Amsterdam.
Kuratowski, K., and A. Mostowski
[1968] Set Theory. North-Holland, Amsterdam.

Lakatos, I. (ed.)
[1967] Problems in the Philosophy of Mathematics, Proc. Int. Colloq. in the Philosophy
of Science, London, 1965. North-Holland, Amsterdam.

Lambek, J.
[1961] How to program an infinite abacus. Can. Math. Bull. 4, 295-302.

Landau, E.
[1930] Grundlagen der Analysis. Akad. Verlagsgesellschaft (English transl.: Foundations
of Analysis. Chelsea, New York (1951).)
580 BIBLIOGRAPHY

Levy, A.
[1960] Axiom schemata of strong infinity in axiomatic set theory. Pacific J. Math.
10, 223-238.
Los, J.
[1954] On the categoricity in power of elementary deductive systems and some related
problems. Colloq. Math. 3, 58-62.
[1955] Quelques remarques, theoremes et problemes sur les classes definissables
d’algebres. In: Th. Skolem et al. (eds.), Mathematical Interpretations of Formal
Systems. North-Holland, Amsterdam.
Lowenheim, L.
[1915] Uber Moglichkeiten im Relativkalkiil. Math. Ann. 76, 447-470. (English transl.
in Van Heijenoort [1967], pp. 228-251.)
Luxemburg, W. A. J.
[1962] Non-standard Analysis. Lecture Notes, (Duplicated), Calif. Inst, of Technology.
[1969] (Ed.) Applications of Model Theory to Algebra, Analysis and Probability. Holt,
Rinehart and Winston, New York.
Luxemburg, W. A. J. and A. Robinson (eds.)
[1972] Contributions to Non-standard Analysis. North-Holland, Amsterdam.
Machover, M.
[1967] Non-standard analysis without tears. (Duplicated) Technical Report No. 27,
Hebrew Univ., Jerusalem.
Machover, M., and J. Hirschfeld
[1969] Lectures on Non-standard Analysis, Lecture Notes in Math. 94. Springer, Berlin.
Makinson, D
[1969] On the number of ultrafilters of an infinite Boolean algebra. Z. Math. Logik.
15, 121-122.
Mal’cev, A. I.
[1936] Untersuchungen aus dem Gebeite der mathematischen Logik. Mat. Sb. 1,
323-336.
Mansfield, R.
[1971] The theory of Boolean ultrapowers. Ann. Math. Logic 2, 279-325.
Matijasevic, Ju. V.
[1970] Diofantovost pereQslimyh mnozestv. Dokl. Akad. Nauk SSSR 191 (2), 279-282
(English transl.: Soviet Math. Dokl. 11 (2), 354-357.)
[1971] Diophantine representation of r.e. predicates. In: J. E. Fenstad (ed.), Proc.
Second Scand. Logic Symp. North-Holland, Amsterdam.
Minsky, M. L.
[1961] Recursive unsolvability of Post’s problem of “tag” and other topics in the
theory of Turing machines. Ann. Math. 74, 437-455.
Montague, R. M.
[1961] Fraenkel’s addition to the axioms of Zermelo. In: Y. Bar-Hillel et al. [1961],
Mostowski, A.
[1947] On definable sets of positive integers. Fund. Math. 34, 81-112.
[1958] Quelques observations sur l’usage des methodes non finitistes dans la meta-
mathematique. In: Colloq. Intern, du C. R. N. S., 1955, Vol. 70. Paris.
[1966] Thirty years of foundational studies. Blackwell, Oxford.
BIBLIOGRAPHY 581

Mucnik, A. A.
[1956] On the unsolvability of the problem of reducibility in the theory of algorithms.
Dokl. Akad. Nauk SSSR N. S. 108, 194-197 (in Russian).
Myhill, J.
[1955] Creative sets. Z. Math. Logik Grundl. Math. 1, 97-108.
Padoa, A.
[1900] Introduction logique a une theorie deductive quelconque (English transl. in
Van Heijenoort [1967], pp. 118-123.)

Peano, G.
[1889] Arithmetices Principia, Nova Methodo Exposita, Turin. (English transl. in
Van Heijenoort [1967], pp. 83-97.)
Post, E. L.
[1944] Recursively enumerable sets of positive integers and their decision problem.
Bull. Am. Math. Soc. 50, 284-316.
[1948] Degrees of recursive unsolvability. Prelim. Rept. Bull. Am. Math. Soc. 54,
641-642.
Putnam, H.
[1960] An unsolvable problem in number theory. J. Symb. Logic 25, 220-232.
Ramsey, F. P.
[1930] On a problem in formal logic. Proc. London Math. Soc. (2) 30, 264-286.
Rasiowa, H., and R. Sikorski
[1951] A proof of the completeness theorem of Godel. Fund. Math. 37, 193-200.
Rice, H. G.
[1953] Classes of recursively enumerable sets and their decision problems. Trans.
Am. Math. Soc. 74, 358-366.

Robinson, A.
[1951] On the Metamathematics of Algebra. North-Holland, Amsterdam.
[1956] A result on consistency and its application to the theory of definition. Indag.
Math. 18, 47-58.
[1961] Non-standard analysis. Nederl. Akad. Wetensch. Proc. (A) 64, 432-440.
[1962] Complex function theory over non-archimedean fields. Tech. Sci. Note No. 30,
USAF Contract 61 (052)-187.
[1963] Introduction to Model Theory and to the metamathematics of Algebra. North-
Holland, Amsterdam.
[1965] Topics in non-archimedean mathematics. Symp. on Model Theory, Berkeley,
Calif., 1963. North-Holland, Amsterdam.
[1966] Non-standard Analysis. North-Holland, Amsterdam.

Robinson, J.
[1952] Existential definability in arithmetic. Trans. Am. Math. Soc. 72, 437-449.

Robinson, R. M.
[1956] Arithmetical representation of recursively enumerable sets. J. Symb. Logic 21,
162-186.
Rogers, H., Jr.
[1967] Theory of Recursive Functions and Effective Computability. McGraw-Hill,
New York.
582 BIBLIOGRAPHY

Rosser, J. B.
[1936] Extensions of some theorems of Godel and Church. /. Symb. Logic 1, 87-91.
(Reprinted in Davis [1965], pp. 231-235.)
Rotman, B., and G. T. Kneebone
[1966] The Theory of Sets and Transfinite Numbers. Van Nostrand, New York.
Ryll-Nardzewski, C.
[1959] On theories categorical in power, Bull. Acad. Polon. Sci. Ser. Sci. Math. Astron.
Phys. 7, 545-548.
Sacks, G. E.
[1963] Degrees of unsolvability, Ann. Math. Studies 55. Princeton Univ. Press,
Princeton, N. J.
Schutte, K.
[1956] Ein System des verkniipfenden Schliessens Arch. Math. Logik Grundlagenforsch.
2, 56-67.
[1962] Der Interpolationsatz der intuitionistischen Pradikatenlogik. Math. Ann.
148, 192-200.
Shepherdson, J. C., and H. E. Sturgis
[1963] Computability of recursive functions. J. Assoc. Comput. Mach. 10, 217-255.
Shoenfield, J. R.
[1960] An uncountable set of incomparable degrees. Proc. Am. Math. Soc. 11, 61-62.
[1967] Mathematical Logic. Addison-Wesley, Reading, Mass.
[1971] Degrees of Unsolvability. North-Holland, Amsterdam.
Sikorski, R.
[1964] Boolean Algebras (2nd ed.). Springer, Berlin.
Skolem, T.
[1920] Logisch-kombinatorische Untersuchungen iiber die Erfiillbarkeit oder Beweis-
barkeit mathematischer Satze nebst einem Theoreme iiber dichte Mengen I.
Skr. Norske Vid.-Akad. Kristiana Mat.-Naturv. Kl. (4) (English transl. of §1
in Van Heijenoort [1967] 252-263.)
[1922] Einige Bemerkungen zur axiomatischen Begriindung der Mengenlehre. Mat.
Kongr. Helsingfors, 4-7 Juli 1922, Den femte Skand. Mat. Kongr., Redogorelse,
217-232. (English transl. in Van Heijenoort [1967] 290-301.)
[1934] Uber die Nicht-charakterisierbarkeit der Zahlenreihe mittels endlich oder
abzahlbar unendlich vieler Aussagen mit ausschliesslich Zahlenvariablen.
Fund. Math. 23, 150-161.
Smullyan, R. M.
[1961] Theory of formal systems, Ann. Math. Studies 47. Princeton. Univ. Press,
Princeton, N. J.
[1968] First-order Logic. Springer, Berlin.
Stone, M. H.
[1936] The representation theorem for Boolean Algebra. Trans. Am. Math. Soc.
40, 37-111.
[1937] Applications of the theory of Boolean rings to general topology. Trans. Am.
Math. Soc. 41, 375-481.
Tarski, A.
[1930] Ober einige fundamentale Begriffe der Metamathematik. C. R. Soc. Sci. Lettres
Varsovie (III) 23, 22-29. (English transl. in Tarski [1956], pp. 30-37.)
BIBLIOGRAPHY 583

[1930] Une contribution a la theorie de la rnesure. Fund. Math. 15, 42-50.


[1935] Die Wahrheitsbegriff in den formalisierten Sprachen. Stud. Phil. (Warsaw) 1,
261-405. (English transl. in Tarski [1956], pp. 152-278.)
[1939] Ideale in vollstandigen Mengenkorpern. Fund. Math. 32, 45-63.
[1956] Logic, Semantics, Metamathematics, papers from 1923 to 1938. Clarendon
Press, Oxford.
Tarski, A., A. Mostowski and R. M. Robinson
[1953] Vndecidable Theories. North-Holland, Amsterdam.
Tarski, A., and R. L. Vaught
[1957] Arithmetical extensions of relational systems, Comp. Math. 13, 81-102.
Thomason, R. H.
[1968] On the strong semantical completeness of the intuitionistic predicate calculus.
/. Symb. Logic 33, 1-7.
Troelstra, A. S.
[1969] Principles of Intuitionism, Lecture Notes in Math. 95. Springer, Berlin.
[1973] (ed.) Metamathematical Investigations of Intuitionistic Arithmetic and Analysis,
Lecture Notes in Math. 344. Springer, Berlin.
Turing, A. M.
[1936] On computable numbers, with an application to the Entscheidungsproblem.
Proc. London Math. Soc. (2) 42, 230-265; 43, 544-546.
[1939] Systems of logic based on ordinals. Proc. London Math. Soc. (2) 45, 161-228.
Van Heijenoort, J.
[1967] (ed.) From Frege to Godel, a Source Book in Mathematical Logic 1879-1931.
Harvard Univ. Press, Cambridge, Mass.
Vaught, R. L.
[1954] Applications of the Lowenheim-Skolem-Tarski theorem to problems of
completeness and decidability. Indag. Math. 16, 467-472.
[1961] Denumerable models of complete theories. Infinitistic Methods. Pergamon
Press, Elmsford, N. Y.
[1974] Model theory before 1945. Proc. Tarski symposium, AMS Proc. symp. Pure
Math. 25, 153-172.
Whitehead, A. N., and B. Russell
[1910] Principia Mathematica, Vol. 1. Cambridge Univ. Press, London (Vol. 2
appeared in 1912 and Vol. 3 in 1913. Second edition of Vol. 1 in 1925
and Vols. 2 and 3 in 1927.)
GENERAL INDEX*

absolute 502, 505 Axiom of Constructibility (Constr) 516


absuluteness sequence 502 -Extensionality (Ext) 461
abstraction term 465 -Infinity (Inf) 472
AC see Axiom of Choice -Power Set (Pow) 466
Aczel, P. 458 -Regularity (Reg) 480
address 232 -Replacement (Rep) 463
agreement (of two valuations) 50 -Separation (Sep) 463

aleph 490 -Union (Union) 466


algorithm 230 axiom scheme 35
alphabetic change of variable 61
(2.3.5) baby arithmetic 334
alphabetic ordering of variables 16, back-and-forth construction 187, 209

162, 404 Bar-Hillel, Y. xxi, 457, 530, 574

antecedent 17 base (for filter) 137, 544


antichain 141 (4.3.18) -, strong (for filter) 137
argument (of function symbol) 16 Basic Semantic Definition (BSD) 51

- (of predicate symbol) 17 (2.1.1)


Aronszajn, N. 574 Bell, J. L. 225, 491, 530
assignment (in structure) 163 Benacerraf, P. xxi

atom 150 Bernays, P. 398

atomic 150 Bernstein, A. 574

atomless 150 Bernstein, F. 488


automorphism (of structure) 220 Beth, E. W. 48, 421 (9.6.12) 457f

axiom, propositional 35, 47f Beth’s Definability Theorem 455

-, first-order 108, 122 (9.13.4), 456 (9.13.5)


-, intuitionistic first-order 434 bi-implication 19

-, - propositional 433 bijection 2


axiomatizable (encoded theory) 371 Birkhoff, G. 160

(8.3.2) Bishop, E. 457

-, finitely 190 (5.4.9), 341 Bolzano, B. 529

Axiom of Choice (AC) 3, 488 Boole, G. 159


-Comprehension see Boolean algebra 129

Comprehension Axiom - operation 129

* Entries for terms and abbreviations refer to the place where they are first explained,
defined or re-defined.
586 GENERAL INDEX

Boolean space 142 complete, 3~- 158


bound (occurrence of variable) 55 - first-order arithmetic 318
(2.2.2) completeness (of first-order tableaux)
bound, lower/upper 4 88ff
Bourbaki, N. 1 - (of propositional tableaux) 33
branch (in tableau) 25 - (of theory) 185, 353
branch, closed 26, 68, 412 Completeness Theorem, Storng
Brouncker, Lord 315n (for intuitionistic predicate calculus)
Brouwer, E. 403 441 (9.10.3)
BSD see Basic Semantic Definition -, - (for predicate calculus) 121
(3.3.14)
Cantor, G. 187, 225, 461, 488, 529 -, - (for propositional calculus)
- space 148 (4.4.18) 46 (1.13.5)
-s paradise 461 -, Weak (for predicate calculus)
cardinal 489 117 (3.3.3)
cardinality 489 -, - (for propositional calculus)
- (of language) 119 (3.3.10) 43 (1.12.1)
Cartesian product 1 component, prime 23
categorical 186 composition 2, 241 (6.5.2), 249 (6.6.1)
-, a- 186 Comprehension Axiom 462 (10.1.2)
CH see Continuum Hypothesis computable see function(al), computable
chain 4 computation code 263
Chang, C. C. 225 compute (function(al)) 237
Chinese Remainder Theorem 286 - (-, relative to sequence) 239
(6.12.6) computer 232n
Church, A. 48, 107, 124, 314, 360 concatenation (of programs) 236
-’s Theorem 349 (7.10.7) concurrent 533 (11.1.2)
s thesis 258f confutation, first-order 69
class 1, 461 -, propositional 26
-, definable 462 conjunction 19
R- 3 connective 12, 15
clopen algebra/set 131 (4.2.2) consequence, logical 53 (2.1.5)
closed (under operation) 3 -, tautological 24 (1.6.11)
- branch see branch, closed consequent 17
closure, monadic 546 (11.3.8) conservative 106
-, recursive 365 (8.1.10) consistency (of encoded theory) 371
code number (of command/program) (8.3.2)
259 -, first-order 113
-(of term/formula) 327f -, propositional 39
Cohen, P. 491, 530 relative 480f
collection 461, 537 -, strong 438
combination, propositional 21 consistent, maximal first-order 118
command 234 (3.3.4)
compact 557 -, - propositionally 44
Compactness Theorem 121 constant, individual 10, 15
(3.3.16), 171 (5.2.5), 181 (5.3.2) -, syntactic 8
complement (in lattice) 128 Constr see Axiom of Constructibility
GENERAL INDEX 58

constructible set 510 effectively inseparable 373 (8.4.4)


construction 403ff Ehrenfeucht, A. 213n, 225
contain 1, 459 element 1
Continuum Hypothesis (CH) 490 Elimination Lemma 31 (1.8.6), 81
-, Generalized (GCH) 491 (2.6.7), 422 (9.7.1)
countable 4, 507 - Theorem 32 (1.8.7), 83 (2.6.8), 432
countable chain condition 141 (4.3.18) (9.7.3)
Craig, W. 445, 458 EM see rule, EM-
creative (set/theory) 380 (8.5.8), 383 embedding 165, 318
(8.5.21) canonical (in ultrapower) 181
-, elementary 166
Daniel, Book of 274n enforceable 418
Davis, M. xxi, 311, 315, 340, 360 enlargement 533 (11.1.3)
decidable 269, 347, see also enlargement, a-/limit 534f (11.1.6)
recursively decidable enumerate 280
decision problem 269, 347, 349 Enumeration Theorem 364 (8.1.7)
Dedekind, R. 359, 529 equation 17
deduction 35, 109, 4331 equipollent 2, 487
Deduction Theorem 36 (1.10.4.), 109 equivalence, elementary 165
(3.1.3), 433, 435 -, logical 53 (2.1.5)
definable (member/subset) 346 (7.9.17) -, many-one/one-one 384
-, pointwise 346 (7.9.17) -, provable 111
degree (of description) 240 -, recursive see recursively equivalent
- (of formula) 17, 162, 405 tautological 24
- (of term) 16 - relation 3
- (Turing, of unsolvability) 388 eventually constant 498
de Jongh, D. H. J. 457 excluded middle, law of 407, 445
Dekker, J. 399 see also rule, EM-
denumerable 4 expansion 94, 164
depth (of tableau) 25 exponent 254 (6.6.12)
description (of function(al)) 240 (6.5.1) expression 18
de Swart, H. 457 Ext see Axiom of Extensionality
d. g. see diophantine extension (of encoded theory) 371
diophantine 285, 312 (8.3.2)
Dirichlet, P. G. L. 315n - (of set) 461
disjunction 19 -, elementary 165
disjunctive normal form 24 (1.6.14) -, natural (of function) 542
domain (of function) 2, 226 extensional (set) 485
- (of structure) see universe external 540
Drake, F. 530 extralogical axiom see postulate
dual (of homomorphism/continuous - symbol see symbol, extralogical
map) 153f extremally disconnected 144
duality, principle of (for Boolean
algebras) 130 false 317 see also truth value
-,-(for arithmetical hierarchy) Feferman, S. 160, 360
362 Fibonacci sequence 256 (6.6.16), 288
Dwinger, P. 160 field of sets 141
GENERAL INDEX
588

filter 133, 140 (4.3.16), 543 generate (filter) 133


-, principal 133, 546 (11.3.2) - (subalgebra) 132 (4.2.4)
-, proper 543 Gentzen, G. 48
finite-cofinite algebra 131 (4.2.2) Glivenko, V. 443
finite meet property 133 Godel, K. 117, 124, 225, 314f, 356,
Fitting, M. 457f 357, 360, 445, 459, 491, 530, 572
flow chart 235 Goodman, N. D. 457
f. m. p. see finite meet property graph (of function) 279
force 417 group, infinitesimal 563
formula 11, 17 -, topological, 562 (11.5.1)
atomic 17 Grzegorczyk, A. 457
-, conjunction 19
disjunction 19 Haar measure 569f (11.6.9)
-, existential 19 Halmos, P. 1, 160, 487
-, implication 17 halting state 260
-, negation 17 Henkin, L. 48, 121, 124, 573
-, prime 21 - set 119 (3.3.8)
-, restricted 502 Herbrand, J. 314
universal 17 Hewitt, E. 225
-, ± 410 Heyting, A. 457
Fraenkel, A. A. xxi, 1, 457, 529f hierarchy, analytical 369
Frayne, T. 225 -, arithmetical 361 (8.1.1)
free (occurrence of variable) 55 (2.2.2) -, cumulative 477
see also variable, free Hierarchy Theorem 365 (8.1.9)
- (set of generators) 147 (4.4.17) Hilbert, D. 315. 398
- (for substitution) 59 (2.3.3) -’s Fifth Problem 575
Frege, G. 124, 225 -’s Tenth Problem 312
Friedberg, R. 392, 399 Hintikka, K. J. J. 48
function 2, 226, 542 see also - set 83 (2.7.1)
function(al) Hirschfeld, J. 159f, 549, 575
-, arithmetical 327 (7.3.10) - mapping 553 (11.3.31)
-, diagonal 268 (6.10.6) homomorphism (of Boolean algebras)
functional 228, 265 134
function(al), algorithmic 23If - (of lattices) 126
computable 237 -, 2-valued 135
-, - relative to sequence 239 hull 134
primitive recursive 242 Huntington, E. V. 159
-, recursive 241 Hurd, A. 575
-, - relative to sequence 242 hyperarithmetical 369
total 227f hypothesis 35
function symbol 10, 15
include 1, 460
GCH see Continuum Hypothesis, ideal (in Boolean algebra) 139 (4.3.13)
Generalized identity map 3
generalization (of formula) 108 - relation 2
- on constans/variables, law of 110 implication 15
(3.1.5), 113 (3.1.13) inaccessible 528 (10.10.13)
GENERAL INDEX 589

incomparability (of degrees) 388 Keisler, H. J. 225


Incompleteness Theorem, First 356 Kelley, J. L. 1
(7.11.8) Kleene, S. C. 314, 398, 457
-, Second 358 (7.11.9) Kneebone, G. T. xxi, 457, 487
index (of function(al)) 268 Kolmogorov, A. N. 457f
- (of r. e. set) 282 Konig’s Tree Lemma 91
indexing 168 Kreisel, G. xxi, 457
indiscernibles 218 Kripke, S. 416, 457
individual (of structure) 9, 49 -’s Semantic Definition (KSD) 417 (9.6.1)
-, designated (of structure) 9, 49, 162 - system 416
induced (mapping) 542 Krivine, J. L. xxi
induction, principle of mathematical KSD see Kripke’s Semantic Definition
321, 472 Kuratowski, K. 1.
-,-transfinite 471 (10.2.4) K-valid 419
- on rank, principle of 484 (10.3.10)
- postulate 342 (7.9.1) Lagrange, J. L. 305, 312, 315n
inductive (partially ordered set) 4 Lakatos, I. xxi
Inf see Axiom of Infinity Lambek, J. 314
infimum 4 Landau, E. 321n
infinitely close 554 Langford, C. 225
injection 2 language (for structure) 162
-, natural 3 -, first-order 14f
internal 540 -, formal 6
interpolant 446, 452 -, higher order 14
interpolate 446, 452 - of arithmetic, first-order 316
interpolation property 453 - with equality 15
Interpolation Theorem 452 (9.12.2), lattice 125
453 (9.13.1) -, complemented 128
interpretation, constructive 401, 403ff -, complete 127
-, structural 400f -, distributive 126
intuitionistic philosophy of mathematics least member 470
403 legitimate (abstaction term) 465
invariance (with respect to language) 23 length (of number) 254 (6.6.12)
(1.6.9), 56 (2.2.7), 117(3.3.2) - (of program) 234
isomorphic, locally 566 (11.5.10) level (in tableau) 25
-, recursively 385 Levy, A. xxi, 457, 530
isomorphism (of Boolean algebras) 134 - Montague set theory (LM) 495
- (of lattices) 126 Liar paradox 330, 355
- (of structures) 165, 318 Lindenbaum, A. 48, 225
-, collapsing 487 - algebra 193, 203
-, €-485 —, propositional 131 (4.2.2)
LM see Levy-Montague set theory
Loeb, P. 575
Jacobson, N. 568 Lowenheim, L. 124, 225
Jech, T. 491, 530 - Skolem Theorem 172 (5.2.8)
join (in lattice) 125 -, Downward 170 (5.2.2)
junior arithmetic 336 -, Upward 172 (5.2.7)

39
590 GENERAL INDEX

logic, first-order 47 negative (of formula) 410


intuitionistic 402 next state 260
propositional 20 NFT see Normal Form Theorem
Los, J. 225 nice 270f (6.10.12)
Theorem 180 (5.3.7) node (in tableau) 25
Luxemburg. W. 549, 575 -, initial (of tableau) 25
nonstandard object 538
Machover, M. 160, 575 Normal Form Theorem 266 (6.10.1)
Makinson, D. 160 -(for functions) 267 (6.10.3)
Mal’cev, A. I. 124 normal (program) 236
Mansfield, R. 225 normalization 236
Matijasevic, Ju. V. 311. 315 number 226
maximal 4 numeral 317
meet (in lattice) 125
member 1 object language 6
membership relation 2, 461
omit 203
metalanguage 7 one-one (function/correspondence) 2
minimal (algebra) 131 (4.2.2)
onto 2
- (element) 469
operation 3
- (model of ZF) 524 (10.10.7)
-, basic (of structure) 9
minimization 241 (6.5.2), 249 (6.6.1)
primitive recursive 247
minimum, bounded 253 (6.6.10)
-, recursive 247
Minsky, M. 314
oracle 233
Mirimanoff, D. 529 order (of constructible set) 511
model 56, 164
- of magnitude 568
-, nonstandard 319
ordered n-tuple 1
-, prime 211 (5.6.14)
ordinal 468
-, standard 319
-, limit 471
modus ponens 34
- -definable 527 (10.10.10)
monad (of set of subsets) 544
-, hereditarily 528 (10.10.12)
- (of point) 553
monadic 544 see also topology, monadic
monadology 553 Padoa, A. 458
Montague. R. 530 parameter 102
Morel, A. C. 225 Parikh, R. 575
Morley, M. 187 partially ordered set 3
Mostowski, A. xxi, I, 225, 360, 398 partition 3
-’s Collapsing Lemma 485 (10.3.11) Peano, G. 360
MRDP Theorem 311 (6.16.1) - arithmetic, first-order 343
Mucnik, A. A. 392, 399 - postulates 320f, 472
Myhill, J. 385, 399, 525 Pell’s equation 296
permutation, recursive 385
natural number 471 place, standard 568
near 554 Platonism 401, 403, 46In
near-standard 554 point 1, 543
Nebuchadnezzar 274 *point 543
negation 12, 15 polynomial 284
GENERAL INDEX 591

Post, E. 392, 398 recursively decidable 269, 347, 371


-’s problem 392 (8.3.2)
postulate 184, 318 - enumerable (r. e.) 278 (6.11.1)
Pow see Axiom of Power set - equivalent 388
power set 1, 466 - separable/inseparable 372
- - algebra 130 (4.2.2) reducible 349
p. r. see function(al), primitive recursive -, (many-one) 381 (8.5.11)
predicate calculus 108fF -, one-one 384
-, intuitionistic 434 reduction (of structure/valuation) 94,
- symbol 11, 15 164
prenex (form/formula) 93 - (of theory) 349
primitive recursion 241 (6.5.2), 249 -, (many-one) 381 (8.5.11)
(6.6.2) reflect 491
productive (set/function) 376 (8.5.1) Reflection Principle, Extended 525
program 234 (10.10.8)
- counter 232 - -, First (RP0 491
proof, constructive 404ff, 412, 415 - -, Second (RP2) 491
-, first-order 109 Reg see Axiom of Regularity
-, propositional 35 register 232
property 2 regular (set) 477
- of structures, first-order 185 - open (algebra/set) 147 (4.4.15)
propositional calculus 35ff relation 2, 248
-, intuitionistic 433 -, arithmetical 325 (7.3.2), 361
pseudo- 537 -, basic (of structure) 9, 162
Putnam, H. xxi, 311, 315 -, elementary 284
-, first-order/second-order 248
relativization 102, 480, 484
quantification, bounded 251 (6.6.9)
remote 554
-, unbounded 278 Rep see Axiom of Replacement
quantifier, existential/universal 13f, represent, strongly/weakly 324f (7.3.1)
15, 19 representable, strongly/weakly (in
quotient (of Boolean algebra) 135 encoded theory) 371 (8.3.3)
- pair of sets (in encoded theory) 375
Ramsey, F. P. 225 respect 157
-’s Theorem 218 (5.7.4) restriction (of function/operation/relation)
range, 2, 473 2f, 473
rank (of set) 478 - (of structure) 165
Rasiowa, H. 160, 199, 225, 437 Rice, H. 315
- Sikorski Theorem 158 (4.7.3) Riemann, B. 224
r. e. see recursively enumerable Robinson, A. 225, 458, 531 f, 572, 574f
realize 203 -’s Consistency Theorem 457 (9.13.6)
recursion, ordinal (transfinite) 475 -’s Overspill Lemma 541 (11.2.11)
Recursion Theorem 273 (6.10.19) Robinson, J. 311, 315
- - for R. E. Sets 284 (6.11.13) Robinson, R. M. 315, 340, 360
recursive see function(al), recursive Rogers, H. 314, 398
-, <p- see function(al), recursive relative Rosser, J. B. 356, 360
to sequence Rotman, B. 160, 487

39*
592 GENERAL INDEX

RPX see Reflection Principle, First S™ Theorem 271 (6.10.13-14)


RP2 see Reflection Principle, Second -for r. e. sets 283 (6.11.11)
RT see Recursion Theorem Smorynski, C. 457
rule (of first-order tableaux) 68ff Smullyan, R. M. 48, 315, 398, 457
- (of intuitionistic first-order tableaux) sound (set of sentences) 318
414 soundness, semantic (of first-order
- (of intuitionistic propositional tableaux) 69 (2.4.1)
tableaux) 41 Of - (of intuitionistic tableaux) 421
- (of propositional tableaux) 28ff (9.6.11)
-, EM- 32, 82 (of modus ponens) 34f (1.10.1)
- (for ►-) 409f - (of predicate calculus) 109 (3.1.2)
-, ± 432 -, - (of propositional calculus) 36
Russell, B. 124 (1.10.3)
-’s paradox 462 -, - (of propositional tableaux) 26
Ryll-Nardzewski, C. 208, 225 (1.7.1)
stable 3
Sacks, G. 398 standard approximation 555
satisfiable 53 (2.1.5) - object 537
satisfy 22 (1.6.7), 53, 56, 163, 500 - place see place, standard
saturated, finitely 212 (5.6.15) state (of Kripke system) 416
Schroder, E. 488 Stone, M. H. 160
Schutte, K. 48, 445, 458 - Representation Theorem 141 (4.4.1)
scope (of quantifier) 17 - space 143
- (of *set) 539 string 16
Scott, D. S. 225, 525 structure 9, 49, 162, 214
semantics 6 basic 197
sentence 12, 56 -, R-valued 200 (5.5.14)
Sep see Axiom of Separation -, canonical 195
separating (field of sets) 145 -, standard (- of natural numbers)
Separation Lemma 373 (8.4.5) 317
sequence 228 Sturgis, H. 314
set 1, 461, 537 subalgebra 131
*set 538 subformula 18
Shepherdson, J. 314 sublattice 126
Shoenfield, J. 398f substitution 57ff
Sikorski, R. 139 (4.3.15), 160, 199, substructure 165, 533
225, 437 -.elementary 165, 501
Simultaneous Recursion Theorem 276 subterm 18
(6.10.24) supremum 4
Skolem, T. 124, 225, 359, 529f, 573 surjection 2
- form 95ff symbol, extralogical/logical 16
- hull 217 Symmetric Lemma 374 (8.4.6)
- set 216 syntax 6
-’s paradox 508
- structure 217 tableau, first-order 67ff
- term 216 -, intuitionistic first-order 414
Slomson, A. 225 propositional 410f
GENERAL INDEX 593

propositional 25ff ultraproduct 179


pure 72 Union see Axiom of Union
Tarski, A. 48, 52, 107, 146, 159, 225, universal (recursive function(al)/program)
329, 360 267f
tautology 22 (1.6.7) -, countably 213 (5.6.16)
term 10, 16 universe (of structure) 9, 49, 162
- (of formula) 416 -, constructible 511
-, bounded ordinal 476 - of sets 461
-, closed 54 Unlimited Register Ideal Machine
-, virtual 104fF, 464 (URIM) 232ff
theory 184, 317 used up (formula in tableau branch)
- (of structure) 185 27 (1.7.3)
-, axiomatic/axiomatizable 332 (7.5.2)
-, creative see creative valuation 50
-, encoded 370 (8.3.1) van Heijenoort, J. xxi
Thomason, R. 458 variable 10, 15
tiny 544 -, chosen (for relativization) 102
topology, monadic 549 (11.3.21) -, critical 58, 414
totally ordered set 3 -, free 55, 464 see also free occurrence
transference 538 -, syntactic 9
transitive (formula) 479 - of quantification 17
- (set) 468 variant (of formula/tableau) 62
- closure 479 (2,3,7), 73
Troelstra, A. 457f Vaught, R. L. 213n, 225
true 317 see also truth value Veldman, W. 457
truth, logical 53 (2.1.5) von Neumann, J, 529f, 574
- definition 52, 331 (7.4.10)
- table 21 f
well-ordered 4
- valuation 20 (1.6.1)
Whitehead, A. N. 124
- value 12, 20
Turing, A. 314, 399
type 204 Z see Zermelo set theory
reduced 204 Zakon, E. 575
Zermelo, E. 529
ultrafilter 135, 547 (11.3.13) —Fraenkel set theory (ZF, ZFC) 495
-.perfect 196 - set theory (Z) 522 (10.10.2)
Ultrafilter Theorem 136 (4.3.6) - structure 536
ultramonad 547 (11.3.13) ZF, ZFC, see Zermelo-Fraenkel set theory
ultrapower 179 ZFL 517
Boolean 202 (5.5.17) Zorn’s lemma 4
INDEX OF SYMBOLS*

N I, 226, 471 deg 16, 17, 162, 2


CO 1, 473 405
P 1, 466 19
<*i. — > *„) 1 A 19
A1X...XA„ 1 V 19
An 1 <-► 19
dom 2, 473 3 19
ran 2, 473 T 20
i— 2 JL 20
f\x 2 1= 22 (1.6.7), 53.
/m 2 l=o 24 (1.6.11)
f-1 m 2 Ho 35
f~Hx) 2 v 50
fog 2 50
3 fU 50
A1
pU 50
EM*
UI
3
a(x/u) 50
inf 4
s(x/t) 57 (2.3.1)
sup 4
a(x/t) 59 (2.3.3),
Ijcl
i i 4, 489
63 (2.3.9)
of 4
t= 53 (2.1.5)
6, 316, 404, 459,
532, 536 62, 73
— 11, 15, 162 *(c/t) 64 (2.3.13)
“1 15 «(Xi/ti,
-*■ 15 65 (2.3.14)
V 15 89

* The symbols are listed in order of first occurrence. A symbol used in more than one
sense may be listed more than once.
596 INDEX OF SYMBOLS

97 (2.10.1) 191, 487


3!
1 106, 464 |<P| 191
h 109, 460 B{Z) 193
119 (3.3.10) 21 (U) 195
m
a (in lattice) 125 T(U) 195
v (in lattice) 125 a> 203
n 0> n 203, 317
V 127
4>(2I ,a) 204
i=l
n
U(%a) 204
A 127
Lc 206
/=1
217
V 127 $«(*)
A_ 128 X 226
0 (in Boolean a 226
algebra) 128 oo 227
1 (in Boolean = 228
algebra) 128 X 229
X* 128 F 229
CX 131 (4.2.2) R; 232
FA 131 (4.2.2) K 232
2 131 (4.2.2) Zi 234
134, 165 Si 234
B/F 135 Ai 234
SB 141 Ji.j.fc 234
153 237

<P> 154 c 240


a(n/b) 163 p 240

K 164 M 240
Cl 165, 460 Z 240
311B 165 s 240
EE 165 A 240
-< 166 Ki 240
(21,o) 168 z 240 (6.5.2)
168 s 240 (6.5.2)
I VII 168 A 240 (6.5.2)
|a| 175 Am 240 (6.5.2)
2(A) 174f “1 251 (6.6.6)
fl& 179 A 251 (6.6.6)
179 V 251 (6.6.6)
Th(3l) 185 — 251 (6.6.6)
INDEX OF SYMBOLS 597

= 251 (6.6.7) Tm 328 (7.4.1)


r=s 251 (6.6.7) Fla 328 (7.4.2)
3z<y 253 (6.6.9) Vbl 328 (7.4.3)
\/z<y 253 (6.6.9) Frm 328 (7.4.4)
min z<y 253 (6.6.10) sb 329 (7.4.5)
q 254 (6.6.11) d 329 (7.4.5)
rm 254 (6.6.11) T* 332 (7.5.1)
Px 254 (6.6.12) n0 334
exp 254 n n* 336
(z)x 254 (6.6.12) 336
lh 254 (6.6.12) n2 340 f
J 255 (6.6.13) n 343
K 255 (6.6.13) i°m 361 (8.1.1)
L 255 (6.6.13) n°m 361 (8.1.1)
# 259, 327 f 361 (8.1.1)
A
z 259 01 365 (8.1.10)
{;} 260 (6.8.1) ran{e} 377 (8.5.4)
1 262 O 388
Tn 263 f (6.8.3) O' 392
t; 263 f (6.8.3) ► 408
T 263 f (6.8.3) — 410
T* 263 f (6.8.3) * 410f
U 264 (6.8.5) £>0 412
s: 271 (6.10.14) o 414
Ac 278 ft 416
By 278 T* 416
vy 278 T (in Kripke
dom„{e} 281 system) 416
P 286 (6.12.5) F* 416
«(ti» •••> *„) 317 F 416
r(ti,.... tn) 317 IF 416f
+ 317 416 f
X 317 433
s 317 434
0 317 € 459
sk 317 460
317 460
Sa

m 317 3 460
£1 317 460
598 INDEX OF SYMBOLS

3*x 460 a(<p) 480


V 461 484
0 465 <pF) 484
{x:<p} 465 Inj 485

u 466 Isom 485


1 466 Ex 485
2 466 F-ac 488
un 467 Card 489
K, 467 490
467 491
<«1» Reflp,.*>)
467 (p 1 495
467 F„ 498
467 F 498
M
(u,v) 467 ec 498
(u) 467 x(y/z) 498
n 467 Sat 500
u 467 ES 501
— 467 Dm 510
{xGw:q>} 468 . V 510
{t:.Yi€wi a... L(x) 511
• •• ax„€«„a<p} 468 A(x) 511
Trans 468 533
Ew 468 91 c 533
Ord 468 < 533 (11.1.3)
a, /?, y,... 469 G 536
< 469 ll 536
X+ 1 469 Fun 537
F 471 * (in nonstandard
Lim 471 analysis) 537 f
/, j, k, m, n 471 nt 537
Px 473 537
Fun (/) 473 t= (in nonstandard
f\x 473 analysis) 537
t\x 473 V 537
K 477 tv 538
Reg 477 O 538
e 478 a, A 539
Trans„(jc) 479 f‘a 542
TC 479 /* 542
INDEX OF SYMBOLS 599

y 544 °q 555
\l<§ 544 R 566
F 545 (11.3.4) P 567
x~ 546 (11.3.8) <P 567
553 568
Kp)
553 o 568
553, 564
DATE DUE
DATE DE RETOUR
-

MAR 2 v 200i APR 3 U 7001

MAR 2 7 7001
r»r*T ft 4 nnno
-ULI U 1 £UU£

APP ft ? qoni
-f1,n Hi CUUJ

NQV 2 fi 2003
8iv~f—»—U—bVVU
QA 9 .B3953
Bell, J. L. (John Lane) 010101 000
A course in mathematical lo<

163 0149377 5
ill
TRENT UNIVERSITY

Be?!? J6g39£3ne_
A course in mathematical
logic.

DATE ISSUED T

302243

You might also like