0% found this document useful (0 votes)
11 views43 pages

Main

The document discusses algorithms and complexity theory, using the Apollo 13 mission as a case study to illustrate the challenges of solving the satisfiability problem (SAT) under time constraints. It presents an algorithm for determining if a boolean formula can yield a true output and explores the implications of verification, measurement, and complexity in algorithm design. The document concludes that while efficient algorithms for SAT are unlikely to exist, understanding problem transformations can help assess problem hardness.

Uploaded by

Matei Popovici
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views43 pages

Main

The document discusses algorithms and complexity theory, using the Apollo 13 mission as a case study to illustrate the challenges of solving the satisfiability problem (SAT) under time constraints. It presents an algorithm for determining if a boolean formula can yield a true output and explores the implications of verification, measurement, and complexity in algorithm design. The document concludes that while efficient algorithms for SAT are unlikely to exist, understanding problem transformations can help assess problem hardness.

Uploaded by

Matei Popovici
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Algorithms & Complexity Theory

Matei Popovici
Computer Science Department
POLITEHNICA University of Bucharest

August 19, 2015


2

0.1 Introduction
0.1.1 The Apollo example
The story
On the 11th of April 1970, the Apollo 13 mission is taking off, with the
intention of landing on the Moon. Two days into the mission, an explosion
in the Service Module forces the crew to abort mission and attempt a return
to Earth. Facing extreme hardship, especially due to loss of power, the crew
finally succeeds in returning to Earth. Both the crew and the support team
from Space Center were forced to find fast and simple solutions to apparently
insurmontable problems1 .
Drawing inspiration from this scenario, we consider the following inspi-
rational example.

The space mission S takes off having Moon as destination. Some time
after take-off, S notices a malfunction in the main engine, which fails to
respond to command inputs. Trying to hack-control the engine the
following circuit is identified:

ϕ ≡ (¬A ∨ B ∨ D) ∧ (¬B ∨ C ∨ D) ∧ (A ∨ ¬C ∨ D) ∧ (A ∨ ¬B¬D)


∧(B ∨ ¬C ∨ ¬D) ∧ (¬A ∨ C¬D) ∧ (A ∨ B ∨ C) ∧ (¬A ∨ ¬B ∨ ¬C)

If ϕ can be made to output 1, then the engine could be manually started. S


however notices that, with no apparent input for A, B, C and D, does the
circuit yield 1. S requires advice from Control, as how to proceed in this
situation.

The solution which Control must prompt is dependent on some key


“background” information:

(i) The position of S allows only 5 minutes of window for a decision to be


made. After 5 minutes, a safe mission abort is no longer possible.2

(ii) Trying to find the solution “by hand ” is unfeasible and likely to pro-
duce errors.
1
One example is the famous improvisation which made the Command Module’s square
CO2 filters operable in the Lunar Module, which required such filters of round shape.
2
This is highly similar to the real situation of Apollo 13: given their position and
malfunctions made a direct abort and return to Earth impossible. Instead, their return
trajectory involved entering Moon orbit.
0.1. INTRODUCTION 3

(iii) The actual problem to be solved consists in finding a certain “input”


I, which assigns to each of A . . . D a value from {0, 1}, such that the
circuit ϕ yields 1 (which we denote as ϕ(I) = 1). We call this problem
S(ϕ).

(iv) To solve S, one needs to find an algorithm for computing ϕ(I), given
an input I.

Computing ϕ(I). It is helpful to examine the structure of ϕ first, which


shows certain regularity. ϕ is of the form:

(L1 ∨ . . . ∨ L1n1 ) ∧ . . . ∧ (Lk ∨ . . . ∨ Lknk ))

where each (Li ∨ . . . Lini ) is a clause consisting of ni literals, and each literal
Lij is of the form X or ¬X, where X is a variable (input).
Next, we identify a suitable representation of which our algorithm can
benefit. Assume I is represented as a vector, which, for convenience, is
indexed by variable names instead of 0 . . . n−1 (n is the number of variables).
Assume ϕ is represented as a matrix, where each column ϕ[i] represents a
vector of literals. The value ϕ[i, X] = 1 indicates that X appears as itself
in clause i, ϕ[i, X] = 0 indicates that X appears in a literal ¬X, while
ϕ[i, X] = ⊥ indicates that X is not present in clause ϕ[i].
The algorithm to solve our problem, entitled CHECK(ϕ, I), proceeds as
follows:
a) Set v = 0

b) for i = 1, k:

re-set v = 0
for each variable X from I:
set v = v + ϕ[i, X] ⊕ I[X]
if v = 0 stop and return 0; otherwise, continue;

c) return 1
The operation ⊕ is a customized XNOR, which behaves as follows: a ⊕ b
is always 0 if either of a or b is ⊥. Hence, if a variable X does not appear
in a clause i, then ϕ[i, X] = ⊥ will not influence the value of v. a ⊕ b is 1 if
both operands are the same. Hence ϕ[i, X] ⊕ I[X] = 1 if X occurs as itself
in clause i and is given value 1 in I, or if X occurs as ¬X in clause i and is
given value 0 in I.
4

At the end of the inner loop, CHECK will have computed the value of
some clause (Li ∨ . . . Lini ), which is 1 if at least some literal is 1.
If, some clause has the computed value of 0, then ϕ is 0 for the input I.
Finally, note that CHECK performs at most n ∗ k computations, where
n is the number of variables, and k is the number of clauses in ϕ.

Computing S(ϕ) To solve S, it is sufficient to take all possible inputs I,


up to the number of variables which occur in ϕ, and for each one, perform
CHECK(ϕ, I).
This can be achieved by viewing the vector I as a binary counter. Hence,
the operation I ++ is implemented as follows, where variables X are iterated
in some arbitrary order, but fixed with respect to the algorithm:
a) for each variable X in I:

if I[X] = 1 make I[X] = 0 and continue;


otherwise make I[X] = 1 and stop;

b) overflow;
The operation I + + is said to overflow, if instruction b) is reached, namely
if all variables are set to 0 upon a traversal of all variables.
Now, we implement FIND(ϕ):
a) Set I to be the input where all variables are set to 0;

b) while overflow was not signalled:

if CHECK(ϕ, I)=1 then return 1


otherwise I++;

c) return 0
We now have a procedure for solving S, one which Control may use in
order to assist the Mission. However, several issues arise:
(V) is FIND(ϕ) correct, that is, bug-free, and returning an appropriate
result ?

(M) how long does FIND(ϕ) take to run, with respect to the size of the
input (i.e. the size of the matrix ϕ) ?

(C) is FIND(ϕ) an optimal algorithm? Are there ones which perform bet-
ter?
0.1. INTRODUCTION 5

Next, we discuss each issue in more detail: (V) denotes the Verification
problem. Verification of computer programs (that is, of algorithm implemen-
tations) is a essential nowadays, as more and more we rely on software to
perform critical tasks: assist us in driving cars and flying planes, guiding
missiles and performing surgeries. Verification is equally important for the
mission.
(M) identifies a measurement problem. Namely, in what way Critical
resources are consumed by an algorithm. Traditionally, these resources
are time, i.e. number of CPU operations and space i.e. computer memory.
We note that, for our given formula, CHECK(ϕ, I) will run at most 8∗4 = 32
CPU operations, where 8 is the number of clauses, and 4 is the number of
variables from any input. FIND(ϕ) will build at most 2n assignments I,
where n is the number of variables. In our example, n = 4, thus we have 16
different ways of assigning A, B, C and D to values 0, 1. For each, we run
CHECK(ϕ, I), yielding a total of 16 ∗ 32 = 512 CPU operations.
For the sake of our example, let us assume Mission’s computers ran
one CPU instruction per second, which is not unlikely, given the hardware
available in the ’70s. Thus, in 5 minutes, we can perform: 5 ∗ 60 = 300 CPU
operations ! Note that FIND(ϕ) does not finish in sufficient time: if Control
would use this algorithm, it would waste Mission’s time.
Finally, (C) designates a complexity problem, one specific to Com-
plexity Theory: “Do efficient algorithms exist, for a given problem? ”.
There are also sub-questions which spawn from the former:

• is there a better encoding of the input and the variables from I, which
makes the algorithm run faster?

• is there a certain machine which allows performing computations in


a more efficient (fast) way?

Unfortunately, the problem which we denoted by S, is traditionally called


SAT: i.e. the Satisfiability problem for boolean formulae. It has been
known that efficient algorithms to solve SAT are unlikely to exist, no matter
the machine of choice. Translated into simple terms, in the general case, we
can’t get substantially better than an exponential number of steps, with
respect to the formula input. If Control would have this knowledge, then it
will take no trouble in looking at specific algorithms. If would recognize the
problem to be impossible to solve in the given time constraints, and would
recommend Mission to return home.
To recap, (V), (M) and (C) play an important role in decision making,
both at the software level, and beyond. Naturally hard problems (re)occur
6

if every field of science. Having at least a minimal knowledge on their nature


is a critical step in attempting to solve them.

Disclamer
SAT is probably one of the most studied problems in Computer Science.
SAT solvers are key components of many algorithms which solve even harder
problems: program verification, code optimisation, cryptography, only to
name a few. While reading this, it is highly probable that the reader employs
some technology which relies on SAT solvers, directly or indirectly.
Are these solvers efficient, in the sense that, they run in less than expo-
nential time? For the general case, the answer is no. However, these solvers
will run fast enough for most formulae. This doesn’t really help Control,
unless they are lucky enough to stumble upon a “nice” formula.
The intuition behind SAT solvers is that they rely on an efficient graph-
like structure to represent a formula. This structure is called an OBDD
(Ordered-Binary Decision Diagram). The efficiency of an OBDD is unfor-
tunately dependent on finding a “suitable” ordering for the variables. The
latter, in itself is a hard problem. However, there are good heuristics which
come close. The overall result is that SAT solvers perform well in “many”
cases, and the “many” is measurable with controlled precision.

0.1.2 Which is harder?


Consider the following problems:

A telecommunications company T owns radios across the country, and


needs to monitor all links, to record performance. Monitoring each radio is
expensive, thus T would like a minimal number of k radios to be monitored.
Is this possible?

An organism consists of a number of cells, and connections between cells.


Some cells are good, while some are bad. The bad cells can be detected by
examining their links: they form a complete mesh: each cell is connected to
all others. Are there (k-) bad cells in the organism?

It is easy to notice that both problems can be cast into graph problems.
Let G = (V, E) be an unoriented graph. If we interpret each node as a radio,
and each edge as a radio link, then solving the former problem boils down
to finding a subset S ⊆ V such that, for all (a, b) ∈ E, at least one of the
following holds: a ∈ S, b ∈ S. Hence, S “covers” all edges from E.
0.1. INTRODUCTION 7

If we interpret nodes as cells, and edges as connections between cells,


then the latter problem consists in finding a subset S ⊆ V (of size |S| = k)
such that (a, b) ∈ E for each a, b ∈ S. Hence S is a clique (of size k).
One interesting question which may be raised is whether one problem is
harder than the other. We note that a graph capturing the first problem
can be transformed into one capturing the latter, and vice-versa.
If we start from the “telecommunications” graph, we can build a “cell ”
graph by: (i) creating once cell per radio (ii) if two radios do not share a
link, then the corresponding cells will share a connection. Thus, if some
subset S of radios covers all links, then all the cells corresponding to radios
outside S must share connections, hence they are bad.
We note that the transformation can be easily adjusted to work in the
other direction. Thus, if one has an algorithm to solve the Telecommuni-
cations problem, then, via a transformation, we can solve the cell problem,
and vice-versa.
This observation highlights several issues:

• If such a transformation exists between two problems, what does it


say about the complexity of solving the problems?

• Is it always the case that transformations are bijective?

• What are the properties of the transformation, such that it is effective


(can be used for problem solving)? For instance, if some problem A
can be transformed to B in exponential time3 , and A can be solved
in polynomial time, then solving B via A yields an algorithm in expo-
nential time.

• Can (appropriate) transformations be used to characterize problems


which are equally hard?

In the previous section, we illustrated a problem which can be solved in


exponential time. We claimed, without arguing, that this problem cannot be
solved in polynomial time. Given some problem P , if we can appropriately
transform P into the former problem, we can also argue that P cannot be
solved in polynomial time. If this would be possible, an algorithm for P
could also solve our former problem.
Developing a formal methodology from the above intuition is a critical
tool for assessing problem hardness.
3
with respect to the size of the input
8

0.2 Computability Theory


0.2.1 Motivation
Goldbach conjecture [Matei: https://en.wikipedia.org/wiki/Wang tile#Applications]

0.2.2 Problems and problem instances


In the previous section, we have illustrated the problem SAT, as well as a
pseudocode which describes a solution in exponential time. We have seen
that such a solution is infeasible in practice, and also that no (predictible)
technological advance can help. The main question we asked (and also
answered) is whether there exists a faster procedure to solve SAT. (We
have conjectured that the answer is no, or, more precisely not likely.) To
generalize a little, we are interested in the following question:

Can a given problem Q be solved in “efficient” time?

For now, we go around the currently absent definition for “efficient”, and
note that such a question spawns another (which is actually more straight-
forward to ask):

Can a given problem Q be “solved ” at all?

In this chapter, we shall focus on answering this question, first, and to do


so, we need to settle the following issues:

• What exactly is a “problem”?

• What does it mean to “solve” a problem?

Definition 0.2.2.1 (Abstract problem, problem instance) A problem


instance is a mathematical object of which we ask a question and expect an
answer.
An (abstract) problem is a mapping P : I → O where I is a set of
problem instances of which we ask the same question, and O is a set of
answers. P assigns to each problem instance i ∈ I the answer P (i) ∈ O.

It is often the case that the answers we seek are also mathematical ob-
jects. For instance, the vector sorting problem must be answered by a sorted
vector. However, many other problems prompt yes/no answers. Whenever
O = {0, 1} we say that P is a decision problem. Many other problems can
be cast into decision problems. The vector sorting problem may be seen
0.2. COMPUTABILITY THEORY 9

as a decision problem, if we simply ask whether the problem instance (i.e.


the vector) is sorted. The original sorting problem and it’s decision coun-
terpart may not seem equivalent in terms of hardness. For instance, sorting
is solved in polynomial time n log n (using “standard” algorithms) which
deciding whether a vector is sorted can be done in linear time. We shall
see that, from the point of view of Complexity Theory, both problems are
equally hard.
Definition 0.2.2.1 may seem abstract and unusable for the following rea-
son: the set I is hard to characterize. One solution may be to assign types
to problem instances. For example graph may be a problem instance type.
However, such a choice forces us to reason about problems separately, based
on the type of their problem instances. Also types themselves are an infinite
set, which is also difficult to characterize.
Another approach is to level out problem instances, starting from the
following key observations: (i) each i ∈ I must be, in some sense finite.
For instance, vectors have a finite length, hence a finite number of elements.
Graphs (of which we ask our questions) also have a finite set of nodes, hence,
a finite set of edges, etc. (ii) I must be countable (but not necessarily finite).
For instance, the problem P : R×R → {0, 1} where P (x, y) returns 1 if x and
y are equal, has no sense from√the point of computability √ theory. Assume
we would like to answer P (π, 2). Simply storing π and 2, which takes
infinite space, is impossible on machines, and also takes us to point (i).
The observations suggest that problem instances can be represented via
a finite encoding, which may be assumed to be uniform over all possible
mathematical objects we consider.

Definition 0.2.2.2 (Encoding problem instances) Let Σ be a finite set


whom we call alphabet. A one-letter word is a member of Σ. A two-letter
word is any member of Σ × Σ = Σ2 . For instance, if Σ = {a, b . . .}, then
(a, a) ∈ Σ2 is a two-letter word. An i-letter word is a member of Σi . We
denote by:
Σ∗ = {} ∪ Σ ∪ Σ2 ∪ . . . ∪ Σi ∪ . . .
the set of finite words which can be formed over Σ.  is a special word which
we call empty word. Instead of writing, e.g. (a, b, b, a, a) for a 5-letter word,
we simply write abbaa. Concatenation of two words is defined as usual.

Remark 0.2.2.1 We shall henceforth consider that a problem P : I → O


has the following property: if I is infinite, then I ' Σ∗ (I is isomorphic
with Σ∗ ). Thus, each problem instance i can be represented as a finite word
enc(i) ∈ Σ∗ , for some Σ∗ .
10

We shall postpone, for now, the question of choosing the appropriate Σ


for our problem (the above remark simply states that such a Σ must exist).
Definition 0.2.2.2 and Remark 0.2.2.1 can be easily recognized from prac-
tice. A programmer always employs the same predefined mechanisms of his
programming language (the available datatypes) to represent his program
inputs. Moreover, these objects ultimately become streams of bits, when
they are actually processed by the machine.
Making one step further, we can observe the following property of al-
phabets, which conforms with (ii):
Proposition 0.2.2.1 For any finite Σ, Σ∗ is infinitely countable.
Proof: We show Σ∗ ' N. We build a bijective function h which assigns to
each word, a unique natural. We assign 0 to . Assume |Σ| = n. We assign
to each one-letter word, the numbers from 1 to n. Next, we assign to each
k ≥ 2-letter word w = w0 x the number n ∗ h(w0 ) + h(x). If n = 2 we easily
recognise that each binary word is assigned to his natural equivalent. 
Hence, we have the following diagram:

i ∈ I ↔ enc(i) ∈ Σ∗ ↔ h(enc(i)) ∈ N

Hence, we can view a problem instance as a natural number, without


losing the ability to uniquely identify the instance at hand. Thus:

Definition 0.2.2.3 (Problem) A problem is a function f : N → N. If


some n encodes a problem input, then f (n) encodes it’s answer. A decision
problem is a function f : N → {0, 1}.

To conclude: when trying to solve concrete problems, the encoding issue


is fundamental, and this is dependent on the type of problem instances we
tackle. From the perspective of Computability Theory which deals with
problems in general, the encoding is unessential, and can be abstracted
without “loss of information” by a natural number.

0.2.3 Algorithms as Turing Machines


Algorithms are usually described as pseudo-code, and intended as abstrac-
tions over concrete programming language operations. The level of abstrac-
tion is usually unspecified rigorously, and is decided in an ad-hoc manner by
the writer. From the author’s experience, pseudo-code is often dependent
on (some future) implementation, and only abstracts from language syn-
tax, possibly including data initialization and subsequent handling. Thus,
0.2. COMPUTABILITY THEORY 11

some pseudo-code can be easily implemented in different languages to the


extent to which the languages are the same, or at least follow the same
programming principles.
The above observation is not intended as a criticism towards pseudo-
code and pseudo-code writing. It is indeed difficult, for instance, to write
pseudocode which does not seem vague, and which can be naturally im-
plemented in an imperative language (using assignments and iterations) as
well as in a purely functional language (where iterations are possible only
through recursion).
As before, we require a means for leveling out different programming
styles and programming languages, in order to come up with a uniform,
straightforward and simple definition for an algorithm.
The key observation here is that programming languages, especially the
newest and most popular, are quite restrictive w.r.t. to what the program-
mer can do. This may seem counter-intuitive at first. Consider typed lan-
guages for instance. Enforcing each variable to have a type is obviously
a restriction, and has a definite purpose: it helps the programmer write
cleaner code, and one which is less likely to crash at runtime. However, this
issue is irrelevant from the point of view of Computability Theory. If we try
to search for less restrictive languages, we find the assembly languages. The
restrictions are minimal here (as well as the “programming structure”).
The formal definition for an algorithm which we propose, can be seen
as an abstract assembly language, where all technical aspects are put aside.
We call such a language the Turing Machine.

Definition 0.2.3.1 (Deterministic Turing Machine) A Deterministic Tur-


ing Machine (abbreviated DTM) is a tuple M = (K, F, Σ, δ, s0 ) where:

• Σ = {a, b, c, . . .} is a finite set of symbols which we call alphabet;

• K is a set of states, and F ⊆ K is a set of accepting/final states;

• δ : K × Σ → K × Σ × {L, H, R} is a transition function which assigns


to each state s ∈ K and c ∈ Σ the triple δ(s, c) = (s0 , c0 , pos).

• s0 ∈ K is an initial state.

The Turing Machine has a tape which contains infinite cells in both direc-
tions, and on each tape cell we have a symbol from Σ. The Turing Machine
has a tape head, which is able to read the symbol from the current cell. Also,
the Turing Machine is always in a given state. Initially (before the machine
has started) the state is s0 . From a given state s, the Turing Machine reads
12

the symbol c from the current cell, and performs a transition. The transition
is given by δ(s, c) = (s0 , c0 , pos). Performing the transition means that the
TM moves from state s to s0 , overrides the symbol c with c0 on the tape cell
and: (i) if pos = L moves the tape head on the next cell to the left, (ii) if
pos = R moves the tape head on the next cell to the right and (iii) pos = H
leaves tape head on the current cell.
The Turing Machine will perform transitions according to δ.
Whenever the TM reaches an accepting/final state, we say it halts. If
a TM reaches a non-accepting state where no other transitions are possible,
we say it clings/hangs.

• the input of a Turing Machine is a finite word which is contained in


its otherwise empty tape;

• the output of a TM is the contents of the tape (not including empty


cells) after the Machine has halted. We also write M (w) to refer to
the output of M , given input w.

[Matei: Comments on the Turing Machine vs. Programming Languages.


The TM is resource-unbound!]

Example 0.2.3.1 (Turing Machine) Consider the alphabet Σ = {#, >


, 0, 1}, the set of states K = {s0 , s1 , s2 }, the set of final states F = {s2 } and
the transition function:

δ(s0 , 0) = (s0 , 0, R) δ(s0 , 1) = (s0 , 1, R)


δ(s0 , #) = (s1 , #, L) δ(s1 , 1) = (s1 , 0, L)
δ(s1 , 0) = (s2 , 1, H) δ(s1 , >) = (s2 , 1, H)

The Turing Machine M = (K, F, Σ, δ, s0 ) reads a number encoded in binary


on the tape, and increments it by 1. The symbol # encodes the empty cell
tape.4 Initially, the tape head is positioned at the most significant bit of the
number. The Machine first goes over all bits, from left to right. When the
first empty cell is detected, the machine goes into state s1 , and starts flipping
1s to 0s, until the first 0 (or the initial position, marked by >) is detected.
Finally, the machine places 1 on this current cell, and enters it’s final state.
The behaviour of the transition function can be more intuitively repre-
sented as in Figure 2.3.1. Each node represents a state, and each edge, a
transition. The label on each edge is of the form c/c0 , pos where c is the
4
We shall use # to refer to the empty cell, throughout the text
0.2. COMPUTABILITY THEORY 13

c/c, R with c ∈ {0, 1} 1/0, L


0/1, H
#/#, L > /1, H
start s0 s1 s2

Figure 2.3.1: The binary increment Turing Machine

symbol read from the current tape cell, c0 is the symbol written on the cur-
rent tape cell and pos is a tape head position. The label should be read as:
the machine replaces c with c0 on the current cell tape and moves in the
direction indicated by pos.
Let us consider that, initially, on the tape we have >0111 — the repre-
sentation of the number 7. The evolution of the tape is shown below. Each
line shows the TM configuration at step i, that is, the tape and current state,
after transition i. For convenience, we have chosen to show two empty cells
in each direction, only. Also, the underline indicates the position of the tape
head.
Transition no Tape Current state
0 ##>0111## s0
1 ##>0111## s0
2 ##>0111## s0
3 ##>0111## s0
4 ##>0111## s0
5 ##>0111## s1
6 ##>0110## s1
7 ##>0100## s1
8 ##>0000## s1
9 ##>1000## s2

In order to better understand the Turing Machine, it is useful to estab-


lish some similarities with, e.g. assembly languages. As specified in Defi-
nition 0.2.3.1, a Turing Machine is M specifies a clearly defined behaviour,
which is actually captured by δ. Thus, M is quite similar to a specific pro-
gram, performing a definite task. If programs (algorithms) are abstracted
by Turing Machine, then what is the abstraction for the programming lan-
guage? The answer is, again, the Turing Machine. This implies that, a
Turing Machine acting as a programming language, can be fed another Tur-
ing Machine acting as program, and execute it.
14

In the following Proposition, we show how Turing Machines can be en-


coded as words:

Proposition 0.2.3.1 (TMs as words) Any Turing Machine M = (K, F, Σ,


δ, s0 ) can be encoded as a word over Σ. We write enc(M ) to refer to this
word.

Proof:(sketch) Intuitively, we encode states and positions as integers n ∈ N,


transitions as pairs of integers, etc. and subsequently “convert” each integer
to it’s word counterpart in Σ∗ , cf. Proposition 0.2.2.1.
Let NonFin = |K \ F \ {s0 }| be the set of non-final states, exclud-
ing the initial one. We encode each state in NonFin as an integer in
{1, 2, . . . , |NonFin|} and each final state as an integer in
{|NonFin|+1, . . . , |NonFin|+|F |}. We encode the initial state s0 as |NonFin|+
|F | + 1, and L, H, R as |NonFin| + |F | + i with i ∈ {2, 3, 4}. Each integer
from the above is represented as a word using dlog|Σ| (|NonFin| + |F | + 4)e
bits.
Each transition δ(s, c) = (s0 , c0 , pos) is encoded as:

enc(s)#c#enc(s0 )#c’#enc(pos)

where enc(·) is the encoding described above. The entire δ is encoded a


sequence of encoded transitions, separated by #. The encoding of M is

enc(M ) = enc(|NonFin|)#enc(|F |)#enc(δ)


Thus, enc(M ) is a word, which can be fed to another Turing Machine.
The latter should have the ability to execute (or to simulate) M . This is
indeed possible:

Proposition 0.2.3.2 (The Universal Turing Machine) There exists a


TM U which, for any TM M , and every word w ∈ Σ∗ , takes enc(M ) and w
as input and outputs 1 whenever M (w) = 1 and 0 whenever M (w) = 0. We
call U , the Universal Turing Machine, and say that U simulates M .

Proof: Let M be a TM and w = c1 c2 . . . cn be a word which is built from


the alphabet of M . We build the Universal Turing Machine U , as follows:

• The input of U is enc(M )#enc(s0 )#c1 #c2 . . . cn . Note that enc(s0 )


encodes the initial state of M while c1 is the first symbol from w.
The portion of the tape enc(s0 )#c1 #c2 . . . cn will be used to mark the
0.2. COMPUTABILITY THEORY 15

current configuration of M , namely the current state of M (initially


s0 ), the contents of M ’s tape, and M ’s current head position. More
generally, this portion of the tape is of the form enc(si )#u#v, with
u, v ∈ Σ∗b and si being the current state of M . The last symbol of u
marks the current symbol, while v is the word which is to the left of
the head. Initially, the current symbol is the first one, namely c1 .

• U will scan the initial state of M , then it will move on the initial
symbol from w, and finally will move on the portion of enc(M ) were
transitions are encoded. Once a valid transition is found, it will execute
it:

1. U will change the initial state to the current one, according to


the transition;
2. U will change the original symbol in w according to the transition;
3. U will change the current symbol, according to pos, from the
transition;

• U will repeat this process until an accepting state of M is detected,


or until no transition can be performed.


Propositions 0.2.3.2 and 0.2.3.1 show that TMs have the capability to
characterize both algorithms, as well as the computational framework to
execute them. One question remains: what can TMs actually compute?
Can they be used to sort vectors, solve SAT, etc.? The answer, which is
positive is given by the following hypothesis:

Conjecture 0.2.3.1 (Church-Turing) Any problem which can be solved


with the Turing Machine is “ universally solvable”.

The term “universally solvable” cannot be given a precise mathematical def-


inition. We only know solvability w.r.t. concrete means, e.g. computers and
programming languages, etc. It can be (an has been) shown that the Tur-
ing Machine can solve any problem which known programming languages
can solve.5 The Turing Machine, in itself, describes a model of computation
based on side-effects: each transition may modify the tape in some way.
Computation can be described differently, for instance: as function applica-
tion, or as term rewriting. However, all other known computational models
5
To be fair to the TM, one would formulate this statement as: ”all programming
languages are Turing-complete, i.e. they can solve everything the TM can solve”
16

are equivalent to the Turing Machine, in the sense that they solve precisely
the same problems.
This observation prompted the aforementioned conjecture. It is strongly
believed to hold (as evidence suggests), but it cannot be formally proved.

0.2.4 Decidability
[Matei: The fact that the totality problem is undecidable means that we
cannot write a program that can find any infinite loop in any program. •
The fact that the equivalence problem is undecidable means that the code
optimization phase of a compiler may improve a program, but can never
guarantee finding the optimally efficient version of the program. There may
be potentially improved versions of the program that it cannot even be sure
are equivalent.]
The existence of the Universal Turing Machine (U) inevitably leads to
interesting questions. Assume M is a Turing Machine and w is a word.
We use the following convention: enc(M ) w in order to represent the input
of the U . Thus, U expects the encoding of a TM, followed by the special
symbol , and M ’s input w. (?) Does U halt for all inputs? If the answer is
positive, then U can be used to tell whether any machine halts, for a given
input.
We already have some reasons to believe we cannot answer positively to
(?), if we examine the proof of Proposition 0.2.3.2. Actually (?) is a decision
problem, one that is quite interesting and useful.
As before, we try to lift our setting to a more general one: Can any
problem be solved by some Turing Machine. The following propositions
indicate that it’s not likely the case:

Proposition 0.2.4.1 The set T M of Turing Machines is countably infi-


nite.

Proof: The proof follows immediately from Proposition 0.2.3.1. Any Turing
Machine can be uniquely encoded by a string, hence the set of Turing Ma-
chines is isomorphic to a subset of Σ∗ , which in turn is countably infinite,
since Σ∗ is countably infinite, for any Σ. 

Proposition 0.2.4.2 The set Hom(N, N) of functions f : N → N is un-


countably infinite.

Proof: It is sufficient to show that Hom(N, {0, 1}) is uncountably infinite.


We build a proof by contraposition. We assume Hom(N, {0, 1}) is countably
0.2. COMPUTABILITY THEORY 17

f0 f1 f2 . . . fn . . .
0 1 1 0 ... 0 ...
1 0 1 1 ... 0 ...
2 1 0 1 ... 1 ...
... ... ... ... ... ... ...
n 1 1 0 ... 1 ...
... ... ... ... ... ... ...

Figure 2.4.2: An example of the matrix of the proof of Proposition 0.2.4.2.


The value mi,j have been filled out purely for the illustration.

infinite. Hence, each natural number n ∈ N corresponds to a function fn ∈


Hom(N, {0, 1}). We build a matrix as follows: Columns describe functions
fn : n ∈ N. Rows describe inputs k ∈ N. Each matrix content mi,j is the
value of fj (i) (hence, the expected output for input i, from function fj ).
In the Figure 0.2.4, we have illustrated our matrix. We now devise a
problem f ∗ as follows:

∗ 1 iff fx (x) = 0
f (x) =
0 iff fx (x) = 1

Since f ∗ ∈ Hom(N, {0, 1}) it must also have a number assigned to it: f ∗ =
fα for some α ∈ N. Then f ∗ (α) = 1 if fα (α) = 0. But fα (α) = f ∗ (α).
Contradiction. On the other hand f ∗ (α) = 0 if fα (α) = 1. As before we
obtain a contradiction. 
Propositions 0.2.4.2 and 0.2.4.1 tell us that there are infinitely more
functions (decision problems) what means of computing them (Turing Ma-
chines).
Our next step is to look at solvable and unsolvable problems, and devise
a method for separating the first from the latter. In other words, we are
looking for a tool which allows us to identify those problems which are
solvable, and those which are not.
We start by observing that Turing Machines may never halt. We write
M (w) = ⊥ to designate that M loops infinitely for input w. Also, we
write nw ∈ N to refer to the number which corresponds to w, according to
Proposition 0.2.2.1. Next, we refine the notion of problem solving:

Definition 0.2.4.1 (Decision, acceptance) Let M be a Turing Machine


and f ∈ Hom(N, {0, 1}).We say that:
18

• M decides f , iff for all n ∈ N: M (w) = 1 whenever f (nw ) = 1 and


M (w) = 0 whenever f (nw ) = 0.

• M accepts f iff for all n ∈ N: M (w) = 1 iff f (nw ) = 1, and M (w) =


⊥ iff f (n) = 0.

Note that, in contrast with acceptance, decision is, intuitively, a stronger


means of computing a function (i.e. solving a problem). In the latter case,
the TM at hand can provide both a yes and a no answer to any problem
instance, while in the former, the TM can only provide an answer of yes. If
the answer to the problem instance at hand is no, the TM will not halt.
Based on the two types of problem solving, we can classify problems
(functions) as follows:

Definition 0.2.4.2 Let f ∈ Hom(N, {0, 1}) be a decision problem.

• f is recursive ( decidable) iff there exists a TM M which decides f .


The set of recursive functions is

R = {f ∈ Hom(N, {0, 1}) | f is recursive }

• f is recursively enumerable ( semi-decidable) iff there exists a TM


M which accepts f . The set of recursive-enumerable functions is

RE = {f ∈ Hom(N, {0, 1}) | f is recursively-enumerable }

Now, let us turn our attention to question (?), which we shall formulate
as a problem:

enc(M ) w 1 iff M (w) halts
fh (n )=
0 iff M (w) = ⊥

Hence, the input of f is a natural number which encodes a Turing Ma-


chine M and an input word w. The first question we ask, is whether fh ∈ R.

Proposition 0.2.4.3 fh 6∈ R.

Proof: Assume fh ∈ R and denote by Mh the Turing Machine which decides


fh . We build the Turing Machine D, as follows:

⊥ iff Mh (enc(M ) enc(M )) = 1
D(enc(M )) =
1 iff Mh (enc(M ) enc(M )) = 0
0.2. COMPUTABILITY THEORY 19

The existence of the Universal Turing Machine guarantees that D can


indeed be built, since D simulates Mh . We note that Mh (enc(M ) enc(M ))
decides if the TM M halts with “itself ” as input (namely enc(M )).
Assume D(enc(D)) = 1. Hence Mh (enc(D), enc(D)) = 0, that is, ma-
chine D does not halt for input enc(D). Hence D(enc(D)) = ⊥. Contradic-
tion.
Assume D(enc(D)) = ⊥. Hence Mh (enc(D), enc(D)) = 1, and thus
D(enc(D)) halts. Contradiction. 
We note that the construction of D mimics the technique which we
applied for the proof of Proposition 0.2.4.2, which is called diagonalization

Exercise 0.2.4.1 Apply the diagonalization technique from the proof of


Proposition 0.2.4.2, in order to prove Proposition 0.2.4.3.

Proposition 0.2.4.4 fh ∈ RE.

Proof: We build a Turing Machine Mh which accepts fh . Essentially, Mh is


the Universal Turing Machine. Mh (enc(M ) w) simulates M , and if M (w)
halts, then it outputs 1. If M (w) does not halt, Mh (enc(M ) w) = ⊥. 
Propositions 0.2.4.3 and 0.2.4.4 produce a classification for fh . The ques-
tion which we shall answer, is how to classify any problem f , by establishing
membership in R and RE, respectively. We start with a simple proposition:

Proposition 0.2.4.5 R ( RE.

Proof: R ⊆ RE is straightforward from Definition 0.2.4.2. Let f ∈ R, and


Mf be the TM which decides Mf . We build the TM M 0 such that M 0 (w) = 1
iff Mf (w) = 1 and M 0 (w) = ⊥ iff Mf (w) = 0. M 0 simulates M but enters
into an infinite loop whenever Mf (w) = 0. M 0 accepts f hence f ∈ RE.
R 6= RE has already been shown by Propositions 0.2.4.3 and 0.2.4.4.
fh ∈ RE but fh 6∈ R. 
Thus, R and RE should be interpreted as a “scale” for solvability: R
membership is complete solvability, RE membership is partial solvability,
while non-membership in RE is “complete” unsolvability.

Remark 0.2.4.1 We note that R and RE are not the only sets of functions
which are used in Computability Theory. It has been shown that there are
“ degrees” of unsolvability, of “ higher level” than R and RE. These degrees
are intuitively obtained as follows: We assume we live in a world where fh
is decidable (recursive). Now, as before, we ask which problems are recursive
and which are recursively-enumerable. It turns out that, also in this ideal
20

case, there still exist recursive and recursively-enumerable problems, as well


as some which are neither. This could be imagined as “ undecidability level
1”. Now, we take some problem which is in RE on level 1, and repeat the
same assumption: it is decidable. Again, under this assumption, we find
problems in R, RE and outside the two, which form up “ undecidability
level 2”. This process can be repeated ad infinitum.

Returning to our simpler classification, we must observe an interesting


feature of recursively-enumerable functions, which is also the reason they
are called this way.

Proposition 0.2.4.6 A function f ∈ Hom(N, {0, 1}) is recursively enu-


merable iff there exists a Turing Machine which can enumerate/generate
all elements in Af = {w ∈ N | f (nw ) = 1}. Intuitively, Af is the set of
inputs of f for which the answer at hand is yes.

Proof: =⇒ Suppose f is recursively-enumerable and M accepts f . We


write wi to refer to the ith word from Σ∗ . We specify the TM generating
Af by the following pseudocode:
Algorithm 1: GEN()
static Af = ∅;
k = 0;
while True do
for 0 ≤ i ≤ k do
run M (wi );
if M (wi ) halts before k steps and i 6∈ Af then
Af = Af ∪ {wi };
return wi
end
end
k = k + 1;
end
The value of k from the for has a two-fold usage. First, it is used to
explore all inputs wi : 0 ≤ i ≤ k. Second, it is used as a time-limit for M .
For each wi we run M (wi ), for precisely k steps. If M (wi ) = 1 in at most
k steps, then wi is added to Af , and then returned (written on the tape).
Also, wi is stored for a future execution of GEN. If M (wi ) = 1 for some wi ,
then there must exist a k : k ≥ i such that M (wi ) halts after k steps. Thus,
such a k will eventually be reached.
0.2. COMPUTABILITY THEORY 21

⇐= Assume we have the Turing Machine GEN which generates Af .


We construct a Turing Machine M which accepts f . M works as follows:
Algorithm 2: M (w)
Af = ∅, n = 0;
while w 6∈ Af do
w = GEN ();
Af = Af ∪ {w}
end
return 1
M simply uses GEN to generate elements in Af . If w ∈ Af , if will
eventually be generated, and M will output 1. Otherwise M will loop.
Thus M accepts f . 
Proposition 0.2.4.6 is useful since, in many cases, it is easier to find a
generator for f , instead of a Turing Machine which accepts f .
Finally, in what follows, we shall take a few decision problems, and apply
a reduction technique, in order to prove they are not decidable.

Halting on all inputs Let:



1 iff M halts for all inputs
fall (nenc(M ) ) =
0 otherwise

The technique we use to show fall 6∈ R is called a reduction (from fh ). It


proceeds as follows. We assume fall ∈ R. Starting from the TM Mall which
decides fall we built a TM which decides fh . Thus, if fall is decidable then
fh is decidable, which leads to contradiction.
First, for each fixed TM M and fixed input w ∈ Σ∗ , we build the TM
ΠM,w (ω) = “Replace ω by w and then simulate M (w)”. It is easy to see
that (∀ω ∈ Σ∗ : ΠM,w (ω) halts) iff M (w) halts. Now, we build the TM Mh
which decides fh . The input of Mh is enc(M ) w. We construct ΠM,w and
run Mall (enc(ΠM,w )). By assumption Ma ll must always halt. If the output
is 1, then ΠM,w (ω) halts for all inputs, hence M (w) halts. We output 1. If
the output is 0, then ΠM,w (ω) does not halt for all inputs, hence M (w) does
not halt. We output 0.
We have built a reduction from fall to fh : Using the TM which de-
cides fall we have constructed a machine which decides fh . Since fh is not
recursive, we obtain a contradiction.

Definition 0.2.4.3 (Turing-reducibility) Let fA , fB ∈ Hom(N, {0, 1}).


We say fA is Turing reducible to fB , and write fA ≤T fB iff there ex-
22

ists a decidable transformation T ∈ Hom(N, N) such that fA (n) = 1 iff


fB (T (n)) = 1.

Remark 0.2.4.2 (Reducibility) We note that the transformation T must


be computable, in the sense that a Turing Machine should be able to com-
pute T , for any possible valid input.When proving fhalt ≤T fall , we have
taken nenc(M ) w — an instance of fh and shown that it could be trans-
formed into nenc(ΠM,w ) — an instance of fall , such that fh (nenc(M ) w ) = 1
iff fall (nenc(ΠM,w ) ) = 1. A Turing Machine can easily perform the transfor-
mation of enc(M ) w to enc(ΠM,w ) since it involves adding some states and
transitions which precede the start-state of M , hence T is computable.

Halting on 111

enc(M ) 1 iff M (111) halts
f111 (n )=
0 otherwise

We reduce fh to f111 . Assume M111 decides f111 . Given a Turing Ma-


chine M and word w, we construct the machine:

ΠM,w (ω) = if ω = 111 then M (w) else loop

We observe that (i) the transformation from enc(M ) w to enc(ΠM,w ) is


decidable since it involves adding precisely three states to M : these states
check the input ω, and if it is 111 — replace it with w and run M ; (ii)
M111 (ΠM,w ) = 1 iff M (w) halts. The reduction is complete. f111 6∈ R.

Halting on some input We define:

1 iff M (w) halts for some w ∈ Σ∗



fany (nenc(M ) ) =
0 otherwise

We reduce f111 to fany . We assume fany is decided by Many . We con-


struct:
ΠM (ω) = Replace ω by 111 and M (111)

Now, Many (enc(ΠM )) = 1 iff M (111) = 1, hence we can use Many to build
a machine which decides f111 . Contradiction fany 6∈ R.
0.2. COMPUTABILITY THEORY 23

Machine halt equivalence We define:


1 for all w ∈ Σ∗ : M1 (w) halts iff M2 (w) halts

enc(M1 ) enc(M2 )
feq (n )=
0 otherwise
We reduce fall to feq . Let Mtriv be a one-state Turing Machine which
halts on every input, and Meq be the Turing Machine which decides feq .
Then Meq (enc(M ) enc(Mtriv ) = 1 iff M halts on all inputs. We have shown
we can use Meq in order to build a machine which decides fall . Contradiction.
feq 6∈ R.
So far, we have used reductions in order to establish problem non-
membership in R. There are other properties of R and RE which can be of
use for this task. First we define:

Definition 0.2.4.4 (Complement of a problem) Let f ∈ Hom(N, {0, 1}).


We denote by f the problem:

1 iff f (n) = 0
f (n) =
0 iff f (n) = 1

We call f the complement of f .

For instance, the complement of fh is the problem which asks if a Turing


Machine M does not halt for input w. We also note that f = f .
Next, we define the class:

coRE = {f ∈ Hom(N, {0, 1}) | f ∈ RE}

coRE contains the set of all problems whose complement is in RE.


We establish that:
Proposition 0.2.4.7 RE ∩ coRE = R.
Proof: Assume f ∈ RE ∩ coRE. Hence, there exists a Turing Machine M
which accepts f and a Turing Machine M which accepts f . We build the
Turing Machine:
M ∗ (w) = for i ∈ N : run M (w) for i steps.
If M (w) = 1, return 1. Otherwise:
run M (w) for i steps.
If M (w) = 1, return 0.

Since M and M will always halt when expected result is 1, they can be
used together to decide f . Hence f ∈ R. 
24

Proposition 0.2.4.8 f ∈ R iff f ∈ R.

Proof: The proposition follows immediately since the Turing Machine which
decides f can be used to decide f , by simply switching it’s output from 0 to
1 and 1 to 0. The same holds for the other direction. 
We conclude this chapter with a very powerful result which states that
an category/type of problems does not belong in R.

Theorem 0.2.4.1 (Rice) Let C ⊆ RE. Given a Turing Machine M , we


ask: “ The problem accepted by M is in C?. Answering this question is not
in R.

Proof: We consider that the trivial problem f (n) = 0 is not in C. Since C is


non-empty, suppose f ∗ ∈ C, and since f ∗ is recursively-enumerable, let M ∗
be the Turing Machine which accepts f ∗ .
We apply a reduction from a variant of f111 , namely fx . fx asks if a
Turing Machine halts for input x. We assume we can decide the membership
f ∈ C by some Turing Machine. Based on the latter, we construct a Turing
Machine which decides fx (i.e. solves the halting problem for a particular
input). Let Mx be the Turing Machine which accepts fx .
Let:
Πw (ω) = if Mx (w) then M ∗ (ω) else loop.

If fΠw is the problem accepted by Πw , we show that:

fΠw ∈ C iff Mx (w) halts

(⇒). Suppose fΠw ∈ C. Then Πw (ω) cannot loop for every input ω ∈ Σ∗ . If
there were so, then fΠw would be the trivial function always returning 0 for
any input, which we have assumed is not in C. Thus, Mx (w) halts.
(⇐). Suppose Mx (w) halts. Then the behaviour of Πw (ω) is precisely that of
M ∗ (ω). Πw (ω) will return 1 whenever M ∗ (ω) will return 1 and Πw (ω) = ⊥
whenever M ∗ (ω) = ⊥. Since f ∗ ∈ C, then also fΠw ∈ C. 
In Theorem 0.2.4.1, the set C should be interpreted as a property of
problems, and subsequently of the Turing Machines which accept them.
Checking if some Turing Machine M ∗ satisfies the given property is unde-
cidable. Consider the property informally described as: The set of Turing
Machines(/computer programs) that behave as viruses. The ability of decid-
ing whether a Turing Machine behaves as a virus (i.e. belongs to the former
set) is undecidable, via Rice’s Theorem.
0.3. COMPLEXITY THEORY 25

0.3 Complexity Theory

0.3.1 Measuring time and space

In Computability Theory, we have classified problems (e.g. in classes R and


RE) based on Turing Machine’s ability to decide/accept them.
In order to classify problems based on hardness, we need to account for
the number of steps (time) and tape cells which are employed by a Turing
Machine.
The amount of spent resources (time/space) by a Turing Machine M
may be expressed as functions:

TM , SM : Σ∗ → N

where TM (w) (resp. SM (w)) is the number of steps performed (resp. tape
cells used) by M , when running on input w.
This definition suffers from un-necessary overhead, which makes time
and space analysis difficult. We formulate some examples to illustrate why
this is the case:

Alg(n)
while n < 100
n=n+1
return 1

We note that Alg runs 100 steps for n = 0 while only one step, for
n ≥ 100. However, in practice, it is often considered that each input is as
likely to occur as any other6 . For this reason, we shall adopt the following
convention: (?) We allways consider the most expensive/ defavourable case,
given inputs of a certain type. In our previous example, we consider the
running time of Alg as being 100, since this is the most expensive case.

6
This is not often the case. There are numerous algorithms which rely on some prob-
ability that the input is of some particular type. For instance, efficient SAT solvers rely
on a particular ordering of variables, when interpretations are generated and verified. On
certain orderings and certain formulae, the algorithm runs in polynomial time. The key
to the efficiency of SAT solvers is that programmers estimate and efficient ordering, based
on some expectancy from the input formula. The algorithm may be exponentially costly
for some formulae, but run in close-to-polynomial time, for most of the inputs.
26

Consider the following example:

Sum(v, n)
s = 0, i = 0
while i < n
s = s + v[i]
i=i+1
return s

Unlike Alg, Sum does not have a universal upper limit on it’s running
time. The number of steps Sum executes, depends on the number of vari-
ables from v, namely n, and is equal to 2n + 3, if we consider each variable
initialisation and the return statements, as computing steps. Thus, we ob-
serve that (??) The running time (resp. consumed space) of an algorithm
will grow as the size of the input grows.
We can now merge (?) and (??) into a definition:

Definition 0.3.1.1 (Running time of a TM) The running time of a Tur-


ing Machine M is given by TM : N → N iff:

∀w ∈ Σ∗ : the nb. of transitions performed by M is at most TM (|w|)

Remark 0.3.1.1 (Consumed space for a TM) A naive definition for con-
sumed space of a Turing Machine would state that SM (|w|) is the number
of tape cells which M employs. This definition is imprecise. Consider the
Turing Machine which receives a binary word as input and compute whether
the word is a power of 2n . Asside from reading it’s input, the machine con-
sumes no space. Thus, we might refine our definition into: “SM (|w|)” is the
number of tape writes which M performs”. This definition is also imprecise.
Consider the binary counter Turing Machine from the former chapter. It
performs a number of writes proportional to the number of consecutive 1’s
found at the end of the string. However, the counter does not use additional
space, but only makes processing on the input.
Thus, the consumed space SM (|w|) is the number of written cells except
from the input. Consider a Turing Machine which receives n numbers
encoded as binary words, each having at most 4 bits, and which computes
the sum of the numbers, modulo 24 . Apart from reading the 4∗n bits and n−1
word separators, the machine employs another 4 cells to hold a temporary
sum. Thus, the consumed space for this machine is 47 .
7
We can also build another machine which simply uses the first number to hold the
temporary sum, and this use no additional space.
0.3. COMPLEXITY THEORY 27

A formal definition for consumed space of a TM is outside the scope of


this course, since it involves multi-tape Turing Machines. The basic idea is
to separate the input from the rest of the space used for computation.
Thus, when assesing the consumed space of an algorithm, we shall never
account for the consumed space by the input.

Recall that, as in Computability Theory, our primary agenda is to pro-


duce a classification of problems. To this end, it makes sense to first intro-
duce a classification of Turing Machine running times.

Asymptotic notations

Remark 0.3.1.2 (Running times vs. arbitrary functions) In the pre-


vious section, we have defined running times of a Turing Machine as func-
tions: T : N → N, and we have seen that they are often monotonically
increasing (n ≤ m =⇒ T (n) ≤ T (m)). While monotonicity is common
among the running times of conventional algorithms, it is not hard to find
examples (more-or-less realistic) where it does not hold. For instance, an
algorithm may simply return, if it’s input exceeds a given size. Thus, we
shall not, in general, assume that running times are monotonic.
Furthermore, we shall extend our classification to arbitrary functions
f : R → R, since there is no technical reason to consider only functions over
naturals. In support for this, we shall also add that asymptotic notations are
useful in other fields outside complexity theory, where the assumption that
functions are defined over natural numbers only is not justified.

Definition 0.3.1.2 (Θ (theta) notation) Let g : R → R. Then Θ(g(n))


is the class of functions:

∃c1 , c2 ∈ R+
Θ(g(n)) = {f : R → R | ∀n ≥ n0 , c1 g(n) ≤ f (n) ≤ c2 g(n)}
∃n0 ∈ N

Thus, Θ(f (n)) is the class of all functions with the same asymptotic
growth as f (n). We can easily observe that, for all continuous g, f ∈
Hom(R, R) such that g ∈ Θ(f (n)), we have limn→∞ fg(n) (n)
= c, where c 6= 0.
There is an infinite number of classes Θ(f (n)), one for each function f .
However, if g(n) ∈ Θ(f (n)), then Θ(g(n)) = Θ(f (n)).
It makes sense to consider classes which describe functions with infe-
rior /superior asymptotic growth:
28

Definition 0.3.1.3 (O,Ω notations) Let f : R → R. Then:

∃c ∈ R+
O(g(n)) = {f : R → R | ∀n ≥ n0 , 0 ≤ f (n) ≤ c ∗ g(n)}
∃n0 ∈ N

∃c ∈ R+
Ω(g(n)) = {f : R → R | ∀n ≥ n0 , 0 ≤ c ∗ g(n) ≤ f (n)}
∃n0 ∈ N

Note that g ∈ O(f (n)) =⇒ O(g(n)) ⊆ O(f (n)), while g ∈ Ω(f (n)) =⇒
Ω(g(n)) ⊆ Ω(g(n)). Finally, Ω(f (n)) ∩ O(f (n)) = Θ(f (n)). Each of the
above propositions can be easily proved using the respective definitions of
the notations.
O and Ω offer relaxed bounds for asymptotic function growth. Thus,
g ∈ O(f (n)) should be read as:The function g grows asymptotically at most
as much as f . It makes sense to also consider tight bounds:

Definition 0.3.1.4 (o,ω notations)

∀c ∈ R+
o(g(n)) = {f : R → R | ∀n ≥ n0 , 0 ≤ f (n) ≤ c ∗ g(n)}
∃n0 ∈ N

∀c ∈ R+
ω(g(n)) = {f : R → R | ∀n ≥ n0 , 0 ≤ c ∗ g(n) ≤ f (n)}
∃n0 ∈ N

Thus, g ∈ o(f (n)) should be read: g grows assymptotically strictly less


than f . We have o(f (n)) ∩ ω(f (n)) = ∅ O(f (n)) ∩ Ω(f (n)) = Θ(f (n)) and
ω(f (n)) ∪ Θ(f (n)) = Ω(f (n)).

Exercise 0.3.1.1

If f (n) ∈ Ω(n2 ) and g(n) ∈ O(n3 ) then f (n)/g(n) ∈ . . .


If f (n) ∈ o(n2 ) and g(n) ∈ Θ(n3 ) then f (n) · g(n) ∈ . . .
If f (n) ∈ Θ(n3 ) and g(n) ∈ o(n2 ) then f (n) \ g(n) ∈ . . .

Exercise 0.3.1.2 Prove or disprove the following implications:

f (n) = O(log n) ⇒ 2f (n) = O(n)


f (n) = O(n2 ) and g(n) = O(n) 3
p ⇒ f (g(n)) = O(n )
f (n) = O(n) and g(n) = 1 + f (n) ⇒ g(n) = Ω(log n)
0.3. COMPLEXITY THEORY 29

Syntactic sugars
This section follows closely Lecture 2 from [1]. Quite often, asymptotic
notations are used to refer to arbitrary functions having certain properties
related to their order of growth. For instance, in:

df (x)e = f (x) + O(1)

applying “rounding” to f (x), may be expressed as the original f (x) to which


we add a function bounded by a constant. Similarly:
1
= 1 + x + x2 + x3 + Ω(x4 ), for − 1 < x < 1
1−x
The above notation allows us to “formally disregard” the terms from the
expansion, by replacing them with an asymptotic notation which charac-
terises their order of growth. One should make a distinction between the
usage of asymptotic notations in arithmetic expressions, such as the ones
previously illustrated, and equations. Consider the following example:

f (x) = O(1/x)

which should be read: there exists a function h ∈ O(1/x) such that f (x) =
h(x). Similarly:
f (x) = O(log x) + O(1/x)
should be read: there exist functions h ∈ O(1/x) and w ∈ O(log x) such that
f (x) = w(x) + h(x). In equations such as:

O(x) = O(log x) + O(1/x)

the equality is not symmetric, and should be read from left to right: for
any function f ∈ O(x), there exist functions h ∈ O(1/x) and w ∈ O(log x)
such that f (x) = w(x) + h(x). In order to avoid mistakes, the following
algorithmic rule should be applied. When reading an equation of the form:

lef t = right

• each occurrence of an asymptotic notation in left should be replaced by


an universally quantified function belonging to the corresponding
class.

• each occurrence of an asymptotic notation in right should be replaced


by an existentially quantified function from the corresponding class.
30

0.3.2 Running time in complexity theory


Using asymptotic notations, we can distinguish between running (of algo-
rithms) with different asymptotic growths. Experience has shown that, it is
unfeasible to develop a theory which uses asymptotic notations in order to
classify problems, based on their difficulty. Thus, in complexity theory, we
make an even stronger assumption: the exponent of a polynomial function
is un-important. Recall that, with asymptotic notations, we do not differ-
entiate between n2 and 2n2 + n + 1 (and denote either of the two by Θ(n2 )).
In Complexity Theory, we do not distinguish between, e.g. n2 and n3 , and
thus, we write nO(1) , thus refering to a polynomial of arbitrary degree.
Before introducing a classification of problems, there is a question which
must be addressed: How does the encoding of a problem instance affect the
running time of the subsequent algorithm?
To see why this issue is important, consider the encoding of numbers
using a single digit. (e.g. IIIIIIII encodes the number 8). A Turing
Machine M which starts with the tape:

> I I I I I I # #

and increments the represented number by shifting the head to the first
empty cell, where it places I, will perform a number steps which is linear
with respect to the size of the input. Thus, the running time of M is O(n)
where n = |w|, and w is M ’s input.
The Turing Machine which uses the binary alphabet, and encodes num-
bers as binary words, will also run in linear time w.r.t. the size of the input,
but in this case, there is an exponential gap between the two representations.
The representation of a natural x consumes n = dlog xe cells in the second
machine, and x cells, in the first. Note that x = 2n .
This is one of the rare cases [2] where a bad choice of a representation
may lead to an exponential increase in the number of steps. In what follows,
we assume problem instances are encoded in some default way: e.g. graphs
are represented as adjacency matrices or as adjacency lists, etc. When ap-
propriate representations are chosen, the computational gap between them
is at most polynomial. As an example, let us compare a matrix representa-
tion of a graph, with that of adjacency lists. Assume the graph is directed,
contains n nodes and only one edge (u, v). The matrix representation will
consume n2 positions (out of which n2 − 1 are equal to 0), while the list rep-
resentation will consume only one position (corresponding unique element
from the adjacency list of u). However, the gap between the two represen-
tations is polynomial. Thus, from the point of view of Complexity Theory,
0.3. COMPLEXITY THEORY 31

it is irrellevant if we chose matrices or adjacency lists to represent graphs.


This observation is highlighted by the following Proposition:
Proposition 0.3.2.1 (The encoding does not matter) Let f : N →
{0, 1} be a problem which is decided by a Turing Machine M in time T .
M is defined over alphabet Σ. Then, there exists a Turing Machine M 0 —
defined over alphabet Σ0 = {0, 1, #, >}, which decides f and runs in time
O(log |Σ|) ∗ T (n).
Proof:(sketch) We build M 0 = (K 0 , F 0 , Σ0 , δ 0 , s00 ), from M as follows:
• Σ0 = {0, 1, #, >}. We encode each symbol different from # (the empty
cell) and > (the marker symbol of the beginning of the input), as a
word w ∈ Σ0 with |w| = dlog |Σ|e. We use k to refer to the length |w|
of the word w. We write encΣ0 (x), with x ∈ Σ, to refer to the encoding
of symbol x ∈ Σ.

• For each state s ∈ K, we build 2k+1 states q1 , . . . q2k+1 ∈ K 0 , organized


as a full binary tree of height k. The purpose of the tree is to recognize
the word encΣ0 (x) of length k from the tape. Thus, the unique state
at the root of the tree, namely q1 , is responsible for recognising the
first bit. If it is 0, M 0 will transition to q2 and if it is 1, to q3 . q2 and
q3 must each recognize the second bit. After their transitions, we shall
be in one of the states q4 to q8 , which give us information about the
first two bits of the word. The states from level i recognize the first
i bits of the encoded symbol encΣ0 (x). The states from the last level
are 2k in number, and recognize the last bit of the encoded symbol
encΣ0 (x). Thus, each of the 2k leaf-states in the tree corresponds to
one possible symbol x ∈ Σ which is encoded as encΣ0 (x). We connect
all 2k+1 states by transitions, as described above.

• For each transition δ(s, x) = (s0 , x0 , pos) of M , the machine M 0 must:


(i) recognize x, (ii) override x0 , (iii) move according to pos and go to
state s0 . Thus:

– (i) is done by the procedure described at the above point.


– for (ii), we use k states to go back (k cells) at the beginning of
encΣ0 (x) and write encΣ0 (x0 ), cell by cell. Finally, we connect the
state corresponding to encΣ0 (x) from the tree, to the first of the
above-described k states.
– for (iii) if pos = L/R, we use another k states to go either left or
right. If pos = H, we need not use these states. Finally, we need
32

to make a transition to the root of the state tree corresponding


to s0 .

For each transition δ(s, x) = (s0 , x0 , pos) of M , M 0 performs k transitions


for reading the encoded symbol x, k transitions for writing x0 and possibly
k transitions for moving the tape head. Thus, for all w ∈ Σ0 , the number
of transitions performed by M 0 is at most 3k ∗ T (|w|). Hence, the running
time of M 0 is O(k) ∗ T (n). 
The proof of Proposition 0.3.2.1 shows that any Turing Machine using an
arbitrary alphabet Σ can be transformed in one using the binary alphabet.
The overhead of this transformation is logarithmic: O(dlog |Σ|e). Thus, if
the original Turing Machine runs in some polynomial time T (n), then the
transformed TM will run in O(dlog |Σ|e) ∗ T (n) time which is bound by a
polynomial. Similarly, if the original TM is running in supra-polynomial
time, the transformed TM will also run in supra-polynomial time.
In what follows, we shall assume all Turing Machines are using the binary
alphabet Σb = {0, 1, #, >}.

0.3.3 Complexity classes


In the previous section, we have stated that, in complexity theory, we shall
make no distinction between polynomial running times with different asymp-
totic growths (e.g. between n2 and n3 ). With this in mind, we construct a
classification of problems. First, we say that: f is decidable in time T (n)
iff there exists a Turing machine M which decides f , and whose running
time is T . We interchangeably use the terms decidable and solvable, since,
in this chapter, there is no ambiguity on what “solvability” means, and it
cannot be mistaken by acceptability. All considered problems in this section
are members of R. The following definition characterizes problems with a
specific running time.

DTIME(T (n)) = {f : N → {0, 1} | f is decidable in time O(T (n))}

Note that, unlike asymptotic notations, DTIME(T (n)) is a class of prob-


lems, not of running times. Also, note that our characterization does not
provide a strict upper bound. Hence, e.g. DTIME(n) ⊆ DTIME(n2 ). In
words: a problem which is decidable in liniar time, is also decidable in
quadratic time. Next, we introduce the class:
[
PTIME = DTIME(nd )
d∈N
0.3. COMPLEXITY THEORY 33

PTIME is often abbreviated P. It is the class of all problems which are


decidable in polynomial time. Also, note that if some problem f is decidable
in log(n) time, then f ∈ DTIME(log(n)) ⊆ DTIME(n) ⊆ P. Hence, even
problems which are solvable in sub-liniar time belong in the class P.
Further on, we introduce the class:
d
[
EXPTIME = DTIME(2n )
d∈N

which contains all the problems which are decidable in exponential time.
Naturally, we have:
P ⊆ EXPTIME ⊆ R
There are two interesting questions which can be raised, at this point:
1. Is the inclusion P ⊆ EXPTIME strict?

2. Are all problems in EXPTIME \ P8 , “equals”, in terms of difficulty?


In the following section, we shall focus on the latter question:

Nondeterminism and Nondeterministic Turing Machines


We recall the problem SAT, which takes as input a boolean formula ψ in
Conjunctive Normal Form (CNF). More precisely:

ψ = C1 ∧ C2 ∧ . . . ∧ Cn

where, for each i : 1 ≤ i ≤ n we have

Ci = Li1 ∨ Li2 ∨ . . . ∨ Limi

and, for each j : 1 ≤ j ≤ mi we have

Lij = x or Lij = ¬x

and finally, x is a variable.


Recall that SAT can be solved in exponential time, hence SAT ∈ EXPTIME.
The major source of complexity consists in generating all possible interpre-
tations on which a verification is subsequently done. In order to answer
question 2. from the previous section, suppose that SAT would be solvable
in polynomial time. Could we find problems (possibly related to SAT) which
8
Later in this chapter, we shall see that EXPTIME \ P is a set whose members are
currently unknown.
34

are still solvable in exponential time under our assumption? The answer is
yes: Let γ be the formula:

γ = ∀x1 ∀x2 . . . ∀xk ψ

where ψ is a formula in CNF containing variables x1 , . . . , xk , and k ∈ N


is arbitrarly fixed. Checking if γ is satisfiable is the problem ∀SAT. An
algorithm for ∀SAT must build all combinations of 0/1 values for each xi
with i : 1 ≤ i ≤ k and for each one, must solve an instance of the SAT
problem. In total, we have 2k combinations, and since k is part of the input,
the algorithm runs in exponential time, provided that we have an algorithm
for SAT which runs in polynomial time.
In order to study the difficulty of problems, in Complexity Theory, we
generalise the above-presented approach, in order to determine degrees of
hardness. In order to do so, we need a general version of our assumption:
“SAT is solvable in polynomial time”. In other words, we need a theoretical
tool to make exponential search (seem) polynomial. This tool will have no
relation to reality, and no implications for real problem solving. It should
not be understood as a technique for deciding hard problems. It’s mere
purpose is theoretical: it allows us to abstract one source of complexity
(that of SAT in our example) in order to explore others (e.g. ∀SAT). The
tool that we mentioned is the Nondeterministic Turing Machine:

Definition 0.3.3.1 (Non-deterministic TM) A non-deterministic Tur-


ing machine (NTM short) is a tuple M = (K, F, Σ, δ, s0 ) over alphabet Σ
with K, Σ and s0 defined as before, F = {syes , sno } and δ ⊆ K × Σ × K ×
Σ × {L, H, R} is a transition relation.
A NTM terminates iff it reaches a final state, hence, a state in F . A
NTM M decides a function f : N → {0, 1} iff f (nw ) = 0 =⇒ M (w)
reaches state sno on all possible sequences of transitions and f (nw ) =
1 =⇒ M (w) reaches state syes on at least one sequence of transitions.
We say the running time of a NTM M is T iff, for all w ∈ Σ, all
sequences of transitions of M (w) contain at most T (|w|) steps. [Matei:
Termination] [Matei: Running time]

We start with a few technical observations. First note that the NTM
is specifically tailored for decision problems. It has only two final states,
which correspond to yes/no answers. Also, the machine does not produce
an output, and the usage of the tape is merely for internal computations.
In essence, these “design choices” for the NTM are purely for convenience,
and alternatives are possible.
0.3. COMPLEXITY THEORY 35

Whereas the conventional Turing Machine assigned, for each combina-


tion of state and symbol, a unique next-state, overriding symbol and head
movement, a nondeterministic machine assigns a collection of such elements.
The current configuration of a conventional Turing Machine was charac-
terized by the current state, by the current contents of the tape, and by the
head position. A configuration of the nondeterministic machine corresponds
to a set of conventional TM configurations. The intuition is that the NTM
can simultaneously process a set of conventional configurations, in one sin-
gle step. While the execution of a Turing Machine can be represented as a
sequence, that of the NTM can be represented as a tree. A path in the tree
corresponds to one sequence of transitions which the NTM performs.
Now, notice the conditions under which a NTM decides a function: if at
least one sequence of transitions leads to syes , we can interpret the answer
of the NTM as yes. Conversely, if all sequences of transitions lead to sno ,
then the machine returns no.
Finally, when accounting for the running time of a NTM, we do not
count all performed transitions (as it would seem reasonable), but only the
length of the longest transition sequence performed by the machine. The
intuition is that all members of the current configuration are processed in
parralel, during a single step.
We illustrate the NTM in the following example:

Example 0.3.3.1 (Nondeterministic Turing Machine) We build the NTM


MSAT which solves the SAT problem discussed previously. First, we assume
the existence of Mchk (I, ψ) which takes an interpretation and a formula, both
encoded as a unique string, and checks if I |= ψ. Mchk is a conventional
TM, thus upon termination it leaves 0 or 1 on the tape.

Step 1: MSAT computes the number of variables from ψ (henceforth referred


to as n), and prepends the encoding of ψ with the encoding of n in
unary (as a sequence of 1’s). This step takes O(n) transitions.

Step 2: During the former step, MSAT has created a context for generating in-
terpretations. In this step, MSAT goes over each cell from the encoding
of n, and non-deterministically places 0 or 1 in that cell. Thus, af-
ter 1 such transition, there are 2 possible conventional configurations.
In the first, bit 0 is placed on the first cell of the encoding of n. In
the second, bit 1 is placed in the same position. After i transitions,
we have 2i possible configurations. At the end of this step, we have
2n possible configurations, and in each one, we have a binary word of
length n at the beginning of the tape, which corresponds to one possible
36

interpretation. All sequences of transitions have the same length, thus,


the execution of this part of MSAT takes O(n).

Step 3: At the end of each sequence illustrated above, we run Mchk (I, ψ), where
I and ψ are already conveniently on the tape. This step takes O(n ∗ m)
where m is maximal number of literals in ψ.
If we add up all running times of the three steps, we obtain O(n ∗ m).
An important issue is whether the NTM has more expressive power than
the conventional TM:
Proposition 0.3.3.1 Every function which is decidable by an NTM in poly-
nomial running time, is also decidable by a TM which runs in exponential
time.
The proof is left as exercise. Intuitively, we can simulate a NTM by doing
a backtracking procedure, with a classic TM. The propositions shows that
the NTM only offers a gain in speed and not expressive power. It solves
precisely the same problems which the conventional TM solves.

Convention for describing NTM in pseudocode. In the previous


chapters, we often resorted to traditional pseudocode in order to describe
algorithms – that is, Turing Machines. It is occasionaly useful to be able
to do the same thing for NTMs. With this in mind, we introduce some
notational conventions. The instruction:

v = choice(A)

where v is a variable and A is a set of values, behaves as follows:


• the current (non-deterministic) configuration of the NTM shall contain
|A| conventional configuration.

• each conventional configuration corresponds to a distinct value a ∈ A,


and it should be interpreted that v = a, in that particular configura-
tion.

• the running time of the instruction is O(1).


We also note that, it is not possible to achieve some form of “communica-
tion” between conventional configurations. Thus, it is intuitive to think that
the processing (execution of a transition) of a conventional configuration is
done independently of all other conventional configurations.
0.3. COMPLEXITY THEORY 37

We add two aditional instructions: success and fail. They correspond


to a transitions into states syes and sno , respectively.
We illustrate NTM pseudocode, by re-writing the SAT algorithm de-
scribed above. We adopt the same representational conventions from the
first Chapter, and also re-use the procedure CHECK.

Example 0.3.3.2 (Pseudocode) .


SolveSAT(ϕ):

Let n be the number of variables in ϕ.

Let I be a vector with n components which are initialised with 0

for i = 0, n − 1:

I[i] = choice({0, 1})

if CHECK(ϕ,I) = 0

fail

else success

As illustrated in the previous example, the NTM has the power of tam-
ing down the complexity which results from the search of an exponential
number of candidates (in our example, interpretations). Note that, in the
NTM, the main source of complexity is given by the verification procedure
of Mchk (I, ψ).
By using the NTM, we have managed to find the source of complexity
of SAT, which is the exponential search over possible interpretations. We
have seen that there are other types of exponential explosion. Thus, we are
in the position for refining our classification, by introducing new complexity
classes:

NTIME(T (n)) = {f : N → {0, 1} | f is decidable by a NTM in time O(T (n))}

and
[
NPTIME = NTIME(nd )
d∈N

We usually abbreviate the class NPTIME by NP. Note that SAT ∈ NP,
however it seems unlikely that ∀SAT ∈ NP. We shall discuss this issue in
38

the next section. Le us relate NP with our other classes. First, note that
NP ⊆ EXPTIME. This result is essentially given by Proposition 0.3.3.1:
every problem solved by a NTM can be solved in exponential time by a TM.
Also, we trivially have: P ⊆ NP. The concise argument is that the NTM is
a generalization of the TM, “minus” some technical details. Thus:

P ⊆ NP ⊆ EXPTIME ⊆ R
The fundamental property of problems from NP is that “solution can-
didates can be verified in polynomial time”. An analogy with solving cross-
words is possible. Generating all possible solutions (by “lexicographically”
generating all posible words) is obviously exponential. But since verifiying
whether a solution is correct can be done in polynomial time w.r.t. the size of
the crossword, the problem in in NP. We shall soon see that all algorithms
which solve hard problems from NP can be split up into an exponential
“generating” procedure, and a polynomial “verification procedure”.

0.3.4 Hard and complete problems for a class


Recall that proving f 6∈ R for a given problem f was done using contrapo-
sition. Essentially, the proof relied on finding a Turing reduction f ∗ ≤T f
to a problem f ∗ for which f ∗ 6∈ R is known.
First, we note that R could be easily replaced with any complexity class
X. Thus, a more general proof scheme could be described as:

f ∗ 6∈ X, f ≤X f ∗ =⇒ f 6∈ X

where f ∗ and X are given in advance. We observe that we have replaced ≤T


by ≤X , in order to highlight that the reduction ≤X depends on the choice
of X. We shall later see that we cannot allways use Turing Reductions, and
there are X’s for which more restrictions on the reduction must be in place.
Returning to the proof scheme, we make a second note: to prove f 6∈ R,
we must first find some f ∗ 6∈ R. But if this very fact is shown by the same
technique, then we need another problem f ∗∗ 6∈ R and so on. This situation
is similar to: “Which came first, hens or eggs? ”.
For the case of R, this issue was settled by the proof fh 6∈ R using
diagonalization. This introduced an initial undecidable problem, and the
aforementioned technique could be employed.
Now, let us turn our attention to X = NP. First, let us examine the
properties which the reduction should have, in order to make our technique
work in this particular case. let f ∗ 6∈ NP. We need to show that f 6∈ NP.
0.3. COMPLEXITY THEORY 39

The first part consists of assuming f ∈ NP. Next, we a reduction T : If ∗ →


If where If ∗ and If are the inputs of the problem from the subscript.9 We
recall that the reduction needs to satisfy:

∀i ∈ If ∗ : f ∗ (i) = 1 ⇐⇒ f (T (i)) = 1

In words: “we can solve all instances i of If ∗ by solving instances T (i)


of If ”. This is done as follows:

a. receive i ∈ If ∗

b. compute T (i) ∈ If

c. run the NTM Mf (T (i)) (Mf must exist since f ∈ NP) and return it’s
answer.

After a careful look, we observe that the conclusion f ∗ ∈ NP (which


is our objective, in order to complete the contrapositive argument) is not
immediate. If, for instance, the computation of T (i) takes exponential time,
then the proof does not work: the NTM which performs a. b. c. runs in
(non-deterministic) exponential time.
Thus, the restriction that T is decidable is insufficient. We further need
that T is computable in polynomial time. We write f ∗ ≤p f iff there
exists a polynomial-time transformation which allows solving f ∗ via f , as
illustrated by the scheme above.
We now turn our attention to the “hen and eggs” issue. We need an
initial problem, which is known not to be in NP. Unfortunately, such a
problem is not known, although there have been major (unsuccessful) efforts
in trying to find one. The same holds for the class P. Hence, the issue:

P ( NP

is currently open, and it is generally believed that it is true, but all proof
attempts have failed. The good news is that our effort in classifying problems
is not fruitless. Transformations:

f ∗ ≤p f f ∗ ≤T f

establish a relation between problems, which is denoted as hardness. No


matter the reduction type (polynomial or Turing), f is at least as hard as
9
In order to make explicit the direction of the transformation, we ignore the fact that
both If ∗ and If are (subsets of) naturals.
40

f ∗ . With the machine Mf (together with computing a transformation) we


can solve all instances of f ∗ . It may be possible that T is bijective: hence
each input of f ∗ is uniquely mapped to an input of f and vice-versa. Then,
f and f ∗ are equally hard. However, this is not generally the case, hence the
term “at least” as hard.
We can naturally extend hardness to complexity classes:
Definition 0.3.4.1 A problem f is called NP-hard iff for all f 0 ∈ NP,
f 0 ≤p f .
Thus, a problem is hard w.r.t. a class, iff it is at least as hard as any
problem in the class at hand. Note that hardness can be defined w.r.t. any
complexity class and not just NP, provided that the appropriate type of
transformation is employed.

Definition 0.3.4.2 A problem f is called NP-complete iff it is NP-hard


and f ∈ NP.

Informally, NP-complete problems are the hardest problems in NP. In


the more general case, complete problems w.r.t. a class are the hardest of
that class.
It is likely that if f is NP-complete, then f 6∈ P , however, this is not a
proven fact.
Thus, instead of trying to disprove membership of a class (f 6∈ P ),
in complexity theory, we prove completeness for the immediate upper (or
greater) class (f is NP-complete).
The intuition is that class membership provides an upper bound, while
hardness — a lower bound for the difficulty of a problem. We illustrate this
by an abstract example. Recall:

P ⊆ NP ⊆ EXPTIME

Let f be a problem. Suppose we find an algorithm for f which runs in


exponential time, however, we cannot find one which runs in polynomial
time on a NTM. At this point, we have f ∈ EXPTIME. Suppose we show
f is P-hard, thus, f can be used to solve any problem in P. We now know
that f can be solved exponentially and it is unlikely that f can be solved
in sub-polynomial (e.g. logarithmic) time. Thus, the likely variants are: f
may be solved in polynomial time (i) by a conventional TM f ∈ P or (ii) by
a NTM f ∈ NP, and (iii) in exponential time, again by a conventional TM
f ∈ EXPTIME. In the best case f is polynomially solvable. In the worst
— it is exponentially solvable.
0.3. COMPLEXITY THEORY 41

Suppose now we also find that f is NP-hard. We cannot rule out f ∈ P


by a proof, but Complexity Theory predicts that such a membership is not
likely. Hence, the feasible variants remain (ii) and (iii).
Finally, if we manage to improve our exponential algorithm for f and
turn it into a non-deterministic polynomial algorithm, then f ∈ NP and,
hence, it is NP-complete. Case (iii) remains of course true, but it does not
carry useful information. At this point, we have an exact characterisation
of the difficulty of f .

Proving NP-completeness For a problem f to be NP-complete, it must


satisfy two conditions. The first: f ∈ NP is shown by finding a NTM
which decides f in polynomial time. For the second part (f is NP-hard),
we can employ precisely the “reduction-finding” technique illustrated at the
beginning of this section.

Proposition 0.3.4.1 A problem f is NP-hard iff there exists a problem g


which is NP-hard, such that g ≤p f .

Proof: Suppose g ≤p f and g is NP-hard. Hence, for all h ∈ NP, h ≤p g.


By transitivity, we also have that h ≤p f . It follows that f is NP-hard. 
In the former proof we have made use of the transitivity of ≤p , without
showing it. We now state several properties of ≤o including transitivity, and
leave the proofs as exercises.

Proposition 0.3.4.2 ≤p is reflexive and transitive.

Proposition 0.3.4.3 The set of NP-hard problems is closed under ≤p .

Proposition 0.3.4.4 The set of NP-complete problems together with ≤p is


an equivalence class.

Proposition 0.3.4.5 Assume f is NP-complete. If f ∈ P, then P = NP.

The former proposition, whose proof follows immediately from the underly-
ing definitions, makes the case for the common belief that P 6= NP. If some
efficient algorithm can be found for some NP-complete problem, then all
problems in NP can be solved in polynomial time.
The P = NP issue can also be given another intuitive interpretation:
“The verification of a solution candidate is as difficult as generating it” or,
alternatively: “Verifying a given proof P for A, is as difficult as finding a
proof for P ”.
42

Finally, to better understand the implications of P = NP, consider sev-


eral facts which would be arguably true, in the case the former equality
holds:
• We can provide a solution to the astronauts’ problem (see first chapter)

• Partial program correctness can be solved efficiently. Technique such


as model checking can be applied to a wide range of applications (in-
cluding operating system kernels). Bugs are almost removed. Win-
dows bluescreens are no longer happening.

• Generation of exponentially many training sets would make tasks such


as voice recognition, computer vision, natural language processing —
computationally easy.

• Mathematical proofs (of, say 100 pages) can be generated efficiently.


Computers can be used to find proofs for some open problems.

• We can exponential search to find passwords, or to break encryption


keys in polynomial time. Internet privacy is no longer possible using
encryption (e.g. using SSH). Internet commerce and banking is no
longer possible. Safe communication is no longer possible (at all lev-
els). Any computer-controlled facility (public, military, etc.), which is
connected to the Internet has considerable potential of being compro-
mised.

Remark 0.3.4.1 (Practical applications of reductions) As illustrated


before, reductions of the type ≤p are a theoretical tool which is useful for prov-
ing NP-hardness. Reductions also have practical applications. For instance,
most NP-complete problems are solved by employing SAT solvers, which,
as discussed in the former chapters, may be quite fast in the general case.
Thus, a specific problem instance is cast (via an appropriate transformation)
into a formula ϕ, such that ϕ is satisfiable iff the answer to the instance is
yes.

SAT. The first NP-complete problem. We observe that the “hen and
eggs” issue still holds in our scenario. To apply our technique, we need
an initial NP-hard problem in the first place. This problem is provided by
Cook’s Theorem, which proves that SAT is NP-complete. The technique for
the proof relies on building, for each NTM M (and hence, for each problem
in NP), a formula ϕM such that it is satisfiable iff there exists a sequence in
the computation tree leading to success.
Bibliography

[1] A.J. Hildebrand. Asymptotic methods in analysis -


http://www.math.uiuc.edu/h̃ildebr/595ama/. Math595AMA, 2009.

[2] Christos M. Papadimitriou. Computational complexity. Addison-Wesley,


Reading, Massachusetts, 1994.

43

You might also like