Complexity Classes II
Advanced Analysis of Algorithms – COMS3005A
Steve James
[Link]@[Link]
Additional reading: Michael Sipser. Theory of Computation (3rd ed). Mainly chapter 7 but also 3,4,5,8
Plan for this section
• Why are formal languages important?
• Decision problems
• Turing machines
• Decidability
• TM variants
• Complexity classes
• P vs NP vs NP complete
• Space complexity
Universal model
• Turing machine
• Mathematical model of computer
• Variants of TMs
• Multitape
• Nondeterministic
• All are equivalent
• If a decision problem can be solved, it can be solved
with a TM
• Solvability (yes/no)
• But what about speed?!
From computability to
complexity
• What computational resources are required?
• How much time will it take?
• How much memory will it need?
• Complexity theory studies these questions
• Big 𝑂: tight upper bound (e.g. ≤)
• Small 𝑜: loose upper bound (e.g. <)
• Big Ω: tight lower bound (e.g. ≥)
• Small 𝜔: loose lower bound (e.g. >)
• 𝜃: tight upper+lower (e.g. combine Ω, 𝑂)
Let’s analyse something!
Decide whether a string ∈ 𝐿 = 0𝑘 1𝑘 𝑘 ≥ 0}
1. Scan across tape and reject if 0 is found to the right of 1
2. While both 0s and 1s remain on the tape
3. Scan across tape and cross of a single 0 and single 1
4. If neither 0s nor 1s remain, accept. Else, reject
• Step 1 takes 𝑂(𝑛) steps
𝑛
• Step 2 loop runs = 𝑂(𝑛) steps
2
• Step 3 takes 𝑂(𝑛) steps
• Step 4 takes 𝑂(𝑛) steps
𝑛
• Total: 𝑂 𝑛 + × 𝑂 𝑛 + 𝑂 𝑛 = 𝑂(𝑛2 )
2
𝐿= 𝑘
0 1𝑘 𝑘 ≥ 0}
• So algorithm on a TM is 𝑂(𝑛2 ) def decide(w):
seen_0 = False
• Actually, there’s a faster seen_1 = False
count_0 = 0
algorithm 𝑂(𝑛𝑙𝑜𝑔𝑛) count_1 = 0
for c in w:
• But hang on! if c == '0':
if seen_1:
• We can do it in 𝑂(𝑛) return False
seen_0 = True
count_0 += 1
if c == '1':
if not seen_0:
• So on our regular computer, return False
seen_1 = True
it’s 𝑂(𝑛) count_1 += 1
return count_0 == count_1
• In fact, can do it 𝑂(𝑛) on a 2-
tape TM
Houston, we have a problem!
• On a 1-tape TM, optimal algorithm is 𝑂(𝑛𝑙𝑜𝑔𝑛)
• On a 2-tape TM, optimal algorithm is 𝑂(𝑛)
• TM and 2-tape TM are equivalent with respect to
computability
• But have different time complexities for same problem!
• How can we ever say anything if the complexity
depends on the computational model we pick?
Solution?
• A measure of complexity independent of the model
of computation
• E.g. same complexity whether it’s a TM with 1 tape or 2
• Measure cannot be fine-grained
• Characterise complexity for different models and
study relationship
• Try come up with something that is not impacted by
difference in models
• Let’s consider: TM, multitape TM, non-
deterministic TM
Runtime definitions
𝑇𝐼𝑀𝐸 𝑡 𝑛 = 𝐿 𝐿 𝑁 where 𝑁 is a TM with
runtime 𝑂(𝑡 𝑛 )}
• Set of all languages that can be decided on a TM in
𝑂(𝑡 𝑛 ) time
• 𝑀𝑇𝐼𝑀𝐸 𝑡 𝑛 : as above, but for a multitape TM
• Since a multitape TM can always reduce to a single
tape machine,
𝑇𝐼𝑀𝐸 𝑡 𝑛 ⊆ 𝑀𝑇𝐼𝑀𝐸(𝑡 𝑛 )
Runtime definitions
𝑀𝑇𝐼𝑀𝐸 𝑡 𝑛 ⊆ 𝑇𝐼𝑀𝐸(𝑡 𝑛 2 )
• Every multitape TM has an equivalent O(𝑡 𝑛 2 )
single-tape TM
• Proof sketch:
• We saw how to convert multitape TM into regular TM
that simulates it. We just need to analyse time
complexity of that simulation!
• Simulating each step of multitape TM uses at most
O(𝑡(𝑛)) steps
• So overall will take 𝑂 𝑡 𝑛 × 𝑂 𝑡 𝑛 = 𝑂(𝑡 𝑛 2 )
Runtime definitions
• 𝑁𝑇𝐼𝑀𝐸 𝑡 𝑛 : set of all languages that can be
decided on a non-deterministic TM in 𝑂(𝑡 𝑛 ) time
• NDTM is a decider if it halts on any input across every
execution branch
• Running time of NDTM is max steps across any branch of
computation!
• Since a NDTM can always operate as a deterministic
TM
𝑇𝐼𝑀𝐸 𝑡 𝑛 ⊆ 𝑁𝑇𝐼𝑀𝐸(𝑡 𝑛 )
Runtime definitions
𝑁𝑇𝐼𝑀𝐸 𝑡 𝑛 ⊆ 𝑇𝐼𝑀𝐸(2𝑂 𝑡 𝑛 )
• Proof sketch:
• Must simulate every possible outcome of NDTM
• Number leaves is 𝑏 𝑑 for branching factor 𝑏 and depth 𝑑
• Depth of tree is number of steps 𝑡(𝑛)
• Total number nodes in tree is less than twice number leaves
• So nodes bounded by 𝑂(𝑏𝑡(𝑛)) )
• 𝑏 is a constant ≥ 2
• So overall will take 𝑂 𝑡 𝑛 × 𝑏𝑡 𝑛 = 2𝑂(𝑡 𝑛 )
Summarise
• Relationship between TM and multitape TM:
𝑇𝐼𝑀𝐸 𝑡 𝑛 ⊆ 𝑀𝑇𝐼𝑀𝐸(𝑡 𝑛 )
𝑀𝑇𝐼𝑀𝐸 𝑡 𝑛 ⊆ 𝑇𝐼𝑀𝐸(𝑡 𝑛 2 )
• Relationship between TM and NDTM
𝑇𝐼𝑀𝐸 𝑡 𝑛 ⊆ 𝑁𝑇𝐼𝑀𝐸(𝑡 𝑛 )
𝑁𝑇𝐼𝑀𝐸 𝑡 𝑛 ⊆ 𝑇𝐼𝑀𝐸(2𝑂 𝑡 𝑛 )
Polynomial time
• We will consider polynomial difference small, and
exponential ones large!
• Polynomial differences still important for
algorithms
• E.g. 𝑂(𝑛𝑙𝑜𝑔𝑛) sorting is much better than 𝑂(𝑛2 )
• But with respect to time complexity, we will
consider only differences between polynomial and
exponential
Polynomial vs exponential
• Exponential time algorithms usually involve
exhaustively searching a space of possible solutions
using brute force search
• Polynomial time algorithms are more efficient in
exploring the possible solutions space
• All reasonable computational models are
polynomial-time equivalent
• If we view all polynomial complexity algorithms as
equivalent then specific computational model does not
matter
The class P
P is the class of languages that are decidable in
polynomial time on a deterministic single-tape
Turing machine.
∞
𝑃 = ራ 𝑇𝐼𝑀𝐸(𝑛𝑘 )
𝑘=1
• P is invariant for all models of computation that are
polynomial equivalent to TM (e.g. multitape, RAM)
• Roughly: problems that are realistically solvable
The class NP
• If a problem is in P, then there is a “clever”
algorithm that does not require brute-force search
• But for some problems, polynomial time algorithms
are not known
• Do they exist, but we haven’t found them yet?
• Or does no such algorithm exist?
Upper bounds are easy, lower bounds are hard
• E.g. PRIMES was recently shown to be in P (AKS
primality test, 2002) but was believed not to be for
many years!
The class NP1
NP1 is the class of languages that are decidable in
polynomial time on a non-deterministic
Turing machine.
∞
𝑁𝑃1 = ራ 𝑁𝑇𝐼𝑀𝐸(𝑛𝑘 )
𝑘=1
• NP stands for non-deterministic polynomial!
• Recall 𝑇𝐼𝑀𝐸 𝑡 𝑛 ⊆ 𝑁𝑇𝐼𝑀𝐸 𝑡 𝑛
• Hence 𝑃 ⊆ 𝑁𝑃1
Verification
• Imagine we were given a potential solution and asked
to check it
• This is called verifiability
• Obviously easier than coming up with solution in the first
place
• A verifier for language 𝐿 is an algorithm 𝑉 that uses
additional info 𝑐 to verify that a string 𝑤 ∈ 𝐿
• 𝑐 is called the certificate
• e.g. 𝐶𝑂𝑀𝑃𝑂𝑆𝐼𝑇𝐸𝑆 = 𝑥 𝑥 = 𝑝𝑞 for integers 𝑝, 𝑞 > 1}
• Certificate is one of the two divisors
• Easy to check in polynomial time
The class NP2
NP2 is the class of languages that have a polynomial
time verifier.
Deterministic TM
𝑁𝑃2 = 𝐿 𝐿 has a polytime verifier}
• We get a solution, and come up with an algorithm
to verify that it is correct in polynomial time on a
TM
• Some problems are so hard that we can’t even do
this!
The class NP
Turns out that class NP1 and NP2 are identical NP2
𝑁𝑃 ≔ 𝑁𝑃1 = 𝑁𝑃2
• NP is class of languages with polytime verifiers
• Equivalently, languages for which there exist
polytime NDTM decider
Hamiltonian path
• A Hamiltonian path in directed graph G is a directed
path that visits each node exactly once
• Decision problem: are two nodes in G connected with a
Hamilton path?
• 𝐻𝐴𝑀𝑃𝐴𝑇𝐻 = 𝐺, 𝑠, 𝑡 𝐺 is a directed graph with
Hamiltonian path from 𝑠 to 𝑡}
• No polytime algorithms are known !
HAMPATH is in NP
• We can verify a solution by just presenting the path
〈𝑠, 𝑛1 … , 𝑡〉 itself
• Worst case is 𝑂(𝑛3 ) since path is at most 𝑛 long and graph
has at most 𝑛2 edges
• But is HAMPATH in P?
• Don’t know! No such algorithm just yet
• Consider co-HAMPATH (i.e. does no such path exist?)
• Even if we are told that no path exists, we do not know how
to verify this without checking every possible path
(exponential)
• Seems like co-HAMPATH is not in NP (but no proof just yet)
P vs NP Question
• Most important open question in CS! NP
P = NP VS HAMPATH P
• P = the class of languages for which membership
can be decided quickly (determine whether
certificate exists in polytime)
• NP = the class of languages for which membership
can be verified quickly (given certificate, verify
correctness in polytime)
Implications for P = NP
• We shall shortly see that certain problems are NP-
complete
• If there exists a polytime algorithm for these problems, then
all problems in NP have polytime solutions!
• But no solutions as yet! (Maybe we need more time?)
• Verifiability seems easier than decidability, so we would
expect some problems to be in NP, but not in P
• If 𝑃 = 𝑁𝑃, then that means coming up with a solution is the
same as recognising a solution! e.g. Creating art is as hard as
recognising art (or creating a proof is the same as verifying
it)!
• Most believe 𝑃 ≠ 𝑁𝑃
• But we have no idea where to even start proving this!
Polytime Reducibility
• If a problem A reduces to problem B, then a solution to B
can be used to solve A
• Note that this means B is at least as hard as A (B could be harder
but not easier)
• When problem A is efficiently reducible to problem B, an
efficient solution to B can be used to solve A efficiently
• Efficiently reducible = polytime
• If conversion is polytime, and solution to B is polytime, then
A can be solved in polytime
• We can chain together problems e.g. C reduces to A which
reduces to B, etc
• If one of the languages has polytime solution, then all do!
• If A is polynomially reducible to B, we write 𝐴 ≤𝑝 𝐵
NP hard
• Let 𝐴 be a decision problem
• 𝐴 is NP-hard if for any 𝐿 ∈ 𝑁𝑃, 𝐿 ≤𝑝 𝐴
• 𝐴 is at least as hard as any NP problem!
• We could use a solution to A to solve any problem
in NP
• 𝐴 not necessarily in NP
NP-hard
• Examples
• Halting problem
• Matrix permanent NP
NP complete
• Certain problems in NP are related to entire class
• If a polynomial time algorithm exists for any of
these problems, then all problems in NP would be
polytime solvable (i.e. P = NP)
• A problem is NP complete if is in NP and it is in NP-
hard
NP-hard
• Examples:
• 3SAT NP complete
• Maximum clique
Given graph, find clique NP
(complete subgraph) of size k
Summary of classes
Space complexity
• So far we’ve spoken about time only
• What about memory requirements?
• There’s a relationship between time vs space!
𝑆𝑃𝐴𝐶𝐸 𝑓 𝑛 = 𝐿 𝐿 𝑁 𝑠. 𝑡. 𝑁 is a TM that uses
𝑂(𝑓 𝑛 ) space on 𝑛-size input}
𝑁𝑆𝑃𝐴𝐶𝐸 𝑓 𝑛 : as above, but for NDTM
Some relations
• 𝑇𝐼𝑀𝐸 𝑓 𝑛 ⊆ 𝑆𝑃𝐴𝐶𝐸(𝑓 𝑛 )
• Algorithm uses 𝑓 𝑛 steps can access at most one
memory location at time
• 𝑆𝑃𝐴𝐶𝐸 𝑓 𝑛 ⊆ 𝑇𝐼𝑀𝐸(2𝑂 𝑓 𝑛 )
• Consider a TM that uses 𝑓 𝑛 memory cells
• Cells can have at most Σ 𝑓(𝑛) configs
• TM head can be in at most 𝑓(𝑛) positions
• TM can be in at most 𝑞 possible FSM states
• There are at most Σ 𝑓 𝑛 𝑞𝑓 𝑛 = 2𝑂(𝑓 𝑛 ) TM configs!
• And a TM is in one config at each step
More relations/definitions
• Savitch’s theorem:
𝑁𝑆𝑃𝐴𝐶𝐸 𝑓 𝑛 ⊆ 𝑆𝑃𝐴𝐶𝐸(𝑓 𝑛 2 )
∞
PSPACE = ራ 𝑆𝑃𝐴𝐶𝐸(𝑛𝑘 )
𝑘=1
Set of languages which can be recognised using a TM using
polynomial space
∞
NPSPACE = ራ 𝑁𝑆𝑃𝐴𝐶𝐸(𝑛𝑘 )
𝑘=1
Set of languages which can be recognised using a NDTM using
polynomial space
Relation summary
• 𝑇𝐼𝑀𝐸 𝑓 𝑛 ⊆ 𝑆𝑃𝐴𝐶𝐸 𝑓 𝑛
• 𝑆𝑃𝐴𝐶𝐸 𝑓 𝑛 ⊆ 𝑇𝐼𝑀𝐸(2𝑂 𝑓 𝑛 ) EXPTIME
• 𝑁𝑆𝑃𝐴𝐶𝐸 𝑓 𝑛 ⊆ 𝑆𝑃𝐴𝐶𝐸 𝑓 𝑛 2 (Savitch)
• By Savitch’s Theorem, 𝑃𝑆𝑃𝐴𝐶𝐸 = 𝑁𝑃𝑆𝑃𝐴𝐶𝐸!
Non-determinism only reduces
space by a small amount, but
(assuming 𝑃 ≠ 𝑁𝑃) reduces time
by an exponential amount!
What we know
• 𝐸𝑋𝑃𝑇𝐼𝑀𝐸: set of
problems solvable in
EXPTIME exponential time on
deterministic TM
• 𝑃 ⊂ 𝐸𝑋𝑃𝑇𝐼𝑀𝐸 definition
NP
• 𝑃 ⊆ 𝑁𝑃
• Probably not equals
P
• 𝑁𝑃 ⊆ 𝐸𝑋𝑃𝑇𝐼𝑀𝐸
• Probably not equals
What we know
EXPSPACE
• 𝑃𝑆𝑃𝐴𝐶𝐸 ⊂ 𝐸𝑋𝑃𝑆𝑃𝐴𝐶𝐸
• Since 𝑇𝐼𝑀𝐸 𝑓 𝑛 ⊆
PSPACE 𝑆𝑃𝐴𝐶𝐸 𝑓 𝑛 and
NP 𝑃𝑆𝑃𝐴𝐶𝐸 = 𝑁𝑃𝑆𝑃𝐴𝐶𝐸…
P
• 𝑃 ⊆ 𝑃𝑆𝑃𝐴𝐶𝐸
• 𝑁𝑃 ⊆ 𝑁𝑃𝑆𝑃𝐴𝐶𝐸 = 𝑃𝑆𝑃𝐴𝐶𝐸
All together
EXPSPACE
• 𝑃 ⊆ 𝑁𝑃 ⊆ 𝑃𝑆𝑃𝐴𝐶𝐸
• But don’t know if equal
EXPTIME
PSPACE • 𝑃𝑆𝑃𝐴𝐶𝐸 ⊆ 𝐸𝑋𝑃𝑇𝐼𝑀𝐸 ⊆
𝐸𝑋𝑃𝑆𝑃𝐴𝐶𝐸
NP
• But don’t know if equal
P
• But we do know:
• 𝑃𝑆𝑃𝐴𝐶𝐸 ⊂ 𝐸𝑋𝑃𝑆𝑃𝐴𝐶𝐸
• 𝑃 ⊂ 𝐸𝑋𝑃𝑇𝐼𝑀𝐸
Also, PSPACE-hard and PSPACE complete
Conclusion
• Complexity classes allowed us to group algorithms or
problems independent of the computational model
• Polynomial vs exponential
• Polynomial is “easy”
• 𝑃 = 𝑁𝑃? Probably not…
• Completeness and reduction: solve one efficiently,
solve them all!
• Space complexity
• Relationship between time and space
• So many unanswered questions!