CS 172: Computability and Complexity
Minimization of DFAs
Sanjit A. Seshia
EECS, UC Berkeley
Acknowledgments: L.von Ahn, L. Blum, M. Blum, A. Sinclair
What is Minimization?
Minimized DFA for language L
=
DFA with fewest states that recognizes L
Also called minimal DFA
S. A. Seshia 2
1
Why is Minimization Important?
DFAs are how computers manipulate
regular languages (expressions)
DFA size determines space/time efficiency
S. A. Seshia 3
IS THIS MINIMAL?
?
NO
0
1
1 1 0 0
1
S. A. Seshia 4
2
HOW ABOUT THIS?
YES
0
1
S. A. Seshia 5
1 Equivalent
1 1 0
1 0 DFAs
0
0
1
S. A. Seshia
0 6
3
Main Result of this Lecture
For every regular language L,
there exists a unique, minimal DFA
that recognizes L
• uniqueness up to re-labeling of states
S. A. Seshia 7
Words States
• Let DFA M = (Q, Σ, δ, q0, F)
• Each word w in Σ* corresponds to a unique
state in Q
– The ending state of M on w
• Given x, y ∈ Σ*
– x ∼M y iff
M ends in the same state on both x and y
– ∼M is an equivalence relation (why?)
– How many equivalence classes are there?
S. A. Seshia 8
4
Example:
Is 1 ∼M 11? 10 ∼M 00?
1
1 1 0 0
1
S. A. Seshia 9
Indistinguishable Words/Strings
• Let DFA M = (Q, Σ, δ, q0, F) recognize L
• Given x, y ∈ Σ*
– x ∼L y (x and y are indistinguishable) iff
∀ z ∈ Σ*, xz ∈ L iff yz ∈ L
Compare with
– x ∼M y iff
M ends in the same state on both x and y
S. A. Seshia 10
5
Example:
What are indistinguishable words?
1
1 1 0 0
1
S. A. Seshia 11
∼L and ∼M
• Let DFA M = (Q, Σ, δ, q0, F) recognize L
• Given x, y ∈ Σ*
– x ∼L y (x and y are indistinguishable) iff
∀ z ∈ Σ*, xz ∈ L iff yz ∈ L
– x ∼M y iff
M ends in the same state on both x and y
– True or False:
• If x ∼M y then x ∼L y TRUE
• If x ∼L y then x ∼M y FALSE
S. A. Seshia 12
6
Indistinguishable Words
• Let DFA M = (Q, Σ, δ, q0, F) recognize L
• Given x, y ∈ Σ*
– x ∼L y (x and y are indistinguishable) iff
∀ z ∈ Σ*, xz ∈ L iff yz ∈ L
– x ∼M y iff
M ends in the same state on both x and y
Which has more equivalence classes --
∼M or ∼L ?
S. A. Seshia 13
Myhill-Nerode Theorem
(a version)
The relation ∼L defines a DFA ML for
language L where the states of ML
correspond to the equivalence
classes of ∼L
ML is the unique, minimal DFA for L
(up to isomorphism)
S. A. Seshia 14
7
Proof of Myhill-Nerode Thm.
S. A. Seshia 15
Next:
Algorithm for DFA Minimization
S. A. Seshia 16
8
Indistinguishable States
• Idea: Merge “indistinguishable states”
• Recall:
– States of DFA M map 1-1 to equivalence
classes of ∼M
– Each equivalence class of ∼M is in some
equivalence class of ∼L
• States p and q are indistinguishable iff
their corresponding ∼M equivalence classes
are in the same class of ∼L
– We write p ∼ q
– p ~ q “p and q are distinguishable”
S. A. Seshia 17
The Algorithm We Want
Input: DFA M
Output: DFA ML such that:
M ≡ ML
ML has no unreachable states
ML is irreducible
||
states of ML are pairwise distinguishable
Theorem: ML is the unique minimum
S. A. Seshia 18
9
DFA Minimization Algo.: Idea
• States of ML are equivalence classes of ∼L
• Equivalence classes of ∼L can be obtained
by merging states of M
• Our algorithm works in reverse:
– Start by assuming all states as being merged
together
– Identify pairs of distinguishable states
• Repeat until no new distinguishable state-pairs
identified
S. A. Seshia 19
TABLE-FILLING ALGORITHM
Input: DFA M = (Q, Σ, δ, q0, F)
Output: Table: { (p,q) | p,q ∈ Q and p ~
/ q}
States of ML = { [q] | q ∈ Q }
q0
q1 Base Case: p accepts
and q rejects ⇒ p ~/ q
qi d d d d Recursion:
d σ
p p′′
d
qn σ ~/ ⇒ p ~/ q
d
q q′′
q0 q1 qi qn
S. A. Seshia 20
10
q0 Example
q1 d
q2 d d
q3 d d d
q0 q1 q2 q3
0,1
1 0
0 0 1
q0 q1 q2 q3
1
S. A. Seshia 21
0
q0 q1
1
1 1 0 0
1
q0
q3 q2
q1 d 0
q2 d
q3 d d
q0 q1 q2 q3
S. A. Seshia 22
11
1 0 0,1
0
q0 q1 q4
1 1
0 0 q5
q0 0,1
q1 1
q2 q3
q3
q4
Do Try this
q5 at Home!
S. A. Seshia q0 q1 q3 q4 q5 23
Correctness of Algorithm
1.If algorithm marks (p, q) as “d”, then p ~ q
2.If p ~ q, then algorithm marks (p, q) as “d”
Proving (1) is easy. Use induction on the
step at which (p, q) was marked “d”.
S. A. Seshia 24
12
Part (2):
If p ~ q, then the algorithm marks (p, q) as “d”
Proof (by contradiction):
Suppose p ~ q, but the algorithm does not
mark (p, q) as “d”
Since p ~ q there exists w such that:
δ^(p, w) ∈ F and δ^(q, w) ∉ F
Of all such “bad pairs” (p, q), let p, q be a
pair with the smallest |w|
S. A. Seshia 25
If p ~ q, then the algorithm marks (p, q) as
“d”
Proof (by contradiction):
Suppose p ~ q, but the algorithm does not
mark (p, q) as “d”
δ^(p, w) ∈ F and δ^(q, w) ∉ F
Of all such “bad pairs” (p, q), let p, q be a
pair with the smallest |w|
w = σw′′, where σ ∈ Σ (w is not ε, why?)
Let p′′ = δ(p,σ
σ) and q′′ = δ(q,σ
σ)
Then (p′′, q′′) is also a bad pair Contradiction! (why?)
S. A. Seshia 26
13
Complexity of Algorithm
• For DFA M, let
– Number of states of M be n
– Size of the input alphabet Σ be k
• Initialization of table: O(n2)
• Rest of the algorithm: O(k n2)
S. A. Seshia 27
Minimal NFA is NOT Unique
0
0
S. A. Seshia 28
14
Next Steps
• Read Sipser 2.1 in preparation for next
lecture
S. A. Seshia 29
15