Automata Module 2
Automata Module 2
SEMESTER – V
Module-2
NOTES
Page 1
Regular Expressions (RE): what is a RE? Kleene’s theorem, Applications of REs, Manipulating
and Simplifying REs. Regular Grammars: Definition, Regular Grammars and Regular languages.
Regular Languages (RL) and Non-regular Languages: How many RLs, To show that a language is
regular, Closure properties of RLs, to show some languages are not RLs.
Page 2
Module 2
CONTENTS
Title Page
Chapter No: 2 No.
2.1 REGULAR EXPRESSION 1 - 46
2.2.5 To show some languages are not RLs.( Pumping Theorem for RLs) 59
Page 3
Automata Theory and Computability Regular Expressions
Regular Expressions
Introduction
Instead of focusing on the power of a computing device, let's look at the task that we need to
perform. Let's consider problems in which our goal is to match finite or repeating patterns.
For example regular expressions are used as pattern description language in
• Lexical analysis.-- compiler
• Filtering email for spam.
• Sorting email into appropriate mailboxes based on sender and/or content words and
phrases.
• Searching a complex directory structure by specifying patterns that are known to occur in
the file we want.
A regular expression is a pattern description language, which is used to describe particular
patterns of interest. A regular expression provides a concise and flexible means for
"matching" strings of text, such as particular characters, words, or patterns of characters.
Example: [ ] : A character class which matches any character within the brackets
[^ \t\n] matches any character except space, tab and newline character.
Regular expression:
A language accepted by a finite- state machine is called as regular language. A regular
language can be described using regular expressions, in the form of algebraic notations
consisting of the symbols such as alphabets or symbols in Σ and a set of special symbols to
which we will attach particular meanings when they occur in a regular expression. These
symbols are Ø, U, ε, (, ), *, and .
Page 1
Automata Theory and Computability Regular Expressions
Page 2
Automata Theory and Computability Regular Expressions
Page 3
Automata Theory and Computability Regular Expressions
Obtain regular expression to accept the language containing at least one a and one b over Σ = { a,
b, c}. OR
Obtain regular expression to accept the language containing at least one 0 and one 1 over Σ = {
0, 1, 2}.
String should contain at least one a and one b, so the regular expression corresponding to this is
given by = ab + ba
There is no restriction on c’s. Insert any number of a’s, b’s and c;s ie: (a+b+c)* in between the
above regular expression.
So the regular expression = (a+b+c)* a (a+b+c)* b(a+b+c)* + (a+b+c)* b(a+b+c)*a(a+b+c)*
Obtain regular expression to accept the language containing at least 3 consecutive zeros.
Regular expression for string containing 3 consecutive 0’s = 000
The above regular expression can be preceded or followed by any number of 0’s and 1’s, ie:
(0+1)*
Regular expression = (0+1)*000(0+1)*
Obtain regular expression to accept the language containing strings of a’s and b’s ending with b
and has no substring aa.
Regular expression for strings of a’s and b’s ending with b and has no substring aa is nothing but
the string containing any combinations of either b or ab without ε.
Regular expression = ( b + ab) (b +ab)*
Obtain regular expression to accept the language containing strings of a’s and b’s such that
L = { a2n b2m | n, m 0 }
a2n means even number of a’s, regular expression = (aa)*
b2m means even number of b’s, regular expression = (bb)*.
The regular expression for the given language = (aa)* (bb)*
Obtain regular expression to accept the language containing strings of a’s and b’s such that L = {
a2n+1 b2m | n, m 0 }.
a2n+1 means odd number of a’s, regular expression = a(aa)*
b2m means even number of b’s, regular expression = (bb)*
The regular expression for the given language = a(aa)* (bb)*
Page 4
Automata Theory and Computability Regular Expressions
Obtain regular expression to accept the language containing strings of a’s and b’s such that L = {
a2n+1 b2m+1 | n, m 0 }.
a2n+1 means odd number of a’s, regular expression = a(aa)*
b2m+1 means odd number of b’s, regular expression = b(bb)*
The regular expression for the given language = a(aa)*b(bb)*
Obtain regular expression to accept the language containing strings of 0’s and 1’s with exactly
one 1 and an even number of 0’s.
Regular expression for exactly one 1 = 1
Even number of 0’s = (00)*
So here 1 can be preceded or followed by even number of 0’s or 1 can be preceded and followed
by odd number of 0’s.
The regular expression for the given language = (00)* 1 (00)* + 0(00)* 1 0(00)*
Obtain regular expression to accept the language containing strings of 0’s and 1’s having no two
consecutive 0’s. OR
Obtain regular expression to accept the language containing strings of 0’s and 1’s with no pair of
consecutive 0’s.
Whenever a 0 occurs it should be followed by 1. But there is no restriction on number of 1’s. So
it is a string consisting of any combinations of 1’s and 01’s, ie regular expression = (1+01)*
Suppose string ends with 0, the above regular expression can be modified by inserting (0 + ε ) at
the end.
Regular expression for the given language = (1+01)* (0 + ε )
Obtain regular expression to accept the language containing strings of 0’s and 1’s having no two
consecutive 1’s. OR
Obtain regular expression to accept the language containing strings of 0’s and 1’s with no pair of
consecutive 1’s.
Whenever a 1 occurs it should be followed by 0. But there is no restriction on number of 0’s. So
it is a string consisting of any combinations of 0’s and 10’s, ie regular expression = (0+10)*
Suppose string ends with 1, the above regular expression can be modified by inserting (1 + ε ) at
the end.
Regular expression for the given language = (0+10)* (1 + ε )
Page 5
Automata Theory and Computability Regular Expressions
Page 6
Automata Theory and Computability Regular Expressions
vii. Obtain the regular expression to accept the words with two or more letters but
beginning and ending with the same letter. Σ = { a, b}
Regular expression beginning and ending with same letter is = a a + b b. In between
include any number of a’s and b’s.
Therefore the regular expression = a (a+b)* a + b (a+b)* b
viii. Strings of a’s and b’s of length is either even or multiple of 3.
Multiple of regular expression = [(a+b) (a+b) (a+b)]*
Length is of even, regular expression = [(a+b) (a+b)]*
So the regular expression for the given language = ((a+b) (a+b) (a+b)]* + [(a+b)
(a+b))*
ix. Obtain the regular expression to accept the language L = { anbm | m+n is even }
Here n represents number of a’s and m represents number of b’s.
m+n is even results in two possible cases;
case i. when even number of a’s followed by even number of b’s.
regular expression : (aa)*(bb)*
case ii. Odd number of a’s followed by odd number of b’s.
regular expression = a(aa)* b(bb)*.
So the regular expression for the given language = (aa)*(bb)* + a(aa)* b(bb)*
xi. Obtain the regular expression to accept the language L = { anbm cp | n 4 and m 3 p
2}.
Here n 4 means at least 4 a’s, the regular expression for this = aaaa(a)*
m 3 means at most 3 b’s, regular expression for this = (ε+b) (ε+b) (ε+b).
p 2 means at most 2 c’s, regular expression for this = (ε+c) (ε+c)
Page 7
Automata Theory and Computability Regular Expressions
So the regular expression for the given language = aaaa(a)* (ε+b) (ε+b) (ε+b) (ε+c)
(ε+c).
xii. All strings of a’s and b’s that do not end with ab.
Strings of length 2 and that do not end with ab are ba, aa and bb.
So the regular expression = (a+b)*(aa + ba +bb)
xiii. All strings of a’s, b’s and c’s with exactly one a.
The regular expression = (b+c)* a (b+c)*
xiv. All strings of a’s and b’s with at least one occurrence of each symbol in Σ = {a, b}.
At least one occurrence of a’s and b’s means ab + ba, in between we have n number
of a’s and b’s.
So the regular expression = (a+b)* a (a+b)* b(a+b)* +(a+b)* b(a+b)* a(a+b)*
Case ii. Since nm 3, if m = 1 then n should be 3. The equivalent regular expression is given
by: RE = aaa(a)* b
Case iii. Since nm 3, if m 2 and n 2 then the equivalent regular expression is given by:
RE = aa(a)* bb(b)*
So the final regular expression is obtained by adding all the above regular expression.
Regular expression = abbb(b)* + aaa(a)*b + aa(a)*bb(b)*
Page 8
Automata Theory and Computability Kleen’s Theorem
The regular expression language provides three operators (precedence order from highest to
lowest)
1. Kleene star
2. Concatenation, and
3. Union
NOTE:
(α U ε) : optional α and expression can be satisfied either by matching α or the empty string.
(a U b)* : Describes the set of all strings composed of the characters a and b.
a* U b* ≠ (a U b)* : Every string in the language on the left contains only a’s or b’s whereas
right side it contains combination of a’s and b’s.
(ab)* ≠ a*b*: The language on the left contains the string abab….. while the
language on the right does not. The language on the right contains the string aaabbbb, while
the language on the left does not.
The regular expression a* is simply a string. It is different from Language L(a*) = {w: w is
composed of zero or more a's}.
Kleene's Theorem
• The regular expression language is a useful way to define patterns..
• Any language that can be defined by a regular expression can be accepted by some finite
state machine.
• Any language that can be accepted by a finite state machine can be defined by some
regular expression.
Page 9
Automata Theory and Computability FSM to Regular Expressions
Let β and γ be regular expressions that define languages over the alphabet ∑
If L (β) is regular, then it is accepted by some FSM M1 = (K1, ∑, δ1, s1, A1).
If L (γ) is regular, then it is accepted by some FSM M2 = (K2, ∑, δ2, s2, A2).
If regular expression α = β U γ and if both L(β) and L(γ) are regular, then we construct M3 =(K3,
∑, δ3, s3, A3), such that L(M3) = L(α ) = L(β ) U L(γ).
Construct a new machine M3, by creating a new start state s3, and connect it to the start states of
M1 and M2 via ε-transitions. M3 accepts if either M1 or M2 accepts.
So M3 = ({ s3} U K1 U K2, ∑, δ3, s3, A1 U A2 ) where δ3 = δ1 U δ2{((s3, ε), s1), (s3, ε ), s2)}
Page 10
Automata Theory and Computability FSM to Regular Expressions
If regular expression α = βγ and if both L(β) and L(γ) are regular, then we construct M 3 =(K3, ∑,
δ3, s3, A3), such that L(M3) = L(α ) = L(β )L(γ).
Construct a new machine M3, by connecting every accepting state of M1 to the start state of M2
via an ε-transition. M3 will start in the start state of M1 and will accept if M2 does. So M3 = ( K1
U K2, ∑, δ3, s1, A2) where δ3 = δ1 U δ2{((q, ε), s2) : q ϵ A1)}
If regular expression α = β*and L(β) is regular, then we construct M2 =(K2, ∑, δ2, s2, A2), such
that L(M2) = L(α ) = L(β )*
M2 is constructed by creating a new start state s2 and make it accepting state, thus assuming that
M2 accepts ε. We link the new s2 to s1 via an ε –transitions. Finally, we create ε -transitions from
each of M1's accepting states back to s1. So M2 = ({s2} U K1, ∑, δ2, s2, {s2 } U A1) where δ2 = δ1
U {((s2, ε), s1) } U {((q, ε), s1) : q ϵ A1}
Page 11
Automata Theory and Computability FSM to Regular Expressions
NOTE:
Finite state Machines constructed from Regular expression are typically highly non-eterministic
because of their use of ε-transitions. These FSM’s have a large number of unnecessary states. As
a practical matter, it is not a problem, since, given an arbitrary NDFSM M, we have an algorithm
that can construct an equivalent DFSM M’. We also have an algorithm that can minimize M’
Construct a FSM for the regular expression (b U ab)*
OR
Convert the regular expression (b + ab)* to an ε- NFA
OR
Convert the regular expression (b, ab)* to a FSM.
FSM for b :
FSM for a :
FSM for ab :
Page 12
Automata Theory and Computability FSM to Regular Expressions
FSM for 1:
Page 13
Automata Theory and Computability FSM to Regular Expressions
Page 14
Automata Theory and Computability FSM to Regular Expressions
FSM for 1:
Page 15
Automata Theory and Computability FSM to Regular Expressions
FSM for 0:
FSM for 1:
Page 16
Automata Theory and Computability FSM to Regular Expressions
Page 17
Automata Theory and Computability FSM to Regular Expressions
FSM for 1:
FSM for 2:
Page 18
Automata Theory and Computability FSM to Regular Expressions
Page 19
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Page 20
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
We can build an equivalent machine M' by eliminating state q2 and replacing it by a transition
from q1 to q3 labeled with the regular expression ab*a.
So M' is:
There is no incoming edge into the initial state as well as no outgoing edge from final state. So
there is only two states, initial and final.
Page 21
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Obtain the regular expression for the following finite automata using state elimination method
There is no incoming edge into the initial state as well as no outgoing edge from final state.
After eliminating the state B:
Regular expression = ab
Obtain the regular expression for the following finite automata using state elimination method
There is no incoming edge into the initial state as well as no outgoing edge from final state.
After eliminating the state B:
Page 22
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Since initial state has incoming edge, and final sate has outgoing edge, we have to create a new
iniatial and final state by connecting new initial state to old initial state through ε and old final
state to new final state through ε. Make old final state has non-final state.
Obtain the regular expression for the following finite automata using state elimination method
Page 23
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Since there are multiple final states, we have to create a new final state.
Obtain the regular expression for the following finite automata using state elimination method
Page 24
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Obtain the regular expression for the following finite automata using state elimination method
Page 25
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Page 26
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Page 27
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Obtain the regular expression for the following finite automata using state elimination method
Page 28
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Page 29
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Since start state 1 has incoming transitions, we create a new start state and link that state to state
1 through ε.
Since accepting state 1 and 2 has outgoing transitions, we create a new accepting state and link
that state to state 1 and state 2 through ε. Remove the old accepting states from the set of
accepting states. (ie: consider 1 and 2 has non final states)
Page 30
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Finally we have only start and final states with one transition from start state 1 to final state 2,
The labels on transition path indicates the regular edpression.
Page 31
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Page 32
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Page 33
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Page 34
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Page 35
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Page 36
Automata Theory and Computability Regular Expressions to FSM (State Elimination)
Page 37
Automata Theory and Computability FSM to Regular Expressions ( Kleen’s Theorem)
Kleen’s Thereom
Theorem: Every regular language (ie: every language that can be accepted by some FSM) can
be defined with a regular expression.
This proof is by construction of FSM (construct a new FSM by using the following steps)
1. Remove any states from given FSM M that are unreachable from the start state.
2. If the start state of M is part of a loop (i.e: it has any transitions coming into it), then
create a new start state s and connects to M ‘s start state via an ε-transition. This new
start state s will have no transitions into it.
3. If there is more than one accepting state of M or if there is just one but there are any
transitions out of it, create a new accepting state and connect each of M’s accepting states
to it via an ε-transition. Remove the old accepting states from the set of accepting states.
Note that the new accepting state will have no transitions out from it.
4. If there is more than one transition between states p and q, collapse them into a single
transition.
5. If there is a pair of states p, q and there is no transition between them and p is not the
accepting state and q is not the start state, then create a transition from p to q labeled Ø.
6. At this point, if M has only one state, then that state is both the start state and the
accepting state and M has no transitions. So L (M} = {ε}. Halt and return the simple
regular expression as ε.
7. If M has no accepting states then halt and return the simple regular expression Ø.
8. Until only the start state and the accepting state remain do:
i. Select some state rip of M. Any state except the start
state or the accepting state may be chosen.
ii. For every transition from some state p to some state ,
if both p and q are not rip then, using the current
labels given by the expressions R, compute the new
label R ' for the transition from p to q using the formula:
R'(p, q) = R(p, q) U R(p, rip)R(rip, rip)* R(rip, q)
9. Remove rip and all transitions into and out of it.
Page 38
Automata Theory and Computability FSM to Regular Expressions ( Kleen’s Theorem)
10. Return the regular expression that labels the one remaining transition from the start state
to the accepting state.
Construct the regular expression for the following FSM using Kleen’s Theorem
Page 39
Automata Theory and Computability Application of Regular expressions
ɛR=Rɛ=R 1ɛ=ɛ1=1
ØR = RØ = Ø 1Ø = Ø1 = Ø
ɛ*=ɛ
(Ø)* = ɛ
Ø + R = R+Ø = R Ø +1 =1
R +R = R 1U1=1
(R*)* = R* (1*)* = 1*
R* R* = R*
ɛ + RR* = R* ɛ + 1+ = 1*
(P+Q)R = PR +QR
(P+Q)* =(P*Q*) =
R*(ɛ + R) = (ɛ + R) R* =
(ɛ + R)* = R*
ɛ + R* = R*
(PQ)* P = P(QP)*
R*R + R = R*R =R+
Page 40
Automata Theory and Computability Simplification of Regular expressions
Page 41
Automata Theory and Computability Simplification of Regular expressions
Page 42
Automata Theory and Computability Simplification of Regular expressions
9
( Number of a’s in w is at most 3)
Regular expression = b* (a + ε) b* (a + ε) b* (a + ε) b*
10 {w € {a. b}* : w contains exactly two occurrences of the substring aa}
Regular expression = (b + ab)*aaa (b + ba)* + (b + ab)*aab (b + ab)*aa(b + ba)*
Simplify each of the following regular expressions
a (a U b)* (a U ε) b* = (a U b)*
b (Ø* + b) b*.
We know that Ø* = ε
(ε +b) b* = b* + bb* = b*
c (a U b)*a* U b = (a U b)*
d ((a U b)*)* = (a U b)*
e ((a U b)+ )* = (a U b)*
Let L = {an bn : 0 ≤ n ≤ 4}.
Regular expression = (ε + ab + aabb + aaabbb + aaaabbbb)
Write the regular expression for the following FSM M using state elimination method.
Page 43
Automata Theory and Computability Simplification of Regular expressions
Page 44
Automata Theory and Computability Simplification of Regular expressions
Write the regular expression for the following FSM M using state elimination method
Page 45
Automata Theory and Computability Simplification of Regular expressions
Page 46
Automata Theory and Computability Regular grammars
Regular Grammars
So far we have considered two equivalent ways to describe exactly the class of regular
languages:
Finite state machines.
Regular expressions.
We now introduce a third:
• Regular grammars (sometimes also called right linear grammars).
Define regular Grammar
A regular grammar G is a quadruple (V, ∑, R,S)
where:
• V is the rule alphabet, which contains Non-terminal symbols and Terminal symbols.
• ∑ is the set of terminal symbols ( Subset of V)
• R is finite set of rules of the form X → Y
• S is the start symbol, which is a non-terminal symbol.
Example: G = ({ A, C, a, b}, {a, b}, R, A) where rule R is: A → ε, A → b, A → aC and C →a
Non-Terminal and Terminal Symbols:
Non-terminal Symbols: symbols that are used in the grammar but that do not appear in strings
in the language. In above example A, C are non-terminal (Variable) symbols.
Terminal Symbols: symbols that can appear in strings generated by G. In above example a, b
are terminal symbols.
Rules R of any Regular Grammar:
Rule R is of the form X → Y must satisfy the following 2 conditions:
1. Left-hand side contains only one symbol that must be a non- terminal.
2. RHS contains ε or a single character (terminal) or a single character (terminal) followed
by a single non-terminal
Example: A → ε or A → b or A → aA are legal Rules
BA → ε , A → aSa are not legal rules.
Language generated by a Grammar:
The language generated by a grammar G =( V, ∑, R, S ), denoted L( G), is the set of all strings w
in ∑* such that it is possible to start with S, apply some finite set of rules in R, and derive w.
Page 47
Automata Theory and Computability Regular grammars
To generate any string by using regular grammar G; to start with S, apply derivation step, by
replacing non-terminal symbol in each derivation step until the required string is generated.
Regular Grammars and Regular Languages:
Theorem : The Regular grammar defines exactly the regular languages or The class of languages
that can be defined with regular grammars is exactly the regular languages.
To prove this theorem one must prove that for given regular grammar it is possible to construct
equivalent FSM or from FSM it is possible to get the regular grammar.
By applying the above methods (Above FSM doesnot contain final state as #)
The regular Grammar which defines the regular language L is:
Regular grammar G = ( { S, A, a, b}, { a, b}, R, S ) where S is the start Non-terminal symbol of
grammar G and R is the rules defined as:
S→ aA
S → bS
A → aA
Page 48
Automata Theory and Computability Regular grammars
A → bA
A→ε
Method for conversion of Regular Grammar G to FSM:
1. Create in FSM M a separate state for each non terminal in V.
2. Make the state corresponding to S the start state.
3. If there are any rules in R of the form X → w, for some w € ∑ then create an additional
state labelled #.
4. For each rule of the form X → wY, add a transition from X to Y labelled w.
5. For each rule of the form X → w, add a transition from X to # labelled w.
6. For each rule of the form X → ɛ, mark state X as accepting.
7. Mark state # as accepting.
Consider the following regular grammar G:
S → aT
T → bT
T→a
T → aW
W→ε
W → aT
The equivalent FSM for the given regular grammar G:
Page 49
Automata Theory and Computability Regular grammars
Write the regular grammar for the language L = {w € {a, b }*: |w| is even}.
The following DFSM M accepts L:
Page 50
Automata Theory and Computability Regular grammars
Write the regular grammar for L = {w € {a, b}*: w contains an odd number of a's and w ends in
a}. Also generate the string baaba by using this regular grammar.
DFSM which accepts L is:
Regular grammar G = ({S, T, X, a, b}, {a, b}, R, S) where R is the rule defined as follows:
S → bS
S → aT
T → aS
T→ bX
T→ ε
X→ aS
X→ bX
To generate the string baaba, start with S apply derivation process, by replacing non terminal
symbol in each step, until the required string is generated.
Page 51
Automata Theory and Computability Regular grammars
Show a regular grammar for the language: L ={ w € {a, b }*: w contains an even number of a's
and an odd number of b's }
DFSM which accepts L is:
Regular grammar G = ({S, A, B, C, a, b}, {a, b}, R, S) where R is the rule defined as follows:
S → aA
S → bB
A → aS
A→ bC
B→ aC
B→ bS
B→ ε
C→ aB
C→ bA
Show a regular grammar for the language: L ={ w € {a, b }*: w does not end in aa }
DFSM which accepts L:
Page 52
Automata Theory and Computability Regular grammars
Regular grammar G = ({S, A, B, a, b}, {a, b}, R, S) where R is the rule defined as follows:
S → aA | bS | ε
A → aB | bS | ε
B→ aB | bS
Show a regular grammar for the language: L ={ w € {a, b }*: w does not contain the substring
aabb }
Regular grammar G = ({S, A, B, C, a, b}, {a, b}, R, S) where R is the rule defined as follows:
S → aA | bS | ε
A → aB | bS | ε
B→ aB | bC | ε
C→ aA | ε
Page 53
Automata Theory and Computability Regular and Non regular Languages
Page 55
Automata Theory and Closure properties of
Computability RLs
So the language which is rejected by M1 is accepted by M2 and vice versa. Thus we have a
machine M2 which accepts all those strings ‘w’ in that are rejected by machine M1. So the
complement of regular language L is is also regular.
3. Closure under intersection:
The intersection of two regular languages is regular.
If L and M are regular languages then show that L∩M is also regular.
Proof::
Let L and M are regular languages. We know that complement of a regular language is regular.
So complement of L, ie: and complement of M, ie: is also regular language.
Also union of two regular languages is a regular language.
So union of and , ie: U is also regular.
Page 56
Automata Theory and Closure properties of
Computability RLs
Page 57
Automata Theory and Closure properties of
Computability RLs
Page 58
Automata theory and Computability Showing that language is not RL
Page 59
Automata theory and Computability Showing that language is not RL
This FSM accepts only one string, aab. The only string that can drive FSM through its loop is ɛ
No matter how many times FSM goes through the loop, it cannot accept any longer strings.
Therefore the length of pumping string y must be greater 0. It should not be empty.
This property of FSMs and the languages that they can accept is the basis for a powerful
tool for showing that a language is not regular.
If a language contains even one long string that cannot be pumped in the fashion that we
have just described, then it is not accepted by any FSM and so is not regular.
We formalize this idea, in Pumping Theorem.
The Pumping Theorem for Regular languages (Pumping Lemma for Regular Languages)
**********State and prove pumping theorem for regular languages
Theorem:
Let L be a regular language. Then there exists a constant ‘k’ (number of states in FSM which
depends on L) such that for every string ‘w’ in L such that |w| ≥ k, we can break w into three
strings, w = xyz, such that:
|xy| ≤ k
For all q ≥ 0, the string xyqz is also in L
Proof: Suppose L = L(M) for some DFSM ‘M’ and L is regular language. Suppose ‘M’ has ’k’
number of states. Consider any string w = a1a2a3………………..am of length ’m’ where m ≥ k and
each ai is an input symbol. Since we have ‘m’ input symbols, naturally we should have ‘m+1’
states, in sequence q0, q1, q2……………….qm where q0 is the start state and qm is the final state.
Page 60
Automata theory and Computability Showing that language is not RL
Since |w| ≥ k, by the pigeonhole principle it is not possible to have distinct transitions, since
there are only ‘k’ different states. So one of the state can have a loop. Thus we can find two
different integers i and j with 0 ≤ i < j ≤ k, such that qi = qj. Now we can break the string w = xyz
as follows:
x = a1a2a3 .....................ai.
The relationships among the strings and states are given in figure below:
‘x’ may be empty in the case that i= 0. Also ‘z’ may be empty if j = k = m. However, y cannot be
empty, since ‘i’ is strictly less than ‘j’.
Thus for any integer q ≥ 0, xyqz is also accepted by DFSM ‘M’; that is for a language L to be a
regular, xyqz is also in L for all q ≥ 0.
Page 61
Automata theory and Computability Showing that language is not RL
Show that L = {w € { ), ( }*: the parentheses are balanced} are not regular
Assume that given language L is regular language and there exist some ‘k’ number of states,
such that any string w, where |w| ≥ k must satisfy the conditions of the theorem.
Let w = (k )k
Since |w| = k + k = 2k ≥ k, we can split ‘w’ into xyz such that |xy| ≤ k and |y | ≥ 1 as
w = xyz
Since |xy| <= k, y must occur within the first k characters and so y = (p for some p. Also y ≠ε,
p must be greater than 0.
x = (k – p y = (p z = )k
According to pumping lemma, language to be regular, xyqz € L for all q ≥ 0.
Page 62
Automata theory and Computability Showing that language is not RL
Let q = 0 and the resulting string w = ak – p (ap)0 abk where p ≥ 1 = ak+1- p bk must be in L
But it is not since p > 0 and k + 1 - p <= k, so the resulting string no longer has more a's than b's
and so is not in L.
Thus there exists at least one long string w in L that fails to satisfy the conditions of the Pumping
Theorem. So L = { anbm | n > m } is not regular.
Page 63
Automata theory and Computability Showing that language is not RL
0i (0p)q 0k – i – p = 0k + p(q – 1) € L
We know that p > 0; suppose p = 1 then 2k is also prime, but it is not true, which is a
contradiction to our assumption. So the language L = {0m | m is prime} is not regular.
Page 64
Automata theory and Computability Showing that language is not RL
If p = 1, then k! – 1 > k!
ak! – 1 does not belongs to L, which is a contradiction to our assumption. So the language L = {an!
| n ≥ 0} is not regular.
Page 65
Automata theory and Computability Showing that language is not RL
Let w = 0k
Since |w| = k2 ≥ k, we can split ‘w’ into xyz such that |xy| ≤ k and |y | ≥ 1 as
w = xyz
Since |xy| ≤ k, y must occur within the first k characters and so y = ap for some p. Also y ≠ε,
p must be greater than 0.
–i-p
x = 0i y = 0p z = 0k
According to pumping lemma, language to be regular, xyqz € L for all q ≥ 0.
–i-p
Let q = 2 and the resulting string w = 0i (0p)2 0k where p ≥ 1; w = 0k +p
must be in L.
But it is not since p > 0 and when p= 1; k2 < k2+1 < (k+1)2, so the resulting string no longer
appears in L
Thus there exists at least one long string w in L that fails to satisfy the conditions of the Pumping
Theorem. So L= {0n | n is a perfect square} is not regular.
Page 66