Lesson 43
Lesson 43
1
Non-Context-Free language
There arises a question, whether all
languages are CFL ? The answer is no.
Non CFL: Languages which are not
Context-Free, are called Non-CFL.
To prove the claim that all languages are
not Context-Free, the study of machines of
word production from the grammar is
needed
2
Live production: A production of the form
nonterminal string of two nonterminals
is called a live production.
Dead production: A production of the form
nonterminal terminal
is called a dead production.
It may be noted that every CFG in CNF has
only these types of productions.
Following is a theorem
3
Theorem: If a CFG is in CNF and if there
is restriction to use the live production at
most once each, then only the finite many
words can be generated.
It may be noted that every time a live
production is applied during the derivation
of a word it increases the number of
nonterminals by one.
4
Similarly applying dead production decreases
the nonterminals by one. Which shows that to
generate a word, one more dead production are
applied than the live productions e.g.
S XY
aY
aa
Here one live and two dead productions are
used
5
In general, if a CFG in CNF has p live and q
dead productions then all words generated
without repeating any live production have at
most (p+1) letters
Theorem: If a CFG is in CNF with p live and q
dead productions and if w is word generated by
the CFG, having more than 2p letters then any
derivation tree for w has a nonterminal z which
is used twice, where the second z is the
descended from the first z.
6
It can be observed from the above theorem
that generation tree of word w has more
than p rows.
Self-embedded nonterminal: A
nonterminal is said to be self-embedded, if
in a given derivation of a word, it ever
occurs as a tree descendant of itself e.g.
7
S
A X
a S A
A X a
a S A
a
Here the nonterminal X is
b
self-embedded. 8
Note
Consider the following CFG in CNF
S AB
A BC
C AB
Aa
Bb
and the derivation tree of the word bbabbb
9
S
A B
B C b
b A B
B C b
b A B
a b 10
Note
The part of tree enclosed in upper triangle is
identical to that enclosed in lower triangle,
there is still another option of replacing A
by the same sequence of production shown
in lower triangle.
The above fact provides the following
pumping lemma for the CFLs.
11
Pumping lemma for CFLs
Theorem: If G is any CFG in CNF with p
live productions, then every word w of
length more than 2p can be partitioned into
five substrings as w=uvxyz, where x is not
null string and v and y are not both null
string.
n n
Then all the words of the form uv xy z,
n=1,2,3,… can also be generated by G.
12
Example
Example: Consider the following CFG
which is in CNF
S PQ
Q QS|b
Pa
and a word abab generated by the above
CFG with the following derivation tree
13
Example continued …
P Q
a Q S
u
b P Q
x
a b
y 14
Then w can be broken up as w=uvxyz
where u=a, v=, x=b, y=ab, z=
Repeating the triangle from the second Q
just as it descends from the first Q, the
corresponding tree may be expressed as
follows
15
S
P S
a Q S
u
Q S P Q
b P Q
x
a b a b
y y 16
Which shows that
uvvxyyz=ababab=ababab belongs to the
language generated by the given CFG.
So, it can be generalized that words of the
form
n n
uv xy z, n=1,2,3,… belong to the language
generated by the given CFG.
17
Note: It may be noted that the pumping
lemma is satisfied by all CFLs and the
languages which don’t hold this pumping
lemma, can’t be Context Free languages.
Such languages are non-CFLs. Following
are the examples of non-CFL.
18
Example
Consider the language
L={anbncn :n=1,2,3,…}, let the language L be
Context Free language and let the word
w=a200b200c200 of length more than 2p, where p is
the number of live productions of its CFG in CNF.
It can be observed that no matter what choices are
made for the substrings u,v,x,y and z. uv2xy2z
can’t belong to L, as all the words in anbncn have
19
1. Only one substring ab
2. Only one substring bc
3. No substring ac
4. No substring ba
5. No substring ca
6. No substring cb
For any n=1,2,3,…
20
The above observations shows that if v or y is not
single letter or , then uv2xy2z may contain either
two or more substrings ab or bc or one or more
substrings ac or ba or ca or cb i.e. these strings may
be in the number more than the number they are
supposed to be.
Moreover, if v and y are either single letter or , then
one or two of letters a,b,c will be increased, where
as the other letter will not be increased in uv2xy2z,
which shows uv2xy2z does not belong to L.
21
Thus pumping lemma is not satisfied.
Hence L is non CFL.
It may be noted that the pumping lemma
discussed for infinite regular language L,
the word w can be decomposed into three
parts w=xyz, such that all words of the
form xynz, n=1,2,3,…, belong to L.
22
Similarly, the pumping lemma discussed for
CFLs can also stated as
“If w is a large enough word in a CF:
then, w can be decomposed into w=uvxyz
such that all words of the form uvnxynz
belong to L”
23
It may be noted that proof of pumping
lemma for regular languages needed that
path of word w to be so large enough so
that it contains a circuit and circuit can be
looped as many times as one can. The proof
of the pumping lemma for CFLs needs the
derivation for w to be so large that it
contain a sequence of productions that can
be repeated as many times as one can.
24
Moreover, the pumping lemma for regular
languages does not hold for non regular language as
that language does not contain both xyz and xyyz.
Similarly pumping lemma for CFLs does not hold
for non-CFL as that language does not contain both
uvxyz and uvvxyyz.
There is another difference between the pumping
lemma for regular languages and that for CFLs that
first one acts on machines while other acts on
algebraic structures i.e. grammar.
25
To achieve full power the pumping lemma
for regular languages has modified by
pumping lemma version II. Similarly, full
power for pumping lemma for CFLs is
achieved by stating the following theorem
Theorem: If L is a CFL in CNF with p live
productions then any word W in L of length
more than 2p can be decomposed as
26
w=uvxyz s.t.
length(vxy) 2p, length(x) > 0, length(v)
+length(y) > 0
then the words of the form
uvnxynz : n=1,2,3,… belong to L.
Following is another example of non-CFL
27
Example
Consider the language
L = {anbmanbm :m,n=1,2,3,…}
={abab,aabaab, abbabb, aabbaabb,
aaabaaab,… }
The first version of pumping lemma for CFLs
may be satisfied by L, but to apply the
second version of pumping lemma to L, let L
be generated by CFG which is in CNF and
has p live productions.
28
p p p p
Consider the word w=a2 b2 a2 b2
decomposing w into uvxyz where length(vxy)
< 2p
which shows that v and y can’t be single letters
separated by clumps of other letter because the
separator letter is longer than the length of
whole substring vxy, which shows that
uvvxyyz is not contained in L. Thus pumping
lemma is not satisfied and L is non CFL.
29
Example
Consider the language EVENA i.e.
EVENA=(aa)n =a2n ={aa, aaaa, aaaaaa, …}
The grammar for this language must be
S SS|aa and its CNF will be
S SS|AA, A a, the PDA for this grammar
will be
30
PUSH S ST
S
S
PUSH A PUSH S
PUSH A PUSH S
A $
PUSH S
S POP POP POP
a a a
PUSH S
RD1
PUSH A A
POP
PUSH $
HERE POP
Lower
HERE POP $ part
PUSH $
POP POP
S S RD2
PUSH A PUSH S
AT
Example continued …
The summary table corresponding to the
previous PDA in conversion form can be
expressed as
34
FROM TO READ POP PUSH ROW
START HERE $ S$ 1
HERE HERE S SS 2
HERE HERE S AA 3
HERE READ1 A -- 4
READ1 HERE a S S 5
RAED1 HERE a $ $ 6
READ1 HERE a A A 7
HERE READ2 $ $ 8
READ2 AT $ -- 9
35
Example continued …
Following are the productions defined from the
summary table
S Net(START, ACCEPT, $)
Net(HERE, READ1, A) Row4
Net(READ2, ACCEPT, $) Row9
Net(START, X, $) Row1 Net(HERE, Y,
S)Net(Y, X, $)
Net(HERE, X, S) Row2 Net(HERE, Y, S)Net(Y,
X, S)
Net(START, X, S) Row3 Net(HERE, Y, 36
A)Net(Y, X, A)
Example continued …
Net(READ1, X, S) Row5 Net(HERE, X, S) gives
four productions
Net(READ1, X, $) Row6 Net(HERE, X, $) gives
four productions
Net(READ1, X, A) Row7 Net(HERE, X, A) gives
four productions
Net(HERE, ACCEPT, $) Row8 Net(READ2,
ACCEPT, $)
Where X and Y are the corresponding joints
37
Example continued …
In addition to 44 productions following 9 productions complete the
required CFG
Row1
Row2
Row3
Row4
Row5 a
Row6 a
Row7 a
Row8
Row9
38