Properties of Context-Free Languages
Properties of Context-Free Languages
Languages
1
Topics
1) Simplifying CFGs, Normal forms
2) Pumping lemma for CFLs
3) Closure and decision properties of
CFLs
2
How to “simplify” CFGs?
3
Three ways to simplify/clean a CFG
(clean)
1. Eliminate useless symbols
(simplify)
2. Eliminate -productions A =>
4
Eliminating useless symbols
Grammar cleanup
5
Eliminating useless symbols
A symbol X is reachable if there exists:
S * X
S * X * w’, for some w’ T*
reachable generating
6
Algorithm to detect useless
symbols
1. First, eliminate all symbols that are not
generating
7
Example: Useless symbols
SAB | a
A b
1. A, S are generating
2. B is not generating (and therefore B is useless)
3. ==> Eliminating B… (i.e., remove all productions that involve
B)
1. S a
2. Ab
4. Now, A is not reachable and therefore is useless
9
S * X
Algorithm to find all reachable symbols
Given: G=(V,T,P,S)
Basis:
S is obviously reachable (from itself)
Induction:
Suppose for a production A 1 2… k,
where A is reachable
Then, all symbols on the right hand side,
{1, 2 ,… k} are also reachable.
10
Eliminating -productions
A =>
11
What’s the point of removing -productions?
A
Eliminating -productions
It is not possible to eliminate -productions for
languages which include in their word set
So we will target the grammar for the rest of the language
Theorem: If G=(V,T,P,S) is a CFG for a language L, then
L\ {} has a CFG without -productions
12
Example: Eliminating -
productions
Let L be the language represented by the following CFG G:
i. SAB
ii. AaAA |
iii. BbBB | Simplified
grammar
Goal: To construct G1, which is the grammar for L-{}
13
Eliminating unit productions
A => B B has to be a variable
15
Example: eliminating unit
productions
Unit pairs Only non-unit
productions to be
added to P1
G:
1. E T | E+T (E,E) E E+T
2. T F | T*F
3. F I | (E) (E,T) E T*F
4. I a | b | Ia | Ib | I0 | I1 (E,F) E (E)
(E,I) E a|b|Ia | Ib | I0 | I1
(T,T) T T*F
(T,F) T (E)
(T,I) T a|b| Ia | Ib | I0 | I1
G1:
1. E E+T | T*F | (E) | a| b | Ia | Ib | I0 | I1 (F,F) F (E)
2. T T*F | (E) | a| b | Ia | Ib | I0 | I1
(F,I) F a| b| Ia | Ib | I0 |
3. F (E) | a| b | Ia | Ib | I0 | I1
4. I a | b | Ia | Ib | I0 | I1 I1
(I,I) I a| b | Ia | Ib | I0 |
I1
16
Putting all this together…
Theorem: If G is a CFG for a language that
contains at least one string other than , then there
is another CFG G1, such that L(G1)=L(G) - , and
G1 has:
no -productions
no unit productions
no useless symbols
Algorithm:
Step 1) eliminate -productions
Step 2) eliminate unit productions
Step 3) eliminate useless symbols
17
Normal Forms
18
Why normal forms?
If all productions of the grammar could be
expressed in the same form(s), then:
19
Chomsky Normal Form (CNF)
Let G be a CFG for some L-{}
Definition:
G is said to be in Chomsky Normal Form if all
its productions are in one of the following
two forms:
i. A BC where A,B,C are variables, or
ii. Aa where a is a terminal
G has no useless symbols
G has no unit productions
G has no -productions
20
CNF checklist
Is this grammar in CNF?
G1 :
1. E E+T | T*F | (E) | Ia | Ib | I0 | I1
2. T T*F | (E) | Ia | Ib | I0 | I1
3. F (E) | Ia | Ib | I0 | I1
4. I a | b | Ia | Ib | I0 | I1
Checklist:
• G has no -productions
• G has no unit productions
• G has no useless symbols
• But…
• the normal form for productions is violated
B2 C2 and so on…
3) Replace each production of the form A BB1B B … Bk by:
1 2 3C1
22
Example #1
G in CNF:
G:
X0 => 0
S => AS | BABC
X1 => 1
A => A1 | 0A1 | 01
S => AS | BY1
B => 0B | 0
Y1 => AY2
C => 1C | 1 Y2 => BC
A => AX1 | X0Y3 | X0X1
Y3 => AX1
B => X0B | 0
C => X1C | 1
23
Languages with
For languages that include ,
Write down the rest of grammar in CNF
Then add production “S => ” at the end
E.g., consider: G in CNF:
G: X0 => 0
S => AS | BABC X1 => 1
A => A1 | 0A1 | 01 | S => AS | BY1 |
B => 0B | 0 | Y1 => AY2
C => 1C | 1 | Y2 => BC
25
Return of the Pumping Lemma !!
26
Why pumping lemma?
A result that will be useful in proving
languages that are not CFLs
(just like we did for regular languages)
27
The Pumping Lemma for CFLs
Let L be a CFL.
Then there exists a constant N, s.t.,
if z L s.t. |z|≥N, then we can write
z=uvwxy, such that:
1. |vwx| ≤ N
2. vx≠
3. For all k≥0: uvkwxky L
A1 Ai = Aj
A2 Ai
, > m levels
. h ≥ m+1 h ≥ m+1
.
Aj
m variables
. m+1
Ah-1
u v x y
Ah=a
w
z z = uvwxy
• Therefore, vx≠
29
Extending the parse tree…
S = A0
S = A0
Ai
w
Ai u y
u v x y
z = uwy
v x
…
z = uvkwxky 30
Application of Pumping
Lemma for CFLs
Example 1: L = {ambmcm | m>0 }
Claim: L is not a CFL
Proof:
Let N <== P/L constant
Pick z = aNbNcN
Apply pumping lemma to z and show that there
exists at least one other string constructed from z
(obtained by pumping up or down) that is L
31
Proof contd…
z = uvwxy
As z = aNbNcN and |vwx| ≤ N and vx≠
==> v, x cannot contain all three symbols
(a,b,c)
==> we can pump up or pump down to build
another string which is L
32
Example #2 for P/L application
L = { ww | w is in {0,1}*}
33
CFL Closure Properties
34
Closure Property Results
CFLs are closed under:
Union
Concatenation
Kleene closure operator
Substitution
Homomorphism, inverse homomorphism
reversal
CFLs are not closed under: Note: Reg languages
Intersection are closed
under
Difference these
Complementation operators
35
Strategy for Closure Property
Proofs
First prove “closure under substitution”
Using the above result, prove other closure properties
CFLs are closed under:
Union
Concatenation
Kleene closure operator
Prove Substitution
this first Homomorphism, inverse homomorphism
Reversal
36
Note: s(L) can use
a different alphabet
37
CFLs are closed under
Substitution
IF L is a CFL and a substititution defined
on L, s(L), is s.t., s(a) is a CFL for every
symbol a, THEN:
s(L) is also a CFL
What is s(L)?
L s(L)
w1 s(w1) Note: each s(w)
w2 s(L) s(w2) is itself a set of strings
w3 s(w3)
w4 s(w4)
38
CFLs are closed under union
Let L1 and L2 be CFLs
To show: L2 U L2 is also a CFL
Let us show by using the result of Substitution
L1 {a b }n n
S1 aS1b |
L2 S 2 aS 2 a | bS 2b |
{ww } R
Union
L {a b }
n n
S S1 | S 2
40
CFLs are closed under
concatenation
Let L1 and L2 be CFLs
Let us show by using the result of Substitution
L1 {a b } n n
S1 aS1b |
L2 S 2 aS 2 a | bS 2b |
{ww } R
Concatenation
L {a b }{ww }
n n R S S1S2 42
CFLs are closed under
Kleene Closure
Let L be a CFL
Then, L* = s(Lnew)
43
Example
Language Grammar
L S aSb
{a nb n } |
Star Operation
L S1 SS1 |
n n 44
We won’t use substitution to prove this result
45
Some negative closure results
L1 {a nb n c m } L2 {a nb m c m }
Context-free: Context-free:
S AC S AB
A aAb | A aA |
C cC | B bBc |
Intersection
L L {a nb n c n } NOT context-free
47
Some negative closure results
L1 L2 = L1 U L2
Logic: if CFLs were to be closed under complementation
the whole right hand side becomes a CFL (because
CFL is closed for union)
the left hand side (intersection) is also a CFL
but we just showed CFLs are
NOT closed under intersection!
CFLs cannot be closed under complementation.
48
Some negative closure results
49
Decision Properties
Emptiness test
Generating test
Reachability test
Membership test
PDA acceptance
50
The CYK Algorithm
J. Cocke
D. Younger,
T. Kasami
{S, A}
{B} {A, C} {A, C} {B} {A, C}
b a a b a
Constructing The Triangular
Table
X2 , 3 = (Xi , i ,Xi+1 , j) = (X2 , 2 , X3 , 3)
{A, C}{A,C} = {AA, AC, CA, CC} = Y
Steps:
Look for production rules to generate Y
There is one: B S AB | BC
A BA | a
X2 , 3 = {B} B CC | b
C AB | a
Constructing The Triangular
Table
{S, A} {B}
{B} {A, C} {A, C} {B} {A, C}
b a a b a
Constructing The Triangular
Table
X3 , 4 = (Xi , i ,Xi+1 , j) = (X3 , 3 , X4 , 4)
{A, C}{B} = {AB, CB} = Y
Steps:
Look for production rules to generate Y
There are two: S and C S AB | BC
A BA | a
X3 , 4 = {S, C} B CC | b
C AB | a
Constructing The Triangular
Table
Ø
{S, A} {B} {S, C} {S, A}
{B} {A, C} {A, C} {B} {A, C}
b a a b a
Constructing The Triangular
Table
X2 , 4 = (Xi , i ,Xi+1 , j) (Xi , i+1 ,Xi+2 , j)
= (X2 , 2 , X3 , 4) , (X2 , 3 , X4 , 4)
{A, C}{S, C} U {B}{B}= {AS, AC, CS, CC,
BB} = Y
Steps:
Look for production rules to generate Y
S AB | BC
There is one: B A BA | a
X2 , 4 = {B} B CC | b
C AB | a
Constructing The Triangular
Table
Ø {B}
{S, A} {B} {S, C} {S, A}
{B} {A, C} {A, C} {B} {A, C}
b a a b a
Constructing The Triangular
Table
X3 , 5 = (Xi , i ,Xi+1 , j) (Xi , i+1 ,Xi+2 , j)
= (X3 , 3 , X4 , 5) , (X3 , 4 , X5 , 5)
{A,C}{S,A} U {S,C}{A,C}
= {AS, AA, CS, CA, SA, SC, CA, CC} = Y
Steps:
Look for production rules to generate SY AB | BC
A BA | a
There is one: B B CC | b
X3 , 5 = {B} C AB | a
Constructing The Triangular
Table
Ø {B} {B}
{S, A} {B} {S, C} {S, A}
{B} {A, C} {A, C} {B} {A, C}
b a a b a
Final Triangular Table
{S, A, C} X1, 5
Ø {S, A, C}
Ø {B} {B}
{S, A} {B} {S, C} {S, A}
{B} {A, C} {A, C} {B} {A, C}
b a a b a
Yes
76
Summary
Normal Forms
Chomsky Normal Form
Griebach Normal Form
Useful in proroving P/L
Pumping Lemma for CFLs
Main difference: z=uviwxiy
Closure properties
Closed under: union, concatentation, reversal, Kleen
closure, homomorphism, substitution
Not closed under: intersection, complementation,
difference
77