0% found this document useful (0 votes)
3 views

Toc CHP-3

Uploaded by

udgam pandey
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Toc CHP-3

Uploaded by

udgam pandey
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

CHAPTER - 3

Context Free Language and Pushdown Automata

3.1 Context Free Grammar (CFG)


The context free grammar is a language generator which operates on some set of rules called the
production rule. The context-free languages are applied in parser design. They are also useful
for describing block structure of programming languages. In the production rule of context
sensitive language, the replacement of any non-terminal symbol have the influence of their
respective terminal but in case of the context free language, the non-terminals or variables are
replaced independently as we have to define the production rule for each non-terminals
individually.

A. In Production Rule of Context Free Grammar (CFG)


Non-terminal  Terminals only or Non-terminals only or Combination of terminals and
non-terminals

For example: s  a/b/SS/MM, M  p/D/q, D x/y/z etc. This implies that each non-
terminal is separately or individually defined so they are context free on their replacement.

B. In Production Rule of Context Sensitive Grammar (CSG)


Non-terminal only or Combination of non-terminals and terminals  Terminals only or Non-
terminals only or Combination of terminals and non-terminals

For example: aaSbc  aab/Sbb/aaS, S  aa/bb etc. In the first production rule, it implies
that on replacing the aaSbc, we have to consider the present context of S present ahead and
back side also.

Mathematically, a grammar G = (V, , R, S) is a context-free grammar (CFG) if


V = Finite set of non-terminals or variables that are represented by capital letters
 = Finite set of terminals that are represented by the small letter or sign or number etc
S = Starting non-terminal symbol and SV
R = Set of rules called the production rule of the form →, where V and (V)* (i.e.
the LHS of production rule in CFG have only the non-terminals and the RHS may have
empty string, terminals, non-terminals or the combination of terminals and non-terminals).

The production rules or productions or rewriting rules is the kernel of any grammar and
language specification. The productions are used to derive one string over V  from another
string. In the application of the production rule, the reverse substitution is not permitted i.e. if
S AA then AA S is not possible.

Note: Is there any relationship between the regular expression and the context free
grammar? Define by suitable example.

Solution
We know from the Chomsky hierarchy of grammar, all the regular expressions can be described on
the basis of the Context Free Grammar (CFG), but the reverse cannot be true. This can be verified
with the following example. Let any regular expression, R = a (a*+b*) b, then the operating rules
for R can be generated as:
Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 1

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


S  aMb (i.e. S = start state = String with M starting with a and ending with b)
M  A/B [i.e. M = (a*+b*) = any number of string of a or b]
A  aA/ (i.e. String with any number of a)
B  bB/ (i.e. String with any number of b)
Now,
Let G = (V, , R, S) be a context-free grammar (CFG) that can describe the string generated by
regular expression, R. Where
V = Finite set of non-terminals or variables = {S, M, A, B}
 = Finite set of terminals = {a, b, }
S = Starting non-terminal symbol and SV
R = The production rule, which can be describe as: S  aMb, M  A/B, A  aA/,
& B  bB/.

Numerical – 1: Write a CFG to generate only the palindrome with the input symbol, = {0, 1}.
Solution
Let G = (V, , R, S) be a context-free grammar (CFG) that can describe the string of palindrome
only.
Where V = Finite set of non-terminals or variables = {S}
 = Finite set of terminals = {0, 1, }
S = Starting non-terminal symbol and SV
R = Production rule, which can be describe as: S  0S0/1S1/.

Note: The production rule will be S  0S/1S/ or S  SS/1/0/ for the palindrome and non-
palindrome string generation but not only the palindrome as above.

Derivations
The process of deriving the required string over (V)* using the given production rule on the
existing string is called the derivation. The string generated by the most recent application of
production is called the working string. The derivation of a string completed when the working string
cannot be modified. The different derivations results are quite different in different sentential form
such as context sensitive grammar, but for a context free grammar, it really doesn’t make much
difference in what order you expand the variable.

Suppose that 1, 2 … m are strings over (V)* and


Then, we say that 1 derives m in grammar G or
this can be represented as

Hence, we call the sequence of the derivation in G of m from 1 in the following manner as:

The derivations are described as following manner.


 If we describe a string by applying the production rule at once (i.e. one time application) then
the string is called directly derivable string and is denoted by

 If we derive the string by more than one sequence of operations using given production rules
then the string is called derivable string and is denoted by

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 2

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


Type of derivation
A. Leftmost derivation
A derivation is called a leftmost derivation if we apply a production only to the
leftmost variable at every step. For example, if the given production rules of any CFG are
as: S  0B/1A, A  0/0S/1AA, & B  1/1S/0BB then any given string w = 00110101
can be derived using the leftmost derivation in following manner:
S  0B
 00BB (As, B  0BB)
 001B (As, B  1)
 0011S (As, B  1S)
 00110B (As, B  0B)
 001101S (As, B  1S)
 0011010B (As, B  0B)
 00110101(As, B  1)

Here, the tree is a derivation tree with yield 00110101.

B. Rightmost derivation
A derivation is called a rightmost derivation if we apply a production only to the
rightmost variable at every step. For example, if the given production rules of any CFG are
as: S  0B/1A, A  0/0S/1AA, & B  1/1S/0BB then any
given string w = 00110101 can be derived using the rightmost
derivation in following manner:
S  0B
 00BB (As, B  0BB)
 00B1 (As, B  1)
 001S1 (As, B  1S)
 0011A1 (As, S  1A)
 00110S1 (As, A  0S)
 001101A1 (As, S  1A)
 00110101(As, A  0)

3.2 Representation of CFG


The Context Free Grammar (CFG) can be represented in two ways:
 Derivation tree or Parse tree or Production tree
 Backus Naur Form (BNF)

A. Derivation Tree
It is easy to visualize derivation in context-free languages as we can represent derivations
using tree structure. Such tree representing derivations are called derivation trees or parse
trees. A parse tree is an ordered tree in which nodes are labeled with the left side of
production (i.e. non-terminals only) and the children of the nodes (i.e. leaves) represent its
corresponding right-sides (i.e. the terminals or non-terminals or both).

A derivation tree for a CFG, G = (V, , R, S) is a tree satisfying the following conditions:
1. Every vertex has label, which is a variable or terminal or empty string ().
2. The root has label ‘S’ (i.e. start symbol).
3. The label of an internal vertex is a variable.
4. Each vertex of variable is extended towards the leaf-node or terminals using the
production rule (R).

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 3

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


For example, if the given production rules of any CFG are as:
S  0B/1A, A  0/0S/1AA, & B  1/1S/0BB then any given
string w = 00110101 can be derived using the leftmost
derivation in following manner: -
S  0B
 00BB (As, B  0BB)
 001B (As, B  1)
 0011S (As, B  1S)
 00110B (As, B  0B)
 001101S (As, B  1S)
 0011010B (As, B  0B)
 00110101(As, B  1)

Here, the tree is a derivation tree with yield 00110101. The yield of a derivation tree is the
concatenation of the labels of the leaves without repetition in the left-to-right ordering.

Exercise
 Consider a CFG, S  XX, & X  XXX/bX/Xb/a, then find the parse tree for any given string
w = bbaaaab.
 Consider the grammar G, with production S  aXY, X  bYb, & Y  X/c, then find the
parse tree for any string w = abbbb.

Note: The derivation tree does not specify the order in which we apply the production for getting the
required string. So, same derivation tree can include several derivations. But, in general we use
leftmost derivation than that of the rightmost derivation.

3.3 Ambiguity in Context-Free Grammar


Grammars are used to put structures on programs or documents. The assumption was that a
grammar uniquely determines a structure for each string in the language. However, not every
grammar does provide unique structure. Thus, when a grammar fails to provide unique structure,
it is called the ambiguous grammar i.e. in this case grammar puts more than one structure for
same string in the language.

A Context Free Grammar, G is ambiguous if there exists some terminal string w  L (G) is
ambiguous. The terminal string w is ambiguous if there exists two or more leftmost derivations
for single w. In other word, the single terminal string w is ambiguous if it may be the yield of two
derivation trees.

For example: Consider G = ({S}, {a, b, +, }, R, S), where R consists of S  S+S/S*S/a/b.
Now, we have two derivation trees for the terminal string w = a + a  b as given below: -

Leftmost derivation – I
S  S+S
 a + SS
a+aS
a+ab

Leftmost derivation – II
S  SS
 S + SS
a+SS
Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 4

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


a+aS
a+ab

Since, here exists two leftmost derivation trees for a same terminal string w = a + a  b (i.e. w
is ambiguous). Hence, we can conclude that the given grammar G is ambiguous.

Exercise
 Consider any grammar G, with the production rule S  SbS/a. Show that the given
grammar is ambiguous. Assume the terminal string w = abababa.

3.4 Normal Form


Since, in CFG, the production rule is of the form →, where V and (V)* (i.e. the LHS
of production rule in CFG have only the non-terminals and the RHS may have empty string,
terminals, non-terminals or the combination of terminals and non-terminals). Thus, when we
apply some restriction on the R.H.S. of the Context Free Grammar G for defining new production
then G is said to be in “normal form”. Among several normal forms, we deal with the following
two normal forms:
 Chomsky Normal Form (CNF)
 Greibach Normal Form (GNF)

A. Chomsky Normal Form (CNF)


In the Chomsky normal form, we have restrictions on the length of R.H.S. and the nature of
symbols in the R.H.S. of production. Here, the restriction is that every node has either two
internal vertices (i.e. two non-terminals only) or a single leaf (i.e. exactly one terminal).
When a grammar is in CNF, some of the proofs and constrains are simpler.

A context-free grammar G is in Chomsky normal form if every production is of the form A


 a and A BC. Here, A, B and C are non-terminal symbols, ‘a’ is a terminal symbol, S is
the start symbol, and  is the empty string. Also, neither B nor C may be the start symbol but
A may be. For example: consider G whose productions are S AB, A  a, & B  c. Then
G is in CNF.

Note: Any Context Free Grammar (CFG) can be reduced into the Chomsky normal form
(CNF) but the converse is not true as CNF is a restricted form of CFG.

Reduction to Chomsky Normal Form


Any context-free language is generated by a context-free grammar in Chomsky normal form.
For every context-free grammar (CFG), there is an equivalent grammar G2 in Chomsky
normal form (CNF). Here, we put the non-terminals in place of terminals in the same
principle of the bottom-up parsing.

Proof idea:
1. Show that any CFG G can be converted into a CFG G′ in Chomsky normal form;
2. Conversion procedure has several stages where the rules that violate Chomsky normal
form conditions are replaced with equivalent rules that satisfy these conditions;
3. Order of transformations: (1) add a new start variable, (2) eliminate all null production,
(3) eliminate unit-productions, (4) Elimination of terminals on R.H.S.
4. Check that the obtained CFG G′ define the same language as the initial CFG G by
restricting the number of variables on R.H.S.

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 5

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


For example: If the given production rules of context-free grammar, G is given as: S aAD,
A  aB/bAB, B  b, & D  d, then construct the equivalent CNF.

Solution
Since in CNF, the restriction is that every nodes on R.H.S. has either two internal vertices
(i.e. two non-terminals only) or a single leaf (i.e. exactly one terminal). Also, there are no null
productions or unit production. Thus, the production rule can be constructed as: -
 B  b & D  d are in R’.
 S aAD gives S CaAD, where Ca  a and also S CaAD, gives S CaC1, where C1
 AD.
 A  aB gives A  CaB, where Ca  a
 A  bAB gives A  BAB, where B  b and also A  BAB gives A  BCb, where
Cb AB.

Hence, Let G’ = (V’, , R’, S’) be newly constructed CNF equivalent to the given CFG,
where
V’ = Set of non-terminals only = {S, A, B, D, Ca, Cb, C1}
 = Set of terminals only = {a, b, d}
S’ = Start state = S
R’ = Set of production rule that can be defined as: S CaC1, A CaB, A  BCb, Ca  a,
Cb AB, C1  AD, B  b & D  d.

For example: If the given production rules of context-free grammar, G is given as: S
aBASA/aBA, A  aAA/a, & B  bBB/b then construct the equivalent CNF.

Solution
We can construct the CNF for the given CFG by defining the production rule as:
 B  b & A  a.
 S aBASA gives S CaBASA, where Ca  a
S CaBASA, gives S CaC1SA, where C1  BA.
S CaC1SA, gives S CaC1C2, where C2  SA.
S CaC1C2, gives S CaC3, where C3  C1C2.
 S aBA gives S CaBA, where Ca  a
S CaBA gives S CaC1, where C1  BA.
 A  aAA gives A  CaAA, where Ca  a.
A  CaAA gives A  CaC4, where C4  AA.
 B  bBB gives B  CbBB, where Cb b.
B  CbBB gives B  CbC5, where C5 BB.

Hence, Let G’ = (V’, , R’, S’) be newly constructed CNF equivalent to the given CFG,
where
V’ = Set of non-terminals only = {S, A, B, Ca, Cb, C1, C2, C3, C4, C5}
 = Set of terminals only = {a, b}
S’ = Start state = S
R’ = Set of production rule that can be defined as: S CaC3/CaC1, Ca  a, C1
 BA, C2  SA, C3  C1C2, S CaC1, A  CaC4, C4  AA, B  CbC5,
Cb b & C5 BB.

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 6

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


B. Greibach Normal Form (GNF)
GNF is another normal form quite useful in some proofs and constructions. A context free
grammar generating the set accepted by a pushdown is in GNF. A grammar in GNF is a
natural generalization of a regular grammar as the production of the regular grammar are of
the form A →a, where V* and a.

A context-free grammar is in Greibach normal form if every production is in the form A


→a, where V* and ‘a’ is only one terminal, a. Hence, any CFG will be in GNF if it is
in the form “non-terminal → exactly one terminal followed by any number of non-terminals
including null ().For example, the grammar G with productions S  aAB, A bC, Bb, C
c is in GNF.

Conversation of CFG into GNF

Example: If the given production rules of context-free grammar, G is given as: S


abaSa/aba, then construct the equivalent GNF.

Solution
We can construct the GNF for the given CFG by defining the production rule.
Here, we have
S abaSa/aba
Now, let us introduce new variables A and B and productions A a & Bb and substitute
into the given grammar as
S aBASA/aBA
A a &
Bb
Hence, Let G’ = (V’, , R’, S’) be newly constructed GNF equivalent to the given CFG,
where
V’ = Set of non-terminals only = {S, A, B}
 = Set of terminals only = {a, b}
S’ = Start state = S
R’ = Set of production rule that can be defined as: S aBASA/aBA, A a & Bb.

Example: If the given production rules of context-free grammar, G is given as: S AB, A
aA/bB/b & Bb then construct the equivalent GNF.

Solution
We can construct the GNF for the given CFG by defining the production rule as: -
 S AB gives S aAB, (Since A aA)
 S AB gives S bBB, (Since A bB)
 S AB gives S bB, (Since A b)
 A aA/bB/b
 Bb
Now, Let G’ = (V’, , R’, S’) be newly constructed GNF equivalent to the given CFG.
Where
V’ = Set of non-terminals only = {S, A, B}
 = Set of terminals only = {a, b}
S’ = Start state = S
R’ = Set of production rule that can be defined as: S aAB/bBB/bB, A  aA/bB/b &
Bb.

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 7

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


3.5 Simplification of CFG
The unit production, useless productions & the useless production in CFG makes it ambiguous &
thus there is not any unique structure for any language generated by using the grammar and also
the cost of production will increase. Thus, for simplification of CFG, we attempt to do following
procedures:
 Removal of unit productions
 Removal of null productions
 Removal of useless productions

A. Elimination of Unit Productions


A Context Free Grammar G may have production of the form A  B, where A & B are
variable in G, (i.e. one non-terminal  one non-terminal) is called unit production. On
removing the unit productions, we analyze the non-terminals by substitution of the terminals.
The unit production increases the cost of the derivation in grammar.

Example: If the given production rules of context-free grammar, G is given as: S AB, A
a, BC/b, CD, D E & Ea, then remove the unit productions.

Solution
The given grammar contain the following unit productions
BC,
CD, &
D E

Also, the terminals given by the non-terminals are


A a,
B b, gives C b and hence it generates B b/b or B b which is given
E a, gives D a & hence C  a and finally we get B a

Now, the useless production for the given CFG can be distinguished by analyzing the not
reachable non-terminals from the start symbol. Here, the non-terminals are A & B are only
reachable from start symbol and others are useless productions i.e. C b, E a & D a are
useless or never be used. Hence, we can eliminate them.
Now, Let G’ = (V’, , R’, S’) be newly constructed CFG, which is completely reachable
grammar.
Where
V’ = Set of non-terminals only = {S, A, B}
 = Set of terminals only = {a, b}
S’ = Start state = S
R’ = Set of production rule that can be defined as: S AB, A  a & Ba/b.

Example: If the given production rules of context-free grammar, G is given as: S A/bb, A
B/b, & BS/a, then remove the unit productions.

Solution
The given grammar contain the following unit productions
SA,
AB, &
B S
Also, the terminals given by the non-terminals are
B a, gives S a & hence A a
A b, gives S b & hence B b

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 8

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


S bb, gives B bb & hence A  bb
Now, Let G’ = (V’, , R’, S’) be newly constructed CFG, which holds no unit
productions.
Where
V’ = Set of non-terminals only = {S, A, B}
 = Set of terminals only = {a, b}
S’ = Start state = S
R’ = Set of production rule that can be defined as: S a/b/bb, A  a/b/bb &
Ba/b/bb.
Here, the productions of the non-terminals A and B are useless as they are not included on the
start symbol or we can say that we can generate the entire possible string only by using the
start symbol so we can eliminate the production A & B. Hence the final production rule is
only S a/b/bb.

B. Elimination of Null Productions


A CFG may have production of the form A  , where A is any variable, is called a null
production. The production A   is just used to erase A.

To eliminate the null production, we use the following procedure. If A → is a production to


be eliminate then we look for all productions, whose right side contains ‘A’ and replace each
occurrence of ‘A’ in each of these productions to obtain the non-null productions only. Now,
these resultant non-null productions must be added to the grammar.

Example: If the given production rules of context-free grammar, G is given as: S aA, & A
b/, then remove the null productions.

Solution
The given grammar has null productions A. So, put null () in place of ‘A’ at the right
side of productions and add the resultant productions to the grammar.
Thus,
S aA, gives S a/ & hence S a

Hence, Let G’ = (V’, , R’, S’) be newly constructed null-free CFG.


Where
V’ = Set of non-terminals only = {S, A}
 = Set of terminals only = {a, b}
S’ = Start state = S
R’ = Set of production rule that can be defined as: S a, & A  b.

Here, the production of the non-terminals A is useless as they are not included on the start
symbol. Hence the final production rule is only S a.

Example: If the given production rules of context-free grammar, G is given as: S ABAC,
A aA/, B bB/, & C c then remove the null productions.

Solution
The given grammar contains the following null productions:
A
B

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 9

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


So, put null () in place of ‘A’ at the right side of productions, we get
S BAC, S ABC, S BC, and A a
Thus, the simplified grammar become: S ABAC/BAC/ABC/BC, and A a

Again, put null () in place of ‘B’ at the right side of productions, we get
S AAC, S AC, S C, and B b
Thus, the final simplified grammar become: S ABAC/BAC/ABC/BC/AAC/AC/C, A a
and B b
Hence, Let G’ = (V’, , R’, S’) be newly constructed null-free CFG.
Where
V’ = Set of non-terminals only = {S, A, C}
 = Set of terminals only = {a, b}
S’ = Start state = S
R’ = Set of production rule as: S ABAC/BAC/ABC/BC/AAC/AC/C, A a and B
b

Exercise: Consider the following grammar and remove the null productions.
 S aSa /bSb/
 S a /Xb/aYa/, X  Y/, Y b/.

C. Elimination of Useless Productions


For the identification of the useful production, the following two points should be noted: -
a. Can a production generate a string or terminal symbol? This means that not generating
is useless productions.
b. Can a non-terminal symbols (or Variables) are reachable from the start symbol? This
means that non reachable are useless productions.

Example: Eliminate the useless productions from context-free grammar, G where V = {S, A,
B, C} and  = {a, b} with the productions S aS/A/C, A a, B aa, & C aCb.

Solution
 First identify the set of variables that can lead to a terminal string.
i.e. A a,
B aa, and
S aS/A
Since, C cannot generate any string so we remove C. Now, we get a new context-free
grammar, G1 having V1 = {S, A, B} and 1 = {a} with the production rule R1 defined
as: S aS/A, A a, B aa.

 Next step is the elimination of the variables that cannot be reached from the start variable
or symbol. For this, we draw a dependency graph (or transition diagram) where its
vertices are labelled with non-terminals and an edge between C and D is connected if and
only if there is a production of the form C →xDy.

Here, the non-terminal B is useless. Hence the specified grammar is G’ = (V’, ’, R’, S’)
Where
V’ = Set of non-terminals only = {S, A}
’ = Set of terminals only = {a}
S’ = Start state = S
R’ = Set of production rule as: S aS/A, A a.

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 10

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


3.6 Closure Properties of CFL
The closure property of the CFL is defined itself by the CFL i.e. the language generator so it can
be distinguish from the closure property of the regular set as defined earlier. The class of
language of context-free language is closed under: Union, Concatenation, and Kleene closure.

A. Union
Let L1 and L2 be two context-free languages generated
by the Context-free Grammar G1 = (V1, 1, R1, S1) and
G2 = (V2, 2, R2, S2) respectively. Now, we construct
new language ‘L (G)’ using the grammar G = (V, , R,
S), such that it can accepts L(G1)  L(G2).

Where,
• V = V1  V2  {S}
•  = Set of input states = 1  2
• R = R1  R2  {S  S1/S2}
• S = Start state.

Now, let us choose a string w  {1  2}* if and in our grammar S


 S1/S2, so S will lead the string w.

Hence, G is a context-free grammar that can generate L (G) such that


L (G) = L (G1)  L (G2).

B. Concatenation
Let L1 and L2 be two context-free languages generated
by the Context-free Grammar G1 = (V1, 1, R1, S1) and
G2 = (V2, 2, R2, S2) respectively. Now, we construct
new language ‘L (G)’ using the grammar G = (V, , R,
S), such that it can accepts L(G1)  L(G2).

Where,
• V = V1  V2  {S}
•  = Set of input states = 1  2
• R = R1  R2  {S  S1.S2}
• S = Start state.

Now, let us choose string w1  1 and w1  2. We know that , but in the
above grammar G, S  S1S2, so S will lead the concatenation of the string w1 & w2 i.e. w1 w2
and the language will be L1L2. Since L1 & L2 are CFL so L1L2 is also CFL.

C. Kleene closure
Let L1 be a context-free languages generated by the
Context-free Grammar G1 = (V1, 1, R1, S1). Now, we
construct new language ‘L (G)’ using the grammar G =
(V, , R, S), such that it can accepts Kleene Star of the
language L (or L*).

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 11

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


Where,
• V = V1  {S}
•  = Set of input states = 1  {}
• R = R1  {S  S1, S , S  S1S1}
• S = Start state.

Here, R follows all the properties of CFG as R1 is a production of given CFG and S  S1,
S, & S  S1S1 also fulfill the requirement so we say that G is a CFG that can generate the
context-free language L*.

Decision Algorithm for Context-Free Language


 Algorithm for deciding whether a CFL is empty (emptiness).
 Algorithm for deciding whether a CFL is finite (finiteness).
 Algorithm for deciding whether any string ‘w’ can be generated by the same CFG
(Membership).

3.7 Pumping Lemma for Context-Free Languages


The pumping lemma for the context-free language (called Bar-Hillel lemma or just "the pumping
lemma") gives a method of generating an infinite number of strings from a given sufficiently long
string in a context-free language L. It is used to prove that certain languages are not context-free.

While the pumping lemma is often a useful tool to prove that a given language is not context-
free, it does not give a complete characterization of the context-free languages. If a language
does not satisfy the condition given by the pumping lemma, we have established that it is not
context-free. On the other hand, there are languages that are not context-free, but still satisfy the
condition given by the pumping lemma. There are more powerful proof techniques available,
such as Ogden's lemma, but also these techniques do not give a complete characterization of the
context-free languages.

Statement
Let L be a context-free language and n be the length of the string or pumping length such that:
i. Every z  L with |z| = n can be written as uvwxy for some u, v, w, x & y. (i.e. any string
z can be decompose into five sub-strings u, v, w, x & y)
ii. |vx|  1 (i.e. may one null at a time but not both or must have at least one string because we
have to pump at least one sub-string to generate the new infinite number of string)
iii. |vwx| ≤ n (i.e. if u & y are , then vwx = n)
iv. uvkwxky  L for all k  0 (i.e. generate infinite number of string by setting any value of k)

Proof
To prove the theorem, we consider a CFG
whose productions are given by: S AB, A 
aB/a, B bA/b.

Now, let any string z = ababb such that z  L.


Thus we can decompose z as u = a, v = ba, w =
b, x =  and y = b.

Since, the string z and z1 are the yield of the tree


T. Thus, we can write: z = uz1vy

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 12

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


Again, the string z1 & z2 are the yield of the tree T1. Thus, we can write: z1 = vwx. Also, | vwx | >
|w| so |vx|  1
Hence, we have
z = uvwxy with |vwx| ≤ n & |vx|  1
As T is an S-tree and T1 & T2 are the B-tree, we get
Now,
S → uBy
→ uvBxy (since, B → uBx)
→ uvwxy (since, B → w)
Thus, S →uv1wx1 y, where k = 1

Similarly,
S → uBy
→ uvBxy (since, B → uBx)
→ uvuBxxy (since, B → uBx)
→ uvvwxxy (since, B → w)
Thus, S →uv2wx2 y, where k = 2 and so on.

Hence, we can conclude that S → uBy gives This proves the theorem.

Example: Prove that the language L = {anbncn | n  0} is not context-free language.

Solution
Let us consider ‘L’ is a context-free language. Also, let z = apbpcp, z L.
Now, according to the pumping lemma of the CFL, the string ‘z’ can be decomposes into u,
v, w, x, & y as follow:
u = ar
v = as (s > 0)
w = ap-(r+s)
x = bt (t >0)
y = b(p-t)cp

Now, using the pumping lemma, z = uvkwxk y; k  0, we have


z = uv2wx2y = ar (as)2 ap-(r+s) (bt)2 b(p-t)cp = ar a2s ap-(r+s) b2t b(p-t)cp = a(p+s) b(p+t)cp  apbpcp
Here, our assumption z L, contradict with our result.
Hence, we can conclude that the language L = {anbncn | n  0} is not context-free language.

Example: Prove that the language L = {an | n  0} is not context-free language.

Solution
Let us consider ‘L’ is a context-free language. Also, let z = ap, z L.
Now, according to the pumping lemma of the CFL, the string ‘z’ can be decomposes into u,
v, w, x, & y as follow:
u = aα, v = aβ (β > 0), w = aq-(α+β), x = aγ (γ >0), & y = ap-(q+γ)
Now, using the pumping lemma, z = uvkwxk y; k  0, we have
z = uv2wx2y = aα (aβ)2 aq-(α+β) (aγ)2 ap-(q+γ)= aα a2β aq-(α+β) a2γ ap-(q+γ) = ap+α+β  ap
Here, our assumption z L, contradict with our result.
Hence, we can conclude that the language L = {an | n  0} is not context-free language.

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 13

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


3.8 Pushdown Automata
Finite automata provide only finite amount of memory, thus preventing us from recognizing
language that require remembering an infinite amount of information during the processing of a
string. For example a finite automat cannot accept the string of the form L = anbn (n1) as it has
to remember the number of a’s in a string and so it will require infinite number of states. This
difficulty can be avoided by adding memory capability to finite automata. The addition of infinite
stack to the finite automata leads to what we call the push down automata (pda).

Thus, the push down automata is


essentially a finite automaton with
control of both input tape and a stack to
store what it has read. Hence, the
pushdown automata consist of the
following three things:
 An input tape
 A finite control
 A stack

Here is an input tape from which the finite control reads the input and same time it reads the
symbol from the stack. Now it depends upon the finite control that what is the next state and
what will happen with the stack. In one transition, the pushdown automata do the following:
 Consume the input symbol that it uses in the transition. If the input is , then there is no
consumption of input symbol.
 Goes to the new state, which may or may not be the same as the previous state.
 Replace the symbol at the top of the stack by any string. It could be the same symbol that
appeared at the top of the stack.

Mathematically, the pushdown automata, is six tuple, can be defined as P = (Q, , , , q0, F).
Where
– Q is the non-empty finite set of states.
–  is the non-empty finite set of input symbol.
–  is a finite non-empty set of pushdown or stack symbol
– q0 is the start state, q0  Q.
– F is a set of final states, F  Q.
–  is a transition function which maps (Q x * x *) → (Q x *). Formally  takes an
argument, a triple as  (q, a, x) where,
 q is a state in Q.
 a is an input symbol.
 x is a stack symbol that is a member of .
The output of  is a finite set of pair (p, r), where ‘p’ is the new state and r is the string of
the stack symbol that replaces x at the top of the stack.

Move of PDA
The pushdown automata consist of following moves:
  (q, a, x) → (p, y) means that whenever PDA is in state q with x, the top of stack, may
read ‘a’ from the input tape, replace ‘x’ by ‘y’ on the top of stack and enter state p.
  (q, a, ) → (p, y), indicates the pushes ‘y’ on the top of the stack.
  (q, a, y) → (p, ), indicates the pops a symbol ‘y’ from the top of the stack.

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 14

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


Example: Design a PDA which accepts a language L = {w = anbn : n1}

Solution
To solve the problem, we have to first analyze the given language so that we can generate the
PDA for it. As we can see, the string must have equal number of ‘a’ and ‘b’ and the order of
placement is number of ‘a’ followed by number of ‘b’. Thus, to design such PDA, we have to
read the number of ‘a’ and then the same number of ‘b’ and finally, the input string and the stack
must be empty for the accepting condition. To do so, we have to push first the whole string of ‘a’,
one by one, into the empty stack and when the string of ‘b’ start to read, pop one by one the
string of ‘a’ on stack in each ‘b’ read. Hence, when the input string is consumed, there is nothing
on the stack also and that we termed as the accepting condition of the PDA.
Now, let the required PDA for the given string be: P = (Q, , , , q0, F), Where
– Q = finite set of states = {q0, q, qf }
–  = finite set of input symbol = {a, b}
–  = finite set of stack symbol = {a}
– q0 = start state
– F = set of final states = { qf }
–  = transition function which can be
defined as
i.  (q0, a, ) → (q1, a)
ii.  (q1, a, a) → (q1, aa)
iii.  (q1, b, a) → (q1, )
iv.  (q1, , ) → (qf, )
For example, let w = aaabbb is any string then we can process it using PDA defined above in
tabular form:
Present Unread Present Next States Next Stack Transition
States Input Stack symbols Used
symbols
q0 →aaabbb  q1 a I
q1 →aabbb a q1 aa II
q1 →abbb aa q1 aaa II
q1 →bbb aaa q1 aa III
q1 →bb aa q1 a III
q1 →b a q1  III
q1 →  qf IV

Example: Design a PDA which accepts a language L = (w  {a, b}* where w has equal number
of a & b).

Solution
As we can see, the string must have equal number of ‘a’ and ‘b’ with any order of placement.
Hence, let us consider P = (Q, , , , q0, F) be the required PDA for the given string where
– Q = finite set of states = {q0, qf }
–  = finite set of input symbol = {a, b}
–  = finite set of stack symbol = {a, b}
– q0 = start state
– F = set of final states = { qf }
–  = transition function which can be defined as
i.  (q0, a, ) → (q0, a)
ii.  (q0, b, ) → (q0, b)

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 15

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


iii.  (q0, a, a) → (q0, aa)
iv.  (q0, b, b) → (q0, bb)
v.  (q0, a, b) → (q0, )
vi.  (q0, b, a) → (q0, )
vii.  (q0, , ) → (qf, )

For example, let w = abbbaaba is any string then we can process it using PDA defined above
in tabular form:
Present Unread Input Stack top Next New Stack- Transition
States States top Used
q0 →abbbaaba  q0 a I
q0 →bbbaaba a q0  VI
q0 →bbaaba  q0 b II
q0 →baaba b q0 bb IV
q0 →aaba bb q0 b V
q0 →aba b q0  V
q0 →ba  q0 b II
q0 →a b q0  V
q0 →  qf  VII

This can also be done in the following manner:


  (q0, abbbaaba, ) → (q0, a) by rule I.
  (q0, bbbaaba, a) → (q0, ) by rule VI.
  (q0, bbaaba, ) → (q0, b) by rule II.
  (q0, baaba, b) → (q0, bb) by rule IV.
  (q0, aaba, bb) → (q0, b) by rule V.
  (q0, aba, b) → (q0, ) by rule V.
  (q0, ba, ) → (q0, b) by rule II.
  (q0, a, b) → (q0, ) by rule V.
  (q1, , ) → (qf, ), by rule VII, which is the accepting state.

Example: Design a PDA which accepts a language L = (wcwT, w  {a, b}* i.e. w belong to any
strings of a & b).

Solution
As we can see, the string of any number of ‘a’ or ‘b’ must be complement on both side of ‘c’.
Hence, let us consider that P = (Q, , , , q0, F) be the required PDA for the given string where
– Q = finite set of states = {q0, q1, qf}
–  = finite set of input symbol = {a, c, b}
–  = finite set of stack symbol = {a, b}
– q0 = start state
– F = set of final states = { qf }
–  = transition function which can be defined as
i.  (q0, a, ) → (q0, a)
ii.  (q0, a, a) → (q0, aa)
iii.  (q0, a, b) → (q0, ab)
iv.  (q0, b, b) → (q0, bb)
v.  (q0, b, a) → (q0, ba)
vi.  (q0, c, a) → (q1, a) i.e. only state change.
vii.  (q0, c, b) → (q1, b) i.e. only state change.
Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 16

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


viii.  (q1, a, a) → (q1, )
ix.  (q1, b, a) → (q1, )
x.  (q1, a, b) → (q1, )
xi.  (q0, b, ) → (q0, b)
xii.  (q1, b, b) → (q1, )
xiii.  (q1, , ) → (qf, ), which is the accepting state.

For example, let w = aabcbaa is any string then we can process it using PDA defined above in
following tabular form:
Present Unread Input Stack Next New Stack- Transition
States top States top Used
q0 → aabcbaa  q0 a I
q0 → abcbaa a q0 aa II
q0 → bcbaa aa q0 baa VIII
q0 → cbaa baa q1 baa IX
q1 → baa baa q1 aa X
q1 →aa aa q1 a V
q1 →a a q1  V
q1 →  qf  XI

Example: Design a PDA which accepts a language L = {an b2n, n 0}.

Solution
As we can see, the string must have just half number of ‘a’ than that of ‘b’ with order of
placement is ‘a’ followed by ‘b’. Hence, let us consider P = (Q, , , , q0, F) be the required
PDA for the given string where
– Q = finite set of states = {q0, q1, qf }
–  = finite set of input symbol = {a, b}
–  = finite set of stack symbol = {a}
– q0 = start state
– F = set of final states = { qf }
–  = transition function which can be defined as
i.  (q0, a, ) → (q1, aa)
ii.  (q1, a, a) → (q1, aaa)
iii.  (q1, b, a) → (q1, )
iv.  (q1, , ) → (qf, )
For example, let w = aabbbb is any string then we can process it using PDA defined above in
following tabular form:
Present Unread Input Present Next States Next Stack Transition
States Stack symbols Used
symbols
q0 →aabbbb  q1 aa I
q1 →abbb aa q1 aaaa II
q1 →bbbb aaaa q1 aaa II
q1 →bbb aaa q1 aa III
q1 →bb aa q1 a III
q1 →b a q1  III
q1 →  qf IV

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 17

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


Exercise: -Design a PDA which accepts a language L = {an bn+1, n 0}.
Hind: - We can construct the string of the form L = an bbn

3.9 Context-Free Grammar (CFG) to Pushdown Automata (PDA) Conversation


The CFG generates the language accepted by the PDA so we can construct the PDA equivalent to
the given Context-Free Grammar.

Example: Construct a PDA equivalent to the following CFG: S  0BB, B  0S/1S/0. Also,
test whether o10000 is accepted by PDA or not.

Solution
We can define the PDA as: P = (Q, , , , q0, F) where
– Q = finite set of states = {q0, qf }
–  = finite set of input symbol = {0, 1}
–  = finite set of stack symbol = {S, B, 0, 1}
– q0 = start state
– F = set of final states = { qf }
–  = transition function which can be defined by the following rules:
i.  (q0, , ) → (qf, S)
ii.  (qf, , S) → (qf, 0BB)
iii.  (qf, , B) →{(qf, 0S), (qf, 1S), (qf, 0)}
iv.  (qf, 0, 0) → (qf, )
v.  (qf, 1, 1) → (qf, ) For the given string w = 010000, we can process it using the
defined transition rules.

 On the basis of stack processing

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 18

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)


 Tabular form
Present Unread Input Stack Next New Stack- Transition
States top States top Used
q0 → 010000  qf S I
qf →010000 S qf 0BB II
qf → 010000 0BB qf BB IV
qf → 10000 BB qf 1SB III
qf → 10000 1SB qf SB V
qf →0000 SB qf 0BBB II
qf →0000 0BBB qf BBB IV
qf →000 BBB qf 0BB III
qf →000 0BB qf BB IV
qf →00 BB qf 0B III
qf →00 0B qf B IV
qf →0 B qf 0 III
qf →0 0 qf  IV
→  qf

 This can also be done in the following manner: -


  (q0, 010000, ) →  (qf, 010000, 0BB) →  (qf, 10000, BB)
→  (qf, 10000, 1SB) →  (qf, 0000, SB)
→  (qf, 0000, 0BBB) →  (qf, 000, BBB)
→  (qf, 000, 0BB) →  (qf, 00, BB)
→  (qf, 00, 0B) →  (qf, 0, B)
→  (qf, 0, 0)
→  (qf, , ) , which is the accepting condition.

Theory of Computation - Compiled by Yagya Raj Pandeya, NAST, Dhangadhi ©[email protected] Page 19

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

You might also like