0% found this document useful (0 votes)

913 views37 pages

Context Free Grammar: 1. G (V, T, P, S)

The document discusses context free grammar (CFG), which is a formal grammar used to generate strings in a formal language. A CFG is defined by a 4-tuple including non-terminal symbols, terminal symbols, production rules, and a start symbol. Strings are derived by replacing non-terminals with right-hand sides of rules until only terminals remain. The document also covers parsing, derivation, ambiguity, simplifying grammars by removing useless/epsilon productions and unit productions.

Uploaded by

Swastik Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

913 views37 pages

Context Free Grammar: 1. G (V, T, P, S)

Uploaded by

Swastik Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

Context free grammar

Context free grammar is a formal grammar which is used to generate all possible strings
in a given formal language.

Context free grammar G can be defined by four tuples as:

1. G= (V, T, P, S)

Where,

G describes the grammar

T describes a finite set of terminal symbols.

V describes a finite set of non-terminal symbols

P describes a set of production rules

S is the start symbol.

In CFG, the start symbol is used to derive the string. You can derive the string by
repeatedly replacing a non-terminal by the right hand side of the production, until all
non-terminal have been replaced by terminal symbols.

Example:

L= {wcwR | w € (a, b)*}

Production rules:

1. S → aSa
2. S → bSb
3. S → c

Now check that abbcbba string can be derived from the given CFG.

1. S ⇒ aSa
2. S ⇒ abSba
3. S ⇒ abbSbba
4. S ⇒ abbcbba

By applying the production S → aSa, S → bSb recursively and finally applying the
production S → c, we get the string abbcbba.

Capabilities of CFG
There are the various capabilities of CFG:

o Context free grammar is useful to describe most of the programming languages.

o If the grammar is properly designed then an efficient parser can be constructed
automatically.
o Using the features of associatively & precedence information, suitable grammars
for expressions can be constructed.
o Context free grammar is capable of describing nested structures like: balanced
parentheses, matching begin-end, corresponding if-then-else's & so on.

Derivation
Derivation is a sequence of production rules. It is used to get the input string through
these production rules. During parsing we have to take two decisions. These are as follows:

o We have to decide the non-terminal which is to be replaced.

o We have to decide the production rule by which the non-terminal will be replaced.

We have two options to decide which non-terminal to be replaced with production rule.

Left-most Derivation
In the left most derivation, the input is scanned and replaced with the production rule
from left to right. So in left most derivatives we read the input string from left to right.

Example:
Production rules:

1. S = S + S
2. S = S - S
3. S = a | b |c

Input:

a - b + c

The left-most derivation is:

1. S = S + S
2. S = S - S + S
3. S = a - S + S
4. S = a - b + S
5. S = a - b + c
Right-most Derivation
In the right most derivation, the input is scanned and replaced with the production rule
from right to left. So in right most derivatives we read the input string from right to left.

Example:
1. S = S + S
2. S = S - S
3. S = a | b |c

Input:

a - b + c

The right-most derivation is:

1. S = S - S
2. S = S - S + S
3. S = S - S + c
4. S = S - b + c
5. S = a - b + c

Parse tree
o Parse tree is the graphical representation of symbol. The symbol can be terminal or non-
terminal.
o In parsing, the string is derived using the start symbol. The root of the parse tree is that
start symbol.
o It is the graphical representation of symbol that can be terminals or non-terminals.
o Parse tree follows the precedence of operators. The deepest sub-tree traversed first. So,
the operator in the parent node has less precedence over the operator in the sub-tree.

The parse tree follows these points:

o All leaf nodes have to be terminals.
o All interior nodes have to be non-terminals.
o In-order traversal gives original input string.

Example:
Production rules:

1. T= T + T | T * T
2. T = a|b|c

Input:
a * b + c

Step 1:

Step 2:
Step 3:

Step 4:

Step 5:
Ambiguity
A grammar is said to be ambiguous if there exists more than one leftmost derivation or
more than one rightmost derivative or more than one parse tree for the given input string.
If the grammar is not ambiguous then it is called unambiguous.

Example:
1. S = aSb | SS
2. S = ∈

For the string aabb, the above grammar generates two parse trees:

If the grammar has ambiguity then it is not good for a compiler construction. No method can
automatically detect and remove the ambiguity but you can remove ambiguity by re-writing the
whole grammar without ambiguity.

Parser
Parser is a compiler that is used to break the data into smaller elements coming from
lexical analysis phase.A parser takes input in the form of sequence of tokens and produces
output in the form of parse tree.

Parsing is of two types: top down parsing and bottom up parsing.

Top down paring
o The top down parsing is known as recursive parsing or predictive parsing.
o Bottom up parsing is used to construct a parse tree for an input string.
o In the top down parsing, the parsing starts from the start symbol and transform it into the
input symbol.

Parse Tree representation of input string "acdb" is as follows:

34.5M

775

Bottom up parsing
o Bottom up parsing is also known as shift-reduce parsing.
o Bottom up parsing is used to construct a parse tree for an input string.
o In the bottom up parsing, the parsing starts with the input symbol and construct the parse
tree up to the start symbol by tracing out the rightmost derivations of string in reverse.

Example
Production

1. E → T
2. T → T * F
3. T → id
4. F → T
5. F → id

Parse Tree representation of input string "id * id" is as follows:

Bottom up parsing is classified in to various parsing. These are as follows:

1. Shift-Reduce Parsing
2. Operator Precedence Parsing
3. Table Driven LR Parsing

a. LR( 1 )
b. SLR( 1 )
c. CLR ( 1 )
d. LALR( 1 )

Simplification of CFG
As we have seen, various languages can efficiently be represented by a context-free
grammar. All the grammar are not always optimized that means the grammar may consist
of some extra symbols(non-terminal). Having extra symbols, unnecessary increase the
length of grammar. Simplification of grammar means reduction of grammar by removing
useless symbols. The properties of reduced grammar are given below:

1. Each variable (i.e. non-terminal) and each terminal of G appears in the derivation of some
word in L.
2. There should not be any production as X → Y where X and Y are non-terminal.
3. If ε is not in the language L then there need not to be the production X → ε.

Let us study the reduction process in

detail./p>

Removal of Useless Symbols

A symbol can be useless if it does not appear on the right-hand side of the production
rule and does not take part in the derivation of any string. That symbol is known as a
useless symbol. Similarly, a variable can be useless if it does not take part in the derivation
of any string. That variable is known as a useless variable.

For Example:
1. T → aaB | abA | aaT
2. A → aA
3. B → ab | b
4. C → ad

In the above example, the variable 'C' will never occur in the derivation of any string, so
the production C → ad is useless. So we will eliminate it, and the other productions are
written in such a way that variable C can never reach from the starting variable 'T'.How to

Production A → aA is also useless because there is no way to terminate it. If it never

terminates, then it can never produce a string. Hence this production can never take part
in any derivation.

To remove this useless production A → aA, we will first find all the variables which will
never lead to a terminal string such as variable 'A'. Then we will remove all the productions
in which the variable 'B' occurs.

Elimination of ε Production
The productions of type S → ε are called ε productions. These type of productions can
only be removed from those grammars that do not generate ε.

Step 1: First find out all nullable non-terminal variable which derives ε.

Step 2: For each production A → a, construct all production A → x, where x is obtained

from a by removing one or more non-terminal from step 1.

Step 3: Now combine the result of step 2 with the original production and remove ε
productions.

Example:
Remove the production from the following CFG by preserving the meaning of it.

1. S → XYX
2. X → 0X | ε
3. Y → 1Y | ε

Solution:

Now, while removing ε production, we are deleting the rule X → ε and Y → ε. To preserve
the meaning of CFG we are actually placing ε at the right-hand side whenever X and Y
have appeared.

Let us take

1. S → XYX

If the first X at right-hand side is ε. Then

1. S → YX

Similarly if the last X in R.H.S. = ε. Then

1. S → XY

If Y = ε then

1. S → XX

If Y and X are ε then,

1. S → X

If both X are replaced by ε

1. S → Y

Now,

1. S → XY | YX | XX | X | Y

Now let us consider

1. X → 0X

If we place ε at right-hand side for X then,

1. X → 0
2. X → 0X | 0

Similarly Y → 1Y | 1

Collectively we can rewrite the CFG with removed ε production as

1. S → XY | YX | XX | X | Y
2. X → 0X | 0
3. Y → 1Y | 1
Removing Unit Productions
The unit productions are the productions in which one non-terminal gives another non-
terminal. Use the following steps to remove unit production:

Step 1: To remove X → Y, add production X → a to the grammar rule whenever Y → a

occurs in the grammar.

Step 2: Now delete X → Y from the grammar.

Step 3: Repeat step 1 and step 2 until all unit productions are removed.

For example:
1. S → 0A | 1B | C
2. A → 0S | 00
3. B → 1 | A
4. C → 01

Solution:

S → C is a unit production. But while removing S → C we have to consider what C gives.

So, we can add a rule to S.

1. S → 0A | 1B | 01

Similarly, B → A is also a unit production so we can modify it as

1. B → 1 | 0S | 00
Thus finally we can write CFG without unit production as

1. S → 0A | 1B | 01
2. A → 0S | 00
3. B → 1 | 0S | 00
4. C → 01

Chomsky's Normal Form (CNF)

CNF stands for Chomsky normal form. A CFG(context free grammar) is in CNF(Chomsky
normal form) if all production rules satisfy one of the following conditions:

o Start symbol generating ε. For example, A → ε.

o A non-terminal generating two non-terminals. For example, S → AB.
o A non-terminal generating a terminal. For example, S → a.

For example:
1. G1 = {S → AB, S → c, A → a, B → b}
2. G2 = {S → aA, A → a, B → c}

The production rules of Grammar G1 satisfy the rules specified for CNF, so the grammar
G1 is in CNF.

However, the production rule of Grammar G2 does not satisfy the rules specified for CNF
as S → aZ contains terminal followed by non-terminal. So the grammar G2 is not in CNF.

Steps for converting CFG into CNF

Step 1: Eliminate start symbol from the RHS. If the start symbol T is at the right-hand side
of any production, create a new production as:

1. S1 → S

Where S1 is the new start symbol.

Step 2: In the grammar, remove the null, unit and useless productions. You can refer to
the Simplification of CFG.
Step 3: Eliminate terminals from the RHS of the production if they exist with other non-
terminals or terminals. For example, production S → aA can be decomposed as:

1. S → RA
2. R → a

Step 4: Eliminate RHS with more than two non-terminals. For example, S → ASB can be
decomposed as:

1. S → RS
2. R → AS
Example:
Convert the given CFG to CNF. Consider the given grammar G1:

1. S → a | aA | B
2. A → aBB | ε
3. B → Aa | b

Solution:

Step 1: We will create a new production S1 → S, as the start symbol S appears on the RHS.
The grammar will be:

1. S1 → S
2. S → a | aA | B
3. A → aBB | ε
4. B → Aa | b

Step 2: As grammar G1 contains A → ε null production, its removal from the grammar
yields:

1. S1 → S
2. S → a | aA | B
3. A → aBB
4. B → Aa | b | a

Now, as grammar G1 contains Unit production S → B, its removal yield:

1. S1 → S
2. S → a | aA | Aa | b
3. A → aBB
4. B → Aa | b | a

Also remove the unit production S1 → S, its removal from the grammar yields:

1. S0 → a | aA | Aa | b
2. S → a | aA | Aa | b
3. A → aBB
4. B → Aa | b | a

Step 3: In the production rule S0 → aA | Aa, S → aA | Aa, A → aBB and B → Aa, terminal a
exists on RHS with non-terminals. So we will replace terminal a with X:

1. S0 → a | XA | AX | b
2. S → a | XA | AX | b
3. A → XBB
4. B → AX | b | a
5. X → a

Step 4: In the production rule A → XBB, RHS has more than two symbols, removing it
from grammar yield:

1. S0 → a | XA | AX | b
2. S → a | XA | AX | b
3. A → RB
4. B → AX | b | a
5. X → a
6. R → XB

Hence, for the given grammar, this is the required CNF.

Chomsky's Normal Form (CNF)

CNF stands for Chomsky normal form. A CFG(context free grammar) is in CNF(Chomsky
normal form) if all production rules satisfy one of the following conditions:
o Start symbol generating ε. For example, A → ε.
o A non-terminal generating two non-terminals. For example, S → AB.
o A non-terminal generating a terminal. For example, S → a.

For example:
1. G1 = {S → AB, S → c, A → a, B → b}
2. G2 = {S → aA, A → a, B → c}

The production rules of Grammar G1 satisfy the rules specified for CNF, so the grammar
G1 is in CNF. However, the production rule of Grammar G2 does not satisfy the rules
specified for CNF as S → aZ contains terminal followed by non-terminal. So the grammar
G2 is not in CNF.

Steps for converting CFG into CNF

Step 1: Eliminate start symbol from the RHS. If the start symbol T is at the right-hand side
of any production, create a new production as:

1. S1 → S

Where S1 is the new start symbol.

Step 2: In the grammar, remove the null, unit and useless productions. You can refer to
the Simplification of CFG.

Step 3: Eliminate terminals from the RHS of the production if they exist with other non-
terminals or terminals. For example, production S → aA can be decomposed as:

1. S → RA
2. R → a

Step 4: Eliminate RHS with more than two non-terminals. For example, S → ASB can be
decomposed as:

1. S → RS
2. R → AS
Example:
Convert the given CFG to CNF. Consider the given grammar G1:
1. S → a | aA | B
2. A → aBB | ε
3. B → Aa | b

Solution:

Step 1: We will create a new production S1 → S, as the start symbol S appears on the RHS.
The grammar will be:

1. S1 → S
2. S → a | aA | B
3. A → aBB | ε
4. B → Aa | b

Step 2: As grammar G1 contains A → ε null production, its removal from the grammar
yields:

1. S1 → S
2. S → a | aA | B
3. A → aBB
4. B → Aa | b | a

Now, as grammar G1 contains Unit production S → B, its removal yield:

1. S1 → S
2. S → a | aA | Aa | b
3. A → aBB
4. B → Aa | b | a

Also remove the unit production S1 → S, its removal from the grammar yields:

1. S0 → a | aA | Aa | b
2. S → a | aA | Aa | b
3. A → aBB
4. B → Aa | b | a

Step 3: In the production rule S0 → aA | Aa, S → aA | Aa, A → aBB and B → Aa, terminal a
exists on RHS with non-terminals. So we will replace terminal a with X:
1. S0 → a | XA | AX | b
2. S → a | XA | AX | b
3. A → XBB
4. B → AX | b | a
5. X → a

Step 4: In the production rule A → XBB, RHS has more than two symbols, removing it
from grammar yield:

1. S0 → a | XA | AX | b
2. S → a | XA | AX | b
3. A → RB
4. B → AX | b | a
5. X → a
6. R → XB

Hence, for the given grammar, this is the required CNF.

Greibach Normal Form (GNF)

GNF stands for Greibach normal form. A CFG(context free grammar) is in GNF(Greibach
normal form) if all the production rules satisfy one of the following conditions:

o A start symbol generating ε. For example, S → ε.

o A non-terminal generating a terminal. For example, A → a.
o A non-terminal generating a terminal which is followed by any number of non-terminals.
For example, S → aASB.

For example:

1. G1 = {S → aAB | aB, A → aA| a, B → bB | b}

2. G2 = {S → aAB | aB, A → aA | ε, B → bB | ε}

The production rules of Grammar G1 satisfy the rules specified for GNF, so the grammar
G1 is in GNF. However, the production rule of Grammar G2 does not satisfy the rules
specified for GNF as A → ε and B → ε contains ε(only start symbol can generate ε). So the
grammar G2 is not in GNF.
Steps for converting CFG into GNF
Step 1: Convert the grammar into CNF.

If the given grammar is not in CNF, convert it into CNF. You can refer the following topic
to convert the CFG into CNF: Chomsky normal form

Step 2: If the grammar exists left recursion, eliminate it.

If the context free grammar contains left recursion, eliminate it. You can refer the following
topic to eliminate left recursion: Left Recursion

Step 3: In the grammar, convert the given production rule into GNF form.

If any production rule in the grammar is not in GNF form, convert it.

Example:
1. S → XB | AA
2. A → a | SA
3. B → b
4. X → a

Solution:

As the given grammar G is already in CNF and there is no left recursion, so we can skip
step 1 and step 2 and directly go to step 3.

The production rule A → SA is not in GNF, so we substitute S → XB | AA in the production

rule A → SA as:

1. S → XB | AA
2. A → a | XBA | AAA
3. B → b
4. X → a

The production rule S → XB and B → XBA is not in GNF, so we substitute X → a in the

production rule S → XB and B → XBA as:

1. S → aB | AA
2. A → a | aBA | AAA
3. B → b
4. X → a

Now we will remove left recursion (A → AAA), we get:

1. S → aB | AA
2. A → aC | aBAC
3. C → AAC | ε
4. B → b
5. X → a

Now we will remove null production C → ε, we get:

The production rule S → AA is not in GNF, so we substitute A → aC | aBAC | a | aBA in

production rule S → AA as:

1. S → aB | aCA | aBACA | aA | aBAA

2. A → aC | aBAC | a | aBA
3. C → AAC
4. C → aCA | aBACA | aA | aBAA
5. B → b
6. X → a

The production rule C → AAC is not in GNF, so we substitute A → aC | aBAC | a | aBA in

production rule C → AAC as:

1. S → aB | aCA | aBACA | aA | aBAA

Hence, this is the GNF form for the grammar G.

Pumping Lemma for CFG

Lemma
If L is a context-free language, there is a pumping length p such that any string w ∈ L of
length ≥ p can be written as w = uvxyz, where vy ≠ ε, |vxy| ≤ p, and for all i ≥ 0, uvixyiz
∈ L.

Applications of Pumping Lemma

Pumping lemma is used to check whether a grammar is context free or not. Let us take
an example and show how it is checked.
Problem
Find out whether the language L = {xnynzn | n ≥ 1} is context free or not.
Solution
Let L is context free. Then, L must satisfy pumping lemma.
At first, choose a number n of the pumping lemma. Then, take z as 0n1n2n.
Break z into uvwxy, where
|vwx| ≤ n and vx ≠ ε.
Hence vwx cannot involve both 0s and 2s, since the last 0 and the first 2 are at least
(n+1) positions apart. There are two cases −
Case 1 − vwx has no 2s. Then vx has only 0s and 1s. Then uwy, which would have to
be in L, has n 2s, but fewer than n 0s or 1s.
Case 2 − vwx has no 0s.
Here contradiction occurs.
Hence, L is not a context-free language.

CFL Closure Property

Context-free languages are closed under −

• Union
• Concatenation
• Kleene Star operation

Union
Let L1 and L2 be two context free languages. Then L1 ∪ L2 is also context free.

Example
Let L1 = { anbn , n > 0}. Corresponding grammar G1 will have P: S1 → aAb|ab
Let L2 = { cmdm , m ≥ 0}. Corresponding grammar G2 will have P: S2 → cBb| ε
Union of L1 and L2, L = L1 ∪ L2 = { anbn } ∪ { cmdm }
The corresponding grammar G will have the additional production S → S1 | S2

Concatenation
If L1 and L2 are context free languages, then L1L2 is also context free.
Example
Union of the languages L1 and L2, L = L1L2 = { anbncmdm }
The corresponding grammar G will have the additional production S → S1 S2

Kleene Star
If L is a context free language, then L* is also context free.
Example
Let L = { anbn , n ≥ 0}. Corresponding grammar G will have P: S → aAb| ε
Kleene Star L1 = { anbn }*
The corresponding grammar G1 will have additional productions S1 → SS1 | ε
Context-free languages are not closed under −
• Intersection − If L1 and L2 are context free languages, then L1 ∩ L2 is not
necessarily context free.
• Intersection with Regular Language − If L1 is a regular language and L2 is a
context free language, then L1 ∩ L2 is a context free language.
• Complement − If L1 is a context free language, then L1’ may not be context free.

Pushdown Automata(PDA)
o Pushdown automata is a way to implement a CFG in the same way we design DFA for a
regular grammar. A DFA can remember a finite amount of information, but a PDA can
remember an infinite amount of information.
o Pushdown automata is simply an NFA augmented with an "external stack memory". The
addition of stack is used to provide a last-in-first-out memory management capability to
Pushdown automata. Pushdown automata can store an unbounded amount of
information on the stack. It can access a limited amount of information on the stack. A
PDA can push an element onto the top of the stack and pop off an element from the top
of the stack. To read an element into the stack, the top elements must be popped off and
are lost.
o A PDA is more powerful than FA. Any language which can be acceptable by FA can also
be acceptable by PDA. PDA also accepts a class of language which even cannot be
accepted by FA. Thus PDA is much more superior to FA.

PDA Components:
Input tape: The input tape is divided in many cells or symbols. The input head is read-
only and may only move from left to right, one symbol at a time.

Finite control: The finite control has some pointer which points the current symbol which
is to be read.
Stack: The stack is a structure in which we can push and remove the items from one end
only. It has an infinite size. In PDA, the stack is used to store the items temporarily.

Formal definition of PDA:

The PDA can be defined as a collection of 7 components:

Q: the finite set of states

∑: the input set

Γ: a stack symbol which can be pushed and popped from the stack

q0: the initial state

Z: a start symbol which is in Γ.

F: a set of final states

δ: mapping function which is used for moving from current state to next state.

Instantaneous Description (ID)

ID is an informal notation of how a PDA computes an input string and make a decision
that string is accepted or rejected.

An instantaneous description is a triple (q, w, α) where:

q describes the current state.

w describes the remaining input.

α describes the stack contents, top at the left.

Turnstile Notation:
⊢ sign describes the turnstile notation and represents one move.

⊢* sign describes a sequence of moves.

For example,
(p, b, T) ⊢ (q, w, α)

In the above example, while taking a transition from state p to q, the input symbol 'b' is
consumed, and the top of the stack 'T' is represented by a new string α.

Example 1:
Design a PDA for accepting a language {anb2n | n>=1}.

Solution: In this language, n number of a's should be followed by 2n number of b's.

Hence, we will apply a very simple logic, and that is if we read single 'a', we will push two
a's onto the stack. As soon as we read 'b' then for every single 'b' only one 'a' should get
popped from the stack.

The ID can be constructed as follows:

1. δ(q0, a, Z) = (q0, aaZ)

2. δ(q0, a, a) = (q0, aaa)

Now when we read b, we will change the state from q0 to q1 and start popping
corresponding 'a'. Hence,

1. δ(q0, b, a) = (q1, ε)

Thus this process of popping 'b' will be repeated unless all the symbols are read. Note
that popping action occurs in state q1 only.

1. δ(q1, b, a) = (q1, ε)

After reading all b's, all the corresponding a's should get popped. Hence when we read ε
as input symbol then there should be nothing in the stack. Hence the move will be:

1. δ(q1, ε, Z) = (q2, ε)

Where

PDA = ({q0, q1, q2}, {a, b}, {a, Z}, δ, q0, Z, {q2})

We can summarize the ID as:

1. δ(q0, a, Z) = (q0, aaZ)

2. δ(q0, a, a) = (q0, aaa)
3. δ(q0, b, a) = (q1, ε)
4. δ(q1, b, a) = (q1, ε)
5. δ(q1, ε, Z) = (q2, ε)

Now we will simulate this PDA for the input string "aaabbbbbb".

1. δ(q0, aaabbbbbb, Z) ⊢ δ(q0, aabbbbbb, aaZ)

2. ⊢ δ(q0, abbbbbb, aaaaZ)
3. ⊢ δ(q0, bbbbbb, aaaaaaZ)
4. ⊢ δ(q1, bbbbb, aaaaaZ)
5. ⊢ δ(q1, bbbb, aaaaZ)
6. ⊢ δ(q1, bbb, aaaZ)
7. ⊢ δ(q1, bb, aaZ)
8. ⊢ δ(q1, b, aZ)
9. ⊢ δ(q1, ε, Z)
10. ⊢ δ(q2, ε)
11. ACCEPT
Example 2:
Design a PDA for accepting a language {0n1m0n | m, n>=1}.

Solution: In this PDA, n number of 0's are followed by any number of 1's followed n
number of 0's. Hence the logic for design of such PDA will be as follows:

Push all 0's onto the stack on encountering first 0's. Then if we read 1, just do nothing.
Then read 0, and on each read of 0, pop one 0 from the stack.

For instance:
This scenario can be written in the ID form as:

1. δ(q0, 0, Z) = δ(q0, 0Z)

2. δ(q0, 0, 0) = δ(q0, 00)
3. δ(q0, 1, 0) = δ(q1, 0)
4. δ(q0, 1, 0) = δ(q1, 0)
5. δ(q1, 0, 0) = δ(q1, ε)
6. δ(q0, ε, Z) = δ(q2, Z) (ACCEPT state)

Now we will simulate this PDA for the input string "0011100".

1. δ(q0, 0011100, Z) ⊢ δ(q0, 011100, 0Z)

2. ⊢ δ(q0, 11100, 00Z)
3. ⊢ δ(q0, 1100, 00Z)
4. ⊢ δ(q1, 100, 00Z)
5. ⊢ δ(q1, 00, 00Z)
6. ⊢ δ(q1, 0, 0Z)
7. ⊢ δ(q1, ε, Z)
8. ⊢ δ(q2, Z)
9. ACCEPT

PDA Acceptance
A language can be accepted by Pushdown automata using two approaches:

1. Acceptance by Final State: The PDA is said to accept its input by the final state if it
enters any final state in zero or more moves after reading the entire input.

Let P =(Q, ∑, Γ, δ, q0, Z, F) be a PDA. The language acceptable by the final state can be
defined as:

1. L(PDA) = {w | (q0, w, Z) ⊢* (p, ε, ε), q ∈ F}

2. Acceptance by Empty Stack: On reading the input string from the initial configuration
for some PDA, the stack of PDA gets empty.

Let P =(Q, ∑, Γ, δ, q0, Z, F) be a PDA. The language acceptable by empty stack can be
defined as:

1. N(PDA) = {w | (q0, w, Z) ⊢* (p, ε, ε), q ∈ Q}

Equivalence of Acceptance by Final State and Empty Stack
o If L = N(P1) for some PDA P1, then there is a PDA P2 such that L = L(P2). That means the
language accepted by empty stack PDA will also be accepted by final state PDA.
o If there is a language L = L (P1) for some PDA P1 then there is a PDA P2 such that L =
N(P2). That means language accepted by final state PDA is also acceptable by empty stack
PDA.

Example:
Construct a PDA that accepts the language L over {0, 1} by empty stack which accepts all
the string of 0's and 1's in which a number of 0's are twice of number of 1's.

Solution:

There are two parts for designing this PDA:

o If 1 comes before any 0's

o If 0 comes before any 1's.

We are going to design the first part i.e. 1 comes before 0's. The logic is that read single
1 and push two 1's onto the stack. Thereafter on reading two 0's, POP two 1's from the
stack. The δ can be

1. δ(q0, 1, Z) = (q0, 11, Z) Here Z represents that stack is empty

2. δ(q0, 0, 1) = (q0, ε)

Now, consider the second part i.e. if 0 comes before 1's. The logic is that read first 0, push
it onto the stack and change state from q0 to q1. [Note that state q1 indicates that first 0
is read and still second 0 has yet to read].

Being in q1, if 1 is encountered then POP 0. Being in q1, if 0 is read then simply read that
second 0 and move ahead. The δ will be:

1. δ(q0, 0, Z) = (q1, 0Z)

2. δ(q1, 0, 0) = (q1, 0)
3. δ(q1, 0, Z) = (q0, ε) (indicate that one 0 and one 1 is already read, so simply r
ead the second 0)
4. δ(q1, 1, 0) = (q1, ε)

Now, summarize the complete PDA for given L is:

1. δ(q0, 1, Z) = (q0, 11Z)

2. δ(q0, 0, 1) = (q1, ε)
3. δ(q0, 0, Z) = (q1, 0Z)
4. δ(q1, 0, 0) = (q1, 0)
5. δ(q1, 0, Z) = (q0, ε)
6. δ(q0, ε, Z) = (q0, ε) ACCEPT state

Non-deterministic Pushdown Automata

The non-deterministic pushdown automata is very much similar to NFA. We will discuss
some CFGs which accepts NPDA.
The CFG which accepts deterministic PDA accepts non-deterministic PDAs as well.
Similarly, there are some CFGs which can be accepted only by NPDA and not by DPDA.
Thus NPDA is more powerful than DPDA.

Example:
Design PDA for Palindrome strips.

Solution:

Suppose the language consists of string L = {aba, aa, bb, bab, bbabb, aabaa, ......]. The
string can be odd palindrome or even palindrome. The logic for constructing PDA is that
we will push a symbol onto the stack till half of the string then we will read each symbol
and then perform the pop operation. We will compare to see whether the symbol which
is popped is similar to the symbol which is read. Whether we reach to end of the input,
we expect the stack to be empty.

This PDA is a non-deterministic PDA because finding the mid for the given string and
reading the string from left and matching it with from right (reverse) direction leads to
non-deterministic moves. Here is the ID.
Simulation of abaaba

1. δ(q1, abaaba, Z) Apply rule 1

2. ⊢ δ(q1, baaba, aZ) Apply rule 5
3. ⊢ δ(q1, aaba, baZ) Apply rule 4
4. ⊢ δ(q1, aba, abaZ) Apply rule 7
5. ⊢ δ(q2, ba, baZ) Apply rule 8
6. ⊢ δ(q2, a, aZ) Apply rule 7
7. ⊢ δ(q2, ε, Z) Apply rule 11
8. ⊢ δ(q2, ε) Accept

CFG to PDA Conversion

The first symbol on R.H.S. production must be a terminal symbol. The following steps are
used to obtain PDA from CFG is:

Step 1: Convert the given productions of CFG into GNF.

Step 2: The PDA will only have one state {q}.

Step 3: The initial symbol of CFG will be the initial symbol in the PDA.

Step 4: For non-terminal symbol, add the following rule:

1. δ(q, ε, A) = (q, α)

Where the production rule is A → α

Step 5: For each terminal symbols, add the following rule:

1. δ(q, a, a) = (q, ε) for every terminal symbol

Example 1:
Convert the following grammar to a PDA that accepts the same language.

1. S → 0S1 | A
2. A → 1A0 | S | ε
Solution:

The CFG can be first simplified by eliminating unit productions:

1. S → 0S1 | 1S0 | ε

Now we will convert this CFG to GNF:

1. S → 0SX | 1SY | ε
2. X → 1
3. Y → 0

The PDA can be:

R1: δ(q, ε, S) = {(q, 0SX) | (q, 1SY) | (q, ε)}

R2: δ(q, ε, X) = {(q, 1)}
R3: δ(q, ε, Y) = {(q, 0)}
R4: δ(q, 0, 0) = {(q, ε)}
R5: δ(q, 1, 1) = {(q, ε)}

Example 2:
Construct PDA for the given CFG, and test whether 0104 is acceptable by this PDA.

1. S → 0BB
2. B → 0S | 1S | 0

Solution:

The PDA can be given as:

1. A = {(q), (0, 1), (S, B, 0, 1), δ, q, S, ?}

The production rule δ can be:

R1: δ(q, ε, S) = {(q, 0BB)}

R2: δ(q, ε, B) = {(q, 0S) | (q, 1S) | (q, 0)}
R3: δ(q, 0, 0) = {(q, ε)}
R4: δ(q, 1, 1) = {(q, ε)}

Testing 0104 i.e. 010000 against PDA:

1. δ(q, 010000, S) ⊢ δ(q, 010000, 0BB)
2. ⊢ δ(q, 10000, BB) R1
3. ⊢ δ(q, 10000,1SB) R3
4. ⊢ δ(q, 0000, SB) R2
5. ⊢ δ(q, 0000, 0BBB) R1
6. ⊢ δ(q, 000, BBB) R3
7. ⊢ δ(q, 000, 0BB) R2
8. ⊢ δ(q, 00, BB) R3
9. ⊢ δ(q, 00, 0B) R2
10. ⊢ δ(q, 0, B) R3
11. ⊢ δ(q, 0, 0) R2
12. ⊢ δ(q, ε) R3
13. ACCEPT

Thus 0104 is accepted by the PDA.

Example 3:
Draw a PDA for the CFG given below:

1. S → aSb
2. S → a | b | ε

Solution:

The PDA can be given as:

1. P = {(q), (a, b), (S, a, b, z0), δ, q, z0, q}

The mapping function δ will be:

R1: δ(q, ε, S) = {(q, aSb)}

R2: δ(q, ε, S) = {(q, a) | (q, b) | (q, ε)}
R3: δ(q, a, a) = {(q, ε)}
R4: δ(q, b, b) = {(q, ε)}
R5: δ(q, ε, z0) = {(q, ε)}

Simulation: Consider the string aaabb

1. δ(q, εaaabb, S) ⊢ δ(q, aaabb, aSb) R3
2. ⊢ δ(q, εaabb, Sb) R1
3. ⊢ δ(q, aabb, aSbb) R3
4. ⊢ δ(q, εabb, Sbb) R2
5. ⊢ δ(q, abb, abb) R3
6. ⊢ δ(q, bb, bb) R4
7. ⊢ δ(q, b, b) R4
8. ⊢ δ(q, ε, z0) R5
9. ⊢ δ(q, ε)
10. ACCEPT

LEX
o Lex is a program that generates lexical analyzer. It is used with YACC parser generator.
o The lexical analyzer is a program that transforms an input stream into a sequence of
tokens.
o It reads the input stream and produces the source code as output through implementing
the lexical analyzer in the C program.

The function of Lex is as follows:

o Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler runs
the lex.1 program and produces a C program lex.yy.c.
o Finally C compiler runs the lex.yy.c program and produces an object program a.out.
o a.out is lexical analyzer that transforms an input stream into a sequence of tokens.
Lex file format
A Lex program is separated into three sections by %% delimiters. The formal of Lex source
is as follows:

1. { definitions }
2. %%
3. { rules }
4. %%
5. { user subroutines }

Definitions include declarations of constant, variable and regular definitions.

Rules define the statement of form p1 {action1} p2 {action2}....pn {action}.

Where pi describes the regular expression and action1 describes the actions what action
the lexical analyzer should take when pattern pi matches a lexeme.

User subroutines are auxiliary procedures needed by the actions. The subroutine can be
loaded with the lexical analyzer and compiled separately.

YACC
o YACC stands for Yet Another Compiler Compiler.
o YACC provides a tool to produce a parser for a given grammar.
o YACC is a program designed to compile a LALR (1) grammar.
o It is used to produce the source code of the syntactic analyzer of the language
produced by LALR (1) grammar.
o The input of YACC is the rule or grammar and the output is a C program.

These are some points about YACC:

Input: A CFG- file.y

Output: A parser y.tab.c (yacc)

o The output file "file.output" contains the parsing tables.

o The file "file.tab.h" contains declarations.
o The parser called the yyparse ().
o Parser expects to use a function called yylex () to get tokens.

The basic operational sequence is as follows:

This file contains the desired grammar in YACC format.

It shows the YACC program.

It is the c source program created by YACC.

C Compiler
Executable file that will parse grammar given in gram.Y

Class 18 Context Free Grammar
No ratings yet
Class 18 Context Free Grammar
35 pages
Pushdown Automata (PDA) : Unit Iii-Push Down Automata
No ratings yet
Pushdown Automata (PDA) : Unit Iii-Push Down Automata
27 pages
CFG Removal of Null and Unit Production
No ratings yet
CFG Removal of Null and Unit Production
31 pages
MODULE 3 Syntax Analysis
100% (1)
MODULE 3 Syntax Analysis
182 pages
Turing Machine: Concepts & Examples
No ratings yet
Turing Machine: Concepts & Examples
39 pages
UNIT II-Syntax Analysis: CS416 Compilr Design 1
No ratings yet
UNIT II-Syntax Analysis: CS416 Compilr Design 1
29 pages
Theory of Automata and Formal Languages
No ratings yet
Theory of Automata and Formal Languages
55 pages
Automata Theory Chapter-Finite Automata
No ratings yet
Automata Theory Chapter-Finite Automata
28 pages
Bottom-Up Parsing in Compiler Design
No ratings yet
Bottom-Up Parsing in Compiler Design
20 pages
Epsilon NFA
No ratings yet
Epsilon NFA
3 pages
24-Module 4 - Variants of Syntax Trees - Three Address Code-10!09!2024
100% (1)
24-Module 4 - Variants of Syntax Trees - Three Address Code-10!09!2024
44 pages
Compiler Design Notes
No ratings yet
Compiler Design Notes
130 pages
III Year CSE FLAT Notes Unit Wise & Important Questions (R20)
No ratings yet
III Year CSE FLAT Notes Unit Wise & Important Questions (R20)
172 pages
PDA To CFG
No ratings yet
PDA To CFG
5 pages
Flat Unit-3
No ratings yet
Flat Unit-3
31 pages
Flat-Unit-2 Notes
No ratings yet
Flat-Unit-2 Notes
23 pages
AT&CD Important Questions Bank
No ratings yet
AT&CD Important Questions Bank
7 pages
Context Free Grammars: Unit - Iii
No ratings yet
Context Free Grammars: Unit - Iii
17 pages
ACT CH 3 Context Free Languages
No ratings yet
ACT CH 3 Context Free Languages
66 pages
Chapter 3 Regular Expression
No ratings yet
Chapter 3 Regular Expression
25 pages
CNF and GNF
No ratings yet
CNF and GNF
4 pages
Cs1352 Principles of Compiler Design
No ratings yet
Cs1352 Principles of Compiler Design
33 pages
Cse Flat Digital Notes Full 2020 21
No ratings yet
Cse Flat Digital Notes Full 2020 21
195 pages
First and Follow Set
86% (7)
First and Follow Set
5 pages
CFG Simplification Techniques
100% (2)
CFG Simplification Techniques
12 pages
Finite Automata (DFA and NFA, Epsilon NFA) : FSA Unit 1 Chapter 2
100% (2)
Finite Automata (DFA and NFA, Epsilon NFA) : FSA Unit 1 Chapter 2
24 pages
Theory of Computation and Automata
100% (1)
Theory of Computation and Automata
104 pages
CFG Simplification Guide
No ratings yet
CFG Simplification Guide
7 pages
Role of Parse1
No ratings yet
Role of Parse1
20 pages
21CS1503 - Theory of Computation
No ratings yet
21CS1503 - Theory of Computation
3 pages
Recognition of Tokens
No ratings yet
Recognition of Tokens
34 pages
Moore and Mealy
No ratings yet
Moore and Mealy
39 pages
LR (0) Parser
No ratings yet
LR (0) Parser
8 pages
Context-Free Grammar (CFG) : Example 1
No ratings yet
Context-Free Grammar (CFG) : Example 1
9 pages
Lectures Examples and Solutions of CFG&RE
No ratings yet
Lectures Examples and Solutions of CFG&RE
290 pages
Parsing Techniques for CS Students
No ratings yet
Parsing Techniques for CS Students
8 pages
Flat - Unit-2
No ratings yet
Flat - Unit-2
32 pages
Finite Automata Basics
No ratings yet
Finite Automata Basics
40 pages
Atcd-Unit-5 (1) - 2
No ratings yet
Atcd-Unit-5 (1) - 2
32 pages
Elimination of Left Recursion
No ratings yet
Elimination of Left Recursion
17 pages
1.deterministic Finite Automata
No ratings yet
1.deterministic Finite Automata
46 pages
Myhill Nerode Theorem
No ratings yet
Myhill Nerode Theorem
4 pages
CD - Sem 7 - GTU - Study Material - 15112016 - 100740AM PDF
50% (2)
CD - Sem 7 - GTU - Study Material - 15112016 - 100740AM PDF
100 pages
Bottom Up Parsing - LR Parsers (LR (0), SLR, CLR and LALR Parsers)
0% (1)
Bottom Up Parsing - LR Parsers (LR (0), SLR, CLR and LALR Parsers)
7 pages
Turing Machine Concepts & Designs
No ratings yet
Turing Machine Concepts & Designs
53 pages
Theory of Automata - Lecture - 1
No ratings yet
Theory of Automata - Lecture - 1
36 pages
Theory of Computation Q&A
100% (2)
Theory of Computation Q&A
5 pages
Push Down Automata (PDA) : Non-Deterministic PDA Deterministic PDA
No ratings yet
Push Down Automata (PDA) : Non-Deterministic PDA Deterministic PDA
20 pages
Finite Automata (DFA and NFA, Epsilon NFA) : FSA Unit 1 Chapter 2
No ratings yet
Finite Automata (DFA and NFA, Epsilon NFA) : FSA Unit 1 Chapter 2
24 pages
Chapter 9 Undecidability
100% (1)
Chapter 9 Undecidability
48 pages
Toc Unit 1 Finite Automata
No ratings yet
Toc Unit 1 Finite Automata
132 pages
Chapter 4 Automata
No ratings yet
Chapter 4 Automata
36 pages
Unit-II CFG Pda Presentation
No ratings yet
Unit-II CFG Pda Presentation
68 pages
ToC Notes - Unit 2
No ratings yet
ToC Notes - Unit 2
20 pages
TPL Lect 17-20
No ratings yet
TPL Lect 17-20
8 pages
Chapter Four Automata
No ratings yet
Chapter Four Automata
36 pages
Lecture CFG 02
No ratings yet
Lecture CFG 02
16 pages
TCNTMN
No ratings yet
TCNTMN
48 pages
CFG & GNF
No ratings yet
CFG & GNF
21 pages
Context-Free Grammar Guide
No ratings yet
Context-Free Grammar Guide
31 pages
Unit1 Introduction JAVa
No ratings yet
Unit1 Introduction JAVa
13 pages
Pom 2
No ratings yet
Pom 2
89 pages
Unit 1 & 2 Some Testing Subject
No ratings yet
Unit 1 & 2 Some Testing Subject
13 pages
Web Dev Assignments
No ratings yet
Web Dev Assignments
28 pages
PHP UNIT 3 and 4
No ratings yet
PHP UNIT 3 and 4
11 pages
PHP Unit-3,4
No ratings yet
PHP Unit-3,4
25 pages
PoM Assignment 1
No ratings yet
PoM Assignment 1
1 page
Mathematics Language Basics
No ratings yet
Mathematics Language Basics
49 pages
Java Operators
No ratings yet
Java Operators
11 pages
Artifical Intelligence Notes Part 6
No ratings yet
Artifical Intelligence Notes Part 6
20 pages
Simplifications of Context-Free Grammars: Costas Buch - RPI 1
No ratings yet
Simplifications of Context-Free Grammars: Costas Buch - RPI 1
51 pages
Lab 02
No ratings yet
Lab 02
2 pages
Axioms and Postulates
No ratings yet
Axioms and Postulates
20 pages
All Units Notes DAA
No ratings yet
All Units Notes DAA
19 pages
Propositional Logic
No ratings yet
Propositional Logic
26 pages
Lab Report: Submitted To
No ratings yet
Lab Report: Submitted To
12 pages
Pres
No ratings yet
Pres
18 pages
Resolution Refutation
No ratings yet
Resolution Refutation
5 pages
Introduction To Mathematical Analysis
60% (5)
Introduction To Mathematical Analysis
141 pages
B.tech CS - IT 4th SEM FORMAL LANGUAGE AND AUTOMATA THEORY
No ratings yet
B.tech CS - IT 4th SEM FORMAL LANGUAGE AND AUTOMATA THEORY
3 pages
Jawapan
No ratings yet
Jawapan
119 pages
CS 61A Homework 3 Solutions
No ratings yet
CS 61A Homework 3 Solutions
14 pages
Algorithms-Practical For Pattern Matching-Brute Force
No ratings yet
Algorithms-Practical For Pattern Matching-Brute Force
4 pages
Language of Sets
No ratings yet
Language of Sets
6 pages
Digital Logic Design II Overview
No ratings yet
Digital Logic Design II Overview
33 pages
Sample Questions
No ratings yet
Sample Questions
5 pages
Fundamental Elements of The Language Mathematics
No ratings yet
Fundamental Elements of The Language Mathematics
25 pages
Class 12 Relations & Functions Q&A
No ratings yet
Class 12 Relations & Functions Q&A
2 pages
DM Final
No ratings yet
DM Final
10,544 pages
Probability Ch1 (ITC)
No ratings yet
Probability Ch1 (ITC)
23 pages
Gourab Kunda Roy-CSE-Maths-2nd Year.
No ratings yet
Gourab Kunda Roy-CSE-Maths-2nd Year.
7 pages
Chap 3
No ratings yet
Chap 3
19 pages
Categorical Propositions
No ratings yet
Categorical Propositions
34 pages
10 - Trees
No ratings yet
10 - Trees
64 pages
Ch-3 RegularExpressions
No ratings yet
Ch-3 RegularExpressions
21 pages
Notes Inverse Trigonometry Theory
No ratings yet
Notes Inverse Trigonometry Theory
6 pages
A2 Cse 643 Ai M2024
No ratings yet
A2 Cse 643 Ai M2024
6 pages