0% found this document useful (0 votes)
2 views

CS242_Module 5

The document discusses Context-Free Grammars (CFGs) and Context-Free Languages (CFLs), explaining their definitions, applications, and the concept of ambiguity in grammars. It outlines the structure of CFGs, provides examples, and highlights their significance in areas such as compilers and markup languages. Additionally, it covers parsing techniques, derivations, and the relationship between CFGs and regular languages.

Uploaded by

iHACK Project
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

CS242_Module 5

The document discusses Context-Free Grammars (CFGs) and Context-Free Languages (CFLs), explaining their definitions, applications, and the concept of ambiguity in grammars. It outlines the structure of CFGs, provides examples, and highlights their significance in areas such as compilers and markup languages. Additionally, it covers parsing techniques, derivations, and the relationship between CFGs and regular languages.

Uploaded by

iHACK Project
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

‫ر‬

‫الجامعة السعودية االلكتونية‬


‫ر‬
‫االلكتونية‬ ‫الجامعة السعودية‬

‫‪26/12/2021‬‬
Theory of Computing

Headline separator Module 5


Context-Free Grammars and Languages
Contents
1. Context-Free Grammars
2. Parse Trees
3. Applications of Context-Free Grammars
4. Ambiguity in Grammars and Languages
Weekly Learning Outcomes
1. Describe Context-Free grammars.
2. Explain parsing and ambiguity using derivation trees.
Required Reading
1. Context-Free Grammars
2. Parse Trees
3. Applications of Context-Free Grammars
4. Ambiguity in Grammars and Languages
(Introduction to Automata Theory, Languages, and Computation
(2013) Global Edition 3rd Edition)
Recommended Reading
https://www3.nd.edu/~cpennycu/2019/assets/fall/TOC/08%20Context%20Free%2
0Grammars.pdf
https://www3.nd.edu/~cpennycu/2019/assets/fall/TOC/09%20Chomsky%20Norm
al%20Form.pdf

This Presentation is mainly dependent on the textbook: Introduction to Automata Theory, Languages, and Computation: Global Edition, 3rd edition (2013) PHI
by John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullman
• Context-Free Grammars
Not all languages are regular
• So what happens to the languages which are
not regular?
• Can we still come up with a language
recognizer?
• i.e., something that will accept (or reject) strings that
belong (or do not belong) to the language?

7
Context-Free Languages
• A language class larger than the class of regular languages
• Supports natural, recursive notation called “context-free
grammar”
• Applications:
• Parse trees, compilers
• XML

Context-
Regular
(FA/RE) free
(PDA/CFG)

8
An Example
• A palindrome is a word that reads the same from both
ends
• E.g., mom, nolemonnomelon, madam, 101010101
• Let L = { w | w is a binary palindrome}
• Is L regular?
• No.
• Proof:
• Let w = 0N10N (assuming N to be a positive integer constant)
• By Pumping Lemma, w can be rewritten as xyz, such that xykz is also L (for any
k ≥ 0)
• But |xy|≤ N and y ≠ 
• ==> y = 0+
• ==> xykz will NOT be in L for k = 0
• ==> Contradiction

9
CFL
• The language of palindromes is a CFL, because it
supports recursive substitution (in the form of a CFG)
• This is because we can construct a “grammar” like this:
1. A ==> 
2. A ==> 0
3. A ==> 1 Productions
4. A ==> 0A0
5. A ==> 1A1
• This can be also written as
A => 0A0 | 1A1 | 0 | 1 | 
• Variable or non-terminal: Symbols on the left side of a
production. Only variable in this grammar is A.
• Terminal: The symbols on the right side of a production. Here,
, 0, 1.

10
How does the CFG for palindromes work?
An input string belongs to the language (i.e.,
accepted) if and only if it can be generated by the
CFG.
Generating a string from a grammar:
1. Pick and choose a sequence of productions that would
allow us to generate the string.
2. At every step, substitute one variable with one of its productions.

• Example: w = 01110
G:
• G can generate w as follows:
A => 0A0 | 1A1 | 0 | 1 | 
1. A => 0A0
2. => 01A10
3. => 01110

11
Definition of Context-Free Grammar
• A context-free grammar is denoted as G = (V, T, P, S),
where
• V = Set of variables or non-terminals
• T = Set of terminal symbols (alphabet U {})
• P = Set of productions, each of which is of the form
V ==> 1 | 2 | …
• Where each i is an arbitrary string of variables and terminal
symbols
• S = The start variable
Example: CFG for the language of binary palindromes:
• G = ({A}, {0,1}, P, A)
•P= A ==> 0A0|1A1|0|1|

12
More examples (1)
• Parenthesis matching in code
E.g., ()(((())))((()))….
CFG is:
• S => (S) | SS | 
• A grammar for L = {0m1n | m ≥ n}
CFG is:
• S => 0S1 | A
• A => 0A | 
• Syntax checking
• In scenarios where there is a general need for:
• Matching a symbol with another symbol, or
• Matching a count of one symbol with that of another symbol, or
• Recursively substituting one symbol with a string of other symbols

13
More examples (2)
• L1 = {0n | n ≥ 0}
• L2 = {0n | n ≥ 1}
• L3={0i1j2k | i = j or j=k, where i, j, k ≥ 0}
• L4={0i1j2k | i = j or I = k, where i, j, k ≥ 1}

14
Applications of CFLs & CFGs
• Compilers use parsers for syntax checking
• Parsers are expressed as CFGs
1. Balancing parentheses:
• B ==> BB | (B) | Statement
• Statement ==> …
2. If-then-else:
• S ==> SS | if Condition then Statement else Statement | if
Condition then Statement | Statement
• Condition ==> …
• Statement ==> …
3. C parentheses matching { … }
4. Pascal begin-end matching
5. YACC (Yet Another Compiler-Compiler)

15
More applications
• Markup languages
• Nested Tag Matching
• HTML
• <html> …<p> … <a href=…> … </a> </p> … </html>
• XML
• <PC> … <MODEL> … </MODEL> .. <RAM> … </RAM> … </PC>

16
Structure of a production

head derivation body

A =======> 1 | 2 | … | k

The above is the same as below:

1. A ==> 1
2. A ==> 2
3. A ==> 3

K. A ==> k

17
CFG conventions
• Terminal symbols <== a, b, c…

• Non-terminal symbols <== A,B,C, …

• Terminal or non-terminal symbols <== X,Y,Z

• Terminal strings <== w, x, y, z

• Arbitrary strings of terminals and non-terminals <== , , ,


..

18
Syntactic Expressions in
Programming Languages
result = a*b + location + 10 * distance + c

terminals variables Operators are also


terminals
Regular languages have only terminals
• Reg expression = [a-z][a-z0-1]*
• If we allow only letters a & b, and 0 & 1 for constants (for
simplification)
• Regular expression = (a + b)(a + b + 0 + 1)*

19
String membership
How to say if a string belong to the language defined by
a CFG?
1. Derivation
• Head to body Both are equivalent forms
2. Recursive inference
• Body to head
Example:
• w = 01110
• Is w a palindrome?
CFG: A => 0A0 | 1A1 | 0 | 1 | 
A => 0A0
=> 01A10
=> 01110

20
Simple Expressions
• We can write a CFG for accepting simple expressions
• G = (V, T, P, S)
• V = {E, F}
• T = {0, 1 ,a, b, +, *, (, )}
• S = {E}
• P=
• E ==> E+E | E*E | (E) | F
• F ==> aF | bF | 0F | 1F | a | b | 0 | 1

21
Generalization of derivation
▪ Derivation is head ==> body
▪ A ==> X (A derives X in a single step)
▪ A ==>*G X (A derives X in a multiple steps)

▪ Transitivity:
IF A ==>*GB, and B ==>*GC, THEN A ==>*G C

22
Context-Free Language
• The language of a CFG, G=(V, T, P, S), denoted by L(G), is
the set of terminal strings that have a derivation from
the start variable S.
• L(G) = { w in T* | S ==>*G w }

23
Left-most & Right-most Derivations
For the CFG:
E => E+E | E*E | (E) | F
F => aF | bF | 0F | 1F | 
Derive the string a*(ab+10) from G: E =*=>G a*(ab+10)
E E
Left-most ==> E * E Right-most ==> E * E
derivation: ==> F * E derivation: ==> E * (E)
==> aF * E ==> E * (E + E)
==> a * E ==> E * (E + F)
Always ==> a * (E) Always ==> E * (E + 1F)
substitute ==> a * (E + E) substitute ==> E * (E + 10F)
leftmost ==> a * (F + E) rightmost ==> E * (E + 10)
variable ==> a * (aF + E) variable ==> E * (F + 10)
==> a * (abF + E) ==> E * (aF + 10)
==> a * (ab + E) ==> E * (abF + 0)
==> a * (ab + F) ==> E * (ab + 10)
==> a * (ab + 1F) ==> F * (ab + 10)
==> a * (ab + 10F) ==> aF * (ab + 10)
==> a * (ab + 10) ==> a * (ab + 10)
24
Leftmost vs. Rightmost derivations
• For every leftmost derivation, there is a rightmost
derivation, and vice versa.
Will use parse trees to prove this
• Does every word generated by a CFG have a leftmost
and a rightmost derivation?
Easy to prove (reverse direction)
• Could there be words which have more than one
leftmost (or rightmost) derivation?
Yes – depending on the grammar

25
CFG & CFL
• Gpal
A => 0A0 | 1A1 | 0 | 1 | 

• Theorem: A string w in (0+1)* is in L(Gpal), if and only if,


w is a palindrome.

• Proof:
• Use induction
• on string length for the IF part
• On length of derivation for the ONLY IF part

26
• Parse Trees
Parse Trees
• Each CFG can be represented using a parse tree:
• Each internal node is labeled by a variable in V
• Each leaf represents a terminal symbol
• For a production, A ==>X1X2…Xk, then any internal node
labeled A has k children which are labeled from X1,X2,…Xk
from left to right
• Parse tree for production and all other subsequent productions:
• A ==> X1..Xi..Xk

X1 … Xi … Xk

28
Examples
G:
G:
E => E+E | E*E | (E) | F
A => 0A0 | 1A1 | 0 | 1 | 
F => aF | bF | 0F | 1F | 0 | 1 | a | b

E
A

Recursive inference
E + E
0 A 0
F F

Derivation
1 A 1
a 1

Parse tree for 0110


Parse tree for a + 1

29
Parse Trees, Derivations, and
Recursive Inferences

Production:
A ==> X1..Xi..Xk
A

Derivation
X1 … Xi … Xk
Recursive
inference

Left-most Parse tree


derivation

Derivation Right-most
Recursive
derivation
inference

30
Interchangeability of different
CFG representations
• Parse tree ==> left-most derivation
• DFS left to right
• Parse tree ==> right-most derivation
• DFS right to left
• ==> left-most derivation == right-most derivation
• Derivation ==> Recursive inference
• Reverse the order of productions
• Recursive inference ==> Parse trees
• bottom-up traversal of parse tree

31
• Applications of Context-Free Grammars
Relationship between CFLs and RLs

33
CFLs & Regular Languages
• A CFG is said to be right-linear if all the productions are
one of the following two forms: A ==> wB (or) A ==> w
Where:
• A & B are variables,
• w is a string of terminals

• Theorem 1: Every right-linear CFG generates a regular


language
• Theorem 2: Every regular language has a right-linear
grammar
• Theorem 3: Left-linear CFGs also represent RLs

34
Some Examples

0 1 0 1
0,1 A => 01B | C
1 0 1 0
A B 1 C B => 11B | 0C | 1A
A B C
C => 1A | 0 | 1
0

Right linear CFG? Right linear CFG? Finite Automaton?

35
• Ambiguity in Grammars and Languages
Ambiguity in CFGs

• A CFG is said to be ambiguous if there exists a string


which has more than one left-most derivation. E.g. for
the Input string 00111, the following CFG can be derived
in two ways.
Example:
• S ==> AS | 
• A ==> A1 | 0A1 | 01

LM derivation #1: LM derivation #2:


S => AS S => AS
=> 0A1S => A1S
=>0A11S => 0A11S
=> 00111S => 00111S
=> 00111 => 00111

37
Why does ambiguity matter?
• E ==> E + E | E * E | (E) | a | b | c | 0 | 1
• For the string a * b + c, the two values are different!
E
• LM derivation #1:
•E => E + E => E * E + E E + E (a*b)+c
==>* a * b + c
E * E c

a b
E
• LM derivation #2
•E => E * E => a * E => E * E a*(b+c)
a * E + E ==>* a * b + c
a E + E

b c
The calculated value depends on which of the two parse trees is
actually used.

38
Removing Ambiguity in
Expression Evaluations
• It may be possible to remove ambiguity for some CFLs
• E.g., in a CFG for expression evaluation by imposing rules &
restrictions such as precedence rule
• This would imply rewrite of the grammar
Precedence: (), * , +
• Modified unambiguous version:
E => E + T | T
T => T * F | F
F => I | (E)
I => a | b | c | 0 | 1

39
Inherently Ambiguous CFLs
• However, for some languages, it may not be possible to
remove ambiguity

• A CFL is said to be inherently ambiguous if every CFG that


describes it is ambiguous
Example:
• L = { anbncmdm | n, m ≥ 1} U {anbmcmdn | n, m ≥ 1}
• L is inherently ambiguous
• Why?
• Check for Input string: anbncndn

40
Main Reference
1. Context-Free Grammars
2. Parse Trees
3. Applications of Context-Free Grammars
4. Ambiguity in Grammars and Languages
(Introduction to Automata Theory, Languages, and Computation
(2013) Global Edition 3rd Edition)
Additional References
https://www3.nd.edu/~cpennycu/2019/assets/fall/TOC/08%20Context%20Free%2
0Grammars.pdf
https://www3.nd.edu/~cpennycu/2019/assets/fall/TOC/09%20Chomsky%20Norm
al%20Form.pdf

This Presentation is mainly dependent on the textbook: Introduction to Automata Theory, Languages, and Computation: Global Edition, 3rd edition (2013) PHI
by John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullman
Thank You

You might also like