17 CFGremove Ambiguity Optional
17 CFGremove Ambiguity Optional
Ambiguity
Review
• Why should we study CFGs?
2
Review
A sentential form is a string of terminals and non-
terminals produced from the start symbol
Inductively:
– The start symbol
– If αAδ is a sentential form for a grammar, where (α
and δ ∊ (V|Σ)*), and A → γ is a production, then αγδ
is a sentential form for the grammar
• In this case, we say that αAδ derives αγδ in one step, which
is written as αAδ ⇒ αγδ
3
Leftmost and Rightmost Derivation
• Example: S → a | SbS String: aba
5
Multiple Leftmost Derivations
S → a | SbS
• Can we find more than one leftmost derivation?
A leftmost derivation Another leftmost derivation
S ⇒ SbS ⇒ abS ⇒ S ⇒ SbS ⇒ SbSbS ⇒
abSbS ⇒ ababS ⇒ ababa abSbS ⇒ ababS ⇒ ababa
6
Ambiguity
• A string is ambiguous for a grammar if it has
more than one parse tree
– Equivalent to more than one leftmost (or more than
one rightmost) derivation
• A grammar is ambiguous if it generates an
ambiguous string
– It can be hard to see this with manual inspection
• Exercise: can you create an unambiguous
grammar for S → a | SbS ?
7
Are these Grammars Ambiguous?
(1) S → aS | T
T → bT | U
U → cU | ε
(2) S→T|T
T → Tx | Tx | x | x
(3) S → SS | () | (S)
8
Ambiguity of Grammar (Example 3)
• 2 different parse trees for the same string: ()()()
• 2 distinct leftmost derivations :
S SS SSS ()SS ()()S ()()()
S SS ()S ()SS ()()S ()()()
10
Tips for Designing Grammars
• Concatenation: use separate non-terminals to
generate disjoint parts of a language, and then
combine in a production
G = S → AB
A → aA | ε
B → bB | ε
L(G) = a*b*
11
Tips for Designing Grammars (cont’d)
• Matching constructs: write productions which
generate strings from the middle
{anbn | n ≥ 0} (not a regular language!)
S → aSb | ε
Example: S ⇒ aSb ⇒ aaSbb ⇒ aabb
{anb2n | n ≥ 0}
S → aSbb | ε
12
Tips for Designing Grammars (cont’d)
{anbm | m ≥ 2n, n ≥ 0}
S → aSbb | B | ε
B → bB | b
13
Tips for Designing Grammars (cont’d)
{anbman+m | n ≥ 0, m ≥ 0}
Rewrite as anbmaman, which now has matching
superscripts (two pairs)
Corrected:
The outer anan are generated first,
S → aSa | B
then the inner bmam
B → bBa | ε
14
Tips for Designing Grammars (cont’d)
• Union: use separate nonterminals for each part
of the union and then combine
{ an(bm|cm) | m > n ≥ 0}
Can be rewritten as
{ anbm | m > n ≥ 0} ∪
{ ancm | m > n ≥ 0}
15
Tips for Designing Grammars (cont’d)
{ anbm | m > n ≥ 0} ∪ { ancm | m > n ≥ 0}
S→T|U
T → aTb | Tb | b T generates the first set
U → aUc | Uc | c U generates the second
set
16
Tips for Designing Grammars (cont’d)
{ anbm | m > n ≥ 0} ∪ { ancm | m > n ≥ 0}
17
Tips for Designing Grammars (cont’d)
{ anbm | m > n ≥ 0} ∪ { ancm | m > n ≥ 0}
Unambiguous version
S→T|V
T → aTb | U
U → Ub | b
V → aVc | W
W → Wc | c
18
CFGs for Languages
• Recall that our goal is to describe programming
languages with CFGs
19
Example: a-b-c
E ⇒ E-E ⇒ a-E ⇒ a-E-E ⇒ E ⇒ E-E ⇒ E-E-E ⇒
a-b-E ⇒ a-b-c a-E-E ⇒ a-b-E ⇒ a-b-c
21
Another Example: If-Then-Else
<stmt> ::= <assignment> | <if-stmt> | ...
<if-stmt> ::= if (<expr>) <stmt> |
if (<expr>) <stmt> else <stmt>
– (Here <>’s are used to denote nonterminals and ::=
for productions)
CMSC 330 24
Fixing the Expression Grammar
• Idea: Require that the right operand of all of the
operators is not a bare expression
E → E+T | E-T | E*T | T
T → a | b | c | (E)
28
Final Expression Grammar
E → E+T | E-T | T lowest precedence operators
T → T*P | P higher precedence
P → a | b | c | (E) highest precedence (parentheses)
29
Final Expression Grammar
E → E+T | E-T | T
T → T*P | P
P → a | b | c | (E)
• Exercises:
– Construct tree and left and and right derivations for
a+b*c a*(b+c) a*b+c a-b-c
– See what happens if you change the first set of
productions to E → E +T | E-T | T | P
– See what happens if you change the last set of
productions to P → a | b | c | E | (E)
30