0% found this document useful (0 votes)
4 views

17 CFGremove Ambiguity Optional

this si how to efficientify automatas

Uploaded by

survivor000111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

17 CFGremove Ambiguity Optional

this si how to efficientify automatas

Uploaded by

survivor000111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Context-Free Grammars

Ambiguity
Review
• Why should we study CFGs?

• What are the four parts of a CFG?

• How do we tell if a string is accepted by a CFG?

• What’s a parse tree?

2
Review
A sentential form is a string of terminals and non-
terminals produced from the start symbol

Inductively:
– The start symbol
– If αAδ is a sentential form for a grammar, where (α
and δ ∊ (V|Σ)*), and A → γ is a production, then αγδ
is a sentential form for the grammar
• In this case, we say that αAδ derives αγδ in one step, which
is written as αAδ ⇒ αγδ

3
Leftmost and Rightmost Derivation
• Example: S → a | SbS String: aba

Leftmost Derivation Rightmost Derivation


S ⇒ SbS ⇒ abS ⇒ aba S ⇒ SbS ⇒ Sba ⇒ aba
At every step, apply production At every step, apply production
to leftmost non-terminal to rightmost non-terminal

• Both derivations happen to have the same parse tree


• A parse tree has a unique leftmost and a unique
rightmost derivation
• Not every string has a unique parse tree
• Parse trees don’t show the order productions are
applied
4
More on Leftmost/Rightmost Derivations
• Is the following derivation leftmost or rightmost?
S ⇒ aS ⇒ aT ⇒ aU ⇒ acU ⇒ ac
– There’s at most one non-terminal in each sentential
form, so there's no choice between left or right non-
terminals to expand

• How about the following derivation?


– S ⇒ SbS ⇒ SbSbS ⇒ SbabS ⇒ ababS ⇒ ababa

5
Multiple Leftmost Derivations
S → a | SbS
• Can we find more than one leftmost derivation?
A leftmost derivation Another leftmost derivation
S ⇒ SbS ⇒ abS ⇒ S ⇒ SbS ⇒ SbSbS ⇒
abSbS ⇒ ababS ⇒ ababa abSbS ⇒ ababS ⇒ ababa

6
Ambiguity
• A string is ambiguous for a grammar if it has
more than one parse tree
– Equivalent to more than one leftmost (or more than
one rightmost) derivation
• A grammar is ambiguous if it generates an
ambiguous string
– It can be hard to see this with manual inspection
• Exercise: can you create an unambiguous
grammar for S → a | SbS ?

7
Are these Grammars Ambiguous?
(1) S → aS | T
T → bT | U
U → cU | ε

(2) S→T|T
T → Tx | Tx | x | x

(3) S → SS | () | (S)

8
Ambiguity of Grammar (Example 3)
• 2 different parse trees for the same string: ()()()
• 2 distinct leftmost derivations :
S  SS  SSS ()SS ()()S ()()()
S  SS  ()S ()SS ()()S ()()()

• We need unambiguous grammars to manage


programming language semantics
9
Tips for Designing Grammars
• Closures: use recursive productions to generate
an arbitrary number of symbols
A → xA | ε Zero or more x’s
A → yA | y One or more y’s

10
Tips for Designing Grammars
• Concatenation: use separate non-terminals to
generate disjoint parts of a language, and then
combine in a production
G = S → AB
A → aA | ε
B → bB | ε
L(G) = a*b*

11
Tips for Designing Grammars (cont’d)
• Matching constructs: write productions which
generate strings from the middle
{anbn | n ≥ 0} (not a regular language!)
S → aSb | ε
Example: S ⇒ aSb ⇒ aaSbb ⇒ aabb

{anb2n | n ≥ 0}
S → aSbb | ε

12
Tips for Designing Grammars (cont’d)
{anbm | m ≥ 2n, n ≥ 0}
S → aSbb | B | ε
B → bB | b

The following grammar also works:


S → aSbb | B
B → bB | ε

How about the following?


S → aSbb | bS | ε

13
Tips for Designing Grammars (cont’d)
{anbman+m | n ≥ 0, m ≥ 0}
Rewrite as anbmaman, which now has matching
superscripts (two pairs)

Would this grammar work?


S → aSa | B Doesn’t allow m = 0
B → bBa | ba

Corrected:
The outer anan are generated first,
S → aSa | B
then the inner bmam
B → bBa | ε
14
Tips for Designing Grammars (cont’d)
• Union: use separate nonterminals for each part
of the union and then combine

{ an(bm|cm) | m > n ≥ 0}

Can be rewritten as
{ anbm | m > n ≥ 0} ∪
{ ancm | m > n ≥ 0}

15
Tips for Designing Grammars (cont’d)
{ anbm | m > n ≥ 0} ∪ { ancm | m > n ≥ 0}
S→T|U
T → aTb | Tb | b T generates the first set
U → aUc | Uc | c U generates the second
set

• What’s the parse tree for


string abbb?
• Ambiguous!

16
Tips for Designing Grammars (cont’d)
{ anbm | m > n ≥ 0} ∪ { ancm | m > n ≥ 0}

Will this fix the ambiguity?


S→T|U
T → aTb | bT | b
U → aUc | cU | c

• It's not ambiguous, but it can generate invalid


strings such as babb

17
Tips for Designing Grammars (cont’d)
{ anbm | m > n ≥ 0} ∪ { ancm | m > n ≥ 0}

Unambiguous version
S→T|V
T → aTb | U
U → Ub | b
V → aVc | W
W → Wc | c

18
CFGs for Languages
• Recall that our goal is to describe programming
languages with CFGs

• We had the following example which describes


limited arithmetic expressions
E → a | b | c | E+E | E-E | E*E | (E)

• What’s wrong with using this grammar?


– It’s ambiguous!

19
Example: a-b-c
E ⇒ E-E ⇒ a-E ⇒ a-E-E ⇒ E ⇒ E-E ⇒ E-E-E ⇒
a-b-E ⇒ a-b-c a-E-E ⇒ a-b-E ⇒ a-b-c

Corresponds to a-(b-c) Corresponds to (a-b)-c


20
The Issue: Associativity
• Ambiguity is bad here because if the compiler
needs to generate code for this expression, it
doesn’t know what the programmer intended

• So what do we mean when we write a-b-c?


– In mathematics, this only has one possible meaning
– It’s (a-b)-c, since subtraction is left-associative
– a-(b-c) would be the meaning if subtraction was right-
associative

21
Another Example: If-Then-Else
<stmt> ::= <assignment> | <if-stmt> | ...
<if-stmt> ::= if (<expr>) <stmt> |
if (<expr>) <stmt> else <stmt>
– (Here <>’s are used to denote nonterminals and ::=
for productions)

• Consider the following program fragment:


if (x > y)
if (x < z)
a = 1;
else a = 2;
– Note: Ignore newlines
22
Parse Tree #1

• Else belongs to inner if


23
Parse Tree #2

• Else belongs to outer if

CMSC 330 24
Fixing the Expression Grammar
• Idea: Require that the right operand of all of the
operators is not a bare expression
E → E+T | E-T | E*T | T
T → a | b | c | (E)

• Now there's only one parse


tree for a-b-c

– Exercise: Give a derivation


for the string a-(b-c)
CMSC 330 25
What if We Wanted Right-Associativity?
• Left-recursive productions are used for left-
associative operators
• Right-recursive productions are used for right-
associative operators
• Left:
E → E+T | E-T | E*T | T
T → a | b | c | (E)
• Right:
E → T+E | T-E | T*E | T
T → a | b | c | (E)
26
Parse Tree Shape
• The kind of recursion/associativity determines
the shape of the parse tree
left recursion right recursion

– Exercise: draw a parse tree for a-b-c in the prior


grammar in which subtraction is right-associative
27
A Different Problem
• How about the string a+b*c ?
E → E+T | E-T | E*T | T
T → a | b | c | (E)

• Doesn’t have correct


precedence for *
– When a nonterminal has productions for several
operators, they effectively have the same precedence
• How can we fix this?

28
Final Expression Grammar
E → E+T | E-T | T lowest precedence operators
T → T*P | P higher precedence
P → a | b | c | (E) highest precedence (parentheses)

• Each non-terminal represents a level of


precedence
• At each level, operations are left-associative

29
Final Expression Grammar
E → E+T | E-T | T
T → T*P | P
P → a | b | c | (E)

• Exercises:
– Construct tree and left and and right derivations for
a+b*c a*(b+c) a*b+c a-b-c
– See what happens if you change the first set of
productions to E → E +T | E-T | T | P
– See what happens if you change the last set of
productions to P → a | b | c | E | (E)
30

You might also like