0% found this document useful (0 votes)

23 views26 pages

Introduction To Parsing

The document discusses the fundamentals of parsing in the context of a computer science course (COMP 412) at Rice University. It covers the role of parsers, context-free grammars, derivations, and the importance of precedence in grammar design. The document also highlights the limitations of regular languages and the necessity of context-free grammars for certain language constructs.

Uploaded by

Dr. Biswapati Jana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views26 pages

Introduction To Parsing

Uploaded by

Dr. Biswapati Jana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 26

COMP 412

FALL 2010

Introduction to Parsing

Comp 412

Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved.
Students enrolled in Comp 412 at Rice University have explicit permission to make
copies of these materials for their personal use.
Faculty from other educational institutions may use these materials for nonprofit
educational purposes, provided this copyright notice is preserved.
The Front End

Source tokens IR
Scanner Parser
code

Errors

Parser
• Checks the stream of words and their parts of speech
(produced by the scanner) for grammatical correctness
• Determines if the input is syntactically well formed
• Guides checking at deeper levels than syntax
• Builds an IR representation of the code
Think of this chapter as the mathematics of diagramming
sentences

Comp 412, Fall 2010 2

The Study of Parsing
The process of discovering a derivation for some sentence
• Need a mathematical model of syntax — a grammar G
• Need an algorithm for testing membership in L(G)
• Need to keep in mind that our goal is building parsers,
not studying the mathematics of arbitrary languages

Roadmap for our study of parsing

1 Context-free grammars and derivations Today
2 Top-down parsing
— Generated LL(1) parsers & hand-coded recursive descent
parsers
3 Bottom-up parsing Lab 2
— Generated LR(1) parsers

We will define “context free” today. I am

Comp 412, Fall 2010 3
just deferring the definition for a couple of
slides.
Specifying Syntax with a Grammar
Context-free syntax is specified with a context-free grammar

SheepNoise  SheepNoise baa

| baa

This CFG defines the set of noises sheep normally make

It is written in a variant of Backus–Naur form

Formally, a grammar is a four tuple, G = (S,N,T,P)

• S is the start symbol (set of strings in L(G))
• N is a set of nonterminal symbols (syntactic variables)
• T is a set of terminal symbols (words)
• P is a set of productions or rewrite rules (P : N  (N  T)+ )
Example due to Dr. Scott K. Warren

Comp 412, Fall 2010 From Lecture 4

1
Deriving Syntax
We can use the SheepNoise grammar to create sentences
— use the productions as rewriting rules

And so on ...

While this example is cute, it quickly runs out of intellectual

steam ...
Comp 412, Fall 2010 5
Why Not Use Regular Languages & DFAs?
Not all languages are regular (RL’s  CFL’s  CSL’s)
You cannot construct DFA’s to recognize these languages
• L = { p k qk } (parenthesis
languages)
• L = { wcwr | w  *}
Neither of these is a regular language (nor an RE)

To recognize these features requires an arbitrary amount of

context (left or right …)
But, this issue is somewhat subtle. You can construct DFA’s
for
• Strings with alternating 0’s and 1’s
(  | 1 ) ( 01 )* (  | 0 )
• Strings with an even number of 0’s and 1’s
RE’s can count bounded sets and bounded differences
Comp 412, Fall 2010 6
Limits of Regular Languages
Advantages of Regular Expressions
• Simple & powerful notation for specifying patterns
• Automatic construction of fast recognizers
• Many kinds of syntax can be specified with REs
Example — a regular expression for arithmetic expressions
Term  [a-zA-Z] ([a-zA-Z] | [0-9])*
Op  +|-||/
Expr  ( Term Op )* Term
([a-zA-Z] ([a-zA-Z] | [0-9])* (+ | - |  | /))* [a-zA-Z] ([a-zA-Z] | [0-9])
Of course, this would generate a DFA …

If REs are so useful … Why not use them for everything?

 Cannot add parenthesis, brackets, begin-end pairs, …

Comp 412, Fall 2010 7

Context-free Grammars
What makes a grammar “context free”?

The SheepNoise grammar has a specific form:

SheepNoise  SheepNoise baa

| baa

Productions have a single nonterminal on the left hand side,

which makes it impossible to encode left or right context.
 The grammar is context free.
A context-sensitive grammar can have ≥ 1 nonterminal on
lhs.

Notice that L(SheepNoise) is actually a regular language: baa

Classic definition: any language that can be

Comp 412, Fall 2010 8
recognized by a push-down automaton is a
context-free language.
A More Useful Grammar Than Sheep Noise
To explore the uses of CFGs,we need a more complex
grammar
Rule Sentential Form
0 Expr  Expr Op Expr
— Expr
1 | number
0 Expr Op Expr
2 | id
2 <id,x> Op Expr
3 Op  + 4 <id,x> - Expr
4 | - 0 <id,x> - Expr Op Expr
5 | * 1 <id,x> - <num,2> Op
Expr
6 | /
5 <id,x> - <num,2> *
Expr
2 <id,x> - <num,2> *
<id,y>

• Such a sequence of rewrites is called a derivation

• Process of discovering a derivation is called parsing
We denote this derivation: Expr * id – num *
id
Comp 412, Fall 2010 9
Derivations
The point of parsing is to construct a derivation

• At each step, we choose a nonterminal to replace

• Different choices can lead to different derivations
Two derivations are of interest
• Leftmost derivation — replace leftmost NT at each step
• Rightmost derivation — replace rightmost NT at each step
These are the two systematic derivations
(We don’t care about randomly-ordered derivations!)

The example on the preceding slide was a leftmost

derivation
• Of course, there is also a rightmost derivation
• Interestingly, it turns out to be different
Comp 412, Fall 2010 10
Derivations
The point of parsing is to construct a derivation

A derivation consists of a series of rewrite steps

S  0  1  2  …  n–1  n  sentence

• Each i is a sentential form

— If  contains only terminal symbols,  is a sentence in L(G)
— If  contains 1 or more non-terminals,  is a sentential form
• To get i from i–1, expand some NT A  i–1 by using A 
— Replace the occurrence of A  i–1 with  to get i
— In a leftmost derivation, it would be the first NT A  i–1

A left-sentential form occurs in a leftmost derivation

A right-sentential form occurs in a rightmost derivation

Comp 412, Fall 2010 11

The Two Derivations for x – 2 * y

Rule Sentential Form Rule Sentential Form

— Expr — Expr
0 Expr Op Expr 0 Expr Op Expr
2 <id,x> Op Expr 2 Expr Op <id,y>
4 <id,x> - Expr 5 Expr * <id,y>
0 <id,x> - Expr Op Expr 0 Expr Op Expr * <id,y>
1 <id,x> - <num,2> Op 1 Expr Op <num,2> *
Expr <id,y>
5 <id,x> - <num,2> * 4 Expr - <num,2> *
Expr <id,y>
2 <id,x> - <num,2> * 2 <id,x> - <num,2> *
Leftmost
<id,y> derivation Rightmost
<id,y>
derivation
In both cases, Expr * id – num * id
• The two derivations produce different parse trees
• The parse trees imply different evaluation orders!
Comp 412, Fall 2010 12
Derivations and Parse Trees
Leftmost derivation
G
Rule Sentential Form
— Expr
0 Expr Op Expr
2 <id,x> Op Expr E
4 <id,x> - Expr
0 <id,x> - Expr Op Expr
1 <id,x> - <num,2> Op E Op E
Expr
5 <id,x> - <num,2> *
Expr x – E Op E
2 <id,x> - <num,2> *
<id,y>

This evaluates as x – ( 2 * 2 y
y) *

Comp 412, Fall 2010 13

Derivations and Parse Trees
Rightmost derivation
G
Rule Sentential Form
— Expr
0 Expr Op Expr
2 Expr Op <id,y> E
5 Expr * <id,y>
0 Expr Op Expr * <id,y>
1 Expr Op <num,2> * E Op E
<id,y>
4 Expr - <num,2> *
<id,y>
E Op E * y
2 <id,x> - <num,2> *
<id,y>

This evaluates as ( x – 2 ) * x – 2
y

This ambiguity is NOT good

Comp 412, Fall 2010 14
Derivations and Precedence

These two derivations point out a problem with the grammar:

It has no notion of precedence, or implied order of evaluation

To add precedence
• Create a nonterminal for each level of precedence
• Isolate the corresponding part of the grammar
• Force the parser to recognize high precedence
subexpressions first

For algebraic expressions

• Parentheses first (level 1 )
• Multiplication and division, next ( level
2)
• Subtraction and addition, last ( level 3)

Comp 412, Fall 2010 15

Derivations and Precedence
Adding the standard algebraic precedence produces:
0 Goal  Expr This grammar is slightly larger
1 Expr  Expr + Term •Takes more rewriting to
level
2 | Expr - Term reach some of the terminal
3
3 | Term symbols

level
4 Term  Term * Factor •Encodes expected
5 | Term / Factor precedence
2
6 | Factor •Produces same parse tree
7 Factor  ( Expr ) under leftmost & rightmost
level
8 | number derivations
1
9 | id •Correctness trumps the speed
of the parser

Cannot handle Let’s see how

Introduced it parses xtoo
parentheses, -2*
precedence in an RE for y
(beyond power of an RE)
expressions
Comp 412, Fall 2010 One form of the “classic expression 16
grammar”
Derivations and Precedence
Rule Sentential Form G
— Goal
0 Expr E
2 Expr - Term
4 Expr - Term * Factor E – T
9 Expr - Term * <id,y>
6 Expr - Factor * <id,y> T T * F
8 Expr - <num,2> *
<id,y> F F <id,y
3 Term - <num,2> * >
<id,y>
6 Factor - <num,2> * <id,x <num,2>
<id,y> >
9 <id,x> - <num,2> * Its parse tree
The rightmost
<id,y>
derivation

It derives x – ( 2 * y ), along with an appropriate parse tree.

Both the leftmost and rightmost derivations give the same expression,
because the grammar directly and explicitly encodes the desired
precedence.
Comp 412, Fall 2010 17
Ambiguous Grammars
Let’s leap back to our original expression grammar.
It had other problems.
Rule Sentential Form
0 Expr  Expr Op Expr — Expr
1 | number 0 Expr Op Expr
2 | id 2 <id,x> Op Expr
3 Op  + 4 <id,x> - Expr
0 <id,x> - Expr Op Expr
4 | -
1 <id,x> - <num,2> Op
5 | * Expr
6 | / 5 <id,x> - <num,2> *
Expr
2 <id,x> - <num,2> *
<id,y>
• This grammar allows multiple leftmost derivations for x - 2 * y
• Hard to automate derivation if > 1 choice
Different choice
• The grammar is ambiguous than the first time
Comp 412, Fall 2010 18
Two Leftmost Derivations for x – 2 * y
The Difference:
 Different productions chosen on the second step
Rule Sentential Form Rule Sentential Form
— Expr — Expr
0 Expr Op Expr 0 Expr Op Expr
2 <id,x> Op Expr 0 Expr Op Expr Op Expr
4 <id,x> - Expr 2 <id,x> Op Expr Op
0 <id,x> - Expr Op Expr Expr
1 <id,x> - <num,2> Op 4 <id,x> - Expr Op Expr
Expr 1 <id,x> - <num,2> Op
5 <id,x> - <num,2> * Expr
Expr 5 <id,x> - <num,2> *
1 <id,x> - <num,2> * Expr
Original choice
<id,y> 2 New -choice
<id,x> <num,2> *
<id,y>

 Both derivations succeed in producing x - 2 * y

Comp 412, Fall 2010 19
Two Leftmost Derivations for x – 2 * y
The Difference:
 Different productions chosen on the second step
Rule Sentential Form Rule Sentential Form
— Expr — Expr
0 Expr Op Expr 0 Expr Op Expr
2 <id,x> Op Expr 0 Expr Op Expr Op Expr
4 <id,x> - Expr 2 <id,x> Op Expr Op
0 <id,x> - Expr Op Expr Expr
1 <id,x> - <num,2> Op 4 <id,x> - Expr Op Expr
Expr 1 <id,x> - <num,2> Op
5 <id,x> - <num,2> * Expr
Expr 5 <id,x> - <num,2> *
2 <id,x> - <num,2> * Expr
Original choice
<id,y> 2 New -choice
<id,x> <num,2> *
<id,y>
Different choices in same
situation, again
Remember
Comp 412, Fall 2010 nondeterminism? 20
Ambiguous Grammars
Definitions
• If a grammar has more than one leftmost derivation for
a single sentential form, the grammar is ambiguous
• If a grammar has more than one rightmost derivation
for a single sentential form, the grammar is ambiguous
• The leftmost and rightmost derivations for a sentential
form may differ, even in an unambiguous grammar
— However, they must have the same parse tree!

Classic example — the if-then-else problem

Stmt  if Expr then Stmt
| if Expr then Stmt else Stmt
| … other stmts …
This ambiguity is inherent in the grammar
Comp 412, Fall 2010 21
Ambiguity
This sentential form has two derivations
if Expr1 then if Expr2 then Stmt1 else Stmt2 Part of the problem
is that the structure
built by the parser
if if will determine the
interpretation of the
code, and these two
E1 then else E1 then forms have different
meanings!

if S2 if

E2 then E2 then else

S1 S1 S2

production 2, then production 1, then

production 1 production 2

Comp 412, Fall 2010 22

The grammar forces the
Ambiguity structure to match the desired
meaning.
Removing the ambiguity
• Must rewrite the grammar to avoid generating the
problem
• Match each else to innermost unmatched if (common sense
0)
rule Stmt  if Expr then Stmt
1  if Expr then WithElse else Stmt
2  Other Statements
3 WithElse  if Expr then WithElse else WithElse
4  Other Statements

Intuition: once into WithElse, we cannot generate an unmatched

With
elsethis grammar, example has only one rightmost
derivation
… a final if without an else can only come through rule 2 …
Comp 412, Fall 2010 23
Ambiguity
if Expr1 then if Expr2 then Stmt1 else Stmt2

Rul Sentential Form

e
— Stmt
0 if Expr then Stmt
1 if Expr then if Expr then WithElse else Stmt
2 if Expr then if Expr then WithElse else S2
4 if Expr then if Expr then S1 else S2
? if Expr then if E2 then S1 else S2
? if E1 then if E2 then S1 else S2
Other productions to derive Expr
s
This grammar has only one rightmost derivation for the
example
Comp 412, Fall 2010 24
Deeper Ambiguity
Ambiguity usually refers to confusion in the CFG
Overloading can create deeper ambiguity
a = f(17)
In many Algol-like languages, f could be either a function
or a subscripted variable

Disambiguating this one requires context

• Need values of declarations
• Really an issue of type, not context-free syntax
• Requires an extra-grammatical solution (not in CFG)
• Must handle these with a different mechanism
— Step outside grammar rather than use a more complex
grammar

Comp 412, Fall 2010 25

Ambiguity - the Final Word
Ambiguity arises from two distinct sources
• Confusion in the context-free syntax (if-then-else)
• Confusion that requires context to resolve (overloading)

Resolving ambiguity
• To remove context-free ambiguity, rewrite the grammar
• To handle context-sensitive ambiguity takes cooperation
— Knowledge of declarations, types, …
— Accept a superset of L(G) & check it by other means†
— This is a language design problem

Sometimes, the compiler writer accepts an ambiguous

grammar
— Parsing techniques that “do the right thing”
— i.e., always select the same derivation

Comp 412, Fall 2010 †

See Chapter 4 26

Introduction To Parsing
No ratings yet
Introduction To Parsing
21 pages
Introduction To Parsing
No ratings yet
Introduction To Parsing
21 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
78 pages
Lecture 11
No ratings yet
Lecture 11
56 pages
Syntax Analysis Presentation
No ratings yet
Syntax Analysis Presentation
226 pages
Context Free Grammars 2
No ratings yet
Context Free Grammars 2
52 pages
Syntax Analysis
No ratings yet
Syntax Analysis
63 pages
Lecture 5
No ratings yet
Lecture 5
28 pages
4 Parsing
No ratings yet
4 Parsing
32 pages
03 Parsing
No ratings yet
03 Parsing
61 pages
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
No ratings yet
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
44 pages
08 CFG
No ratings yet
08 CFG
27 pages
Parsing for Programmers
No ratings yet
Parsing for Programmers
9 pages
Lecture 12
No ratings yet
Lecture 12
54 pages
2-Role of Parser and Parse Tree-02!08!2024
No ratings yet
2-Role of Parser and Parse Tree-02!08!2024
69 pages
Compiler 8
No ratings yet
Compiler 8
28 pages
SYNTAX Analyzer
No ratings yet
SYNTAX Analyzer
29 pages
Top Down
No ratings yet
Top Down
25 pages
ATC Module 3
No ratings yet
ATC Module 3
40 pages
Syntax Analysis in Compiler Design
No ratings yet
Syntax Analysis in Compiler Design
39 pages
Sukomal Parsing Till MidSem25
No ratings yet
Sukomal Parsing Till MidSem25
78 pages
1 Syntax Analyzer
No ratings yet
1 Syntax Analyzer
33 pages
Copch 3
No ratings yet
Copch 3
90 pages
6 CFG
No ratings yet
6 CFG
34 pages
ATC Module 3
No ratings yet
ATC Module 3
38 pages
Chapter 4
No ratings yet
Chapter 4
35 pages
Atcd Unit 2
No ratings yet
Atcd Unit 2
49 pages
Module 2a - With Soln
No ratings yet
Module 2a - With Soln
90 pages
KCA015 Unit2
No ratings yet
KCA015 Unit2
29 pages
Second Phase of The Compiler. Main Task:: Lexical Analyzer Rest of Front End Parser Source Tree Parse Req Token IR
No ratings yet
Second Phase of The Compiler. Main Task:: Lexical Analyzer Rest of Front End Parser Source Tree Parse Req Token IR
13 pages
CD Unit-3 Part-1
No ratings yet
CD Unit-3 Part-1
99 pages
Chapter 4
No ratings yet
Chapter 4
62 pages
CC-Lec 5 Week 5 Cfgs
No ratings yet
CC-Lec 5 Week 5 Cfgs
29 pages
Context
No ratings yet
Context
57 pages
Parsing ME Modified
No ratings yet
Parsing ME Modified
168 pages
2.2 - Syntax Analysis (Upto Top-Down Parsing)
No ratings yet
2.2 - Syntax Analysis (Upto Top-Down Parsing)
91 pages
Chapter 3 - Syntax Analyzer
No ratings yet
Chapter 3 - Syntax Analyzer
28 pages
SSK5204 Chapter 5: Context-Free Grammars and Languages
No ratings yet
SSK5204 Chapter 5: Context-Free Grammars and Languages
55 pages
Notes CFG
No ratings yet
Notes CFG
25 pages
Syntax
No ratings yet
Syntax
62 pages
Introduction To Context-Free Grammars: Deepak D'Souza
No ratings yet
Introduction To Context-Free Grammars: Deepak D'Souza
56 pages
Chapter 4 Intro - To - Parsing
No ratings yet
Chapter 4 Intro - To - Parsing
53 pages
Atc Module 3 Notes
No ratings yet
Atc Module 3 Notes
38 pages
Context-Free Grammars Explained
No ratings yet
Context-Free Grammars Explained
21 pages
Chapter Four: Context Free Languages (CFG) : - Contents
No ratings yet
Chapter Four: Context Free Languages (CFG) : - Contents
36 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
chapter-7-Context-Free Languages
No ratings yet
chapter-7-Context-Free Languages
32 pages
Unit 03 Parser
No ratings yet
Unit 03 Parser
148 pages
(Week 4) Syntax Analysis (CFG)
No ratings yet
(Week 4) Syntax Analysis (CFG)
50 pages
3 Parser
No ratings yet
3 Parser
47 pages
Context-Free Grammars: Formalism Derivations Backus-Naur Form Left-And Rightmost Derivations
No ratings yet
Context-Free Grammars: Formalism Derivations Backus-Naur Form Left-And Rightmost Derivations
40 pages
2nd Unit CD
No ratings yet
2nd Unit CD
14 pages
Computing Theory Lecture 4
No ratings yet
Computing Theory Lecture 4
25 pages
09 Parsing
No ratings yet
09 Parsing
11 pages
G52Cmp Compilers: Syntax Analysis
No ratings yet
G52Cmp Compilers: Syntax Analysis
36 pages
Syntax Analyser
No ratings yet
Syntax Analyser
30 pages
TAFL Unit-3
No ratings yet
TAFL Unit-3
26 pages
Context Free Grammars
No ratings yet
Context Free Grammars
17 pages
Sebesta Chapter 4 With Additions
No ratings yet
Sebesta Chapter 4 With Additions
46 pages
Curriculum Map Q1 English 10
No ratings yet
Curriculum Map Q1 English 10
4 pages
Object-Oriented Programming in Python For BracU
No ratings yet
Object-Oriented Programming in Python For BracU
2 pages
What A Student Says
No ratings yet
What A Student Says
4 pages
普通话学习 Putonghua Xuexi Guide for Dummies
No ratings yet
普通话学习 Putonghua Xuexi Guide for Dummies
36 pages
Ethiopian Empire Overview
No ratings yet
Ethiopian Empire Overview
6 pages
English Grammar Practice Worksheet
No ratings yet
English Grammar Practice Worksheet
3 pages
Argumentative Essay Assignment
100% (2)
Argumentative Essay Assignment
15 pages
Test
No ratings yet
Test
4 pages
Annual Allotment English 2025 2026
No ratings yet
Annual Allotment English 2025 2026
5 pages
Carl - 2011 - Type, Field, Culture, Praxis
No ratings yet
Carl - 2011 - Type, Field, Culture, Praxis
8 pages
(최다빈출 공략) 3.Stories of English Words and Expressions (02) - 동아 (윤정미) 중3 영어 (25문제) (Q)
No ratings yet
(최다빈출 공략) 3.Stories of English Words and Expressions (02) - 동아 (윤정미) 중3 영어 (25문제) (Q)
8 pages
Wing's Chip by Mavis Gallant
No ratings yet
Wing's Chip by Mavis Gallant
8 pages
Devoir de Contrôle N°1 - Anglais - 1ère AS
100% (1)
Devoir de Contrôle N°1 - Anglais - 1ère AS
3 pages
Thesis Preparation Guidelines Upm
100% (3)
Thesis Preparation Guidelines Upm
5 pages
English One Edit
100% (1)
English One Edit
120 pages
Children Language Acquisation
No ratings yet
Children Language Acquisation
2 pages
Quot Ingilis Dili Quot Asas Xarici Dil Fanni Uzra 11 Ci Sinif Ucun Metodik Vasait 1539603619 604
No ratings yet
Quot Ingilis Dili Quot Asas Xarici Dil Fanni Uzra 11 Ci Sinif Ucun Metodik Vasait 1539603619 604
219 pages
15 - iPLAN
No ratings yet
15 - iPLAN
2 pages
Easy Problems That LLMs Get Wrong
No ratings yet
Easy Problems That LLMs Get Wrong
46 pages
SYLLABUS English 5 TOEFL Preparation
100% (2)
SYLLABUS English 5 TOEFL Preparation
5 pages
ELT Across Curriculum Midterm
No ratings yet
ELT Across Curriculum Midterm
4 pages
DLP - Jan 9 - English 10
No ratings yet
DLP - Jan 9 - English 10
4 pages
Muhammad Ilyas Sarwar CV
No ratings yet
Muhammad Ilyas Sarwar CV
6 pages
Taqvim 2 Kurs Aniq Fanlar (Texnik) 2024-2025
No ratings yet
Taqvim 2 Kurs Aniq Fanlar (Texnik) 2024-2025
7 pages
Test Grammar Com 246 T9
No ratings yet
Test Grammar Com 246 T9
11 pages
Surya
No ratings yet
Surya
1 page
BP B1. UNIT 1. Exercises
No ratings yet
BP B1. UNIT 1. Exercises
8 pages
Finding Topic and Main Idea in The Reading Material: By: Irma Sufianingsih
No ratings yet
Finding Topic and Main Idea in The Reading Material: By: Irma Sufianingsih
20 pages
Oldest Languages of The World
No ratings yet
Oldest Languages of The World
18 pages
Country Studies Thesises of The Lectures
No ratings yet
Country Studies Thesises of The Lectures
36 pages

Introduction To Parsing

Uploaded by

Introduction To Parsing

Uploaded by

COMP 412

Comp 412, Fall 2010 2

Roadmap for our study of parsing

We will define “context free” today. I am

SheepNoise  SheepNoise baa

This CFG defines the set of noises sheep normally make

It is written in a variant of Backus–Naur form

Formally, a grammar is a four tuple, G = (S,N,T,P)

Comp 412, Fall 2010 From Lecture 4

While this example is cute, it quickly runs out of intellectual

To recognize these features requires an arbitrary amount of

If REs are so useful … Why not use them for everything?

Comp 412, Fall 2010 7

The SheepNoise grammar has a specific form:

SheepNoise  SheepNoise baa

Productions have a single nonterminal on the left hand side,

Notice that L(SheepNoise) is actually a regular language: baa

Classic definition: any language that can be

• Such a sequence of rewrites is called a derivation

• At each step, we choose a nonterminal to replace

The example on the preceding slide was a leftmost

A derivation consists of a series of rewrite steps

• Each i is a sentential form

A left-sentential form occurs in a leftmost derivation

Comp 412, Fall 2010 11

Rule Sentential Form Rule Sentential Form

Comp 412, Fall 2010 13

This ambiguity is NOT good

These two derivations point out a problem with the grammar:

For algebraic expressions

Comp 412, Fall 2010 15

Cannot handle Let’s see how

It derives x – ( 2 * y ), along with an appropriate parse tree.

 Both derivations succeed in producing x - 2 * y

Classic example — the if-then-else problem

E2 then E2 then else

production 2, then production 1, then

Comp 412, Fall 2010 22

Intuition: once into WithElse, we cannot generate an unmatched

Rul Sentential Form

Disambiguating this one requires context

Comp 412, Fall 2010 25

Sometimes, the compiler writer accepts an ambiguous

Comp 412, Fall 2010 †

You might also like

While this example is cute, it quickly runs out of intellectual