0% found this document useful (0 votes)

35 views100 pages

Chapter - Three: Syntax Analysis

Uploaded by

Fedasa Bote

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views100 pages

Chapter - Three: Syntax Analysis

Uploaded by

Fedasa Bote

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 100

Chapter – three

Syntax analysis

1
Outline
 Introduction
 Context free grammar (CFG)
 Derivation
 Parse tree
 Ambiguity
 Left recursion
 Left factoring
 Top-down parsing
• Recursive Descent Parsing (RDP)
• Non-recursive predictive parsing
– First and follow sets
– Construction of a predictive parsing table

2
Outline
 LR(1) grammars
 Syntax error handling
 Error recovery in predictive parsing
 Panic mode error recovery strategy

 Bottom-up parsing (LR(k) parsing)

 Stack implementation of shift/reduce parsing
 Conflict during shift/reduce parsing
 LR parsers
 Constructing SLR parsing tables
 Canonical LR parsing
 LARL (Reading assignment)

 Yacc

3
Introduction
 Syntax: the way in which tokens are put together to form
expressions, statements, or blocks of statements.
 The rules governing the formation of statements in a programming
language.

 Syntax analysis: checks if the sequence of tokens generated

by the lexical analyzer follows the grammatical rules of the
programming language.
 Parsing: is the process of analyzing the grammatical
structure of a program's source code to determine its
syntactic correctness and build a structured representation of
it.
 The syntax of a programming language is usually given by the
grammar rules of a context free grammar (CFG).
4
Parser

Parse tree
next char next token
lexical Syntax
analyzer analyzer
get next
char get next
token

Source
Program
symbol
table

Lexical Syntax
(Contains a record Error
Error
for each identifier)

5
Introduction…
 The syntax analyzer (parser) checks whether a given
source program satisfies the rules implied by a CFG or
not.
 If it satisfies, the parser creates the parse tree of that
program.
 Otherwise, the parser gives the error messages.

 A CFG:
 gives a precise syntactic specification of a programming
language.
 A grammar can be directly converted in to a parser by
some tools (yacc).
6
Introduction…
 The parser can be categorized into two groups:
 Top-down parser
 The parse tree is created top to bottom, starting from the
root to leaves.
 Bottom-up parser
 The parse tree is created bottom to top, starting from the
leaves to root.
 Both top-down and bottom-up parser scan the input from
left to right (one symbol at a time).
 Efficient top-down and bottom-up parsers can be
implemented using context-free-grammar.
 LL for top-down parsing
 LR for bottom-up parsing
7
Context free grammar (CFG)
 A context-free grammar (CFG) is a specification for the
syntactic structure of a programming language.
 Context-free grammar has 4-tuples:
G = (T, N, P, S) where
 T is a finite set of terminals (a set of tokens)
 N is a finite set of non-terminals (syntactic variables)
 P is a finite set of productions of the form
A→α where A is non-terminal and
α is a strings of terminals and non-terminals (including the empty
string)
 S N is a designated start symbol (one of the non-
terminal symbols)
8
Example: grammar for simple arithmetic expressions

expression  expression + term Terminal symbols

expression  expression - term id + - * / ( )
expression  term
term  term * factor Non-terminals
term  term / factor expression
term  factor term
factor  (expression) Factor
factor  id Start symbol
expression

9
Notational Conventions Used
 Terminals:
 Lowercase letters early in the alphabet, such as a, b, c.
 Operator symbols such as +, *, and so on.
 Punctuation symbols such as parentheses, comma, and so on.
 The digits 0,1,. . . ,9.
 Boldface strings such as id or if, each of which represents a
single terminal symbol.
 Non-terminals:
 Uppercase letters early in the alphabet, such as A, B, C.
 The letter S is usually the start symbol.
 Lowercase, italic names such as expr or stmt.
 Uppercase letters may be used to represent non-terminals for
the constructs.
• expr, term, and factor are represented by E, T, F 10
Notational Conventions Used…
 Grammar symbols
 Uppercase letters late in the alphabet, such as X, Y, Z, that is, either
non-terminals or terminals.
 Strings of terminals
 Lowercase letters late in the alphabet, mainly u,v,x,y T*
 Strings of grammar symbols
 Lowercase Greek letters, α, β, γ (N T)*
 A set of productions A  α1, A  α2, . . . , A  αk with a common head A
(call them A-productions), may be written
A  α1 | α2 |…| αk
α1, α2,. . . , αk the alternatives for A.
 The head of the first production is the start symbol.

EE+T|E-TIT
TT*FIT/FIF
F  ( E ) | id 11
Derivation
 A derivation is a sequence of replacements of structure names
by choices on the right hand sides of grammar rules.

 Example: E → E + E | E – E | E * E | E / E | -E
E→(E)
E → id

E => E + E means that E + E is derived from E

- we can replace E by E + E
- we have to have a production rule E → E + E in our grammar.

E=>E+E =>id+E=>id+id means that a sequence of replacements of

non-terminal symbols is called a derivation of id+id from E.
12
Derivation…
 In general The one-step derivation is defined by
αAβ α γ β if there is a production rule A → γ in our
grammar
Where α and β are arbitrary strings of terminal and non-
terminal symbols.
α1=> α2=>….=> αn (αn is derived from α1 or α1 derives αn)

 At each derivation step, we can choose any of the non-

terminal in the sentential form of G for the replacement.

 Transitive closure * (zero or more steps)

 Positive closure + (one or more steps)
13
Derivation…
 If we always choose the left-most non-terminal in each
derivation step, this derivation is called left-most derivation.
Example: E=>-E=>-(E)=>-(E+E)=>-(id+E)=>-(id+id)
 If we always choose the right-most non-terminal in each
derivation step, this derivation is called right-most
derivation.
Example: E=>-E=>-(E)=>-(E+E)=>-(E+id)=>-(id+id)
 We will see that the top-down parser try to find the left-most
derivation of the given source program.
 We will see that the bottom-up parser try to find right-most
derivation of the given source program in the reverse order.

14
Parse tree
 A parse tree is a graphical representation of a derivation.
 It filters out the order in which productions are applied to replace
non-terminals.

 A parse tree corresponding to a derivation is a labeled tree

in which:
• the interior nodes are labeled by non-terminals,
• the leaf nodes are labeled by terminals, and
• the children of each internal node represent the
replacement of the associated non-terminal in one step
of the derivation.

15
Parse tree and Derivation
Grammar E  E + E | E  E | ( E ) | - E | id
Lets examine this derivation:
E  -E  -(E)  -(E + E)  -(id + id)

E E E E E

- E - E - E - E

( E ) ( E ) ( E )

E + E E + E
This is a top-down derivation
because we start building the id id
parse tree at the top parse tree
16
Exercise
a) Using the grammar below, draw a parse tree for the
following string:
( ( id . id ) id ( id ) ( ( ) ) )
S→E
E → id
|(E.E)
|(L)
|()
L→LE
|E
b) Give a rightmost derivation for the string given in (a).

17
Ambiguity
 A grammar, which produces more than one parse tree for a
sentence is called as an ambiguous grammar.
• produces more than one leftmost derivation or
• more than one rightmost derivation for the same sentence.

 We should eliminate the ambiguity in the grammar during the

design phase of the compiler.
 An unambiguous grammar should be written to eliminate
the ambiguity.
 E.g. Ambiguous grammars (b/c of ambiguous operators) can be
disambiguated according to the precedence and associatively rules.

18
Ambiguity: Example
 Example: The arithmetic expression grammar

E → E + E | E * E | ( E ) | id

 permits two distinct leftmost derivations for the

sentence id + id * id:
(a) (b)
E => E + E E => E * E
=> id + E => E + E * E
=> id + E * E => id + E * E
=> id + id * E => id + id * E
=> id + id * id => id + id * id

19
Ambiguity: example
E  E + E | E  E | ( E ) | - E | id
Construct parse tree for the expression: id + id  id
E E E E

E + E E + E E + E

E  E id E  E

id id
E E E E

E  E E  E E  E

E + E E + E id
Which parse tree is correct?
id id
20
Ambiguity: example…
E  E + E | E  E | ( E ) | - E | id
Find a derivation for the expression: id + id  id
E
According to the grammar, both are correct.
E + E

id E  E

A grammar that produces more than one id id

parse tree for any input sentence is said
to be an ambiguous grammar. E

E + E

E  E id

id id
21
Elimination of ambiguity
Precedence/Association
 These two derivations point out a problem with the grammar:
 The grammar do not have notion of precedence, or implied order of
evaluation.
To add precedence
 Create a non-terminal for each level of precedence
 Isolate the corresponding part of the grammar
 Force the parser to recognize high precedence sub expressions first

For algebraic expressions

 Multiplication and division, first (level one)
 Subtraction and addition, next (level two)

To add association
 Left-associative : The next-level (higher) non-terminal places at the last of a
production.
22
Elimination of ambiguity
 To disambiguate the grammar:

E  E + E | E  E | ( E ) | id

 we can use precedence of operators as follows:

* Higher precedence (left associative)
+ Lower precedence (left associative)

 We get the following unambiguous grammar:

EE+T|T id + id * id
TTF|F
F  ( E ) | id
23
Left Recursion
EE+T|T
Consider the grammar: TTF|F
F  ( E ) | id

A top-down parser might loop forever when parsing

an expression using this grammar

E E E E

E + T E + T E + T

E + T E + T

E + T

24
Elimination of Left recursion
 A grammar is left recursive, if it has a non-terminal A
such that there is a derivation
A=>+Aα for some string α.
 Top-down parsing methods cannot handle left-
recursive grammar.
 so a transformation that eliminates left-recursion is needed.

 To eliminate left recursion for single production

A  Aα|β could be replaced by the non left-recursive
productions
A  β A’
A’  α A’| ε
25
Elimination of Left recursion…

This left-recursive EE+T|T

grammar: TTF|F
F  ( E ) | id

Can be re-written to eliminate the immediate left recursion:

E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id

26
Elimination of Left recursion…
 Generally, we can eliminate immediate left recursion from
them by the following technique.
 First we group the A-productions as:

A  Aα1 |Aα2 |…. |Aαm |β1 | β2|….| βn

Where no βi begins with A. then we replace the A

productions by:
A  β1A’ | β2A’ | … | βnA’
A’  α1Α’ | α2A’ | … | αmA’ |ε

27
Eliminating left-recursion algorithm

 Arrange non-terminals in some order A1....An

for i from 1 to n do {
for j from 1 to i-1 do {
replace each production of the form Ai  Ajγ
By the productions
Ai  α1γ|....|αkγ, where
Aj  α1 | α2 . . . |αk are all current Aj-productions
}
Eliminate immediate left-recursions among the Ai
production.
}
28
Example Left Recursion Elim.
A→BC|a
Choose arrangement : A,B,C
B → CA | Ab
C → AB | CC |a
A→BC|a
i=1: nothing to do
B → CA B’ | abB’
i=2, j=1: B → CA | A b B’ → CbB’ | ε
B → CA | BC b | a b
C → abB’CBC’|a B C’ | a C’
(imm) B → CA B’ | abB’ C’ → AB’CBC’ | CC’ |ε
B’ → CbB’ | ε

i=3, j=1: C → A B | CC | a
C → B C B | a B | CC | a

i=3, j=2: C → B C B | a B | CC | a
C A B’ CB | abB’CB | aB | CC | a
(imm) C → abB’CBC’ | a B C’ | a C’
C’ → AB’CBC’ | CC’ |ε
29
Eliminating left-recursion (more)
 Example: Given: S  Aa | b
A  Ac |Sd |ε
 Substitute the S productions in A  Sd to obtain the
following productions:
A  Ac | Aad | bd |ε
 Eliminating the immediate left recursion among the
A productions yields the following grammar:

S  Aa | b
A  bdA’ | A’
A’  cA’ | adA’ |ε

30
Left factoring

 When a non-terminal has two or more productions

whose right-hand sides start with the same grammar
symbols, the grammar is not LL(1) and cannot be used
for predictive parsing.
 A predictive parser (a top-down parser without
backtracking) insists that the grammar must be left-
factored.
 In general : A  αβ1 | αβ2 , where α-is a non empty and
the first symbol of β1 and β2.

31
Left factoring…
 When processing α we do not know whether to expand A
to αβ1 or to αβ2, but if we re-write the grammar as
follows:
A  αA’
A’  β1 | β2 so, we can immediately expand A to αA’.
 Example: given the following grammar:
S  iEtS | iEtSeS | a
Eb
 Left factored, this grammar becomes:
S  iEtSS’ | a
S’  eS | ε
Eb
32
Left factoring…

The following stmt  if expr then stmt else stmt

grammar: | if expr then stmt
Cannot be parsed by a predictive parser that looks
one element ahead.
But the grammar stmt  if expr then stmt stmt’
can be re-written: stmt‘ else stmt | 
Where  is the empty string.
Rewriting a grammar to eliminate multiple productions
starting with the same token is called left factoring.

33
Syntax analysis (Parsing)
 Every language has rules that prescribe the syntactic
structure of well formed programs.
 The syntax can be described using Context Free Grammars
(CFG) notation.
 The use of CFGs has several advantages:
 helps in identifying ambiguities
 a grammar gives a precise yet easy to understand syntactic
specification of a programming language
 it is possible to have a tool which produces automatically a
parser using the grammar
 a properly designed grammar helps in modifying the parser
easily when the language changes
34
Top-down parsing
Recursive Descent Parsing (RDP)
 This method of top-down parsing can be considered as an
attempt to find the left most derivation for an input string.
 It may involve backtracking.

 To construct the parse tree using RDP:

 we create one node tree consisting of S.
 two pointers, one for the tree and one for the input, will be
used to indicate where the parsing process is.
 initially, they will be on S and the first input symbol, respectively.
 then we use the first S-production to expand the tree. The tree
pointer will be positioned on the left most symbol of the newly
created sub-tree.
35
Recursive Descent Parsing (RDP)…

 as the symbol pointed by the tree pointer matches that of the

symbol pointed by the input pointer, both pointers are moved
to the right.
 whenever the tree pointer points on a non-terminal, we
expand it using the first production of the non-terminal.
 whenever the pointers point on different terminals, the
production that was used is not correct, thus another
production should be used. We have to go back to the step
just before we replaced the non-terminal and use another
production.
 if we reach the end of the input and the tree pointer passes the
last symbol of the tree, we have finished parsing.

36
RDP…

 Example: G: S  cAd
A  ab|a
 Draw the parse tree for the input string cad using
the above method.

 Exercise: Consider the following grammar:

SA
A  A + A | B++
By
Draw the parse tree for the input “ y+++y++”
37
Exercise
 Using the grammar below, draw a parse tree for the
following string using RDP algorithm:
( ( id . id ) id ( id ) ( ( ) ) )
S→E
E → id
|(E.E)
|(L)
|()
L→LE
|E

38
Non-recursive predictive parsing
 It is possible to build a non-recursive parser by explicitly
maintaining a stack.
 This method uses a parsing table that determines the next
production to be applied.
x=a=$ id + id  id $ OUTPUT:
INPUT:
x=a≠$
X is non-terminal E

E
Predictive Parsing
STACK:
$ Program

NON- INPUT SYMBOL

TERMINAL id + * ( ) $
PARSING E E  TE’ E  TE’
E’ E’  +TE’ E’   E’  
TABLE: T T  FT’ T  FT’
T’ T’  T’  *FT’ T’   T’  
F F  id F  (E)
39
Non-recursive predictive parsing…
 The input buffer contains the string to be parsed followed
by $ (the right end marker)
 The stack contains a sequence of grammar symbols with $
at the bottom.
 Initially, the stack contains the start symbol of the grammar
followed by $.
 The parsing table is a two dimensional array M[A, a]
where A is a non-terminal of the grammar and a is a
terminal or $.
 The parser program behaves as follows.
 The program always considers
 X, the symbol on top of the stack and
 a, the current input symbol.
40
Predictive Parsing…
 There are three possibilities:
1. x = a = $ : the parser halts and announces a successful
completion of parsing
2. x = a ≠ $ : the parser pops x off the stack and advances
the input pointer to the next symbol
3. X is a non-terminal : the program consults entry M[X, a]
which can be an X-production or an error entry.
 If M[X, a] = {X  uvw}, X on top of the stack will be replaced
by uvw (u at the top of the stack).
 As an output, any code associated with the X-production can
be executed.
 If M[X, a] = error, the parser calls the error recovery method.
41
Predictive Parsing algorithm
set ip to point to the first symbol of w;
set X to the top stack symbol;
while ( X ≠ $ ) { /* stack is not empty */
if ( X is a ) pop the stack and advance ip;
else if ( X is a terminal ) error();
else if ( M[X, a] is an error entry ) error();
else if ( M[X,a] = X  Y1Y2 … Yk ) {
output the production X  Y1Y2 … Yk;
pop the stack;
push Yk, Yk-1,. . . , Y1 onto the stack, with Y1 on top;
}
set X to the top stack symbol;
}
42
A Predictive Parser table
E  TE’
E’  +TE’ | 
T  FT’
Grammar: T’  FT’ | 
F  ( E ) | id

NON- INPUT SYMBOL

TERMINAL id + * ( ) $
E E  TE’ E  TE’
Parsing E’ E’  +TE’ E’   E’  
T T  FT’ T  FT’
Table: T’ T’  T’  *FT’ T’   T’  
F F  id F  (E)

43
Predictive Parsing Simulation

INPUT: id + id  id $ OUTPUT:
E

T E’
T
E
Predictive Parsing
STACK:
E’
$ Program
$

PARSING NON-
TERMINAL id +
INPUT SYMBOL
* ( ) $
TABLE: E E  TE’ E  TE’
E’ E’  +TE’ E’   E’  
T T  FT’ T  FT’
T’ T’  T’  *FT’ T’   T’  
F F  id F  (E) 44
Predictive Parsing Simulation…

INPUT: id + id  id $ OUTPUT:
E

T E’
Predictive Parsing
STACK: T
F
Program F T’
T’
E’
E’
$
$

PARSING NON- INPUT SYMBOL

TABLE: TERMINAL id + * ( ) $
E E  TE’ E  TE’
E’ E’  +TE’ E’   E’  
T T  FT’ T  FT’
T’ T’  T’  *FT’ T’   T’  
F F  id F  (E) 45
Predictive Parsing Simulation…

INPUT: id + id  id $ OUTPUT:
E

T E’
Predictive Parsing
STACK: id
T
F
Program F T’
T’
E’
E’
$ id
$

PARSING NON- INPUT SYMBOL

TABLE: TERMINAL id + * ( ) $
E E  TE’ E  TE’
E’ E’  +TE’ E’   E’  
T T  FT’ T  FT’
T’ T’  T’  *FT’ T’   T’  
F F  id F  (E) 46
Predictive Parsing Simulation…

INPUT: id + id  id $ OUTPUT:
E

T E’
Predictive Parsing
STACK: T’
E’
Program F T’
E’
$
$ id 

PARSING NON- INPUT SYMBOL

TABLE: TERMINAL id + * ( ) $
E E  TE’ E  TE’
E’ E’  +TE’ E’   E’  
T T  FT’ T  FT’
T’ T’  T’  *FT’ T’   T’  
F F  id F  (E) 47
Predictive Parsing Simulation…

The predictive parser proceeds E

in this fashion using the T E’

following productions:
E’  +TE’ F T’ + T E’

T  FT’ id  F T’ 
F  id
id  F T’
T’   FT’
F  id id 
T’   When Top(Stack) = input = $
E’   the parser halts and accepts the
input string.
48
Non-recursive predictive parsing…
 Example: G:
E  TR
R  +TR Input: 1+2
R  -TR
Rε
T  0|1|…|9
X|a 0 1 … 9 + - $

E ETR ETR … ETR Error Error Error

R Error Error … Error R+TR R-TR Rε

T T0 T1 … T9 Error Error Error

49
Non-recursive predictive parsing…

50
FIRST and FOLLOW
 The construction of both top-down and bottom-up parsers
are aided by two functions, FIRST and FOLLOW, associated
with a grammar G.

 During top-down parsing, FIRST and FOLLOW allow us to

choose which production to apply, based on the next input
symbol.

 During panic-mode error recovery, sets of tokens produced

by FOLLOW can be used as synchronizing tokens.

51
FIRST and FOLLOW
We need to build a FIRST set and a FOLLOW set
for each symbol in the grammar.

The elements of FIRST and FOLLOW are

terminal symbols.

FIRST() is the set of terminal symbols that can

begin any string derived from .

FOLLOW() is the set of terminal symbols that can follow :

t  FOLLOW()   derivation containing t

52
Construction of a predictive parsing table

 Makes use of two functions: FIRST and FOLLOW.

FIRST
 FIRST(α) = set of terminals that begin the strings
derived from α.
 If α => ε in zero or more steps, ε is in FIRST(α).

 FIRST(X) where X is a grammar symbol can be found

using the following rules:
1- If X is a terminal, then FIRST(x) = {x}
2- If X is a non-terminal: two cases
53
Construction of a predictive parsing table…
2- If X is a non-terminal: two cases…
a) If X  ε is a production, then add ε to FIRST(X)
b) For each production X  Y1Y2…Yk, place a in
FIRST(X) if for some i, a Є FIRST(Yi) and ε Є
FIRST(Yj), for 1<j<i
If ε Є FIRST(Yj), for j=1, …,k then ε Є FIRST(X)

For any string Y = X1X2…Xn

a- Add all non- ε symbols of FIRST(X1) in FIRST(Y)
b- Add all non- ε symbols of FIRST(Xi) for i≠1 if for all
j<i, ε Є FIRST(Xj)
c- ε Є FIRST(Y) if ε Є FIRST(Xi) for all i
54
Construction of a predictive parsing table…
FOLLOW
 FOLLOW(A) = set of terminals that can appear
immediately to the right of A in some sentential form.

1- Place $ in FOLLOW(A), where A is the start symbol.

2- If there is a production B  αAβ, then everything in

FIRST(β), except ε, should be added to FOLLOW(A).

3- If there is a production B  αA or B  αAβ and ε Є

FIRST(β), then all elements of FOLLOW(B) should be
added to FOLLOW(A).
55
Rules to Create FIRST
GRAMMAR: FIRST rules:
E  TE’ 1. If X is a terminal, FIRST(X) = {X}
E’  +TE’ | 
T  FT’ 2. If X   , then   FIRST(X)
T’  FT’ |  3. If X  Y1Y2 ••• Yk
F  ( E ) | id and Y1 ••• Yi-1 * 
SETS: and a FIRST(Yi)
FIRST(id) = {id} then a  FIRST(X)
FIRST() = {}
FIRST(+) = {+}
FIRST(() = {(}
FIRST()) = {)}
FIRST(E’) = {} {+, }
FIRST(T’) = {} {, }
FIRST(F) = {(, id}
FIRST(T) = FIRST(F) = {(, id}
FIRST(E) = FIRST(T) = {(, id}
56
FIRST(E’) = {+, }
FIRST(T’) = { , }
FIRST(F) = {(, id}
FIRST(T) = {(, id}
Rules to Create FOLLOW
FIRST(E) = {(, id}

GRAMMAR: FOLLOW rules:

E  TE’ 1. If S is the start symbol, then $  FOLLOW(S)
E’  +TE’ | 
2. If A  B,
T  FT’
T’  FT’ | 
and a  FIRST()
F  ( E ) | id and a  
then a  FOLLOW(B)
SETS: 3. If A  B
FOLLOW(E) = {$} { ), $} and a  FOLLOW(A)
FOLLOW(E’) = { ), $} then a  FOLLOW(B)
FOLLOW(T) = { ), $} 3a. If A  B
 *  and
and a  FOLLOW(A)
then a  FOLLOW(B)

A and B are non-terminals,

 and  are strings of grammar symbols 57
FIRST(E’) = {+, }
FIRST(T’) = { , }
FIRST(F) = {(, id}
FIRST(T) = {(, id}
Rules to Create FOLLOW
FIRST(E) = {(, id}

GRAMMAR: FOLLOW rules:

E  TE’ 1. If S is the start symbol, then $  FOLLOW(S)
E’  +TE’ | 
2. If A  B,
T  FT’
T’  FT’ | 
and a  FIRST()
F  ( E ) | id and a  
then a  FOLLOW(B)
SETS: 3. If A  B
FOLLOW(E) = {), $} and a  FOLLOW(A)
FOLLOW(E’) = { ), $} then a  FOLLOW(B)
FOLLOW(T) = { ), $} {+, ), $} 3a. If A  B
 *  and
and a  FOLLOW(A)
then a  FOLLOW(B)

58
FIRST(E’) = {+, }
FIRST(T’) = { , }
FIRST(F) = {(, id}
FIRST(T) = {(, id}
Rules to Create FOLLOW
FIRST(E) = {(, id}

GRAMMAR: FOLLOW rules:

E  TE’ 1. If S is the start symbol, then $  FOLLOW(S)
E’  +TE’ | 
2. If A  B,
T  FT’
T’  FT’ | 
and a  FIRST()
F  ( E ) | id and a  
then a  FOLLOW(B)
SETS: 3. If A  B
FOLLOW(E) = {), $} and a  FOLLOW(A)
FOLLOW(E’) = { ), $} then a  FOLLOW(B)
FOLLOW(T) = {+, ), $} 3a. If A  B
 *  and
FOLLOW(T’) = {+, ), $}
and a  FOLLOW(A)
then a  FOLLOW(B)

59
FIRST(E’) = {+, }
FIRST(T’) = { , }
FIRST(F) = {(, id}
FIRST(T) = {(, id}
Rules to Create FOLLOW
FIRST(E) = {(, id}

GRAMMAR: FOLLOW rules:

E  TE’ 1. If S is the start symbol, then $  FOLLOW(S)
E’  +TE’ | 
2. If A  B,
T  FT’
T’  FT’ | 
and a  FIRST()
F  ( E ) | id and a  
then a  FOLLOW(B)
SETS: 3. If A  B
FOLLOW(E) = {), $} and a  FOLLOW(A)
FOLLOW(E’) = { ), $} then a  FOLLOW(B)
FOLLOW(T) = {+, ), $} 3a. If A  B
 *  and
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, ), $} and a  FOLLOW(A)
then a  FOLLOW(B)

60
FIRST(E’) = {+, }
FIRST(T’) = { , }
FIRST(F) = {(, id}
FIRST(T) = {(, id}
Rules to Create FOLLOW
FIRST(E) = {(, id}

GRAMMAR: FOLLOW rules:

E  TE’ 1. If S is the start symbol, then $  FOLLOW(S)
E’  +TE’ | 
2. If A  B,
T  FT’
T’  FT’ | 
and a  FIRST()
F  ( E ) | id and a  
then a  FOLLOW(B)
SETS: 3. If A  B
FOLLOW(E) = {), $} and a  FOLLOW(A)
FOLLOW(E’) = { ), $} then a  FOLLOW(B)
FOLLOW(T) = {+, ), $} 3a. If A  B
 * and
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, ), $} {+, , ), $} and a  FOLLOW(A)
then a  FOLLOW(B)

61
Exercies:
 Find FIRST and FOLLOW sets for the following
grammar G:
E  TR
FIRST(E)=FIRST(T)={0,1,…,9}
R  +TR FIRST(R)={+,-,ε}
R  -TR
Rε
T  0|1|…|9 FOLLOW(E)={$}
FOLLOW(T)={+,-,$}
FOLLOW(R)={$}

62
Exercise…
 Consider the following grammar over the alphabet
{ g,h,i,b}
A  BCD
B  bB | ε
C  Cg | g | Ch | i
D  AB | ε
Fill in the table below with the FIRST and FOLLOW sets for
the non-terminals in this grammar:
FIRST FOLLOW
A
B
C
D

63
Construction of predictive parsing table
 Input Grammar G
 Output Parsing table M
 For each production of the form A  α of the
grammar do:
• For each terminal a in FIRST(α), add A  α to
M[A, a]
• If ε Є FIRST(α), add A  α to M[A, b] for each b in
FOLLOW(A)
• If ε Є FIRST(α) and $ Є FOLLOW(A), add A  α
to M[A, $]
• Make each undefined entry of M be an error.
64
GRAMMAR: FIRST SETS: FOLLOW SETS:
E  TE’ FIRST(E’) = {+, } FOLLOW(E) = {), $}
Rules to Build Parsing Table
E’  +TE’ | 
T  FT’
FIRST(T’) = { , }
FIRST(F) = {(, id}
FOLLOW(E’) = { ), $}
FOLLOW(T) = {+, ), $}
T’  FT’ |  FIRST(T) = {(, id}
F  ( E ) | id FOLLOW(T’) = {+, ), $}
FIRST(E) = {(, id} FOLLOW(F) = {+, , ), $}