100% found this document useful (1 vote)
75 views

Compiler Construction CS-4207: Instructor Name: Atif Ishaq

This document summarizes a lecture on compiler construction and token recognition. It discusses recognizing tokens using regular expressions and finite state machines. Transition diagrams are used to recognize different types of tokens like identifiers, keywords, operators, and numbers. The lecture also covered how a lexical analyzer works by matching tokens, consulting symbol tables, and returning token values.

Uploaded by

Faisal Shehzad
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
75 views

Compiler Construction CS-4207: Instructor Name: Atif Ishaq

This document summarizes a lecture on compiler construction and token recognition. It discusses recognizing tokens using regular expressions and finite state machines. Transition diagrams are used to recognize different types of tokens like identifiers, keywords, operators, and numbers. The lecture also covered how a lexical analyzer works by matching tokens, consulting symbol tables, and returning token values.

Uploaded by

Faisal Shehzad
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Compiler Construction

CS-4207

Instructor Name: Atif Ishaq


Lecture 6
Today’s Lecture

 Recognition of Token

 Regular Expression and FSM

 Transition Diagram Construction

2
Recognition of Token : Transition Diagram

 A language defined by a grammar is a (possibly infinite) set of strings

 An automation is a device that determines, by reading a string (word) one

character at a time, whether the string belongs to a special language

 A finite state automata (FSA, NFA) is an automaton that recognizes regular

languages (regular expressions)

 Simplest automaton : memory is an element of a finite set

3
Recognition of Token : Transition Diagram

 Graphically a Finite State Automata are represented by

 A set of labeled states, represented as nodes in a digraph

 Directed edges labelled with a character are drawn between states

 One or more states designated as terminal (accepting)

 One or more state designated as initial

 On reading character a ∈ ∑ , automaton may move from state S1 to state S2 if

there exists an a-labled edge connecting S1 to S2.

 A string belongs to the language if, while reading the string, the automaton

may move from an initial state to an accepting state.

4
Recognition of Token : Transition Diagram

Following diagram is an NFA which recognizes the language of all string over ∑ :

{a , b} which have an even number of a’s and b’s

For even a’s and b’s

5
Recognition of Token : Transition Diagram

6
Recognition of Token : Transition Diagram

7
Recognition of Token : Transition Diagram

8
Recognition of Token : Transition Diagram

9
Recognition of Tokens : Transition Diagram

relop  < | > | <= | >= | <> | =

id  letter (letter | digit )*

10
Recognition of Tokens : Transition Diagram

A transition diagram for unsigned digits

A transition diagram for white spaces

11
What else a Lexical Analyze Do?

 All keyword / reserve word are matched as ids


 After the match, symbol table or special keyword table is consulted
 Keywords table contains string version of all keywords along with the
associated token value
 When a match is found the token is returned along with its symbolic
value, i.e, “then”,16
 If match is not found then it is assumed that an id has been discovered

if 15
then 16
begin 17
... ...
12
Transition Diagram : Code
token nexttoken()
{ while (1) {
switch (state) {
case 0: c = nextchar();
if (c==blank || c==tab || c==newline) { Decides the
state = 0;
lexeme_beginning++; next start state
}
else if (c==‘<’) state = 1; to check
else if (c==‘=’) state = 5;
else if (c==‘>’) state = 6;
else state = fail();
break; int fail()
case 1: { forward = token_beginning;
… swith (start) {
case 9: c = nextchar(); case 0: start = 9; break;
if (isletter(c)) state = 10; case 9: start = 12; break;
else state = fail(); case 12: start = 20; break;
break; case 20: start = 25; break;
case 10: c = nextchar(); case 25: recover(); break;
if (isletter(c)) state = 10; default: /* error */
else if (isdigit(c)) state = 10; }
else state = 11; return start;
break; }
13

Transition Diagram : Code

14
Lecture Outcome

 Significance of context free grammar in compiler construction

 How to resolve associativity and precedence issues in arithmetic


expressions

 Focusing on unambiguous grammar for parsing

15
Lecture Outcome

 Token Recognition

 Transition Diagram Construction

16
17

You might also like