0% found this document useful (0 votes)

36 views9 pages

Lexical Analyzer in Compiler Design

Uploaded by

ruhinalmuhit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Topics covered

Lexical Errors,
Compiler Design,
Patterns,
Error Patterns,
Symbol Table,
Lexical Analysis Importance,
Input Processing,
Tokenization,
Constants,
Language Definition

0% found this document useful (0 votes)

36 views9 pages

Lexical Analyzer in Compiler Design

Uploaded by

ruhinalmuhit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Topics covered

Lexical Errors,
Compiler Design,
Patterns,
Error Patterns,
Symbol Table,
Lexical Analysis Importance,
Input Processing,
Tokenization,
Constants,
Language Definition

07/23/2020

Lexical Analysis

Zakia Zinat Choudhury

Lecturer
Department of Computer Science & Engineering
University of Rajshahi

The Role of Lexical Analyzer

Lexical analyzer is the first phase of a compiler. It is also known as a scanner. So,
it’s main job is to read the input characters of the source program and group
them into lexemes, and produce a sequence of tokens for each lexeme in the
source program as an output.

Tokens
Lexemes
Source Lexical Syntax
Program Analyzer Analyzer
Request for Tokens

Figure: Interaction between the Lexical analyzer and the Syntax analyzer

2
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

1
07/23/2020

The Role of Lexical Analyzer

Lexical analyzers sometimes are divided into two processes:

a) Scanning consists of the simple processes that do not require tokenization of

the input, such as deletion of comments and compaction of consecutive
whitespace characters into one.

b) Lexical analysis proper is the more complex portion, where the scanner
produces the sequence of tokens as output.

3
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

Why lexical analysis and syntax analysis phases

are separated?

Simplicity of design is the most important consideration. The separation of

lexical and syntax analysis often allows us to simplify at least one of these
tasks.

Compiler efficiency is improved. A separate lexical analyzer allows us to apply

specialized techniques that serve only the lexical task, not the job of parsing.

Compiler portability is enhanced. Input-device-specific peculiarities can be

restricted to the lexical analyzer.

4
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

2
07/23/2020

Convert the source code into stream of tokens

Removing white spaces

Removing the comments

Functions of
Lexical Analyzer Recognizing identifiers and keywords

Recognizing of constants

Show error when the lexeme does not match any

patterns

5
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

Lexemes, Patterns and Tokens

Lexeme:
A lexeme is a sequence of characters in the source program that is matched by
the pattern for a token.

Pattern:
A pattern is a description of the form or rule that describes the set of strings.

Token:
A token is a set of strings over source alphabets. Also a token is a pair consisting
of a token name and an optional attribute value.
Typical tokens are,
1) Identifiers 2) keywords 3) operators 4) special symbols 5)constants
6
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

3
07/23/2020

When more than one lexeme can match a pattern,

the lexical analyzer must provide the subsequent
compiler phases additional information about the
particular lexeme that matched.

For each lexeme, the lexical analyzer produces as

output a token of the form

Attributes for Tokens (token-name, attribute-value)

❖ token-name is an abstract symbol that is used

during syntax analysis

❖ attribute-value points to an entry in the symbol

table for this token.

7
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

An alphabet is any finite set of symbols.

Typical examples of symbols are letters, digits,

and punctuation. The set {0,1} is the binary alphabet.

A string over an alphabet is a finite sequence of

symbols drawn from that alphabet.

For example, banana is a string of length six. The

empty string, denoted ꜫ, is the string of length zero.
Specification of Tokens

A language is any countable set of strings over some

fixed alphabet.

Abstract languages like Ø, the empty set, or {ꜫ},

the set containing only the empty string, are languages
under this definition.

8
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

4
07/23/2020

Terms for Parts of Strings

Prefix

Suffix

Substring

Proper prefixes, suffixes, substrings

Subsequence
9
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

Operations on Languages
Union of two languages L and M is written as

L U M = {s | s is in L or s is in M}

Concatenation of two languages L and M is written as

LM = {st | s is in L and t is in M}

The Kleene Closure of a language L is written as

L* = Zero or more occurrence of language L

The Positive Closure of a language L is written as

L+ = One or more occurrence of language L

10
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

5
07/23/2020

Regular Expressions

Regular expressions have the capability to express finite languages by defining

a pattern for finite strings of symbols.

The grammar defined by regular expressions is known as regular grammar.

The language defined by regular grammar is known as regular language.

11
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

Regular Expressions’ Operations

If r and s are regular expressions denoting the languages L(r) and L(s), then

Union : (r)|(s) is a regular expression denoting L(r) U L(s)

Concatenation : (r)(s) is a regular expression denoting L(r)L(s)

Kleene closure : (r)* is a regular expression denoting (L(r))*

(r) is a regular expression denoting L(r)

12
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

6
07/23/2020

Example of Lexical Analyzer

int position , rate, initial
Symbol Table Manager
position = rate + initial *60; Serial no Variable Name Variable Type
1 position int
2 rate int
Lexical Analyzer
3 initial int
Stream of token
<id,1> <=> <id,2> <+> <id,3> <*> <60>

Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU 13

Example of Lexical Analyzer

1. int x1;
x=23;

2. /find the total value x and y/

int x, y, sum;
sum = x + y ;
printf(“Total = %d\n”, sum);

Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU 14

7
07/23/2020

Lexical Error

Lexical error is a sequence of characters that does not match the

pattern of any token.
Lexical phase error can be:
Spelling error.
Exceeding length of identifier or numeric constants.
Appearance of illegal characters.
To remove the character that should be present.
To replace a character with an incorrect character.
Transposition of two characters.
15
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

Transition Diagrams
As an intermediate step in the construction of a
lexical analyzer, patterns are converted into
stylized flowcharts, called “transition diagrams”.

Transition diagrams have a collection of nodes

or circles, called states. Each state represents a
condition that could occur during the process of
scanning the input looking for a lexeme that
matches one of several patterns.

Edges are directed from one state of the

transition diagram to another.

Figure: Transition Diagram of Relation Operator

16
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

8
07/23/2020

Finite Automata
Finite Automata(FA) is the simplest machine to recognize patterns.

Finite automata come in two flavors:

(a) Nondeterministic finite automata (NFA) have no restrictions on the labels

of their edges. A symbol can label several edges out of the same state, and
Ꜫ, the empty string, is a possible label.

(b) Deterministic finite automata (DFA) have, for each state, and for each
symbol of its input alphabet exactly one edge with that symbol leaving that
state.

17
Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU

Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
48 pages
Lexical Analyzer Functions Explained
No ratings yet
Lexical Analyzer Functions Explained
62 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
25 pages
Lexical Analyzer Design and Implementation
No ratings yet
Lexical Analyzer Design and Implementation
27 pages
Introduction to Lexical Analysis
No ratings yet
Introduction to Lexical Analysis
11 pages
Lexical Analyzer: Role and Functions
No ratings yet
Lexical Analyzer: Role and Functions
62 pages
Lexical Analyzer: Token Recognition Guide
No ratings yet
Lexical Analyzer: Token Recognition Guide
16 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
63 pages
Lexical Analysis Overview and Tools
No ratings yet
Lexical Analysis Overview and Tools
26 pages
Lexical Analysis Overview by Dr. Enduri
No ratings yet
Lexical Analysis Overview by Dr. Enduri
88 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
67 pages
Lexical Analysis in Compiler Design
100% (1)
Lexical Analysis in Compiler Design
52 pages
Regular Expressions and Definitions Explained
No ratings yet
Regular Expressions and Definitions Explained
61 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
39 pages
Understanding Lexical Analysis in Compilers
No ratings yet
Understanding Lexical Analysis in Compilers
4 pages
Lexical Analyzer Overview and Functions
No ratings yet
Lexical Analyzer Overview and Functions
78 pages
Understanding Lexical Analyzers
No ratings yet
Understanding Lexical Analyzers
24 pages
Lexical Analyzer Generator Overview
No ratings yet
Lexical Analyzer Generator Overview
64 pages
Lexical Analyzer in Compiler Design
No ratings yet
Lexical Analyzer in Compiler Design
30 pages
Lexical Analyzer Design Overview
No ratings yet
Lexical Analyzer Design Overview
43 pages
Lexical Analysis in Compilers
No ratings yet
Lexical Analysis in Compilers
33 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
40 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
6 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
150 pages
Understanding Lexical Analysis in Compilers
No ratings yet
Understanding Lexical Analysis in Compilers
153 pages
Lexical Analysis
No ratings yet
Lexical Analysis
62 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
22 pages
Lexical Analysis
No ratings yet
Lexical Analysis
8 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
89 pages
Compiler Phases and Lexical Analysis
No ratings yet
Compiler Phases and Lexical Analysis
109 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
10 pages
Lexical Analysis and Token Recognition
No ratings yet
Lexical Analysis and Token Recognition
5 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
10 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
27 pages
Understanding Lexical Analyzers
No ratings yet
Understanding Lexical Analyzers
18 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
59 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
14 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
24 pages
Compiler Design: Lexical Analysis Explained
No ratings yet
Compiler Design: Lexical Analysis Explained
92 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
39 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
109 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
38 pages
Lexical Analysis and Token Generation
No ratings yet
Lexical Analysis and Token Generation
66 pages
Lexical Analysis and Finite Automata
No ratings yet
Lexical Analysis and Finite Automata
100 pages
EContent 11 2025 08 12 13 15 41 Unit2 SSPDF 2025 07 15 07 41 15
No ratings yet
EContent 11 2025 08 12 13 15 41 Unit2 SSPDF 2025 07 15 07 41 15
126 pages
Unit 2 - Lexical Analyzer
No ratings yet
Unit 2 - Lexical Analyzer
34 pages
Lexical Analyzer in Compiler Design
No ratings yet
Lexical Analyzer in Compiler Design
64 pages
Lexical Analyzer Overview and Functions
No ratings yet
Lexical Analyzer Overview and Functions
64 pages
Lexical Analysis in Compiler Design
No ratings yet
Lexical Analysis in Compiler Design
35 pages
Lexical Analysis Overview and Techniques
No ratings yet
Lexical Analysis Overview and Techniques
2 pages
Understanding Lexical Analysis in Compilers
No ratings yet
Understanding Lexical Analysis in Compilers
59 pages
Input Buffering in Lexical Analysis
No ratings yet
Input Buffering in Lexical Analysis
26 pages
Lexical Analysis in Compilers Explained
No ratings yet
Lexical Analysis in Compilers Explained
12 pages
Lexical Analysis: Tokens and Errors
No ratings yet
Lexical Analysis: Tokens and Errors
80 pages
C# Programming Question Bank for Students
No ratings yet
C# Programming Question Bank for Students
2 pages
PVI OPC Configuration Guide
No ratings yet
PVI OPC Configuration Guide
28 pages
Program Development Life Cycle Overview
No ratings yet
Program Development Life Cycle Overview
5 pages
Altair Analytics Workbench User Guide en
No ratings yet
Altair Analytics Workbench User Guide en
534 pages
C Programming Basics: Variables and Functions
No ratings yet
C Programming Basics: Variables and Functions
10 pages
Software Engineering Course Overview
100% (1)
Software Engineering Course Overview
116 pages
Taylor Hansen - Software Developer Profile
No ratings yet
Taylor Hansen - Software Developer Profile
1 page
Clear Data Engine Cache in Power BI
No ratings yet
Clear Data Engine Cache in Power BI
4 pages
Project Management Methodologies Guide
No ratings yet
Project Management Methodologies Guide
13 pages
Hashing Techniques and Priority Queues
No ratings yet
Hashing Techniques and Priority Queues
23 pages
OSBP System Version 2.0 Overview
No ratings yet
OSBP System Version 2.0 Overview
16 pages
Discovering The Best Developer Framework Through Benchmarking 12232020
No ratings yet
Discovering The Best Developer Framework Through Benchmarking 12232020
45 pages
Simulink Fundamentals Training Course
No ratings yet
Simulink Fundamentals Training Course
7 pages
Roblox Cashier Automation Script
No ratings yet
Roblox Cashier Automation Script
5 pages
Java MCQ Assignment for NPTEL Course
No ratings yet
Java MCQ Assignment for NPTEL Course
10 pages
Wonderware FactorySuite Support Matrix
No ratings yet
Wonderware FactorySuite Support Matrix
2 pages
IIT Computer Science Graduate Resume
No ratings yet
IIT Computer Science Graduate Resume
3 pages
IT 1016: Windows & C Programming Guide
No ratings yet
IT 1016: Windows & C Programming Guide
14 pages
C# ADO.NET Database Programming Guide
No ratings yet
C# ADO.NET Database Programming Guide
3 pages
ArangoDB vs Neo4j: Key Comparisons
No ratings yet
ArangoDB vs Neo4j: Key Comparisons
4 pages
TCL Script Automation for Xcelium Tool
No ratings yet
TCL Script Automation for Xcelium Tool
2 pages
DBMS Concepts and Questions Guide
No ratings yet
DBMS Concepts and Questions Guide
5 pages
Android Event Management System Overview
No ratings yet
Android Event Management System Overview
52 pages
PM100D Power Meter Programming Guide
No ratings yet
PM100D Power Meter Programming Guide
13 pages
AI Crop Advisory System for Farmers
No ratings yet
AI Crop Advisory System for Farmers
21 pages
Overview of Feign Client Usage
No ratings yet
Overview of Feign Client Usage
4 pages
Introduction to CSS for Web Design
No ratings yet
Introduction to CSS for Web Design
20 pages
ESP-IDF Beginner's Guide Overview
No ratings yet
ESP-IDF Beginner's Guide Overview
58 pages
Database Management System Overview
No ratings yet
Database Management System Overview
24 pages
String Operations in Data Structures
No ratings yet
String Operations in Data Structures
20 pages

Lexical Analyzer in Compiler Design

Uploaded by

Lexical Analyzer in Compiler Design

Uploaded by

07/23/2020

Zakia Zinat Choudhury

The Role of Lexical Analyzer

The Role of Lexical Analyzer

a) Scanning consists of the simple processes that do not require tokenization of

Why lexical analysis and syntax analysis phases

Simplicity of design is the most important consideration. The separation of

Compiler efficiency is improved. A separate lexical analyzer allows us to apply

Compiler portability is enhanced. Input-device-specific peculiarities can be

Convert the source code into stream of tokens

Removing white spaces

Removing the comments

Show error when the lexeme does not match any

Lexemes, Patterns and Tokens

When more than one lexeme can match a pattern,

For each lexeme, the lexical analyzer produces as

Attributes for Tokens (token-name, attribute-value)

❖ token-name is an abstract symbol that is used

❖ attribute-value points to an entry in the symbol

An alphabet is any finite set of symbols.

Typical examples of symbols are letters, digits,

A string over an alphabet is a finite sequence of

For example, banana is a string of length six. The

A language is any countable set of strings over some

Abstract languages like Ø, the empty set, or {ꜫ},

Terms for Parts of Strings

Proper prefixes, suffixes, substrings

Concatenation of two languages L and M is written as

The Kleene Closure of a language L is written as

L* = Zero or more occurrence of language L

The Positive Closure of a language L is written as

L+ = One or more occurrence of language L

Regular expressions have the capability to express finite languages by defining

The grammar defined by regular expressions is known as regular grammar.

The language defined by regular grammar is known as regular language.

Regular Expressions’ Operations

Union : (r)|(s) is a regular expression denoting L(r) U L(s)

Concatenation : (r)(s) is a regular expression denoting L(r)L(s)

Kleene closure : (r)* is a regular expression denoting (L(r))*

(r) is a regular expression denoting L(r)

Example of Lexical Analyzer

Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU 13

Example of Lexical Analyzer

2. /*find the total value x and y*/

Zakia Zinat Choudhury, Lecturer, Dept. of CSE, RU 14

Lexical error is a sequence of characters that does not match the

Transition diagrams have a collection of nodes

Edges are directed from one state of the

Figure: Transition Diagram of Relation Operator

Finite automata come in two flavors:

(a) Nondeterministic finite automata (NFA) have no restrictions on the labels

You might also like

2. /find the total value x and y/