Chapter 3 Finite Automata and Lexical Analysis
Chapter 3 Finite Automata and Lexical Analysis
Lexical analysis
The role of the lexical analyzer: Lexical
scanning, token classes, keyword recognition.
Finite automata
Alphabet, Strings and languages
Regular expressions
Finite automata (DFA and NFA)
From regular expressions to finite automata
Minimizing the number of states of a DFA
Contents
Lexical analysis
The role of the lexical analyzer: Lexical scanning,
token classes, keyword recognition.
Finite automata
Alphabet, Strings and languages
Regular expressions
Finite automata (DFA and NFA)
From regular expressions to finite automata
Minimizing the number of states of a DFA
Lexical Analysis (Scanning): plays an important
role in compilation process of a program.
It takes the source program as input and reads it
Lexical
one character at a time and produces equivalent
analysis
token stream of a program.
For example, A = B + C * 50 (source program)
statement.
The corresponding tokens stream after lexical
analyzer phase are x1 = x2 + x3 * 50, where x1,
x2 and x3 are tokens.
Other tasks performed by Lexers are:
skip comments and white space;
Lexical
Detect syntactic errors in tokens
analysis
Input program representation: Character
sequence
analysis sequence
Analysis specification: Regular expressions
Recognizing (abstract) machine: Finite
Automata
Implementation: Finite Automata
Lexical analyzer performs the following tasks:
Reads the source program, scans the input
characters, group them into lexemes and produce
Role of the token as output.
Lexical
Analyzer
Enters the identified token into the symbol table.
Strips out whitespaces and comments from
source program.
Correlate error messages with the source program
i.e., displays error messages with its occurrence
by specifying the line number.
Expands the macros if it is found in the source
program.
Simplicity of design of compiler
- The removal of white spaces and
comments enables the syntax analyzer
Need of
Lexical for efficient syntactic constructs.
Analyzer Compiler efficiency is improved
- Specialized buffering techniques for
reading characters speed up the
compiler process.
∑ ={0,1}
reverse string of w.
L2 = {101, 10101, radar, level,….}
Union:
- If L1 and L2 are two languages, then union,
denoted by L1U L2 is a language containing all
of these strings is in L.
- L*= L0 U L1U L2 , Where L0=Є
with (a + λ).
Examples:
– Represent the following sets by regular expression
a. {∧, ab}
b. {1,11,111....}
c. {ab, a, b, bb}
Solution
Regular
a. The set {∧, ab} is represented by the regular
expressions expression ∧ + ab
b. The set{1, 11,111,....,}is got by concatenating 1
and any element of {1}*.
Therefore 1(1)* represent the given set.
c. The set {ab, a, b, bb} represents the regular
expression
ab+ a+ b +bb.
Obtain the regular expressions for the following
sets:
1. The set of all strings over {a, b} beginning and ending
with ‘a’.
Þ The regular expression for ‘the set of all
Regular
strings over {a, b} beginning and ending
expressions with ‘a’ is given by: a (a + b)*a
- Exercises 2. {b2, b5, b8,. . . . .}
Þ The regular expression for {b 2
, b 5
, b
8
, .........} is given by: bb (bbb)*
3. {a2n+1 |n > 0}
Þ The regular expression for {a 2n+1
|n >
0}is given by: a (aa)+
Let L = {ab, aa, baa}, which of the following
of Finite
Automata
Final state
a state transition table is a table
showing what state finite state
Table machine(or states in the case of an
transition
NFA) will move to, based on the
current state and other inputs.
Row – states
Column – inputs
Entries – next state
- start state
* - final state
The mathematical model of automat
consists of
Detailed Q finite set of states
description
∑ finite set of input symbols
δ : Q X ∑ Q , transition function
Example
δ (q0,0)q1
δ (q0,1)q0
δ (q1,0)q1
δ (q1,1)q2
δ (q2,0)q2
Determine the DFA schematic for M =
(Q, Σ, δ ,q ,F ), where Q = {q1, q2,
Example - q3}, Σ = {0,1}, q1 is the start state,
DFA
F = {q2} and δ is given by the table
below
Language of accepted Strings
Consider a DFA shown in figure below
R = r3* , where r1 = 0 , r2 = 1
R = r3* , where r1 = 0 , r2 = 1
Exercise Solution:
The NFA will be constructed step by step by breaking regular
expression into small regular expressions.
R = (r1 + r2)r3 , where r1 = 01 , r2 = 2* and r3 = 0
• Two finite accepters M1 and M2 are equivalent,
iff L(M1) =L(M2) i.e., if both
EQUIVALENC
E OF NFA accept the same language.
AND DFA
• Both DFA and NFA recognize the same class of
languages.
• It is important to note that every NFA has an
equivalent DFA.
Problem Statement
• Let X = (Qx, ∑, δx, q0, Fx) be an NDFA which
accepts the language L(X).
• We have to design an equivalent DFA Y = (Qy,
∑, δy, q0, Fy) such that L(Y) = L(X).
Algorithm
NDFA to DFA
Conversion- • Input: An NDFA
Subset
Constructi • Output: equivalent DFA
on
• Step 1 Create state table from the given
NDFA.
• Step 2 Create a blank state table under
possible input alphabets for the
equivalent DFA.
• Step 3 Mark the start state of the DFA
by q0 (Same as the NDFA).
• Algorithm
• Step 4 Find out the combination of States {Q0, Q1,... ,Qn} for
NDFA to DFA each possible input alphabet.
Conversion
• Step 5 Each time we generate a new DFA state under the
input alphabet columns, we have to apply step 4 again,
otherwise go to step 6.
• Step 6 The states which contain any of the final states of the
NDFA are the final states of the equivalent DFA.
• Let us illustrate the conversion of NFA(NDFA ) to DFA
through an example.
Example
.
71
• ε-closure for a given state A means a set of states
Steps for which can be reached from the state A with only
converting
ε(null) move including the state A itself.
NFA with ε to
DFA: 01: We will take the ε-closure for the starting state of
NFA as a starting state of DFA.
02: Find the states for each input symbol that can be
traversed from the present. I.e., the union of transition
value and their closures for each state of NFA present
in the current state of DFA.
73
Con…
74
DFA
DFA minimization is the task of transforming a given
Minimizati
deterministic finite automaton (DFA) into an
on
equivalent DFA that has a minimum number of
states.
value 0.
1. Minimization of DFA Using Equivalence
DFA Theorem-
Minimizatio 03: Increment k by 1.
n Find Pk by partitioning the different sets of Pk-
1.
B,
H
DFA Minimization
Example -
• There is a wide range of tools
A language for for constructing lexical
specifying lexical
analyzers analyzers.
– Lex
• Lex is a computer program
that generates lexical
analyzers.
• Lex is commonly used with the
yacc parser generator.
• Lex Specification or Structure
A language for
• A LEX program has the
specifying lexical
analyzers following forms:
D1 = R1
D2 = R2
---------------------
Auxiliary ---------------------
Definitions Dn = Rn
Regular expression
Three general approaches for
the implementation of a lexical
Implementation of analyzer
a lexical analyzer By using a lexical-analyzer
generator:
The generator provides routines for
reading and buffering the input.