Lecture 1 To 4 Theory of Computation
Lecture 1 To 4 Theory of Computation
DR O. J Falana
1
CSC 206: THEORY OF COMPUTATION (2 Units)
9. Pumping Lemma
Pumping Lemma for Regular Languages, Pumping Lemma for Context-Free Languages
Applications in Proving Language Non-Regularity
Automata theory, also known as the Theory of Computation, is a field within computer
science and mathematics that focuses on studying abstract machines to understand the
capabilities and limitations of computation by analyzing mathematical models of how
machines can perform calculations.
A. Automata Theory
• Studies abstract computing devices (automata) and their capabilities.
• Explains how computers process languages and recognize patterns.
B. Computability Theory
• Focuses on what problems are solvable using algorithms.
• Deals with decidable vs. undecidable problems.
C. Complexity Theory
• Examines how efficiently problems can be solved.
• Categorizes problems into complexity classes like P, NP, NP-complete.
1.3 Why Study Computation Theory?
1. Understanding Computation Limits – Helps in identifying which problems are
solvable and which are impossible to compute.
2. Algorithm Design – Provides techniques for designing better and more efficient
algorithms.
3. Artificial Intelligence & Machine Learning – Automata concepts are used in pattern
recognition, natural language processing, and AI.
4. Compiler Design – Used to design programming language parsers and interpreters.
5. Cybersecurity & Cryptography – Complexity theory is used to design secure
cryptographic algorithms.
6. Regular Expressions (RE) in Systems -Regular expressions are powerful tools for
pattern matching and text processing used extensively in many systems.
7. Finite Automata in Modeling Systems -Modeling Protocols and Circuits: Finite
automata (FA) are used to model protocols, like those in network communication,
and to design electronic circuits that operate based on a set of predefined rules or
states.
1.4 Fundamental Concepts in Computation Theory
The study of computation is based on fundamental mathematical models
and formal systems, including:
2. Alphabets (Σ)
A finite, non-empty set of symbols used to construct strings and
languages. For example, Σ = {a, b}.
1.5 Basic Terminologies of Theory of
Computation
3. String
A string is a finite sequence of symbols from some alphabet. A string
is generally denoted as w and the length of a string is denoted
as |w|. Empty string is the string with zero occurrence of symbols,
represented as ε.
2. GRAMMARS: REGULAR EXPRESSIONS AND
LANGUAGES, CONTEXT-FREE LANGUAGES
Having learnt about Strings and Alphabets in the previous lecture, another important
concept in formal language and automata theory, is Grammar.
Learning Outcomes
At the end of this lecture, you should be able to:
• define Grammar
• state the types of grammars available in the field of Computer Science
• describe the class of automata that can recognize strings generated by each grammar
• identify strings that are generated by a particular grammar
• describe the Chomsky hierarchy
• explain the relevance of grammar and formal languages to computer programming.
1
4
Lecture 2: Grammars and Automata
GRAMMAR
Grammar is a set of rules for forming strings in a formal language.
The rules describe how to form strings from the language's alphabet that are valid
according to the language's syntax. A grammar does not describe the meaning of the
strings or what can be done with them in whatever context - only their form.
Formal language theory, which is the discipline that studies formal grammars and
languages, is a branch of Applied Mathematics. Its applications are found in theoretical
computer science, theoretical linguistics, formal semantics, mathematical logic, and other
areas.
Grammar is a set of rules for rewriting strings, along with a "start symbol" from which
rewriting must start.
Therefore, a grammar is usually thought of as a language generator. However, it can also
sometimes be used as the basis for a "recognizer", a function in computing that
determines whether a given string belongs to the language or is grammatically incorrect.
To describe such recognisers, formal language theory uses separate formalisms, known as
automata theory.
1
6
Elements of a Grammar
• Grammar is composed of two basic elements:
• Terminal Symbols: Terminal symbols are those that are
the components of the sentences generated using
grammar and are represented using small case letters
like a, b, c, etc.
• Non-Terminal Symbols: Non-terminal symbols are
those symbols that take part in the generation of the
sentence but are not the component of the sentence.
Non-Terminal Symbols are also called Auxiliary
Symbols and Variables. These symbols are represented
using a capital letters like A, B, C, etc.
Representation of Grammar
2
1
2. The Semantics of Grammars
Semantics is the linguistic and philosophical study of meaning, in language, programming
languages and formal logics.
It is concerned with the relationship between signifiers, like words, phrases, signs and
symbols, and their denotations.
For example, assume the alphabet consists of a and b, the start symbol is S, and we have
the following production rules:
S => aSb
S => ba
Then we start with S, and can choose a rule to apply to it. If we choose rule 1, we obtain
the string aSb. If we choose rule 1 again, we replace S with aSb and obtain the string
aaSbb. If we now choose rule 2, we replace S with ba and obtain the string aababb, and are
done.
We can write this series of choices more briefly, using symbols:
S => aSb => aaSbb => aababb. 6
Example 1: Consider the Grammar G where S = {a, b, c} is the start symbol, and P
consists of the following production rules:
1. S => aBSc
2. S => abc
3. Ba => aB
4. Bb => bb
Construct the grammar of language L(G) = {anbncn n > 1}
Solution: the language is the set of strings that consist of 1 or more a's, followed by the
same number of b's, followed by the same number of c's.
Some examples of the derivation of strings in L(G) are:
S => aBSc => aBabcc => aaBbcc => aabbcc
OR
S => aBSc => aBaBScc => aBaBabccc => aaBBabccc => aaaBBbccc => aaaBbbccc =>
aaabbbccc
2
3
Language theory is a branch of Mathematics concerned with describing languages as a set
of operations over an alphabet. It is closely linked with automata theory, as automata are
used to generate and recognize formal languages.
There are several classes of formal languages, each allowing more complex language
specification than the one before it, i.e. Chomsky hierarchy, and each corresponding to a
class of automata which recognizes it.
Because automata are used as models for computation, formal languages are the preferred
mode of specification for any problem that must be computed.
2
4
TYPES OF GRAMMARS
Here are some more examples (in all cases, the alphabet is {0, 1}):
• The language {w : w contains exactly two 0s} can be described by the expression
1*01*01*
• The language {w : w contains at least two 0s}can be described by the expression
(0 � 1)*0(0 � 1)*0(0 � 1)*.
• The language {w : 1011 is a substring of w} can be described by the expression
(0 � 1)*1011(0 � 1)*.
26
• The language {w : the length of w is odd} can be described by the expression:
(0 � 1) ((0 � 1)(0 � 1))* .
27
Regular Expressions
Regular expressions are symbolic notations used to define search
patterns in strings. They describe regular languages and are
commonly used in tasks such as validation, searching, and parsing
A regular expression over an alphabet Σ is defined as follows:
1.Base Cases
• Empty string: ε is a regular expression that represents the language {ε}.
• Single symbols: Any symbol a ∈ Σ is a regular expression that represents {a}.
2.Recursive Rules
• Union (OR, denoted by "|"): If R₁ and R₂ are regular expressions, then R₁ | R₂ represents the set of strings in
R₁ or R₂.
• Concatenation (denoted by "•"): If R₁ and R₂ are regular expressions, then R₁R₂ represents strings where R₁
is followed by R₂.
• Kleene Star (denoted by "*"): If R is a regular expression, then R* represents zero or more occurrences of R.
Regular Languages
Regular languages are the class of languages that can be represented
using finite automata, regular expressions, or regular grammar.
These languages have predictable patterns and are computationally
efficient to recognize.
Properties of Regular Languages
1. Closure Properties
Regular languages are closed under operations like union,
concatenation, and Kleene star.
•Union: If L1 and If L2 are two regular languages, their union L1 ?
L2 will also be regular. For example, L1 = {an | n ? 0} and L2 = {bn |
n ? 0} L3 = L1 ? L2 = {an ? bn | n ? 0} is also regular.
•Intersection: If L1 and If L2 are two regular languages, their
intersection L1 ? L2 will also be regular. For example, L1= {ambn | n ?
0 and m ? 0} and L2= {ambn ? bnam | n ? 0 and m ? 0} L3 = L1 ? L2
= {ambn | n ? 0 and m ? 0} is also regular.
Properties of Regular Languages
contd..
•Concatenation: If L1 and If L2 are two regular languages, their
concatenation L1.L2 will also be regular. For example, L1 = {an | n ?
0} and L2 = {bn | n ? 0} L3 = L1.L2 = {am. bn | m ? 0 and n ? 0} is
also regular.
•Kleene Closure: If L1 is a regular language, its Kleene closure L1* will
also be regular. For example, L1 = (a ? b) L1* = (a ? b)*
•Complement: If L(G) is regular language, its complement L’(G) will
also be regular. Complement of a language can be found by
subtracting strings which are in L(G) from all possible strings. For
example, L(G) = {an | n > 3} L’(G) = {an | n <= 3}
Definition:
Let Σ be a non-empty alphabet.
1. є is a regular expression.
2. ϕ is a regular expression.
3. For each a∈Σ, a is a regular expression.
4. If R1 and R2 are regular expressions, then R1 � R2 is a regular expression.
5. If R1 and R2 are regular expressions, then R1 R2 is a regular expression.
6. If R is a regular expression, then R* is a regular expression.
You can regard 1, 2, and 3 as being the “building blocks” of regular expressions. Items 4, 5
and 6 give rules that can be used to combine regular expressions into new (and larger)
regular expressions.
31
To give an example, we claim that:
(0 � 1)*101(0 � 1)*
is a regular expression (where the alphabet Σ is equal to {0, 1}). In order to prove this, we
have to show that this expression can be built using the “rules” given in Definition above.
Here we go:
• By point 3, 0 is a regular expression.
• By point 3, 1 is a regular expression.
• Since 0 and 1 are regular expressions, by point 4, 0�1 is a regular expression.
• Since 0�1 is a regular expression, by point 6, (0�1)* is a regular expression.
• Since 1 and 0 are regular expressions, by point 5, 10 is a regular expression.
• Since 10 and 1 are regular expressions, by point 5, 101 is a regular expression.
•Since (0 � 1)* and 101 are regular expressions, by point 5, (0 � 1)*101 is a
regular expression.
• Since (0 � 1)*101 and (0 � 1)* are regular expressions, by point 5, (0 �1)*101(0 � 1)* is a
regular expression. 13
Next we define the language that is described by a regular expression.
33
For example:
• The regular expression (0�є)(1�є) describes the language {01, 0, 1, є}.
• The regular expression 0�є describes the language {0, є}, whereas the regular
expression 1* describes the language {є, 1, 11, 111, . . .}.
Therefore, the regular expression (0�є)1* describes the language {0, 01, 011, 0111, . . . , є,
1, 11, 111, . . .}.
Observe that this language is also described by the regular expression 01* � 1*
• The regular expression 1*∅ describes the empty language, i.e., the language ∅.
• The regular expression ∅* describes the language {є}.
34
2. CONTEXT-FREE GRAMMARS
A Context-Free Grammar (CFG) is more powerful than a regular grammar and is
defined as:
G=(V,Σ,P,S)
A context-free grammar is a set of recursive rules used to generate patterns of strings.
Context-free grammars are used for defining the syntax of programming languages and
their compilation.
Context-free grammars (CFGs) are used to describe context-free languages. A context-free
grammar can describe all regular languages and more, but they cannot
describe all possible languages.
Context-free grammars are studied in fields of theoretical computer science, compiler
design (in particular parsing), and linguistics. CFG’s are used to describe programming
languages and parser programs in compilers.
36
Definition:
A Context-Free Grammar (CFG) is a 4-tuple (V, T, S, P) where:
( i ) V is a finite set called the variables (Set of non-terminal symbol). Typically,
non-terminals are represented by uppercase letters (e.g., S, A,
B).
(ii) T is a finite set, disjoint from V, called the terminals. They are usually
represented by lowercase letters (e.g., a, b, c) or specific symbols.
(iii) P is a finite set of rules, with each rule being a variable and a string of variables and
terminals
(iv) S ∈ V is the start variable.
37
Examples of Context free langauages:
(a) The grammar G = ({S}, {a, b}, S, P) with productions
S => aSa, S => bSb,
S => γ is context free.
S => aSa
=> aaSaa
=> aabSbaa
=> aabbaa
There are grammars called context-sensitive grammars which are more powerful
(meaning they can generate more complex languages that might require more memory)
than both regular languages and context-free languages.
40
3. CONTEXT SENSITIVE GRAMMARS AND LANGUAGES
A context-sensitive grammar is a formal grammar in which the left-hand sides and right-
hand sides of any production rules may be surrounded by a context of terminal and non-
terminal symbols.
Context-sensitive grammar are more general than context-free grammars, in the sense that
there are languages that can be described by CSG but not by context-free grammar
A context-sensitive Language is a language generated by a context sensitive grammar.
Definition:
A context-sensitive grammar is one whose productions are all of the form
xAy => xvy
where A ∈v and x, v, y ∈ (V �T )*
“Context-sensitive” implies the fact that the actual string modification is given by A=> v,
while the x and y provide the context in which the rule may be applied.
41
For example: S => abc│aAbc
Ab => bA
Ac => Bbcc
bB => Bb
aB => aa │aaA
42
Other Forms of Generative Grammars
Many extensions and variations on Chomsky's original hierarchy of formal grammars have
been developed, both by linguists and by computer scientists, usually either in order to
increase their expressive power or in order to make them easier to analyse or parse.
Some forms of grammars developed include:
• Tree-adjoining grammars increase the expressiveness of conventional generative
grammars by allowing rewrite rules to operate on parse trees instead of just strings.
• Affix grammars and attribute grammars allow rewrite rules to be augmented with
semantic attributes and operations, useful both for increasing grammar expressiveness
and for constructing practical language translation tools.
• Analytic Grammars
43
THE CHOMSKY HIERARCY
The Chomsky hierarchy is an hierarchy of the classes of formal grammars.
The Chomsky Hierarchy, as originally defined by Noam Chomsky in 1956, comprises
four types of languages and their associated grammars and the type of machines
that recognizes it.
44
Table 1 : Chomsky Hierarchy
45
• The Unrestricted grammars are classified as Type 0.
• Type 1 grammars generate context-sensitive languages.
• Type 2 grammars generate context-free languages and
• Type 3 grammars generate regular languages.
46
47
PARSING
A grammar can be used in two ways:
(a) Using the grammar to generate strings of the language.
(b) Using the grammar to recognize the strings.
“Parsing” a string is finding a derivation (or a derivation tree) for that string.
Parsing a string is like recognizing a string. The only realistic way to recognize a string
of a context-free grammar is to parse it.
48
49
50
51
CONCLUSION
In this lecture, you have been introduced to the concept of formal grammars. Grammars
are very important in the field of automata theory since they are the building blocks of
languages.
SELF EXERCISE
1. What you understand by Grammars?
2. Give examples of Context-Free Grammar
3. Distinguish among the following grammar types:
a. Regular Grammars
b. Context-Free Grammars
c. Analytical Grammars.
4. Discuss the Chomsky hierarchy.
What is the relationship amongst the various types of grammars described in the Chomsky
hierarchy? 32
COURSE OUTLINE
FINITE AUTOMATA
• NFA
• Regular Expressions
• Regular Languages
• Two-way finite automata
• Finite automata with output
5
3
Definition: A Nondeterministic Finite Automata (NFA) is also
defined by a 5- tuple
5
4
NFA differs from DFA in that, the range of δ in NFA is in the powerset 2Q . A string is accepted by
an NFA if there is some sequence of possible moves that will put the machine in the final state at the
end of the string.
Example 1: Obtain an NFA for a language consisting of all strings over {0,1} containing a 1 in
the third position from the end.
Solution:
5
5
Example 2: Determine an NFA accepting the language
Solution:
5
6
We shall come back to NFA later
REGULAR EXPRESSION
Regular Languages.
The regular languages are those languages that can be constructed from the “big three” set
operations viz., (a) Union (b) Concatenation (c) Kleene star. A regular language is defined as
follows.
Definition: Let Σ be an alphabet. The class of “regular languages” over Σ is defined inductively
as follows:
6
58
Regular Expressions:
Regular expressions are designed to represent regular languages with a mathematical
tool, a tool built from a set of primitives and operations. This representation involves
a combination of strings of symbols from some alphabet Σ, parentheses and the
operators +, ⋅ and *. A regular expression is obtained from the symbol {a, b, c},
empty string ∈, and empty-set ∅ with the operations +, ⋅ and * (i.e union,
concatenation and Kleene star).
Examples:
0 + 1 represents the set {0, 1}
1 represents the set {1}
0 represents the set {0}
(0 + 1) 1 represents the set {01, 11}
(a + b ).(b + c) represents the set {ab, bb, ac, bc}
(0 + 1)* = ∈ + (0 + 1) + (0 + 1) (0 + 1) + … = Σ*
∈(0+1)+ = (0+1)(0+1)(0+1)* = Σ+ = Σ*-{∈} 59
Building Regular Expressions
Assume that Σ = {a b, c}
a* means “zero or more instances of a concatenated together”, So a* ={λ,a, aa, aaa,
…}
To say “zero or more ab’s,” = {λ, ab abab, …} = (ab)*.
60
Languages defined by Regular Expressions
There is a very simple correspondence between regular expressions and the languages they denote:
61
TWO-WAY FINITE AUTOMATA
Two-way finite automata are machines that can read input string in either direction.
This type of machines have a “read head”, which can move left or right over the input
string. Like the finite automata, the two-way finite automata also have a finite set Q of
states and they can be either deterministic (2DFA) or nondeterministic (2NFA). They
accept only regular sets like the ordinary finite automata. Let us assume that the
symbols of the input string are occupying cells of a finite tape, one symbol per cell as
shown below. The left and right end markers |— and —| enclose the input string. The
end markers are not part of the input alphabet Σ.
62
Definition:
A 2DFA is an octuple M = (Q, Σ |—, —|, δ, s, t, r)
where, Q is a finite set of states
Σ is a finite set of input alphabet.
|— is the left end marker, |— ∉Σ,
—| is the right end marker, —|∉ Σ,
δ: Q × (Σ �{|—, —|}) ( → Q × {L, R}) is the transition function.
s∈Q is the start state,
t∈Q is the accept state, and
r∈Q is the reject state, r ≠ t
63
δ takes a state and a symbol as arguments and returns a new state
and a direction to move the head i.e., if δ(p, b) = (q, d), then
whenever the machine is in state p and scanning a tape cell
containing symbol b, it moves its head one cell in the direction d and
enters the state q.
64
FINITE AUTOMATA WITH OUTPUT
Definition: A finite-state machine M = (Q, Σ, O, δ, λ, q0) consists of a finite set Q of states, a finite
input alphabet Σ, a finite output alphabet O, a transition function δ that assigns to each state and input
pair a new state, an output function λ that assigns to each state and input pair an output, and an initial
state q0 . Let M = M = (Q, Σ, O, δ, λ, q0) be a finite state machine. A state table is used to denote the
values of the transition function δ and the output function λ for all pairs of states and input.
Mealey Machine: Usually the finite automata have binary output, i.e., they accept the string or do not
accept the string. This is basically decided on the basis of whether the final state is reached by the initial
state. Removing this restriction, we are trying to consider a model where the outputs can be chosen from
some other alphabet.
65
The values of the output function F(t) in the most general case is a function of the present state q(t) and
present input x(t).
F(t) = λ(q(t), x(t))
where λ is called the output function. This model is called the “Mealey machine”.
A Mealey machine is a six-tuple (Q, Σ, O, δ, λ q0) where all the symbols except λ have the same meaning
as discussed in the section above. λ is the output function mapping Σ × Q into O.
66
TWO-WAY FINITE AUTOMATA
Two-way finite automata are machines that can read input string in either direction.
This type of machines have a “read head”, which can move left or right over the input
string. Like the finite automata, the two-way finite automata also have a finite set Q of
states and they can be either deterministic (2DFA) or nondeterministic (2NFA). They
accept only regular sets like the ordinary finite automata. Let us assume that the
symbols of the input string are occupying cells of a finite tape, one symbol per cell as
shown below. The left and right end markers |— and —| enclose the input string. The
end markers are not part of the input alphabet Σ.
67
Definition:
A 2DFA is an octuple M = (Q, Σ |—, —|, δ, s, t, r)
where, Q is a finite set of states
Σ is a finite set of input alphabet.
|— is the left end marker, |— ∉Σ,
—| is the right end marker, —|∉ Σ,
δ: Q × (Σ �{|—, —|}) ( → Q × {L, R}) is the transition function.
s∈Q is the start state,
t∈Q is the accept state, and
r∈Q is the reject state, r ≠ t
68
δ takes a state and a symbol as arguments and returns a new state
and a direction to move the head i.e., if δ(p, b) = (q, d), then
whenever the machine is in state p and scanning a tape cell
containing symbol b, it moves its head one cell in the direction d and
enters the state q.
69
FINITE AUTOMATA WITH OUTPUT
Definition: A finite-state machine M = (Q, Σ, O, δ, λ, q0) consists of a finite set Q of states, a finite
input alphabet Σ, a finite output alphabet O, a transition function δ that assigns to each state and input
pair a new state, an output function λ that assigns to each state and input pair an output, and an initial
state q0 . Let M = M = (Q, Σ, O, δ, λ, q0) be a finite state machine. A state table is used to denote the
values of the transition function δ and the output function λ for all pairs of states and input.
Mealey Machine: Usually the finite automata have binary output, i.e., they accept the string or do not
accept the string. This is basically decided on the basis of whether the final state is reached by the initial
state. Removing this restriction, we are trying to consider a model where the outputs can be chosen from
some other alphabet.
70
The values of the output function F(t) in the most general case is a function of the present state q(t) and
present input x(t).
F(t) = λ(q(t), x(t))
where λ is called the output function. This model is called the “Mealey machine”.
A Mealey machine is a six-tuple (Q, Σ, O, δ, λ q0) where all the symbols except λ have the same meaning
as discussed in the section above. λ is the output function mapping Σ × Q into O.
71
DETERMINISTIC FINITE AUTOMATA – DFA
What is an Automaton? An automaton is an abstract model of a digital computer.
It has a mechanism to read input, which is a string over a given alphabet. This
input is place on an “input file”, which can be read by the automaton but cannot
change it.
The input file is divided into cells, each of which can hold
one symbol. The automaton has a temporary “storage”
device, which has unlimited number of cells, the contents of
which can be altered by the automaton. Automaton has a
control unit, which is said to be in one of a finite number of
“internal states”. The automaton can change state in a
defined way.
A model of Automaton
Types of Automaton- We have two types of
Automaton
(a) Deterministic Automata
(b) Non-deterministic Automata
A deterministic automata is one in which each move (i.e. transition
from one state to another) is determined by the current configuration.
If the internal state, input and contents of the storage are known, it is
possible to predict the future behaviour of the automaton. This type
of automaton is said to be deterministic automata, otherwise it is
non-determinist automata.
Definition:
consecutive b’s that either began the input string or was preceded by an ‘a’.
(ii) If an ‘a’ is read and M is in state, q0 , q1 , or M returns to its initial state q0.
q0 , q1 and q1 are “final states” (as given in the problem). Therefore any input string
not containing three consecutive b’s will be accepted. In case we get three consecutive b’s
then the q3 state is reached (which is not a final state), hence M will
remain in this state, irrespective of any other symbol in the rest of the string. This state q3
is said to be a “dead state” or M is said to be “trapped” at q3 . The DFA
From the given table for δ, the DFA is drawn, where q2 is the only final state.
(It is to be noted that a DFA “accept” a string it can “recognize” a
language. The catch here is that “accept” is used for strings and
“recognize” for of a language).
It could be seen that the DFA accepts strings that has at least one 1 and an even
number of 0s following the last 1. Hence the language L is given by
Example 3: Sketch the DFA given
and δ is given by δ
(q1, 0) = q1
δ ( q 2 , 0 ) = q 1
δ ( q 1 , 1 ) = q 2
δ (q2, 1) = q2
Automaton being
L { anb : n 0}
Solution:
Therefore the DFA accepts all strings consisting of an
arbitrary number of a’s, followed by a single b. All other
input strings are rejected.
Example 5: Obtain the state table and state transition diagram
(DFA Schematic) of the finite state automaton: M = (Q, Σ, δ,
q0, F),
where Q = {q0 q1 q2 q3}, Σ = {a b}, q0 is the initial state, F is
the final state with the transition defined by
δ(q0,a) = q2 δ(q3,a) = q1 δ(q2,b) = q3 δ(q2,a) = q0
(1)
(i) (ii)
(2) *Construct a finite state machine that accepts only positive integers that are evenly
divisible by 4. Hint: Use Σ = {0, 1}.
* Intermediate
Language of a DFA
A DFA A accepts string w if there is a path from q0 to an accepting (or final) state that is
labeled by w
96
Non-deterministic Finite Automata
(NFA)
A Non-deterministic Finite Automaton (NFA)
is of course “non-deterministic”
Implying that the machine can exist in more than one state at the same time
Transitions could be non-deterministic
1 qj
qi … • Each transition function therefore
1 maps to a set of states
qk
97
Non-deterministic Finite Automata
A Non-deterministic Finite Automaton (NFA) consists of:
(NFA)
Q ==> a finite set of states
∑ ==> a finite set of input symbols (alphabet)
q0 ==> a start state
F ==> set of accepting states
δ ==> a transition function, which is a mapping between Q x ∑ ==> subset of Q
An NFA is also defined by the 5-tuple:
{Q, ∑ , q0,F, δ }
98
How to use an NFA?
Input: a word w in ∑*
Question: Is w acceptable by the NFA?
Steps:
Start at the “start state” q0
For every input symbol in the sequence w do
Determine all possible next states from all current states, given the current input symbol in w and the
transition function
If after all symbols in w are consumed and if at least one of the current states is a final state then
accept w;
Otherwise, reject w.
99
Regular expression: (0+1)*01(0+1)*
NFA for strings containing 01
states
an input of 0 is received? q1 Φ {q2}
*q2 {q2} {q2}
10
0
Note: Omitting to explicitly show error states is just a matter of design convenience
(one that is generally followed for NFAs), and
i.e., this feature should not be confused with the notion of non-determinism.
What is an “error state”?
A DFA for recognizing the key word “while”
w h i l e
q0 q1 q2 q3 q4 q5
10
2
Language of an NFA
An NFA accepts w if there exists at least one path from the start state to an accepting (or
final) state that is labeled by w
L(N) = { w | δ(q0,w) ∩ F ≠ Φ }
10
3
Advantages of NFA
Great for modeling regular expressions
String processing - e.g., grep, lexical analyzer
10
4
Technologies for NFAs
10
5
But, DFAs and NFAs are equivalent in their power to capture langauges !!
Differences: DFA vs. NFA
DFA
1. All transitions are deterministic
NFA Each transition leads to exactly one state
1. Some For
2. transitions could
each state, be non-deterministic
transition on all possible symbols (alphabet) should be defined
A3.transition could lead to a subset of
Accepts input if the last state states
visited is in F
2. Not
4. all symbol transitions need to be
Sometimes harder to construct because defined of the number of states
explicitly
5. (if undefined
Practical will go to
implementation an error state –
is feasible
this is just a design convenience, not to be confused
with “non-determinism”)
3. Accepts input if one of the last states is in F
4. Generally easier than a DFA to construct
5. Practical implementations limited but emerging
(e.g., Micron automata processor)
10
6
Regular Expressions
Reading: Chapter 3
10
7