0% found this document useful (0 votes)
82 views19 pages

Patterns and Finite Automata Explained

This document discusses patterns, regular expressions, and finite automata (FAs). It contains the following key points in 3 sentences: Patterns are strings that represent sets of strings and are defined inductively. Regular expressions are patterns that can be built from atomic patterns, concatenation, union, and Kleene star, and are shown to be equivalent to languages accepted by FAs. The document presents algorithms to reduce an NFA to a regular expression by merging edges and removing states one by one until one or two states remain, at which point the regular expression is readily obtained.

Uploaded by

jaansyda
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views19 pages

Patterns and Finite Automata Explained

This document discusses patterns, regular expressions, and finite automata (FAs). It contains the following key points in 3 sentences: Patterns are strings that represent sets of strings and are defined inductively. Regular expressions are patterns that can be built from atomic patterns, concatenation, union, and Kleene star, and are shown to be equivalent to languages accepted by FAs. The document presents algorithms to reduce an NFA to a regular expression by merging edges and removing states one by one until one or two states remain, at which point the regular expression is readily obtained.

Uploaded by

jaansyda
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd

Formal Language and Automata Theory

Chapter 4
Patterns, Regular Expressions and Finite Automata
(include lecture 7,8,9)

Transparency No. 4-1 Transparency No. 4-1

Patterns and their defined languages

Patterns, regular expression & FAs

S: a finite alphabet A pattern is a string of symbols representing a set of strings in S*. The set of all patterns is defined inductively as follows: 1. atomic patterns: a S, e, , #, @. 2. compound patterns: if a and b are patterns, then so are: a + b, a b , a*, a+, ~ a and ab . For each pattern a, L(a) is the language represented by a and is defined inductively as follows: 1. L(a) = {a}, L(e) = {e }, L()= {}, L(#) = S, L(@) = S *. 2. If L(a) and L(b) have been defined, then L(a + b ) = L(a ) U L(b ), L(a b ) = L(a ) L(b ). L(a+) = L(a )+, L(a*) = L(a)*, L(~ a ) = S* - L(a ), L(a b) = L(a ) L(b ).
Transparency No. 4-2

More on patterns

Patterns, regular expression & FAs

We say that a string x matches a pattern a iff x L(a). Some examples: 1. S* = L(@) = L(#*) 2. L(x) = {x} for any x S* 3. for any x1,,xn in S*, L(x1+x2++xn) = {x1,x2,,xn}. 4. {x | x contains at least 3 as} = L(@a@a@a@}

5. S - {a} = # ~a
6. {x | x does not contain a} = (# ~a)* 7. {x | every a in x is followed sometime later by a b } =

= {x | either no a in x or $ b in x followed no a }
= (# ~a)* + @b(# ~a)*

Transparency No. 4-3

More on pattern matching

Patterns, regular expression & FAs

Some interesting and important questions: 1. How hard is it to determine if a given input string x matches a given pattern a ? ==> efficient algorithm exists 2. Can every set be represented by a pattern ? ==> no! the set {anbn | n > 0 } cannot be represented by any pattern. 3. How to determine if two given patterns a and b are equivalent ? (I.e., L(a) = L(b)) --- an exercise ! 4. Which operations are redundant ? e = ~(#+ @) = * ; a+ = a a* # = a1 + a2 ++ an if S = {a1,.., an} a + b = ~(~a ~b) ; a b = ~ (~a + ~b ) It can be shown that ~ is redundant.
Transparency No. 4-4

Equivalence of patterns, regular expr. & FAs

Patterns, regular expression & FAs

Recall that regular expressions are those patterns that can be built from: a S, e, , +, and *. Notational conventions: a + br means a + (br) a + b* means a + (b*) a b* means a (b*) Theorem 8: Let A S*. Then the followings are equivalent: 1. A is regular (I.e., A = L(M) for some FA M ), 2. A = L(a) for some pattern a, 3. A = L(b) for some regular expression b. pf: Trivial part: (3) => (2). (2) => (1) to be proved now! (1)=> (3) later.
Transparency No. 4-5

(2) => (1) : Every set represented by a pattern is regular

Patterns, regular expression & FAs

Pf: By induction on the structure of pattern a. Basis: a is atomic: (by construction!)

1. a = a :
2. a = e:

a e

3. a = :
4. a = #: 5. a = @ = #* : a,b,c, e a,b,c,

Transparency No. 4-6

Patterns, regular expression & FAs

Inductive cases: Let M1 and M2 be any FAs accepting L(b) and L(g), respectively.

6. a = b g : => L(a) = L(M1 M2)


7. a = b * : => L(a) = L(M1*) 8. a = b + g, a = ~b or a = b g : By ind. hyp. b and g are regular. Hence by closure properties of regular languages, a is regular, too. 9. a = b+ = b b* : Similar to case 8.

Transparency No. 4-7

Some examples patterns & their equivalent FAs

Patterns, regular expression & FAs

1. (aaa)* + (aaaaa)*

Transparency No. 4-8

(1)=>(3): Regular languages can be represented by reg.

Patterns, regular expression & FAs expr.

M = (Q, S, d, S, F) : a NFA; X Q: a set of states; m,n Q : two states

pX(m,n) =def {y S* | $ a path from m to n labeled y and all intermediate states X }. Note: L(M) = ? pX(m,n) can be shown to be representable by a regular expr, by induction as follows: Let D(m,n) = { a | (m an) d } = {a1,,ak} ( k 0) = the set of symbols by which we can reach from m to n, then Basic case: X = : 1.1 if m n: p(m,n) = {a1, a2,,ak } = L(a1 + a2++ ak) if k > 0, = {} = L() if k = 0. 1.2 if m =n: p(m,n) = {a1, a2, ak, e}=L(a1 + a2++ ak +e) if k > 0, = {e} = L(e) if k = 0.
Transparency No. 4-9

Continue.

Patterns, regular expression & FAs

3. For nonempty X, let q be any state in X, then : pX(m,n) = pX-{q} (m,n) U pX-{q}(m,q) (pX-{q}(q,q))* pX-{q}(q,n).

By Ind.hyp.(why?), there are regular expressions a, b, g, r with L( [a, b, g, r] ) = [pX-{q} (m,n), pX-{q}(m,q), (pX-{q}(q,q)), pX-{q}(q,n) ]
Hence pX(m,n) = L( a ) U L(b) L(g) = L(a + bg*r ) and can be represented as a reg. expr. Finally, L(M) = {x | s --x--> f, s S, f F } = SsS, fF pQ(s,f), is representable by a regular expression. * L(r ),

Transparency No. 4-10

Some examples

Patterns, regular expression & FAs

Example (9.3): M : L(M) = p{p,q,r}(p,p) = p{p,r}(p,p) + p{p,r}(p,q) (p{p,r}(q,q))* p{p,r}(q,p) p{p,r}(p,p) = ? p{p,r}(p,q) = ? p{p,r}(q,q) = ? p{p,r}(q,p) = ?

0 >pF q r {p} {r} {p}

1 {q} {} {q}

Hence L(M) = ?

Transparency No. 4-11

Another approach

Patterns, regular expression & FAs

The previous method easy to prove, easy for computer implementation, but hard for human computation. The strategy of the new method: reduce the number of states in the target FA and encodes path information by regular expressions on the edges. until there is one or two states : one is the start state and one is the final state.

Transparency No. 4-12

Steps

Patterns, regular expression & FAs

0. Assume the machine M has only one start state and one final state. Both may probably be identical. 1. While the exists a third state p that is neither start nor final: 1.1 (Merge edges) For each pair of states (q,r) that has more than 1 edges with labels t1,t2,tn, respectively, than merge these edges by a new one with regular expression t = t1 + t2 + tn. 1.2 (Replace state p by edges; remove state) Let (p1, a1, p), (pn, an, p) where pj != p be the collection of all edges in M with p as the destination state, and (p,b1, q1),,(p, bm, qm) where qj != p be the collection of all edges with p as the start state. Now the sate p together with all its connecting edges can be removed and replaced by a set of m x n new edges : { (pi, ai t* bj, qj) | i in [1,n] and j in [1,m] }. The new machine is equivalent to the old one.
Transparency No. 4-13

Patterns, regular expression & FAs

Merge Edges : a b g

Replace state by Edges g a1 b1 p1 a2 p2 p b2 a3 p3

q1

q2

p1

a1 g*b1

a+b+g

p2

a2 g*b1 a3g*b1 a2 g*b2 a1 g*b2 a3 g*b2

q1

p3

q2

Note: {p1,p2,p3} may intersect with {q1,q2}.

Transparency No. 4-14

Patterns, regular expression & FAs

2. perform 1.1 once again (merge edges) // There are one or two states now 3 Two cases to consider: 3.1 The final machine has only one state, that is both start and final. Then if there is an edge labeled t on the sate, then t* is the result, other the result is e.

3.2 The machine has one start state s and one final state f. Let (s, ss, s), (f, ff, f), (s,sf, f) and (f, ff, f) be the collection of all edges in the machine, where (sf) means the regular expression or label on the edge from s to f. The result then is
[ (ss) + (sf ) (ff)* (fs) ] * (sf) (ff)*
Transparency No. 4-15

Example

0 >p q rF {p,r} {r} {p,q}

1 {q,r} {p,q,r} {q,r}


p q
1 1

Patterns, regular expression & FAs

1. another representation

r
0,1 0,1

p q 1

0,1 1
Transparency No. 4-16

Merge edges

Patterns, regular expression & FAs

p
q r p q r 1 0

1
1

0,1
0,1

0,1 1

p
q r 1 0

1
1

0+1
0+1

0+1 1
Transparency No. 4-17

remove q

Patterns, regular expression & FAs

p
p p p q
0, 11*1 1

q
1 1

r
0+1 0+1

0
1 0

q
1 1,

r
0+1, 11* (0+1) 0+1

q r

0+1 1
1

p r

r 0+1 0, 0+1 1, r (0+1)1*(0+1) (0+1) 1*1

q
1

0+1

Transparency No. 4-18

Form the final result

Patterns, regular expression & FAs

p >p rF
0+11*1 0+ (0+1) 1*1

r
0+1+11* (0+1) 1+ (0+1)1*(0+1)

Final result : = [ pp + (pr) (rr)* (rp) ]* (pr) (rr) *

[ (0+11*1) +(0+1+11*(0+1)) (1+(0+1)1*(0+1))* (0+(0+1)1*1) ]* (0+1+11*(0+1)) (1+(0+1)1*(0+1))*

Transparency No. 4-19

You might also like