Regular Expressions
Regular Expressions
• x, for each x ,
• , the empty string, and
• , indicating no strings at all.
Thus, if | | = n, then there are n+2 primitive regular expressions defined over .
• For each x , the primitive regular expression x denotes the language {x}.
That is, the only string in the language is the string "x".
• The primitive regular expression denotes the language { }. The only
string in this language is the empty string.
• The primitive regular expression denotes the language {}. There are no
strings in this language.
Precedence: * binds most tightly, then justaposition, then +. For example, a+bc* denotes the
language {a, b, bc, bcc, bccc, bcccc, ...}.
Examples :
Now we want to allow arbitrary strings not containing a's at the places
marked by X's:
X( +a)X( +a)X( +a)X
Finally, replacing the X's with (a+b+c)* gives the final (unwieldy) answer:
(a+b+c)*a(a+b+c)*b(a+b+c)*c(a+b+c)* +
(a+b+c)*a(a+b+c)*c(a+b+c)*b(a+b+c)* +
(a+b+c)*b(a+b+c)*a(a+b+c)*c(a+b+c)* +
(a+b+c)*b(a+b+c)*c(a+b+c)*a(a+b+c)* +
(a+b+c)*c(a+b+c)*a(a+b+c)*b(a+b+c)* +
(a+b+c)*c(a+b+c)*b(a+b+c)*a(a+b+c)*
All strings which contain no runs of a's of length greater than two.
We can fairly easily build an expression containing no a, one a, or one aa:
(b+c)*( +a+aa)(b+c)*
but if we want to repeat this, we need to be sure to have at least one non-a
between repetitions:
(b+c)*( +a+aa)(b+c)*((b+c)(b+c)*( +a+aa)(b+c)*)*
All strings in which all runs of a's have lengths that are multiples of three.
(aaa+b+c)*