11 Regular Expressions
11 Regular Expressions
BBK P1 Module
2010/11 : [1]
Some definitions
[email protected]
'/^[a-z\d\.\+_\'%-]+@([a-z\d-]+\.)+
[a-z]{2,6}$/i
PHP functions to do
something with data and
regular expression.
preg_match(), preg_replace()
BBK P1 Module
2010/11 : [2]
Regular Expressions
'/^[a-z\d\.\+_\'%-]+@([a-z\d-]+\.)+[a-z]{2,6}$/i
Are complicated!
They are a definition of a pattern. Usually
used to validate or extract data from a
string.
BBK P1 Module
2010/11 : [3]
Regex: Delimiters
The regex definition is always bracketed
by delimiters, usually a /:
$regex = /php/;
Matches: php, I love php
Doesnt match: PHP
I love ph
BBK P1 Module
2010/11 : [4]
BBK P1 Module
2010/11 : [5]
2010/11 : [6]
2010/11 : [7]
BBK P1 Module
2010/11 : [8]
\s
\w
2010/11 : [9]
2010/11 : [10]
BBK P1 Module
2010/11 : [11]
Regex: Repetition
There are a number of special characters that
indicate the character group may be repeated:
?
Zero or 1 times
1 or more times
BBK P1 Module
2010/11 : [12]
Regex: Repetition
$regex = /ph?p/;
Matches: pp, php,
Doesnt match: phhp, pap
$regex = /ph*p/;
Matches: pp, php, phhhhp
Doesnt match: pop, phhohp
BBK P1 Module
2010/11 : [13]
Regex: Repetition
$regex = /ph+p/;
Matches: php, phhhhp,
Doesnt match: pp, phyhp
$regex = /ph{1,3}p/;
Matches: php, phhhp
Doesnt match: pp, phhhhp
BBK P1 Module
2010/11 : [14]
2010/11 : [15]
Regex: Anchors
So far, we have matched anywhere within a
string (either the entire data string or part of it).
We can change this behaviour by using anchors:
^
End of string
BBK P1 Module
2010/11 : [16]
Regex: Anchors
With NO anchors:
$regex = /php/;
Matches: php, php is great,
in php we..
Doesnt match: pop
BBK P1 Module
2010/11 : [17]
Regex: Anchors
With start and end anchors:
$regex = /^php$/;
Matches: php,
Doesnt match: php is great,
in php we.., pop
BBK P1 Module
2010/11 : [18]
2010/11 : [19]
So.. An example
Lets define a regex that matches an email:
$emailRegex = '/^[a-z\d\.\+_\'%-]+@([a-z\d-]+\.)+[az]{2,6}$/i;
Matches: [email protected],
[email protected]
[email protected]
Doesnt match: rob@[email protected]
not.an.email.com
BBK P1 Module
2010/11 : [20]
So.. An example
/^
[a-z\d\.\+_\'%-]+
@
The @ separator
([a-z\d-]+\.)+
[a-z]{2,6}
$/i
com,uk,info,etc.
2010/11 : [21]
Phew..
So we now know how to define regular
expressions. Further explanation can be
found at:
http://www.regular-expressions.info/
We still need to know how to use them!
BBK P1 Module
2010/11 : [22]
Boolean Matching
We can use the function preg_match() to
test whether a string matches or not.
// match an email
$input = [email protected];
if (preg_match($emailRegex,$input) {
echo Is a valid email;
} else {
echo NOT a valid email;
}
BBK P1 Module
2010/11 : [23]
Pattern replacement
We can use the function preg_replace()
to replace any matching strings.
// strip
$input =
$regex =
$clean =
// Some
BBK P1 Module
2010/11 : [24]
Sub-references
Were not quite finished: we need to
master the concept of sub-references.
Any bracketed expression in a regular
expression is regarded as a subreference. You use it to extract the bits of
data you want from a regular expression.
Easiest with an example..
BBK P1 Module
2010/11 : [25]
Sub-reference example:
I start with a date string in a particular
format:
$str = 10, April 2007;
2010/11 : [26]
Extracting data..
I then pass in an extra argument to the
function preg_match():
$str = The date is 10, April 2007;
$regex = /(\d+),\s(\w+)\s(\d+)/;
preg_match($regex,$str,$matches);
// $matches[0] = 10, April 2007
// $matches[1] = 10
// $matches[2] = April
// $matches[3] = 2007
BBK P1 Module
2010/11 : [27]
Back-references
This technique can also be used to reference
the original text during replacements with
$1,$2,etc. in the replacement string:
$str = The date is 10, April 2007;
$regex = /(\d+),\s(\w+)\s(\d+)/;
$str = preg_replace($regex,
$1-$2-$3,
$str);
// $str = The date is 10-April-2007
BBK P1 Module
2010/11 : [28]