Voting

: five plus one?
(Example: nine)

The Note You're Voting On

Steve
2 years ago
The escape sequence \g used as a backreference may not always behave as expected.
The following numbered backreferences refer to the text matching the specified capture group, as documented:
\1
\g1
\g{1}
\g-1
\g{-1}

However, the following variants refer to the subpattern code instead of the matched text:
\g<1>
\g'1'
\g<-1>
\g'-1'

With named backreferences, we may also use the \k escape sequence as well as the (?P=...) construct. The following combinations also refer to the text matching the named capture group, as documented:
\g{name}
\k{name}
\k<name>
\k'name'
(?P=name)

However, these refer to the subpattern code instead of the matched text:
g<name>
\g'name'

In the following example, the capture group searches for a single letter 'a' or 'b', and then the backreference looks for the same letter. Thus, the patterns are expected to match 'aa' and 'bb', but not 'ab' nor 'ba'.

<?php
/* Matches to the following patterns are replaced by 'xx' in the subject string 'aa ab ba bb'. */
$patterns = [
# numbered backreferences (absolute)
'/([ab])\1/', // 'xx ab ba xx'
'/([ab])\g1/', // 'xx ab ba xx'
'/([ab])\g{1}/', // 'xx ab ba xx'
'/([ab])\g<1>/', // 'xx xx xx xx' # unexpected behavior, backreference matches both 'a' and 'b'.
"/([ab])\g'1'/", // 'xx xx xx xx' # unexpected behavior, backreference matches both 'a' and 'b'.
'/([ab])\k{1}/', // 'aa ab ba bb' # No group with name "1", backreference to unset group always fails.
'/([ab])\k<1>/', // 'aa ab ba bb' # No group with name "1", backreference to unset group always fails.
"/([ab])\k'1'/", // 'aa ab ba bb' # No group with name "1", backreference to unset group always fails.
'/([ab])(?P=1)/', // NULL # Regex error: "subpattern name must start with a non-digit", (?P=) expects name not number.
# numbered backreferences (relative)
'/([ab])\-1/', // 'aa ab ba bb'
'/([ab])\g-1/', // 'xx ab ba xx'
'/([ab])\g{-1}/', // 'xx ab ba xx'
'/([ab])\g<-1>/', // 'xx xx xx xx' # unexpected behavior, backreference matches both 'a' and 'b'.
"/([ab])\g'-1'/", // 'xx xx xx xx' # unexpected behavior, backreference matches both 'a' and 'b'.
'/([ab])\k{-1}/', // 'aa ab ba bb' # No group with name "-1", backreference to unset group always fails.
'/([ab])\k<-1>/', // 'aa ab ba bb' # No group with name "-1", backreference to unset group always fails.
"/([ab])\k'-1'/", // 'aa ab ba bb' # No group with name "-1", backreference to unset group always fails.
'/([ab])(?P=-1)/', // NULL # Regex error: "subpattern name expected", (?P=) expects name not number.
# named backreferences
'/(?<name>[ab])\g{name}/', // 'xx ab ba xx'
'/(?<name>[ab])\g<name>/', // 'xx xx xx xx' # unexpected behavior, backreference matches both 'a' and 'b'.
"/(?<name>[ab])\g'name'/", // 'xx xx xx xx' # unexpected behavior, backreference matches both 'a' and 'b'.
'/(?<name>[ab])\k{name}/', // 'xx ab ba xx'
'/(?<name>[ab])\k<name>/', // 'xx ab ba xx'
"/(?<name>[ab])\k'name'/", // 'xx ab ba xx'
'/(?<name>[ab])(?P=name)/', // 'xx ab ba xx'
];

foreach (
$patterns as $pat)
echo
" '$pat',\t// " . var_export(@preg_replace($pat, 'xx', 'aa ab ba bb'), 1) . PHP_EOL;
?>

<< Back to user notes page

To Top