Perl | Anchors in Regex

Last Updated : 06 Jan, 2023

Anchors in Perl Regex do not match any character at all. Instead, they match a particular position as before, after, or between the characters. These are used to check not the string but its positional boundaries. Following are the respective anchors in Perl Regex:

'^' '$', '\b', '\A', '\Z', '\z', '\G', '\p{....}', '\P{....}', '[:class:]'

^ or \A: It matches the pattern at the beginning of the string.

Syntax: (/^pattern/, /\Apattern/).

Example:

perl

#!/usr/bin/perl
$str = "guardians of the galaxy";

# prints the pattern as it is
# starting with 'guardians'
print "$&\n" if($str =~ /^guardians/);

# prints the pattern 'gua'
print "$&\n" if($str =~ /\Agua/);

# prints nothing because 
# the 0th position doesn't start with 'a'
print "$&" if($str =~ /^ans/)

Output:

guardians
gua

$ or \z: It matches the pattern at the end of the string.

Syntax: (/pattern$/, /pattern\z/).

Example:

perl

#!/usr/bin/perl
$str = "guardians of the galaxy";

# prints nothing as it is not 
# ending with 'guardians'
print "$&\n" if($str =~ /guardians$/);

# prints the pattern 'y'
print "$&\n" if($str =~ /y\z/);

# prints the pattern as it is 
# ending with 'galaxy'
print "$&" if($str =~ /galaxy$/)

Output:

y
galaxy

\b: It matches at the word boundary of the string from \w to \W. In precise, it either gets a match to beginning or end of the string if it is a word or to a word character or a non-word character.

Syntax: (/\bpattern\b/).

Example:

perl

#!/usr/bin/perl
$str = "guardians-of-the-galaxy";

# prints '-galaxy' as it forms
# a word even with '-'.
print "$&\n" if($str =~ /\b-galaxy\b/);

# prints '-guardians' as it forms 
# a word even with '-'.
print "$&\n" if($str =~ /\bguardians-\b/);

# prints nothing as it is bounded
# with a character 't'.
print "$&" if($str =~ /\be-galaxy\b/);

# prints 'guardians-of-the-galaxy' as it
# is bounded with the beginning and end.
print "$&" if($str =~ /\bguardians-of-the-galaxy\b/);

Output:

-galaxy
guardians-
guardians-of-the-galaxy

\Z: It matches at the ending of the string or before the newline. '\z' and '\Z' both differ from $ in that they are not affected by the /m "multiline" flag, which allows $ to match at the end of any line.

perl

#!/usr/bin/perl

# Prints one due to m//
print "one\n" if ('galaxy' =~ m/galaxy\z/);

# Prints two due to m//
print "two\n" if('galaxy' =~ m/galaxy\Z/);

# Prints three due to /Z 
# as it forms a newline
print "three\n" if ("galaxy\n" =~ m/galaxy\Z/);

# Prints four due to m// as 
# the line ended \z gets affected
print "four\n" if ("galaxy\n" =~ m/galaxy\n\z/);

# Prints five as it forms a new line
print "five\n" if("galaxy\n" =~ m/galaxy\n\Z/);

# Due to the "" it forms a newline and 
# \z doesn't get affected. Prints nothing
print "six" if("galaxy\n" =~ m/galaxy\z/);

Output:

one
two
three
four
five

\G: It matches at the specified position. If a pattern's length is 5 then it starts from the start of the string till 5 positions, if the pattern is valid then it is forced to check the string from 6th position onwards, moves forward in this fashion till pattern not valid or end of the string.

Perl

#!/usr/bin/perl
$str = "galaxy8222as";

# prints until the pattern is valid
print "one: $& " while($str =~ /\G[a-z]{2}/gc);
print "\n";

# prints until the pattern is valid
print "two: $& " while("1122a44" =~ /\G\d\d/gc);
print "\n";

# Take the string as a new value and 
# searches from the start to false
print "three: $& " while("galaxy8222as" =~ /\G\w{2}/gc);
print "four: $& " while($str =~ /\G[a-z]{2}/gc);

# Take the false position of the 
# above string and searches from there
# Prints if the pattern is valid from that position 
# onwards(prints nothing). As it is false 
# it stays at the same position as before.
print "\n";
print "five: $& " while($str =~ /\G\w{2}/gc);

Output:

one: ga one: la one: xy 
two: 11 two: 22 
three: ga three: la three: xy three: 82 three: 22 three: as 
five: 82 five: 22 five: as

\p{...} and \P{...}: \p{...} matches Unicode character class like IsLower, IsAlpha, etc. whereas \P{….} is the complement of Unicode character class.

Perl

#!/usr/bin/perl

# unicode class is the pattern to match
print "$&" while("guardians!@#%^*123" =~ /\p{isalpha}/gc);
print "\n";

# unicode class is the pattern to match
print "$&" while("guardians!@#%^&*123" =~ /\p{isalnum}/gc);
print "\n";

# here L matches the alphabets where \P is the complement
print "$&" while("guardians!@#%^&*123" =~ /\P{L}/gc);
print "\n";

# here L matches the alphabets where \p is non-complement
print "$&" while("guardians!@#%^&*123" =~ /\p{L}/gc);

Output:

guardians
guardians123
!@#%^&*123
guardians

[:class:]: POSIX Character Classes like digit, lower, ascii, etc.

Syntax: (/[[:class:]]/)

POSIX character classes are as follows:

alpha, alnum, ascii, blank, cntrl, digit, graph, lower, punct, space, upper, xdigit, word

Perl

#!/usr/bin/perl

# prints only alphabets
print "$&" while('guardians!@#%^&*123' =~ /[[:alpha:]]/gc);
print "\n";

# prints characters and digits
print "$&" while("guardians!@#%^&*123" =~ /[[:alnum:]]/gc);
print "\n";

# prints only digits
print "$&" while("guardians!@#%^&*123" =~ /[[:digit:]]/gc);
print "\n";

# prints anything except space " ".
print "$&" while("guardians!@#%^& 123\n" =~ /[[:graph:]]/gc);
print "\n";

# prints the 1 as it gets matched to 
# space " " or horizontal tab.
print "1" while("guardians!@#%^& 123\n" =~ /[[:blank:]]/gc);
print "\n";

# prints lowercase characters
print "$&" while("Guardians!@#%^& 123\n" =~ /[[:lower:]]/gc);
print "\n";

# prints all ascii characters
print "$&" while("guardians!@#%^& 123\n" =~ /[[:ascii:]]/gc);

Output:

guardians
guardians123
123
guardians!@#%^&123
1
guardians
guardians!@#%^& 123

Perl | Grouping and Alternation in Regex

Tejashwi5

Improve

Article Tags :

Perl | Anchors in Regex

Similar Reads

Thank You!

What kind of Experience do you want to share?