
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Regular Expression Repetition Cases in Python
We can build regular expressions that recognize repeated character groups using a few special characters. The following metacharacters can be used to search for a character or set of repeated characters.
The question mark was the first repeating operator or quantifier developed. It effectively makes it optional by instructing the engine to try matching the previous token 0 or 1 times.
Matching Simple HTML Tags Using Regular Expressions
The engine is instructed to try matching the previous token zero or more times by the asterisk or star. The plus instructs the engine to make one or more attempts to match the previous token. An HTML tag without any attributes is matched by [A-Za-z0-9]*>.
Angle brackets are literal expressions. The initial character class corresponds to a letter. The second character class matches letters or numbers. The star repeats the second character class. It's acceptable if the second character class matches nothing because we utilised the star. Therefore, a tag like this will match our regex.
The first character class will match H upon matching, the second character class will be repeated three times, matching T, M, and L with each step, thanks to the star. We also had the option of using [A-Za-z0-9]+. We refrained from doing so since this regex would match the invalid HTML element 1>. However, if you are certain that the text you are looking through does not contain such incorrect tags, this regex can be adequate.
Limiting Repetition
The number of times a token can be used may also be given using an additional quantifier. The syntax is min, max, where min is the minimum number of matches (zero or a positive integer) and max is the maximum number of matches (an integer equal to or greater than min).
The maximum number of matches is infinite if the comma is present, but the max is left out. As a result, the values 0 and 1 are equivalent to?, *, and +, respectively. The engine is instructed to repeat the token precisely min times if the comma and max are omitted.
A number between 1000 and 9999 can be matched using the formula b[1-9].[0-9]3b. 2,4b represents a number in the range of 100 to 99999. Take note of the term limits used.
Meta Characters Used in Repetition
Character | Meaning | Example |
---|---|---|
? |
This means zero or one of the preceding characters. Note the zero part there because that can trip you up if you aren't careful. |
pythonl?y matches:
pythony pythonly |
* | Looks for zero or more of the preceding characters. |
pythonl*y matches both of the above plus pythonlly, pythonllly, and so on |
+ | Looks for one or more of the preceding characters. |
pythonl+y matches: pythonly, pythonlly, pythonllly, and so on |
{n,m} | looks for n to m repetitions of the preceding characters. |
fo{1,2} matches fo or foo |
Example 1
Below is a simple example of how to use the repetition operator in Python. So here we will use the re.search() method to search for a string that contains a repeated character.
#importing re import re #storing a string s = "sheeeeeeeeple" print("Give String -",s) #searching for the repetitive words match = re.search(r"he+", s) #printing the matched groups of letters together print ("Latest String-",match.group())
Here is the output of the above program -
Give String - sheeeeeeeeple Latest String- heeeeeeee
Example 2
In the example below, we will use the re.search() method to search for a string that contains a repeated character. So we will use the '+' operator to search for a string that contains one or more repeated characters.
#importing re import re #storing a string s = "tutoriallllllllsPoint" print('Give String-',s) #searching for the repetitive words match = re.search(r"al+", s) #printing the matched groups of letters together print ("Lates string -",match.group())
Below is the result of the above Python code -
Give String- tutoriallllllllsPoint Lates string - allllllll
Example 3
In the example below, we will use the re.findall() method to search for a string that contains a repeated character. So we will use the '?' operator to search for a string that contains zero or one repeated characters. we will search for the string "color" or "colour".
import re s = "color or colour" match = re.findall(r"colou?r", s) print(match)
Here is the output of the above program -
['color', 'colour']
Example 4
In the example below, we will use the re.findall() method to search for a string that contains a repeated character. So we will use the '*' operator to search for a string that contains zero or more repeated characters.
import re s = "goooal! goooooal! goal!" match = re.findall(r"go*al", s) print(match)
Here is the output of the above program -
['goal', 'goooal', 'goooooal']
Example 5
In the example below, we will use the re.findall() method to search for a string that contains a repeated character. So we will use the '{n}' operator to search for a string that contains exactly n repeated characters.
import re s = "My number is 444-555-1234" match = re.findall(r"\d{3}-\d{3}-\d{4}", s) print(match)
Below is the result of the above program -
['444-555-1234']