0% found this document useful (0 votes)

19 views

Regular Expression 01

Uploaded by

Aamna Raza

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

Regular Expression 01

Uploaded by

Aamna Raza

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

REGULAR EXPRESSIONS IN PYTHON

Regular
Expression
Introduction
• If you want to search 123 in ‘upes123python’ then how will you do?

3
Introduction
• If you want to search 123 in ‘upes123python’ then how will you do?
• ‘123’ in ‘upes123python’
• Now if you want to find index also

4
Introduction
• If you want to search 123 in ‘upes123python’ then how will you do?
• ‘123’ in ‘upes123python’
• Now if you want to find index also
• ‘upes123python’.find(‘123’)
• ‘upes123python’.index(‘123’)
• In the above examples, matching is done by character by character.

5
Introduction
• But rather than searching for a fixed substring like '123', suppose
you wanted to determine whether a string contains any three
consecutive decimal digit characters, as in the strings
‘upes123python', ' upes456python', ' upes789python234buzz', and '
upes123pythonbuzz678‘ then character by character comparison
will not solve our problem.
• This is where Regular expression is used.
• With regexes in Python, you can identify patterns in a string that
you wouldn’t be able to find with the in operator or with string
methods.
6
Introduction
• Regular expressions is a sequence of characters that forms a search
pattern.
• It can be used to check if a string contains the specified search
pattern or not.
• Python provides a built in module re which can be used to work
with regular expression.
• match=re.method_name(pattern, string)
• If the search is successful, search() returns a match object or None
object otherwise.

7
To implement regular expressions, the Python's re package can
be used. Import the Python's re package with the following
command:

import re

8
Raw strings

• Different functions in Python's re module use raw string as an

argument. A normal string, when prefixed with 'r' or 'R'
becomes a raw string.
>>> rawstr = r'Hello! How are you?'
>>> print(rawstr)
Hello! How are you?

9
The difference between a normal string and a raw string is that the normal string in
print() function translates escape characters (such as \n, \t etc.) if any, while those in a
raw string are not. In the following example, \n inside str1 (normal string) has
translated as a newline being printed in the next line. But, it is printed as \n in str2 - a
raw string.
Example: String vs Raw String Copy
str1 = "Hello!\nHow are you?" Output
normal string: Hello!
print("normal string:", str1) How are you?
str2 = r"Hello!\nHow are you?" raw string: Hello!\nHow are you?

print("raw string:",str2)

10
meta characters

Some characters carry a special meaning when they appear as a part

pattern matching string. Python's re module uses the following
characters as meta characters:

.^$*+?[]\|()

When a set of alpha-numeric characters are placed inside square

brackets [], the target string is matched with these characters. A
range of characters or individual characters can be listed in the square
bracket. For example:

11
Pattern Description
[abc] match any of the characters a, b, or c
[a-c] which uses a range to express the same set of characters.

[a-z] match only lowercase letters.

[0-9] match only digits.

12
'\'is an escaping metacharacter followed by various characters to
signal various special sequences. If you need to match a [ or \,
you can precede them with a backslash to remove their special
meaning: \[ or \\.

13
Metacharacter Description
. (DOT) Matches any character except a newline.
^ (Caret) Matches pattern only at the start of the string.
$ (Dollar) Matches pattern at the end of the string
* (asterisk) Matches 0 or more repetitions of the regex.
+ (Plus) Match 1 or more repetitions of the regex.
? (Question mark) Match 0 or 1 repetition of the regex.
Used to indicate a set of characters. Matches any single character in brackets. For
[] (Square brackets)
example, [abc] will match either a, or, b, or c character

used to specify multiple patterns. For example, P1|P2, where P1 and P2 are two different
| (Pipe)
regexes.

Use to escape special characters or signals a special sequence. For example, If you are
\ (backslash)
searching for one of the special characters you can use a \ to escape them
[^...] Matches any single character not in brackets.
Matches whatever regular expression is inside the parentheses. For example, (abc) will
(...)
match to substring 'abc'
Example

The caret sign (^) serves two purposes. Here, in this figure, it’s checking for the string that doesn’t contain
upper case, lower case, digits, underscore and space in the strings. In short, we can say that it is simply
matching for special characters in the given string. If we use caret outside the square brackets, it will simply
check for the starting of the string.

15
You can also specify a range of characters using - inside square
brackets.
• [a-e] is the same as [abcde].
• [1-4] is the same as [1234].
• [0-9] is the same as [0123---9]
You can complement (invert) the character set by using caret ^
symbol at the start of a square-bracket.
• [^abc] means any character except a or b or c.
• [^0-9] means any non-digit character.

16
Other Special Sequences
There are some of the Special sequences that make commonly used patterns
easier to write. Below is a list of such special sequences:

17
re.match() function
This function in re module tries to find if the specified pattern is
present at the beginning of the given string.

re.match(pattern, string)
This function returns None if no match can be found. If they’re
successful, a match object instance is returned, containing
information about the match: where it starts and ends, the
substring it matched, etc.

18
>>> import re
>>> string="Simple is better than complex."
>>> obj=re.match("Simple", string)
>>> obj
<_sre.SRE_Match object; span=(0, 6), match='Simple'>
>>> obj.start()
0
>>> obj.end()
6
The match object's start() method returns the starting position of pattern in the string,
and end() returns the endpoint.

If the pattern is not found, the match object is None.

19
re.search():
This function searches for first occurrence of RE pattern within string
from any position of the string but it only returns the first occurrence
of the search pattern.

>>> import re
>>> string="Simple is better than complex."
>>> obj=re.search("is", string)
>>> obj.start()
7
>>> obj.end()
9
20
re.findall():

It helps to get a list of all matching patterns. The return object is

the list of all matches.

>>> import re
>>> string="Simple is better than complex."
>>> obj=re.findall("ple", string)
>>> obj
['ple', 'ple']

21
To obtain list of all alphabetic characters from the string

>>> obj=re.findall("\w", string)

>>> obj
['S', 'i', 'm', 'p', 'l', 'e', 'i', 's', 'b', 'e', 't', 't', 'e', 'r', 't', 'h', 'a', 'n', 'c',
'o', 'm', 'p', 'l', 'e', 'x']

22
To obtain list of words

>>> obj=re.findall("\w*", string)

>>> obj
['Simple', '', 'is', '', 'better', '', 'than', '', 'complex', '', '']

23
re.split():

This function helps to split string by the occurrences of given

pattern. The returned object is the list of slices of strings.

>>> import re
>>> string="Simple is better than complex."
>>> obj=re.split(' ',string)
>>> obj
['Simple', 'is', 'better', 'than', 'complex.']

24
The string is split at each occurrence of a white space ' ' returning
list of slices, each corresponding to a word. Note that output is
similar to split() function of built-in str object.

>>> string.split(' ')

['Simple', 'is', 'better', 'than', 'complex.']

25
re.sub():
This function returns a string by replacing a certain pattern by its substitute string.
Usage of this function is :

re.sub(pattern, replacement, string)

In the example below, the word 'is' gets substituted by 'was' everywhere in the
target string.

>>> string="Simple is better than complex. Complex is better than complicated."

>>> obj=re.sub('is', 'was', string)
>>> obj
'Simple was better than complex. Complex was better than complicated.'

26
Example 1
Write a Python program that matches a string that has
an a followed by zero or more b's.

patterns = 'ab*‘

• Matches 0 or more repetitions of the regex.

27
• ? The question mark indicates zero or one occurrences of the
preceding element. For example, colou?r matches both "color"
and "colour".

• * The asterisk indicates zero or more occurrences of the

preceding element. For example, ab*c matches "ac", "abc",
"abbc", "abbbc", and so on.

28
import re
def text_match(text):
patterns = 'ab*'
if re.search(patterns, text):
return 'Found a match!'
else:
return('Not matched!') Output:

Found a match!
print(text_match("ac")) Found a match!
print(text_match("abc")) Found a match!
print(text_match("abbc"))

29
Example 2
Write a Python program that matches a string that has an a
followed by one or more b's.

patterns = 'ab+‘

+ Match 1 or more repetitions of the regex.

30
import re
def text_match(text):
patterns = 'ab+'
if re.search(patterns, text):
return 'Found a match!'
else:
return('Not matched!')
Sample Output:
print(text_match("ab"))
Found a match!
print(text_match("abc")) Found a match!

31
Example 3
Write a Python program that matches a string that has an a
followed by zero or one 'b'.

32
import re
def text_match(text):
patterns = 'ab?'
if re.search(patterns, text):
return 'Found a match!'
else:
return('Not matched!') Output:

Found a match!
print(text_match("ab")) Found a match!
print(text_match("abc")) Found a match!
Found a match!
print(text_match("abbc"))
print(text_match("aabbc"))

33
Example 4
Write a Python program that matches a string that has an a
followed by three 'b'.

patterns = 'ab{3}'

{} Exactly the specified number of occurrences

34
import re
def text_match(text):
patterns = 'ab{3}'
if re.search(patterns, text):
return 'Found a match!'
else:
return('Not matched!')
Output:

print(text_match("abbb")) Found a match!

print(text_match("aabbbbbc")) Found a match!

35
Example 5
Write a Python program that matches a string that has
an a followed by two to three 'b'.

36
import re
def text_match(text):
patterns = 'ab{2,3}'
if re.search(patterns, text):
return 'Found a match!'
Output:
else:
return('Not matched!') Not matched!
Found a match!
print(text_match("ab"))
print(text_match("aabbbbbc"))

37
Example 6
Write a Python program to find sequences of lowercase letters
joined with a underscore.

patterns = '^[a-z]+_[a-z]+$'

^ (Caret) Matches pattern only at the start of the string.

Matches pattern at the end of the string
$ (Dollar)

+ (Plus) Match 1 or more repetitions of the regex.

38
import re
def text_match(text):
patterns = '^[a-z]+_[a-z]+$'
if re.search(patterns, text):
return 'Found a match!'
else:
return('Not matched!') Output:

Found a match!
print(text_match("aab_cbbbc")) Not matched!
print(text_match("aab_Abbbc")) Not matched!
print(text_match("Aaab_abbbc"))

39
Example 7
Write a Python program to find the sequences of one upper case
letter followed by lower case letters.

40
import re
def text_match(text):
patterns = '[A-Z]+[a-z]+$'
if re.search(patterns, text):
return 'Found a match!'
else:
Output:
return('Not matched!')
print(text_match("AaBbGg")) Found a match!
print(text_match("Python")) Found a match!
print(text_match("python")) Not matched!
Not matched!
print(text_match("PYTHON")) Not matched!
print(text_match("aA")) Found a match!
print(text_match("Aa"))

41
Example 8
Write a Python program that matches a string that has an 'a'
followed by anything, ending in 'b'.

patterns = 'a.*b$'

. (DOT) Matches any character except a newline.

42
import re
def text_match(text):
patterns = 'a.*b$'
if re.search(patterns, text):
return 'Found a match!'
else:
Output:
return('Not matched!')
Not matched!
print(text_match("aabbbbd")) Not matched!
Found a match!
print(text_match("aabAbbbc"))
print(text_match("accddbbjjjb"))

43
Example 9
Write a Python program that matches a string that has an 'a'
followed by anything, ending in digits.

44
import re
def text_match(text):
patterns = 'a.*\d$'
if re.search(patterns, text):
return 'Found a match!'
else:
return('Not matched!') Output:

print(text_match("aabbbbd")) Not matched!

print(text_match("aabAbbbc")) Not matched!
Found a match!
print(text_match("accddbbjjj6"))

45
Example 10
Write a Python program that matches a word containing 'z'.

patterns = '\w*z.\w*'

The \w metacharacter is used to find a word character.

A word character is a character from a-z, A-Z, 0-9, including the _ (underscore) character.

46
import re
def text_match(text):
patterns = '\w*z.\w*'
if re.search(patterns, text):
return 'Found a match!' Output:
else: Found a match!
return('Not matched!') Not matched!

print(text_match("The quick brown fox jumps over the lazy dog."))

print(text_match("Python Exercises."))

Ocr A Level Computer Science For A Level Includes Annas Archive
No ratings yet
Ocr A Level Computer Science For A Level Includes Annas Archive
284 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Lecture 6 Re Basics
No ratings yet
Lecture 6 Re Basics
12 pages
9.RegEx (1)
No ratings yet
9.RegEx (1)
57 pages
PP_Module-3 Notes
No ratings yet
PP_Module-3 Notes
56 pages
17_Regular Expression
No ratings yet
17_Regular Expression
20 pages
unit 4 Regular expression
No ratings yet
unit 4 Regular expression
16 pages
PP - Chapter - 4
No ratings yet
PP - Chapter - 4
15 pages
UNIT - 4 REGEX
No ratings yet
UNIT - 4 REGEX
28 pages
Regular Expression
No ratings yet
Regular Expression
17 pages
regular exp
No ratings yet
regular exp
10 pages
Python unit 3
No ratings yet
Python unit 3
46 pages
Regular Expression 4
No ratings yet
Regular Expression 4
16 pages
Python Complete Unit 3
No ratings yet
Python Complete Unit 3
40 pages
13B RegExp
No ratings yet
13B RegExp
38 pages
Regular Expression
No ratings yet
Regular Expression
21 pages
Howto Regex
No ratings yet
Howto Regex
19 pages
UNIT4
No ratings yet
UNIT4
67 pages
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
No ratings yet
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
18 pages
RegEx-in-Python
No ratings yet
RegEx-in-Python
5 pages
Regular Expression Python
No ratings yet
Regular Expression Python
23 pages
Howto Regex
No ratings yet
Howto Regex
17 pages
Regular Expression l
No ratings yet
Regular Expression l
20 pages
Python Regex Cheat Sheet
No ratings yet
Python Regex Cheat Sheet
29 pages
Python Re Modul
No ratings yet
Python Re Modul
3 pages
Regular Expressions
No ratings yet
Regular Expressions
9 pages
Unit-3 - Regular Expression
No ratings yet
Unit-3 - Regular Expression
15 pages
Manipulating Text with Regular Expression in python
No ratings yet
Manipulating Text with Regular Expression in python
4 pages
Python Regular Expressions
No ratings yet
Python Regular Expressions
14 pages
Python Regular Expressions
No ratings yet
Python Regular Expressions
6 pages
Unit 4 - Regular Expressions
No ratings yet
Unit 4 - Regular Expressions
20 pages
Python RegEx
No ratings yet
Python RegEx
11 pages
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
No ratings yet
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
18 pages
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
No ratings yet
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
18 pages
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
100% (1)
Regular Expression HOWTO: Guido Van Rossum Fred L. Drake, JR., Editor
18 pages
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
No ratings yet
Regular Expression HOWTO: Guido Van Rossum and The Python Development Team
20 pages
2 - Python Strings
No ratings yet
2 - Python Strings
23 pages
Python How To Regex
No ratings yet
Python How To Regex
19 pages
Unit7_RegularExpressionpdf__2023_10_17_09_16_29
No ratings yet
Unit7_RegularExpressionpdf__2023_10_17_09_16_29
17 pages
Lecture 7 Re Part2 Split
No ratings yet
Lecture 7 Re Part2 Split
8 pages
Lec 06 - Regular Expression
No ratings yet
Lec 06 - Regular Expression
19 pages
Regular Expressions Python
No ratings yet
Regular Expressions Python
26 pages
Day-13 Python Regx
No ratings yet
Day-13 Python Regx
11 pages
Structuring with regix
No ratings yet
Structuring with regix
49 pages
Python 201 - (Slightly) Advanced Python Topics
No ratings yet
Python 201 - (Slightly) Advanced Python Topics
69 pages
Python Module-41
No ratings yet
Python Module-41
56 pages
Python Course: Session 6b - Regular Expressions
No ratings yet
Python Course: Session 6b - Regular Expressions
11 pages
Regular Expression Howto: A.M. Kuchling
No ratings yet
Regular Expression Howto: A.M. Kuchling
20 pages
Regular Exp
No ratings yet
Regular Exp
6 pages
howto-regex
No ratings yet
howto-regex
20 pages
Python Regex
No ratings yet
Python Regex
8 pages
Howto Regex
No ratings yet
Howto Regex
20 pages
Regex Case Interview Guide
No ratings yet
Regex Case Interview Guide
10 pages
Python Reg Expressions PDF
No ratings yet
Python Reg Expressions PDF
8 pages
Regular Expressions - Regexes in Python (Part 1) - Real Python
No ratings yet
Regular Expressions - Regexes in Python (Part 1) - Real Python
44 pages
Data Analysis Using Python Lab Ex3
No ratings yet
Data Analysis Using Python Lab Ex3
27 pages
Lecture 9 Python
No ratings yet
Lecture 9 Python
8 pages
Howto Regex PDF
No ratings yet
Howto Regex PDF
20 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Ian Talks Regex A-Z
From Everand
Ian Talks Regex A-Z
Ian Eress
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
CAT19
No ratings yet
CAT19
89 pages
Republic of Ireland - "The Celtic Tiger Economy"-Progress and Challenges
No ratings yet
Republic of Ireland - "The Celtic Tiger Economy"-Progress and Challenges
13 pages
PG - M.A. - Economics (English) - M.A. (Economics) - 362 12 - Development Economics
No ratings yet
PG - M.A. - Economics (English) - M.A. (Economics) - 362 12 - Development Economics
240 pages
Daily DILR Set-9
No ratings yet
Daily DILR Set-9
1 page
OS Module
No ratings yet
OS Module
12 pages
Built-In Functions in Python
No ratings yet
Built-In Functions in Python
50 pages
Computer Science Python Book Class XI
100% (2)
Computer Science Python Book Class XI
272 pages
Stack
100% (1)
Stack
14 pages
Vedic Math Presentation
No ratings yet
Vedic Math Presentation
11 pages
Asymptotes For Function of 2 Variables
No ratings yet
Asymptotes For Function of 2 Variables
18 pages
Adobe Scan 25 Mar 2023
No ratings yet
Adobe Scan 25 Mar 2023
18 pages
Download full (Ebook) The Java Tutorial: A Short Course on the Basics by Sharon Biocca Zakhour, Joni Gordon, Sowmya Kannan, Scott Hommel, Raymond Gallardo ISBN 9780134034706, 0134034708 ebook all chapters
100% (2)
Download full (Ebook) The Java Tutorial: A Short Course on the Basics by Sharon Biocca Zakhour, Joni Gordon, Sowmya Kannan, Scott Hommel, Raymond Gallardo ISBN 9780134034706, 0134034708 ebook all chapters
86 pages
Session - 27 and 28 Hardwired Vs Micro-Programmed Realization, Multi Cycle Implementation
No ratings yet
Session - 27 and 28 Hardwired Vs Micro-Programmed Realization, Multi Cycle Implementation
15 pages
Cambridge International AS & A Level: Computer Science 9618/21
No ratings yet
Cambridge International AS & A Level: Computer Science 9618/21
10 pages
CSE 373 - Course Objective and Outcome Form - Sec 6,9
No ratings yet
CSE 373 - Course Objective and Outcome Form - Sec 6,9
3 pages
Asynchronous Programming in Kotlin
No ratings yet
Asynchronous Programming in Kotlin
125 pages
Master - S Programme in Computer Science 2024-25
No ratings yet
Master - S Programme in Computer Science 2024-25
1 page
Tech Sharmit
No ratings yet
Tech Sharmit
16 pages
Solution Manual for Problem Solving with C++ 10th Edition Savitch download
100% (4)
Solution Manual for Problem Solving with C++ 10th Edition Savitch download
51 pages
Syllabus:Distributed: Distributed Systems Unit - III
No ratings yet
Syllabus:Distributed: Distributed Systems Unit - III
10 pages
Advanced Navigation Destinations in SwiftUI | by Michael Long | Jan, 2025 | Medium
No ratings yet
Advanced Navigation Destinations in SwiftUI | by Michael Long | Jan, 2025 | Medium
16 pages
12 Machine Learning Model To Predict Construction Duration
No ratings yet
12 Machine Learning Model To Predict Construction Duration
15 pages
EEI3262 Introduction To Object Oriented Programming - Course Synopsis
No ratings yet
EEI3262 Introduction To Object Oriented Programming - Course Synopsis
2 pages
Bigo List
No ratings yet
Bigo List
8 pages
6CS4-23 Python Lab
No ratings yet
6CS4-23 Python Lab
26 pages
Sample Paper: Sof International Mathematics Olympiad
No ratings yet
Sample Paper: Sof International Mathematics Olympiad
2 pages
CC05
No ratings yet
CC05
3 pages
Index
No ratings yet
Index
212 pages
5.9.84 (1471) Crash 2021 12 17 11 21 17 1639711277390
No ratings yet
5.9.84 (1471) Crash 2021 12 17 11 21 17 1639711277390
2 pages
Homework 2
No ratings yet
Homework 2
2 pages
Chapter six
No ratings yet
Chapter six
28 pages
Seema
No ratings yet
Seema
17 pages
rapicopy error
No ratings yet
rapicopy error
12 pages
Week 2 - Number System
No ratings yet
Week 2 - Number System
13 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
03 - Data Structure and Algorithm - Quiz
100% (1)
03 - Data Structure and Algorithm - Quiz
8 pages
CODECHEF WEEK-7 Assignment (Chirag)
No ratings yet
CODECHEF WEEK-7 Assignment (Chirag)
7 pages
Ass. Prof Nashaat Edward Nashed
No ratings yet
Ass. Prof Nashaat Edward Nashed
1 page
Bugreport Topaz - Ru UKQ1.230917.001 2024 02 22 20 45 49 Dumpstate - Log 14644
No ratings yet
Bugreport Topaz - Ru UKQ1.230917.001 2024 02 22 20 45 49 Dumpstate - Log 14644
24 pages
Unit - 5 Notes
No ratings yet
Unit - 5 Notes
32 pages

Regular Expression 01

Uploaded by

Regular Expression 01

Uploaded by

REGULAR EXPRESSIONS IN PYTHON

• Different functions in Python's re module use raw string as an

Some characters carry a special meaning when they appear as a part

When a set of alpha-numeric characters are placed inside square

[a-z] match only lowercase letters.

If the pattern is not found, the match object is None.

It helps to get a list of all matching patterns. The return object is

>>> obj=re.findall("\w", string)

>>> obj=re.findall("\w*", string)

This function helps to split string by the occurrences of given

>>> string.split(' ')

re.sub(pattern, replacement, string)

>>> string="Simple is better than complex. Complex is better than complicated."

• Matches 0 or more repetitions of the regex.

• * The asterisk indicates zero or more occurrences of the

+ Match 1 or more repetitions of the regex.

{} Exactly the specified number of occurrences

print(text_match("abbb")) Found a match!

^ (Caret) Matches pattern only at the start of the string.

+ (Plus) Match 1 or more repetitions of the regex.

. (DOT) Matches any character except a newline.

print(text_match("aabbbbd")) Not matched!

The \w metacharacter is used to find a word character.

print(text_match("The quick brown fox jumps over the lazy dog."))

You might also like