Python – Filter list of strings based on the substring list
The problem requires to check which strings in the main list contain any of the substrings from a given list and keep only those that match. Let us explore this problem and understand different methods to solve it.
Using list comprehension with any() (Most Efficient)
List comprehension is a concise and efficient way to solve problems like this. By combining it with the any() function, we can easily check if a string contains any substring from the list of substrings.
s = ["learn", "python", "with", "gfg"]
subs = ["le", "py"] # List of substrings to check for in the strings
# List comprehension that filters strings from 's' if any of the substrings in 'subs' are found in them
res = [x for x in s if any(y in x for y in subs)]
print(res)
Output
['learn', 'python']
Explanation:
- We use a list comprehension to iterate over each string in the list ‘s’.
- Inside the comprehension, the any() function checks if any substring from subs is present in the current string.
- Only the strings that meet the condition are added to the result.
Let’s explore some more different methods to filter list of strings based on the substring list.
Using nested loops
This method is more straightforward but less efficient. We use two loops: one for the list of strings and one for the list of substrings.
s = ["learn", "python", "with", "gfg"]
subs = ["le", "py"] # List of substrings to check for in the strings
res = [] # List to store the result
# Iterate through each string in 's'
for x in s:
# Iterate through each substring in 'subs'
for y in subs:
# If a substring 'y' is found in the string 'x'
if y in x:
res.append(x) # Add the string 'x' to the result list
break # Exit the inner loop once a match is found
print(res)
Output
['learn', 'python']
Explanation:
- We loop through each string in ‘s’ and each substring in subs.
- If a substring is found in the string, we add it to the result and stop checking further substrings for that string.
- This method is easy to understand but can be slower for larger lists.
Using filter with a lambda function
filter() function provides a functional programming approach. We use a lambda function to define the condition.
s = ["learn", "python", "with", "gfg"]
subs = ["le", "py"] # List of substrings to check for in the strings
# Use filter and lambda to filter strings in 's' if any of the substrings in 'subs' are found
res = list(filter(lambda x: any(y in x for y in subs), s))
print(res)
Output
['learn', 'python']
Explanation:
- The filter() function applies the lambda function to each element in ‘s’.
- The lambda function uses any to check if a string contains any of the substrings from subs.
- The filter function returns an iterator, which we convert into a list.
Using regular expressions
We can use the re module to solve this problem. Regular expressions are a versatile tool for pattern matching and can be used to solve this problem effectively.
import re
s = ["learn", "python", "with", "gfg"] #
subs = ["le", "py"] # List of substrings to check for in the strings
# Join substrings in 'subs' with '|' (OR operator) to create a regular expression pattern
pattern = "|".join(subs)
# List comprehension that filters strings from 's' if the pattern matches using regular expressions
res = [x for x in s if re.search(pattern, x)]
print(res)
Output
['learn', 'python']
Explanation:
- We create a regular expression pattern by joining all substrings with the pipe operator.
- re.search() function checks if the pattern matches any part of the string.
- Only matching strings are added to the result.