Python - Extract K length substrings
Last Updated :
15 Jan, 2025
The task is to extract all possible substrings of a specific length, k. This problem involves identifying and retrieving those substrings in an efficient way. Let's explore various methods to extract substrings of length k from a given string in Python
Using List Comprehension
List comprehension is the most efficient and concise way to extract substrings. It iterates through the string and collects all substrings of length k in a single line.
Python
s = "geeksforgeeks"
k = 4
# Extracting k-length substrings using list comprehension
sub = [s[i:i+k] for i in range(len(s) - k + 1)]
print(sub)
Output['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']
Explanation:
- List comprehension iterates over the string and slices it to get all substrings of length k.
- The range ensures we only extract valid substrings within the bounds of the string.
Let's explore some more methods and see how we can extract K length substrings from a given string.
Using a for loop
By using a simple for loop, we can extract the substrings by iterating through the string and slicing it at each step. This method is very intuitive.
Python
s = "geeksforgeeks"
k = 4
# Initialize an empty list to store substrings
sub = []
# Loop through the string and extract k-length substrings
for i in range(len(s) - k + 1):
sub.append(s[i:i+k])
print(sub)
Output['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']
Explanation:
- A for loop iterates through the string, slicing substrings of length k.
- Each substring is appended to the
substrings
list.
Using zip()
zip()
function can also be used for this task, we can use it to group characters of the string into k-length substrings.
Python
s = "geeksforgeeks"
k = 4
# Use zip to extract k-length substrings
substrings = [''.join(s[i:i+k]) for i in range(len(s) - k + 1)]
print(substrings)
Output['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']
Explanation:
- The zip() function groups characters in chunks of size k.
- A list comprehension is used to extract and join these chunks into substrings.
Using re.findall()
re.findall()
function can be used to extract substrings based on regular expressions.
Python
import re
s = "geeksforgeeks"
k = 4
# Use re.findall to extract k-length substrings
subs = re.findall(r'(?=(\w{4}))', s)
print(subs)
Output['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']
Explanation:
- The regular expression
(?=(\w{4}))
matches overlapping substrings of length 4. - This method uses regular expressions, which may be overkill for simple tasks.
Similar Reads
Python | Extract K sized strings Sometimes, while working with huge amount of data, we can have a problem in which we need to extract just specific sized strings. This kind of problem can occur during validation cases across many domains. Let's discuss certain ways to handle this in Python strings list. Method #1 : Using list compr
5 min read
Python - Extract Indices of substring matches Given a String List, and a substring, extract list of indices of Strings, in which that substring occurs. Input : test_list = ["Gfg is good", "for Geeks", "I love Gfg", "Gfg is useful"], K = "Gfg" Output : [0, 2, 3] Explanation : "Gfg" is present in 0th, 2nd and 3rd element as substring. Input : tes
5 min read
Python | Extract words from given string In Python, we sometimes come through situations where we require to get all the words present in the string, this can be a tedious task done using the native method. Hence having shorthand to perform this task is always useful. Additionally, this article also includes the cases in which punctuation
4 min read
Python Extract Substring Using Regex Python provides a powerful and flexible module called re for working with regular expressions. Regular expressions (regex) are a sequence of characters that define a search pattern, and they can be incredibly useful for extracting substrings from strings. In this article, we'll explore four simple a
2 min read
Python - Longest Substring Length of K Given a String and a character K, find longest substring length of K. Input : test_str = 'abcaaaacbbaa', K = b Output : 2 Explanation : b occurs twice, 2 > 1. Input : test_str = 'abcaacccbbaa', K = c Output : 3 Explanation : Maximum times c occurs is 3. Method #1: Using loop This is brute way to
7 min read