Open In App

Python – Extract K length substrings

Last Updated : 15 Jan, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

The task is to extract all possible substrings of a specific length, k. This problem involves identifying and retrieving those substrings in an efficient way. Let’s explore various methods to extract substrings of length k from a given string in Python

Using List Comprehension

List comprehension is the most efficient and concise way to extract substrings. It iterates through the string and collects all substrings of length k in a single line.

Python
s = "geeksforgeeks"
k = 4

# Extracting k-length substrings using list comprehension
sub = [s[i:i+k] for i in range(len(s) - k + 1)]
print(sub)  

Output
['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']

Explanation:

  • List comprehension iterates over the string and slices it to get all substrings of length k.
  • The range ensures we only extract valid substrings within the bounds of the string.

Let’s explore some more methods and see how we can extract K length substrings from a given string.

Using a for loop

By using a simple for loop, we can extract the substrings by iterating through the string and slicing it at each step. This method is very intuitive.

Python
s = "geeksforgeeks"
k = 4

# Initialize an empty list to store substrings
sub = []

# Loop through the string and extract k-length substrings
for i in range(len(s) - k + 1):
    sub.append(s[i:i+k])
print(sub) 

Output
['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']

Explanation:

  • A for loop iterates through the string, slicing substrings of length k.
  • Each substring is appended to the substrings list.

Using zip()

zip() function can also be used for this task, we can use it to group characters of the string into k-length substrings.

Python
s = "geeksforgeeks"
k = 4

# Use zip to extract k-length substrings
substrings = [''.join(s[i:i+k]) for i in range(len(s) - k + 1)]
print(substrings) 

Output
['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']

Explanation:

  • The zip() function groups characters in chunks of size k.
  • A list comprehension is used to extract and join these chunks into substrings.

Using re.findall()

re.findall() function can be used to extract substrings based on regular expressions.

Python
import re
s = "geeksforgeeks"
k = 4

# Use re.findall to extract k-length substrings
subs = re.findall(r'(?=(\w{4}))', s)
print(subs) 

Output
['geek', 'eeks', 'eksf', 'ksfo', 'sfor', 'forg', 'orge', 'rgee', 'geek', 'eeks']

Explanation:

  • The regular expression (?=(\w{4})) matches overlapping substrings of length 4.
  • This method uses regular expressions, which may be overkill for simple tasks.




Next Article
Practice Tags :

Similar Reads