Strip Spaces, Tabs, and Newlines Using Python Regular Expression



Python regular expressions (regex) provide various ways to handle whitespaces, including spaces, tabs, and newline characters, which can be effectively stripped from strings using regex.

This article will explain how to split a string on newline characters using different regular expressions, following are the various methods to achieve the present task.

Using re.split(r"[\n]", text)

The re.split() function splits a string wherever the specified regular expression pattern matches. The pattern [\n] means stripping the string wherever a single newline character occurs. This is useful when we want to break the string into lines.

Example

Let's assume we have a multiple-line string and want to break it into a list of individual lines. By using the re.split() function along [\n] pattern, we can match every newline character and break the text at those points.

import re
s = """I find 
 Tutorialspoint
  useful"""

result = re.split(r"[\n]", s)
print(result)

Following is the output of the above code -

['I find', ' Tutorialspoint', '  useful']

Splitting on One or More Newlines Using Quantifier [\n+]

To split multiple newlines in a row, we have to use the pattern [\n+] along re.split() function. Here quantifier ( + ) means one or more occurrences of the preceding character or group in the given string.

Example

The following example demonstrates how to strip multiple new lines into a list of segments using the re.split() function along the [\n+] pattern.

import re
s = """First paragraph.

Second paragraph.

Third paragraph."""

result = re.split(r"\n+", s)
print(result)

Following is the output of the above code -

['First paragraph.', 'Second paragraph.', 'Third paragraph.']

Splitting on Newlines with Whitespace Using re.split(r"\n\s*\n", text)

The regular expression r"\n\s*\n" is used to strip a string into parts, specifically at points where there are newlines separating blocks of text that contain whitespaces. Following is the breakdown of these characters.

  • \n: This matches a newline character.
  • \s: To match a whitespace character (space, tab, newline, etc.). The '*' means "zero or more occurrences" of the preceding character or group.
  • *\n: This matches another newline character.

Example

The following example demonstrates how to split on newlines containing whitespaces by using the (r"\n\s*\n") pattern.

import re
s = """Line 1

     Line 2 with blank space above

Line 3"""

result = re.split(r"\n\s*\n", s)
print(result)

Following is the output of the above program -

['Line 1', '     Line 2 with blank space above', 'Line 3']
Updated on: 2025-04-21T15:45:08+05:30

558 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements