Remove URLs from string in Python

Last Updated : 23 Jul, 2025

A regular expression (regex) is a sequence of characters that defines a search pattern in text. To remove URLs from a string in Python, you can either use regular expressions (regex) or some external libraries like urllib.parse. The re-module in Python is used for working with regular expressions. In this article, we will see how we can remove URLs from a string in Python.

Python Remove URLs from a String

Below are the ways by which we can remove URLs from a string in Python:

Using the re.sub() function
Using the re.findall() function
Using the re.search() function
Using the urllib.parse class

Python Remove URLs from String Using re.sub() function

In this example, the code defines a function 'remove_urls' to find URLs in text and replace them with a placeholder [URL REMOVED], using regular expressions for pattern matching and the re.sub() method for substitution.

Python3

import re
def remove_urls(text, replacement_text=&quot;[URL REMOVED]&quot;):
    # Define a regex pattern to match URLs
    url_pattern = re.compile(r'https?://\S+|www\.\S+')

    # Use the sub() method to replace URLs with the specified replacement text
    text_without_urls = url_pattern.sub(replacement_text, text)

    return text_without_urls

# Example:
input_text = &quot;Visit on GeeksforGeeks Website: https://www.geeksforgeeks.org/&quot;
output_text = remove_urls(input_text)

print(&quot;Original Text:&quot;)
print(input_text)
print(&quot;\nText with URLs Removed:&quot;)
print(output_text)

Output

Original Text:
Visit on GeeksforGeeks Website: https://www.geeksforgeeks.org/

Text with URLs Removed:
Visit on GeeksforGeeks Website: [URL REMOVED]

Remove URLs from String Using re.findall() function

In this example, the Python code defines a function 'remove_urls_findall' that uses regular expressions to find all URLs using re.findall() method in a given text and replaces them with a replacement text "[URL REMOVED]".

Python3

import re
def remove_urls_findall(text, replacement_text=&quot;[URL REMOVED]&quot;):
    url_pattern = re.compile(r'https?://\S+|www\.\S+')
    urls = url_pattern.findall(text)

    for url in urls:
        text = text.replace(url, replacement_text)

    return text

# Example:
input_text = &quot;Check out the latest Python tutorials on GeeksforGeeks: https://www.geeksforgeeks.org/category/python/&quot;/
output_text_findall = remove_urls_findall(input_text)

print(&quot;\nUsing re.findall():&quot;)
print(&quot;Original Text:&quot;)
print(input_text)
print(&quot;\nText with URLs Removed:&quot;)
print(output_text_findall)

Output:

Using re.findall():
Original Text:
Check out the latest Python tutorials on GeeksforGeeks: https://www.geeksforgeeks.org/category/python/
Text with URLs Removed:
Check out the latest Python tutorials on GeeksforGeeks: [URL REMOVED]