Remove duplicate words from Sentence using Regular Expression Last Updated : 12 Jul, 2025 Comments Improve Suggest changes Like Article Like Report Given a string str which represents a sentence, the task is to remove the duplicate words from sentences using regular Expression in Programming Languages like C++, Java, C#, Python, etc. Examples of Remove Duplicate Words from SentencesInput: str = "Good bye bye world world" Output: Good bye world Explanation: We remove the second occurrence of bye and world from Good bye bye world world Input: str = "Ram went went to to to his home" Output: Ram went to his home Explanation: We remove the second occurrence of went and the second and third occurrences of to from Ram went went to to to his home. Input: str = "Hello hello world world" Output: Hello world Explanation: We remove the second occurrence of hello and world from Hello hello world world. Approach 1. Get the sentence.2. Form a regular expression to remove duplicate words from sentences. regex = "\\b(\\w+)(?:\\W+\\1\\b)+";The details of the above regular expression can be understood as: "\\b": A word boundary. Boundaries are needed for special cases. For example, in "My thesis is great", "is" wont be matched twice."\\w+" A word character: [a-zA-Z_0-9] (?:\\W+\\1\\b)+ : This part is a non-capturing group (denoted by (?:...)). It's used to group together the repeated words. Let's break it down further:"\\W+" : This matches one or more non-word characters (anything that is not a word character)."\\1:" This is a back reference to the first capturing group (\\w+). It ensures that the same word that was captured earlier is repeated. The \\1 references the exact text captured by the first capturing group."\\b" Another word boundary anchor to ensure that the repeated word is a whole word."+" This quantifier ensures that the non-capturing group (?:\\W+\\1\\b) matches one or more times, effectively matching one or more repeated words.3. Match the sentence with the Regex. In Java, this can be done using Pattern.matcher(). 4. return the modified sentence. Below is the implementation of the above approach: C++ // C++ program to remove duplicate words // using Regular Expression or ReGex. #include <iostream> #include <regex> using namespace std; // Function to validate the sentence // and remove the duplicate words string removeDuplicateWords(string s) { // Regex to matching repeated words. const regex pattern("\\b(\\w+)(?:\\W+\\1\\b)+", regex_constants::icase); string answer = s; for (auto it = sregex_iterator(s.begin(), s.end(), pattern); it != sregex_iterator(); it++) { // flag type for determining the matching behavior // here it is for matches on 'string' objects smatch match; match = *it; answer.replace(answer.find(match.str(0)), match.str(0).length(), match.str(1)); } return answer; } // Driver Code int main() { // Test Case: 1 string str1 = "Good bye bye world world"; cout << removeDuplicateWords(str1) << endl; // Test Case: 2 string str2 = "Ram went went to to his home"; cout << removeDuplicateWords(str2) << endl; // Test Case: 3 string str3 = "Hello hello world world"; cout << removeDuplicateWords(str3) << endl; return 0; } // This code is contributed by yuvraj_chandra Java // Java program to remove duplicate words // Using Regular Expression or ReGex. import java.util.regex.Matcher; import java.util.regex.Pattern; // Driver Class class GFG { // Function to validate the sentence // and remove the duplicate words public static String removeDuplicateWords(String input) { // Regex to matching repeated words. String regex = "\\b(\\w+)(?:\\W+\\1\\b)+"; Pattern p = Pattern.compile(regex,Pattern.CASE_INSENSITIVE); // Pattern class contains matcher() method // to find matching between given sentence // and regular expression. Matcher m = p.matcher(input); // Check for subsequences of input // that match the compiled pattern while (m.find()) { input = input.replaceAll( m.group(), m.group(1)); } return input; } // Driver code public static void main(String args[]) { // Test Case: 1 String str1 = "Good bye bye world world"; System.out.println(removeDuplicateWords(str1)); // Test Case: 2 String str2 = "Ram went went to to his home"; System.out.println(removeDuplicateWords(str2)); // Test Case: 3 String str3 = "Hello hello world world"; System.out.println( removeDuplicateWords(str3)); } } Python3 # Python program to remove duplicate words # using Regular Expression or ReGex. import re # Function to validate the sentence # and remove the duplicate words def removeDuplicateWords(input): # Regex to matching repeated words regex = r'\b(\w+)(?:\W+\1\b)+' return re.sub(regex, r'\1', input, flags=re.IGNORECASE) # Driver Code # Test Case: 1 str1 = "Good bye bye world world" print(removeDuplicateWords(str1)) # Test Case: 2 str2 = "Ram went went to to his home" print(removeDuplicateWords(str2)) # Test Case: 3 str3 = "Hello hello world world" print(removeDuplicateWords(str3)) # This code is contributed by yuvraj_chandra C# using System; using System.Text.RegularExpressions; class Program { // Function to validate the sentence // and remove the duplicate words static string RemoveDuplicateWords(string s) { // Regex to matching repeated words. Regex pattern = new Regex(@"\b(\w+)(?:\W+\1\b)+", RegexOptions.IgnoreCase); string answer = s; MatchCollection matches = pattern.Matches(s); foreach (Match match in matches) { answer = answer.Replace(match.Groups[0].Value, match.Groups[1].Value); } return answer; } // Driver Code static void Main() { // Test Case: 1 string str1 = "Good bye bye world world"; Console.WriteLine(RemoveDuplicateWords(str1)); // Test Case: 2 string str2 = "Ram went went to to his home"; Console.WriteLine(RemoveDuplicateWords(str2)); // Test Case: 3 string str3 = "Hello hello world world"; Console.WriteLine(RemoveDuplicateWords(str3)); } } JavaScript // Function to remove duplicate words using Regular Expression function removeDuplicateWords(input) { // Regular expression to match repeated words let regex = /\b(\w+)(?:\W+\1\b)+/gi; // Replace duplicate words with the first occurrence return input.replace(regex, '$1'); } // Test cases // Test Case: 1 let str1 = "Good bye bye world world"; console.log(removeDuplicateWords(str1)); // Test Case: 2 let str2 = "Ram went went to to his home"; console.log(removeDuplicateWords(str2)); // Test Case: 3 let str3 = "Hello hello world world"; console.log(removeDuplicateWords(str3)); OutputGood bye world Ram went to his home Hello worldComplexity of the above ProgramsTime Complexity : O(n), where n is length of stringAuxiliary Space : O(1) Comment More infoAdvertise with us Next Article Types of Asymptotic Notations in Complexity Analysis of Algorithms P prashant_srivastava Follow Improve Article Tags : DSA python-regex java-regular-expression CPP-regex Similar Reads Basics & PrerequisitesTime Complexity and Space ComplexityMany times there are more than one ways to solve a problem with different algorithms and we need a way to compare multiple ways. Also, there are situations where we would like to know how much time and resources an algorithm might take when implemented. To measure performance of algorithms, we typic 13 min read Types of Asymptotic Notations in Complexity Analysis of AlgorithmsWe have discussed Asymptotic Analysis, and Worst, Average, and Best Cases of Algorithms. The main idea of asymptotic analysis is to have a measure of the efficiency of algorithms that don't depend on machine-specific constants and don't require algorithms to be implemented and time taken by programs 8 min read Data StructuresGetting Started with Array Data StructureArray is a collection of items of the same variable type that are stored at contiguous memory locations. It is one of the most popular and simple data structures used in programming. Basic terminologies of ArrayArray Index: In an array, elements are identified by their indexes. Array index starts fr 14 min read String in Data StructureA string is a sequence of characters. The following facts make string an interesting data structure.Small set of elements. Unlike normal array, strings typically have smaller set of items. For example, lowercase English alphabet has only 26 characters. ASCII has only 256 characters.Strings are immut 2 min read Hashing in Data StructureHashing is a technique used in data structures that efficiently stores and retrieves data in a way that allows for quick access. Hashing involves mapping data to a specific index in a hash table (an array of items) using a hash function. It enables fast retrieval of information based on its key. The 2 min read Linked List Data StructureA linked list is a fundamental data structure in computer science. It mainly allows efficient insertion and deletion operations compared to arrays. Like arrays, it is also used to implement other data structures like stack, queue and deque. Hereâs the comparison of Linked List vs Arrays Linked List: 2 min read Stack Data StructureA Stack is a linear data structure that follows a particular order in which the operations are performed. The order may be LIFO(Last In First Out) or FILO(First In Last Out). LIFO implies that the element that is inserted last, comes out first and FILO implies that the element that is inserted first 2 min read Queue Data StructureA Queue Data Structure is a fundamental concept in computer science used for storing and managing data in a specific order. It follows the principle of "First in, First out" (FIFO), where the first element added to the queue is the first one to be removed. It is used as a buffer in computer systems 2 min read Tree Data StructureTree Data Structure is a non-linear data structure in which a collection of elements known as nodes are connected to each other via edges such that there exists exactly one path between any two nodes. Types of TreeBinary Tree : Every node has at most two childrenTernary Tree : Every node has at most 4 min read Graph Data StructureGraph Data Structure is a collection of nodes connected by edges. It's used to represent relationships between different entities. If you are looking for topic-wise list of problems on different topics like DFS, BFS, Topological Sort, Shortest Path, etc., please refer to Graph Algorithms. Basics of 3 min read Trie Data StructureThe Trie data structure is a tree-like structure used for storing a dynamic set of strings. It allows for efficient retrieval and storage of keys, making it highly effective in handling large datasets. Trie supports operations such as insertion, search, deletion of keys, and prefix searches. In this 15+ min read AlgorithmsSearching AlgorithmsSearching algorithms are essential tools in computer science used to locate specific items within a collection of data. In this tutorial, we are mainly going to focus upon searching in an array. When we search an item in an array, there are two most common algorithms used based on the type of input 2 min read Sorting AlgorithmsA Sorting Algorithm is used to rearrange a given array or list of elements in an order. For example, a given array [10, 20, 5, 2] becomes [2, 5, 10, 20] after sorting in increasing order and becomes [20, 10, 5, 2] after sorting in decreasing order. There exist different sorting algorithms for differ 3 min read Introduction to RecursionThe process in which a function calls itself directly or indirectly is called recursion and the corresponding function is called a recursive function. A recursive algorithm takes one step toward solution and then recursively call itself to further move. The algorithm stops once we reach the solution 14 min read Greedy AlgorithmsGreedy algorithms are a class of algorithms that make locally optimal choices at each step with the hope of finding a global optimum solution. At every step of the algorithm, we make a choice that looks the best at the moment. To make the choice, we sometimes sort the array so that we can always get 3 min read Graph AlgorithmsGraph is a non-linear data structure like tree data structure. The limitation of tree is, it can only represent hierarchical data. For situations where nodes or vertices are randomly connected with each other other, we use Graph. Example situations where we use graph data structure are, a social net 3 min read Dynamic Programming or DPDynamic Programming is an algorithmic technique with the following properties.It is mainly an optimization over plain recursion. Wherever we see a recursive solution that has repeated calls for the same inputs, we can optimize it using Dynamic Programming. The idea is to simply store the results of 3 min read Bitwise AlgorithmsBitwise algorithms in Data Structures and Algorithms (DSA) involve manipulating individual bits of binary representations of numbers to perform operations efficiently. These algorithms utilize bitwise operators like AND, OR, XOR, NOT, Left Shift, and Right Shift.BasicsIntroduction to Bitwise Algorit 4 min read AdvancedSegment TreeSegment Tree is a data structure that allows efficient querying and updating of intervals or segments of an array. It is particularly useful for problems involving range queries, such as finding the sum, minimum, maximum, or any other operation over a specific range of elements in an array. The tree 3 min read Pattern SearchingPattern searching algorithms are essential tools in computer science and data processing. These algorithms are designed to efficiently find a particular pattern within a larger set of data. Patten SearchingImportant Pattern Searching Algorithms:Naive String Matching : A Simple Algorithm that works i 2 min read GeometryGeometry is a branch of mathematics that studies the properties, measurements, and relationships of points, lines, angles, surfaces, and solids. From basic lines and angles to complex structures, it helps us understand the world around us.Geometry for Students and BeginnersThis section covers key br 2 min read Interview PreparationInterview Corner: All Resources To Crack Any Tech InterviewThis article serves as your one-stop guide to interview preparation, designed to help you succeed across different experience levels and company expectations. Here is what you should expect in a Tech Interview, please remember the following points:Tech Interview Preparation does not have any fixed s 3 min read GfG160 - 160 Days of Problem SolvingAre you preparing for technical interviews and would like to be well-structured to improve your problem-solving skills? Well, we have good news for you! GeeksforGeeks proudly presents GfG160, a 160-day coding challenge starting on 15th November 2024. In this event, we will provide daily coding probl 3 min read Practice ProblemGeeksforGeeks Practice - Leading Online Coding PlatformGeeksforGeeks Practice is an online coding platform designed to help developers and students practice coding online and sharpen their programming skills with the following features. GfG 160: This consists of most popular interview problems organized topic wise and difficulty with with well written e 6 min read Problem of The Day - Develop the Habit of CodingDo you find it difficult to develop a habit of Coding? If yes, then we have a most effective solution for you - all you geeks need to do is solve one programming problem each day without any break, and BOOM, the results will surprise you! Let us tell you how:Suppose you commit to improve yourself an 5 min read Like