Delete Duplicate Lines in Text File using Java



The interface set does not allow duplicate elements. The add() method of this interface accepts elements and adds to the Set object, if the addition is successful it returns true, if you try to add an existing element using this method, the addition operations fails returning false.

Problem Statement

Given a file which contains duplicate lines, write a program in Java to read the file, remove duplicate lines, and write the unique lines to a new file.

Input

Hello how are you
Hello how are you
welcome to Tutorialspoint

Output

Hello how are you
welcome to Tutorialspoint

Basic Approch

Basic approch, to remove duplicate lines from a File −

  • Step 1. Instantiate Scanner class (any class that reads data from a file)
  • Step 2. Instantiate the FileWriter class (any class that writes data into a file)
  • Step 3. Create an object of the Set interface.
  • Step 4. Read each line of the file Store it in a String say input.
  • Step 5. Try to add this String to the Set object.
  • Step 6. If the addition is successful, append that particular line to file writer.
  • Step 7. Finally, flush the contents of the FileWriter to the output file.

If a file contains a particular line more than one time, for the 1st time it is added to the set object and thus appended to the file writer.

If the same line is encountered again while reading all the lines in the file, since it already exists in the set object the add() method rejects it.

Example

The following Java program removes the duplicate lines from the above file and adds them to the file named output.txt.

import java.io.File;
import java.io.FileWriter;
import java.util.HashSet;
import java.util.Scanner;
import java.util.Set;
public class DeletingDuplcateLines {
   public static void main(String args[]) throws Exception {
      String filePath = "D://sample.txt";
      String input = null;
      //Instantiating the Scanner class
      Scanner sc = new Scanner(new File(filePath));
      //Instantiating the FileWriter class
      FileWriter writer = new FileWriter("D://output.txt");
      //Instantiating the Set class
      Set set = new HashSet();
      while (sc.hasNextLine()) {
         input = sc.nextLine();
         if(set.add(input)) {
            writer.append(input+"
");          }       }       writer.flush();       System.out.println("Contents added............");    } }

Output

Contents added............

The contents of the output.txt will be:

Hello how are you
welcome to Tutorialspoint
Updated on: 2024-07-08T12:02:44+05:30

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements