
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Delete Duplicate Lines in Text File using Java
The interface set does not allow duplicate elements. The add() method of this interface accepts elements and adds to the Set object, if the addition is successful it returns true, if you try to add an existing element using this method, the addition operations fails returning false.
Problem Statement
Given a file which contains duplicate lines, write a program in Java to read the file, remove duplicate lines, and write the unique lines to a new file.
Input
Hello how are you Hello how are you welcome to Tutorialspoint
Output
Hello how are you welcome to Tutorialspoint
Basic Approch
Basic approch, to remove duplicate lines from a File −
- Step 1. Instantiate Scanner class (any class that reads data from a file)
- Step 2. Instantiate the FileWriter class (any class that writes data into a file)
- Step 3. Create an object of the Set interface.
- Step 4. Read each line of the file Store it in a String say input.
- Step 5. Try to add this String to the Set object.
- Step 6. If the addition is successful, append that particular line to file writer.
- Step 7. Finally, flush the contents of the FileWriter to the output file.
If a file contains a particular line more than one time, for the 1st time it is added to the set object and thus appended to the file writer.
If the same line is encountered again while reading all the lines in the file, since it already exists in the set object the add() method rejects it.
Example
The following Java program removes the duplicate lines from the above file and adds them to the file named output.txt.
import java.io.File; import java.io.FileWriter; import java.util.HashSet; import java.util.Scanner; import java.util.Set; public class DeletingDuplcateLines { public static void main(String args[]) throws Exception { String filePath = "D://sample.txt"; String input = null; //Instantiating the Scanner class Scanner sc = new Scanner(new File(filePath)); //Instantiating the FileWriter class FileWriter writer = new FileWriter("D://output.txt"); //Instantiating the Set class Set set = new HashSet(); while (sc.hasNextLine()) { input = sc.nextLine(); if(set.add(input)) { writer.append(input+"
"); } } writer.flush(); System.out.println("Contents added............"); } }
Output
Contents added............
The contents of the output.txt will be:
Hello how are you welcome to Tutorialspoint