In C++, reading a large text file efficiently requires a careful approach to ensure optimal performance in terms of memory usage and processing speed. In this article, we will learn how to read a huge text file efficiently in C++.
Read a Large Text File Efficiently in C++
The most efficient way to read a large text file is to read the file in chunks rather than line by line or one character at a time by using the combination of std::ifstream and std::istringstream to parse each chunk. This method significantly reduces the I/O operations, thereby improving the overall performance.
Approach
- Create an object for the text file using std::ifstream.
- Open the file using the file stream object by passing the path of the file to the file stream object.
- Read the file in chunks of specified size referred to as BUFFER_SIZE, and process each chunk.
- Use std::istringstream to parse each chunk into lines.
- Print the non-empty lines returned by istringstream.
- Process any remaining data in the last chunk to ensure no data is left unprocessed.
- Close the file after all data in the file has been processed.
C++ Program to Read a Huge Text File
The below program illustrates how we can read a huge text file effectively in C++.
// C++ Program to read a huge text file efficiently
#include <fstream>
#include <iostream>
#include <sstream>
#include <vector>
using namespace std;
// Declare the buffer size
const int BUFFER_SIZE = 1024;
int main()
{
ifstream file("huge_file.txt");
if (!file.is_open()) {
cerr
<< "Error: Could not open file 'huge_file.txt'."
<< endl;
return 1;
}
vector<char> buffer(BUFFER_SIZE);
istringstream iss;
// Parse the chunk of data from the text file into lines
while (file.read(buffer.data(), BUFFER_SIZE)) {
streamsize bytes_read = file.gcount();
iss.str(string(buffer.data(), bytes_read));
iss.clear();
string line;
while (getline(iss, line)) {
if (!line.empty()) {
cout << line << endl;
}
}
}
// Process any remaining data
streamsize bytes_read = file.gcount();
if (bytes_read > 0) {
string last_chunk(buffer.data(), bytes_read);
cout << last_chunk;
}
// close the file
file.close();
return 0;
}
Output
Hello World
GeeksforGeeks
C++ Java Python
GoLang Rust JavaScriptTime Complexity: O(N), where N is the total number of characters in the text file.
Auxiliary Space: O(M), where M is the length of the longest line in the text file.