File Handling Guide
File Handling Guide
The above example code declares two objects, an input file stream object, and an output file stream object. Of
course, they can be named whatever you wish, like any other C++ variable.
1
A disk file consists of a body of text on the disk, arranged in a way determined by your computer's Operating System
(OS), which is responsible for keeping track of the information. If the file is deleted, moved, expanded, contracted,
etc., the OS keeps track of exactly where it is on the disk and how much of it there is. The C/C++ facilities for
working with disk files actually call OS subroutines to do the work.
So before you can use a disk file, you have to establish a relationship between your file stream object and the disk
file. More exactly, you have to ask the OS to connect your stream to the file. Fortunately, this is easy: Just tell the
stream object that you want to "open" the disk file and supply the name of the disk file as a C-string; the open
member function negotiates with the OS to locate that file on the disk and establish the connection between that file
and your stream object. Continuing the example:
!
!
ifstream my_input_file;!
ofstream my_output_file;!
!
!
Now the stream my_input_file is connected to the text file on disk named "input_data" and the stream
my_output_file is connected to the text file on disk named "output_data".
Instead of creating and then opening the file streams in separate statements, you can use a constructor that takes the
file name as an argument; after doing the normal initializations, the constructor completes the initialization by
opening the named file. The above four statements would then condense down to two:
!
!
In both ways of opening a file, you can specify the file path or file name either with a C-string array or literal (as the
above examples do), or in C++11 with a std::string. For example:
!
!
!
string filename;
cin >> filename;
ifstream my_input_file(filename);
When opening files, especially input files, is it critical to test for whether the open operation succeeded. File stream
errors are discussed in more detail below. But for now, here is one way of doing this test using a member function
that returns true if the file was successfully opened:
!
!
!
if (my_input_file.is_open()) {
!
// can continue, file opened correctly
!
}
Now that the file streams are open, using them could not be simpler. We can read and write variable values from/to
the streams using the stream input and output operators just like with cin and cout. For example, to read an integer
and a double from the input file:
!
The contents of the input file are processed just like you were typing them in via cin, and the output going into the
file looks identical to what is written on your display with cout. You can read or write as much information from the
file as is appropriate. Since ifstream and ofstream inherit from istream and ostream, your definitions of overloaded
operators << or >> for ostream and istream will automatically work for file streams.
An easy way to prepare an input file is to use a text editor that creates plain ASCII text files, such as using the "save
as text" option in a Windows or Mac word-processor. But by far the most convenient approach is to use the same
text editor you use for writing your programs - in Unix, this is just vi or emacs. The IDE program editors all work in
terms of text files. In MSVC or CW, simply create a new file, type your text content into it, and save it. The same
editors work great for viewing the contents of an output file as well. Just be sure that the last character in the file is a
whitespace character such as a newline - this avoids some odd end-of-file behaviors.
When your program is finished reading from or writing to a file, it is considered good programming practice to
"close" the file. This is asking the OS to disconnect your program from the disk file, and save the final state of the
file on disk. For an input file, the effect of closing is minor. For an output file, it can be vital to close the file
promptly; the file system normally "buffers" information before actually recording it on the disk - this saves a lot of
time. But until the buffer is "flushed" and all the information actually written to the disk, the file is in an incomplete
state. Fortunately, closing is even easier than opening:
!
!
my_input_file.close();!
my_output_file.close();!
A couple of handy tips: If you want to read a file twice with the same stream object, read it through once until endof-file occurs, clear the end-of-file state with clear(), then close the file, then reopen it, and start reading it again.
Opening an input file always resets it to read starting at the beginning. You can read information that your program
has just written into an output file by closing the output file stream, and then reopening that same disk file as an input
stream.
Now, on to the complications. These concern policies for opening files, some additional handy member functions,
and what can go wrong - which includes how you tell when you have read all the way through a file.
However, the situation is different with an output file. If you open a file for output, using only the normal default
specifications (as above) and the OS cannot find a file with that name, the open function creates a new empty file
with that name. Why? Because decades of experience shows this is the most convenient and sensible policy! This is
almost certainly what you want! This is why it is the default behavior.
But what if you open a file for output using only the normal default specifications (as above) and it already exists
and the OS finds it? The most sensible and convenient policy has proven to be the following: The existing file is
deleted, and a new empty file is created with the same name. Again, this is almost certainly what you want! This is
why it is the default behavior.
What if you want something different? Consult a reference for other member functions and opening options that you
can supply. Full flexibility is available, but it is idiomatic to use the defaults when they apply (which they usually
do).
Reads the next character, skipping nothing, and returns it as an integer. If eof is encountered, the function returns the
special value defined as the macro EOF. The need to test for EOF means that this form is relatively inconvenient for
reading from a file.
istream& get(char&);
cin.get(char_variable);
Reads the next character, skipping nothing, into the supplied char variable (notice the reference parameter). The
returned value is a reference to the istream object; this returned value can be used to test the stream state.
istream& getline(char * array, int n);
input_file.getline(buffer_char_array, buffer_length);
This function reads characters into the pointed-to character array, reading until it has either encountered a '\n' or
has read n-1 characters. It terminates the string of characters with '\0' so you get a valid C string regardless of how
much was read. As long as the supplied n is less than or equal to the array length, you will not overflow the array.
Conveniently, the '\n' character is removed from the stream and is not placed into the array, which is usually what
you want in order to read and process the information in a series of lines. The returned value is a reference to the
istream object, which can be used to test the stream state. If it fills the array without finding the '\n', the input
operation fails in the same way as invalid input (see below) to let you know that a newline was not found within the
size of the line you are trying to read.
If you want different behaviors from these, consult a reference for different forms of get and getline that allow
different possible terminators and different treatments of them. Note that if you want to read a line into a
std::string, there is a special function just for this purpose, declared in <string>:
istream& getline(istream&, std::string&);
Because the string automatically expands as needed, reading a line into a std::string with this function cannot
overflow, and so is by far the best way to process a file (or cin input) a line at a time.
4
interpretations of the eof condition: it is either expected as a normal part of processing (we've read all the input and
so are ready to use it), or unexpected, meaning that some input is missing (there's supposed to be more!) and so
there is an error. Unexpected eofs in file input are usually handled by informing the user and then terminating the
program.
Important: Do not test the state of the eof bit (with the eof() member function) in order to control an input reading
loop! This will not detect fail or bad conditions, and the end of file condition is not raised until you try to read past
the input; the last successful input does not set the eof condition! Instead control the reading loop testing for the
stream being in the good or not-good state. Use eof() only to determine why the stream is no longer good. See the
example below.
Also important: Remember that once the stream is no longer good, it will stay that way, and any additional input
operations will do nothing, no matter what they are or what is in the input. You have to clear the stream by resetting
the error bits before input will work on the stream again. If you don't clear the stream state, your program will drop
through all the remaining input statements doing nothing; often, your program will appear to hang, or loop forever.
While this may seem cranky, it is a way to ensure that if your program gets incorrect input, and your code does not
detect and handle it appropriately, something obviously wrong will probably happen.
Returns true if none of the error bits are on and the stream is ready to use (is open). This is a good choice for
asking "is everything OK?"
if (stream_object)
if (!stream_object)
if (stream_object >> var)
while (stream_object >> var)
These tests of the whole stream object correspond to using good() The stream classes have a conversion
operator that converts the stream object to the same true-false value that is returned by the good() function.
The first test is true if the stream is in a good state; the second if the stream is not in a good state. The third
and fourth examples test the result of the input operation (remember that the result of the input operator is the
stream object itself). The fourth example form is commonly used to repeatedly read the stream until an end of
file condition.
bool is_open();
Returns true if the stream is open. A good choice for testing for a successful opening because its name makes
the purpose of the test more obvious.
bool bad();
Returns true if the bad bit is on as a result of a "hard" I/O error. It is not the opposite of good().
bool fail();
Returns true if the fail bit is on due to invalid input or the bad bit is on (odd, but Standard) (see the Basic C+
+ Stream I/O handout).
bool eof();
Returns true if the eof bit is on due to trying to read past the end of the file.
clear();
Resets all of the error bits to off. Does not change which characters will be read next from the stream.
The action your program takes on an input error depends on the type of error encountered. If it is invalid input, your
program needs to clean up the input and clear the stream, and attempt to continue. If it is an expected eof, the
program simply goes to the next step in the processing. But if the eof is unexpected, something is wrong, and the
user needs to be informed. Finally, if you are checking for hard I/O errors, you need to deal with it if it turns out to
be the problem.
A simple example
The following example program opens an input and output file, checks each one for being open, and then reads
integers from the input file and writes twice the value of each one to the output file. It continues until the input
stream is no longer good. This simple pattern would be justified if the programmer was confident that (1) the input
data was always valid (no garbage); (2) No hard I/O errors would occur; (3) If 1 and 2 turn out to be false, we can
recognize it by other means. Under these conditions, the stream input fails only at end of file, making for a very
simple file-reading loop. For some kinds of data files (e.g. string data) where invalid input cannot happen, this
approach is usually adequate. The pattern represented by this program is:
Attempt to read some input.
Check the stream state.
If the state is good, process the input.
If the state is not good, assume expected eof condition and continue processing.
#include <iostream>
#include <fstream>
using namespace std;
int main ()
{
!
ifstream input_file("data.input");!
!
// open the input file
!
if (!input_file.is_open()) {!
// check for successful opening
!
!
cout << "Input file could not be opened! Terminating!" << endl;
!
!
return 1;
!
!
}
!
ofstream output_file("data.output");!
// open the output file
!
if (!output_file.is_open()) { // check for successful opening
!
!
cout << "Output file could not be opened! Terminating!" << endl;
!
!
return 1;
!
!
}
!
// read as long as the stream is good - any problem, just quit.
!
// output is each number times two on a line by itself
!
int datum;
!
while (input_file >> datum) {
!
!
output_file << datum * 2 << endl;
!
!
}!
!
!
input_file.close();
!
output_file.close();
!
cout << "Done!" << endl;
!
return 0;
}
--- input file --12 34 23
34
6
89
--- output file --24
68
46
7
68
12
178
10