Handling Files in C
Handling Files in C
• Handling Files in C
o UNIX File Redirection
o C File Handling - File Pointers
! Opening a file pointer using fopen
! Standard file pointers in UNIX
! Closing a file using close
o Input and Output using file pointers
! Character Input and Output with Files
! Formatted Input Output with File Pointers
! Formatted Input Output with Strings
! Whole Line Input and Output using File Pointers
o Special Characters
! NULL, The Null Pointer or Character
! EOF, The End of File Marker
o Other String Handling Functions
o Conclusion
Handling Files in C
This section describes the use of C's input / output facilities for reading and writing files.
There is also a brief description of string handling functions here.
The functions are all variants on the forms of input / output which were introduced in the
previous section.
UNIX has a facility called redirection which allows a program to access a single input
file and a single output file very easily. The program is written to read from the keyboard
and write to the terminal screen as normal.
To run prog1 but read data from file infile instead of the keyboard, you would type
To run prog1 and write data to outfile instead of the screen, you would type
44
C Programming, 16 April 2002, Sawaluddin, [email protected]
prog1 > outfile
Redirection is simple, and allows a single program to read or write data to or from files or
the screen and keyboard.
Some programs need to access several files for input or output, redirection cannot do this.
In such cases you will have to use C's file handling facilities.
FILE *output_file;
Your program must open a file before it can access it. This is done using the fopen
function, which returns the required file pointer. If the file cannot be opened for any
reason then the value NULL will be returned. You will usually use fopen as follows
fopen takes two arguments, both are strings, the first is the name of the file to be opened,
the second is an access character, which is usually one of:
As usual, use the man command for further details by typing man fopen.
45
C Programming, 16 April 2002, Sawaluddin, [email protected]
Standard file pointers in UNIX
UNIX systems provide three file descriptors which are automatically open to all C
programs. These are
Since these files are already open, there is no need to use fopen on them.
The fclose command can be used to disconnect a file pointer from a file. This is usually
done so that the pointer can be used to access a different file. Systems have a limit on the
number of files which can be open simultaneously, so it is a good idea to close a file
when you have finished using it.
fclose(output_file);
If files are still open when a program exits, the system will close them for you. However
it is usually better to close the files properly.
46
C Programming, 16 April 2002, Sawaluddin, [email protected]
Character Input and Output with Files
This is done using equivalents of getchar and putchar which are called getc and putc.
Each takes an extra argument, which identifies the file pointer to be used for input or
output.
Similarly there are equivalents to the functions printf and scanf which read or write data
to files. These are called fprintf and fscanf. You have already seen fprintf being used to
write data to stderr.
The functions are used in the same way, except that the fprintf and fscanf take the file
pointer as an additional first argument.
These are the third set of the printf and scanf families. They are called sprintf and sscanf.
sprintf
puts formatted data into a string which must have sufficient space allocated to
hold it. This can be done by declaring it as an array of char. The data is formatted
according to a control string of the same form as that for p rintf.
sscanf
takes data from a string and stores it in other variables as specified by the control
string. This is done in the same way that scanf reads input data into variables.
sscanf is very useful for converting strings into numeric v values.
Predictably, equivalents to gets and puts exist called fgets and fputs. The programmer
should be careful in using them, since they are incompatible with gets and puts. gets
requires the programmer to specify the maximum number of characters to be read. fgets
and fputs retain the trailing newline character on the line they read or write, wheras gets
and puts discard the newline.
47
C Programming, 16 April 2002, Sawaluddin, [email protected]
When transferring data from files to standard input / output channels, the simplest way to
avoid incompatibility with the newline is to use fgets and fputs for files and standard
channels too.
fputs(data_string, stdout);
Special Characters
C makes use of some 'invisible' characters which have already been mentioned. However
a fuller description seems appropriate here.
NULL is a character or pointer value. If a pointer, then the pointer variable does not
reference any object (i.e. a pointer to nothing). It is usual for functions which return
pointers to return NULL if they failed in some way. The return value can be tested. See
the section on fopen for an example of this.
NULL is returned by read commands of the gets family when they try to read beyond the
end of an input file.
EOF is a character which indicates the end of a file. It is returned by read commands of
the getc and scanf families when they try to read beyond the end of a file.
48
C Programming, 16 April 2002, Sawaluddin, [email protected]
Other String Handling Functions
As well as sprintf and sscanf, the UNIX system has a number of other string handling
functions within its libraries. A number of the most useful ones are contained in the
<strings.h> file, and are made available by putting the line
#include <strings.h>
A full list of these functions can be seen using the man command by typing
man 3 strings
Conclusion
The variety of different types of input and output, using standard input or output, files or
character strings make C a very powerful language. The addition of character input and
output make it highly suitable for applications where the format of data must be
controlled very precisely.
49
C Programming, 16 April 2002, Sawaluddin, [email protected]