0% found this document useful (0 votes)
32 views

Chap.5.File Handling

The document discusses various file handling concepts in Python including: 1. Data can be stored in text files using ASCII/Unicode characters or in binary files as a stream of bytes. 2. Text files can be opened, read from and written to using methods like open(), read(), write(), etc. 3. Binary files are used to store non-text data like objects and lists using pickling and serialization methods from the pickle module.

Uploaded by

manaspatelms
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Chap.5.File Handling

The document discusses various file handling concepts in Python including: 1. Data can be stored in text files using ASCII/Unicode characters or in binary files as a stream of bytes. 2. Text files can be opened, read from and written to using methods like open(), read(), write(), etc. 3. Binary files are used to store non-text data like objects and lists using pickling and serialization methods from the pickle module.

Uploaded by

manaspatelms
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Chap.

File Handling

The provisions to use and create files through programs are offered by
programming languages.

Data Files: The data files store data pertaining to a specific application.

Data files can be stored in two ways:

1. Text Files
2. Binary Files

1. Text Files:
A text file stores information in the form of a stream of ASCII or UNICODE
characters.
In text files, each line of text is terminated (delimited) with a special
character (as per OS) and known as EOL (End Of Line).
In Python, by default, the EOL character is the new line character ‘\n’ or
the carriage return ‘\r’.

 The text files can be of following types:


(a) Regular text file – These are the text files which store the text in
the same form as typed. Here the newline character ends a line
and the text translation takes place. These files have a file
extension ‘.txt’.
(b) Delimited Text Files: In these, a specific character is used to
separate the values, i.e. use of a tab or a comma after each value.
When a tab character is used to separate the values, these are
called TSV files (Tab Separated Values files), when a comma
character is used to separate the values, these are called CSV files
(Comma Separated Values files).

e.g.
Regular text file ------------- I am simple text.
TSV file ------------------------ I →am→simple→text.
CSV file ----------------------- I,am,simple,text.
2. Binary Files:
A binary file stores data and information in the form of stream of bytes. It
stores the information in the same format in which the information is held
in memory, i.e. the file returned to you is raw (without any translation).
In binary files there is no delimiter for a line.
The binary files can take variety of extensions. Most non-text file
extensions are binary files.
NOTE: The text files can be opened in any text editor and are in human readable
form while binary files are not in human readable form.

Opening and Closing files

The working with a file in Python we need to open it in a specific mode.


The most basic file manipulation tasks include adding, modifying or deleting
data in a file, which in turn include any one or combination of the following
operations:
1. Reading data from files
2. Writing data into files
3. Appending data to files

Opening Flies: For the purpose we follow the syntaxes:


<file_objectname> = open(<file_name>)
<file_objectname> = open(<file_name>, mode)
* File_objectname is also called file handle.
e.g. myfile = open(“D:\\filehandlind\\taxes.txt”)

File Object / File Handle – A file object is a reference to a file on disk. It opens
and makes the file available for the different tasks.

File Access Modes – When we open a file in Python then it requires to know the
file mode in which the file is opened. File-mode governs the type of operations
possible in the opened file.
Closing Files: An open file is closed by calling the close() method of its file-
object.

A close() function breaks the link of file-object and the file on the disk. After
close(), no task can be performed on that file through the file-object.

Why closing of file is important?


 Closing file using close() is a good programming practice.
 For the last write operation the data may remain in the memory until
close() is used. This may lead to data loss.
 Some operating system may restrict on the number of files opened,
so it becomes necessary to close a file before opening a new file.
 Some operating systems treats open files as locked and private.
When a file is not closed and no longer in use then it leads to
unnecessary memory blockage.
Working with text files
Reading from text files: Python provides mainly three types of read functions to
read from a data file.
1. read() <file_handle>.read() reads the whole file.
2. read(n) <file_handle>.read(n) reads at most n bytes.
3. readline() <file_handle>.readline() reads a line from the file.
4. readline(n) <file_handle>.readline(n) reads at most n bytes from
the line.
5. readlines() <file_handle>.readlines() reads all lines and returns a
list of all lines.

Writing onto text files: Python provides mainly two types of functions to write
onto a data file.
1. write() <file_handle>.write(str1) writes string str1
2. writelines() <file_handle>.writelines(L) writes all strings from
list L
Flush() Function: This function forces the writing of data on disc still pending in
output buffer.
myfile = open("D:\\file\\tax.txt", 'a+')
myfile.write("Hello")
myfile.flush()
myfile.close()

Removing whitespaces after reading from file


1. Removing whitespaces or EOL character from the end of a line.
<file_handle> = open(<file_name>, ‘r’)
<file_handle>.rstrip() # removes whitespaces
<file_handle>.rstrip(‘\n’) # removes EOL character
2. Removing whitespaces from the left of a line.
<file_handle> = open(<file_name>, ‘r’)
<file_handle>.lstrip() # removes whitespaces
Accessing and manipulating location of a file pointer: Python provides two
functions that helps us manipulating the position of file-pointer so that we can
read and write
Working with Binary Files
Sometimes, we need to write and read non-simple objects such as dictionaries,
tuples, lists or nested lists, etc. Since, objects have some specific structure or
hierarchy, it becomes important to store them in a way so that their
structure/hierarchy is maintained.
For this purpose, objects are often serialized and then stored in Binary Files.

Serialization/pickling - It is the process of converting Python object hierarchy


into a byte stream in such a way that it can be reconstructed in original form
when unpickled or de-serialized.

Unpickling/De-serialized - It is the reverse process of pickling/serialization.

 Python provides the pickle module to achieve this.


 The pickle module implements a fundamental, but powerful
algorithm pickling and unpickling a Python object structure.

Process to work with pickle module:


1. Import pickle module.
2. Open binary file in the required file mode (read, write or append).
3. Process binary file by reading/writing objects using pickle methods.
4. Close the file.

Pickle methods:
1. dump() method – It is used to write in a binary file.
2. load() method - It is used to read from a binary file.

Example:
>>>import pickle
>>>fh = open("D:\\file\\myfile.dat", 'wb')
>>>pickle.dump("Hello", fh)
>>>fh.close()

>>>stud_details = ['Pankaj Kumar', '18', 'Male', '5698745263']


>>>fh = open("D:\\file\\myfile.dat", 'wb')
>>>pickle.dump(stud_details, fh)
>>>fh.close()

>>>stud2_details = ['Diksha', '16', 'Female', '5568745263']


>>>fh = open("D:\\file\\myfile.dat", 'ab')
>>>pickle.dump(stud2_details, fh)
>>>fh.close()

>>>fh = open("D:\\file\\myfile.dat", 'rb')


>>>pickle.load(fh)
>>>['Pankaj Kumar', '18', 'Male', '5698745263']
>>>pickle.load(fh)
>>>['Diksha', '16', 'Female', '5568745263']

Working with CSV Files

CSV – Comma Separated Values

CSV files are delimited files that store tabular data columns where comma
delimits every value.

The separator character of CSV files is called a delimiter. The common and
most popular delimiter is comma (,). Other popular delimiters are tab (\t),
colon (:), pipe (|) and semi-colon (;) characters.

Python csv module – It provides functionality to read and write tabular data in
CSV format. It provides two specific types of objects – the reader and writer
objects – to read and write into CSV files.

Opening a CSV file:


Fh = open(“D:\\file\\student.csv”, ‘w’)
e.g.
import csv
fh = open("D:\\file\\student.csv", "w", newline = '')
stuwriter = csv.writer(fh)
stuwriter.writerow(['Roll', 'Name', 'Marks'])
det = (['01', 'Anupriya', '40'], ['02', 'Pankaj', '40'])
stuwriter.writerows(det)
fh.close()

You might also like