Introduction to Binary
Files in Python
This presentation explores binary files in Python. They efficiently
store non-text data like images and objects. Python's pickle module
serialises objects into byte streams. This topic is crucial for
persistent storage of complex data structures.
by surya chandran
Text Files vs. Binary Files
Text Files Binary Files
Human-readable, store characters (ASCII/Unicode). Store raw bytes (0s and 1s), not directly human-
Slower for large, complex data. Conversions needed for readable. Faster for I/O. No character
storage. encoding/decoding overhead.
Example: data.txt contains "Hello World". Example: Python object {'name': 'Alice'} stored as
byte sequence.
Working with Binary Files:
Opening & Closing
Open Function
Use open() with binary modes: 'wb', 'rb', 'ab'.
Example: open("student.dat", "wb").
With Statement
Best practice: with open(...) as f:.
Ensures automatic file closure, preventing data loss.
Other Modes
'r+b' (read/write), 'w+b' (write/read, truncates), 'a+b'
(append/read).
Choose mode based on operation needed.
Writing Data to Binary Files (pickle.dump)
Sequential Write
Dump Object
Data is written sequentially to
Import Pickle
Use pickle.dump(object, the binary file.
First, import the pickle module: file_object) to write Python
Example:
import pickle. objects.
This serialises the object to a pickle.dump({'name':
byte stream. 'Rohan'}, file_object).
Reading Data from Binary
Files (pickle.load)
Load Single Object
pickle.load(file_object) reads one pickled object.
This deserialises the byte stream to a Python object.
Read Multiple Records
Use a loop with try-except EOFError.
EOFError signals the end of file data.
Example Usage
student_data = pickle.load(file_object).
Process each object as it is read.
Navigating Binary Files (seek & tell)
Tell Current Position Whence Options
file_object.tell() returns the current cursor position. whence can be 0 (start), 1 (current), or 2 (end).
It indicates bytes from the file's start. Crucial for random access in files.
1 2 3
Seek to Position
file_object.seek(offset, whence) moves the cursor.
offset is bytes to move.
Practical Application: Updating Records
Read to List Modify Records
Read all records into a temporary Update the desired record(s)
in-memory list. within the list.
This allows easy modification. Changes are made in memory.
Rewrite All Clear Original
Write all modified records from the Clear the original file using 'wb'
list back to the file. mode or create a new file.
Ensures persistent updates. Prepare for rewritten data.
Conclusion & Best Practices
Efficient Storage
Binary files store complex Python objects efficiently.
pickle is standard for object serialization.
Robust Handling
Always use with open(...) for reliable file management.
It ensures proper closure.
Error Management
Handle EOFError when reading multiple records to prevent crashes.
Advanced Navigation
Understand seek() and tell() for advanced navigation.
Be careful with variable object sizes during updates.