
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Read and Write TAR Archive Files Using Python tarfile
The ‘tar’ utility was originally introduced for UNIX operating system. Its purpose is to collect multiple files in a single archive file often called tarball which makes it easy to distribute the files. Functions in tarfile module of Python’s standard library help in creating tar archives and extracting from the tarball as required. The archives can be constructed with gzip, bz2 and lzma compressions or without any compression at all.
Main function defined in this module is main() using which writing to tar file or reading from it is accomplished.
Open()
This function returns a TarFile object corresponding to file name which is provided to it as parameter. The function requires another parameter called mode, which by default is ‘r’ indicating no compression. Other modes are listed below
Sr.No. | Mode & action |
---|---|
1 |
'r' or 'r:*' Open for reading with transparent compression. |
2 |
'r:' Open for reading without compression. |
3 |
'r:gz' Open for reading with gzip compression. |
4 |
'r:bz2' Open for reading with bzip2 compression. |
5 |
'r:xz' Open for reading with lzma compression. |
6 |
'x' or 'x:' Create a tarfile exclusively without compression. |
7 |
'x:gz' Create a tarfile with gzip compression. |
8 |
'x:bz2' Create a tarfile with bzip2 compression. |
9 |
'x:xz' Create a tarfile with lzma compression. |
10 |
'a' or 'a:' Open for appending with no compression. |
11 |
'w' or 'w:' Open for uncompressed writing. |
12 |
'w:gz' Open for gzip compressed writing. |
13 |
'w:bz2' Open for bzip2 compressed writing. |
14 |
'w:xz' Open for lzma compressed writing. |
The module defines TarFile class. Instead of open() function, TarFile object can be instantiated by calling constructor.
TarFile()
This constructor also needs a file name and mode parameter. Possible values of mode parameter are as above.
Other methods in this class are as follows
add()
This method adds a file to the archive. The method needs a name which can be name of file, directory, symbolic link,shortcut etc. Directories are recursively added by default. To prevent recursive addition set recursive parameter to False.
addfile()
This method adds TarInfo object to the archive.
extractall()
This method extracts all members of archive into current path if any other path is not explicitly provided.
extract()
This method extracts specified member to given path, default is current path.
Following example opens a tar file for compression with gzip algorithm and adds a file in it.
>>> fp = tarfile.open("zen.tar.gz","w:gz") >>> fp.add("zen.txt") >>> fp.close()
Assuming that ‘zen.txt’ file is present in current working directory, it will be added in ‘zen.tar.gz’ file.
Following code extracts the files from the tar archive and extracts all files (in this case there is only on) and puts them in current folder. To verify the result, you may delete or rename ‘zen.txt’ in current folder.
>>> fp = tarfile.open("zen.tar.gz","r:gz") >>> fp.extractall() >>> fp.close()
You will find that ‘zen.txt’ file will appear in the current directory.
To create a tar consisting of all files in current directory, use following code
import tarfile, glob >>> fp=tarfile.open('file.tar','w') >>> for file in glob.glob('*.*'): fp.add(file) >>> fp.close()
Command line interface
Creation and extraction of tar files can be achieved through command line interface. For example ‘lines.txt’ file is added in a tar file by following command executed in command window
C:\python36 >python -m tarfile -c line.tar lines.txt
Following command line options can be used.
-l or --list | List files in a tarfile. |
-c or --create | Create tarfile from source files. |
-e or --extract | Extract tarfile into the current directory if output_dir is not specified. |
-t or --test | Test whether the tarfile is valid or not. |
-v or --verbose | Verbose output. |
Following command line will extract line.tar in newdir folder under current directory.
C:\python36>python -m tarfile -e line.tar newdir/
Following command line will list all files in the tar archive.
C:\python36>python -m tarfile -l files.tar
This article on tarfile module explained classes and functions defined in it.