
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Scan Through a Directory Recursively in Python
A directory is simply defined as a collection of subdirectories and single files; or either one of them. A directory hierarchy is constructed by organizing all the files and subdirectories within a main directory, also known as "root" directory. These subdirectories are separated using a "/" operator in a directory hierarchy.
Since the directory is organized in the form of a hierarchy/tree, scanning it can be likened to traversing a tree. And to do that, there are various ways.
Using os.walk() method
Using glob.glob() method
Using os.listdir() method
The directories are handled by the operating system; therefore, whenever one needs a status update on any directory it needs to be done using the os module.
Using os.walk() Method
The os.walk() function generates file names in a directory tree by walking it top-down or bottom-up. It returns a three-tuple for each directory in the tree rooted at directory top: (path, names, and filenames)
The path is a string that represents the path to the directory. The names variable contains a list of the names of the subdirectories in path that do not begin with '.' or '..' The filenames variable contains a list of the names of non-directory files in path.
Example
Here, let us use the os.walk() method to display all the files and subdirectories present in the current root directory.
import os path = "." for root, d_names, f_names in os.walk(path): print(root, d_names, f_names)
Output
Let us compile and run the program above, to produce the following result ?
. [] ['main.py']
Example
We can also make a full path for each file. For that, we must use the os.path.join() method. This method will create a path for a file. These paths of each file can be appended together using the append() method as shown below.
import os path = "./TEST" fname = [] for root,d_names,f_names in os.walk(path): for f in f_names: fname.append(os.path.join(root, f)) print("fname = %s" %fname)
Output
The output for the program above is given as follows ?
fname = []
Example
Using the os.walk() method, we can also choose to display what element of the return value tuple we want to print. Let us look at an example program below.
import os for dirpath, dirs, files in os.walk("."): print(dirpath) # prints paths of all subdirectories present for dirpath, dirs, files in os.walk("."): print(dirs) # prints the names of existing subdirectories for dirpath, dirs, files in os.walk("."): print(files) # prints existing files in the current directory
Output
The output for the program above is as follows ?
. [] ['main.py']
Using glob.glob() Method
The glob module is used to get the pathnames matching a specific pattern in a specific directory. The glob.glob() method is used to search for all the pathnames containing the given path specification as an argument.
If the path specification is passed as an "*" (Asterisk), the method matches zero or more characters in the pathname; hence, it returns all the files present in the directory.
Example
Let us try to print the names of all the files and subdirectories present in the root directory using the glob() method. The example is shown below.
from pathlib import Path root_directory = Path('.') size = 0 for f in root_directory.glob("*"): print(f)
Output
main.py
Conclusion
We have discussed how to scan through a directory recursively in Python using the os.walk() function.