Merge All CSV Files into a Single DataFrame in Python Pandas



To merge all CSV files, use the GLOB module. The os.path.join() method is used inside the concat() to merge the CSV files together. Some of the common methods we can use to merge multiple CSV Files into a single dataframe are as follows -

  • os.path.join() and glob

  • Merging CSV Files with glob Pattern

  • Merging CSV Files in a Specific Order

Using 'os.path.join()' and glob

First of all, import the required libraries. We have set pd as an alias for the Python Pandas library.

import pandas as pd
import glob
import os

This method constructs the file path by joining directory paths and file name patterns.

Example

In the below example code the glob.glob() returns a list of all file names matching the given pattern ("sales').

import pandas as pd
import glob
import os

# setting the path for joining multiple files
files = os.path.join("C:\Users\amit_\Desktop", "sales*.csv")

# list of merged files returned
files = glob.glob(files)

print("Resultant CSV after joining all CSV files at a particular location...");

# joining files with concat and read_csv
df = pd.concat(map(pd.read_csv, files), ignore_index=True)
print(df)

Sales1.csv

Car place UnitsSold
0 Audi Bangalore 80
1 Porsche Mumbai 110
2 Rolls Royce Pune 100

Sales2.csv

Car place UnitsSold
3 BMW Delhi 95
4 Mercedes Hyderabad 80
5 Lamborgini Chandigarh 80

Sales3.csv

Car place UnitsSold
6 Volvo Rajasthan 150
7 Hyundai Manipur 120
8 Toyota HP 70

Output

Resultant CSV after joining all CSV files at a particular location.

Car place UnitsSold
0 Audi Bangalore 80
1 Porsche Mumbai 110
2 Rolls Royce Pune 100
3 BMW Delhi 95
4 Mercedes Hyderabad 80
5 Lamborgini Chandigarh 80
6 Volvo Rajasthan 150
7 Hyundai Manipur 120
8 Toyota HP 70

Merging CSV Files with glob Pattern

When we are sure about the file location, you can directly use the file path without using os.path.join(), the path is hardcoded into the glob method.

Example

In the below example, the list comprehension [pd.read_csv(f) for f in files] reads each CSV file before passing it to pd.concat() for merging.

import pandas as pd
import glob

# Fetch all CSV files directly from the path
files = glob.glob("C:\Users\amit_\Desktop\sales*.csv")

# Concatenate them into a single DataFrame
df = pd.concat([pd.read_csv(f) for f in files], ignore_index=True)

# Display the final DataFrame
print(df)

Output

Car place UnitsSold
0 Audi Bangalore 80
1 Porsche Mumbai 110
2 Rolls Royce Pune 100
3 BMW Delhi 95
4 Mercedes Hyderabad 80
5 Lamborgini Chandigarh 80
6 Volvo Rajasthan 150
7 Hyundai Manipur 120
8 Toyota HP 70

Merging CSV Files in a Specific Order

If we want to merge the files in a specific order, we can sort the list of file names before concatenation. In the following example file.sort() function sorts the files in alphabetical or numerical order depending on naming.

import pandas as pd
import glob
import os

# Fetch all CSV files matching the pattern
files = glob.glob("C:\Users\amit_\Desktop\sales*.csv")

# Sort files alphabetically or numerically (depending on naming)
files.sort()

# Concatenate them into a DataFrame
df = pd.concat(map(pd.read_csv, files), ignore_index=True)

# Display the final DataFrame
print(df)

Output

Car place UnitsSold
0 Audi Bangalore 80
1 Porsche Mumbai 110
2 Rolls Royce Pune 100
3 BMW Delhi 95
4 Mercedes Hyderabad 80
5 Lamborgini Chandigarh 80
6 Volvo Rajasthan 150
7 Hyundai Manipur 120
8 Toyota HP 70
Updated on: 2024-10-09T12:20:54+05:30

16K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements