Open In App

Merge Multiple Dataframes - Pandas

Last Updated : 22 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Merging allow us to combine data from two or more DataFrames into one based on index values. This is used when we want to bring together related information from different sources. In Pandas there are different ways to combine DataFrames:

1. Merging DataFrames Using merge()

We use merge() when we want to join two DataFrames using one or more common columns. It works like SQL joins like inner, left, right and outer join. It's the most common method when the data has shared column names.

Python
import pandas as pd

df1 = pd.DataFrame({'id': [1, 2, 3], 'name': ['Alice', 'Bob', 'Charlie']})
df2 = pd.DataFrame({'id': [3, 4, 5], 'age': [25, 30, 35]})
df3 = pd.DataFrame({'id': [5, 6, 7], 'city': ['New York', 'Los Angeles', 'Chicago']})

res = pd.merge(pd.merge(df1, df2, on='id', how='outer'), df3, on='id', how='outer')
print(res)

Output:

Capture
Merge Multiple Dataframes

2. Concatenating Multiple DataFrame in Pandas

We use concat() when we want to simply put DataFrames together either by adding rows (one below the other) or columns (side by side). It doesn’t require matching columns or indexes just lines them up. In concat we can specify two categories of joins: inner or outer. Its parameters are:

  • axis=0: This indicates that the concatenation will happen along the rows. If axis=1 it would concatenate along the columns.
  • join='outer': In this we use outer join and it will include all columns from both DataFrames. If a column is missing from a DataFrame it will be filled with NaN values.
  • ignore_index=True: This resets the index after concatenation create a new default integer index instead of keeping the original indices from df1 and df2.
Python
import pandas as pd

df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
df2 = pd.DataFrame({'A': [5, 6], 'C': [7, 8]})
df3 = pd.DataFrame({'A': [9, 10], 'D': [11, 12]})

res = pd.concat([df1, df2, df3], axis=0, join='outer', ignore_index=True)
print(res)

Output:

Screenshot-2025-04-10-163046
Concatnating Multiple Dataframes

3. Joining Multiple DataFrames function in Pandas

join() is used for combining the dataframes based on indices. It is to be noted that join combines two or more tables side by side.

Python
import pandas as pd

df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}, index=['a', 'b', 'c'])
df2 = pd.DataFrame({'C': [7, 8, 9], 'D': [10, 11, 12]}, index=['a', 'b', 'c'])
df3 = pd.DataFrame({'E': [13, 14, 15], 'F': [16, 17, 18]}, index=['a', 'b', 'c'])

res = df1.join([df2, df3])
print(res)

Output:

Screenshot-2025-04-10-163244
Join Multiple Dataframes

In this we have indexes specified for each dataframe. Using the join method we are joining two dataframes side by side and getting the result.

Combining Multiple DataFrames with join(), concat() and merge() in Pandas

Example : We have three dataframes that comprises of flower details. We will use merge, concat and join to combine the dataframes and see their results.

Python
import pandas as pd

df1 = pd.DataFrame({'Species': ['Rose', 'Tulip', 'Lily'], 'Color': ['Red', 'Yellow', 'White']})
df2 = pd.DataFrame({'Species': ['Rose', 'Tulip', 'Orchid'], 'Price': [2.5, 3.0, 5.0]})
df3 = pd.DataFrame({'Species': ['Rose', 'Tulip', 'Jasmine'], 'Scent': ['Fragrant', 'Mild', 'Intense']})
df4 = pd.DataFrame({'Species': ['Rose', 'Lily', 'Orchid', 'Jasmine'], 'Region': ['Europe', 'Asia', 'Tropics', 'Asia']})

# 1. Using merge(): Merge all DataFrames on 'Species' column
merged_df = pd.merge(df1, df2, on='Species', how='inner')
merged_df = pd.merge(merged_df, df3, on='Species', how='inner')
merged_df = pd.merge(merged_df, df4, on='Species', how='inner')

print("Merge Example:")
print(merged_df.to_string(index=False)) 

# 2. Using join(): Join all DataFrames on 'Species' column (first set as index)
df1.set_index('Species', inplace=True)
df2.set_index('Species', inplace=True)
df3.set_index('Species', inplace=True)
df4.set_index('Species', inplace=True)

joined_df = df1.join(df2, how='inner').join(df3, how='inner').join(df4, how='inner')

print("\nJoin Example:")
print(joined_df.to_string())

# 3. Using concat(): Concatenate all DataFrames horizontally (along columns)
concat_df = pd.concat([df1.reset_index(), df2.reset_index(), df3.reset_index(), df4.reset_index()], axis=1)

print("\nConcat Example:")
print(concat_df.to_string(index=False))

Output:

dataframe_combining
Combining Multiple DataFrames with join(), concat() and merge() in Pandas

We can see that all methods combined dataframes with different logic and we can use different method according to our need.


Next Article

Similar Reads