How to Merge Two Pandas DataFrames on Index
Last Updated :
12 Nov, 2024
Merging two pandas DataFrames on their index is necessary when working with datasets that share the same row identifiers but have different columns. The core idea is to align the rows of both DataFrames based on their indices, combining the respective columns into one unified DataFrame.
To merge two pandas DataFrames on their index, you can use the merge()
function with left_index
and right_index
parameters set to True
. Alternatively, you can use the join()
or concat()
functions, which also support merging on index.
Merge Two Pandas DataFrames on IndexMerge Two Pandas DataFrames on Index
This merge() method will merge the two Dataframes with matching indexes. If you have two DataFrames with matching index labels, you can simply set left_index=True
and right_index=True
in the pd.merge()
function to merge them.
- Types of Joins: Pandas supports different join types for merging data, such as
inner
, outer
, left
, and right
joins. By default, pd.merge()
performs an inner
join, meaning it keeps only the rows that have matching indices in both DataFrames. You can specify the join type with the how
parameter. - Aligning Data: When merging on the index, each row in the resulting DataFrame aligns based on the index value rather than a specific column, making this approach ideal when your row labels are the critical point of reference.
Python
# import pandas module
import pandas as pd
# join two dataframes with merge
print(pd.merge(data1, data2, left_index=True, right_index=True))
Output:
Merge Two Pandas DataFrames on Index using merge()2. Merging two Pandas DataFrames on Index using join()
By default, the join() method it performs a left join. In this case, all rows from the left DataFrame (df1
) are kept, and matching rows from the right DataFrame (df2
) are added.
Python
# import pandas module
import pandas as pd
# create student dataframe
data1 = pd.DataFrame({'id': [1, 2, 3, 4],
'name': ['manoj', 'manoja', 'manoji', 'manij']},
index=['one', 'two', 'three', 'four'])
# create marks dataframe
data2 = pd.DataFrame({'s_id': [1, 2, 3, 6, 7],
'marks': [98, 90, 78, 86, 78]},
index=['one', 'two', 'three', 'siz', 'seven'])
# join two dataframes
print(data1.join(data2))
Output:
Merge two Pandas DataFrames on Index using join() 3. Merging Two Pandas DataFrames on Index using concat()
By default, concat() method performs an outer join by setting axis=1. This method includes all rows from both DataFrames, filling in missing values with NaN where there are no matches.
Python
# import pandas module
import pandas as pd
# join two dataframes with concat
print(pd.concat([data1, data2], axis=1))
Output:
Merging Two Pandas DataFrames on Index using concat()Key Takeaways:
- Merging on Index: Use
merge()
, join()
, or concat()
to merge two DataFrames based on their row indices. - Join Types: You can specify different join types (
inner
, outer
, left
, or right
) depending on whether you want to keep only matching rows or include all rows from one or both DataFrames. - Flexibility: The
merge()
function is more flexible than join()
and concat()
, allowing for more complex merging operations like merging on both columns and indices simultaneously.
Recommended Article: Pandas Merging, Joining, and Concatenating
Similar Reads
How to Join Pandas DataFrames using Merge? Joining and merging DataFrames is that the core process to start  out with data analysis and machine learning tasks. It's one of the toolkits which each Data Analyst or Data Scientist should master because in most cases data comes from multiple sources and files. In this tutorial, you'll how to join
3 min read
How to Get the Common Index of Two Pandas DataFrames When working with large datasets in Python Pandas, having multiple DataFrames with overlapping or related data is common. In many cases, we may want to identify the common indices between two DataFrames to perform further analysis, such as merging, filtering, or comparison.This article will guide us
5 min read
How to combine two DataFrames in Pandas? While working with data, there are multiple times when you would need to combine data from multiple sources. For example, you may have one DataFrame that contains information about a customer, while another DataFrame contains data about their transaction history. If you want to analyze this data tog
3 min read
How to merge dataframes in R ? In this article, we will discuss how to perform inner, outer, left, or right joins in a given dataframe in R Programming Language. Functions Used merge() function is used to merge or join two tables. With appropriate values provided to specific parameters, we can create the desired join. Syntax: mer
3 min read
How to Merge DataFrames of different length in Pandas ? Merging DataFrames of different lengths in Pandas can be done using the merge(), and concat(). These functions allow you to combine data based on shared columns or indices, even if the DataFrames have unequal lengths. By using the appropriate merge method (like a left join, right join, or outer join
3 min read