Reverting from Multi-Index to Single Index DataFrame in Pandas
Let’s learn how to revert a multi-index to a single index DataFrame in Pandas.
Setting Up a Multi-Index DataFrame in Pandas
To begin, let’s import Pandas and load the data (data.csv):
# Importing Pandas library
import pandas as pd
# Load the dataset into a DataFrame
df = pd.read_csv('data.csv')
# Displaying the first few rows of the dataset
df.head()
# Importing Pandas library
import pandas as pd
# Load the dataset into a DataFrame
df = pd.read_csv('data.csv')
# Displaying the first few rows of the dataset
df.head()
Output:
At this point, the DataFrame doesn’t have any specific index, but we can create a multi-index using the set_index() method. We’ll use the ‘region’, ‘state’, and ‘individuals’ columns as index levels:
# Setting multi-index with 'region', 'state', and 'individuals'
df_mi = df.set_index(['region', 'state', 'individuals'])
# Display the DataFrame with multi-index
display(df_mi.head())
# Setting multi-index with 'region', 'state', and 'individuals'
df_mi = df.set_index(['region', 'state', 'individuals'])
# Display the DataFrame with multi-index
display(df_mi.head())
Output:
Multi-Index Pandas DataFrame
Now, the DataFrame has a hierarchical index, commonly referred to as a multi-index.
Reverting from Multi-Index to Single Index using reset_index()
Pandas offers several ways to reset a multi-index, depending on your needs. The reset_index() method can be used in different ways to drop or retain certain index levels.
Method 1: Reverting Using Index Levels
You can specify the index levels you want to remove using the level parameter. In this case, we will remove ‘region’ (level 0) and ‘individuals’ (level 2), keeping ‘state’ (level 1) as the single index.
# Reverting multi-index by removing level 0 and level 2
df_si_level = df_mi.reset_index(level=[0, 2])
# Display the DataFrame with a single index
display(df_si_level.head())
# Reverting multi-index by removing level 0 and level 2
df_si_level = df_mi.reset_index(level=[0, 2])
# Display the DataFrame with a single index
display(df_si_level.head())
Output:
DataFrame with Single Index
Method 2: Reverting Using Index Names
Alternatively, you can specify the index names directly in a list to reset them. Here, we will reset the ‘region’ and ‘state’ indexes, leaving ‘individuals’ as the index:
# Reverting multi-index by specifying index names
df_si_name = df_mi.reset_index(['region', 'state'])
# Display the DataFrame with a single index
display(df_si_name.head())
# Reverting multi-index by specifying index names
df_si_name = df_mi.reset_index(['region', 'state'])
# Display the DataFrame with a single index
display(df_si_name.head())
Output:
Method 3: Removing All Indexes
If you want to completely remove the index from your DataFrame (i.e., convert it to an index-free DataFrame), you can pass all the index names to the reset_index() function:
# Removing all index levels to make the DataFrame index-free
df_no_index = df_mi.reset_index(['region', 'state', 'individuals'])
# Display the index-free DataFrame
display(df_no_index.head())
# Removing all index levels to make the DataFrame index-free
df_no_index = df_mi.reset_index(['region', 'state', 'individuals'])
# Display the index-free DataFrame
display(df_no_index.head())
Output:
Convert MultiIndex to Single Index using droplevel() Method
To drop one of the index levels, use the droplevel() method. In this example, we remove the ‘number’ level while keeping the ‘letter’ level.
import pandas as pd
arrays = [['A', 'A', 'B', 'B'], ['one', 'two', 'one', 'two']]
index = pd.MultiIndex.from_arrays(arrays, names=('letter', 'number'))
df = pd.DataFrame({'value': [10, 20, 30, 40]}, index=index)
print("Sample MultiIndex DataFrame:")
print(df)
# Dropping the second level ('number') and keeping 'letter' as the index
df_dropped = df.droplevel('number')
print("\nDataFrame after dropping the second level:")
print(df_dropped)
import pandas as pd
arrays = [['A', 'A', 'B', 'B'], ['one', 'two', 'one', 'two']]
index = pd.MultiIndex.from_arrays(arrays, names=('letter', 'number'))
df = pd.DataFrame({'value': [10, 20, 30, 40]}, index=index)
print("Sample MultiIndex DataFrame:")
print(df)
# Dropping the second level ('number') and keeping 'letter' as the index
df_dropped = df.droplevel('number')
print("\nDataFrame after dropping the second level:")
print(df_dropped)
Output: