Table of Contents
Flattening a MultiIndex in Pandas is the process of converting a hierarchical index into a single level index, which can be accomplished using the reset_index() function. This function will take all the levels of a MultiIndex and turn them into separate columns with each row containing all the values from each level. This can be useful for data analysis and visualization, as it allows for a simpler and more organized data structure. Examples of how to flatten a MultiIndex in Pandas can be found in many online tutorials.
You can use the following basic syntax to flatten a MultiIndex in pandas:
#flatten all levels of MultiIndex df.reset_index(inplace=True) #flatten specific levels of MultiIndex df.reset_index(inplace=True, level = ['level_name'])
The following examples show how to use this syntax in practice.
Example 1: Flatten All Levels of MultiIndex in Pandas
Suppose we have the following MultiIndex pandas DataFrame:
import pandas as pd #create DataFrame index_names = pd.MultiIndex.from_tuples([('Level1','Lev1', 'L1'), ('Level2','Lev2', 'L2'), ('Level3','Lev3', 'L3'), ('Level4','Lev4', 'L4')], names=['Full','Partial', 'ID']) data = {'Store': ['A','B','C','D'], 'Sales': [12, 44, 29, 35]} df = pd.DataFrame(data, columns = ['Store','Sales'], index=index_names) #view DataFrame df Store Sales Full Partial ID Level1 Lev1 L1 A 17 Level2 Lev2 L2 B 22 Level3 Lev3 L3 C 29 Level4 Lev4 L4 D 35
We can use the following syntax to flatten every level of the MultiIndex into columns in the DataFrame:
#flatten every level of MultiIndex df.reset_index(inplace=True) #view updated DataFrame df Full Partial ID Store Sales 0 Level1 Lev1 L1 A 12 1 Level2 Lev2 L2 B 44 2 Level3 Lev3 L3 C 29 3 Level4 Lev4 L4 D 35
Notice that each level of the MultiIndex is now a column in the DataFrame.
Example 2: Flatten Specific Levels of MultiIndex in Pandas
Suppose we have the same pandas DataFrame as the previous example:
#view DataFrame df Store Sales Full Partial ID Level1 Lev1 L1 A 12 Level2 Lev2 L2 B 44 Level3 Lev3 L3 C 29 Level4 Lev4 L4 D 35
The following code shows how to flatten just one specific level of the MultiIndex:
#flatten 'ID' level only
df.reset_index(inplace=True, level = ['ID'])
#view updated DataFrame
df
ID Store Sales
Full Partial
Level1 Lev1 L1 A 12
Level2 Lev2 L2 B 44
Level3 Lev3 L3 C 29
Level4 Lev4 L4 D 35
And the following code shows how to flatten several specific levels of the MultiIndex:
#flatten 'ID' level only
df.reset_index(inplace=True, level = ['Partial', 'ID'])
#view updated DataFrame
df
Partial ID Store Sales
Full
Level1 Lev1 L1 A 12
Level2 Lev2 L2 B 44
Level3 Lev3 L3 C 29
Level4 Lev4 L4 D 35