How to use as_index in a Pandas DataFrame groupby?

The as_index parameter of the Pandas DataFrame groupby function allows you to specify whether you want the index of the grouped DataFrame to be one of the columns from the original DataFrame or not. Setting as_index to False will cause the index to be reset to a numerical value and will also cause the column used for groupby to become a regular column in the grouped DataFrame.


You can use the as_index argument in a pandas groupby() operation to specify whether or not you’d like the column that you grouped by to be used as the index of the output.

The as_index argument can take a value of True or False.

The default value is True.

The following example shows how to use the as_index argument in practice.

Example: How to Use as_index in pandas groupby

Suppose we have the following pandas DataFrame that shows the number of points scored by basketball players on various teams:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'C', 'C', 'C'],
                   'points': [12, 15, 17, 17, 19, 14, 15, 20, 24, 28]})
                            
#view DataFrame
print(df)

  team  points
0    A      12
1    A      15
2    A      17
3    A      17
4    A      19
5    B      14
6    B      15
7    C      20
8    C      24
9    C      28

We can use the following syntax to group the rows by the team column and calculate the sum of the points column, while specifying as_index=True to use team as the index of the output:

#group rows by team and calculate sum of points
print(df.groupby('team', as_index=True).sum())

      points
team        
A         80
B         29
C         72

The output shows the sum of values in the points column, grouped by the values in the team column.

Notice that the team column is used as the index of the output.

If we instead specify as_index=False then the team column will not be used as the index of the output:

#group rows by team and calculate sum of points
print(df.groupby('team', as_index=False).sum())

  team  points
0    A      80
1    B      29
2    C      72

Notice that team is now used as a column in the output and the index column is simply numbered from 0 to 2.

Note: You can find the complete documentation for the pandas groupby() operation .

x