How to calculate the mean of one column in a groupby object in Pandas?

In order to calculate the mean of one column in a groupby object in Pandas, you can use the groupby.mean() function. This function will take the groupby object as an input and return the mean of the selected column as an output. You can then use the resulting mean value in downstream operations.


You can use the following syntax to calculate the mean and standard deviation of a column after using the groupby() operation in pandas:

df.groupby(['team'], as_index=False).agg({'points':['mean','std']})

This particular example groups the rows of a pandas DataFrame by the value in the team column, then calculates the mean and standard deviation of values in the points column.

The following example shows how to use this syntax in practice.

Example: Calculate Mean & Std of One Column in Pandas groupby

Suppose we have the following pandas DataFrame that contains information about basketball players on various teams:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
                   'points': [12, 15, 17, 17, 19, 14, 15, 20, 24, 28],
                   'assists': [5, 5, 7, 9, 10, 14, 13, 8, 2, 7]})
                            
#view DataFrame
print(df)

  team  points  assists
0    A      12        5
1    A      15        5
2    A      17        7
3    A      17        9
4    B      19       10
5    B      14       14
6    B      15       13
7    C      20        8
8    C      24        2
9    C      28        7

We can use the following syntax to calculate the mean and standard deviation of values in the points column, grouped by the team column:

#calculate mean and standard deviation of points, grouped by team
output = df.groupby(['team'], as_index=False).agg({'points':['mean','std']})

#view results
print(output)

  team points          
         mean       std
0    A  15.25  2.362908
1    B  16.00  2.645751
2    C  24.00  4.000000

From the output we can see:

  • The mean points value for team A is 15.25.
  • The standard deviation of points for team A is 2.362908.

And so on.

We can also rename the columns so that the output is easier to read:

#rename columns
output.columns = ['team', 'points_mean', 'points_std']

#view updated results
print(output)

  team  points_mean  points_std
0    A        15.25    2.362908
1    B        16.00    2.645751
2    C        24.00    4.000000

Note: You can find the complete documentation for the pandas groupby() operation .

x