Pandas: Use groupby() with size() How do I find the number of rows in each group of a groupby?

The groupby() function with the size() method can be used to find the number of rows in each group of a groupby. This is done by first grouping the dataframe by the desired column and then calling size() on the groupby object to get the count of each group.


You can use the following methods with the groupby() and size() functions in pandas to count the number of occurrences by group:

Method 1: Count Occurrences Grouped by One Variable

df.groupby('var1').size()

Method 2: Count Occurrences Grouped by Multiple Variables

df.groupby(['var1', 'var2']).size()

Method 3: Count Occurrences Grouped by Multiple Variables and Sort by Count

df.groupby(['var1', 'var2']).size().sort_values(ascending=False)

The following examples show how to use each method in practice with the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'],
                   'position': ['G', 'G', 'F', 'F', 'F', 'G', 'G', 'G', 'G', 'F'],
                   'points': [15, 22, 24, 25, 20, 35, 34, 19, 14, 12]})

#view DataFrame
print(df)

  team position  points
0    A        G      15
1    A        G      22
2    A        F      24
3    A        F      25
4    A        F      20
5    B        G      35
6    B        G      34
7    B        G      19
8    B        G      14
9    B        F      12

Example 1: Count Occurrences Grouped by One Variable

The following code shows how to use the groupby() and size() functions to count the occurrences of values in the team column:

#count occurrences of each value in team column
df.groupby('team').size()

team
A    5
B    5
dtype: int64

From the output we can see that the values A and B both occur 5 times in the team column.

Example 2: Count Occurrences Grouped by Multiple Variables

The following code shows how to use the groupby() and size() functions to count the occurrences of values for each combination of values in the team and position columns:

#count occurrences of values for each combination of team and position
df.groupby(['team', 'position']).size()

team  position
A     F           3
      G           2
B     F           1
      G           4
dtype: int64

From the output we can see:

  • Team A and position F occurs 3 times.
  • Team A and position G occurs 2 times.

And so on.

Example 3: Count Occurrences Grouped by Multiple Variables and Sort

The following code shows how to use the groupby() and size() functions to count the occurrences of values for each combination of values in the team and position columns, then sort by count:

#count occurrences for each combination of team and position and sort
df.groupby(['team', 'position']).size().sort_values(ascending=False)

team  position
B     G           4
A     F           3
      G           2
B     F           1
dtype: int64

The output shows the count of each combination of team and position values, sorted by count in descending order.

Note: To sort by count in ascending order, simply remove ascending=False in the sort_values() function.

x