How to Count Unique Values Using Pandas GroupBy

Using the Pandas GroupBy method, it is possible to count the number of unique values within a given column of a Pandas DataFrame. This is done by grouping the DataFrame by the column of interest and then using the size() method to count the number of unique values. This method is useful for quickly identifying the number of unique values in each group, and can be used to further analyze the data across multiple columns.


You can use the following basic syntax to count the number of unique values by group in a pandas DataFrame:

df.groupby('group_column')['count_column'].nunique()

The following examples show how to use this syntax with the following DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'],
                   'position': ['G', 'G', 'G', 'F', 'F', 'G', 'G', 'F', 'F', 'F'],
                   'points': [5, 7, 7, 9, 12, 9, 9, 4, 7, 7],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12, 13, 15]})

#view DataFrame
df

	team	position points	rebounds
0	A	G	 5	11
1	A	G	 7	8
2	A	G	 7	10
3	A	F	 9	6
4	A	F	 12	6
5	B	G	 9	5
6	B	G	 9	9
7	B	F	 4	12
8	B	F	 7	13
9	B	F	 7	15

Example 1: Group By One Column & Count Unique Values

The following code shows how to count the number of unique values in the ‘points’ column for each team:

#count number of unique values in 'points' column grouped by 'team' column
df.groupby('team')['points'].nunique()

team
A    4
B    3
Name: points, dtype: int64

From the output we can see:

  • There are 4 unique ‘points’ values for team A.
  • There are 3 unique ‘points’ values for team B.

Note that we can also use the unique() function to display each unique ‘points’ value by team:

#display unique values in 'points' column grouped by 'team'
df.groupby('team')['points'].unique()

team
A    [5, 7, 9, 12]
B        [9, 4, 7]
Name: points, dtype: object

Example 2: Group By Multiple Columns & Count Unique Values

The following code shows how to count the number of unique values in the ‘points’ column, grouped by team and position:

#count number of unique values in 'points' column grouped by 'team' and 'position'
df.groupby(['team', 'position'])['points'].nunique()

team  position
A     F           2
      G           2
B     F           2
      G           1
Name: points, dtype: int64

From the output we can see:

  • There are 2 unique ‘points’ values for players in position ‘F’ on team A.
  • There are 2 unique ‘points’ values for players in position ‘G’ on team A.
  • There are 2 unique ‘points’ values for players in position ‘F’ on team B.
  • There is1 unique ‘points’ value for players in position ‘G’ on team B.

Once again, we can use the unique() function to display each unique ‘points’ value by team and position:

#display unique values in 'points' column grouped by 'team' and 'position'
df.groupby(['team', 'position'])['points'].unique()

team  position
A     F           [9, 12]
      G            [5, 7]
B     F            [4, 7]
      G               [9]
Name: points, dtype: object

x