How to Add a Count Column to a Pandas DataFrame

Adding a count column to a pandas DataFrame can be done by using the DataFrame.insert() method and passing the column name and the desired count value to the method. The count value can be created using the DataFrame.shape attribute which returns a tuple containing the number of rows and columns of the DataFrame. Then the desired count value can be derived from the number of rows and assigned to the count column. The count column can then be inserted into the DataFrame using the DataFrame.insert() method.


You can use the following basic syntax to add a ‘count’ column to a pandas DataFrame:

df['var1_count'] = df.groupby('var1')['var1'].transform('count')

This particular syntax adds a column called var1_count to the DataFrame that contains the count of values in the column called var1.

The following example shows how to use this syntax in practice.

Example: Add Count Column in Pandas

Suppose we have the following pandas DataFrame that contains information about various basketball players:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'],
                   'pos': ['Gu', 'Fo', 'Fo', 'Fo', 'Gu', 'Gu', 'Fo', 'Fo'],
                   'points': [18, 22, 19, 14, 14, 11, 20, 28]})

#view DataFrame
print(df)

  team pos  points
0    A  Gu      18
1    A  Fo      22
2    A  Fo      19
3    B  Fo      14
4    B  Gu      14
5    B  Gu      11
6    B  Fo      20
7    B  Fo      28

We can use the following code to add a column called team_count that contains the count of each team:

#add column that shows total count of each team
df['team_count'] = df.groupby('team')['team'].transform('count')

#view updated DataFrame
print(df)

  team pos  points  team_count
0    A  Gu      18           3
1    A  Fo      22           3
2    A  Fo      19           3
3    B  Fo      14           5
4    B  Gu      14           5
5    B  Gu      11           5
6    B  Fo      20           5
7    B  Fo      28           5

There are 3 rows with a team value of A and 5 rows with a team value of B.

Thus:

  • For each row where the team is equal to A, the value in the team_count column is 3.
  • For each row where the team is equal to B, the value in the team_count column is 5.

You can also add a ‘count’ column that groups by multiple variables.

For example, the following code shows how to add a ‘count’ column that groups by the team and pos variables:

#add column that shows total count of each team and position
df['team_pos_count'] = df.groupby(['team', 'pos')['team'].transform('count')

#view updated DataFrame
print(df)

  team pos  points  team_pos_count
0    A  Gu      18               1
1    A  Fo      22               2
2    A  Fo      19               2
3    B  Fo      14               3
4    B  Gu      14               2
5    B  Gu      11               2
6    B  Fo      20               3
7    B  Fo      28               3

From the output we can see:

  • There is 1 row that contains A in the team column and Gu in the pos column.
  • There are 2 rows that contain A in the team column and Fo in the pos column.
  • There are 3 rows that contain B in the team column and Fo in the pos column.
  • There are 2 rows that contain B in the team column and Gu in the pos column.

x