How do I create a Pandas crosstab with percentages?

Creating a Pandas crosstab with percentages is done by using the pd.crosstab() method to generate a frequency table of two or more variables, then applying the normalize parameter to obtain the percentages for each combination of variables in the table. The normalize parameter can be set to ‘index’, ‘columns’, or ‘all’ depending on which values you would like to be expressed as percentages. Additionally, you can use the margins parameter to add all-inclusive row or column totals to the table.


You can use the normalize argument within the pandas crosstab() function to create a crosstab that displays percentage values instead of counts:

pd.crosstab(df.col1, df.col2, normalize='index')

The normalize argument accepts three different arguments:

  • all: Display percentage relative to all values.
  • index: Display percentage as total of row values.
  • columns: Display percentage as total of column values.

The following examples show how to use each of these methods in practice with the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C'],
                   'position':['G', 'G', 'F', 'G', 'F', 'F', 'F', 'G', 'G', 'F', 'F'],
                   'points': [22, 25, 24, 39, 34, 20, 18, 17, 20, 19, 22]})

#view DataFrame
print(df)

   team position  points
0     A        G      22
1     A        G      25
2     A        F      24
3     B        G      39
4     B        F      34
5     B        F      20
6     B        F      18
7     C        G      17
8     C        G      20
9     C        F      19
10    C        F      22

Here is what the default crosstab would look like for the count of players by team and position:

#create crosstab that displays count by team and position
pd.crosstab(df.team, df.position)

position  F	G
team		
A	  1	2
B	  3	1
C	  2	2

Example 1: Create Crosstab with Percentages Relative to All Values

We can use the crosstab() function with the argument normalize=all to create a crosstab that displays percentages of each value relative to the total count of all values:

#create crosstab that displays counts as percentage relative to total count
pd.crosstab(df.team, df.position, normalize='all')

position	F	       G
team		
A	0.090909	0.181818
B	0.272727	0.090909
C	0.181818	0.181818

Here is how to interpret the output:

  • Players on team A in position F account for 9.09% of total players.
  • Players on team A in position G account for 18.18% of total players.

And so on.

Example 2: Create Crosstab with Percentages Relative to Row Totals

We can use the crosstab() function with the argument normalize=index to create a crosstab that displays percentages of each value relative to the row total:

#create crosstab that displays counts as percentage relative to row totals
pd.crosstab(df.team, df.position, normalize='index')

position	F	       G
team		
A	0.333333	0.666667
B	0.750000	0.250000
C	0.500000	0.500000

  • Players in position F account for 33.33% of total players on team A.
  • Players in position F account for 75% of total players on team B.
  • Players in position F account for 50% of total players on team C.

And so on.

Example 3: Create Crosstab with Percentages Relative to Column Totals

We can use the crosstab() function with the argument normalize=columns to create a crosstab that displays percentages of each value relative to the column total:

#create crosstab that displays counts as percentage relative to column totals
pd.crosstab(df.team, df.position, normalize='columns')

position	F	  G
team		
A	0.166667	0.4
B	0.500000	0.2
C	0.333333	0.4

Here is how to interpret the output:

  • Players on team A account for 16.67% of total players with a position of F.
  • Players on team B account for 50% of total players with a position of F.
  • Players on team C account for 33.33% of total players with a position of F.

And so on.

Note: You can find the complete documentation for the pandas crosstab() function .

x