How can I calculate the cumulative percentage in Pandas?

How can I calculate the cumulative percentage in Pandas?

Pandas is a popular data analysis library in Python that offers a wide range of functions for manipulating and analyzing data. One of the frequently used operations in data analysis is calculating the cumulative percentage of a particular column or series. This can be done in Pandas by using the “cumsum” function, which calculates the cumulative sum of a series. To obtain the cumulative percentage, the “cumsum” function can be divided by the total sum of the series and multiplied by 100. This process can be easily implemented using Pandas’ powerful and intuitive syntax, making it an efficient tool for calculating the cumulative percentage in data analysis tasks.

Calculate Cumulative Percentage in Pandas


You can use the following basic syntax to calculate the cumulative percentage of values in a column of a pandas DataFrame:

#calculate cumulative sum of column
df['cum_sum'] = df['col1'].cumsum()

#calculate cumulative percentage of column (rounded to 2 decimal places)
df['cum_percent'] = round(100*df.cum_sum/df['col1'].sum(),2)

The following example shows how to use this syntax in practice.

Example: Calculate Cumulative Percentage in Pandas

Suppose we have the following pandas DataFrame that shows the number of units a company sells during consecutive years:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'year': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
                   'units_sold': [60, 75, 77, 87, 104, 134, 120, 125, 140, 150]})

#view DataFrame
print(df)

   year  units_sold
0     1          60
1     2          75
2     3          77
3     4          87
4     5         104
5     6         134
6     7         120
7     8         125
8     9         140
9    10         150

Next, we can use the following code to add a column that shows the cumulative number of units sold and cumulative percentage of units sold:

#calculate cumulative sum of units sold
df['cum_sum'] = df['units_sold'].cumsum()

#calculate cumulative percentage of units sold
df['cum_percent'] = round(100*df.cum_sum/df['units_sold'].sum(),2)

#view updated DataFrame
print(df)

   year  units_sold  cum_sum  cum_percent
0     1          60       60         5.60
1     2          75      135        12.59
2     3          77      212        19.78
3     4          87      299        27.89
4     5         104      403        37.59
5     6         134      537        50.09
6     7         120      657        61.29
7     8         125      782        72.95
8     9         140      922        86.01
9    10         150     1072       100.00

We interpret the cumulative percentages as follows:

  • 5.60% of all sales were made in year 1.
  • 12.59 of all sales were made in years 1 and 2 combined.
  • 19.78% of all sales were made in years 1, 2, and 3 combined.

And so on.

Note that you can simply change the value in the round() function to change the number of decimal points shown as well.

For example, we could round the cumulative percentage to zero decimal places instead:

#calculate cumulative sum of units sold
df['cum_sum'] = df['units_sold'].cumsum()

#calculate cumulative percentage of units sold
df['cum_percent'] = round(100*df.cum_sum/df['units_sold'].sum(),0)

#view updated DataFrame
print(df)

   year  units_sold  cum_sum  cum_percent
0     1          60       60          6.0
1     2          75      135         13.0
2     3          77      212         20.0
3     4          87      299         28.0
4     5         104      403         38.0
5     6         134      537         50.0
6     7         120      657         61.0
7     8         125      782         73.0
8     9         140      922         86.0
9    10         150     1072        100.0

The cumulative percentages are now rounded to zero decimal places.

Additional Resources

The following tutorials explain how to perform other common operations in Python:

Cite this article

stats writer (2024). How can I calculate the cumulative percentage in Pandas?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-can-i-calculate-the-cumulative-percentage-in-pandas/

stats writer. "How can I calculate the cumulative percentage in Pandas?." PSYCHOLOGICAL SCALES, 1 Jul. 2024, https://scales.arabpsychology.com/stats/how-can-i-calculate-the-cumulative-percentage-in-pandas/.

stats writer. "How can I calculate the cumulative percentage in Pandas?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/how-can-i-calculate-the-cumulative-percentage-in-pandas/.

stats writer (2024) 'How can I calculate the cumulative percentage in Pandas?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-can-i-calculate-the-cumulative-percentage-in-pandas/.

[1] stats writer, "How can I calculate the cumulative percentage in Pandas?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, July, 2024.

stats writer. How can I calculate the cumulative percentage in Pandas?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.

Download Post (.PDF)
Slide Up
x
PDF
Scroll to Top