Table of Contents
Using the Pandas library in Python, it is possible to sum specific columns of a DataFrame using the DataFrame.sum() method. This method takes in the axis parameter which is set to 0, representing the columns, and the numeric_only parameter set to True to ensure only numeric values are summed. The result is a DataFrame with a single row containing the sum of each column specified. Examples of how to use this method are given to demonstrate the syntax and the result.
You can use the following methods to find the sum of a specific set of columns in a pandas DataFrame:
Method 1: Find Sum of All Columns
#find sum of all columns df['sum'] = df.sum(axis=1)
Method 2: Find Sum of Specific Columns
#specify the columns to sum cols = ['col1', 'col4', 'col5'] #find sum of columns specified df['sum'] = df[cols].sum(axis=1)
The following examples show how to use each method in practice with the following pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'points': [18, 22, 19, 14, 14, 11, 20, 28], 'assists': [5, 7, 7, 9, 12, 9, 9, 4], 'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]}) #view DataFrame print(df) points assists rebounds 0 18 5 11 1 22 7 8 2 19 7 10 3 14 9 6 4 14 12 6 5 11 9 5 6 20 9 9 7 28 4 12
Example 1: Find Sum of All Columns
The following code shows how to sum the values of the rows across all columns in the DataFrame:
#define new column that contains sum of all columns
df['sum_stats'] = df.sum(axis=1)
#view updated DataFrame
df
points assists rebounds sum_stats
0 18 5 11 34
1 22 7 8 37
2 19 7 10 36
3 14 9 6 29
4 14 12 6 32
5 11 9 5 25
6 20 9 9 38
7 28 4 12 44
The sum_stats column contains the sum of the row values across all columns.
For example, here’s how the values were calculated:
- Sum of row 0: 18 + 5 + 11 = 34
- Sum of row 1: 22 + 7 + 8 = 37
- Sum of row 2: 19 + 7 + 10 = 36
And so on.
Example 2: Find Sum of Specific Columns
The following code shows how to sum the values of the rows across all columns in the DataFrame:
#specify the columns to sum
cols = ['points', 'assists']
#define new column that contains sum of specific columns
df['sum_stats'] = df[cols].sum(axis=1)
#view updated DataFrame
df
points assists rebounds sum_stats
0 18 5 11 23
1 22 7 8 29
2 19 7 10 26
3 14 9 6 23
4 14 12 6 26
5 11 9 5 20
6 20 9 9 29
7 28 4 12 32
For example, here’s how the values were calculated:
- Sum of row 0: 18 + 5 + 11 = 23
- Sum of row 1: 22 + 7 = 29
- Sum of row 2: 19 + 7 = 26
And so on.