How can I use dplyr to calculate the sum across multiple columns in a data frame?

How can I use dplyr to calculate the sum across multiple columns in a data frame?

Dplyr is a popular R package that provides a set of tools for data manipulation and transformation. One of its useful functions is the ability to calculate the sum across multiple columns in a data frame. This can be achieved by using the “summarise” function, which allows you to specify the columns to be summed and the resulting column name. Additionally, dplyr also has the “mutate” function which can be used to add a new column with the sum of the specified columns. By utilizing these functions, you can efficiently and accurately calculate the sum across multiple columns in a data frame, making data analysis and manipulation more convenient.

Sum Across Multiple Columns Using dplyr


You can use the following methods to sum values across multiple columns of a data frame using dplyr:

Method 1: Sum Across All Columns

df %>%
  mutate(sum = rowSums(., na.rm=TRUE))

Method 2: Sum Across All Numeric Columns

df %>%
  mutate(sum = rowSums(across(where(is.numeric)), na.rm=TRUE))

Method 3: Sum Across Specific Columns

df %>%
  mutate(sum = rowSums(across(c(col1, col2))))

The following examples show how to each method with the following data frame that contains information about points scored by various basketball players during different games:

#create data frame
df <- data.frame(game1=c(22, 25, 29, 13, 22, 30),
                 game2=c(12, 10, 6, 6, 8, 11),
                 game3=c(NA, 15, 15, 18, 22, 13))

#view data frame
df

  game1 game2 game3
1    22    12    NA
2    25    10    15
3    29     6    15
4    13     6    18
5    22     8    22
6    30    11    13

Example 1: Sum Across All Columns

The following code shows how to calculate the sum of values across all columns in the data frame:

library(dplyr)

#sum values across all columns
df %>%
  mutate(total_points = rowSums(., na.rm=TRUE))  game1 game2 game3 total_points
1    22    12    NA           34
2    25    10    15           50
3    29     6    15           50
4    13     6    18           37
5    22     8    22           52
6    30    11    13           54

Example 2: Sum Across All Numeric Columns

The following code shows how to calculate the sum of values across all numeric columns in the data frame:

library(dplyr) #sum values across all numeric columns
df %>%
  mutate(total_points = rowSums(across(where(is.numeric)), na.rm=TRUE))
  game1 game2 game3 total_points
1    22    12    NA           34
2    25    10    15           50
3    29     6    15           50
4    13     6    18           37
5    22     8    22           52
6    30    11    13           54

Example 3: Sum Across Specific Columns

The following code shows how to calculate the sum of values across the game1 and game2 columns only:

library(dplyr) 

#sum values across game1 and game2 only
df %>%
  mutate(first2_sum = rowSums(across(c(game1, game2))))

  game1 game2 game3 first2_sum
1    22    12    NA         34
2    25    10    15         35
3    29     6    15         35
4    13     6    18         19
5    22     8    22         30
6    30    11    13         41

Additional Resources

The following tutorials explain how to perform other common tasks using dplyr:

Cite this article

stats writer (2024). How can I use dplyr to calculate the sum across multiple columns in a data frame?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-can-i-use-dplyr-to-calculate-the-sum-across-multiple-columns-in-a-data-frame/

stats writer. "How can I use dplyr to calculate the sum across multiple columns in a data frame?." PSYCHOLOGICAL SCALES, 27 Jun. 2024, https://scales.arabpsychology.com/stats/how-can-i-use-dplyr-to-calculate-the-sum-across-multiple-columns-in-a-data-frame/.

stats writer. "How can I use dplyr to calculate the sum across multiple columns in a data frame?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/how-can-i-use-dplyr-to-calculate-the-sum-across-multiple-columns-in-a-data-frame/.

stats writer (2024) 'How can I use dplyr to calculate the sum across multiple columns in a data frame?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-can-i-use-dplyr-to-calculate-the-sum-across-multiple-columns-in-a-data-frame/.

[1] stats writer, "How can I use dplyr to calculate the sum across multiple columns in a data frame?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, June, 2024.

stats writer. How can I use dplyr to calculate the sum across multiple columns in a data frame?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.

Download Post (.PDF)
Slide Up
x
PDF
Scroll to Top