How can I utilize the across() function in dplyr to manipulate multiple columns at once?

How can I utilize the across() function in dplyr to manipulate multiple columns at once?

The across() function in dplyr allows for manipulation of multiple columns at once in a data frame. This function can be utilized by providing a set of columns or a range of columns to perform the desired operation on. It is particularly useful for performing repetitive tasks, such as applying the same function to multiple columns, or creating new columns based on existing ones. By using the across() function, data manipulation tasks can be streamlined and more efficient.

Use the across() Function in dplyr (3 Examples)


You can use the function from the package in R to apply a transformation to multiple columns.

There are countless ways to use this function, but the following methods illustrate some common uses:

Method 1: Apply Function to Multiple Columns

#multiply values in col1 and col2 by 2
df %>% 
  mutate(across(c(col1, col2), function(x) x*2))

Method 2: Calculate One Summary Statistic for Multiple Columns

#calculate mean of col1 and col2
df %>%
  summarise(across(c(col1, col2), mean, na.rm=TRUE))

Method 3: Calculate Multiple Summary Statistics for Multiple Columns

#calculate mean and standard deviation for col1 and col2
df %>%
  summarise(across(c(col1, col2), list(mean=mean, sd=sd), na.rm=TRUE))

The following examples show how to each method with the following data frame:

#create data frame
df <- data.frame(conf=c('East', 'East', 'East', 'West', 'West', 'West'),
                 points=c(22, 25, 29, 13, 22, 30),
                 rebounds=c(12, 10, 6, 6, 8, 11))

#view data frame
df

  conf points rebounds
1 East     22       12
2 East     25       10
3 East     29        6
4 West     13        6
5 West     22        8
6 West     30       11

Example 1: Apply Function to Multiple Columns

The following code shows how to use the across() function to multiply the values in both the points and rebounds columns by 2:

library(dplyr)

#multiply values in points and rebounds columns by 2
df %>% 
  mutate(across(c(points, rebounds), function(x) x*2))

  conf points rebounds
1 East     44       24
2 East     50       20
3 East     58       12
4 West     26       12
5 West     44       16
6 West     60       22

Example 2: Calculate One Summary Statistic for Multiple Columns

The following code shows how to use the across() function to calculate the mean value for both the points and rebounds columns:

library(dplyr) 

#calculate mean value of points an rebounds columns
df %>%
  summarise(across(c(points, rebounds), mean, na.rm=TRUE))

  points rebounds
1   23.5 8.833333

Note that we can also use the is.numeric function to automatically calculate a summary statistic for all of the numeric columns in the data frame:

library(dplyr) 

#calculate mean value for every numeric column in data frame
df %>%
  summarise(across(where(is.numeric), mean, na.rm=TRUE))

  points rebounds
1   23.5 8.833333

Example 3: Calculate Multiple Summary Statistics for Multiple Columns

The following code shows how to use the across() function to calculate the mean and standard deviation of both the points and rebounds columns:

library(dplyr) 

#calculate mean and standard deviation for points and rebounds columns
df %>%
  summarise(across(c(points, rebounds), list(mean=mean, sd=sd), na.rm=TRUE))

  points_mean points_sd rebounds_mean rebounds_sd
1        23.5  6.156298      8.833333    2.562551

Note: You can find the complete documentation for the across() function .

Additional Resources

The following tutorials explain how to perform other common functions using dplyr:

Cite this article

stats writer (2024). How can I utilize the across() function in dplyr to manipulate multiple columns at once?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-can-i-utilize-the-across-function-in-dplyr-to-manipulate-multiple-columns-at-once/

stats writer. "How can I utilize the across() function in dplyr to manipulate multiple columns at once?." PSYCHOLOGICAL SCALES, 27 Jun. 2024, https://scales.arabpsychology.com/stats/how-can-i-utilize-the-across-function-in-dplyr-to-manipulate-multiple-columns-at-once/.

stats writer. "How can I utilize the across() function in dplyr to manipulate multiple columns at once?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/how-can-i-utilize-the-across-function-in-dplyr-to-manipulate-multiple-columns-at-once/.

stats writer (2024) 'How can I utilize the across() function in dplyr to manipulate multiple columns at once?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-can-i-utilize-the-across-function-in-dplyr-to-manipulate-multiple-columns-at-once/.

[1] stats writer, "How can I utilize the across() function in dplyr to manipulate multiple columns at once?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, June, 2024.

stats writer. How can I utilize the across() function in dplyr to manipulate multiple columns at once?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.

Download Post (.PDF)
Slide Up
x
PDF
Scroll to Top