Table of Contents

Using the dplyr package in R, the mean of multiple columns can be calculated using the summarise_at() function. This function allows you to specify which columns you want to calculate the mean for, and then will return the mean of those columns in a data frame. This makes it easy to quickly calculate the mean of multiple columns at once.

You can use the following syntax to calculate the mean value for multiple specific columns in a data frame using the dplyr package in R:

library(dplyr)

df %>%
  rowwise() %>%
  mutate(game_mean = mean(c_across(c('game1', 'game2', 'game3')), na.rm=TRUE))

This particular example calculates the mean value of each row for only the columns named game1, game2, and game3 in the data frame.

The following example shows how to use this function in practice.

Example: Calculate Mean for Multiple Columns Using dplyr

Suppose we have the following data frame that shows the points scored by various basketball players in three different games:

#create data frame
df <- data.frame(team=c('A', 'A', 'A', 'B', 'B', 'B', 'C', 'C'),
                 game1=c(10, 12, 17, 18, 24, 29, 29, 34),
                 game2=c(8, 10, 14, 15, NA, 19, 18, 29),
                 game3=c(4, 5, 5, 9, 12, 12, 18, 20))

#view data frame
df

  team game1 game2 game3
1    A    10     8     4
2    A    12    10     5
3    A    17    14     5
4    B    18    15     9
5    B    24    NA    12
6    B    29    19    12
7    C    29    18    18
8    C    34    29    20

We can use the following syntax to calculate the mean value of each row for only the game1, game2 and game3 columns:

library(dplyr)

#calculate mean value in each row for game1, game2 and game3 columns
df %>%
  rowwise() %>%
  mutate(game_mean = mean(c_across(c('game1', 'game2', 'game3')), na.rm=TRUE))

# A tibble: 8 x 5
# Rowwise: 
  team  game1 game2 game3 game_mean
          
1 A        10     8     4      7.33
2 A        12    10     5      9   
3 A        17    14     5     12   
4 B        18    15     9     14   
5 B        24    NA    12     18   
6 B        29    19    12     20   
7 C        29    18    18     21.7 
8 C        34    29    20     27.7

The column called game_mean displays the mean value in each row across the game1, game2 and game3 columns.

For example:

Mean value of row 1: (10 + 8 + 4) / 3 = 7.33
Mean value of row 2: (12 + 10 + 5) / 3 = 9
Mean value of row 3: (17 + 14 + 5) / 3 = 12

And so on.

Note that we could also use the starts_with() function to specify that we’d like to calculate the mean value of each row for only the columns that start with ‘game’ in the column name:

library(dplyr)

#calculate mean value in each row for columns that start with 'game'
df %>%
  rowwise() %>%
  mutate(game_mean = mean(c_across(c(starts_with('game'))), na.rm=TRUE))

# A tibble: 8 x 5
# Rowwise: 
  team  game1 game2 game3 game_mean
          
1 A        10     8     4      7.33
2 A        12    10     5      9   
3 A        17    14     5     12   
4 B        18    15     9     14   
5 B        24    NA    12     18   
6 B        29    19    12     20   
7 C        29    18    18     21.7 
8 C        34    29    20     27.7

Notice that this syntax produces the same results as the previous example.

How can I calculate the mean for multiple columns using dplyr?

Example: Calculate Mean for Multiple Columns Using dplyr

Requst a

Scale

Example: Calculate Mean for Multiple Columns Using dplyr

Related terms:

Requst a

Scale