How can we calculate the mean by group in R? Can you provide some examples?

In R, the mean by group can be calculated using the “group_by” function in the dplyr package. This function allows for grouping of data based on a chosen variable, followed by the application of a summary function such as “mean” to obtain the mean value for each group. For example, to calculate the mean age of individuals in different countries, the data can be grouped by the “country” variable and the mean function can be applied to the “age” variable. Other summary functions such as median, sum or standard deviation can also be used. This approach allows for easy and efficient calculation of group-wise means in R.

Calculate the Mean by Group in R (With Examples)


Often you may want to calculate the mean by group in R. There are three methods you can use to do so:

Method 1: Use base R.

aggregate(df$col_to_aggregate, list(df$col_to_group_by), FUN=mean)

Method 2: Use the dplyr() package.

library(dplyr)

df %>%
  group_by(col_to_group_by) %>%
  summarise_at(vars(col_to_aggregate), list(name = mean))

Method 3: Use the data.table package.

library(data.table)

dt[ ,list(mean=mean(col_to_aggregate)), by=col_to_group_by]

The following examples show how to use each of these methods in practice.

Method 1: Calculate Mean by Group Using Base R

The following code shows how to use the aggregate() function from base R to calculate the mean points scored by team in the following data frame:

#create data frame
df <- data.frame(team=c('a', 'a', 'b', 'b', 'b', 'c', 'c'),
                 pts=c(5, 8, 14, 18, 5, 7, 7),
                 rebs=c(8, 8, 9, 3, 8, 7, 4))

#view data frame
df

  team pts rebs
1    a   5    8
2    a   8    8
3    b  14    9
4    b  18    3
5    b   5    8
6    c   7    7
7    c   7    4

#find mean points scored by team
aggregate(df$pts, list(df$team), FUN=mean)

  Group.1        x
1       a  6.50000
2       b 12.33333
3       c  7.00000

Method 2: Calculate Mean by Group Using dplyr

The following code shows how to use the group_by() and summarise_at() functions from the dplyr package to calculate the mean points scored by team in the following data frame:

library(dplyr) 

#create data frame
df <- data.frame(team=c('a', 'a', 'b', 'b', 'b', 'c', 'c'),
                 pts=c(5, 8, 14, 18, 5, 7, 7),
                 rebs=c(8, 8, 9, 3, 8, 7, 4))

#find mean points scored by team df %>%
  group_by(team) %>%
  summarise_at(vars(pts), list(name = mean))

# A tibble: 3 x 2
  team   name
  <fct> <dbl>
1 a       6.5
2 b      12.3
3 c       7  

Method 3: Calculate Mean by Group Using data.table

The following code shows how to calculate the mean points scored by team in the following data frame:

library(data.table)

#create data frame
df <- data.frame(team=c('a', 'a', 'b', 'b', 'b', 'c', 'c'),
                 pts=c(5, 8, 14, 18, 5, 7, 7),
                 rebs=c(8, 8, 9, 3, 8, 7, 4))

#convert data frame to data table 
setDT(df)

#find mean points scored by team df[ ,list(mean=mean(pts)), by=team]

   team     mean
1:    a  6.50000
2:    b 12.33333
3:    c  7.00000

Related:

Additional Resources

x