Table of Contents
In R, the mean by group can be calculated using the “group_by” function in the dplyr package. This function allows for grouping of data based on a chosen variable, followed by the application of a summary function such as “mean” to obtain the mean value for each group. For example, to calculate the mean age of individuals in different countries, the data can be grouped by the “country” variable and the mean function can be applied to the “age” variable. Other summary functions such as median, sum or standard deviation can also be used. This approach allows for easy and efficient calculation of group-wise means in R.
Calculate the Mean by Group in R (With Examples)
Often you may want to calculate the mean by group in R. There are three methods you can use to do so:
Method 1: Use base R.
aggregate(df$col_to_aggregate, list(df$col_to_group_by), FUN=mean)
Method 2: Use the dplyr() package.
library(dplyr)
df %>%
group_by(col_to_group_by) %>%
summarise_at(vars(col_to_aggregate), list(name = mean))
Method 3: Use the data.table package.
library(data.table)
dt[ ,list(mean=mean(col_to_aggregate)), by=col_to_group_by]
The following examples show how to use each of these methods in practice.
Method 1: Calculate Mean by Group Using Base R
The following code shows how to use the aggregate() function from base R to calculate the mean points scored by team in the following data frame:
#create data frame df <- data.frame(team=c('a', 'a', 'b', 'b', 'b', 'c', 'c'), pts=c(5, 8, 14, 18, 5, 7, 7), rebs=c(8, 8, 9, 3, 8, 7, 4)) #view data frame df team pts rebs 1 a 5 8 2 a 8 8 3 b 14 9 4 b 18 3 5 b 5 8 6 c 7 7 7 c 7 4 #find mean points scored by team aggregate(df$pts, list(df$team), FUN=mean) Group.1 x 1 a 6.50000 2 b 12.33333 3 c 7.00000
Method 2: Calculate Mean by Group Using dplyr
The following code shows how to use the group_by() and summarise_at() functions from the dplyr package to calculate the mean points scored by team in the following data frame:
library(dplyr)
#create data frame
df <- data.frame(team=c('a', 'a', 'b', 'b', 'b', 'c', 'c'),
pts=c(5, 8, 14, 18, 5, 7, 7),
rebs=c(8, 8, 9, 3, 8, 7, 4))
#find mean points scored by team df %>%
group_by(team) %>%
summarise_at(vars(pts), list(name = mean))
# A tibble: 3 x 2
team name
<fct> <dbl>
1 a 6.5
2 b 12.3
3 c 7
Method 3: Calculate Mean by Group Using data.table
The following code shows how to calculate the mean points scored by team in the following data frame:
library(data.table)
#create data frame
df <- data.frame(team=c('a', 'a', 'b', 'b', 'b', 'c', 'c'),
pts=c(5, 8, 14, 18, 5, 7, 7),
rebs=c(8, 8, 9, 3, 8, 7, 4))
#convert data frame to data table
setDT(df)
#find mean points scored by team df[ ,list(mean=mean(pts)), by=team]
team mean
1: a 6.50000
2: b 12.33333
3: c 7.00000
Related: