How can one find the maximum value by group in R?

Finding the maximum value by group in R can be achieved through the use of the “aggregate” function. This function allows for the grouping of data based on a specific variable and then calculates the maximum value for each group. The syntax for this function is “aggregate(x, by, FUN)” where “x” is the data to be grouped, “by” is the variable used for grouping, and “FUN” is the function used to calculate the maximum value. The result is a new data frame with the maximum value for each group. This method is useful for comparing data trends within different groups and identifying the highest value within each group.

Find the Maximum Value by Group in R


Often you may want to find the maximum value of each group in a data frame in R. Fortunately this is easy to do using functions from the dplyr package.

This tutorial explains how to do so using the following data frame:

#create data frame
df <- data.frame(team = c('A', 'A', 'A', 'B', 'B', 'B', 'B'),
                 position = c('G', 'F', 'F', 'G', 'G', 'G', 'F'),
                 points = c(12, 15, 19, 22, 34, 34, 39))

#view data frame
df

  team position points
1    A        G     12
2    A        F     15
3    A        F     19
4    B        G     22
5    B        G     34
6    B        G     34
7    B        F     39

Example 1: Find Max Value by Group

The following code shows how to find the max value by team and position:

library(dplyr)

#find max value by team and position
df %>%
  group_by(team, position) %>%
  summarise(max = max(points, na.rm=TRUE))

# A tibble: 4 x 3
# Groups:   team [?]
  team   position   max
      
1 A      F         19.0
2 A      G         12.0
3 B      F         39.0
4 B      G         34.0

Example 2: Return Rows that Contains Max Value by Group

The following code shows how to return the rows that contain the max value by team and position:

library(dplyr)

#find rows that contain max points by team and position
df %>%
  group_by(team, position) %>%
  filter(points == max(points, na.rm=TRUE))

# A tibble: 5 x 3
# Groups:   team, position [4]
  team   position points
       
1 A      G          12.0
2 A      F          19.0
3 B      G          34.0
4 B      G          34.0
5 B      F          39.0

Example 3: Return a Single Row that Contains Max Value by Group

In the previous example, there were two players who had the max amount of points on team A who were both in position G. If you only want to return the first player with the max value in a group, you can use the slice() function as follows:

library(dplyr)

#find rows that contain max points by team and position
df %>%
  group_by(team, position) %>%
  slice(which.max(points))

# A tibble: 4 x 3
# Groups:   team, position [4]
  team   position points
       
1 A      F          19.0
2 A      G          12.0
3 B      F          39.0
4 B      G          34.0

Additional Resources

The Complete Guide: How to Group & Summarize Data in R
How to Filter Rows in R
How to Remove Duplicate Rows in R

x