Table of Contents
The stat_summary() function in ggplot2 is a useful tool for summarizing data in a graphical representation. It allows users to quickly and easily visualize key summary statistics, such as mean, median, or standard deviation, for a given dataset. This function can be used to create a variety of graphs, including box plots, bar charts, and line graphs, making it a versatile tool for data exploration and analysis. By incorporating summary statistics into a visual format, the stat_summary() function helps to provide a clearer understanding of the underlying data and can be a valuable addition to any ggplot2 graph.
Use stat_summary() Function in ggplot2
You can use the stat_summary() function in ggplot2 to create visualizations that display summary metrics of specific variables in a data frame.
The following examples show how to use the stat_summary() function in practice with the following data frame in R:
#create data frame df = data.frame(team=rep(c('A', 'B', 'C'), each=4), points=c(8, 12, 4, 6, 26, 21, 25, 20, 9, 18, 14, 14)) #view data frame df team points 1 A 8 2 A 12 3 A 4 4 A 6 5 B 26 6 B 21 7 B 25 8 B 20 9 C 9 10 C 18 11 C 14 12 C 14
Example 1: Use stat_summary() to Visualize Mean Values with Bar Plot
The following code shows how to use the stat_summary() function to visualize the mean value in the points column of the data frame, grouped by the team column:
library(ggplot2)
library(dplyr)
#create bar plot to visualize mean points by team
df %>%
ggplot(aes(x=team, y=points)) +
stat_summary(fun='mean', geom='bar')
The bars in the bar plot represent the mean points value for each unique team value.
Notice that we used the fun argument within stat_summary() to specify the summary function to use and we used the geom argument to specify the geometric shape to use in the plot.
Example 2: Use stat_summary() to Visualize Mean Values with Scatter Plot
The following code shows how to use the stat_summary() function to visualize the mean value in the points column of the data frame, grouped by the team column, using points as the geometric shape:
library(ggplot2)
library(dplyr)
#create plot with points to visualize mean points by team
df %>%
ggplot(aes(x=team, y=points)) +
stat_summary(fun='mean', geom='points')
Notice that we used the geom argument within the stat_summary() function to specify that we’d like to use points as the geometric shape in the plot.
Example 3: Use stat_summary() to Visualize Minimum Values with Bar Plot
The following code shows how to use the stat_summary() function to visualize the minimum value in the points column of the data frame, grouped by the team column:
library(ggplot2)
library(dplyr)
#create bar plot to visualize minimum points by team
df %>%
ggplot(aes(x=team, y=points)) +
stat_summary(fun='min', geom='bar')
Notice that we used the fun argument within the stat_summary() function to specify that we’d like to use the minimum as the summary function.
Additional Resources
The following tutorials explain how to perform other common tasks in ggplot2: