What is the use of the quantile() function in R?

The quantile() function in R is used to calculate the quantiles for given data. This includes the minimum, maximum, median, first quartile, and third quartile for the given data. The quantile() function will also calculate any percentile for the given data. This function is useful for understanding the spread of a data set and for generating descriptive statistics.


In statistics, quantiles are values that divide a ranked dataset into equal groups.

The quantile() function in R can be used to calculate sample quantiles of a dataset.

This function uses the following basic syntax:

quantile(x, probs = seq(0, 1, 0.25), na.rm = FALSE)

where:

  • x: Name of vector
  • probs: Numeric vector of probabilities
  • na.rm: Whether to remove NA values

The following examples show how to use this function in practice.

Example 1: Calculate Quantiles of a Vector

The following code shows how to calculate quantiles of a vector in R:

#define vector of data 
data = c(1, 3, 3, 4, 5, 7, 8, 9, 12, 13, 13, 15, 18, 20, 22, 23, 24, 28)

#calculate quartiles
quantile(data, probs = seq(0, 1, 1/4))

 0%  25%  50%  75% 100% 
1.0  5.5 12.5 19.5 28.0 

#calculate quintiles
quantile(data, probs = seq(0, 1, 1/5))

 0%  20%  40%  60%  80% 100% 
1.0  4.4  8.8 13.4 21.2 28.0 

#calculate deciles
quantile(data, probs = seq(0, 1, 1/10))

 0%  10%  20%  30%  40%  50%  60%  70%  80%  90% 100% 
1.0  3.0  4.4  7.1  8.8 12.5 13.4 17.7 21.2 23.3 28.0 

#calculate random quantiles of interest
quantile(data, probs = c(.2, .5, .9))

20%  50%  90% 
4.4 12.5 23.3

Example 2: Calculate Quantiles of Columns in Data Frame

The following code shows how to calculate the quantiles of a specific column in a data frame:

#create data frame
df <- data.frame(var1=c(1, 3, 3, 4, 5, 7, 7, 8, 12, 14, 18),
                 var2=c(7, 7, 8, 3, 2, 6, 8, 9, 11, 11, 16),
                 var3=c(3, 3, 6, 6, 8, 4, 4, 7, 10, 10, 11))

#calculate quartiles of column 'var2'
quantile(df$var2, probs = seq(0, 1, 1/4))

  0%  25%  50%  75% 100% 
 2.0  6.5  8.0 10.0 16.0 

We can also use the sapply() function to calculate the quantiles of multiple columns at once:

#calculate quartiles of every column
sapply(df, function(x) quantile(x, probs = seq(0, 1, 1/4)))

     var1 var2 var3
0%    1.0  2.0    3
25%   3.5  6.5    4
50%   7.0  8.0    6
75%  10.0 10.0    9
100% 18.0 16.0   11

Example 3: Calculate Quantiles by Group

The following code shows how to use functions from the package to calculate quantiles by a grouping variable:

library(dplyr)

#define data frame
df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C'),
                 points=c(1, 3, 3, 4, 5, 7, 7, 8, 12, 14, 18))

#define quantiles of interest
q = c(.25, .5, .75)

#calculate quantiles by grouping variable
df %>%
  group_by(team) %>%
  summarize(quant25 = quantile(points, probs = q[1]), 
            quant50 = quantile(points, probs = q[2]),
            quant75 = quantile(points, probs = q[3]))

# A tibble: 3 x 4
  team  quant25 quant50 quant75
           
1 A         2.5       3    3.25
2 B         6.5       7    7.25
3 C          13      14      16   

The following tutorials show how to use the quantile() function to calculate other common quantile values:

x