Table of Contents
The quantile() function in R is used to calculate the quantiles for given data. This includes the minimum, maximum, median, first quartile, and third quartile for the given data. The quantile() function will also calculate any percentile for the given data. This function is useful for understanding the spread of a data set and for generating descriptive statistics.
In statistics, quantiles are values that divide a ranked dataset into equal groups.
The quantile() function in R can be used to calculate sample quantiles of a dataset.
This function uses the following basic syntax:
quantile(x, probs = seq(0, 1, 0.25), na.rm = FALSE)
where:
- x: Name of vector
- probs: Numeric vector of probabilities
- na.rm: Whether to remove NA values
The following examples show how to use this function in practice.
Example 1: Calculate Quantiles of a Vector
The following code shows how to calculate quantiles of a vector in R:
#define vector of data data = c(1, 3, 3, 4, 5, 7, 8, 9, 12, 13, 13, 15, 18, 20, 22, 23, 24, 28) #calculate quartiles quantile(data, probs = seq(0, 1, 1/4)) 0% 25% 50% 75% 100% 1.0 5.5 12.5 19.5 28.0 #calculate quintiles quantile(data, probs = seq(0, 1, 1/5)) 0% 20% 40% 60% 80% 100% 1.0 4.4 8.8 13.4 21.2 28.0 #calculate deciles quantile(data, probs = seq(0, 1, 1/10)) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 1.0 3.0 4.4 7.1 8.8 12.5 13.4 17.7 21.2 23.3 28.0 #calculate random quantiles of interest quantile(data, probs = c(.2, .5, .9)) 20% 50% 90% 4.4 12.5 23.3
Example 2: Calculate Quantiles of Columns in Data Frame
The following code shows how to calculate the quantiles of a specific column in a data frame:
#create data frame
df <- data.frame(var1=c(1, 3, 3, 4, 5, 7, 7, 8, 12, 14, 18),
var2=c(7, 7, 8, 3, 2, 6, 8, 9, 11, 11, 16),
var3=c(3, 3, 6, 6, 8, 4, 4, 7, 10, 10, 11))
#calculate quartiles of column 'var2'
quantile(df$var2, probs = seq(0, 1, 1/4))
0% 25% 50% 75% 100%
2.0 6.5 8.0 10.0 16.0
We can also use the sapply() function to calculate the quantiles of multiple columns at once:
#calculate quartiles of every column
sapply(df, function(x) quantile(x, probs = seq(0, 1, 1/4)))
var1 var2 var3
0% 1.0 2.0 3
25% 3.5 6.5 4
50% 7.0 8.0 6
75% 10.0 10.0 9
100% 18.0 16.0 11
Example 3: Calculate Quantiles by Group
The following code shows how to use functions from the package to calculate quantiles by a grouping variable:
library(dplyr)
#define data frame
df <- data.frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C'),
points=c(1, 3, 3, 4, 5, 7, 7, 8, 12, 14, 18))
#define quantiles of interest
q = c(.25, .5, .75)
#calculate quantiles by grouping variable
df %>%
group_by(team) %>%
summarize(quant25 = quantile(points, probs = q[1]),
quant50 = quantile(points, probs = q[2]),
quant75 = quantile(points, probs = q[3]))
# A tibble: 3 x 4
team quant25 quant50 quant75
1 A 2.5 3 3.25
2 B 6.5 7 7.25
3 C 13 14 16
The following tutorials show how to use the quantile() function to calculate other common quantile values: