How can I calculate deciles in R and what are some examples of their use?

Deciles are statistical measures used to divide a data set into ten equally sized groups. In R, deciles can be calculated using the quantile() function, which takes in a data set and the desired percentile (in this case, 0.1 for deciles). This function returns the values that divide the data into ten groups, with the first decile representing the 10th percentile and the tenth decile representing the 100th percentile.

Deciles are useful in analyzing data sets as they provide a more detailed breakdown of the distribution of the data. For example, if we have a data set of exam scores, we can use deciles to identify the top 10% of students (10th decile) and the bottom 10% of students (1st decile). This can help in identifying outliers or trends within the data.

Another example of using deciles is in market research, where deciles can be used to divide a population into ten income groups to understand the purchasing power of different segments.

In summary, calculating deciles in R allows for a more comprehensive analysis of data, providing insights into the distribution and characteristics of a data set.

Calculate Deciles in R (With Examples)


In statistics, deciles are numbers that split a dataset into ten groups of equal frequency.

The first decile is the point where 10% of all data values lie below it. The second decile is the point where 20% of all data values lie below it, and so on.

We can use the following syntax to calculate the deciles for a dataset in R:

quantile(data, probs = seq(.1, .9, by = .1))

The following example shows how to use this function in practice.

Example: Calculate Deciles in R

The following code shows how to create a fake dataset with 20 values and then calculate the values for the deciles of the dataset:

#create dataset
data <- c(56, 58, 64, 67, 68, 73, 78, 83, 84, 88,
          89, 90, 91, 92, 93, 93, 94, 95, 97, 99)

#calculate deciles of dataset
quantile(data, probs = seq(.1, .9, by = .1))

 10%  20%  30%  40%  50%  60%  70%  80%  90% 
63.4 67.8 76.5 83.6 88.5 90.4 92.3 93.2 95.2 

The way to interpret the deciles is as follows:

  • 10% of all data values lie below 63.4
  • 20% of all data values lie below 67.8.
  • 30% of all data values lie below 76.5.
  • 40% of all data values lie below 83.6.
  • 50% of all data values lie below 88.5.
  • 60% of all data values lie below 90.4.
  • 70% of all data values lie below 92.3.
  • 80% of all data values lie below 93.2.
  • 90% of all data values lie below 95.2.

It’s worth noting that the value at the 50th percentile is equal to the median value of the dataset.

Example: Place Values into Deciles in R

To place each data value into a decile, we can use the ntile(x, ngroups) function from the package in R.

Here’s how to use this function for the dataset we created in the previous example:

library(dplyr) 

#create dataset
data <- data.frame(values=c(56, 58, 64, 67, 68, 73, 78, 83, 84, 88,
                            89, 90, 91, 92, 93, 93, 94, 95, 97, 99))

#place each value into a decile
data$decile <- ntile(data, 10)

#view data
data

   values decile
1      56      1
2      58      1
3      64      2
4      67      2
5      68      3
6      73      3
7      78      4
8      83      4
9      84      5
10     88      5
11     89      6
12     90      6
13     91      7
14     92      7
15     93      8
16     93      8
17     94      9
18     95      9
19     97     10
20     99     10

The way to interpret the output is as follows:

  • The data value 56 falls between the percentile 0% and 10%, thus it falls in the first decile.
  • The data value 58 falls between the percentile 0% and 10%, thus it falls in the first decile.
  • The data value 64 falls between the percentile 10% and 20%, thus it falls in the second decile.
  • The data value 67 falls between the percentile 10% and 20%, thus it falls in the second decile.
  • The data value 68 falls between the percentile 20% and 30%, thus it falls in the third decile.

Additional Resources

x