How to Specify Histogram Breaks in R (With Examples)

Specifying the breaks in an R histogram is done using the “breaks” argument in the hist() command. This argument takes a vector of numbers that are the desired breakpoints between each bin in the histogram. For example, if one wanted to specify the breaks of a histogram as 0, 10, 20, 30, etc., the breaks argument would take the form of c(0, 10, 20, 30). Examples of how to specify breaks in histograms in R can be found in the help page for the hist() command in the R documentation.


By default, the hist() function in R uses to determine how many bins to use in a histogram.

Sturges’ Rule uses the following formula to determine the optimal number of bins to use in a histogram:

Optimal Bins = ⌈log2n + 1⌉

where:

  • n: The total number of in the dataset.
  • ⌈ ⌉: Symbols that mean “ceiling” – i.e. round the answer up to the nearest integer.

For example, if there are 31 observations in a dataset, Sturge’s Rule will use the following formula to determine the optimal number of bins to use in a histogram:

Optimal Bins = ⌈log2(31) + 1⌉ = ⌈4.954 + 1⌉ = ⌈5.954⌉ = 6.

According to Sturges’ Rule, we should use 6 bins in the histogram to visualize this dataset.

If you use the hist() function in R, Sturges’ Rule will be used to automatically choose the number of bins to display in the histogram.

hist(data)

Even if you use the breaks argument to specify a different number of bins to use, R will only use this as a “suggestion” for how many bins to use.

hist(data, breaks=7)

However, you can use the following code to force R to use a specific number of bins in a histogram:

#create histogram with 7 bins
hist(data, breaks = seq(min(data), max(data), length.out = 8))

Note: You must use a length of n+1 for length.out where n is your desired number of bins.

The following example shows how to use this code in practice.

Example: Specify Histogram Breaks in R

#create vector of 16 values
data <- c(2, 3, 3, 3, 4, 4, 5, 6, 8, 10, 12, 14, 15, 18, 20, 21)

If we use the hist() function, R will create the following histogram with 5 bins:

#create histogram
hist(data)

Note: R used to determine that 5 bins was the optimal number of bins to use to visualize a dataset with 16 observations.

If we attempt to use the breaks argument to specify 7 bins to use in the histogram, R will only take this as a “suggestion” and instead choose to use 10 bins:

#attempt to create histogram with 7 bins
hist(data, breaks=7)

However, we can use the following code to force R to use 7 bins in the histogram:

#create histogram with 7 bins
hist(data, breaks = seq(min(data), max(data), length.out = 8))

Notice that the result is a histogram with 7 equally-spaced bins.

The following tutorials explain how to perform other common operations in R:

x