How to Calculate Bray-Curtis Dissimilarity in R?

Bray-Curtis dissimilarity is a measure of dissimilarity between two sets of data. It is calculated by taking the sum of the absolute difference between the two data sets and dividing by the sum of the combined values. In R, it can be computed using the dist() function and setting the method argument to “bray” or “braycurtis”. It is often used in ecological studies to compare the similarity of different species or communities.


The is a way to measure the dissimilarity between two different sites.

It’s often used in ecology and biology to quantify how different two sites are in terms of the species found in those sites. 

It is calculated as:

BCij = 1 – (2*Cij) / (Si + Sj)

where:

  • Cij: The sum of the lesser values for the species found in each site.
  • Si: The total number of specimens counted at site i
  • Sj: The total number of specimens counted at site j

The Bray-Curtis Dissimilarity always ranges between 0 and 1 where:

  • 0 indicates that two sites have zero dissimilarity. In other words, they share the exact same number of each type of species.
  • 1 indicates that two sites have complete dissimilarity. In other words, they share none of the same type of species.

For example, suppose a botanist goes out and counts the number of five different plant species (A, B, C, D, and E) in two different sites. 

The following table summarizes the data she collected:

Using this data, she can calculate the Bray-Curtis dissimilarity as:

Bray-Curtis Dissimilarity

Plugging these numbers into the Bray-Curtis dissimilarity formula, we get:

  • BCij = 1 – (2*Cij) / (Si + Sj)
  • BCij = 1 – (2*15) / (21 + 24)
  • BCij = 0.33

The Bray-Curtis dissimilarity between these two sites is 0.33.

The following example shows how to calculate Bray-Curtis dissimilarity in R.

Example: Calculating Bray-Curtis Dissimilarity in R

First, let’s create the following data frame in R to hold our data values:

#create data frame
df <- data.frame(A=c(4, 3),
                 B=c(0, 6),
                 C=c(2, 0),
                 D=c(7, 4),
                 E=c(8, 11))

#view data frame
df

  A B C D  E
1 4 0 2 7  8
2 3 6 0 4 11

We can use the following code to calculate the Bray-Curtis dissimilarity between the two rows of the data frame:

#calculate Bray-Curtis dissimilarity
sum(apply(df, 2, function(x) abs(max(x)-min(x)))) / sum(rowSums(df))

[1] 0.3333333

The Bray-Curtis dissimilarly turns out to be 0.33.

This matches the value that we calculated earlier by hand.

Note: This formula will only work if each row in the data frame represents a distinct site.

x