How to Calculate Matthews Correlation Coefficient in R

The Matthews Correlation Coefficient (MCC) in R is a measure of the correlation between two binary variables and is calculated using the confusion matrix. It is calculated by taking the product of the true positives and true negatives, subtracting the product of the false positives and false negatives, and then dividing the result by the square root of the product of the sum of the true positives and false negatives multiplied by the sum of the true negatives and false positives. This value ranges from -1 to 1, where 1 represents a perfect prediction and -1 represents a perfect inverted prediction.


Matthews correlation coefficient (MCC) is a metric we can use to assess the performance of a .

It is calculated as:

MCC = (TP*TN – FP*FN) / √(TP+FP)(TP+FN)(TN+FP)(TN+FN)

where:

  • TP: Number of true positives
  • TN: Number of true negatives
  • FP: Number of false positives
  • FN: Number of false negatives

This metric is particularly useful when the two classes are imbalanced – that is, one class appears much more than the other.

The value for MCC ranges from -1 to 1 where:

  • -1 indicates total disagreement between predicted classes and actual classes
  • 0 is synonymous with completely random guessing
  • 1 indicates total agreement between predicted classes and actual classes

For example, suppose a sports analyst uses a to predict whether or not 400 different college basketball players get drafted into the NBA.

The following confusion matrix summarizes the predictions made by the model:

To calculate the MCC of the model, we can use the following formula:

  • MCC = (TP*TN – FP*FN) / √(TP+FP)(TP+FN)(TN+FP)(TN+FN)
  • MCC = (15*375-5*5) / √(15+5)(15+5)(375+5)(375+5)
  • MCC = 0.7368

Matthews correlation coefficient turns out to be 0.7368.

This value is somewhat close to one, which indicates that the model does a decent job of predicting whether or not players will get drafted.

The following example shows how to calculate MCC for this exact scenario using the mcc() function from the mltools package in R.

Example: Calculating Matthews Correlation Coefficient in R

library(mltools)

#define vector of actual classes
actual <- rep(c(1, 0), times=c(20, 380))

#define vector of predicted classes
preds <- rep(c(1, 0, 1, 0), times=c(15, 5, 5, 375))

#calculate Matthews correlation coefficient
mcc(preds, actual)

[1] 0.7368421

Matthews correlation coefficient is 0.7368.

This matches the value that we calculated earlier by hand.

If you’d like to calculate Matthews correlation coefficient for a confusion matrix, you can use the confusionM argument as follows:

library(mltools)

#create confusion matrix
conf_matrix <- matrix(c(15, 5, 5, 375), nrow=2)

#view confusion matrix
conf_matrix

     [,1] [,2]
[1,]   15    5
[2,]    5  375

#calculate Matthews correlation coefficient for confusion matrix
mcc(confusionM = conf_matrix)

[1] 0.7368421

Once again, Matthews correlation coefficient is 0.7368

The following tutorials explain how to perform other common tasks in R:

x