How can the MSE (Mean Squared Error) be calculated in R?

The Mean Squared Error (MSE) is a common measure of the accuracy of a prediction model. In R, the MSE can be calculated by first obtaining the predicted values and the actual values of the data. Then, the difference between these values is squared and the average of these squared differences is taken, resulting in the MSE value. This can be easily achieved using built-in functions such as “mean” and “sum” in R. The MSE value can provide valuable insights into the performance of a prediction model and can be used to compare different models and determine the most accurate one.

Calculate MSE in R


One of the most common metrics used to measure the prediction accuracy of a model is MSE, which stands for mean squared error. It is calculated as:

MSE = (1/n) * Σ(actual – prediction)2

where:

  • Σ – a fancy symbol that means “sum”
  • n – sample size
  • actual – the actual data value
  • prediction – the predicted data value

The lower the value for MSE, the more accurately a model is able to predict values.

How to Calculate MSE in R

Depending on what format your data is in, there are two easy methods you can use to calculate the MSE of a regression model in R.

Method 1: Calculate MSE from Regression Model

In one scenario, you may have a fitted regression model and would simply like to calculate the MSE of the model. For example, you may have the following regression model:

#load mtcars dataset
data(mtcars)

#fit regression model
model <- lm(mpg~disp+hp, data=mtcars)

#get model summary
model_summ <-summary(model)

To calculate the MSE for this model, you can use the following formula:

#calculate MSE
mean(model_summ$residuals^2)

[1] 8.85917

This tells us that the MSE is 8.85917.

Method 2: Calculate MSE from a list of Predicted and Actual Values

In another scenario, you may simply have a list of predicted and actual values. For example:

#create data frame with a column of actual values and a column of predicted values
data <- data.frame(pred = predict(model), actual = mtcars$mpg)

#view first six lines of data
head(data)

                      pred actual
Mazda RX4         23.14809   21.0
Mazda RX4 Wag     23.14809   21.0
Datsun 710        25.14838   22.8
Hornet 4 Drive    20.17416   21.4
Hornet Sportabout 15.46423   18.7
Valiant           21.29978   18.1

In this case, you can use the following formula to calculate the MSE:

#calculate MSE
mean((data$actual - data$pred)^2)

[1] 8.85917

This tells us that the MSE is 8.85917, which matches the MSE that we calculated using the previous method.

x