How can the Mean Squared Error (MSE) be calculated in Python?

The Mean Squared Error (MSE) is a commonly used metric to measure the accuracy of a regression model. In Python, the MSE can be calculated by taking the average of the squared differences between the predicted values and the actual values of a dataset. This can be done by first importing the necessary libraries, then using the built-in functions to calculate the squared errors and finding their mean. Alternatively, the MSE can also be calculated using pre-existing functions from popular machine learning libraries such as scikit-learn or TensorFlow. Overall, the calculation of MSE in Python involves simple mathematical operations and can be easily implemented with the appropriate tools and functions.

Calculate Mean Squared Error (MSE) in Python


The mean squared error (MSE) is a common way to measure the prediction accuracy of a model. It is calculated as:

MSE = (1/n) * Σ(actual – prediction)2

where:

  • Σ – a fancy symbol that means “sum”
  • n – sample size
  • actual – the actual data value
  • forecast – the predicted data value

The lower the value for MSE, the better a model is able to predict values accurately.

How to Calculate MSE in Python

We can create a simple function to calculate MSE in Python:

import numpy as np

def mse(actual, pred): 
    actual, pred = np.array(actual), np.array(pred)
    return np.square(np.subtract(actual,pred)).mean() 

We can then use this function to calculate the MSE for two arrays: one that contains the actual data values and one that contains the predicted data values.

actual = [12, 13, 14, 15, 15, 22, 27]
pred = [11, 13, 14, 14, 15, 16, 18]

mse(actual, pred)

17.0

The mean squared error (MSE) for this model turns out to be 17.0.

In practice, the root mean squared error (RMSE) is more commonly used to assess model accuracy. As the name implies, it’s simply the square root of the mean squared error.

We can define a similar function to calculate RMSE:

import numpy as np

def rmse(actual, pred): 
    actual, pred = np.array(actual), np.array(pred)
    return np.sqrt(np.square(np.subtract(actual,pred)).mean())

We can then use this function to calculate the RMSE for two arrays: one that contains the actual data values and one that contains the predicted data values.

actual = [12, 13, 14, 15, 15, 22, 27]
pred = [11, 13, 14, 14, 15, 16, 18]

rmse(actual, pred)

4.1231

The root mean squared error (RMSE) for this model turns out to be 4.1231.

Additional Resources

Mean Squared Error (MSE) Calculator
How to Calculate Mean Squared Error (MSE) in Excel

x