How can I calculate the Root Mean Squared Error (RMSE) in Python?

Root Mean Squared Error (RMSE) is a commonly used metric for evaluating the performance of a predictive model. It measures the average difference between the predicted values and the actual values of a dataset. In Python, the RMSE can be calculated by first obtaining the squared differences between the predicted and actual values, then taking the square root of the mean of these squared differences. This can be achieved using the built-in functions in the NumPy library or by using the scikit-learn library’s “mean_squared_error” function. By calculating and comparing the RMSE values for different models, one can determine the most accurate predictive model for a given dataset.

Calculate RMSE in Python


The root mean square error (RMSE) is a metric that tells us how far apart our predicted values are from our observed values in a model, on average. It is calculated as:

RMSE = √[ Σ(Pi – Oi)2 / n ]

where:

  • Σ is a fancy symbol that means “sum”
  • Pi is the predicted value for the ith observation
  • Oi is the observed value for the ith observation
  • n is the sample size

This tutorial explains a simple method to calculate RMSE in Python.

Example: Calculate RMSE in Python

Suppose we have the following arrays of actual and predicted values:

actual= [34, 37, 44, 47, 48, 48, 46, 43, 32, 27, 26, 24]
pred = [37, 40, 46, 44, 46, 50, 45, 44, 34, 30, 22, 23]

To calculate the RMSE between the actual and predicted values, we can simply take the square root of themean_squared_error()function from the sklearn.metrics library:

#import necessary libraries
from sklearn.metrics import mean_squared_error
from math import sqrt

#calculate RMSE
sqrt(mean_squared_error(actual, pred)) 

2.4324199198

The RMSE turns out to be 2.4324.

How to Interpret RMSE

RMSE is a useful way to see how well a model is able to fit a dataset. The larger the RMSE, the larger the difference between the predicted and observed values, which means the worse a model fits the data. Conversely, the smaller the RMSE, the better a model is able to fit the data.

It can be particularly useful to compare the RMSE of two different models with each other to see which model fits the data better.

Additional Resources

RMSE Calculator
How to Calculate Mean Squared Error (MSE) in Python
How to Calculate MAPE in Python

x