Table of Contents

A residual plot can be created using Python by first fitting a regression model to the data using a suitable library, such as statsmodels or scikit-learn. Then, the residuals (i.e. the difference between the actual values and the predicted values) can be calculated and plotted against the independent variable(s) used in the regression. This can be done using the matplotlib library in Python. The resulting plot will show the distribution of the residuals and can be used to assess the goodness of fit of the regression model. It can also help identify any patterns or outliers in the data that may affect the accuracy of the model. Overall, creating a residual plot using Python can provide valuable insights into the performance of a regression model and aid in its improvement.

Create a Residual Plot in Python

A residual plot is a type of plot that displays the fitted values against the residual values for a .

This type of plot is often used to assess whether or not a linear regression model is appropriate for a given dataset and to check for of residuals.

This tutorial explains how to create a residual plot for a linear regression model in Python.

Example: Residual Plot in Python

For this example we’ll use a dataset that describes the attributes of 10 basketball players:

import numpy as np
import pandas as pd

#create dataset
df = pd.DataFrame({'rating': [90, 85, 82, 88, 94, 90, 76, 75, 87, 86],
                   'points': [25, 20, 14, 16, 27, 20, 12, 15, 14, 19],
                   'assists': [5, 7, 7, 8, 5, 7, 6, 9, 9, 5],
                   'rebounds': [11, 8, 10, 6, 6, 9, 6, 10, 10, 7]})

#view dataset
df

	rating	points	assists	rebounds
0	90	25	5	11
1	85	20	7	8
2	82	14	7	10
3	88	16	8	6
4	94	27	5	6
5	90	20	7	9
6	76	12	6	6
7	75	15	9	10
8	87	14	9	10
9	86	19	5	7

Residual Plot for Simple Linear Regression

Suppose we fit a simple linear regression model using points as the predictor variable and rating as the response variable:

#import necessary libraries import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.formula.api import ols

#fit simple linear regression model
model = ols('rating ~ points', data=df).fit()

#view model summary
print(model.summary())

We can create a residual vs. fitted plot by using the from the statsmodels library:

#define figure size
fig = plt.figure(figsize=(12,8))

#produce regression plots
fig = sm.graphics.plot_regress_exog(model, 'points', fig=fig)

Four plots are produced. The one in the top right corner is the residual vs. fitted plot. The x-axis on this plot shows the actual values for the predictor variable points and the y-axis shows the residual for that value.

Since the residuals appear to be randomly scattered around zero, this is an indication that heteroscedasticity is not a problem with the predictor variable.

Residual Plots for Multiple Linear Regression

Suppose we instead fit a multiple linear regression model using assists and rebounds as the predictor variable and rating as the response variable:

#fit multiple linear regression model
model = ols('rating ~ assists + rebounds', data=df).fit()

#view model summary
print(model.summary())

For example, here’s what the residual vs. predictor plot looks like for the predictor variable assists:

#create residual vs. predictor plot for 'assists'
fig = plt.figure(figsize=(12,8))
fig = sm.graphics.plot_regress_exog(model, 'assists', fig=fig)

Residual vs. fitted plot

And here’s what the residual vs. predictor plot looks like for the predictor variable rebounds:

#create residual vs. predictor plot for 'assists'
fig = plt.figure(figsize=(12,8))
fig = sm.graphics.plot_regress_exog(model, 'rebounds', fig=fig)

In both plots the residuals appear to be randomly scattered around zero, which is an indication that heteroscedasticity is not a problem with either predictor variable in the model.

How can a residual plot be created using Python?

Create a Residual Plot in Python

Example: Residual Plot in Python

Residual Plot for Simple Linear Regression

Residual Plots for Multiple Linear Regression

Requst a

Scale

Example: Residual Plot in Python

Residual Plot for Simple Linear Regression

Residual Plots for Multiple Linear Regression

Related terms:

Requst a

Scale