How can I create a scatterplot with a regression line in Python?

Creating a scatterplot with a regression line in Python involves using the Matplotlib library to plot the data points as a scatterplot and then using the NumPy and SciPy libraries to calculate and add the regression line to the plot. The regression line is a visual representation of the relationship between the two variables in the scatterplot, allowing for the identification of any linear trend in the data. By following a few simple steps, one can easily create a scatterplot with a regression line in Python, providing a clear and concise visualization of the data.

Create a Scatterplot with a Regression Line in Python


Often when you perform simple linear regression, you may be interested in creating a scatterplot to visualize the various combinations of x and y values along with the estimation regression line.

Fortunately there are two easy ways to create this type of plot in Python. This tutorial explains both methods using the following data:

import numpyas np 

#create data
x = np.array([1, 1, 2, 3, 4, 4, 5, 6, 7, 7, 8, 9])
y = np.array([13, 14, 17, 12, 23, 24, 25, 25, 24, 28, 32, 33])

Method 1: Using Matplotlib

The following code shows how to create a scatterplot with an estimated regression line for this data using Matplotlib:

import matplotlib.pyplot as plt

#create basic scatterplot
plt.plot(x, y, 'o')

#obtain m (slope) and b(intercept) of linear regression line
m, b = np.polyfit(x, y, 1)

#add linear regression line to scatterplot 
plt.plot(x, m*x+b)

Scatterplot with regression line in Python

Feel free to modify the colors of the graph as you’d like. For example, here’s how to change the individual points to green and the line to red:

#use green as color for individual points
plt.plot(x, y, 'o', color='green')

#obtain m (slope) and b(intercept) of linear regression line
m, b = np.polyfit(x, y, 1)

#use red as color for regression line
plt.plot(x, m*x+b, color='red')

Scatterplot with regression line in numpy

Method 2: Using Seaborn

You can also use the regplot() function from the Seaborn visualization library to create a scatterplot with a regression line:

import seaborn as sns

#create scatterplot with regression line
sns.regplot(x, y, ci=None)

Scatterplot with regression line in seaborn Python

Note that ci=None tells Seaborn to hide the confidence interval bands on the plot. You can choose to show them if you’d like, though:

import seaborn as sns

#create scatterplot with regression line and confidence interval lines
sns.regplot(x, y)

Additional Resources

x