How can I create a pairs plot in Python?

A pairs plot, also known as a scatter plot matrix, is a graphical tool used to visualize the relationship between multiple variables in a dataset. In Python, a pairs plot can be created using the Seaborn library, which provides a simple and efficient way to generate highly informative visualizations. To create a pairs plot, the user can use the “pairplot” function in Seaborn, which takes in the dataset and the variables to be plotted. The result is a grid of scatter plots, where each variable is compared with every other variable in the dataset. This allows for quick and easy identification of any patterns or relationships between the variables. Additionally, the user can customize the plot by adding labels, titles, and adjusting the size and color of the points. Overall, using Seaborn to create a pairs plot in Python is a simple yet effective method for gaining insights from a dataset.

Create a Pairs Plot in Python


A pairs plot is a matrix of that lets you understand the pairwise relationship between different variables in a dataset.

The easiest way to create a pairs plot in Python is to use the function.

The following examples show how to use this function in practice.

Example 1: Pairs Plot for All Variables

The following code shows how to create a pairs plot for every numeric variable in the seaborn dataset called iris:

import pandas as pd
import matplotlib.pyplotas plt
import seaborn as sns

#define dataset
iris = sns.load_dataset("iris")

#create pairs plot for all numeric variables
sns.pairplot(iris)

Pairs plot in Python

The way to interpret the matrix is as follows:

  • The distribution of each variable is shown as a histogram along the diagonal boxes.
  • All other boxes display a scatterplot of the relationship between each pairwise combination of variables. For example, the box in the bottom left corner of the matrix displays a scatterplot of values for petal_width vs. sepal_length.

This single plot gives us an idea of the relationship between each pair of variables in our dataset.

Example 2: Pairs Plot for Specific Variables

We can also specify only certain variables to include in the pairs plot:

sns.pairplot(iris[['sepal_length', 'sepal_width']])

Example 3: Pairs Plot with Color by Category

We can also create a pairs plot that colors each point in each plot based on some categorical variable using the hue argument:

sns.pairplot(iris, hue='species')

Pairs plot in Python with color by category

By using the hue argument, we can gain an even better understanding of the data.

Additional Resources

x