How can I plot the distribution of column values in Pandas? 2

How can I plot the distribution of column values in Pandas?

To plot the distribution of column values in Pandas, one can use the built-in function “hist()” or “plot(kind=’hist’)” to generate a histogram. This will display the frequency of values in the chosen column, allowing for a visual representation of the data’s distribution. Other options include using the “boxplot()” or “plot(kind=’box’)” function to create a box plot, which can show the median, quartiles, and outliers of the column’s values. One can also utilize the “scatter()” or “plot(kind=’scatter’)” function to create a scatter plot, which can show the relationship between two columns and the distribution of their values. These various methods in Pandas provide a convenient way to analyze and visualize the distribution of column values in a dataset.

Plot Distribution of Column Values in Pandas


You can use the following methods to plot a distribution of column values in a pandas DataFrame:

Method 1: Plot Distribution of Values in One Column

df['my_column'].plot(kind='kde')

Method 2: Plot Distribution of Values in One Column, Grouped by Another Column

df.groupby('group_column')['values_column'].plot(kind='kde')

The following examples show how to use each method in practice with the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A',
                            'B', 'B', 'B', 'B', 'B', 'B', 'B', 'B', 'B', 'B'],
                   'points': [3, 3, 4, 5, 4, 7, 7, 7, 10, 11, 
                              8, 7, 8, 9, 12, 12, 12, 14, 15, 17]})

#view DataFrame
print(df)

   team  points
0     A       3
1     A       3
2     A       4
3     A       5
4     A       4
5     A       7
6     A       7
7     A       7
8     A      10
9     A      11
10    B       8
11    B       7
12    B       8
13    B       9
14    B      12
15    B      12
16    B      12
17    B      14
18    B      15
19    B      17

Example 1: Plot Distribution of Values in One Column

The following code shows how to plot the distribution of values in the points column:

#plot distribution of values in points column
df['points'].plot(kind='kde')

Note that kind=’kde’ tells pandas to use kernel density estimation, which produces a smooth curve that summarizes the distribution of values for a variable.

If you’d like to create a histogram instead, you can specify kind=’hist’ as follows:

#plot distribution of values in points column using histogram
df['points'].plot(kind='hist', edgecolor='black')

This method uses bars to represent frequencies of values in the points column as opposed to a smooth line that summarizes the shape of the distribution.

Example 2: Plot Distribution of Values in One Column, Grouped by Another Column

import matplotlib.pyplot as plt

#plot distribution of points by team 
df.groupby('team')['points'].plot(kind='kde')

#add legend
plt.legend(['A', 'B'], title='Team')

#add x-axis label
plt.xlabel('Points')

The blue line shows the distribution of points for players on team A while the orange line shows the distribution of points for players on team B.

The following tutorials explain how to perform other common tasks in pandas:

Cite this article

stats writer (2024). How can I plot the distribution of column values in Pandas?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-can-i-plot-the-distribution-of-column-values-in-pandas/

stats writer. "How can I plot the distribution of column values in Pandas?." PSYCHOLOGICAL SCALES, 27 Jun. 2024, https://scales.arabpsychology.com/stats/how-can-i-plot-the-distribution-of-column-values-in-pandas/.

stats writer. "How can I plot the distribution of column values in Pandas?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/how-can-i-plot-the-distribution-of-column-values-in-pandas/.

stats writer (2024) 'How can I plot the distribution of column values in Pandas?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-can-i-plot-the-distribution-of-column-values-in-pandas/.

[1] stats writer, "How can I plot the distribution of column values in Pandas?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, June, 2024.

stats writer. How can I plot the distribution of column values in Pandas?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.

Download Post (.PDF)
Slide Up
x
PDF
Scroll to Top