Table of Contents
Pandas provides a convenient and easy-to-use method for creating bar charts to visualize the top 10 values. To create a bar chart, you must first load the data into a pandas DataFrame and then use the DataFrame’s method, plot.bar(), to generate the bar chart. This method takes in the desired x-axis and y-axis values, the type of chart to create (bar, line, etc.), and other parameters to customize the chart. Additionally, you can also add a title and labels for the chart, as well as adjust the size and style. Once you have all of the parameters set, you can call the plot.bar() method to generate the chart.
You can use the following basic syntax to create a bar chart in pandas that includes only the top 10 most frequently occurring values in a specific column:
import pandas as pd import matplotlib.pyplot as plt #find values with top 10 occurrences in 'my_column' top_10 = (df['my_column'].value_counts()).iloc[:10] #create bar chart to visualize top 10 values top_10.plot(kind='bar')
The following example shows how to use this syntax in practice.
Example: Create Bar Chart in Pandas to Visualize Top 10 Values
Suppose we have the following pandas DataFrame that contains information on the team name and points scored by 500 different basketball players:
import pandas as pd import numpy as np from string import ascii_uppercase import random from random import choice #make this example reproducible random.seed(1) np.random.seed(1) #create DataFrame df = pd.DataFrame({'team': [choice(ascii_uppercase) for _ in range(500)], 'points': np.random.uniform(0, 20, 500)}) #view first five rows of DataFrame print(df.head()) team points 0 E 8.340440 1 S 14.406490 2 Z 0.002287 3 Y 6.046651 4 C 2.935118
We can use the following syntax to create a bar chart that displays the top 10 most frequently occurring values in the team column:
import matplotlib.pyplot as plt #find teams with top 10 occurrences top_10_teams = (df['team'].value_counts()).[:10] #create bar chart of top 10 teams top_10_teams.plot(kind='bar')
The bar chart only contains the names of the top 10 most frequently occurring teams.
The x-axis displays the team name and the y-axis displays the frequency.
Note that we can also customize the plot to make it more aesthetically pleasing:
import matplotlib.pyplot as plt #find teams with top 10 occurrences top_10_teams = (df['team'].value_counts()).[:10] #create bar chart of top 10 teams top_10_teams.plot(kind='bar', edgecolor='black', rot=0) #add axis labels plt.xlabel('Team') plt.ylabel('Frequency')
Note that the edgecolor argument added a black border around each bar and the rot argument rotated the x-axis labels 90 degrees to make them easier to read.