Table of Contents
The “NOT IN” filter in Pandas is a logical operator used to filter out specific values from a dataset. It is commonly used in combination with the “IN” filter to create more complex filtering conditions. The “NOT IN” filter returns rows that do not match the specified values, while the “IN” filter returns rows that match the specified values.
For example, if we have a dataset of student grades and we want to retrieve all the grades except for A and B, we can use the “NOT IN” filter to exclude those grades. This would be done by specifying the values “A” and “B” in the filter, and the resulting dataset would only contain grades other than A and B.
Another example is using the “NOT IN” filter to filter out data from multiple categories. For instance, if we have a dataset of sales by product category and we want to retrieve all the sales except for those from the categories “Electronics” and “Toys”, we can use the “NOT IN” filter to exclude those categories and get the desired result.
In summary, the “NOT IN” filter is a useful tool in Pandas for creating more specific and refined filtering conditions. It allows for the exclusion of certain values or categories from a dataset, providing more flexibility in data analysis and manipulation.
Use “NOT IN” Filter in Pandas (With Examples)
You can use the following syntax to perform a “NOT IN” filter in a pandas DataFrame:
df[~df['col_name'].isin(values_list)]
Note that the values in values_list can be either numeric values or character values.
The following examples show how to use this syntax in practice.
Example 1: Perform “NOT IN” Filter with One Column
The following code shows how to filter a pandas DataFrame for rows where a team name is not in a list of names:
import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'B', 'B', 'C', 'C'], 'points': [25, 12, 15, 14, 19, 23, 25, 29], 'assists': [5, 7, 7, 9, 12, 9, 9, 4], 'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]}) #define list of teams we don't want values_list = ['A', 'B'] #filter for rows where team name is not in list df[~df['team'].isin(values_list)] team points assists rebounds 6 C 25 9 9 7 C 29 4 12
And the following code shows how to filter a pandas DataFrame for rows where the ‘points’ column does not contain certain values:
import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'B', 'B', 'C', 'C'], 'points': [25, 12, 15, 14, 19, 23, 25, 29], 'assists': [5, 7, 7, 9, 12, 9, 9, 4], 'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]}) #define list of values we don't want values_list = [12, 15, 25] #filter for rows where team name is not in list df[~df['team'].isin(values_list)] team points assists rebounds 3 B 14 9 6 4 B 19 12 6 5 B 23 9 5 7 C 29 4 12
Example 2: Perform “NOT IN” Filter with Multiple Columns
The following code shows how to filter a pandas DataFrame for rows where certain team names are not in one of several columns:
import pandas as pd #create DataFrame df = pd.DataFrame({'star_team': ['A', 'A', 'B', 'B', 'B', 'B', 'C', 'C'], 'backup_team': ['B', 'B', 'C', 'C', 'D', 'D', 'D', 'E'], 'points': [25, 12, 15, 14, 19, 23, 25, 29], 'assists': [5, 7, 7, 9, 12, 9, 9, 4], 'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]}) #define list of teams we don't want values_list = ['C', 'E'] #filter for rows where team name is not in one of several columns df[~df[['star_team', 'backup_team']].isin(values_list).any(axis=1)] star_team backup_team points assists rebounds 0 A B 25 5 11 1 A B 12 7 8 4 B D 19 12 6 5 B D 23 9 5
Notice that we filtered out every row where teams ‘C’ or ‘E’ appeared in either the ‘star_team’ column or the ‘backup_team’ column.
The following tutorials explain how to perform other common filtering operations in pandas:
Cite this article
stats writer (2024). How can the “NOT IN” filter be used in Pandas, and what are some examples of its implementation?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-can-the-not-in-filter-be-used-in-pandas-and-what-are-some-examples-of-its-implementation/
stats writer. "How can the “NOT IN” filter be used in Pandas, and what are some examples of its implementation?." PSYCHOLOGICAL SCALES, 12 May. 2024, https://scales.arabpsychology.com/stats/how-can-the-not-in-filter-be-used-in-pandas-and-what-are-some-examples-of-its-implementation/.
stats writer. "How can the “NOT IN” filter be used in Pandas, and what are some examples of its implementation?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/how-can-the-not-in-filter-be-used-in-pandas-and-what-are-some-examples-of-its-implementation/.
stats writer (2024) 'How can the “NOT IN” filter be used in Pandas, and what are some examples of its implementation?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-can-the-not-in-filter-be-used-in-pandas-and-what-are-some-examples-of-its-implementation/.
[1] stats writer, "How can the “NOT IN” filter be used in Pandas, and what are some examples of its implementation?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, May, 2024.
stats writer. How can the “NOT IN” filter be used in Pandas, and what are some examples of its implementation?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.
