How can I filter a pandas dataframe for values that do not contain a certain string or character?

How can I filter a pandas dataframe for values that do not contain a certain string or character?

Filtering a pandas dataframe for values that do not contain a certain string or character can be done by using the “str.contains()” method. This method allows us to search for a specific string or character within a column of the dataframe and return a boolean series indicating whether each value contains the specified string or character. By using the “~” operator, we can then invert the boolean series and filter out the values that contain the specified string or character. This allows us to easily exclude certain values from our dataframe and obtain a subset of data that meets our desired criteria.

Pandas: Filter for “Not Contains”


You can use the following methods to perform a “Not Contains” filter in a pandas DataFrame:

Method 1: Filter for Rows that Do Not Contain Specific String

filtered_df = df[df['my_column'].str.contains('some_string') ==False]

Method 2: Filter for Rows that Do Not Contain One of Several Specific Strings

filtered_df = df[df['my_column'].str.contains('string1|string2|string3') ==False]

The following examples show how to use each method in practice with the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['Nets', 'Rockets', 'Mavs', 'Spurs', 'Kings', 'Nuggets'],
                   'points': [18, 22, 19, 14, 14, 11],
                   'assists': [5, 7, 7, 9, 12, 9],
                   'rebounds': [11, 8, 10, 6, 6, 5]})

#view DataFrame
print(df)

      team  points  assists  rebounds
0     Nets      18        5        11
1  Rockets      22        7         8
2     Mavs      19        7        10
3    Spurs      14        9         6
4    Kings      14       12         6
5  Nuggets      11        9         5

Example 1: Filter for Rows that Do Not Contain Specific String

The following code shows how to filter the pandas DataFrame for rows where the team column does not contain “ets” in the name:

#filter for rows that do not contain 'ets' in the 'team' column
filtered_df = df[df['team'].str.contains('ets') ==False]

#view filtered DataFrame
print(filtered_df)

    team  points  assists  rebounds
2   Mavs      19        7        10
3  Spurs      14        9         6
4  Kings      14       12         6

Notice that the resulting DataFrame does not contain any rows where the value in the team column contains “ets” in the name.

In particular, the following teams were filtered out of the DataFrame:

  • Nets
  • Rockets
  • Nuggets

Notice that each of these team names contained “ets” in the name.

Example 2: Filter for Rows that Do Not Contain One of Several Specific Strings

The following code shows how to filter the pandas DataFrame for rows where the team column does not contain “ets” in the name:

#filter for rows that do not contain 'ets' or 'urs' in the 'team' column
filtered_df = df[df['team'].str.contains('ets|urs') ==False]

#view filtered DataFrame
print(filtered_df)

    team  points  assists  rebounds
2   Mavs      19        7        10
4  Kings      14       12         6

Note: The | operator stands for “OR” in pandas.

The following tutorials explain how to perform other common filtering operations in pandas:

Cite this article

stats writer (2024). How can I filter a pandas dataframe for values that do not contain a certain string or character?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-can-i-filter-a-pandas-dataframe-for-values-that-do-not-contain-a-certain-string-or-character/

stats writer. "How can I filter a pandas dataframe for values that do not contain a certain string or character?." PSYCHOLOGICAL SCALES, 26 Jun. 2024, https://scales.arabpsychology.com/stats/how-can-i-filter-a-pandas-dataframe-for-values-that-do-not-contain-a-certain-string-or-character/.

stats writer. "How can I filter a pandas dataframe for values that do not contain a certain string or character?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/how-can-i-filter-a-pandas-dataframe-for-values-that-do-not-contain-a-certain-string-or-character/.

stats writer (2024) 'How can I filter a pandas dataframe for values that do not contain a certain string or character?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-can-i-filter-a-pandas-dataframe-for-values-that-do-not-contain-a-certain-string-or-character/.

[1] stats writer, "How can I filter a pandas dataframe for values that do not contain a certain string or character?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, June, 2024.

stats writer. How can I filter a pandas dataframe for values that do not contain a certain string or character?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.

Download Post (.PDF)
Slide Up
x
PDF
Scroll to Top