How to Pandas: Filter Rows that Contain a Specific String

Pandas provides a convenient way to filter rows that contain a specific string using the str.contains() method. This method takes a string as input and returns a boolean series indicating whether the string is present in each row. This makes it easy to filter a DataFrame by a given substring, as each row in the DataFrame is examined for the presence of the string in order to determine if it should be included or excluded from the resulting filtered DataFrame.


You can use the following syntax to filter for rows that contain a certain string in a pandas DataFrame:

df[df["col"].str.contains("this string")]

This tutorial explains several examples of how to use this syntax in practice with the following DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'C'],
                   'conference': ['East', 'East', 'East', 'West', 'West', 'East'],
                   'points': [11, 8, 10, 6, 6, 5]})

#view DataFrame
df

        team	conference   points
0	A	East         11
1	A	East	     8
2	A	East	     10
3	B	West         6
4	B	West         6
5	C	East         5

Example 1: Filter Rows that Contain a Specific String

The following code shows how to filter for rows in the DataFrame that contain ‘A’ in the team column:

df[df["team"].str.contains("A")]

	team	conference points
0	A	East	   11
1	A	East	   8
2	A	East	   10

Only the rows where the team column contains ‘A’ are kept.

Example 2: Filter Rows that Contain a String in a List

The following code shows how to filter for rows in the DataFrame that contain ‘A’ or ‘B’ in the team column:

df[df["team"].str.contains("A|B")]

	team	conference points
0	A	East	   11
1	A	East	   8
2	A	East	   10
3	B	West	   6
4	B	West	   6

Only the rows where the team column contains ‘A’ or ‘B’ are kept.

Example 3: Filter Rows that Contain a Partial String

In the previous examples, we filtered based on rows that exactly matched one or more strings.

However, if we’d like to filter for rows that contain a partial string then we can use the following syntax:

#identify partial string to look for
keep= ["Wes"]

#filter for rows that contain the partial string "Wes" in the conference column
df[df.conference.str.contains('|'.join(keep))]

	team	conference points
3	B	West	   6
4	B	West	   6

Only the rows where the conference column contains “Wes” are kept.

x