Table of Contents

To search for a string in all columns of a Pandas DataFrame, one can use the DataFrame.eq() function to check if the string is present in each column’s values. This function will return a boolean DataFrame with True values where the string is found in the column. The boolean DataFrame can be used to subset the original DataFrame to the desired rows.

You can use the following syntax to search for a particular string in each column of a pandas DataFrame and filter for rows that contain the string in at least one column:

#define filter
mask = np.column_stack([df[col].str.contains(r"my_string", na=False) for col in df])

#filter for rows where any column contains 'my_string'
df.loc[mask.any(axis=1)]

The following example shows how to use this syntax in practice.

Example: Search for String in All Columns of Pandas DataFrame

Suppose we have the following pandas DataFrame that contains information about the first role and second role of various basketball players on a team:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'player': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
                   'first_role': ['P Guard', 'P Guard', 'S Guard', 'S Forward',
                                  'P Forward', 'Center', 'Center', 'Center'],
                   'second_role': ['S Guard', 'S Guard', 'Forward', 'S Guard',
                                   'S Guard', 'S Forward', 'P Forward', 'P Forward']})

#view DataFrame
print(df)

  player first_role second_role
0      A    P Guard     S Guard
1      B    P Guard     S Guard
2      C    S Guard     Forward
3      D  S Forward     S Guard
4      E  P Forward     S Guard
5      F     Center   S Forward
6      G     Center   P Forward
7      H     Center   P Forward

The following code shows how to filter the pandas DataFrame for rows where the string “Guard” occurs in any column:

import numpy as np

#define filter
mask = np.column_stack([df[col].str.contains(r"Guard", na=False) for col in df])

#filter for rows where any column contains 'Guard'
df.loc[mask.any(axis=1)]

        player	first_role  second_role
0	A	P Guard	    S Guard
1	B	P Guard	    S Guard
2	C	S Guard	    Forward
3	D	S Forward   S Guard
4	E	P Forward   S Guard

Notice that each row in the resulting DataFrame contains the string “Guard” in at least one column.

You could also filter for rows where one of several strings occurs in at least one column by using the “OR” ( | ) operator in pandas.

For example, the following code shows how to filter for rows where either “P Guard” or “Center” occurs in at least one column:

import numpy as np

#define filter
mask = np.column_stack([df[col].str.contains(r"P Guard|Center", na=False) for col in df])

#filter for rows where any column contains 'P Guard' or 'Center'
df.loc[mask.any(axis=1)]

        player	first_role  second_role
0	A	P Guard	    S Guard
1	B	P Guard	    S Guard
5	F	Center	    S Forward
6	G	Center	    P Forward
7	H	Center	    P Forward

Notice that each row in the resulting DataFrame contains “P Guard” or “Center” in at least one column.

Note: It’s important to include the argument na=False within the contains() function or else you will encounter if NaN values are present in the DataFrame.

The following tutorials explain how to perform other common filtering operations in pandas:

How to search for a string in all columns of a Pandas DataFrame?

Example: Search for String in All Columns of Pandas DataFrame

Requst a

Scale

Example: Search for String in All Columns of Pandas DataFrame

Related terms:

Requst a

Scale