Table of Contents
In order to filter a pandas dataframe using a boolean column, you can use the .loc method to select the rows and columns of the dataframe that correspond to the values of the boolean column. This will allow you to select only the values that meet the criteria of the boolean column and create a new dataframe with only those values remaining.
You can use the following methods to filter the rows of a pandas DataFrame based on the values in Boolean columns:
Method 1: Filter DataFrame Based on One Boolean Column
#filter for rows where value in 'my_column' is True df.loc[df.my_column]
Method 2: Filter DataFrame Based on Multiple Boolean Columns
#filter for rows where value in 'column1' or 'column2' is True df.loc[df.column1 | df.column2]
The following examples show how to use each method in practice with the following pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E', 'F', 'G'], 'points': [18,20, 25, 40, 34, 32, 19], 'all_star': [True, False, True, True, True, False, False], 'starter': [False, True, True, True, False, False, False]}) #view DataFrame print(df) team points all_star starter 0 A 18 True False 1 B 20 False True 2 C 25 True True 3 D 40 True True 4 E 34 True False 5 F 32 False False 6 G 19 False False
Example 1: Filter DataFrame Based on One Boolean Column
We can use the following syntax to filter the pandas DataFrame to only contain rows where the value in the all_star column is True:
#filter for rows where 'all_star' is True df.loc[df.all_star] team points all_star starter 0 A 18 True False 2 C 25 True True 3 D 40 True True 4 E 34 True False
Notice that the DataFrame has been filtered to only contain rows where the value in the all_star column is True.
If you would instead like to filter for rows where all_star is False, simply type a tilde (~) in front of the column name:
#filter for rows where 'all_star' is False df.loc[~df.all_star] team points all_star starter 1 B 20 False True 5 F 32 False False 6 G 19 False False
Now the DataFrame has been filtered to only contain rows where the value in the all_star column is False.
Example 2: Filter DataFrame Based on Multiple Boolean Columns
We can use the following syntax to filter the pandas DataFrame to only contain rows where the value in the all_star column or the starter column is True:
#filter for rows where 'all_star' or 'starter' is True df.loc[df.all_star | df.starter] team points all_star starter 0 A 18 True False 1 B 20 False True 2 C 25 True True 3 D 40 True True 4 E 34 True False
If you would like to filter for rows where the value in both the all_star and starter column is True, you can use the & operator instead of the | operator:
#filter for rows where 'all_star' and 'starter' is True df.loc[df.all_star & df.starter] team points all_star starter 2 C 25 True True 3 D 40 True True
Now that the DataFrame has been filtered to only contain rows where the value in the all_star and starter column is True.
Related: