Table of Contents
To select rows with NaN values in Pandas, you can use the isnull() function to create a boolean mask that can be used to index the DataFrame. For example, you can use the following code to select all rows with NaN values in a DataFrame called ‘df’: df[df.isnull()]. This will return a DataFrame containing only the rows that contain NaN values. Additionally, you can use the notnull() function to select all rows that do not contain NaN values. For example: df[df.notnull()]. This will return a DataFrame containing only the rows that do not contain NaN values.
You can use the following methods to select rows with NaN values in pandas:
Method 1: Select Rows with NaN Values in Any Column
df.loc[df.isnull().any(axis=1)]
Method 2: Select Rows with NaN Values in Specific Column
df.loc[df['this_column'].isnull()]
The following examples show how to use each method in practice with the following pandas DataFrame:
import pandas as pd import numpy as np #create DataFrame df = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'], 'points': [18, np.NaN, 19, 14, 14, 11, 20, 28], 'assists': [5, 7, 7, 9, np.NaN, 9, 9, np.NaN], 'rebounds': [11, 8, 10, 6, 6, 5, 9, np.NaN]}) #view DataFrame print(df)
Example 1: Select Rows with NaN Values in Any Column
We can use the following syntax to select rows with NaN values in any column of the DataFrame:
#create new DataFrame that only contains rows with NaNs in any column df_nan_rows = df.loc[df.isnull().any(axis=1)] #view results print(df_nan_rows) team points assists rebounds 1 B NaN 7.0 8.0 4 E 14.0 NaN 6.0 7 H 28.0 NaN NaN
Notice that each row in the resulting DataFrame contains a NaN value in at least one column.
Example 2: Select Rows with NaN Values in Specific Column
We can use the following syntax to select rows with NaN values in the assists column of the DataFrame:
#create new DataFrame that only contains rows with NaNs in assists column df_assists_nans = df.loc[df['assists'].isnull()] #view results print(df_assists_nans) team points assists rebounds 4 E 14.0 NaN 6.0 7 H 28.0 NaN NaN
Notice that each row in the resulting DataFrame contains a NaN value in the assists column.
There is one row with a NaN value in the points column, but this row is not selected since it doesn’t have a NaN value in the assists column as well.