How to drop columns with NaN Values

Dropping columns with NaN values is the process of removing columns from a data set that contain any NaN values. This can be done by using the .dropna() method, which allows you to specify the axis to drop on, and the subset of columns to drop by. This is useful for reducing the size of a data set and for ensuring that the data set contains only valid values.


You can use the following methods to drop columns from a pandas DataFrame with NaN values:

Method 1: Drop Columns with Any NaN Values

df = df.dropna(axis=1)

Method 2: Drop Columns with All NaN Values

df = df.dropna(axis=1, how='all')

Method 3: Drop Columns with Minimum Number of NaN Values

df = df.dropna(axis=1, thresh=2)

The following examples show how to use each method in practice with the following pandas DataFrame:

import pandas as pd
import numpy as np

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'B', 'B', 'B'],
                   'position': [np.nan, 'G', 'F', 'F', 'C', 'G'],
                   'points': [11, 28, 10, 26, 6, 25],
                   'rebounds': [np.nan, np.nan, np.nan, np.nan, np.nan, np.nan]})

#view DataFrame
print(df)

  team position  points  rebounds
0    A      NaN      11       NaN
1    A        G      28       NaN
2    A        F      10       NaN
3    B        F      26       NaN
4    B        C       6       NaN
5    B        G      25       NaN

Example 1: Drop Columns with Any NaN Values

The following code shows how to drop columns with any NaN values:

#drop columns with any NaN values
df = df.dropna(axis=1)

#view updated DataFrame
print(df)

  team  points
0    A      11
1    A      28
2    A      10
3    B      26
4    B       6
5    B      25

Notice that the position and rebounds columns were dropped since they both had at least one NaN value.

Example 2: Drop Columns with All NaN Values

The following code shows how to drop columns with all NaN values:

#drop columns with all NaN values
df = df.dropna(axis=1, how='all')

#view updated DataFrame
print(df)

  team position  points
0    A      NaN      11
1    A        G      28
2    A        F      10
3    B        F      26
4    B        C       6
5    B        G      25

Notice that the rebounds column was dropped since it was the only column with all NaN values.

Example 3: Drop Columns with Minimum Number of NaN Values

The following code shows how to drop columns with at least two NaN values:

#drop columns with at least two NaN values
df = df.dropna(axis=1, thresh=2)

#view updated DataFrame
print(df)

  team position  points
0    A      NaN      11
1    A        G      28
2    A        F      10
3    B        F      26
4    B        C       6
5    B        G      25

Notice that the rebounds column was dropped since it was the only column with at least two NaN values.

Note: You can find the complete documentation for the dropna() function in pandas .

x