How can I fill NaN values in a Pandas dataframe with the median value?

How can I fill NaN values in a Pandas dataframe with the median value?

The process of filling NaN (Not a Number) values in a Pandas dataframe with the median value involves replacing any missing data with the middle value of a dataset. This can be accomplished by first identifying the median value of the desired column or columns in the dataframe, and then using the Pandas fillna() function to replace any NaN values with the calculated median. This method ensures that the missing values are replaced with a representative value, rather than simply being dropped from the dataset. By utilizing this technique, the resulting dataframe will have a more complete and accurate representation of the data.

Pandas: Fill NaN Values with Median (3 Examples)


You can use the fillna() function to replace NaN values in a pandas DataFrame.

Here are three common ways to use this function:

Method 1: Fill NaN Values in One Column with Median

df['col1'] = df['col1'].fillna(df['col1'].median())

Method 2: Fill NaN Values in Multiple Columns with Median

df[['col1', 'col2']] = df[['col1', 'col2']].fillna(df[['col1', 'col2']].median())

Method 3: Fill NaN Values in All Columns with Median

df = df.fillna(df.median())

The following examples show how to use each method in practice with the following pandas DataFrame:

import numpy as np
import pandas as pd

#create DataFrame with some NaN values
df = pd.DataFrame({'rating': [np.nan, 85, np.nan, 88, 94, 90, 76, 75, 87, 86],
                   'points': [25, np.nan, 14, 16, 27, 20, 12, 15, 14, 19],
                   'assists': [5, 7, 7, np.nan, 5, 7, 6, 9, 9, 5],
                   'rebounds': [11, 8, 10, 6, 6, 9, 6, 10, 10, 7]})

#view DataFrame
df

        rating	points	assists	rebounds
0	NaN	25.0	5.0	11
1	85.0	NaN	7.0	8
2	NaN	14.0	7.0	10
3	88.0	16.0	NaN	6
4	94.0	27.0	5.0	6
5	90.0	20.0	7.0	9
6	76.0	12.0	6.0	6
7	75.0	15.0	9.0	10
8	87.0	14.0	9.0	10
9	86.0	19.0	5.0	7

Example 1: Fill NaN Values in One Column with Median

The following code shows how to fill the NaN values in the rating column with the median value of the rating column:

#fill NaNs with column median in 'rating' columndf['rating'] = df['rating'].fillna(df['rating'].median())

#view updated DataFrame 
df

        rating	points	assists	rebounds
0	86.5	25.0	5.0	11
1	85.0	NaN	7.0	8
2	86.5	14.0	7.0	10
3	88.0	16.0	NaN	6
4	94.0	27.0	5.0	6
5	90.0	20.0	7.0	9
6	76.0	12.0	6.0	6
7	75.0	15.0	9.0	10
8	87.0	14.0	9.0	10
9	86.0	19.0	5.0	7

The median value in the rating column was 86.5 so each of the NaN values in the rating column were filled with this value.

Example 2: Fill NaN Values in Multiple Columns with Median

The following code shows how to fill the NaN values in both the rating and points columns with their respective column medians:

#fill NaNs with column medians in 'rating' and 'points' columns
df[['rating', 'points']] = df[['rating', 'points']].fillna(df[['rating', 'points']].median())

#view updated DataFrame
df

	rating	points	assists	rebounds
0	86.5	25.0	5.0	11
1	85.0	16.0	7.0	8
2	86.5	14.0	7.0	10
3	88.0	16.0	NaN	6
4	94.0	27.0	5.0	6
5	90.0	20.0	7.0	9
6	76.0	12.0	6.0	6
7	75.0	15.0	9.0	10
8	87.0	14.0	9.0	10
9	86.0	19.0	5.0	7

Example 3: Fill NaN Values in All Columns with Median

The following code shows how to fill the NaN values in each column with their column median:

#fill NaNs with column medians in each column 
df = df.fillna(df.median())

#view updated DataFrame
df

	rating	points	assists	rebounds
0	86.5	25.0	5.0	11
1	85.0	16.0	7.0	8
2	86.5	14.0	7.0	10
3	88.0	16.0	7.0	6
4	94.0	27.0	5.0	6
5	90.0	20.0	7.0	9
6	76.0	12.0	6.0	6
7	75.0	15.0	9.0	10
8	87.0	14.0	9.0	10
9	86.0	19.0	5.0	7

Notice that the NaN values in each column were filled with their column median.

You can find the complete online documentation for the fillna() function .

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

Cite this article

stats writer (2024). How can I fill NaN values in a Pandas dataframe with the median value?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-can-i-fill-nan-values-in-a-pandas-dataframe-with-the-median-value/

stats writer. "How can I fill NaN values in a Pandas dataframe with the median value?." PSYCHOLOGICAL SCALES, 1 Jul. 2024, https://scales.arabpsychology.com/stats/how-can-i-fill-nan-values-in-a-pandas-dataframe-with-the-median-value/.

stats writer. "How can I fill NaN values in a Pandas dataframe with the median value?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/how-can-i-fill-nan-values-in-a-pandas-dataframe-with-the-median-value/.

stats writer (2024) 'How can I fill NaN values in a Pandas dataframe with the median value?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-can-i-fill-nan-values-in-a-pandas-dataframe-with-the-median-value/.

[1] stats writer, "How can I fill NaN values in a Pandas dataframe with the median value?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, July, 2024.

stats writer. How can I fill NaN values in a Pandas dataframe with the median value?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.

Download Post (.PDF)
Slide Up
x
PDF
Scroll to Top