Table of Contents
The process of filling NaN (Not a Number) values in a Pandas dataframe with the mode value involves identifying the most frequently occurring value in a particular column or row and replacing the NaN values with that value. This technique is useful for handling missing data and maintaining the integrity of the dataset. By using the mode value to fill in the missing values, the overall distribution and patterns in the data can be preserved. This can be achieved by using the built-in “fillna” function in Pandas, which allows for the replacement of NaN values with the mode value of the respective column or row. This approach ensures that the data remains accurate and can be effectively used for analysis and other data manipulation tasks.
Pandas: Fill NaN Values with Mode
You can use the following syntax to replace NaN values in a column of a pandas DataFrame with the mode value of the column:
df['col1'] = df['col1'].fillna(df['col1'].mode()[0])
The following example shows how to use this syntax in practice.
Example: Replace Missing Values with Mode in Pandas
Suppose we have the following pandas DataFrame with some missing values:
import numpy as np import pandas as pd #create DataFrame with some NaN values df = pd.DataFrame({'rating': [np.nan, 85, np.nan, 88, 94, 90, 75, 75, 87, 86], 'points': [25, np.nan, 14, 16, 27, 20, 12, 15, 14, 19], 'assists': [5, 7, 7, np.nan, 5, 7, 6, 9, 9, 7], 'rebounds': [11, 8, 10, 6, 6, 9, 6, 10, 10, 7]}) #view DataFrame df rating points assists rebounds 0 NaN 25.0 5.0 11 1 85.0 NaN 7.0 8 2 NaN 14.0 7.0 10 3 88.0 16.0 NaN 6 4 94.0 27.0 5.0 6 5 90.0 20.0 7.0 9 6 75.0 12.0 6.0 6 7 75.0 15.0 9.0 10 8 87.0 14.0 9.0 10 9 86.0 19.0 7.0 7
We can use the fillna() function to fill the NaN values in the rating column with the mode value of the rating column:
#fill NaNs with column mode in 'rating' columndf['rating'] = df['rating'].fillna(df['rating'].mode()[0]) #view updated DataFrame df rating points assists rebounds 0 75.0 25.0 5.0 11 1 85.0 NaN 7.0 8 2 75.0 14.0 7.0 10 3 88.0 16.0 NaN 6 4 94.0 27.0 5.0 6 5 90.0 20.0 7.0 9 6 75.0 12.0 6.0 6 7 75.0 15.0 9.0 10 8 87.0 14.0 9.0 10 9 86.0 19.0 7.0 7
The mode value in the rating column was 75 so each of the NaN values in the rating column were filled with this value.
Note: You can find the complete online documentation for the fillna() function .
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
Cite this article
stats writer (2024). How can I fill NaN values in a Pandas dataframe with the mode value?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-can-i-fill-nan-values-in-a-pandas-dataframe-with-the-mode-value/
stats writer. "How can I fill NaN values in a Pandas dataframe with the mode value?." PSYCHOLOGICAL SCALES, 28 Jun. 2024, https://scales.arabpsychology.com/stats/how-can-i-fill-nan-values-in-a-pandas-dataframe-with-the-mode-value/.
stats writer. "How can I fill NaN values in a Pandas dataframe with the mode value?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/how-can-i-fill-nan-values-in-a-pandas-dataframe-with-the-mode-value/.
stats writer (2024) 'How can I fill NaN values in a Pandas dataframe with the mode value?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-can-i-fill-nan-values-in-a-pandas-dataframe-with-the-mode-value/.
[1] stats writer, "How can I fill NaN values in a Pandas dataframe with the mode value?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, June, 2024.
stats writer. How can I fill NaN values in a Pandas dataframe with the mode value?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.
