Table of Contents
The phrase “which columns contain all missing values” refers to a situation where a dataset or table contains columns with no data present in any of their rows. This means that the entire column is empty and does not provide any relevant information. Identifying these columns is important in data analysis as they may need to be removed or filled with appropriate data to ensure the accuracy and completeness of the data set.
R: Find Columns with All Missing Values
You can use the following methods to find columns in a data frame in R that contain all missing values:
Method 1: Use Base R
#check if each column has all missing values all_miss <- apply(df, 2, function(x) all(is.na(x))) #display columns with all missing values names(all_miss[all_miss>0])
Method 2: Use purrr Package
library(purrr) #display columns with all missing values df %>% keep(~all(is.na(.x))) %>% names
Both methods produce the same result, but the purrr approach tends to be quicker for extremely large data frames.
The following examples show how to use each method with the following data frame in R:
#create data frame
df <- data.frame(points=c(21, 15, 10, 4, 4, 9, 12, 10),
assists=c(NA, NA, NA, NA, NA, NA, NA, NA),
rebounds=c(8, 12, 14, 10, 7, 9, 8, 5),
steals=c(NA, NA, NA, NA, NA, NA, NA, NA))
#view data frame
df
points assists rebounds steals
1 21 NA 8 NA
2 15 NA 12 NA
3 10 NA 14 NA
4 4 NA 10 NA
5 4 NA 7 NA
6 9 NA 9 NA
7 12 NA 8 NA
8 10 NA 5 NAExample 1: Find Columns with All Missing Values Using Base R
The following code shows how to find the columns in the data frame with all missing values:
#check if each column has all missing values all_miss <- apply(df, 2, function(x) all(is.na(x))) #display columns with all missing values names(all_miss[all_miss>0]) [1] "assists" "steals"
From the output we can see that the assists and steals columns have all missing values.
Example 2: Find Columns with All Missing Values Using purrr Package
The following code shows how to find the columns in the data frame with all missing values by using functions from the package:
library(purrr) #display columns with all missing values df %>% keep(~all(is.na(.x))) %>% names [1] "assists" "steals"
From the output we can see that the assists and steals columns have all missing values.
This matches the output from the base R method.
The following tutorials explain how to perform other common operations with missing values in R:
Cite this article
stats writer (2024). Which columns contain all missing values?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/which-columns-contain-all-missing-values/
stats writer. "Which columns contain all missing values?." PSYCHOLOGICAL SCALES, 26 Jun. 2024, https://scales.arabpsychology.com/stats/which-columns-contain-all-missing-values/.
stats writer. "Which columns contain all missing values?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/which-columns-contain-all-missing-values/.
stats writer (2024) 'Which columns contain all missing values?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/which-columns-contain-all-missing-values/.
[1] stats writer, "Which columns contain all missing values?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, June, 2024.
stats writer. Which columns contain all missing values?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.
