How can we remove rows with some or all NAs in R?

In order to remove rows with some or all NAs in R, we can use the “na.omit” function. This function removes any rows from a data frame that contain any missing values (NAs). Alternatively, we can also use the “complete.cases” function to identify and remove rows with all NAs. This function returns a logical vector indicating which rows are complete (no NAs) and we can then use this vector to subset the data frame and remove the desired rows. Both of these methods allow us to efficiently remove rows with NAs from our data set in R.

Remove Rows with Some or All NAs in R


Often you may want to remove rows with all or some NAs (missing values) in a data frame in R.

This tutorial explains how to remove these rows using base R and the tidyr package. We’ll use the following data frame for each of the following examples:

#create data frame with some missing values
df <- data.frame(points = c(12, NA, 19, 22, 32),
                 assists = c(4, NA, 3, NA, 5),
                 rebounds = c(5, NA, 7, 12, NA))

#view data frame
df

  points assists rebounds
1     12       4        5
2     NA      NA       NA
3     19       3        7
4     22      NA       12
5     32       5       NA

Remove NAs Using Base R

The following code shows how to use complete.cases() to remove all rows in a data frame that have a missing value in any column:

#remove all rows with a missing value in any column
df[complete.cases(df), ]

  points assists rebounds
1     12       4        5
3     19       3        7

The following code shows how to use complete.cases() to remove all rows in a data frame that have a missing value in specific columns:

#remove all rows with a missing value in the third column
df[complete.cases(df[ , 3]),]

  points assists rebounds
1     12       4        5
3     19       3        7
4     22      NA       12

#remove all rows with a missing value in either the first or third column
df[complete.cases(df[ , c(1,3)]),]

  points assists rebounds
1     12       4        5
3     19       3        7
4     22      NA       12

Remove NAs Using Tidyr

The following code shows how to use drop_na() from the tidyr package to remove all rows in a data frame that have a missing value in any column:

#load tidyr package
library(tidyr)

#remove all rows with a missing value in any column
df %>% drop_na()

  points assists rebounds
1     12       4        5
3     19       3        7

The following code shows how to use drop_na() from the tidyr package to remove all rows in a data frame that have a missing value in specific columns:

#load tidyr package
library(tidyr)

#remove all rows with a missing value in the third column
df %>% drop_na(rebounds)

  points assists rebounds
1     12       4        5
3     19       3        7
4     22      NA       12

You can find more R tutorials here.

x