How do I remove rows with some or all NAs in R?

In R, you can use the na.omit() function to remove any rows containing some or all NA values. This function returns a new object, so you may need to assign it to a new object if you want to keep the original dataset. Additionally, the complete.cases() function can be used to identify rows that contain no NA values, and the subset() function can be used to remove them.


Often you may want to remove rows with all or some NAs (missing values) in a data frame in R.

This tutorial explains how to remove these rows using base R and the tidyr package. We’ll use the following data frame for each of the following examples:

#create data frame with some missing values
df <- data.frame(points = c(12, NA, 19, 22, 32),
                 assists = c(4, NA, 3, NA, 5),
                 rebounds = c(5, NA, 7, 12, NA))

#view data frame
df

  points assists rebounds
1     12       4        5
2     NA      NA       NA
3     19       3        7
4     22      NA       12
5     32       5       NA

Remove NAs Using Base R

The following code shows how to use complete.cases() to remove all rows in a data frame that have a missing value in any column:

#remove all rows with a missing value in any column
df[complete.cases(df), ]

  points assists rebounds
1     12       4        5
3     19       3        7

The following code shows how to use complete.cases() to remove all rows in a data frame that have a missing value in specific columns:

#remove all rows with a missing value in the third column
df[complete.cases(df[ , 3]),]

  points assists rebounds
1     12       4        5
3     19       3        7
4     22      NA       12

#remove all rows with a missing value in either the first or third column
df[complete.cases(df[ , c(1,3)]),]

  points assists rebounds
1     12       4        5
3     19       3        7
4     22      NA       12

Remove NAs Using Tidyr

The following code shows how to use drop_na() from the tidyr package to remove all rows in a data frame that have a missing value in any column:

#load tidyr package
library(tidyr)

#remove all rows with a missing value in any column
df %>% drop_na()

  points assists rebounds
1     12       4        5
3     19       3        7

The following code shows how to use drop_na() from the tidyr package to remove all rows in a data frame that have a missing value in specific columns:

#load tidyr package
library(tidyr)

#remove all rows with a missing value in the third column
df %>% drop_na(rebounds)

  points assists rebounds
1     12       4        5
3     19       3        7
4     22      NA       12

You can find more R tutorials here.

x