How do I drop rows that contain a specific string in R?

In order to drop rows that contain a specific string in R, you can use the subset function to subset the data frame based on the condition that the row does not contain the specified string. This can be done by specifying the condition “!=” followed by the specific string you wish to exclude in the subset function. You can then assign the resulting subset to a new data frame, or overwrite the existing data frame with the resulting subset.


You can use the following syntax to drop rows that contain a certain string in a data frame in R:

df[!grepl('string', df$column),]

This tutorial provides several examples of how to use this syntax in practice with the following data frame in R:

#create data frame
df <- data.frame(team=c('A', 'A', 'A', 'B', 'B', 'C'),
                 conference=c('East', 'East', 'East', 'West', 'West', 'East'),
                 points=c(11, 8, 10, 6, 6, 5))

#view data frame
df

  team conference points
1    A       East     11
2    A       East      8
3    A       East     10
4    B       West      6
5    B       West      6
6    C       East      5

Example 1: Drop Rows that Contain a Specific String

The following code shows how to drop all rows in the data frame that contain ‘A’ in the team column:

df[!grepl('A', df$team),]

  team conference points
4    B       West      6
5    B       West      6
6    C       East      5

Or we could drop all rows in the data frame that contain ‘West’ in the conference column:

df[!grepl('West', df$conference),]

  team conference points
1    A       East     11
2    A       East      8
3    A       East     10
6    C       East      5

Example 2: Drop Rows that Contain a String in a List

The following code shows how to drop all rows in the data frame that contain ‘A’ or ‘B’ in the team column:

df[!grepl('A|B', df$team),]

6    C       East      5

We could also define a vector of strings and then remove all rows in the data frame that contain any of the strings in the vector in the team column:

#define vector of strings
remove <- c('A', 'B')

#remove rows that contain any string in the vector in the team column
df[!grepl(paste(remove, collapse='|'), df$team),]

6    C       East      5

Notice that both methods lead to the same result.

x