How do i check if a row in one data frame exists in another in R?

In R, you can check if a row in one data frame exists in another by using the %in% operator. This operator will check if any elements of a vector are present in another vector and return a logical true or false for each comparison. You can use this operator to check if any elements of a row in one data frame are present in another data frame and return a logical true if the row exists in the other data frame or false if it does not.


You can use the following syntax to add a new column to a data frame in R that shows if each row exists in another data frame:

df1$exists <- do.call(paste0, df1) %in% do.call(paste0, df2)

This particular syntax adds a column called exists to the data frame called df1 that contains TRUE or FALSE to indicate if each row in df1 exists in another data frame called df2.

The following example shows how to use this syntax in practice.

Example: Check if Row in One Data Frame Exists in Another in R

Suppose we have the following two data frames in R:

#create first data frame
df1 <- data.frame(team=c('A', 'B', 'C', 'D', 'E'),
                  points=c(12, 15, 22, 29, 24))

#view first data frame
df1

  team points
1    A     12
2    B     15
3    C     22
4    D     29
5    E     24

#create second data frame
df2 <- data.frame(team=c('A', 'D', 'F', 'G', 'H'),
                  points=c(12, 29, 15, 19, 10))

#view second data frame
df2

  team points
1    A     12
2    D     29
3    F     15
4    G     19
5    H     10

We can use the following syntax to add a column called exists to the first data frame that shows if each row exists in the second data frame:

#add new column to df1 that shows if row exists in df2
df1$exists <- do.call(paste0, df1) %in% do.call(paste0, df2)

#view updated data frame
df1

  team points exists
1    A     12   TRUE
2    B     15  FALSE
3    C     22  FALSE
4    D     29   TRUE
5    E     24  FALSE

The new exists column shows if each row in the first data frame exists in the second data frame.

From the output we can see:

  • The first row in df1 does exists in df2.
  • The second row in df1 does not exist in df2.
  • The third row in df1 does not exist in df2.

And so on.

Note that you can also use is.numeric() to display 1‘s and 0‘s instead of TRUE or FALSE in the exists column:

#add new column to df1 that shows if row exists in df2
df1$exists <- as.numeric(do.call(paste0, df1) %in% do.call(paste0, df2))

#view updated data frame
df1

  team points exists
1    A     12      1
2    B     15      0
3    C     22      0
4    D     29      1
5    E     24      0

A value of 1 indicates that the row in the first data frame exists in the second.

Conversely, a value of 0 indicates that the row in the first data frame does not exist in the second.

The following tutorials explain how to perform other common tasks in R:

x