How to Merge Multiple Data Frames in R (With Examples)

In R, merging multiple data frames can be accomplished using the merge() command. This command allows you to combine the data from different data frames into a single data frame. You can specify the columns to be used as the keys for the merge, as well as the type of join (inner, left, right, or full). Examples of how to use the merge() command can be found online, as well as in the R documentation.


You can use one of the following two methods to merge multiple data frames in R:

Method 1: Use Base R

#put all data frames into list
df_list <- list(df1, df2, df3)

#merge all data frames in list
Reduce(function(x, y) merge(x, y, all=TRUE), df_list)

Method 2: Use Tidyverse

library(tidyverse)

#put all data frames into list
df_list <- list(df1, df2, df3)

#merge all data frames in list
df_list %>% reduce(full_join, by='variable_name')

The following examples show how to use each method in practice.

Method 1: Merge Multiple Data Frames Using Base R

Suppose we have the following data frames in R:

#define data frames
df1 <- data.frame(id=c(1, 2, 3, 4, 5),
                  revenue=c(34, 36, 40, 49, 43))

df2 <- data.frame(id=c(1, 2, 5, 6, 7),
                  expenses=c(22, 26, 31, 40, 20))

df3 <- data.frame(id=c(1, 2, 4, 5, 7),
                  profit=c(12, 10, 14, 12, 9))

We can use the following syntax to merge all of the data frames using functions from base R:

#put all data frames into list
df_list <- list(df1, df2, df3)      

#merge all data frames together
Reduce(function(x, y) merge(x, y, all=TRUE), df_list)  

  id revenue expenses profit
1  1      34       22     12
2  2      36       26     10
3  3      40       NA     NA
4  4      49       NA     14
5  5      43       31     12
6  6      NA       40     NA
7  7      NA       20      9

Notice that each of the “id” values from each original data frame is included in the final data frame.

Method 2: Merge Multiple Data Frames Using Tidyverse

Suppose we have the following data frames in R:

#define data frames
df1 <- data.frame(id=c(1, 2, 3, 4, 5),
                  revenue=c(34, 36, 40, 49, 43))

df2 <- data.frame(id=c(1, 2, 5, 6, 7),
                  expenses=c(22, 26, 31, 40, 20))

df3 <- data.frame(id=c(1, 2, 4, 5, 7),
                  profit=c(12, 10, 14, 12, 9))

We can use the following syntax to merge all of the data frames using functions from – a collection of packages designed for data science in R:

library(tidyverse)

#put all data frames into list
df_list <- list(df1, df2, df3)      

#merge all data frames together
df_list %>% reduce(full_join, by='id')

  id revenue expenses profit
1  1      34       22     12
2  2      36       26     10
3  3      40       NA     NA
4  4      49       NA     14
5  5      43       31     12
6  6      NA       40     NA
7  7      NA       20      9

Note: The tidyverse approach will be noticeably quicker if you’re working with extremely large data frames.

The following tutorials explain how to perform other common functions in R:

x