How to Perform an Outer Join in Pandas (With Example)

An outer join in Pandas is a type of join that combines left and right dataframes, returning all of the records from both, even if there isn’t a match in the other dataframe. This is done by using the merge() function in Pandas, and specifying the type of join to be an “outer” join. An example of this type of join would be joining two dataframes on a common column, and having the resulting dataframe contain all of the records from both dataframes, even if there is no match in the other. This type of join is useful for when you require all of the data from both sides, even if there is no match.


An outer join is a type of join that returns all rows from two pandas DataFrames.

You can use the following basic syntax to perform an outer join in pandas:

import pandas as pd

df1.merge(df2, on='some_column', how='outer')

The following example shows how to use this syntax in practice.

Example: How to Perform an Outer Join in Pandas

Suppose we have the following two pandas DataFrames that contain information about various basketball teams:

import pandas as pd

#create DataFrame
df1 = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
                    'points': [18, 22, 19, 14, 14, 11, 20, 28]})

df2 = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'J', 'K'],
                    'assists': [4, 9, 14, 13, 10, 8]})

#view DataFrames
print(df1)

  team  points
0    A      18
1    B      22
2    C      19
3    D      14
4    E      14
5    F      11
6    G      20
7    H      28

print(df2)

  team  assists
0    A        4
1    B        9
2    C       14
3    D       13
4    J       10
5    K        8

We can use the following code to perform an outer join, matching the rows between the DataFrames based on the values in the team column and keeping all rows from both DataFrames:

#perform outer join
df1.merge(df2, on='team', how='outer')


        team	points	assists
0	A	18.0	4.0
1	B	22.0	9.0
2	C	19.0	14.0
3	D	14.0	13.0
4	E	14.0	NaN
5	F	11.0	NaN
6	G	20.0	NaN
7	H	28.0	NaN
8	J	NaN	10.0
9	K	NaN	8.0

The result is a DataFrame that contains all rows from each DataFrame.

Notice that NaN values have been filled in for each row where the value in the team column did not exist in both DataFrames.

Note: You can find the complete documentation for the merge function .

x