How to use cbind in Python (Equivalent to R)?

In Python, the equivalent of the R cbind function is the NumPy hstack function. This function takes two or more NumPy arrays as its arguments and combines them horizontally (in columns) to form a single 2D array. It is important to note that all of the arrays must have the same number of rows. This function is useful for combining data together into a single array for further analysis.


The cbind function in R, short for column-bind, can be used to combine data frames together by their columns.

We can use the function from pandas to perform the equivalent function in Python:

df3 = pd.concat([df1, df2], axis=1)

The following examples shows how to use this function in practice.

Example 1: Use cbind in Python with Equal Index Values

Suppose we have the following two pandas DataFrames:

import pandas as pd

#define DataFrames
df1 = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E'],
                    'points': [99, 91, 104, 88, 108]})

print(df1)

  team  points
0    A      99
1    B      91
2    C     104
3    D      88
4    E     108

df2 = pd.DataFrame({'assists': ['A', 'B', 'C', 'D', 'E'],
                    'rebounds': [22, 19, 25, 33, 29]})

print(df2)

  assists  rebounds
0       A        22
1       B        19
2       C        25
3       D        33
4       E        29

We can use the concat() function to quickly bind these two DataFrames together by their columns:

#column-bind two DataFrames into new DataFrame
df3 = pd.concat([df1, df2], axis=1)

#view resulting DataFrame
df3

	team	points	assists	rebounds
0	A	99	A	22
1	B	91	B	19
2	C	104	C	25
3	D	88	D	33
4	E	108	E	29

Example 2: Use cbind in Python with Unequal Index Values

Suppose we have the following two pandas DataFrames:

import pandas as pd

#define DataFrames
df1 = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E'],
                    'points': [99, 91, 104, 88, 108]})

print(df1)

  team  points
0    A      99
1    B      91
2    C     104
3    D      88
4    E     108

df2 = pd.DataFrame({'assists': ['A', 'B', 'C', 'D', 'E'],
                    'rebounds': [22, 19, 25, 33, 29]})

df2.index = [6, 7, 8, 9, 10]

print(df2)

   assists  rebounds
6        A        22
7        B        19
8        C        25
9        D        33
10       E        29

Notice that the two DataFrames do not have the same index values.

If we attempt to use the concat() function to cbind them together, we’ll get the following result:

#attempt to column-bind two DataFrames
df3 = pd.concat([df1, df2], axis=1)

#view resulting DataFrame
df3

	team	points	assists	rebounds
0	A	99.0	NaN	NaN
1	B	91.0	NaN	NaN
2	C	104.0	NaN	NaN
3	D	88.0	NaN	NaN
4	E	108.0	NaN	NaN
6	NaN	NaN	A	22.0
7	NaN	NaN	B	19.0
8	NaN	NaN	C	25.0
9	NaN	NaN	D	33.0
10	NaN	NaN	E	29.0

This is not the result we wanted. 

To fix this, we need to first reset the index of each DataFrame before concatenating them together:

import pandas as pd

#define DataFrames
df1 = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E'],
                    'points': [99, 91, 104, 88, 108]})

df2 = pd.DataFrame({'assists': ['A', 'B', 'C', 'D', 'E'],
                    'rebounds': [22, 19, 25, 33, 29]})

df2.index = [6, 7, 8, 9, 10]

#reset index of each DataFrame
df1.reset_index(drop=True, inplace=True)
df2.reset_index(drop=True, inplace=True)

#column-bind two DataFrames
df3 = pd.concat([df1, df2], axis=1)

#view resulting DataFrame
df3

	team	points	assists	rebounds
0	A	99	A	22
1	B	91	B	19
2	C	104	C	25
3	D	88	D	33
4	E	108	E	29

Notice that this DataFrame matches the one we got in the previous example.

The following tutorials explain how to perform other common operations in Python:

x