How to Create New Column Using Multiple If Else Conditions in Pandas

Pandas can be used to create new columns using multiple if else conditions by using the ‘np.where()’ function. This function takes three arguments: a boolean condition to check, the value if the condition is True, and the value if the condition is False. The output of this function will be the new column that is created. This function can be used to create multiple columns by adding more conditions and values in the arguments. With the help of this function, we can create new columns in Pandas dataframe with multiple if else conditions.


You can use the following syntax to create a new column in a pandas DataFrame using multiple if else conditions:

#define conditions
conditions = [
    (df['column1'] == 'A') & (df['column2'] < 20),
    (df['column1'] == 'A') & (df['column2'] >= 20),
    (df['column1'] == 'B') & (df['column2'] < 20),
    (df['column1'] == 'B') & (df['column2'] >= 20)
]

#define results
results = ['result1', 'result2', 'result3', 'result4']

#create new column based on conditions in column1 and column2
df['new_column'] = np.select(conditions, results)

This particular example creates a column called new_column whose values are based on the values in column1 and column2 in the DataFrame.

The following example shows how to use this syntax in practice.

Example: Create New Column Using Multiple If Else Conditions in Pandas

Suppose we have the following pandas DataFrame that contains information about various basketball players:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
                   'points': [15, 18, 22, 24, 12, 17, 20, 28]})

#view DataFrame
print(df)

  team  points
0    A      15
1    A      18
2    A      22
3    A      24
4    B      12
5    B      17
6    B      20
7    B      28

Now suppose we would like to create a new column called class that classifies each player into one of the following four groups:

  • Bad_A if team is A and points < 20
  • Good_A if team is A and points ≥ 20
  • Bad_B if team is B and points < 20
  • Good_B if team is B and points ≥ 20

We can use the following syntax to do so:

import numpy as np

#define conditions
conditions = [
    (df['team'] == 'A') & (df['points'] < 20),
    (df['team'] == 'A') & (df['points'] >= 20),
    (df['team'] == 'B') & (df['points'] < 20),
    (df['team'] == 'B') & (df['points'] >= 20)
]

#define results
results = ['Bad_A', 'Good_A', 'Bad_B', 'Good_B']

#create new column based on conditions in column1 and column2
df['class'] = np.select(conditions, results)

#view updated DataFrame
print(df)

  team  points   class
0    A      15   Bad_A
1    A      18   Bad_A
2    A      22  Good_A
3    A      24  Good_A
4    B      12   Bad_B
5    B      17   Bad_B
6    B      20  Good_B
7    B      28  Good_B

The new column called class displays the classification of each player based on the values in the team and points columns.

Note: You can find the complete documentation for the NumPy select() function .

x