Table of Contents
In Pandas, the strip() method can be used to remove any whitespace characters from the start and end of a string from the values of particular columns. This is done by referencing the column name and applying the strip() method to it, which returns a new Series with the whitespace removed from the values in that column. This is useful for cleaning up data and ensuring that all values in a column are formatted to the same standard.
You can use the following methods to strip whitespace from columns in a pandas DataFrame:
Method 1: Strip Whitespace from One Column
df['my_column'] = df['my_column'].str.strip()
Method 2: Strip Whitespace from All String Columns
df = df.apply(lambda x: x.str.strip() if x.dtype == 'object' else x)
The following examples show how to use each method in practice with the following pandas DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['Mavs', ' Heat', ' Nets ', 'Cavs', 'Hawks', 'Jazz '], 'position': ['Point Guard', ' Small Forward', 'Center ', 'Power Forward', ' Point Guard ', 'Center'], 'points': [11, 8, 10, 6, 22, 29]}) #view DataFrame print(df) team position points 0 Mavs Point Guard 11 1 Heat Small Forward 8 2 Nets Center 10 3 Cavs Power Forward 6 4 Hawks Point Guard 22 5 Jazz Center 29
Example 1: Strip Whitespace from One Column
The following code shows how to strip whitespace from every string in the position column:
#strip whitespace from position column
df['position'] = df['position'].str.strip()
#view updated DataFrame
print(df)
team position points
0 Mavs Point Guard 11
1 Heat Small Forward 8
2 Nets Center 10
3 Cavs Power Forward 6
4 Hawks Point Guard 22
5 Jazz Center 29
Notice that all whitespace has been stripped from each string that had whitespace in the position column.
Example 2: Strip Whitespace from All String Columns
The following code shows how to strip whitespace from each string in all string columns of the DataFrame:
#strip whitespace from all string columns
df = df.apply(lambda x: x.str.strip() if x.dtype == 'object' else x)
#view updated DataFrame
print(df)
team position points
0 Mavs Point Guard 11
1 Heat Small Forward 8
2 Nets Center 10
3 Cavs Power Forward 6
4 Hawks Point Guard 22
5 Jazz Center 29
Notice that all whitespace has been stripped from both the team and position columns, which are the two string columns in the DataFrame.