How do you rename columns in PySpark?

How to Rename Columns in PySpark: A Simple Guide

PySpark is a powerful tool for data manipulation and transformation. One common task in data processing is to rename columns in a dataset. This can be easily achieved in PySpark using the “withColumnRenamed” function. This function takes two arguments – the current column name and the new column name. It then creates a new dataframe with the renamed column.

For example, if we have a dataframe with columns “id”, “name”, and “age”, and we want to rename the “age” column to “years”, we can use the following code:

df = df.withColumnRenamed(“age”, “years”)

This will create a new dataframe with columns “id”, “name”, and “years”. We can also rename multiple columns at once by chaining multiple “withColumnRenamed” functions.

In summary, renaming columns in PySpark is a simple and efficient process that can be achieved using the “withColumnRenamed” function. It allows for easy manipulation and organization of data in a dataframe.

Rename Columns in PySpark (With Examples)


You can use the following methods to rename columns in a PySpark DataFrame:

Method 1: Rename One Column

#rename 'conference' column to 'conf'
df = df.withColumnRenamed('conference', 'conf')

Method 2: Rename Multiple Columns

#rename 'conference' and 'team' columns
df = df.withColumnRenamed('conference', 'conf')
       .withColumnRenamed('team', 'team_name')

Method 3: Rename All Columns

#specify new column names to use
col_names = ['the_team', 'the_conf', 'points_scored', 'total_assists']

#rename all column names with new names
df = df.toDF(*col_names)

The following examples show how to use each of these methods in practice with the following PySpark DataFrame:

from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()

#define data
data = [['A', 'East', 11, 4], 
        ['A', 'East', 8, 9], 
        ['A', 'East', 10, 3], 
        ['B', 'West', 6, 12], 
        ['B', 'West', 6, 4], 
        ['C', 'East', 5, 2]] 
  
#define column names
columns = ['team', 'conference', 'points', 'assists'] 
  
#create dataframe using data and column names
df = spark.createDataFrame(data, columns) 
  
#view dataframe
df.show()

+----+----------+------+-------+
|team|conference|points|assists|
+----+----------+------+-------+
|   A|      East|    11|      4|
|   A|      East|     8|      9|
|   A|      East|    10|      3|
|   B|      West|     6|     12|
|   B|      West|     6|      4|
|   C|      East|     5|      2|
+----+----------+------+-------+

Example 1: Rename One Column in PySpark

We can use the following syntax to rename just the conference column in the DataFrame:

#rename 'conference' column to 'conf'
df = df.withColumnRenamed('conference', 'conf')

#view updated DataFrame
df.show()

+----+----+------+-------+
|team|conf|points|assists|
+----+----+------+-------+
|   A|East|    11|      4|
|   A|East|     8|      9|
|   A|East|    10|      3|
|   B|West|     6|     12|
|   B|West|     6|      4|
|   C|East|     5|      2|
+----+----+------+-------+

Notice that only the conference column has been renamed.

Example 2: Rename Multiple Columns in PySpark

We can use the following syntax to rename the conference and team columns in the DataFrame:

#rename 'conference' and 'team' columns
df = df.withColumnRenamed('conference', 'conf')
       .withColumnRenamed('team', 'team_name')

#view updated DataFrame
df.show()

+---------+----+------+-------+
|team_name|conf|points|assists|
+---------+----+------+-------+
|        A|East|    11|      4|
|        A|East|     8|      9|
|        A|East|    10|      3|
|        B|West|     6|     12|
|        B|West|     6|      4|
|        C|East|     5|      2|
+---------+----+------+-------+

Notice that the conference and team columns have been renamed while all other column names have remained the same.

Example 3: Rename All Columns in PySpark

We can use the following syntax to rename all columns in the DataFrame:

#specify new column names to use
col_names = ['the_team', 'the_conf', 'points_scored', 'total_assists']

#rename all column names with new names
df = df.toDF(*col_names)

#view updated DataFrame
df.show()

+--------+--------+-------------+-------------+
|the_team|the_conf|points_scored|total_assists|
+--------+--------+-------------+-------------+
|       A|    East|           11|            4|
|       A|    East|            8|            9|
|       A|    East|           10|            3|
|       B|    West|            6|           12|
|       B|    West|            6|            4|
|       C|    East|            5|            2|
+--------+--------+-------------+-------------+

Notice that all of the column names have been renamed based on the new names that we specified.

The following tutorials explain how to perform other common tasks in PySpark:

Cite this article

stats writer (2026). How to Rename Columns in PySpark: A Simple Guide. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-do-you-rename-columns-in-pyspark-and-what-are-some-examples-of-this-process/

stats writer. "How to Rename Columns in PySpark: A Simple Guide." PSYCHOLOGICAL SCALES, 6 Feb. 2026, https://scales.arabpsychology.com/stats/how-do-you-rename-columns-in-pyspark-and-what-are-some-examples-of-this-process/.

stats writer. "How to Rename Columns in PySpark: A Simple Guide." PSYCHOLOGICAL SCALES, 2026. https://scales.arabpsychology.com/stats/how-do-you-rename-columns-in-pyspark-and-what-are-some-examples-of-this-process/.

stats writer (2026) 'How to Rename Columns in PySpark: A Simple Guide', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-do-you-rename-columns-in-pyspark-and-what-are-some-examples-of-this-process/.

[1] stats writer, "How to Rename Columns in PySpark: A Simple Guide," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, February, 2026.

stats writer. How to Rename Columns in PySpark: A Simple Guide. PSYCHOLOGICAL SCALES. 2026;vol(issue):pages.

Download Post (.PDF)
Slide Up
x
PDF
Scroll to Top