How can I display the full content of a column in PySpark?

To display the full content of a column in PySpark, you can use the “show” function with the “truncate” parameter set to false. This will allow the column to be displayed without any truncation, showing all of its contents. Additionally, you can also use the “select” function to specify the desired column and then use the “collect” function to retrieve all of its values. This will display the complete contents of the column in a list format.

PySpark: Show Full Column Content


You can use the following methods to force a PySpark DataFrame to show the full content of each column, regardless of width:

Method 1: Use truncate=False

df.show(truncate=False) 

Method 2: Use truncate=0

df.show(truncate=0)

The following examples show how to use each method in practice with the following PySpark DataFrame:

from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()

#define data
data = [['A', 'Andy Bob Chad Doug Eric', 136], 
        ['B', 'Frank Henry', 223], 
        ['C', 'Ian John Ken Liam Mike Noah', 450], 
        ['D', 'Oscar Prim', 290], 
        ['E', 'Quentin Ross Sarah', 189]]
  
#define column names
columns = ['store', 'employees', 'sales'] 
  
#create dataframe using data and column names
df = spark.createDataFrame(data, columns) 
  
#view dataframe
df.show()

+-----+--------------------+-----+
|store|           employees|sales|
+-----+--------------------+-----+
|    A|Andy Bob Chad Dou...|  136|
|    B|         Frank Henry|  223|
|    C|Ian John Ken Liam...|  450|
|    D|          Oscar Prim|  290|
|    E|  Quentin Ross Sarah|  189|
+-----+--------------------+-----+

Notice that some of the rows in the employees column are cut off because they exceed the default width in PySpark, which is 20 characters.

Example 1: Show Full Column Content Using truncate=False

We can use the truncate=False argument to show the full content of each content in the PySpark DataFrame:

#view dataframe with full column content
df.show(truncate=False)

+-----+---------------------------+-----+
|store|employees                  |sales|
+-----+---------------------------+-----+
|A    |Andy Bob Chad Doug Eric    |136  |
|B    |Frank Henry                |223  |
|C    |Ian John Ken Liam Mike Noah|450  |
|D    |Oscar Prim                 |290  |
|E    |Quentin Ross Sarah         |189  |
+-----+---------------------------+-----+

Notice that we can now see the full content of the employees column.

Example 2: Show Full Column Content Using truncate=0

We can also use the truncate=0 argument to show the full content of each content in the PySpark DataFrame:

#view dataframe with full column content
df.show(truncate=0)

+-----+---------------------------+-----+
|store|employees                  |sales|
+-----+---------------------------+-----+
|A    |Andy Bob Chad Doug Eric    |136  |
|B    |Frank Henry                |223  |
|C    |Ian John Ken Liam Mike Noah|450  |
|D    |Oscar Prim                 |290  |
|E    |Quentin Ross Sarah         |189  |
+-----+---------------------------+-----+

Once again, we can now see the full content of the employees column.

Note: You can find the complete documentation for the PySpark show function .

Additional Resources

The following tutorials explain how to perform other common tasks in PySpark:

x