Table of Contents
Pandas provide a function called describe() which can be used to generate descriptive statistics that summarize the central tendency, dispersion, and shape of a dataset’s distribution. To suppress scientific notation in Pandas, you can use the set_option() function from the pandas.options module to set the display.float_format option to ‘{:.2f}’. This will set the float format to two decimal places, suppressing any scientific notation.
You can use the describe() function to generate for variables in a pandas DataFrame.
To suppress scientific notation in the output of the describe() function, you can use the following methods:
Method 1: Suppress Scientific Notation When Using describe() with One Column
df['my_column'].describe().apply(lambda x: format(x, 'f'))
Method 2: Suppress Scientific Notation When Using describe() with Multiple Columns
df.describe().apply(lambda x: x.apply('{0:.5f}'.format))
The following examples show how to use each method in practice with the following pandas DataFrame:
import pandas as pd
#create DataFrame
df = pd.DataFrame({'store': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
'sales': [8450550, 406530, 53000, 6000, 2000, 4000, 5400, 6500],
'returns':[2212200, 145200, 300, 2500, 700, 600, 800, 1200]})
#view DataFrame
print(df)
store sales returns
0 A 8450550 2212200
1 A 406530 145200
2 A 53000 300
3 A 6000 2500
4 B 2000 700
5 B 4000 600
6 B 5400 800
7 B 6500 1200
Example 1: Suppress Scientific Notation When Using describe() with One Column
If we use the describe() function to calculate descriptive statistics for the sales column, the values in the output will be displayed in scientific notation:
#calculate descriptive statistics for sales column
df['sales'].describe()
count 8.000000e+00
mean 1.116748e+06
std 2.966552e+06
min 2.000000e+03
25% 5.050000e+03
50% 6.250000e+03
75% 1.413825e+05
max 8.450550e+06
Name: sales, dtype: float64
Notice that each of the values in the output are displayed using scientific notation.
We can use the following syntax to suppress scientific notation in the output:
#calculate descriptive statistics for sales column and suppress scientific notation
df['sales'].describe().apply(lambda x: format(x, 'f'))
count 8.000000
mean 1116747.500000
std 2966551.594104
min 2000.000000
25% 5050.000000
50% 6250.000000
75% 141382.500000
max 8450550.000000
Name: sales, dtype: object
Notice that the values in the output are now shown without scientific notation.
Example 2: Suppress Scientific Notation When Using describe() with Multiple Columns
If we use the describe() function to calculate descriptive statistics for each numeric column, the values in the output will be displayed in scientific notation:
#calculate descriptive statistics for each numeric column
df.describe()
sales returns
count 8.000000e+00 8.000000e+00
mean 1.116748e+06 2.954375e+05
std 2.966552e+06 7.761309e+05
min 2.000000e+03 3.000000e+02
25% 5.050000e+03 6.750000e+02
50% 6.250000e+03 1.000000e+03
75% 1.413825e+05 3.817500e+04
max 8.450550e+06 2.212200e+06
Notice that each of the values in the output are displayed using scientific notation.
We can use the following syntax to suppress scientific notation in the output:
#calculate descriptive statistics for numeric columns and suppress scientific notation
df.describe().apply(lambda x: x.apply('{0:.5f}'.format))
sales returns
count 8.00000 8.00000
mean 1116747.50000 295437.50000
std 2966551.59410 776130.93692
min 2000.00000 300.00000
25% 5050.00000 675.00000
50% 6250.00000 1000.00000
75% 141382.50000 38175.00000
max 8450550.00000 2212200.00000
Notice that the values in the output are now shown without scientific notation.
Note that in this example we used 0:.5f to display 5 decimal places in the output.
Feel free to change the 5 to a different number to display a different number of decimal places.