Table of Contents

Pandas is a powerful data analysis tool that can be used to calculate the standard deviation of a dataset. To do this, you must first import the pandas library and then use the .std() method to calculate the standard deviation of a column or set of columns. You can also use the .describe() method to get summary statistics for a set of columns. Examples are provided to show how to calculate the standard deviation of a single column, multiple columns, and a set of columns using the .std() and .describe() methods.

You can use the function to calculate the standard deviation of values in a pandas DataFrame.

You can use the following methods to calculate the standard deviation in practice:

Method 1: Calculate Standard Deviation of One Column

df['column_name'].std()

Method 2: Calculate Standard Deviation of Multiple Columns

df[['column_name1', 'column_name2']].std()

Method 3: Calculate Standard Deviation of All Numeric Columns

df.std()

Note that the std() function will automatically ignore any NaN values in the DataFrame when calculating the standard deviation.

The following examples shows how to use each method with the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'B', 'B', 'C', 'C'],
                   'points': [25, 12, 15, 14, 19, 23, 25, 29],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})

#view DataFrame
print(df)

	team	points	assists	rebounds
0	A	25	5	11
1	A	12	7	8
2	B	15	7	10
3	B	14	9	6
4	B	19	12	6
5	B	23	9	5
6	C	25	9	9
7	C	29	4	12

Method 1: Calculate Standard Deviation of One Column

The following code shows how to calculate the standard deviation of one column in the DataFrame:

#calculate standard deviation of 'points' column
df['points'].std() 

6.158617655657106

The standard deviation turns out to be 6.1586.

Method 2: Calculate Standard Deviation of Multiple Columns

The following code shows how to calculate the standard deviation of multiple columns in the DataFrame:

#calculate standard deviation of 'points' and 'rebounds' columns
df[['points', 'rebounds']].std()

points      6.158618
rebounds    2.559994
dtype: float64

The standard deviation of the ‘points’ column is 6.1586 and the standard deviation of the ‘rebounds’ column is 2.5599.

Method 3: Calculate Standard Deviation of All Numeric Columns

The following code shows how to calculate the standard deviation of every numeric column in the DataFrame:

#calculate standard deviation of all numeric columns
df.std()

points      6.158618
assists     2.549510
rebounds    2.559994
dtype: float64

Notice that pandas did not calculate the standard deviation of the ‘team’ column since it was not a numeric column.

How to Calculate Standard Deviation in Pandas (With Examples)

Method 1: Calculate Standard Deviation of One Column

Method 2: Calculate Standard Deviation of Multiple Columns

Method 3: Calculate Standard Deviation of All Numeric Columns

Requst a

Scale

Method 1: Calculate Standard Deviation of One Column

Method 2: Calculate Standard Deviation of Multiple Columns

Method 3: Calculate Standard Deviation of All Numeric Columns

Related terms:

Requst a

Scale