How to calculate a moving average by group in pandas?

To calculate a moving average by group in pandas, you can use the groupby() method to group your data, followed by the rolling() method to apply a moving window to your grouped data, and then the mean() method to calculate the mean of each window. You can also use other aggregate functions, such as sum(), to calculate other metrics in the same way.


You can use the following basic syntax to calculate a moving average by group in pandas:

#calculate 3-period moving average of 'values' by 'group'
df.groupby('group')['values'].transform(lambda x: x.rolling(3, 1).mean())

The following example shows how to use this syntax in practice.

Example: Calculate Moving Average by Group in Pandas

Suppose we have the following pandas DataFrame that shows the total sales made by two stores during five sales periods:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'store': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'],
                   'period': [1, 2, 3, 4, 5, 1, 2, 3, 4, 5],
                   'sales': [7, 7, 9, 13, 14, 13, 13, 19, 20, 26]})

#view DataFrame
df

	store	period	sales
0	A	1	7
1	A	2	7
2	A	3	9
3	A	4	13
4	A	5	14
5	B	1	13
6	B	2	13
7	B	3	19
8	B	4	20
9	B	5	26

We can use the following code to calculate a 3-day moving average of sales for each store:

#calculate 3-day moving average of sales by store
df['ma'] = df.groupby('store')['sales'].transform(lambda x: x.rolling(3, 1).mean())

#view updated DataFrame
df

        store	period	sales	ma
0	A	1	7	7.000000
1	A	2	7	7.000000
2	A	3	9	7.666667
3	A	4	13	9.666667
4	A	5	14	12.000000
5	B	1	13	13.000000
6	B	2	13	13.000000
7	B	3	19	15.000000
8	B	4	20	17.333333
9	B	5	26	21.666667

Note: x.rolling(3, 1) means to calculate a 3-period moving average and require 1 as the minimum number of periods.

The ‘ma’ column shows the  3-day moving average of sales for each store.

To calculate a different moving average, simply change the value in the rolling() function.

For example, we could calculate the 2-day moving average of sales for each store instead:

#calculate 2-day moving average of sales by store
df['ma'] = df.groupby('store')['sales'].transform(lambda x: x.rolling(2, 1).mean())

#view updated DataFrame
df

        store	period	sales	ma
0	A	1	7	7.0
1	A	2	7	7.0
2	A	3	9	8.0
3	A	4	13	11.0
4	A	5	14	13.5
5	B	1	13	13.0
6	B	2	13	13.0
7	B	3	19	16.0
8	B	4	20	19.5
9	B	5	26	23.0

x