How do I calculate SST, SSR, and SSE in Python?

How do I calculate SST, SSR, and SSE in Python?

Calculating SST, SSR, and SSE in Python involves using statistical functions and formulas to determine the total sum of squares (SST), regression sum of squares (SSR), and error sum of squares (SSE) for a given dataset. SST represents the total variation of the data from the mean, SSR represents the variation explained by the regression model, and SSE represents the unexplained variation or error of the model. These calculations are essential in assessing the effectiveness of a regression model and making predictions. By utilizing built-in functions and libraries in Python, such as numpy and scipy, users can easily compute SST, SSR, and SSE for their data sets and evaluate the accuracy of their regression models.

Calculate SST, SSR, and SSE in Python


We often use three different values to measure how well a fits a dataset:

1. Sum of Squares Total (SST) – The sum of squared differences between individual data points (yi) and the mean of the response variable (y).

  • SST = Σ(yiy)2

2. Sum of Squares Regression (SSR) – The sum of squared differences between predicted data points (ŷi) and the mean of the response variable(y).

  • SSR = Σ(ŷiy)2

3. Sum of Squares Error (SSE) – The sum of squared differences between predicted data points (ŷi) and observed data points (yi).

  • SSE = Σ(ŷi – yi)2

The following step-by-step example shows how to calculate each of these metrics for a given regression model in Python.

Step 1: Create the Data

First, let’s create a dataset that contains the number of hours studied and exam score received for 20 different students at a certain university:

import pandas as pd

#create pandas DataFrame
df = pd.DataFrame({'hours': [1, 1, 1, 2, 2, 2, 2, 2, 3, 3,
                             3, 4, 4, 4, 5, 5, 6, 7, 7, 8],
                   'score': [68, 76, 74, 80, 76, 78, 81, 84, 86, 83,
                             88, 85, 89, 94, 93, 94, 96, 89, 92, 97]})

#view first five rows of DataFrame
df.head()

	hours	score
0	1	68
1	1	76
2	1	74
3	2	80
4	2	76

Step 2: Fit a Regression Model

Next, we’ll use the OLS() function from the library to fit a simple linear regression model using score as the response variable and hours as the predictor variable:

import statsmodels.apias sm

#define response variable
y = df['score']

#define predictor variable
x = df[['hours']]

#add constant to predictor variables
x = sm.add_constant(x)

#fit linear regression model
model = sm.OLS(y, x).fit()

Step 3: Calculate SST, SSR, and SSE

Lastly, we can use the following formulas to calculate the SST, SSR, and SSE values of the model:

import numpy as np

#calculate sse
sse = np.sum((model.fittedvalues - df.score)**2)
print(sse)

331.07488479262696

#calculate ssr
ssr = np.sum((model.fittedvalues - df.score.mean())**2)
print(ssr)

917.4751152073725

#calculate sst
sst = ssr + sse
print(sst)

1248.5499999999995
  • Sum of Squares Total (SST): 1248.55
  • Sum of Squares Regression (SSR): 917.4751
  • Sum of Squares Error (SSE): 331.0749

We can verify that SST = SSR + SSE:

  • SST = SSR + SSE
  • 1248.55 = 917.4751 + 331.0749

Additional Resources

You can use the following calculators to automatically calculate SST, SSR, and SSE for any simple linear regression line:

The following tutorials explain how to calculate SST, SSR, and SSE in other statistical software:

Cite this article

stats writer (2024). How do I calculate SST, SSR, and SSE in Python?. PSYCHOLOGICAL SCALES. Retrieved from https://scales.arabpsychology.com/stats/how-do-i-calculate-sst-ssr-and-sse-in-python/

stats writer. "How do I calculate SST, SSR, and SSE in Python?." PSYCHOLOGICAL SCALES, 1 Jul. 2024, https://scales.arabpsychology.com/stats/how-do-i-calculate-sst-ssr-and-sse-in-python/.

stats writer. "How do I calculate SST, SSR, and SSE in Python?." PSYCHOLOGICAL SCALES, 2024. https://scales.arabpsychology.com/stats/how-do-i-calculate-sst-ssr-and-sse-in-python/.

stats writer (2024) 'How do I calculate SST, SSR, and SSE in Python?', PSYCHOLOGICAL SCALES. Available at: https://scales.arabpsychology.com/stats/how-do-i-calculate-sst-ssr-and-sse-in-python/.

[1] stats writer, "How do I calculate SST, SSR, and SSE in Python?," PSYCHOLOGICAL SCALES, vol. X, no. Y, ص Z-Z, July, 2024.

stats writer. How do I calculate SST, SSR, and SSE in Python?. PSYCHOLOGICAL SCALES. 2024;vol(issue):pages.

Download Post (.PDF)
Slide Up
x
PDF
Scroll to Top