Table of Contents
Creating a confusion matrix in Google Sheets involves organizing and analyzing data to visualize the performance of a classification model. This involves inputting the predicted and actual values of a dataset into a table, and using built-in functions to calculate the true positive, true negative, false positive, and false negative values. These values are then used to populate the confusion matrix, which displays the accuracy of the model’s predictions. Overall, creating a confusion matrix in Google Sheets provides a comprehensive and easily interpretable way to evaluate the effectiveness of a classification model.
Create a Confusion Matrix in Google Sheets
is a type of regression we use when the is binary.
One common way to evaluate the quality of a logistic regression model is to create a confusion matrix, which is a 2×2 table that shows the predicted values from the model vs. the actual values from the test dataset.
The following step-by-step example shows how to create a confusion matrix in Google Sheets.
Step 1: Enter the Data
First, let’s enter a column of actual values for a response variable along with the predicted values by a logistic regression model:
Step 2: Create the Confusion Matrix
Next, we will create the following outline for a confusion matrix:
Next, we will type the following formula into cell E3:
=COUNTIFS($A$2:$A$21,$D3,$B$2:$B$21,E$2)
We can then click and drag this formula to each remaining cell in the confusion matrix:
Step 3: Calculate Accuracy, Precision and Recall
Once we’ve created the confusion matrix, we can calculate the following metrics:
- Accuracy: Percentage of correct predictions
- Precision: Correct positive predictions relative to total positive predictions
- Recall: Correct positive predictions relative to total actual positives
The higher the accuracy, the better a model is able to correctly classify observations.
In this example, our model has an accuracy of 0.7 which tells us that it correctly classified 70% of observations.
If we’d like, we can compare this accuracy to that of other logistic regression models to determine which model is best at classifying observations into categories of 0 or 1.
Additional Resources
The following tutorials provide additional information about logistic regression: