Table of Contents

A residuals vs. leverage plot is a graph showing the relationship between two variables used in regression analysis: residuals, which are the differences between predicted and observed values, and leverage, which indicates how much the observations influence the model. The purpose of the plot is to detect any potential outliers or influential points in the dataset that may affect the model. This helps to assess the quality of the regression model.

A residuals vs. leverage plot is a type of that allows us to identify influential observations in a regression model.

Here is how this type of plot appears in the statistical programming language R:

diagnostic2

Each observation from the dataset is shown as a single point within the plot. The x-axis shows the leverage of each point and the y-axis shows the standardized residual of each point.

Leverage refers to the extent to which the coefficients in the regression model would change if a particular observation was removed from the dataset.

Observations with high leverage have a strong influence on the coefficients in the regression model. If we remove these observations, the coefficients of the model would change noticeably.

Standardized residuals refer to the standardized difference between a predicted value for an observation and the actual value of the observation.

It’s worth noting that an observation can have a high absolute value for a standardized residual, yet have a low value for leverage.

How to Interpret a Residuals vs. Leverage Plot

If any point in this plot falls outside of Cook’s distance (the red dashed lines) then it is considered to be an influential observation.

Let’s refer to the residuals vs. leverage plot from earlier:

diagnostic2

In the example above, we can see that observation #10 lies closest to the border of Cook’s distance, but it doesn’t fall outside of the dashed line. This means there are not any influential points in our regression model.

However, suppose we had the following residuals vs. leverage plot:

lev1

We can see that observation #1 in the top right corner falls outside of the red dashed lines. This indicates that it is an influential point.

This means that if we removed this observation from our dataset and fit the regression model again, the coefficients of the model would change significantly.

How to Handle Influential Observations

If you create a residuals vs. leverage plot for a model and you find that one or more observations are identified as influential, there are a few things you can do:

1. Verify that the observation is not an error.

Before you take any action, you should first verify that the influential observation(s) are not a result of a data entry error or some other odd occurrence.

2. Attempt to fit another regression model.

Influential observations could indicate that the model you specified does not provide a good fit to the data. In this case, you may try a or a nonlinear model.

3. Remove the influential observations.

Lastly, you may decide to simply remove the influential observations if the model you specified seems to fit the data well except for the one or two influential observations.

The following tutorials provide additional information on how to use residuals to assess the fit of regression models.

What is a residuals vs. leverage plot?

How to Interpret a Residuals vs. Leverage Plot

How to Handle Influential Observations

Requst a

Scale

How to Interpret a Residuals vs. Leverage Plot

How to Handle Influential Observations

Related terms:

Requst a

Scale