Table of Contents

Creating a scatterplot with a regression line in R can be done by using the “plot” function, which allows you to plot two numerical variables against each other. To add a regression line to the plot, you can use the “abline” function and specify the regression equation. This will create a straight line that best fits the data points on the scatterplot. Additionally, you can customize the appearance of the scatterplot and regression line by using various arguments in the “plot” and “abline” functions, such as color, point shape, and line type. This process allows you to visually analyze the relationship between the two variables and determine if there is a linear correlation.

Create a Scatterplot with a Regression Line in R

Often when we perform simple linear regression, we’re interested in creating a to visualize the various combinations of x and y values.

Fortunately, R makes it easy to create scatterplots using the plot() function. For example:

#create some fake data
data <- data.frame(x = c(1, 1, 2, 3, 4, 4, 5, 6, 7, 7, 8, 9, 10, 11, 11),
                   y = c(13, 14, 17, 12, 23, 24, 25, 25, 24, 28, 32, 33, 35, 40, 41))

#create scatterplot of data
plot(data$x, data$y)

It’s also easy to add a regression line to the scatterplot using the abline() function.

For example:

#fit a simple linear regression model
model <- lm(y ~ x, data = data)

#add the fitted regression line to the scatterplot
abline(model)

We can also add confidence interval lines to the plot by using the predict() function:

#define range of x values
newx = seq(min(data$x),max(data$x),by = 1)

#find 95% confidence interval for the range of x values 
conf_interval <- predict(model, newdata=data.frame(x=newx), interval="confidence",
                         level = 0.95)

#create scatterplot of values with regression line 
plot(data$x, data$y)
abline(model)

#add dashed lines (lty=2) for the 95% confidence interval
lines(newx, conf_interval[,2], col="blue", lty=2)
lines(newx, conf_interval[,3], col="blue", lty=2)

Or we could instead add prediction interval lines to the plot by specifying the interval type within the predict() function:

#define range of x values
newx = seq(min(data$x),max(data$x),by = 1)

#find 95% prediction interval for the range of x values 
pred_interval <- predict(model, newdata=data.frame(x=newx), interval="prediction",
                         level = 0.95)

#create scatterplot of values with regression line 
plot(data$x, data$y)
abline(model)

#add dashed lines (lty=2) for the 95% confidence interval
lines(newx, pred_interval[,2], col="red", lty=2)
lines(newx, pred_interval[,3], col="red", lty=2)

Lastly, we can make the plot more aesthetically pleasing by adding a title, changing the axes names, and changing the shape of the individual points in the plot.

plot(data$x, data$y,
     main = "Scatterplot of x vs. y", #add title
     pch=16, #specify points to be filled in
     xlab='x', #change x-axis name
     ylab='y') #change y-axis name

abline(model, col='steelblue')#specify color of regression line

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How do you create a scatterplot with a regression line in R?

Create a Scatterplot with a Regression Line in R

Additional Resources

Requst a

Scale

Additional Resources

Related terms:

Requst a

Scale