How to create a Scatterplot with a Regression Line in R

To create a scatterplot with a regression line in R, you first need to enter the data into a data frame, then use the plot function to draw the scatterplot and the abline function to draw the regression line. You can adjust the line to fit the data by specifying the intercept and slope of the line. You can also add a title and labels to the plot and adjust the color and other aesthetic elements.

Often when we perform simple linear regression, we’re interested in creating a to visualize the various combinations of x and y values.

Fortunately, R makes it easy to create scatterplots using the plot() function. For example:

#create some fake data
data <- data.frame(x = c(1, 1, 2, 3, 4, 4, 5, 6, 7, 7, 8, 9, 10, 11, 11),
                   y = c(13, 14, 17, 12, 23, 24, 25, 25, 24, 28, 32, 33, 35, 40, 41))

#create scatterplot of data
plot(data$x, data$y)

Scatterplot in R

It’s also easy to add a regression line to the scatterplot using the abline() function.

For example:

#fit a simple linear regression model
model <- lm(y ~ x, data = data)

#add the fitted regression line to the scatterplot

Scatterplot with regression line in R

We can also add confidence interval lines to the plot by using the predict() function:

#define range of x values
newx = seq(min(data$x),max(data$x),by = 1)

#find 95% confidence interval for the range of x values 
conf_interval <- predict(model, newdata=data.frame(x=newx), interval="confidence",
                         level = 0.95)

#create scatterplot of values with regression line 
plot(data$x, data$y)

#add dashed lines (lty=2) for the 95% confidence interval
lines(newx, conf_interval[,2], col="blue", lty=2)
lines(newx, conf_interval[,3], col="blue", lty=2)

Regression line with confidence interval in R

Or we could instead add prediction interval lines to the plot by specifying the interval type within the predict() function:

#define range of x values
newx = seq(min(data$x),max(data$x),by = 1)

#find 95% prediction interval for the range of x values 
pred_interval <- predict(model, newdata=data.frame(x=newx), interval="prediction",
                         level = 0.95)

#create scatterplot of values with regression line 
plot(data$x, data$y)

#add dashed lines (lty=2) for the 95% confidence interval
lines(newx, pred_interval[,2], col="red", lty=2)
lines(newx, pred_interval[,3], col="red", lty=2)

Regression line with prediction interval lines in R

Lastly, we can make the plot more aesthetically pleasing by adding a title, changing the axes names, and changing the shape of the individual points in the plot.

plot(data$x, data$y,
     main = "Scatterplot of x vs. y", #add title
     pch=16, #specify points to be filled in
     xlab='x', #change x-axis name
     ylab='y') #change y-axis name

abline(model, col='steelblue') #specify color of regression line

Regression line with scatterplot in R

The following tutorials explain how to perform other common tasks in R:
