How to Analyze the mtcars Dataset in R

The mtcars dataset in R is a built-in dataset that contains data on the performance of different models of cars. It can be analyzed using various R functions to explore the data, such as summary(), which gives a summary of numerical data, or plot(), which produces a graph of the data. Other R functions, such as subset(), can also be used to further analyze the data.


The mtcars dataset is a built-in dataset in R that contains measurements on 11 different attributes for 32 different cars.

This tutorial explains how to explore, summarize, and visualize the mtcars dataset in R.

Related:

Load the mtcars Dataset

Since the mtcars dataset is a built-in dataset in R, we can load it by using the following command:

data(mtcars)

We can take a look at the first six rows of the dataset by using the head() function:

#view first six rows of mtcars dataset
head(mtcars)

                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Summarize the mtcars Dataset

We can use the summary() function to quickly summarize each variable in the dataset:

#summarize mtcars dataset
summary(mtcars)

      mpg             cyl             disp             hp       
 Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
 1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
 Median :19.20   Median :6.000   Median :196.3   Median :123.0  
 Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
 3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
 Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
      drat             wt             qsec             vs        
 Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
 1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
 Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
 Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
 3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
 Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
       am              gear            carb      
 Min.   :0.0000   Min.   :3.000   Min.   :1.000  
 1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
 Median :0.0000   Median :4.000   Median :2.000  
 Mean   :0.4062   Mean   :3.688   Mean   :2.812  
 3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
 Max.   :1.0000   Max.   :5.000   Max.   :8.000   

For each of the 11 variables we can see the following information:

  • Min: The minimum value.
  • 1st Qu: The value of the first quartile (25th percentile).
  • Median: The median value.
  • Mean: The mean value.
  • 3rd Qu: The value of the third quartile (75th percentile).
  • Max: The maximum value.

We can use the dim() function to get the dimensions of the dataset in terms of number of rows and number of columns:

#display rows and columns
dim(mtcars)

[1] 32 11

We can see that the dataset has 32 rows and 11 columns.

We can also use the names() function to display the column names of the data frame:

#display column names
names(mtcars)

 [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear"
[11] "carb"     

Visualize the mtcars Dataset

We can also create some plots to visualize the values in the dataset.

For example, we can use the hist() function to create a histogram of the values for a certain variable:

#create histogram of values for mpg
hist(mtcars$mpg,
     col='steelblue',
     main='Histogram',
     xlab='mpg',
     ylab='Frequency')

We could also use the boxplot() function to create a to visualize the distribution of values for a certain variable:

#create boxplot of values for mpg
boxplot(mtcars$mpg,
        main='Distribution of mpg values',
        ylab='mpg',
        col='steelblue',
        border='black')

We can also use the plot() function to create a scatterplot of any pairwise combination of variables:

#create scatterplot of mpg vs. wt
plot(mtcars$mpg, mtcars$wt,
     col='steelblue',
     main='Scatterplot',
     xlab='mpg',
     ylab='wt',
     pch=19)

By using these built-in functions in R, we can learn a great deal about the mtcars dataset.

If you’d like to perform more advanced statistical analysis with this dataset, check out that explains how to fit linear regression models and generalized linear models using the mtcars dataset.

The following tutorials explain how to perform other common tasks in R:

x