How to calculate AIC in R (Including Examples)

AIC stands for Akaike Information Criterion and is used to compare and evaluate the relative quality of different models. In R, AIC is calculated using the AIC() or AICc() functions, which take the fitted model object and the estimated parameters as inputs. For example, AIC(lm.object, k=2) will return the AIC value for a linear regression model, where lm.object is the model object and k=2 is the estimated number of parameters.


The Akaike information criterion (AIC) is a metric that is used to compare the fit of several regression models.

It is calculated as:

AIC = 2K – 2ln(L)

where:

  • K: The number of model parameters. The default value of K is 2, so a model with just one predictor variable will have a K value of 2+1 = 3.
  • ln(L): The log-likelihood of the model. Most statistical software can automatically calculate this value for you.

The AIC is designed to find the model that explains the most variation in the data, while penalizing for models that use an excessive number of parameters.

Once you’ve fit several regression models, you can compare the AIC value of each model. The lower the AIC, the better the model fit.

To calculate the AIC of several regression models in R, we can use the aictab() function from the AICcmodavg package.

The following example shows how to use this function to calculate and interpret the AIC for various regression models in R.

Example: Calculate & Interpret AIC in R

Suppose we would like to fit three different using variables from the mtcars dataset.

Here are the predictor variables we’ll use in each model:

  • Predictor variables in Model 1: disp, hp, wt, qsec
  • Predictor variables in Model 2: disp, qsec
  • Predictor variables in Model 3: disp, wt

The following code shows how to fit each of these regression models:

#fit three models
model1 <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
model2 <- lm(mpg ~ disp + qsec, data = mtcars)
model3 <- lm(mpg ~ disp + wt, data = mtcars)

Next, we’ll put the models into a list and use the aictab() function to calculate the AIC of each model:

library(AICcmodavg)

#define list of models
models <- list(model1, model2, model3)

#specify model names
mod.names <- c('disp.hp.wt.qsec', 'disp.qsec', 'disp.wt')

#calculate AIC of each model
aictab(cand.set = models, modnames = mod.names)

Model selection based on AICc:

                K   AICc Delta_AICc AICcWt Cum.Wt     LL
disp.hp.wt.qsec 6 162.43       0.00   0.83   0.83 -73.53
disp.wt         4 165.65       3.22   0.17   1.00 -78.08
disp.qsec       4 173.32      10.89   0.00   1.00 -81.92

  • K: The number of parameters in the model.
  • AICc: The AIC value of the model. The lowercase ‘c’ indicates that the AIC has been calculated from the AIC corrected for small sample sizes.
  • Delta_AICc: The difference between the AIC of the best model compared to the current model being compared.
  • AICcWt: The proportion of the total predictive power that can be found in the model.
  • Cum.Wt: The cumulative sum of the AIC weights.
  • LL: The log-likelihood of the model. This tells us how likely the model is, given the data we used.

The model with the lowest AIC value is always listed first. From the output we can see that the following model has the lowest AIC value and is thus the best fitting model:

mpg = β0 + β1(disp) + β2(hp) + β3(wt) + β4(qsec)

Once we’ve identified this model as the best, we can proceed to fit the model and analyze the results including the R-squared value and the beta coefficients to determine the exact relationship between the set of predictor variables and the .

x