How to fit distributions using R’s fitdistr()

R’s fitdistr() function is a convenient way to fit probability distributions to a given set of data. It can be used to fit a variety of different distributions, including the Normal, Gamma, and Poisson distributions, by estimating the parameters of the distribution that best describe the data. It also provides a range of goodness-of-fit tests to evaluate how well a particular distribution fits the data.


You can use the fitdistr() function from the MASS package in R to estimate the parameters of a distribution by maximizing the likelihood function.

This function uses the following basic syntax:

fitdistr(x, densefun, …)

where:

  • x: A numeric vector representing the values of the distribution
  • densefun: the distribution to estimate the parameters for 

Note that the densefun argument accepts the following potential distribution names: beta, cauchy, chi-squared, exponential, gamma, geometric, lognormal, logistic, negative binomial, normal, Poisson, t and Weibull.

The following example shows how to use the fitdistr() function in practice.

Example: How to Use fitdistr() Function to Fit Distributions in R

Suppose we use the rnorm() function in R to generate a vector of 200 values that follow a normal distribution:

#make this example reproducible
set.seed(1)

#generate sample of 200 observations that follows normal dist with mean=10 and sd=3
data <- rnorm(200, mean=10, sd=3)

#view first 6 observations in sample
head(data)

[1]  8.120639 10.550930  7.493114 14.785842 10.988523  7.538595

We can use the hist() function to create a histogram to visualize the distribution of data values:

hist(data, col='steelblue')

Generate normal distribution in R

We can see that the data does indeed look normally distributed.

We can then use the fitdistr() function to estimate the parameters of this distribution:

library(MASS)

#estimate parameters of distribution
fitdistr(data, "normal")

      mean          sd    
  10.1066189    2.7803148 
 ( 0.1965979) ( 0.1390157)

The fitdistr() function estimates that the vector of values follows a with a mean of 10.1066189 and standard deviation of 2.7803148.

The following tutorials explain how to perform other common tasks in R:

How to Plot a Normal Distribution in R

How to Perform a Shapiro-Wilk Test for Normality in R

x