Table of Contents
R’s fitdistr() function is a convenient way to fit probability distributions to a given set of data. It can be used to fit a variety of different distributions, including the Normal, Gamma, and Poisson distributions, by estimating the parameters of the distribution that best describe the data. It also provides a range of goodness-of-fit tests to evaluate how well a particular distribution fits the data.
You can use the fitdistr() function from the MASS package in R to estimate the parameters of a distribution by maximizing the likelihood function.
This function uses the following basic syntax:
fitdistr(x, densefun, …)
where:
- x: A numeric vector representing the values of the distribution
- densefun: the distribution to estimate the parameters for
Note that the densefun argument accepts the following potential distribution names: beta, cauchy, chi-squared, exponential, gamma, geometric, lognormal, logistic, negative binomial, normal, Poisson, t and Weibull.
The following example shows how to use the fitdistr() function in practice.
Example: How to Use fitdistr() Function to Fit Distributions in R
Suppose we use the rnorm() function in R to generate a vector of 200 values that follow a normal distribution:
#make this example reproducible set.seed(1) #generate sample of 200 observations that follows normal dist with mean=10 and sd=3 data <- rnorm(200, mean=10, sd=3) #view first 6 observations in sample head(data) [1] 8.120639 10.550930 7.493114 14.785842 10.988523 7.538595
We can use the hist() function to create a histogram to visualize the distribution of data values:
hist(data, col='steelblue')
We can see that the data does indeed look normally distributed.
We can then use the fitdistr() function to estimate the parameters of this distribution:
library(MASS)
#estimate parameters of distribution
fitdistr(data, "normal")
mean sd
10.1066189 2.7803148
( 0.1965979) ( 0.1390157)
The fitdistr() function estimates that the vector of values follows a with a mean of 10.1066189 and standard deviation of 2.7803148.
The following tutorials explain how to perform other common tasks in R: