How can the ecdf() function be used in R?

The ecdf() function in R is used for constructing empirical cumulative distribution functions, which represent the cumulative probability distribution of a dataset. It takes a numerical vector as an input and outputs a step function that plots the cumulative proportion of data points below each unique value in the vector. This can be useful for visualizing the distribution of a dataset and comparing it to a theoretical distribution. The ecdf() function can also be used to calculate and plot the quantiles of a dataset, making it a useful tool for descriptive statistics and data analysis.


You can use the ecdf function in R to calculate and plot an empirical cumulative distribution function.

Here is the most common way to use this function:

#calculate empirical cumulative distribution function of data
p = ecdf(data)

#plot empirical cumulative distribution function
plot(p)

The following example shows how to use this function in practice.

Example: How to Use ecdf() Function in R

For this example, let’s create a vector of 1,000 random values that follow a :

#make this example reproducible
set.seed(1)

#create vector of 1,000 random values that follow standard normal distribution
data = rnorm(1000)

#view first six values in vector
head(data)

[1] -0.6264538  0.1836433 -0.8356286  1.5952808  0.3295078 -0.8204684

We can use the ecdf function to calculate the empirical cumulative distribution function of this dataset and then use the plot function to visualize it:

#calculate empirical cumulative distribution function of data
p = ecdf(data)

#plot empirical cumulative distribution function
plot(p)

Note that you can also use the xlab, ylab and main arguments within the plot function to add an x-axis label, y-axis label and title to the plot, respectively:

#calculate empirical cumulative distribution function of data
p = ecdf(data)

#plot empirical cumulative distribution function with axis labels and title
plot(p, xlab='x', ylab='CDF', main='CDF of Data') 

ecdf function in R

The x-axis displays the values from the dataset.

The y-axis displays the cumulative distribution function.

Related:

Additional Resources

How to Plot a Normal Distribution in R
A Guide to dnorm, pnorm, qnorm, and rnorm in R
How to Perform a Shapiro-Wilk Test for Normality in R

x