Table of Contents
A normal distribution, also known as a Gaussian distribution, is a commonly used probability distribution that is symmetrical and bell-shaped. It is characterized by its mean and standard deviation, which determine the shape and spread of the distribution. In R, a normal distribution can be generated using the “rnorm” function, which takes the number of observations desired, the mean, and the standard deviation as arguments. For example, to generate a normal distribution with 100 observations, a mean of 0, and a standard deviation of 1, the code would be “rnorm(100, 0, 1)”. This will produce a vector of 100 random numbers following a normal distribution. Another way to generate a normal distribution in R is by using the “dnorm” function, which calculates the density of the distribution at a given point. The “dnorm” function can be used to plot a normal curve by specifying a range of values for the x-axis and using the “curve” function. Overall, there are various ways to generate a normal distribution in R, depending on the specific needs and goals of the user.
Generate a Normal Distribution in R (With Examples)
You can quickly generate a normal distribution in R by using the rnorm() function, which uses the following syntax:
rnorm(n, mean=0, sd=1)
where:
- n: Number of observations.
- mean: Mean of normal distribution. Default is 0.
- sd: Standard deviation of normal distribution. Default is 1.
This tutorial shows an example of how to use this function to generate a normal distribution in R.
Related:A Guide to dnorm, pnorm, qnorm, and rnorm in R
Example: Generate a Normal Distribution in R
The following code shows how to generate a normal distribution in R:
#make this example reproducible set.seed(1) #generate sample of 200 obs. that follows normal dist. with mean=10 and sd=3 data <- rnorm(200, mean=10, sd=3) #view first 6 observations in sample head(data) [1] 8.120639 10.550930 7.493114 14.785842 10.988523 7.538595
We can quickly find the mean and standard deviation of this distribution:
#find mean of sample
mean(data)
[1] 10.10662
#find standard deviation of sample
sd(data)
[1] 2.787292
We can also create a quick histogram to visualize the distribution of data values:
hist(data, col='steelblue')
We can even perform a Shapiro-Wilk test to see if the dataset comes from a normal population:
shapiro.test(data)
Shapiro-Wilk normality test
data: data
W = 0.99274, p-value = 0.4272
The p-value of the test turns out to be 0.4272. Since this value is not less than .05, we can assume the sample data comes from a population that is normally distributed.
Additional Resources
How to Plot a Normal Distribution in R
A Guide to dnorm, pnorm, qnorm, and rnorm in R
How to Perform a Shapiro-Wilk Test for Normality in R