Table of Contents

In R, multivariate normality tests can be performed to determine whether a set of variables are normally distributed. This can be done using the mvnormtest package from CRAN, which provides functions for testing normality in multiple dimensions. The package includes the Shapiro-Wilk, Kolmogorov-Smirnov, Cramer-von Mises, and Anderson-Darling tests, which can be used to assess the normality of the data. These tests require the data to be in a matrix format, so it is important to make sure the data is in the correct format before attempting to run any of the multivariate normality tests.

When we’d like to test whether or not a single variable is normally distributed, we can create a Q-Q plot to visualize the distribution or we can perform a formal statistical test like an Anderson Darling Test or a Jarque-Bera Test.

However, when we’d like to test whether or not several variables are normally distributed as a group we must perform a multivariate normality test.

This tutorial explains how to perform the following multivariate normality tests for a given dataset in R:

Mardia’s Test
Energy Test
Multivariate Kurtosis and Skew Tests

Related: If we’d like to identify outliers in a multivariate setting, we can use the Mahalanobis distance.

Example: Mardia’s Test in R

Mardia’s Test determines whether or not a group of variables follows a multivariate normal distribution. The null and alternative hypotheses for the test are as follows:

H₀ (null): The variables follow a multivariate normal distribution.

H_a (alternative): The variables do not follow a multivariate normal distribution.

The following code shows how to perform this test in R using the QuantPsyc package:

library(QuantPsyc)

#create dataset
set.seed(0)

data <- data.frame(x1 = rnorm(50),
                   x2 = rnorm(50),
                   x3 = rnorm(50))

#perform Multivariate normality test
mult.norm(data)$mult.test

          Beta-hat      kappa     p-val
Skewness  1.630474 13.5872843 0.1926626
Kurtosis 13.895364 -0.7130395 0.4758213

The mult.norm() function tests for multivariate normality in both the skewness and kurtosis of the dataset. Since both p-values are not less than .05, we fail to reject the null hypothesis of the test. We don’t have evidence to say that the three variables in our dataset do not follow a multivariate distribution.

Example: Energy Test in R

An Energy Test is another statistical test that determines whether or not a group of variables follows a multivariate normal distribution. The null and alternative hypotheses for the test are as follows:

H₀ (null): The variables follow a multivariate normal distribution.

H_a (alternative): The variables do not follow a multivariate normal distribution.

The following code shows how to perform this test in R using the energy package:

library(energy)

#create dataset
set.seed(0)

data <- data.frame(x1 = rnorm(50),
                   x2 = rnorm(50),
                   x3 = rnorm(50))

#perform Multivariate normality test
mvnorm.etest(data, R=100)

	Energy test of multivariate normality: estimated parameters

data:  x, sample size 50, dimension 3, replicates 100
E-statistic = 0.90923, p-value = 0.31

The p-value of the test is 0.31. Since this is not less than .05, we fail to reject the null hypothesis of the test. We don’t have evidence to say that the three variables in our dataset do not follow a multivariate distribution.

Note: The argument R=100 specifies 100 boostrapped replicates to be used when performing the test. For datasets with smaller sample sizes, you may increase this number to produce a more reliable estimate of the test statistic.

How to Create & Interpret a Q-Q Plot in R
How to Conduct an Anderson-Darling Test in R
How to Conduct a Jarque-Bera Test in R
How to Perform a Shapiro-Wilk Test in R

How do I Perform Multivariate Normality Tests in R?

Example: Mardia’s Test in R

Example: Energy Test in R

Requst a

Scale

Example: Mardia’s Test in R

Example: Energy Test in R

Related terms:

Requst a

Scale