Table of Contents

In R, multivariate normality tests can be performed to determine whether a set of variables are normally distributed. This can be done using the mvnormtest package from CRAN, which provides functions for testing normality in multiple dimensions. The package includes the Shapiro-Wilk, Kolmogorov-Smirnov, Cramer-von Mises, and Anderson-Darling tests, which can be used to assess the normality of the data. These tests require the data to be in a matrix format, so it is important to make sure the data is in the correct format before attempting to run any of the multivariate normality tests.

When we’d like to test whether or not a single variable is normally distributed, we can create a Q-Q plot to visualize the distribution or we can perform a formal statistical test like an Anderson Darling Test or a Jarque-Bera Test.

However, when we’d like to test whether or not *several *variables are normally distributed as a group we must perform a **multivariate normality test**.

This tutorial explains how to perform the following multivariate normality tests for a given dataset in R:

- Mardia’s Test
- Energy Test
- Multivariate Kurtosis and Skew Tests

**Related: **If we’d like to identify outliers in a multivariate setting, we can use the Mahalanobis distance.

**Example: Mardia’s Test in R**

**Mardia’s Test** determines whether or not a group of variables follows a multivariate normal distribution. The null and alternative hypotheses for the test are as follows:

H_{0} (null): The variables follow a multivariate normal distribution.

H_{a} (alternative): The variables *do not *follow a multivariate normal distribution.

The following code shows how to perform this test in R using the **QuantPsyc** package:

library(QuantPsyc) #create dataset set.seed(0) data <- data.frame(x1 = rnorm(50), x2 = rnorm(50), x3 = rnorm(50)) #perform Multivariate normality test mult.norm(data)$mult.test Beta-hat kappa p-val Skewness 1.630474 13.5872843 0.1926626 Kurtosis 13.895364 -0.7130395 0.4758213

The **mult.norm() **function tests for multivariate normality in both the skewness and kurtosis of the dataset. Since both p-values are not less than .05, we fail to reject the null hypothesis of the test. We don’t have evidence to say that the three variables in our dataset do not follow a multivariate distribution.

**Example: Energy Test in R**

An **Energy**** Test** is another statistical test that determines whether or not a group of variables follows a multivariate normal distribution. The null and alternative hypotheses for the test are as follows:

H_{0} (null): The variables follow a multivariate normal distribution.

H_{a} (alternative): The variables *do not *follow a multivariate normal distribution.

The following code shows how to perform this test in R using the **energy **package:

library(energy) #create dataset set.seed(0) data <- data.frame(x1 = rnorm(50), x2 = rnorm(50), x3 = rnorm(50)) #perform Multivariate normality test mvnorm.etest(data, R=100) Energy test of multivariate normality: estimated parameters data: x, sample size 50, dimension 3, replicates 100 E-statistic = 0.90923, p-value = 0.31

The p-value of the test is **0.31**. Since this is not less than .05, we fail to reject the null hypothesis of the test. We don’t have evidence to say that the three variables in our dataset do not follow a multivariate distribution.

**Note: **The argument R=100 specifies 100 boostrapped replicates to be used when performing the test. For datasets with smaller sample sizes, you may increase this number to produce a more reliable estimate of the test statistic.

How to Create & Interpret a Q-Q Plot in R

How to Conduct an Anderson-Darling Test in R

How to Conduct a Jarque-Bera Test in R

How to Perform a Shapiro-Wilk Test in R