How can a Mann-Whitney U Test be conducted in Python?

The Mann-Whitney U Test is a statistical test used to compare two independent groups of data. In Python, this test can be conducted by using the “mannwhitneyu” function from the SciPy module. This function takes in two arrays of data and returns the U statistic and p-value, which can then be used to determine if there is a significant difference between the two groups. The steps for conducting a Mann-Whitney U Test in Python include importing the necessary modules, defining the data arrays, and performing the test using the “mannwhitneyu” function. This allows for an efficient and accurate analysis of data in Python.

Conduct a Mann-Whitney U Test in Python


A  is used to compare the differences between two samples when the sample distributions are not normally distributed and the sample sizes are small (n <30).

It is considered to be the nonparametric equivalent to the .

This tutorial explains how to conduct a Mann-Whitney U test in Python.

Example: Mann-Whitney U Test in Python

Researchers want to know if a fuel treatment leads to a change in the average mpg of a car. To test this, they measure the mpg of 12 cars with the fuel treatment and 12 cars without it.

Since the sample sizes are small and the researchers suspect that the sample distributions are not normally distributed, they decided to perform a Mann-Whitney U test to determine if there is a statistically significant difference in mpg between the two groups.

Perform the following steps to conduct a Mann-Whitney U test in Python.

Step 1: Create the data.

First, we’ll create two arrays to hold the mpg values for each group of cars:

group1 = [20, 23, 21, 25, 18, 17, 18, 24, 20, 24, 23, 19]group2 = [24, 25, 21, 22, 23, 18, 17, 28, 24, 27, 21, 23]

Step 2: Conduct a Mann-Whitney U Test.

Next, we’ll use the from the scipy.stats library to conduct a Mann-Whitney U test, which uses the following syntax:

mannwhitneyu(x, y, use_continuity=True, alternative=None)

where:

  • x: an array of sample observations from group 1
  • y: an array of sample observations from group 2
  • use_continuity: whether a continuity correction (1/2) should be taken into account. Default is True.
  • alternative: defines the alternative hypothesis. Default is ‘None’ which computes a p-value half the size of the ‘two-sided’ p-value. Other options include ‘two-sided’, ‘less’, and ‘greater.’

Here’s how to use this function in our specific example:

import scipy.stats as stats

#perform the Mann-Whitney U test
stats.mannwhitneyu(group1, group2, alternative='two-sided')

(statistic=50.0, pvalue=0.2114)

Step 3: Interpret the results.

In this example, the Mann-Whitney U Test uses the following null and alternative hypotheses:

H0The mpg is equal between the two groups

HAThe mpg is not equal between the two groups

Since the p-value (0.2114) is not less than 0.05, we fail to reject the null hypothesis.

This means we do not have sufficient evidence to say that the true mean mpg is different between the two groups.

Additional Resources

The following tutorials explain how to perform a Mann-Whitney U Test in different statistical software:

x