How can I perform the Nemenyi Post-Hoc Test in Python?

The Nemenyi Post-Hoc Test is a statistical test used to compare the means of multiple groups. It is included in the Python library statsmodels, and can be performed in Python by first importing the library, creating a Kruskal-Wallis object, and then using the Nemenyi test method to compare the means of the groups. The output of the test will be a critical difference diagram showing the significance of the differences between the groups.


The is a non-parametric alternative to the . It is used to determine whether or not there is a statistically significant difference between the means of three or more groups in which the same subjects show up in each group.

If the p-value of the Friedman test is statistically significant, we can then perform the Nemenyi post-hoc test to determine exactly which groups are different.

The following step-by-step example shows how to perform the Nemenyi test in Python.

Step 1: Create the Data

Suppose a researcher wants to know if the reaction times of patients is equal on three different drugs. To test this, he measures the reaction time (in seconds) of 10 different patients on each of the three drugs.

We can create the following three arrays that contain the response times for each patient on each of the three drugs:

group1 = [4, 6, 3, 4, 3, 2, 2, 7, 6, 5]
group2 = [5, 6, 8, 7, 7, 8, 4, 6, 4, 5]
group3 = [2, 2, 5, 3, 2, 2, 1, 4, 3, 2]

Step 2: Perform the Friedman Test

Next, we’ll perform the Friedman Test using the from the scipy.stats library:

from scipy import stats

#perform Friedman Test
stats.friedmanchisquare(group1, group2, group3)

FriedmanchisquareResult(statistic=13.3513513, pvalue=0.00126122012)

The Friedman Test uses the following null and alternative hypotheses:

The null hypothesis (H0): The mean for each population is equal.

The alternative hypothesis: (Ha): At least one population mean is different from the rest.

In this example, the test statistic is 13.35135 and the corresponding p-value is 0.00126. Since this p-value is less than 0.05, we can reject the null hypothesis that the mean response time is the same for all three drugs.

In other words, we have sufficient evidence to conclude that the type of drug used leads to statistically significant differences in response time.

Step 3: Perform the Nemenyi Test

Next, we can perform the Nemenyi post-hoc test to determine exactly which groups have different means.

pip install scikit-posthocs

Next, we’ll use the posthoc_nemenyi_friedman() function to perform the Nemenyi post-hoc test:

import scikit_posthocs as sp
import numpy as np

#combine three groups into one array
data = np.array([group1, group2, group3])

#perform Nemenyi post-hoc test
sp.posthoc_nemenyi_friedman(data.T)

	0	        1	        2
0	1.000000	0.437407	0.065303
1	0.437407	1.000000	0.001533
2	0.065303	0.001533	1.000000

Note: We had to transpose the numpy array (data.T) in order to perform the post-hoc test correctly.

The Nemeyi post-hoc test returns the p-values for each pairwise comparison of means. From the output we can see the following p-values:

  • P-value of group 0 vs. group 1: 0.4374
  • P-value of group 0 vs. group 2: 0.0653
  • P-value of group 1 vs. group 2: 0.0015

At α = .05, the only two groups that have statistically significantly different means are group 1 and group 2.

Note: The Nemenyi test converted the group number from 1, 2, 3 into 0, 1, 2. Thus, the groups from the original data that are significantly different are groups 2 and 3.

x