What is Dixon’s Q Test and can you provide an example?

Dixon’s Q Test is a statistical method used to identify outliers in a data set. It is based on the principle that an outlier is likely to be the largest or smallest value in a data set. The test calculates a critical value, known as Q, which is compared to the observed data. If the observed value is significantly different from the critical value, it is considered an outlier.

For example, let’s say we have a data set of 10 numbers: 5, 6, 7, 8, 9, 10, 11, 12, 13, 100. The Q Test would identify 100 as an outlier, as it is significantly larger than the other values in the data set. This test is commonly used in quality control and can be helpful in identifying data points that may be erroneous or need further investigation.

Dixon’s Q Test: Definition + Example


Dixon’s Q Test, often referred to simply as the Q Test, is a statistical test that is used for detecting outliers in a dataset.

The test statistic for the Q test is as follows:

Q = |xa – xb| / R

where xa is the suspected outlier, xb is the data point closest to xa, and R is the range of the dataset. In most cases, xa is the maximum value in the dataset but it can also be the minimum value.

It’s important to note that the Q test is typically performed on small datasets and the test assumes that the data is normally distributed. It’s also important to note that the Q test should only be conducted one time for a given dataset.

How to Conduct Dixon’s Q Test By Hand

Suppose we have the following dataset: 

1, 3, 5, 7, 8, 9, 13, 25

We can follow the to conduct Dixon’s Q Test by hand to determine if the maximum value in this dataset is an outlier:

Step 1. State the hypotheses. 

The null hypothesis (H0): The max is not an outlier.

The alternative hypothesis: (Ha): The max is an outlier.

Step 2. Determine a significance level to use.

Common choices are 0.1, 0.05, and 0.01. We will use a .05 level of significance for this example.

Step 3. Find the test statistic.

Q = |xa – xb| / R

In this case, our max value is x= 25, our next closest value is x= 13, and our range is R = 25 – 1 = 24.

Next, we can compare this test statistic to the Q test critical values, which are shown below for various sample sizes (n) and confidence levels:

n       90%       95%       99%
  0.941    0.970    0.994
4    0.765    0.829    0.926
5    0.642    0.710    0.821
6    0.560    0.625    0.740
7    0.507    0.568    0.680
8    0.468    0.526    0.634
9    0.437    0.493    0.598
10 0.412    0.466    0.568
11 0.392    0.444    0.542
12 0.376    0.426    0.522
13 0.361    0.410    0.503
14 0.349    0.396    0.488
15 0.338    0.384    0.475
16 0.329    0.374    0.463
17 0.320    0.365    0.452
18 0.313    0.356    0.442
19 0.306    0.349    0.433
20 0.300    0.342    0.425
21 0.295    0.337    0.418
22 0.290    0.331    0.411
23 0.285    0.326    0.404
24 0.281    0.321    0.399
25 0.277    0.317    0.393
26 0.273    0.312    0.388
27 0.269    0.308    0.384
28 0.266    0.305    0.380
29 0.263    0.301    0.376
30 0.260    0.290    0.372

The critical value for a sample size of 8 and a confidence level of 95% is 0.526.

Step 4. Reject or fail to reject the null hypothesis.

Since our test statistic Q (0.5) is less than the critical value (0.526), we fail to reject the null hypothesis.

Step 5. Interpret the results. 

Since we failed to reject the null hypothesis, we conclude that the max value 25 is not an outlier in this dataset.

How to Conduct Dixon’s Q Test in R

To conduct Dixon’s Q Test on the same dataset in R, we can use the dixon.test() function from the outliers library, which uses the following syntax:

dixon.test(data, , type = 10, opposite = FALSE)

  • data: a numeric vector of data values
  • type: the type of formula to use to conduct the test statistic Q. Set to 10 to use the formula outlined earlier.
  • opposite: If FALSE, the test determines if the maximum value is an outlier. If TRUE, the test determines if the minimum value is an outlier. This is FALSE by default. 

NoteFind the complete documentation for dixon.test() .

The following code illustrates how to conduct Dixon’s Q Test to determine if the maximum value in the dataset is an outlier.

#load the outliers library
library(outliers)

#create data
data <- c(1, 3, 5, 7, 8, 9, 13, 25)

#conduct Dixon's Q Test
dixon.test(data, type = 10)

#	Dixon test for outliers
#
#data:  data
#Q = 0.5, p-value = 0.06913
#alternative hypothesis: highest value 25 is an outlier

From the output we can see that the test statistic is Q = 0.5 and the corresponding p-value is 0.06913. Thus, we fail to reject the null hypothesis at a 0.05 significance level and conclude that 25 is not an outlier. This matches the result we got by hand.

x