Puneet Varma (Editor)

Dixon's Q test

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

In statistics, Dixon's Q test, or simply the Q test, is used for identification and rejection of outliers. This assumes normal distribution and per Robert Dean and Wilfrid Dixon, and others, this test should be used sparingly and never more than once in a data set. To apply a Q test for bad data, arrange the data in order of increasing values and calculate Q as defined:

Contents

Q = gap range

Where gap is the absolute difference between the outlier in question and the closest number to it. If Q > Qtable, where Qtable is a reference value corresponding to the sample size and confidence level, then reject the questionable point. Note that only one point may be rejected from a data set using a Q test.

Example

Consider the data set:

0.189 ,   0.167 ,   0.187 ,   0.183 ,   0.186 ,   0.182 ,   0.181 ,   0.184 ,   0.181 ,   0.177

Now rearrange in increasing order:

0.167 ,   0.177 ,   0.181 ,   0.181 ,   0.182 ,   0.183 ,   0.184 ,   0.186 ,   0.187 ,   0.189

We hypothesize that 0.167 is an outlier. Calculate Q:

Q = gap range = 0.177 0.167 0.189 0.167 = 0.455.

With 10 observations and at 90% confidence, Q = 0.455 > 0.412 = Qtable, so we conclude 0.167 is an outlier. However, at 95% confidence, Q = 0.455 < 0.466 = Qtable 0.167 is not considered an outlier. This means that for this example we can be 90% sure that 0.167 is an outlier, but we cannot be 95% sure.

McBane notes: Dixon provided related tests intended to search for more than one outlier, but they are much less frequently used than the r10 or Q version that is intended to eliminate a single outlier.

Table

This table summarizes the limit values of the test.

References

Dixon's Q test Wikipedia


Similar Topics