Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

PenDigits (version#06)

The 10 classes contained in this data set correspond to the digits from 0 to 9, with examples created by different hand writings. Class 4, defined here as outlier, was downsampled to only 20 objects. After the preprocessing, this database has 16 numeric attributes and 9868 instances, divided into 20 outliers (0.2%) and 9848 inliers (99.8%). This dataset is already normalized, i.e., all 16 attributes (spatial coordinates) have the same range [0,100]. It has been used in this form in [1,2].

References:

[1] H.-P. Kriegel, P. Kroeger, E. Schubert, and A. Zimek. Interpreting and unifying outlier scores. In Proc. SDM, pages 13-24, 2011.
[2] E. Schubert, R. Wojdanowski, A. Zimek, and H.-P. Kriegel. On evaluation of outlier rankings and outlier scores. In Proc. SDM, pages 1047-1058, 2012.

Download all data set variants used (2.1 MB). You can also access the original data. (merge train and test [pendigits.tes and pendigits.tra])

Normalized, without duplicates

This version contains 16 attributes, 9868 objects, 20 outliers (0.20%)

Download raw algorithm results (83.1 MB) Download raw algorithm evaluation table (56.3 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 5 0.05000 0.04807 0.07238 0.07050 0.15714 0.15543 0.98396
KNN 11 0.00000 -0.00203 0.09878 0.09695 0.20635 0.20474 0.98912
KNN 12 0.00000 -0.00203 0.09931 0.09748 0.21176 0.21016 0.98893
KNN 15 0.00000 -0.00203 0.09466 0.09282 0.22222 0.22064 0.98763
KNNW 1 0.00000 -0.00203 0.02395 0.02197 0.08989 0.08804 0.91353
KNNW 18 0.00000 -0.00203 0.07886 0.07699 0.18543 0.18378 0.98788
KNNW 20 0.00000 -0.00203 0.08003 0.07817 0.18072 0.17906 0.98801
KNNW 23 0.00000 -0.00203 0.08105 0.07919 0.17722 0.17554 0.98794
LOF 39 0.15000 0.14827 0.05825 0.05634 0.17391 0.17224 0.96115
LOF 40 0.10000 0.09817 0.05994 0.05803 0.19048 0.18883 0.96116
LOF 41 0.15000 0.14827 0.06171 0.05980 0.19048 0.18883 0.96187
LOF 61 0.10000 0.09817 0.05419 0.05227 0.18605 0.18439 0.96264
SimplifiedLOF 36 0.00000 -0.00203 0.05716 0.05525 0.23077 0.22921 0.95042
SimplifiedLOF 38 0.05000 0.04807 0.05884 0.05693 0.22642 0.22484 0.95187
SimplifiedLOF 57 0.15000 0.14827 0.05499 0.05307 0.17857 0.17690 0.95825
SimplifiedLOF 61 0.15000 0.14827 0.05358 0.05166 0.17857 0.17690 0.95940
LoOP 48 0.00000 -0.00203 0.04942 0.04749 0.19231 0.19067 0.95059
LoOP 59 0.05000 0.04807 0.05170 0.04977 0.18182 0.18016 0.95415
LoOP 60 0.10000 0.09817 0.05155 0.04963 0.18182 0.18016 0.95440
LoOP 82 0.10000 0.09817 0.04889 0.04695 0.18182 0.18016 0.95588
LDOF 29 0.05000 0.04807 0.05604 0.05413 0.09524 0.09340 0.70227
LDOF 77 0.10000 0.09817 0.01441 0.01241 0.10000 0.09817 0.83419
LDOF 80 0.10000 0.09817 0.01450 0.01250 0.10256 0.10074 0.83538
LDOF 85 0.05000 0.04807 0.01405 0.01204 0.09756 0.09573 0.84212
ODIN 6 0.00424 0.00222 0.00325 0.00123 0.00792 0.00590 0.62927
ODIN 100 0.00000 -0.00203 0.03603 0.03407 0.09459 0.09276 0.96531
FastABOD 3 0.10000 0.09817 0.04824 0.04631 0.16667 0.16497 0.82347
FastABOD 98 0.00000 -0.00203 0.02856 0.02658 0.07080 0.06891 0.95920
KDEOS 21 0.05000 0.04807 0.00498 0.00296 0.05128 0.04936 0.58601
KDEOS 33 0.05000 0.04807 0.05303 0.05111 0.09524 0.09340 0.68153
KDEOS 38 0.05000 0.04807 0.05349 0.05157 0.09524 0.09340 0.71560
KDEOS 100 0.05000 0.04807 0.01001 0.00800 0.05128 0.04936 0.85813
LDF 7 0.05000 0.04807 0.05846 0.05655 0.20290 0.20128 0.82476
LDF 12 0.15000 0.14827 0.08384 0.08198 0.19355 0.19191 0.97759
INFLO 23 0.10000 0.09817 0.03648 0.03452 0.16000 0.15829 0.88332
INFLO 56 0.10000 0.09817 0.04481 0.04287 0.15385 0.15213 0.93789
INFLO 64 0.05000 0.04807 0.04311 0.04117 0.17391 0.17224 0.94284
INFLO 100 0.00000 -0.00203 0.03525 0.03330 0.15385 0.15213 0.94699
COF 29 0.15000 0.14827 0.09160 0.08976 0.22222 0.22064 0.96138
COF 38 0.15000 0.14827 0.10417 0.10235 0.19231 0.19067 0.96804
COF 50 0.15000 0.14827 0.14789 0.14616 0.17949 0.17782 0.96549
COF 59 0.20000 0.19838 0.13310 0.13134 0.20000 0.19838 0.96654

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO