Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

PenDigits (version#09)

The 10 classes contained in this data set correspond to the digits from 0 to 9, with examples created by different hand writings. Class 4, defined here as outlier, was downsampled to only 20 objects. After the preprocessing, this database has 16 numeric attributes and 9868 instances, divided into 20 outliers (0.2%) and 9848 inliers (99.8%). This dataset is already normalized, i.e., all 16 attributes (spatial coordinates) have the same range [0,100]. It has been used in this form in [1,2].

References:

[1] H.-P. Kriegel, P. Kroeger, E. Schubert, and A. Zimek. Interpreting and unifying outlier scores. In Proc. SDM, pages 13-24, 2011.
[2] E. Schubert, R. Wojdanowski, A. Zimek, and H.-P. Kriegel. On evaluation of outlier rankings and outlier scores. In Proc. SDM, pages 1047-1058, 2012.

Download all data set variants used (2.1 MB). You can also access the original data. (merge train and test [pendigits.tes and pendigits.tra])

Normalized, without duplicates

This version contains 16 attributes, 9868 objects, 20 outliers (0.20%)

Download raw algorithm results (83.1 MB) Download raw algorithm evaluation table (59.1 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.15000 0.14827 0.06193 0.06002 0.17857 0.17690 0.94565
KNN 7 0.05000 0.04807 0.13860 0.13685 0.27397 0.27250 0.99254
KNNW 4 0.15000 0.14827 0.07025 0.06836 0.15909 0.15738 0.97206
KNNW 12 0.05000 0.04807 0.10735 0.10553 0.24242 0.24089 0.98960
KNNW 18 0.05000 0.04807 0.11067 0.10886 0.23881 0.23726 0.99054
KNNW 19 0.05000 0.04807 0.10996 0.10815 0.24000 0.23846 0.99054
LOF 33 0.10000 0.09817 0.09015 0.08830 0.25000 0.24848 0.97653
LOF 41 0.10000 0.09817 0.09267 0.09082 0.21978 0.21820 0.97693
LOF 45 0.10000 0.09817 0.08788 0.08603 0.20930 0.20770 0.97727
LOF 52 0.15000 0.14827 0.08705 0.08520 0.18182 0.18016 0.97487
SimplifiedLOF 3 0.10000 0.09817 0.01786 0.01586 0.10000 0.09817 0.58818
SimplifiedLOF 39 0.10000 0.09817 0.07628 0.07441 0.18182 0.18016 0.97398
SimplifiedLOF 45 0.10000 0.09817 0.07514 0.07326 0.16000 0.15829 0.97560
SimplifiedLOF 62 0.10000 0.09817 0.07871 0.07684 0.15152 0.14979 0.97268
LoOP 36 0.10000 0.09817 0.04652 0.04458 0.11765 0.11586 0.96214
LoOP 54 0.05000 0.04807 0.05787 0.05596 0.16000 0.15829 0.96849
LoOP 58 0.05000 0.04807 0.05832 0.05641 0.16000 0.15829 0.96876
LoOP 71 0.10000 0.09817 0.05706 0.05515 0.12500 0.12322 0.96973
LDOF 3 0.05000 0.04807 0.00536 0.00334 0.05000 0.04807 0.55617
LDOF 32 0.05000 0.04807 0.01083 0.00882 0.06667 0.06477 0.78603
LDOF 95 0.00000 -0.00203 0.01353 0.01152 0.04615 0.04422 0.81946
LDOF 99 0.00000 -0.00203 0.01331 0.01131 0.04724 0.04531 0.82428
ODIN 4 0.00592 0.00390 0.00295 0.00093 0.01139 0.00938 0.56528
ODIN 89 0.00000 -0.00203 0.04315 0.04120 0.11940 0.11761 0.96502
ODIN 98 0.00000 -0.00203 0.04423 0.04228 0.11765 0.11586 0.96573
ODIN 99 0.00000 -0.00203 0.04359 0.04164 0.10526 0.10345 0.96639
FastABOD 3 0.00000 -0.00203 0.02009 0.01810 0.08696 0.08510 0.86038
FastABOD 69 0.00000 -0.00203 0.04944 0.04751 0.16071 0.15901 0.96996
FastABOD 94 0.00000 -0.00203 0.05060 0.04867 0.15517 0.15346 0.97146
FastABOD 100 0.00000 -0.00203 0.05018 0.04825 0.15254 0.15082 0.97169
KDEOS 92 0.05000 0.04807 0.00823 0.00621 0.05000 0.04807 0.83601
KDEOS 100 0.05000 0.04807 0.00883 0.00681 0.05405 0.05213 0.84106
LDF 10 0.10000 0.09817 0.13490 0.13315 0.25806 0.25656 0.98528
LDF 12 0.15000 0.14827 0.14455 0.14282 0.29412 0.29268 0.98311
LDF 17 0.20000 0.19838 0.12224 0.12046 0.22535 0.22378 0.98094
INFLO 1 0.10000 0.09817 0.01481 0.01281 0.10526 0.10345 0.54042
INFLO 55 0.05000 0.04807 0.04976 0.04783 0.13043 0.12867 0.95882
INFLO 70 0.05000 0.04807 0.05186 0.04993 0.12500 0.12322 0.96005
COF 31 0.10000 0.09817 0.15006 0.14834 0.26374 0.26224 0.99145
COF 42 0.10000 0.09817 0.19010 0.18846 0.32877 0.32740 0.98892
COF 46 0.15000 0.14827 0.22898 0.22741 0.30380 0.30238 0.98961
COF 54 0.20000 0.19838 0.21216 0.21056 0.29213 0.29070 0.99029

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO