Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

PenDigits (version#03)

The 10 classes contained in this data set correspond to the digits from 0 to 9, with examples created by different hand writings. Class 4, defined here as outlier, was downsampled to only 20 objects. After the preprocessing, this database has 16 numeric attributes and 9868 instances, divided into 20 outliers (0.2%) and 9848 inliers (99.8%). This dataset is already normalized, i.e., all 16 attributes (spatial coordinates) have the same range [0,100]. It has been used in this form in [1,2].

References:

[1] H.-P. Kriegel, P. Kroeger, E. Schubert, and A. Zimek. Interpreting and unifying outlier scores. In Proc. SDM, pages 13-24, 2011.
[2] E. Schubert, R. Wojdanowski, A. Zimek, and H.-P. Kriegel. On evaluation of outlier rankings and outlier scores. In Proc. SDM, pages 1047-1058, 2012.

Download all data set variants used (2.1 MB). You can also access the original data. (merge train and test [pendigits.tes and pendigits.tra])

Normalized, without duplicates

This version contains 16 attributes, 9868 objects, 20 outliers (0.20%)

Download raw algorithm results (83.1 MB) Download raw algorithm evaluation table (56.2 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 3 0.10000 0.09817 0.12332 0.12154 0.23656 0.23501 0.99136
KNN 8 0.05000 0.04807 0.15969 0.15799 0.30357 0.30216 0.99417
KNN 10 0.05000 0.04807 0.14946 0.14773 0.33663 0.33529 0.99375
KNNW 1 0.05000 0.04807 0.04360 0.04166 0.14545 0.14372 0.95877
KNNW 19 0.00000 -0.00203 0.13341 0.13165 0.28319 0.28173 0.99329
KNNW 31 0.05000 0.04807 0.13023 0.12846 0.30476 0.30335 0.99265
LOF 15 0.20000 0.19838 0.05623 0.05431 0.21053 0.20892 0.93826
LOF 35 0.15000 0.14827 0.07391 0.07203 0.17143 0.16975 0.97553
LOF 54 0.15000 0.14827 0.06440 0.06250 0.15385 0.15213 0.97638
SimplifiedLOF 10 0.15000 0.14827 0.03223 0.03027 0.15385 0.15213 0.65453
SimplifiedLOF 29 0.10000 0.09817 0.05887 0.05696 0.17778 0.17611 0.96146
SimplifiedLOF 34 0.10000 0.09817 0.06498 0.06308 0.17021 0.16853 0.96833
SimplifiedLOF 74 0.10000 0.09817 0.05958 0.05767 0.14286 0.14112 0.97438
LoOP 26 0.15000 0.14827 0.04119 0.03924 0.15385 0.15213 0.92713
LoOP 40 0.15000 0.14827 0.05269 0.05077 0.18182 0.18016 0.95747
LoOP 81 0.15000 0.14827 0.05838 0.05647 0.15789 0.15618 0.97152
LoOP 99 0.10000 0.09817 0.05493 0.05301 0.13953 0.13779 0.97249
LDOF 33 0.10000 0.09817 0.07216 0.07028 0.14286 0.14112 0.66378
LDOF 39 0.15000 0.14827 0.03540 0.03344 0.15385 0.15213 0.69495
LDOF 78 0.15000 0.14827 0.03025 0.02828 0.18605 0.18439 0.77990
LDOF 100 0.05000 0.04807 0.02217 0.02019 0.13333 0.13157 0.78623
ODIN 66 0.03750 0.03555 0.03833 0.03638 0.10000 0.09817 0.96434
ODIN 99 0.00000 -0.00203 0.05328 0.05136 0.12030 0.11851 0.97917
ODIN 100 0.00000 -0.00203 0.05325 0.05133 0.12214 0.12035 0.97917
FastABOD 3 0.10000 0.09817 0.03147 0.02950 0.10811 0.10630 0.88138
FastABOD 49 0.05000 0.04807 0.06885 0.06696 0.18868 0.18703 0.98250
FastABOD 72 0.05000 0.04807 0.07150 0.06961 0.15748 0.15577 0.98314
FastABOD 94 0.05000 0.04807 0.07821 0.07633 0.15730 0.15559 0.98227
KDEOS 21 0.05000 0.04807 0.00495 0.00293 0.05128 0.04936 0.59470
KDEOS 33 0.05000 0.04807 0.05283 0.05091 0.09524 0.09340 0.66845
KDEOS 60 0.05000 0.04807 0.05421 0.05229 0.09524 0.09340 0.77507
KDEOS 100 0.05000 0.04807 0.01032 0.00831 0.05882 0.05691 0.85806
LDF 7 0.20000 0.19838 0.07767 0.07579 0.20000 0.19838 0.81989
LDF 10 0.15000 0.14827 0.10241 0.10059 0.21429 0.21269 0.98421
LDF 14 0.10000 0.09817 0.09833 0.09650 0.23684 0.23529 0.98219
INFLO 20 0.15000 0.14827 0.03027 0.02830 0.16216 0.16046 0.78762
INFLO 34 0.10000 0.09817 0.04059 0.03864 0.17391 0.17224 0.93011
INFLO 81 0.15000 0.14827 0.04893 0.04700 0.15789 0.15618 0.96345
INFLO 98 0.10000 0.09817 0.04472 0.04278 0.12766 0.12589 0.96583
COF 21 0.20000 0.19838 0.06917 0.06728 0.20000 0.19838 0.97332
COF 34 0.15000 0.14827 0.13562 0.13386 0.25743 0.25592 0.98631
COF 55 0.10000 0.09817 0.17490 0.17323 0.31429 0.31289 0.98318
COF 56 0.10000 0.09817 0.17280 0.17112 0.33846 0.33712 0.98230

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO