Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

PenDigits (version#07)

The 10 classes contained in this data set correspond to the digits from 0 to 9, with examples created by different hand writings. Class 4, defined here as outlier, was downsampled to only 20 objects. After the preprocessing, this database has 16 numeric attributes and 9868 instances, divided into 20 outliers (0.2%) and 9848 inliers (99.8%). This dataset is already normalized, i.e., all 16 attributes (spatial coordinates) have the same range [0,100]. It has been used in this form in [1,2].

References:

[1] H.-P. Kriegel, P. Kroeger, E. Schubert, and A. Zimek. Interpreting and unifying outlier scores. In Proc. SDM, pages 13-24, 2011.
[2] E. Schubert, R. Wojdanowski, A. Zimek, and H.-P. Kriegel. On evaluation of outlier rankings and outlier scores. In Proc. SDM, pages 1047-1058, 2012.

Download all data set variants used (2.1 MB). You can also access the original data. (merge train and test [pendigits.tes and pendigits.tra])

Normalized, without duplicates

This version contains 16 attributes, 9868 objects, 20 outliers (0.20%)

Download raw algorithm results (83.1 MB) Download raw algorithm evaluation table (58.7 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 10 0.00000 -0.00203 0.10093 0.09910 0.22034 0.21876 0.99031
KNN 11 0.00000 -0.00203 0.10111 0.09929 0.22430 0.22272 0.99027
KNN 12 0.00000 -0.00203 0.10135 0.09952 0.21622 0.21462 0.99028
KNN 21 0.02500 0.02302 0.08041 0.07855 0.18750 0.18585 0.98582
KNNW 1 0.10000 0.09817 0.03967 0.03772 0.13559 0.13384 0.91788
KNNW 20 0.00000 -0.00203 0.08768 0.08582 0.19403 0.19239 0.98905
KNNW 24 0.00000 -0.00203 0.08897 0.08712 0.21127 0.20967 0.98905
LOF 3 0.10000 0.09817 0.02152 0.01953 0.13793 0.13618 0.57841
LOF 18 0.05000 0.04807 0.04689 0.04496 0.15556 0.15384 0.95090
LOF 32 0.05000 0.04807 0.05295 0.05103 0.13333 0.13157 0.96332
SimplifiedLOF 3 0.10000 0.09817 0.02050 0.01852 0.13333 0.13157 0.58862
SimplifiedLOF 54 0.05000 0.04807 0.04367 0.04173 0.11348 0.11167 0.95502
SimplifiedLOF 57 0.05000 0.04807 0.04436 0.04242 0.11321 0.11141 0.95480
LoOP 3 0.10000 0.09817 0.02083 0.01884 0.12500 0.12322 0.57901
LoOP 6 0.10000 0.09817 0.02342 0.02144 0.13333 0.13157 0.60913
LoOP 76 0.05000 0.04807 0.04289 0.04094 0.13333 0.13157 0.94818
LoOP 100 0.05000 0.04807 0.03956 0.03761 0.11268 0.11087 0.94993
LDOF 6 0.10000 0.09817 0.02178 0.01979 0.11111 0.10931 0.48689
LDOF 7 0.05000 0.04807 0.02288 0.02090 0.08696 0.08510 0.48994
LDOF 57 0.05000 0.04807 0.01701 0.01501 0.09302 0.09118 0.83063
ODIN 45 0.01136 0.00936 0.01870 0.01671 0.04684 0.04490 0.93149
ODIN 77 0.00000 -0.00203 0.03541 0.03345 0.09910 0.09727 0.95306
ODIN 90 0.00000 -0.00203 0.03975 0.03780 0.09649 0.09466 0.95747
ODIN 94 0.00000 -0.00203 0.03900 0.03705 0.09692 0.09508 0.95947
FastABOD 4 0.05000 0.04807 0.02393 0.02194 0.09615 0.09432 0.88544
FastABOD 49 0.00000 -0.00203 0.03764 0.03568 0.12613 0.12435 0.96531
FastABOD 94 0.00000 -0.00203 0.04180 0.03986 0.10000 0.09817 0.97017
FastABOD 98 0.00000 -0.00203 0.04160 0.03966 0.10127 0.09944 0.97025
KDEOS 2 0.00000 -0.00203 0.00231 0.00029 0.00649 0.00448 0.52345
KDEOS 7 0.00000 -0.00203 0.00389 0.00187 0.02857 0.02660 0.54759
KDEOS 100 0.00000 -0.00203 0.00673 0.00471 0.01793 0.01594 0.85274
LDF 6 0.15000 0.14827 0.04527 0.04333 0.16216 0.16046 0.70419
LDF 11 0.05000 0.04807 0.07088 0.06899 0.21212 0.21052 0.97623
LDF 12 0.00000 -0.00203 0.07469 0.07281 0.19512 0.19349 0.97716
INFLO 3 0.10000 0.09817 0.02551 0.02353 0.14815 0.14642 0.56012
INFLO 86 0.05000 0.04807 0.03888 0.03693 0.11364 0.11184 0.94097
INFLO 100 0.00000 -0.00203 0.03571 0.03375 0.10127 0.09944 0.94192
COF 34 0.10000 0.09817 0.08435 0.08249 0.14925 0.14753 0.97110
COF 41 0.05000 0.04807 0.08837 0.08652 0.16279 0.16109 0.96632
COF 59 0.05000 0.04807 0.09829 0.09646 0.11111 0.10931 0.96105
COF 77 0.15000 0.14827 0.07312 0.07123 0.15789 0.15618 0.96333

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO