Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

PenDigits (version#05)

The 10 classes contained in this data set correspond to the digits from 0 to 9, with examples created by different hand writings. Class 4, defined here as outlier, was downsampled to only 20 objects. After the preprocessing, this database has 16 numeric attributes and 9868 instances, divided into 20 outliers (0.2%) and 9848 inliers (99.8%). This dataset is already normalized, i.e., all 16 attributes (spatial coordinates) have the same range [0,100]. It has been used in this form in [1,2].

References:

[1] H.-P. Kriegel, P. Kroeger, E. Schubert, and A. Zimek. Interpreting and unifying outlier scores. In Proc. SDM, pages 13-24, 2011.
[2] E. Schubert, R. Wojdanowski, A. Zimek, and H.-P. Kriegel. On evaluation of outlier rankings and outlier scores. In Proc. SDM, pages 1047-1058, 2012.

Download all data set variants used (2.1 MB). You can also access the original data. (merge train and test [pendigits.tes and pendigits.tra])

Normalized, without duplicates

This version contains 16 attributes, 9868 objects, 20 outliers (0.20%)

Download raw algorithm results (83.1 MB) Download raw algorithm evaluation table (61.8 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.05000 0.04807 0.04369 0.04175 0.11538 0.11359 0.96671
KNN 15 0.00000 -0.00203 0.09408 0.09224 0.19310 0.19146 0.98909
KNN 18 0.00000 -0.00203 0.09225 0.09041 0.20472 0.20311 0.98850
KNNW 1 0.05000 0.04807 0.03284 0.03087 0.09346 0.09162 0.93748
KNNW 25 0.00000 -0.00203 0.07938 0.07751 0.16832 0.16663 0.98772
KNNW 31 0.00000 -0.00203 0.08033 0.07846 0.16923 0.16754 0.98758
KNNW 39 0.00000 -0.00203 0.07818 0.07631 0.18045 0.17879 0.98671
LOF 13 0.05000 0.04807 0.01415 0.01215 0.06400 0.06210 0.73136
LOF 45 0.00000 -0.00203 0.04791 0.04598 0.15686 0.15515 0.95068
LOF 99 0.05000 0.04807 0.03916 0.03721 0.10695 0.10514 0.95773
SimplifiedLOF 1 0.00000 -0.00203 0.00225 0.00022 0.00594 0.00392 0.50932
SimplifiedLOF 84 0.00000 -0.00203 0.03763 0.03568 0.11828 0.11649 0.95019
SimplifiedLOF 93 0.00000 -0.00203 0.03824 0.03628 0.11640 0.11461 0.95225
SimplifiedLOF 100 0.00000 -0.00203 0.03745 0.03550 0.10837 0.10656 0.95425
LoOP 15 0.05000 0.04807 0.00974 0.00773 0.05128 0.04936 0.60409
LoOP 85 0.00000 -0.00203 0.03253 0.03056 0.10127 0.09944 0.94632
LoOP 96 0.00000 -0.00203 0.03361 0.03165 0.09032 0.08848 0.94954
LoOP 99 0.00000 -0.00203 0.03334 0.03138 0.09091 0.08906 0.95080
LDOF 2 0.00000 -0.00203 0.00261 0.00059 0.01125 0.00924 0.46852
LDOF 12 0.00000 -0.00203 0.00325 0.00122 0.02410 0.02211 0.54138
LDOF 94 0.00000 -0.00203 0.00605 0.00403 0.02088 0.01889 0.76013
ODIN 8 0.00680 0.00479 0.00339 0.00136 0.01277 0.01076 0.59709
ODIN 97 0.00000 -0.00203 0.03047 0.02850 0.08696 0.08510 0.96342
ODIN 100 0.00000 -0.00203 0.03107 0.02910 0.07746 0.07559 0.96398
FastABOD 3 0.00000 -0.00203 0.01667 0.01467 0.06897 0.06707 0.87502
FastABOD 9 0.00000 -0.00203 0.02399 0.02201 0.08642 0.08456 0.92801
FastABOD 98 0.00000 -0.00203 0.02858 0.02660 0.06918 0.06729 0.95280
FastABOD 100 0.00000 -0.00203 0.02855 0.02658 0.06940 0.06751 0.95285
KDEOS 13 0.05000 0.04807 0.00523 0.00321 0.05000 0.04807 0.49950
KDEOS 100 0.00000 -0.00203 0.00534 0.00332 0.01333 0.01133 0.80286
LDF 10 0.10000 0.09817 0.03965 0.03770 0.11494 0.11315 0.91616
LDF 15 0.00000 -0.00203 0.07681 0.07493 0.19820 0.19657 0.97902
LDF 21 0.05000 0.04807 0.08718 0.08533 0.17544 0.17376 0.97983
INFLO 1 0.00000 -0.00203 0.00180 -0.00022 0.00426 0.00224 0.43636
INFLO 75 0.00000 -0.00203 0.02933 0.02736 0.09877 0.09694 0.93764
INFLO 91 0.00000 -0.00203 0.03111 0.02914 0.09326 0.09142 0.94352
INFLO 100 0.00000 -0.00203 0.03056 0.02859 0.08421 0.08235 0.94699
COF 54 0.10000 0.09817 0.11090 0.10909 0.20000 0.19838 0.98697
COF 70 0.20000 0.19838 0.10623 0.10442 0.23256 0.23100 0.98409
COF 98 0.15000 0.14827 0.09702 0.09519 0.24490 0.24336 0.98016

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO