Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

PenDigits (version#10)

The 10 classes contained in this data set correspond to the digits from 0 to 9, with examples created by different hand writings. Class 4, defined here as outlier, was downsampled to only 20 objects. After the preprocessing, this database has 16 numeric attributes and 9868 instances, divided into 20 outliers (0.2%) and 9848 inliers (99.8%). This dataset is already normalized, i.e., all 16 attributes (spatial coordinates) have the same range [0,100]. It has been used in this form in [1,2].

References:

[1] H.-P. Kriegel, P. Kroeger, E. Schubert, and A. Zimek. Interpreting and unifying outlier scores. In Proc. SDM, pages 13-24, 2011.
[2] E. Schubert, R. Wojdanowski, A. Zimek, and H.-P. Kriegel. On evaluation of outlier rankings and outlier scores. In Proc. SDM, pages 1047-1058, 2012.

Download all data set variants used (2.1 MB). You can also access the original data. (merge train and test [pendigits.tes and pendigits.tra])

Normalized, without duplicates

This version contains 16 attributes, 9868 objects, 20 outliers (0.20%)

Download raw algorithm results (83.1 MB) Download raw algorithm evaluation table (58.9 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.10000 0.09817 0.07800 0.07612 0.17582 0.17415 0.98256
KNN 12 0.00000 -0.00203 0.12549 0.12371 0.29703 0.29560 0.99212
KNN 14 0.00000 -0.00203 0.12796 0.12618 0.30769 0.30629 0.99176
KNNW 1 0.15000 0.14827 0.06773 0.06584 0.17778 0.17611 0.92906
KNNW 19 0.00000 -0.00203 0.10740 0.10559 0.26866 0.26717 0.99160
KNNW 21 0.00000 -0.00203 0.10840 0.10659 0.26984 0.26836 0.99158
KNNW 25 0.00000 -0.00203 0.10964 0.10783 0.25424 0.25272 0.99150
LOF 3 0.15000 0.14827 0.02591 0.02393 0.15000 0.14827 0.56970
LOF 5 0.05000 0.04807 0.03088 0.02891 0.16327 0.16157 0.59388
LOF 50 0.00000 -0.00203 0.04593 0.04399 0.11679 0.11499 0.96469
LOF 73 0.00000 -0.00203 0.04186 0.03991 0.10769 0.10588 0.96581
SimplifiedLOF 3 0.15000 0.14827 0.03424 0.03228 0.19355 0.19191 0.53821
SimplifiedLOF 47 0.00000 -0.00203 0.04481 0.04287 0.12500 0.12322 0.96198
SimplifiedLOF 67 0.00000 -0.00203 0.04268 0.04074 0.10778 0.10597 0.96680
LoOP 3 0.15000 0.14827 0.02866 0.02668 0.15000 0.14827 0.54799
LoOP 4 0.15000 0.14827 0.03270 0.03074 0.17647 0.17480 0.55501
LoOP 83 0.00000 -0.00203 0.03600 0.03404 0.09143 0.08958 0.96167
LoOP 98 0.00000 -0.00203 0.03521 0.03325 0.08556 0.08370 0.96231
LDOF 3 0.05000 0.04807 0.00582 0.00380 0.05263 0.05071 0.58098
LDOF 5 0.05000 0.04807 0.02094 0.01895 0.08696 0.08510 0.58501
LDOF 18 0.00000 -0.00203 0.01234 0.01033 0.09836 0.09653 0.48827
LDOF 91 0.05000 0.04807 0.01335 0.01135 0.06667 0.06477 0.75034
ODIN 81 0.01250 0.01049 0.03100 0.02903 0.08387 0.08201 0.95779
ODIN 95 0.00000 -0.00203 0.03475 0.03279 0.09865 0.09682 0.96278
ODIN 100 0.00000 -0.00203 0.03710 0.03515 0.09804 0.09621 0.96432
FastABOD 5 0.15000 0.14827 0.05369 0.05176 0.15789 0.15618 0.95098
FastABOD 6 0.15000 0.14827 0.06140 0.05950 0.15789 0.15618 0.95459
FastABOD 100 0.05000 0.04807 0.05632 0.05440 0.12295 0.12117 0.97979
KDEOS 16 0.05000 0.04807 0.00477 0.00275 0.05128 0.04936 0.51186
KDEOS 30 0.05000 0.04807 0.02780 0.02582 0.09091 0.08906 0.64922
KDEOS 33 0.05000 0.04807 0.02800 0.02603 0.09091 0.08906 0.67038
KDEOS 98 0.00000 -0.00203 0.00729 0.00527 0.04082 0.03887 0.82214
LDF 5 0.15000 0.14827 0.04994 0.04801 0.16393 0.16224 0.74521
LDF 9 0.10000 0.09817 0.07633 0.07445 0.24242 0.24089 0.96239
LDF 12 0.05000 0.04807 0.10002 0.09819 0.22400 0.22242 0.97791
INFLO 2 0.10000 0.09817 0.01759 0.01559 0.14286 0.14112 0.51105
INFLO 6 0.00000 -0.00203 0.02238 0.02039 0.15385 0.15213 0.57009
INFLO 87 0.00000 -0.00203 0.03255 0.03058 0.08531 0.08345 0.95633
INFLO 98 0.00000 -0.00203 0.03174 0.02977 0.07500 0.07312 0.95712
COF 22 0.20000 0.19838 0.07161 0.06973 0.20513 0.20351 0.95520
COF 34 0.10000 0.09817 0.12203 0.12025 0.27586 0.27439 0.94615
COF 36 0.15000 0.14827 0.12969 0.12792 0.26415 0.26266 0.94365
COF 95 0.15000 0.14827 0.11620 0.11440 0.18803 0.18639 0.96695

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO