Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

PenDigits (version#08)

The 10 classes contained in this data set correspond to the digits from 0 to 9, with examples created by different hand writings. Class 4, defined here as outlier, was downsampled to only 20 objects. After the preprocessing, this database has 16 numeric attributes and 9868 instances, divided into 20 outliers (0.2%) and 9848 inliers (99.8%). This dataset is already normalized, i.e., all 16 attributes (spatial coordinates) have the same range [0,100]. It has been used in this form in [1,2].

References:

[1] H.-P. Kriegel, P. Kroeger, E. Schubert, and A. Zimek. Interpreting and unifying outlier scores. In Proc. SDM, pages 13-24, 2011.
[2] E. Schubert, R. Wojdanowski, A. Zimek, and H.-P. Kriegel. On evaluation of outlier rankings and outlier scores. In Proc. SDM, pages 1047-1058, 2012.

Download all data set variants used (2.1 MB). You can also access the original data. (merge train and test [pendigits.tes and pendigits.tra])

Normalized, without duplicates

This version contains 16 attributes, 9868 objects, 20 outliers (0.20%)

Download raw algorithm results (83.1 MB) Download raw algorithm evaluation table (56.4 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.00000 -0.00203 0.06437 0.06247 0.21505 0.21346 0.96347
KNN 4 0.05000 0.04807 0.07220 0.07031 0.18367 0.18202 0.97895
KNN 10 0.00000 -0.00203 0.09161 0.08977 0.20339 0.20177 0.98988
KNN 12 0.00000 -0.00203 0.09340 0.09156 0.20988 0.20827 0.98983
KNNW 1 0.00000 -0.00203 0.05278 0.05086 0.17857 0.17690 0.95246
KNNW 21 0.00000 -0.00203 0.08372 0.08186 0.19718 0.19555 0.98899
KNNW 22 0.00000 -0.00203 0.08359 0.08173 0.20000 0.19838 0.98898
LOF 1 0.05000 0.04807 0.02803 0.02606 0.09091 0.08906 0.50918
LOF 23 0.00000 -0.00203 0.03056 0.02859 0.10638 0.10457 0.94489
LOF 41 0.05000 0.04807 0.03392 0.03196 0.09677 0.09494 0.95000
LOF 91 0.00000 -0.00203 0.02705 0.02508 0.07692 0.07505 0.95329
SimplifiedLOF 1 0.05000 0.04807 0.01926 0.01727 0.08696 0.08510 0.57830
SimplifiedLOF 41 0.00000 -0.00203 0.03297 0.03101 0.11538 0.11359 0.94101
SimplifiedLOF 91 0.00000 -0.00203 0.02871 0.02674 0.08511 0.08325 0.95043
LoOP 1 0.05000 0.04807 0.01509 0.01309 0.08333 0.08147 0.57640
LoOP 66 0.00000 -0.00203 0.02973 0.02776 0.11111 0.10931 0.94060
LoOP 99 0.00000 -0.00203 0.02770 0.02573 0.07547 0.07359 0.94720
LDOF 3 0.05000 0.04807 0.01060 0.00859 0.07143 0.06954 0.52398
LDOF 98 0.00000 -0.00203 0.00669 0.00467 0.02597 0.02400 0.76377
ODIN 20 0.05000 0.04807 0.00776 0.00574 0.05128 0.04936 0.79849
ODIN 37 0.05000 0.04807 0.01833 0.01633 0.07407 0.07219 0.89905
ODIN 94 0.00000 -0.00203 0.02886 0.02689 0.06936 0.06747 0.95497
ODIN 95 0.00000 -0.00203 0.02809 0.02612 0.06486 0.06297 0.95513
FastABOD 3 0.05000 0.04807 0.03704 0.03508 0.08696 0.08510 0.91028
FastABOD 4 0.05000 0.04807 0.03513 0.03317 0.12069 0.11890 0.93354
FastABOD 82 0.00000 -0.00203 0.03705 0.03510 0.10732 0.10550 0.96770
FastABOD 98 0.00000 -0.00203 0.03650 0.03455 0.10417 0.10235 0.96786
KDEOS 46 0.05000 0.04807 0.05472 0.05280 0.09524 0.09340 0.72566
KDEOS 60 0.10000 0.09817 0.02185 0.01986 0.10000 0.09817 0.78562
KDEOS 63 0.10000 0.09817 0.02677 0.02480 0.10526 0.10345 0.79311
KDEOS 100 0.00000 -0.00203 0.01071 0.00870 0.05405 0.05213 0.85043
LDF 10 0.05000 0.04807 0.04714 0.04520 0.13861 0.13686 0.96451
LDF 13 0.10000 0.09817 0.05324 0.05132 0.11429 0.11249 0.96714
LDF 14 0.10000 0.09817 0.05349 0.05157 0.11111 0.10931 0.96215
LDF 43 0.00000 -0.00203 0.04017 0.03822 0.10067 0.09884 0.97419
INFLO 1 0.05000 0.04807 0.01496 0.01296 0.08333 0.08147 0.53116
INFLO 59 0.00000 -0.00203 0.02247 0.02049 0.08696 0.08510 0.92430
INFLO 70 0.00000 -0.00203 0.02285 0.02087 0.07143 0.06954 0.92765
INFLO 95 0.00000 -0.00203 0.02223 0.02024 0.07273 0.07084 0.93526
COF 27 0.15000 0.14827 0.08002 0.07815 0.20690 0.20529 0.95206
COF 36 0.15000 0.14827 0.12723 0.12546 0.20690 0.20529 0.95577
COF 70 0.20000 0.19838 0.10725 0.10543 0.20513 0.20351 0.96646
COF 100 0.10000 0.09817 0.05899 0.05708 0.14599 0.14425 0.97191

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO