Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

PenDigits (version#02)

The 10 classes contained in this data set correspond to the digits from 0 to 9, with examples created by different hand writings. Class 4, defined here as outlier, was downsampled to only 20 objects. After the preprocessing, this database has 16 numeric attributes and 9868 instances, divided into 20 outliers (0.2%) and 9848 inliers (99.8%). This dataset is already normalized, i.e., all 16 attributes (spatial coordinates) have the same range [0,100]. It has been used in this form in [1,2].

References:

[1] H.-P. Kriegel, P. Kroeger, E. Schubert, and A. Zimek. Interpreting and unifying outlier scores. In Proc. SDM, pages 13-24, 2011.
[2] E. Schubert, R. Wojdanowski, A. Zimek, and H.-P. Kriegel. On evaluation of outlier rankings and outlier scores. In Proc. SDM, pages 1047-1058, 2012.

Download all data set variants used (2.1 MB). You can also access the original data. (merge train and test [pendigits.tes and pendigits.tra])

Normalized, without duplicates

This version contains 16 attributes, 9868 objects, 20 outliers (0.20%)

Download raw algorithm results (83.1 MB) Download raw algorithm evaluation table (57.9 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.05000 0.04807 0.07658 0.07470 0.20870 0.20709 0.97849
KNN 12 0.05000 0.04807 0.11566 0.11386 0.27869 0.27722 0.99178
KNN 13 0.05000 0.04807 0.11841 0.11662 0.26357 0.26207 0.99165
KNNW 1 0.10000 0.09817 0.07087 0.06898 0.19231 0.19067 0.95639
KNNW 22 0.00000 -0.00203 0.10263 0.10080 0.22667 0.22510 0.99113
KNNW 26 0.00000 -0.00203 0.10381 0.10199 0.22807 0.22650 0.99095
KNNW 27 0.00000 -0.00203 0.10335 0.10153 0.23077 0.22921 0.99086
LOF 3 0.10000 0.09817 0.03541 0.03345 0.13793 0.13618 0.68122
LOF 47 0.00000 -0.00203 0.05608 0.05417 0.17284 0.17116 0.96958
LOF 51 0.00000 -0.00203 0.05686 0.05495 0.16667 0.16497 0.96994
SimplifiedLOF 3 0.10000 0.09817 0.03216 0.03020 0.13333 0.13157 0.64136
SimplifiedLOF 56 0.00000 -0.00203 0.04762 0.04569 0.10573 0.10391 0.97268
SimplifiedLOF 82 0.00000 -0.00203 0.05125 0.04933 0.12500 0.12322 0.97173
SimplifiedLOF 98 0.00000 -0.00203 0.04831 0.04637 0.13433 0.13257 0.96991
LoOP 6 0.05000 0.04807 0.02898 0.02700 0.12821 0.12643 0.64684
LoOP 12 0.10000 0.09817 0.02215 0.02017 0.10000 0.09817 0.63664
LoOP 82 0.05000 0.04807 0.04552 0.04359 0.10577 0.10395 0.96844
LoOP 100 0.05000 0.04807 0.04619 0.04425 0.11043 0.10862 0.96771
LDOF 7 0.10000 0.09817 0.03737 0.03542 0.11111 0.10931 0.69066
LDOF 8 0.10000 0.09817 0.02602 0.02404 0.11765 0.11586 0.69650
LDOF 17 0.05000 0.04807 0.05536 0.05344 0.09524 0.09340 0.53565
LDOF 100 0.05000 0.04807 0.01346 0.01146 0.06667 0.06477 0.81469
ODIN 4 0.00592 0.00390 0.00380 0.00178 0.01139 0.00938 0.63497
ODIN 78 0.00000 -0.00203 0.03268 0.03071 0.09016 0.08832 0.95490
ODIN 99 0.00000 -0.00203 0.04510 0.04316 0.13559 0.13384 0.95249
FastABOD 4 0.10000 0.09817 0.04022 0.03827 0.10526 0.10345 0.93898
FastABOD 13 0.05000 0.04807 0.04276 0.04082 0.12727 0.12550 0.95723
FastABOD 74 0.00000 -0.00203 0.04977 0.04784 0.12121 0.11943 0.96723
FastABOD 100 0.00000 -0.00203 0.04880 0.04687 0.11940 0.11761 0.96766
KDEOS 73 0.05000 0.04807 0.00765 0.00563 0.05128 0.04936 0.79073
KDEOS 100 0.00000 -0.00203 0.00573 0.00372 0.01368 0.01168 0.82003
LDF 2 0.10000 0.09817 0.02346 0.02148 0.11765 0.11586 0.69940
LDF 18 0.10000 0.09817 0.08304 0.08118 0.22680 0.22523 0.97605
LDF 23 0.05000 0.04807 0.07432 0.07244 0.18018 0.17852 0.98215
INFLO 2 0.05000 0.04807 0.01381 0.01180 0.07692 0.07505 0.67044
INFLO 82 0.05000 0.04807 0.04408 0.04214 0.11475 0.11296 0.95486
INFLO 99 0.05000 0.04807 0.04497 0.04303 0.11921 0.11742 0.95183
INFLO 100 0.05000 0.04807 0.04442 0.04247 0.12000 0.11821 0.95172
COF 51 0.30000 0.29858 0.20193 0.20031 0.33962 0.33828 0.99188
COF 86 0.45000 0.44888 0.30682 0.30541 0.45000 0.44888 0.98780
COF 90 0.45000 0.44888 0.33474 0.33339 0.48649 0.48544 0.98570
COF 99 0.45000 0.44888 0.35195 0.35064 0.45000 0.44888 0.98562

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO