Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

Stamps (2% of outliers version#06)

A data set representing forged (photocopied or scanned+printed) stamps and genuine (ink) stamps. The features are based on color and printing properties of the stamps. Forged stamps are considered to be outliers. The stamps data set is not taken from the UCI repository, but was used in [1].

References:

[1] B. Micenkova, J. van Beusekom, and F. Shafait. Stamp verification for automated document authentication. In 5th Int. Workshop on Computational Forensics, 2012.

Download all data set variants used (371.2 kB).

Normalized, without duplicates

This version contains 9 attributes, 315 objects, 6 outliers (1.90%)

Download raw algorithm results (2.7 MB) Download raw algorithm evaluation table (40.9 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.00000 -0.01942 0.11662 0.09947 0.23529 0.22045 0.91559
KNN 2 0.00000 -0.01942 0.12533 0.10835 0.30000 0.28641 0.92503
KNN 3 0.00000 -0.01942 0.11836 0.10124 0.31579 0.30250 0.91909
KNNW 1 0.00000 -0.01942 0.08267 0.06486 0.21053 0.19520 0.82740
KNNW 3 0.00000 -0.01942 0.11335 0.09613 0.30000 0.28641 0.91586
KNNW 4 0.00000 -0.01942 0.11383 0.09662 0.30000 0.28641 0.91586
LOF 1 0.00000 -0.01942 0.01841 -0.00065 0.04317 0.02459 0.43177
LOF 14 0.00000 -0.01942 0.08061 0.06276 0.21818 0.20300 0.87810
LOF 21 0.00000 -0.01942 0.08575 0.06800 0.21277 0.19748 0.88619
LOF 22 0.00000 -0.01942 0.08596 0.06822 0.21739 0.20220 0.88619
SimplifiedLOF 1 0.00000 -0.01942 0.01923 0.00019 0.03774 0.01905 0.26214
SimplifiedLOF 24 0.00000 -0.01942 0.08493 0.06716 0.22642 0.21139 0.88565
LoOP 1 0.00000 -0.01942 0.01905 0.00000 0.03738 0.01869 0.25728
LoOP 24 0.00000 -0.01942 0.08182 0.06399 0.22222 0.20712 0.88026
LDOF 2 0.00000 -0.01942 0.01338 -0.00578 0.04054 0.02191 0.20550
LDOF 37 0.00000 -0.01942 0.07600 0.05806 0.18750 0.17172 0.87109
LDOF 39 0.00000 -0.01942 0.07606 0.05812 0.20000 0.18447 0.87109
ODIN 9 0.01667 -0.00243 0.05021 0.03177 0.10526 0.08789 0.74434
ODIN 33 0.00000 -0.01942 0.09730 0.07977 0.22642 0.21139 0.89105
ODIN 35 0.00000 -0.01942 0.09271 0.07510 0.23529 0.22045 0.88592
FastABOD 3 0.00000 -0.01942 0.06202 0.04380 0.13333 0.11650 0.83873
FastABOD 29 0.00000 -0.01942 0.09413 0.07654 0.22222 0.20712 0.89752
FastABOD 37 0.00000 -0.01942 0.09403 0.07644 0.21739 0.20220 0.89806
KDEOS 2 0.16667 0.15049 0.06819 0.05009 0.20000 0.18447 0.43635
KDEOS 21 0.16667 0.15049 0.08734 0.06962 0.22222 0.20712 0.73355
KDEOS 63 0.00000 -0.01942 0.08177 0.06394 0.19672 0.18112 0.87972
LDF 1 0.00000 -0.01942 0.01941 0.00037 0.04505 0.02650 0.45766
LDF 10 0.00000 -0.01942 0.09732 0.07979 0.26667 0.25243 0.90022
INFLO 1 0.00000 -0.01942 0.01398 -0.00516 0.03883 0.02017 0.16046
INFLO 100 0.00000 -0.01942 0.07799 0.06008 0.19048 0.17476 0.87487
COF 1 0.00000 -0.01942 0.01923 0.00019 0.03774 0.01905 0.26214
COF 10 0.00000 -0.01942 0.12250 0.10546 0.25000 0.23544 0.91909
COF 23 0.00000 -0.01942 0.12242 0.10538 0.30769 0.29425 0.92395
COF 25 0.00000 -0.01942 0.12171 0.10466 0.27907 0.26507 0.92449

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 9 attributes, 315 objects, 6 outliers (1.90%)

Download raw algorithm results (2.7 MB) Download raw algorithm evaluation table (40.9 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.00000 -0.01942 0.14095 0.12427 0.29412 0.28041 0.93231
KNN 2 0.00000 -0.01942 0.14800 0.13146 0.33333 0.32039 0.93905
KNN 3 0.00000 -0.01942 0.14867 0.13214 0.35294 0.34038 0.93851
KNNW 1 0.00000 -0.01942 0.08837 0.07066 0.22222 0.20712 0.83927
KNNW 5 0.00000 -0.01942 0.13847 0.12174 0.35294 0.34038 0.93312
LOF 1 0.00000 -0.01942 0.01822 -0.00084 0.04367 0.02510 0.43015
LOF 87 0.00000 -0.01942 0.08867 0.07097 0.23077 0.21583 0.89105
LOF 97 0.00000 -0.01942 0.09488 0.07731 0.22642 0.21139 0.89860
SimplifiedLOF 1 0.00000 -0.01942 0.01929 0.00025 0.03785 0.01917 0.26861
SimplifiedLOF 24 0.00000 -0.01942 0.08380 0.06601 0.22222 0.20712 0.88403
LoOP 1 0.00000 -0.01942 0.01905 0.00000 0.03738 0.01869 0.26052
LoOP 21 0.00000 -0.01942 0.07593 0.05799 0.21429 0.19903 0.87001
LoOP 24 0.00000 -0.01942 0.07923 0.06135 0.21429 0.19903 0.87702
LDOF 2 0.00000 -0.01942 0.01336 -0.00580 0.04110 0.02248 0.20388
LDOF 21 0.00000 -0.01942 0.06725 0.04914 0.19672 0.18112 0.84898
LDOF 100 0.00000 -0.01942 0.07577 0.05783 0.19355 0.17789 0.87109
ODIN 1 0.00000 -0.01942 0.01765 -0.00143 0.03774 0.01905 0.34169
ODIN 36 0.00000 -0.01942 0.08679 0.06906 0.20000 0.18447 0.89374
ODIN 54 0.00000 -0.01942 0.08873 0.07104 0.22642 0.21139 0.87594
ODIN 57 0.00000 -0.01942 0.09710 0.07957 0.22222 0.20712 0.87783
FastABOD 3 0.00000 -0.01942 0.07485 0.05689 0.18519 0.16936 0.86624
FastABOD 19 0.00000 -0.01942 0.10501 0.08763 0.25000 0.23544 0.90831
FastABOD 52 0.00000 -0.01942 0.11210 0.09486 0.25000 0.23544 0.91532
KDEOS 2 0.00000 -0.01942 0.01533 -0.00379 0.03987 0.02122 0.32902
KDEOS 63 0.00000 -0.01942 0.07881 0.06092 0.20690 0.19150 0.87594
KDEOS 96 0.00000 -0.01942 0.08333 0.06553 0.18868 0.17293 0.88403
LDF 1 0.00000 -0.01942 0.01909 0.00004 0.04484 0.02630 0.45334
LDF 74 0.00000 -0.01942 0.11214 0.09490 0.27907 0.26507 0.91586
LDF 82 0.00000 -0.01942 0.11356 0.09635 0.26667 0.25243 0.91748
INFLO 1 0.00000 -0.01942 0.01419 -0.00495 0.03909 0.02043 0.17611
INFLO 99 0.00000 -0.01942 0.08420 0.06641 0.21053 0.19520 0.88457
INFLO 100 0.00000 -0.01942 0.08433 0.06655 0.21053 0.19520 0.88511
COF 13 0.00000 -0.01942 0.13477 0.11796 0.25806 0.24366 0.92449
COF 21 0.00000 -0.01942 0.12494 0.10795 0.30769 0.29425 0.92341
COF 24 0.00000 -0.01942 0.12895 0.11204 0.27778 0.26375 0.92826
COF 70 0.16667 0.15049 0.08473 0.06695 0.16667 0.15049 0.84736

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO