Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

Stamps (2% of outliers version#05)

A data set representing forged (photocopied or scanned+printed) stamps and genuine (ink) stamps. The features are based on color and printing properties of the stamps. Forged stamps are considered to be outliers. The stamps data set is not taken from the UCI repository, but was used in [1].

References:

[1] B. Micenkova, J. van Beusekom, and F. Shafait. Stamp verification for automated document authentication. In 5th Int. Workshop on Computational Forensics, 2012.

Download all data set variants used (371.2 kB).

Normalized, without duplicates

This version contains 9 attributes, 315 objects, 6 outliers (1.90%)

Download raw algorithm results (2.7 MB) Download raw algorithm evaluation table (41.1 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.00000 -0.01942 0.11768 0.10055 0.23529 0.22045 0.91613
KNN 3 0.00000 -0.01942 0.11957 0.10247 0.27273 0.25861 0.92287
KNN 4 0.00000 -0.01942 0.10966 0.09237 0.27907 0.26507 0.91424
KNNW 1 0.00000 -0.01942 0.06969 0.05163 0.18182 0.16593 0.74083
KNNW 6 0.00000 -0.01942 0.10928 0.09199 0.26667 0.25243 0.91370
LOF 1 0.00000 -0.01942 0.03338 0.01461 0.11765 0.10051 0.52400
LOF 15 0.00000 -0.01942 0.08545 0.06770 0.24490 0.23024 0.88188
SimplifiedLOF 1 0.00000 -0.01942 0.02103 0.00202 0.07143 0.05340 0.37487
SimplifiedLOF 27 0.00000 -0.01942 0.08042 0.06256 0.21818 0.20300 0.87756
SimplifiedLOF 34 0.00000 -0.01942 0.07805 0.06015 0.22222 0.20712 0.87271
LoOP 1 0.00000 -0.01942 0.02242 0.00344 0.07143 0.05340 0.40939
LoOP 25 0.00000 -0.01942 0.07399 0.05601 0.19355 0.17789 0.86785
LoOP 27 0.00000 -0.01942 0.07343 0.05543 0.20000 0.18447 0.86624
LDOF 2 0.00000 -0.01942 0.01503 -0.00410 0.04040 0.02177 0.30798
LDOF 58 0.00000 -0.01942 0.07654 0.05861 0.21053 0.19520 0.87217
ODIN 1 0.00000 -0.01942 0.02110 0.00209 0.04149 0.02288 0.41046
ODIN 20 0.00000 -0.01942 0.09960 0.08212 0.22222 0.20712 0.85383
ODIN 36 0.00000 -0.01942 0.08994 0.07227 0.22222 0.20712 0.88565
ODIN 52 0.00000 -0.01942 0.08695 0.06922 0.22642 0.21139 0.87837
FastABOD 3 0.00000 -0.01942 0.04362 0.02505 0.10256 0.08514 0.71521
FastABOD 42 0.00000 -0.01942 0.08667 0.06893 0.18462 0.16878 0.88242
FastABOD 76 0.00000 -0.01942 0.08781 0.07010 0.18462 0.16878 0.88403
KDEOS 44 0.16667 0.15049 0.07860 0.06071 0.16667 0.15049 0.81338
KDEOS 54 0.16667 0.15049 0.12994 0.11304 0.26667 0.25243 0.87972
KDEOS 57 0.16667 0.15049 0.12200 0.10495 0.28571 0.27184 0.88242
KDEOS 72 0.00000 -0.01942 0.10324 0.08583 0.19231 0.17662 0.89213
LDF 1 0.00000 -0.01942 0.04015 0.02152 0.14286 0.12621 0.54935
LDF 9 0.00000 -0.01942 0.09204 0.07441 0.24000 0.22524 0.89536
INFLO 1 0.00000 -0.01942 0.03317 0.01440 0.12500 0.10801 0.46629
INFLO 62 0.00000 -0.01942 0.07007 0.05202 0.18462 0.16878 0.86030
INFLO 87 0.00000 -0.01942 0.07036 0.05231 0.17647 0.16048 0.85976
INFLO 92 0.00000 -0.01942 0.06865 0.05056 0.18750 0.17172 0.85653
COF 1 0.00000 -0.01942 0.02109 0.00208 0.07143 0.05340 0.37594
COF 12 0.00000 -0.01942 0.09984 0.08236 0.19048 0.17476 0.89698
COF 24 0.00000 -0.01942 0.09954 0.08205 0.24490 0.23024 0.90453

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 9 attributes, 315 objects, 6 outliers (1.90%)

Download raw algorithm results (2.7 MB) Download raw algorithm evaluation table (41.2 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.00000 -0.01942 0.14264 0.12600 0.27778 0.26375 0.93258
KNN 3 0.00000 -0.01942 0.15753 0.14117 0.34286 0.33010 0.94175
KNNW 1 0.00000 -0.01942 0.07395 0.05596 0.20000 0.18447 0.75593
KNNW 5 0.00000 -0.01942 0.13685 0.12008 0.28571 0.27184 0.93150
KNNW 6 0.00000 -0.01942 0.13615 0.11938 0.30769 0.29425 0.93204
LOF 1 0.00000 -0.01942 0.04147 0.02286 0.15385 0.13742 0.50944
LOF 12 0.00000 -0.01942 0.08897 0.07128 0.21429 0.19903 0.89105
LOF 17 0.00000 -0.01942 0.08664 0.06891 0.24490 0.23024 0.88511
SimplifiedLOF 1 0.00000 -0.01942 0.02336 0.00440 0.08696 0.06923 0.38269
SimplifiedLOF 25 0.00000 -0.01942 0.07758 0.05966 0.21429 0.19903 0.87379
SimplifiedLOF 26 0.00000 -0.01942 0.07762 0.05971 0.21429 0.19903 0.87379
LoOP 1 0.00000 -0.01942 0.02470 0.00576 0.08696 0.06923 0.41721
LoOP 41 0.00000 -0.01942 0.07322 0.05523 0.20000 0.18447 0.86570
LoOP 47 0.00000 -0.01942 0.07456 0.05659 0.19672 0.18112 0.86839
LDOF 2 0.00000 -0.01942 0.01524 -0.00388 0.04054 0.02191 0.31877
LDOF 62 0.00000 -0.01942 0.07291 0.05491 0.18750 0.17172 0.86570
LDOF 63 0.00000 -0.01942 0.07290 0.05490 0.19672 0.18112 0.86570
ODIN 1 0.00000 -0.01942 0.02146 0.00246 0.04237 0.02378 0.41667
ODIN 46 0.00000 -0.01942 0.08898 0.07129 0.21053 0.19520 0.88889
ODIN 56 0.00000 -0.01942 0.09603 0.07847 0.22222 0.20712 0.88673
ODIN 60 0.00000 -0.01942 0.10361 0.08620 0.20833 0.19296 0.88080
FastABOD 3 0.00000 -0.01942 0.04795 0.02947 0.12121 0.10415 0.74488
FastABOD 12 0.00000 -0.01942 0.08822 0.07051 0.20000 0.18447 0.86462
FastABOD 70 0.00000 -0.01942 0.10592 0.08856 0.19355 0.17789 0.89752
FastABOD 72 0.00000 -0.01942 0.10660 0.08925 0.20000 0.18447 0.89644
KDEOS 54 0.16667 0.15049 0.09467 0.07709 0.18182 0.16593 0.86084
KDEOS 58 0.00000 -0.01942 0.09155 0.07391 0.21053 0.19520 0.86839
KDEOS 64 0.00000 -0.01942 0.09974 0.08226 0.17910 0.16316 0.88673
KDEOS 86 0.00000 -0.01942 0.08772 0.07001 0.18462 0.16878 0.88727
LDF 1 0.00000 -0.01942 0.03681 0.01811 0.13333 0.11650 0.52454
LDF 74 0.00000 -0.01942 0.10400 0.08660 0.27907 0.26507 0.90831
LDF 80 0.00000 -0.01942 0.10660 0.08925 0.26667 0.25243 0.91154
INFLO 1 0.16667 0.15049 0.04481 0.02626 0.16667 0.15049 0.48732
INFLO 66 0.00000 -0.01942 0.07688 0.05895 0.19672 0.18112 0.87325
INFLO 99 0.00000 -0.01942 0.07560 0.05765 0.20690 0.19150 0.87055
COF 8 0.16667 0.15049 0.10117 0.08371 0.25000 0.23544 0.83225
COF 11 0.16667 0.15049 0.11625 0.09909 0.22222 0.20712 0.88619
COF 13 0.00000 -0.01942 0.11382 0.09661 0.23529 0.22045 0.90183

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO