Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

SpamBase (40% of outliers)

A data set representing emails classified as spam (outliers) or nonspam.

Download all data set variants used (25.4 MB). You can also access the original data. (spambase.data)

Normalized, without duplicates

This version contains 57 attributes, 4207 objects, 1679 outliers (39.91%)

Download raw algorithm results (37.5 MB) Download raw algorithm evaluation table (70.4 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 10 0.43419 0.05840 0.41858 0.03242 0.57590 0.29424 0.55615
KNN 24 0.44253 0.07227 0.41515 0.02671 0.59861 0.33201 0.56926
KNN 37 0.43895 0.06633 0.41499 0.02645 0.60133 0.33654 0.57174
KNN 63 0.44074 0.06930 0.41623 0.02852 0.59773 0.33056 0.57347
KNNW 54 0.43419 0.05840 0.40978 0.01778 0.59252 0.32189 0.55745
KNNW 88 0.43300 0.05641 0.41249 0.02229 0.59853 0.33189 0.56490
KNNW 100 0.43300 0.05641 0.41273 0.02268 0.59771 0.33053 0.56618
LOF 2 0.37880 -0.03378 0.41634 0.02870 0.57080 0.28574 0.47384
LOF 3 0.37522 -0.03973 0.42106 0.03654 0.57065 0.28549 0.46977
LOF 93 0.32817 -0.11803 0.35683 -0.07033 0.57731 0.29658 0.44077
SimplifiedLOF 2 0.42644 0.04551 0.45627 0.09515 0.57070 0.28557 0.50119
SimplifiedLOF 84 0.31149 -0.14578 0.34858 -0.08407 0.57461 0.29209 0.41348
LoOP 1 0.38892 -0.01693 0.45370 0.09087 0.57051 0.28525 0.48381
LoOP 2 0.41453 0.02569 0.43646 0.06218 0.57051 0.28525 0.49658
LDOF 2 0.38058 -0.03081 0.43957 0.06736 0.57079 0.28573 0.46696
LDOF 5 0.39428 -0.00801 0.41783 0.03117 0.57051 0.28525 0.47960
LDOF 6 0.40143 0.00388 0.41233 0.02201 0.57099 0.28606 0.47815
LDOF 75 0.33175 -0.11208 0.35358 -0.07574 0.57477 0.29235 0.42636
ODIN 21 0.41520 0.02680 0.40433 0.00872 0.57183 0.28745 0.51495
ODIN 42 0.40436 0.00877 0.40546 0.01059 0.57192 0.28760 0.51862
ODIN 47 0.40357 0.00744 0.40513 0.01003 0.57260 0.28874 0.51910
ODIN 81 0.40514 0.01006 0.40378 0.00779 0.57506 0.29283 0.51822
FastABOD 3 0.36450 -0.05757 0.37194 -0.04519 0.57407 0.29119 0.43721
FastABOD 6 0.37165 -0.04568 0.36257 -0.06079 0.57402 0.29111 0.42782
KDEOS 3 0.35557 -0.07244 0.41306 0.02323 0.57123 0.28646 0.45831
KDEOS 15 0.33413 -0.10812 0.35297 -0.07676 0.57211 0.28793 0.41425
KDEOS 98 0.38535 -0.02288 0.39074 -0.01391 0.57075 0.28566 0.47590
KDEOS 100 0.38475 -0.02387 0.39047 -0.01436 0.57085 0.28582 0.47667
LDF 3 0.39547 -0.00603 0.43229 0.05524 0.57128 0.28655 0.48171
LDF 7 0.40858 0.01578 0.39506 -0.00672 0.57158 0.28703 0.48128
LDF 98 0.39845 -0.00107 0.39011 -0.01496 0.59049 0.31852 0.53118
LDF 100 0.40322 0.00686 0.39282 -0.01044 0.59039 0.31835 0.53641
INFLO 3 0.36689 -0.05361 0.41358 0.02411 0.57060 0.28541 0.47381
INFLO 4 0.36927 -0.04964 0.40471 0.00935 0.57060 0.28541 0.47099
INFLO 92 0.33413 -0.10812 0.35934 -0.06616 0.57387 0.29085 0.44855
COF 1 0.38892 -0.01693 0.45046 0.08548 0.57060 0.28541 0.48259
COF 2 0.42942 0.05047 0.44851 0.08223 0.57070 0.28558 0.49945

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 57 attributes, 4207 objects, 1679 outliers (39.91%)

Download raw algorithm results (36.3 MB) Download raw algorithm evaluation table (71.9 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 40 0.59202 0.32105 0.63802 0.39761 0.63679 0.39557 0.73230
KNN 93 0.59797 0.33097 0.63665 0.39532 0.64286 0.40566 0.73949
KNN 99 0.59738 0.32997 0.63665 0.39533 0.64438 0.40819 0.73965
KNN 100 0.59678 0.32898 0.63661 0.39526 0.64455 0.40848 0.73959
KNNW 51 0.59381 0.32403 0.63347 0.39004 0.62471 0.37545 0.72342
KNNW 99 0.59202 0.32105 0.63609 0.39439 0.63852 0.39843 0.73327
KNNW 100 0.59202 0.32105 0.63603 0.39430 0.63866 0.39867 0.73338
LOF 97 0.49077 0.15256 0.46714 0.11324 0.57525 0.29315 0.58735
LOF 100 0.48958 0.15057 0.46917 0.11661 0.57773 0.29727 0.58895
SimplifiedLOF 1 0.37105 -0.04667 0.45119 0.08669 0.57051 0.28525 0.47929
SimplifiedLOF 100 0.44253 0.07227 0.43347 0.05720 0.57304 0.28947 0.54065
LoOP 1 0.37105 -0.04667 0.45523 0.09341 0.57051 0.28525 0.48121
LoOP 95 0.42228 0.03857 0.40645 0.01224 0.57051 0.28525 0.51992
LoOP 100 0.41930 0.03362 0.40943 0.01719 0.57051 0.28525 0.52292
LDOF 2 0.37641 -0.03775 0.43620 0.06175 0.57051 0.28525 0.44212
LDOF 99 0.32996 -0.11506 0.34836 -0.08443 0.57060 0.28541 0.42269
ODIN 1 0.36858 -0.05078 0.39101 -0.01345 0.57350 0.29023 0.48875
ODIN 58 0.37867 -0.03400 0.37848 -0.03431 0.59712 0.32955 0.50729
ODIN 98 0.39238 -0.01117 0.38824 -0.01807 0.59369 0.32384 0.52115
ODIN 100 0.39198 -0.01185 0.38885 -0.01706 0.59376 0.32394 0.52225
FastABOD 3 0.54914 0.24969 0.57080 0.28574 0.58572 0.31058 0.65547
FastABOD 46 0.55926 0.26654 0.56513 0.27630 0.58624 0.31143 0.65294
FastABOD 53 0.55867 0.26555 0.56507 0.27621 0.58737 0.31331 0.65287
KDEOS 3 0.35021 -0.08136 0.41203 0.02152 0.57060 0.28541 0.45364
KDEOS 5 0.34485 -0.09028 0.35656 -0.07079 0.57212 0.28794 0.42500
KDEOS 97 0.38773 -0.01891 0.38336 -0.02618 0.57089 0.28590 0.48840
KDEOS 100 0.38714 -0.01991 0.38388 -0.02532 0.57080 0.28574 0.49001
LDF 66 0.55092 0.25266 0.57867 0.29885 0.57918 0.29969 0.64729
LDF 100 0.54378 0.24077 0.58839 0.31501 0.60389 0.34081 0.67491
INFLO 2 0.34723 -0.08631 0.38840 -0.01780 0.57060 0.28541 0.46371
INFLO 96 0.54883 0.24918 0.43282 0.05613 0.57051 0.28525 0.59052
INFLO 100 0.54620 0.24480 0.43361 0.05744 0.57051 0.28525 0.58725
COF 1 0.37165 -0.04568 0.45051 0.08556 0.57133 0.28663 0.48065
COF 100 0.46575 0.11093 0.49722 0.16329 0.57051 0.28525 0.55950

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO