Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

SpamBase (39% of outliers)

A data set representing emails classified as spam (outliers) or nonspam.

Download all data set variants used (25.4 MB). You can also access the original data. (spambase.data)

Normalized, duplicates

This version contains 57 attributes, 4601 objects, 1813 outliers (39.40%)

Download raw algorithm results (38.3 MB) Download raw algorithm evaluation table (71.4 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 28 0.45174 0.09521 0.41949 0.04199 0.60904 0.35481 0.59210
KNN 32 0.44953 0.09157 0.41975 0.04243 0.61042 0.35707 0.59372
KNN 33 0.44898 0.09066 0.42010 0.04299 0.61037 0.35700 0.59428
KNNW 95 0.43574 0.06881 0.40982 0.02603 0.60423 0.34687 0.57799
KNNW 100 0.43519 0.06790 0.41021 0.02668 0.60524 0.34853 0.57873
LOF 4 0.37948 -0.02403 0.41804 0.03959 0.56594 0.28367 0.48569
LOF 10 0.38279 -0.01857 0.39834 0.00709 0.56864 0.28814 0.48950
LOF 12 0.38389 -0.01675 0.38879 -0.00867 0.56771 0.28660 0.48813
LOF 99 0.28627 -0.17787 0.34047 -0.08841 0.58367 0.31293 0.42423
SimplifiedLOF 2 0.43133 0.06153 0.44662 0.08677 0.56554 0.28302 0.50486
SimplifiedLOF 4 0.42802 0.05607 0.43936 0.07478 0.56533 0.28266 0.50971
SimplifiedLOF 100 0.28737 -0.17605 0.33246 -0.10163 0.57808 0.30372 0.40967
LoOP 1 0.36073 -0.05498 0.44145 0.07823 0.56533 0.28266 0.47351
LoOP 2 0.41809 0.03968 0.43685 0.07065 0.56533 0.28266 0.50395
LoOP 4 0.40265 0.01420 0.41931 0.04170 0.56533 0.28266 0.50485
LDOF 2 0.37229 -0.03591 0.42719 0.05469 0.56533 0.28266 0.45660
LDOF 8 0.40596 0.01966 0.40195 0.01304 0.56656 0.28470 0.46863
LDOF 91 0.31660 -0.12780 0.34037 -0.08858 0.57727 0.30237 0.42822
ODIN 1 0.38090 -0.02169 0.39642 0.00392 0.58059 0.30786 0.51110
ODIN 24 0.41577 0.03586 0.40454 0.01733 0.57241 0.29435 0.52630
ODIN 25 0.41692 0.03775 0.40449 0.01724 0.57185 0.29343 0.52604
FastABOD 3 0.35852 -0.05862 0.36308 -0.05110 0.57764 0.30298 0.44784
FastABOD 20 0.36018 -0.05589 0.34934 -0.07378 0.58020 0.30721 0.43414
FastABOD 39 0.36293 -0.05134 0.34879 -0.07469 0.58001 0.30690 0.43356
KDEOS 3 0.37617 -0.02950 0.41202 0.02967 0.56577 0.28339 0.47070
KDEOS 16 0.34749 -0.07683 0.35595 -0.06287 0.56727 0.28587 0.43544
KDEOS 97 0.39106 -0.00492 0.37519 -0.03111 0.56550 0.28295 0.48371
KDEOS 100 0.38941 -0.00765 0.37561 -0.03043 0.56550 0.28295 0.48491
LDF 4 0.40761 0.02239 0.42531 0.05160 0.56603 0.28382 0.49250
LDF 9 0.42140 0.04515 0.41184 0.02937 0.56569 0.28326 0.50644
LDF 11 0.41754 0.03877 0.42173 0.04570 0.56744 0.28615 0.52427
LDF 67 0.25483 -0.22975 0.31866 -0.12441 0.58515 0.31538 0.36532
INFLO 4 0.37838 -0.02585 0.41026 0.02676 0.56541 0.28281 0.49186
INFLO 99 0.30116 -0.15329 0.34082 -0.08783 0.56846 0.28784 0.42574
COF 2 0.43298 0.06426 0.44629 0.08622 0.56533 0.28266 0.51483
COF 75 0.30061 -0.15420 0.32343 -0.11654 0.57437 0.29759 0.38441

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, duplicates

This version contains 57 attributes, 4601 objects, 1813 outliers (39.40%)

Download raw algorithm results (37.5 MB) Download raw algorithm evaluation table (72.4 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 83 0.60894 0.35463 0.64914 0.42098 0.65147 0.42483 0.75543
KNN 88 0.60894 0.35463 0.64981 0.42208 0.65010 0.42257 0.75627
KNN 89 0.60894 0.35463 0.64974 0.42197 0.65022 0.42276 0.75642
KNN 96 0.61280 0.36100 0.64925 0.42117 0.64988 0.42220 0.75558
KNNW 93 0.60452 0.34735 0.64666 0.41689 0.63939 0.40489 0.74785
KNNW 100 0.60287 0.34462 0.64684 0.41718 0.64140 0.40820 0.74887
LOF 68 0.43519 0.06790 0.39385 -0.00032 0.56825 0.28748 0.53115
LOF 100 0.47932 0.14072 0.44501 0.08410 0.56677 0.28504 0.58079
SimplifiedLOF 2 0.42030 0.04333 0.43211 0.06281 0.56533 0.28266 0.48640
SimplifiedLOF 100 0.40265 0.01420 0.38705 -0.01154 0.57422 0.29733 0.51324
LoOP 1 0.35080 -0.07137 0.43695 0.07080 0.56533 0.28266 0.47371
LoOP 2 0.41258 0.03058 0.42917 0.05797 0.56533 0.28266 0.49657
LoOP 100 0.39658 0.00418 0.37719 -0.02782 0.56533 0.28266 0.50059
LDOF 2 0.37400 -0.03308 0.41745 0.03863 0.56550 0.28295 0.43956
LDOF 98 0.30667 -0.14419 0.32922 -0.10698 0.57274 0.29489 0.40665
ODIN 1 0.37506 -0.03133 0.39489 0.00140 0.58043 0.30759 0.50896
ODIN 53 0.35888 -0.05803 0.36813 -0.04277 0.60014 0.34011 0.49974
ODIN 100 0.38868 -0.00886 0.37831 -0.02597 0.58961 0.32274 0.51812
FastABOD 6 0.55047 0.25814 0.54188 0.24397 0.58247 0.31096 0.64956
FastABOD 83 0.56040 0.27453 0.56429 0.28095 0.58085 0.30829 0.65383
FastABOD 94 0.55874 0.27180 0.56439 0.28112 0.58105 0.30861 0.65397
FastABOD 98 0.55929 0.27271 0.56438 0.28111 0.58128 0.30900 0.65399
KDEOS 3 0.36018 -0.05589 0.39104 -0.00496 0.56559 0.28310 0.45737
KDEOS 5 0.36349 -0.05043 0.35719 -0.06082 0.56686 0.28519 0.43973
KDEOS 100 0.36955 -0.04042 0.36989 -0.03987 0.56550 0.28295 0.46953
LDF 98 0.51903 0.20626 0.58043 0.30759 0.59419 0.33030 0.66345
LDF 100 0.51848 0.20535 0.58088 0.30834 0.59794 0.33649 0.66622
INFLO 3 0.37562 -0.03041 0.38762 -0.01061 0.56550 0.28295 0.48549
INFLO 99 0.52717 0.21970 0.40640 0.02038 0.56533 0.28266 0.56759
INFLO 100 0.52698 0.21937 0.40782 0.02274 0.56533 0.28266 0.56961
COF 2 0.40982 0.02603 0.42918 0.05798 0.56594 0.28368 0.49178
COF 100 0.42857 0.05698 0.41271 0.03081 0.57978 0.30652 0.52895

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO