Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

InternetAds (2% of outliers version#01)

The data set consists of images from web pages, classified as ads or not. The goal is to learn to remove ads automatically from web pages while retaining regular images. Ads are considered outliers.

Download all data set variants used (6.0 MB). You can also access the original data. (ad.data)

Normalized, without duplicates

This version contains 1555 attributes, 1630 objects, 32 outliers (1.96%)

Download raw algorithm results (10.2 MB) Download raw algorithm evaluation table (56.4 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.43750 0.42624 0.42065 0.40905 0.49057 0.48036 0.85612
KNN 3 0.43750 0.42624 0.42586 0.41437 0.54167 0.53249 0.82547
KNNW 2 0.46875 0.45811 0.42308 0.41153 0.52830 0.51886 0.84296
KNNW 3 0.46875 0.45811 0.44158 0.43039 0.52000 0.51039 0.84494
KNNW 5 0.43750 0.42624 0.44380 0.43266 0.52000 0.51039 0.83698
KNNW 6 0.43750 0.42624 0.44315 0.43200 0.54167 0.53249 0.83041
LOF 5 0.40625 0.39436 0.33684 0.32356 0.45614 0.44525 0.90290
LOF 10 0.43750 0.42624 0.38635 0.37406 0.45614 0.44525 0.86282
LOF 28 0.40625 0.39436 0.39018 0.37797 0.44828 0.43723 0.80427
LOF 49 0.37500 0.36248 0.37449 0.36196 0.46154 0.45076 0.74799
SimplifiedLOF 6 0.37500 0.36248 0.34510 0.33198 0.42105 0.40946 0.90147
SimplifiedLOF 9 0.43750 0.42624 0.40572 0.39382 0.45283 0.44187 0.87701
SimplifiedLOF 10 0.43750 0.42624 0.42588 0.41439 0.46667 0.45599 0.88325
SimplifiedLOF 52 0.40625 0.39436 0.39007 0.37786 0.47059 0.45999 0.76228
LoOP 8 0.40625 0.39436 0.39383 0.38169 0.47059 0.45999 0.90920
LoOP 16 0.50000 0.48999 0.45446 0.44354 0.51613 0.50644 0.88866
LoOP 22 0.46875 0.45811 0.45669 0.44581 0.52000 0.51039 0.87477
LoOP 25 0.46875 0.45811 0.44856 0.43752 0.54902 0.53999 0.86856
LDOF 13 0.43750 0.42624 0.37974 0.36732 0.44828 0.43723 0.86814
LDOF 27 0.50000 0.48999 0.42146 0.40988 0.52830 0.51886 0.85805
LDOF 50 0.50000 0.48999 0.43616 0.42487 0.52632 0.51683 0.81463
ODIN 7 0.14085 0.12364 0.11381 0.09607 0.23077 0.21537 0.87089
ODIN 58 0.28423 0.26989 0.19668 0.18059 0.36364 0.35089 0.83317
ODIN 96 0.31250 0.29873 0.20746 0.19159 0.31461 0.30088 0.82552
ODIN 100 0.31250 0.29873 0.20785 0.19199 0.31461 0.30088 0.82586
FastABOD 16 0.40625 0.39436 0.38321 0.37086 0.43077 0.41937 0.82437
FastABOD 17 0.43750 0.42624 0.38911 0.37688 0.43750 0.42624 0.81888
FastABOD 25 0.43750 0.42624 0.37400 0.36146 0.46154 0.45076 0.81207
KDEOS 7 0.18750 0.17123 0.09952 0.08148 0.20690 0.19101 0.72491
KDEOS 55 0.09375 0.07560 0.12310 0.10554 0.15000 0.13298 0.76420
KDEOS 69 0.06250 0.04373 0.07462 0.05609 0.16883 0.15219 0.79353
LDF 3 0.04706 0.02798 0.04180 0.02262 0.08602 0.06772 0.71176
LDF 4 0.02075 0.00114 0.04198 0.02280 0.11178 0.09399 0.78309
INFLO 6 0.34375 0.33061 0.33120 0.31781 0.41509 0.40338 0.89131
INFLO 9 0.46875 0.45811 0.40309 0.39114 0.50000 0.48999 0.87402
INFLO 27 0.40625 0.39436 0.42564 0.41414 0.47273 0.46217 0.83200
COF 4 0.21875 0.20311 0.08839 0.07013 0.22222 0.20665 0.63001
COF 5 0.21875 0.20311 0.09864 0.08059 0.22222 0.20665 0.66342
COF 8 0.21875 0.20311 0.18469 0.16836 0.24000 0.22478 0.58276

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Normalized, duplicates

This version contains 1555 attributes, 2867 objects, 57 outliers (1.99%)

Download raw algorithm results (12.0 MB) Download raw algorithm evaluation table (66.7 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 3 0.44123 0.42989 0.39915 0.38697 0.47826 0.46768 0.86185
KNN 5 0.46069 0.44975 0.41390 0.40201 0.50000 0.48986 0.85041
KNN 6 0.44386 0.43258 0.42221 0.41049 0.50000 0.48986 0.84169
KNN 10 0.42780 0.41619 0.41221 0.40028 0.54762 0.53844 0.79540
KNNW 5 0.36842 0.35561 0.36937 0.35658 0.40000 0.38783 0.86430
KNNW 14 0.45614 0.44511 0.42647 0.41484 0.52747 0.51789 0.84084
KNNW 18 0.43860 0.42721 0.43536 0.42391 0.53763 0.52826 0.82660
KNNW 23 0.43860 0.42721 0.42615 0.41451 0.54545 0.53623 0.81273
LOF 7 0.03483 0.01525 0.03597 0.01641 0.09593 0.07759 0.69793
LOF 8 0.03537 0.01580 0.03626 0.01671 0.08999 0.07153 0.70204
LOF 9 0.03522 0.01565 0.03635 0.01680 0.08623 0.06770 0.71182
SimplifiedLOF 6 0.03286 0.01324 0.03235 0.01272 0.07907 0.06039 0.66195
SimplifiedLOF 8 0.03682 0.01728 0.03469 0.01511 0.07780 0.05910 0.67881
SimplifiedLOF 22 0.03390 0.01430 0.03345 0.01384 0.06671 0.04778 0.69941
LoOP 1 0.19028 0.17386 0.09290 0.07450 0.19802 0.18175 0.61993
LoOP 27 0.07018 0.05131 0.09704 0.07872 0.16949 0.15264 0.73719
LoOP 71 0.08772 0.06921 0.06813 0.04923 0.19820 0.18193 0.74103
LoOP 78 0.10526 0.08711 0.06689 0.04797 0.15810 0.14103 0.76300
LDOF 73 0.10526 0.08711 0.06734 0.04842 0.15854 0.14147 0.75647
LDOF 74 0.10526 0.08711 0.06967 0.05080 0.16667 0.14976 0.75966
LDOF 76 0.08772 0.06921 0.06887 0.04998 0.15748 0.14039 0.76213
ODIN 10 0.09677 0.07845 0.07135 0.05251 0.16547 0.14854 0.77184
ODIN 89 0.29618 0.28190 0.16460 0.14765 0.31788 0.30404 0.76284
ODIN 100 0.29618 0.28190 0.16561 0.14868 0.32432 0.31062 0.76422
FastABOD 73 0.03509 0.01551 0.06342 0.04442 0.15261 0.13542 0.78841
FastABOD 85 0.03509 0.01551 0.06252 0.04350 0.15385 0.13668 0.78327
KDEOS 2 0.03509 0.01551 0.03612 0.01657 0.09651 0.07819 0.64574
KDEOS 10 0.03509 0.01551 0.03673 0.01719 0.07871 0.06002 0.68845
KDEOS 11 0.03509 0.01551 0.03681 0.01727 0.08000 0.06134 0.68549
KDEOS 73 0.07018 0.05131 0.03394 0.01434 0.07812 0.05943 0.66278
LDF 1 0.06667 0.04773 0.02643 0.00668 0.09600 0.07766 0.44238
LDF 2 0.03968 0.02020 0.02939 0.00970 0.11200 0.09399 0.52183
LDF 10 0.03125 0.01160 0.02931 0.00962 0.07171 0.05288 0.64982
INFLO 7 0.03488 0.01531 0.03396 0.01437 0.09524 0.07689 0.68387
INFLO 8 0.03543 0.01586 0.03480 0.01522 0.08999 0.07153 0.69674
INFLO 22 0.03428 0.01469 0.03359 0.01399 0.06827 0.04937 0.70035
COF 74 0.10526 0.08711 0.06712 0.04820 0.17647 0.15977 0.70711
COF 98 0.08772 0.06921 0.05033 0.03107 0.12687 0.10915 0.73091

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO