Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

InternetAds (2% of outliers version#06)

The data set consists of images from web pages, classified as ads or not. The goal is to learn to remove ads automatically from web pages while retaining regular images. Ads are considered outliers.

Download all data set variants used (6.0 MB). You can also access the original data. (ad.data)

Normalized, without duplicates

This version contains 1555 attributes, 1630 objects, 32 outliers (1.96%)

Download raw algorithm results (10.1 MB) Download raw algorithm evaluation table (57.2 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.41563 0.40392 0.43416 0.42283 0.51064 0.50084 0.88472
KNN 3 0.44375 0.43261 0.43740 0.42613 0.52000 0.51039 0.86581
KNNW 1 0.43510 0.42378 0.41653 0.40485 0.45946 0.44864 0.90341
KNNW 2 0.40625 0.39436 0.46600 0.45531 0.53061 0.52121 0.89449
KNNW 4 0.40625 0.39436 0.47411 0.46358 0.52174 0.51216 0.89283
KNNW 8 0.46875 0.45811 0.46468 0.45396 0.51064 0.50084 0.86183
LOF 9 0.31250 0.29873 0.31885 0.30521 0.33333 0.31998 0.91149
LOF 47 0.43750 0.42624 0.36561 0.35291 0.44444 0.43332 0.82805
LOF 55 0.43750 0.42624 0.40922 0.39739 0.47273 0.46217 0.82227
LOF 57 0.43750 0.42624 0.40904 0.39721 0.49057 0.48036 0.81613
SimplifiedLOF 11 0.34375 0.33061 0.36419 0.35146 0.35484 0.34192 0.92063
SimplifiedLOF 29 0.40625 0.39436 0.43275 0.42139 0.44828 0.43723 0.88387
SimplifiedLOF 48 0.43750 0.42624 0.40682 0.39494 0.45161 0.44063 0.84819
SimplifiedLOF 70 0.43750 0.42624 0.40857 0.39673 0.49057 0.48036 0.81734
LoOP 10 0.53125 0.52186 0.46160 0.45082 0.54839 0.53934 0.91519
LoOP 14 0.53125 0.52186 0.47819 0.46774 0.55738 0.54851 0.93586
LoOP 15 0.53125 0.52186 0.47896 0.46853 0.54237 0.53321 0.93727
LoOP 17 0.50000 0.48999 0.47884 0.46840 0.52308 0.51353 0.93844
LDOF 20 0.37500 0.36248 0.37100 0.35840 0.45714 0.44627 0.92626
LDOF 51 0.40625 0.39436 0.43489 0.42358 0.49315 0.48300 0.89451
LDOF 52 0.40625 0.39436 0.42600 0.41450 0.50000 0.48999 0.88766
LDOF 70 0.43750 0.42624 0.42621 0.41472 0.45161 0.44063 0.87038
ODIN 37 0.25000 0.23498 0.18283 0.16646 0.32500 0.31148 0.90310
ODIN 97 0.35000 0.33698 0.20545 0.18954 0.36364 0.35089 0.85835
FastABOD 15 0.07018 0.05156 0.11680 0.09911 0.29139 0.27720 0.89490
FastABOD 23 0.46875 0.45811 0.42221 0.41064 0.50000 0.48999 0.88603
FastABOD 27 0.46875 0.45811 0.40917 0.39734 0.51429 0.50456 0.87964
KDEOS 6 0.25000 0.23498 0.13539 0.11808 0.25806 0.24321 0.78862
KDEOS 7 0.25000 0.23498 0.11568 0.09797 0.27273 0.25816 0.78289
KDEOS 53 0.18750 0.17123 0.16024 0.14342 0.23256 0.21719 0.77742
KDEOS 70 0.12500 0.10748 0.09885 0.08081 0.23333 0.21798 0.82333
LDF 1 0.04688 0.02779 0.03975 0.02052 0.09677 0.07869 0.63128
LDF 3 0.04438 0.02524 0.04101 0.02181 0.09286 0.07469 0.71965
LDF 5 0.02110 0.00149 0.03727 0.01799 0.10060 0.08259 0.74098
LDF 100 0.00000 -0.02003 0.03732 0.01805 0.11354 0.09579 0.64069
INFLO 11 0.37500 0.36248 0.35797 0.34511 0.38356 0.37122 0.92154
INFLO 19 0.43750 0.42624 0.38430 0.37197 0.44444 0.43332 0.90746
INFLO 72 0.43750 0.42624 0.42080 0.40920 0.47458 0.46405 0.84016
INFLO 78 0.43750 0.42624 0.42575 0.41425 0.46429 0.45356 0.83556
COF 2 0.12500 0.10748 0.12039 0.10277 0.14545 0.12834 0.70581
COF 9 0.21875 0.20311 0.12038 0.10277 0.25974 0.24492 0.55256

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Normalized, duplicates

This version contains 1555 attributes, 2867 objects, 57 outliers (1.99%)

Download raw algorithm results (12.0 MB) Download raw algorithm evaluation table (65.9 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.39541 0.38315 0.41711 0.40529 0.42105 0.40931 0.91373
KNN 9 0.42105 0.40931 0.42840 0.41680 0.54762 0.53844 0.84268
KNN 10 0.42105 0.40931 0.42822 0.41662 0.56471 0.55588 0.82379
KNN 14 0.42312 0.41141 0.41616 0.40431 0.54762 0.53844 0.78849
KNNW 3 0.40000 0.38783 0.42017 0.40841 0.43077 0.41922 0.90569
KNNW 4 0.43860 0.42721 0.42344 0.41174 0.43860 0.42721 0.90342
KNNW 13 0.42105 0.40931 0.45037 0.43922 0.52273 0.51305 0.87517
KNNW 28 0.42105 0.40931 0.43241 0.42090 0.55814 0.54918 0.81975
LOF 7 0.04746 0.02814 0.04454 0.02515 0.10857 0.09049 0.71983
LOF 8 0.05047 0.03121 0.04708 0.02775 0.10582 0.08768 0.74989
LOF 11 0.04309 0.02368 0.04388 0.02449 0.09231 0.07390 0.77763
SimplifiedLOF 9 0.04925 0.02997 0.04400 0.02461 0.09198 0.07357 0.73260
SimplifiedLOF 12 0.03969 0.02021 0.03968 0.02020 0.07915 0.06047 0.75155
LoOP 1 0.23414 0.21861 0.14121 0.12378 0.25243 0.23726 0.65969
LoOP 27 0.10526 0.08711 0.09877 0.08049 0.19048 0.17406 0.83375
LDOF 73 0.12281 0.10501 0.10942 0.09136 0.15758 0.14049 0.81065
LDOF 76 0.12281 0.10501 0.11538 0.09743 0.17065 0.15383 0.81866
LDOF 80 0.12281 0.10501 0.11109 0.09306 0.15917 0.14211 0.82203
ODIN 35 0.29240 0.27804 0.19905 0.18280 0.36782 0.35499 0.87046
ODIN 85 0.39207 0.37974 0.25721 0.24215 0.44118 0.42984 0.86822
ODIN 86 0.39207 0.37974 0.25726 0.24219 0.44118 0.42984 0.86808
FastABOD 30 0.03968 0.02020 0.06212 0.04310 0.15645 0.13934 0.79754
FastABOD 73 0.04348 0.02408 0.06688 0.04795 0.15810 0.14103 0.79272
FastABOD 86 0.04348 0.02408 0.06642 0.04748 0.16535 0.14842 0.78828
KDEOS 9 0.03333 0.01372 0.04481 0.02543 0.10323 0.08504 0.76083
KDEOS 10 0.03333 0.01372 0.04445 0.02506 0.10368 0.08550 0.75489
KDEOS 73 0.08772 0.06921 0.03795 0.01844 0.09174 0.07332 0.70401
LDF 1 0.05435 0.03517 0.02231 0.00248 0.08772 0.06921 0.34781
LDF 2 0.03788 0.01836 0.03103 0.01138 0.11523 0.09728 0.52704
LDF 11 0.00000 -0.02028 0.02336 0.00355 0.04866 0.02936 0.54471
INFLO 7 0.04754 0.02822 0.04361 0.02421 0.11064 0.09260 0.72545
INFLO 8 0.05055 0.03129 0.04579 0.02643 0.10775 0.08965 0.74564
INFLO 19 0.03933 0.01984 0.03929 0.01980 0.07757 0.05886 0.75098
COF 74 0.12281 0.10501 0.08588 0.06734 0.21053 0.19451 0.78642
COF 76 0.12281 0.10501 0.08690 0.06838 0.20779 0.19172 0.79215
COF 84 0.14035 0.12291 0.06375 0.04476 0.14620 0.12888 0.73334

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO