Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

InternetAds (10% of outliers version#06)

The data set consists of images from web pages, classified as ads or not. The goal is to learn to remove ads automatically from web pages while retaining regular images. Ads are considered outliers.

Download all data set variants used (6.0 MB). You can also access the original data. (ad.data)

Normalized, without duplicates

This version contains 1555 attributes, 1775 objects, 177 outliers (9.97%)

Download raw algorithm results (13.1 MB) Download raw algorithm evaluation table (72.7 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 4 0.47497 0.41681 0.51728 0.46381 0.48020 0.42262 0.81989
KNN 5 0.48424 0.42711 0.50893 0.45454 0.49020 0.43373 0.80641
KNN 22 0.45865 0.39869 0.47558 0.41749 0.52143 0.46842 0.70469
KNNW 6 0.45763 0.39755 0.48268 0.42538 0.48426 0.42714 0.82275
KNNW 14 0.47458 0.41638 0.52484 0.47221 0.50000 0.44462 0.79141
KNNW 23 0.48588 0.42893 0.51664 0.46310 0.50965 0.45534 0.76222
KNNW 35 0.46893 0.41010 0.51019 0.45594 0.52297 0.47013 0.73843
LOF 35 0.43503 0.37245 0.43512 0.37255 0.43889 0.37674 0.77437
LOF 71 0.47458 0.41638 0.52834 0.47610 0.53247 0.48068 0.76620
LOF 96 0.47458 0.41638 0.53251 0.48073 0.57143 0.52396 0.75429
SimplifiedLOF 35 0.46328 0.40383 0.46102 0.40132 0.48101 0.42353 0.79149
SimplifiedLOF 51 0.48588 0.42893 0.50846 0.45402 0.48725 0.43046 0.78892
SimplifiedLOF 96 0.48588 0.42893 0.53960 0.48860 0.55431 0.50494 0.76866
SimplifiedLOF 100 0.48588 0.42893 0.53755 0.48633 0.55479 0.50548 0.76501
LoOP 32 0.46893 0.41010 0.40963 0.34424 0.47872 0.42099 0.76192
LoOP 86 0.45198 0.39128 0.46665 0.40758 0.45977 0.39993 0.78632
LoOP 99 0.44068 0.37873 0.47559 0.41750 0.46349 0.40407 0.78424
LDOF 66 0.47458 0.41638 0.45680 0.39664 0.48000 0.42240 0.78259
LDOF 96 0.44068 0.37873 0.45964 0.39979 0.45036 0.38948 0.78674
ODIN 9 0.21102 0.12363 0.20290 0.11461 0.33898 0.26577 0.70995
ODIN 31 0.33037 0.25620 0.23032 0.14507 0.37727 0.30830 0.69139
ODIN 33 0.33288 0.25899 0.22794 0.14242 0.38051 0.31189 0.69079
ODIN 34 0.33828 0.26498 0.22923 0.14385 0.37762 0.30869 0.69186
FastABOD 17 0.41808 0.35362 0.39581 0.32889 0.45205 0.39136 0.79377
FastABOD 19 0.42373 0.35990 0.39252 0.32523 0.44375 0.38214 0.79726
FastABOD 25 0.44068 0.37873 0.39835 0.33171 0.44318 0.38151 0.79300
FastABOD 26 0.43503 0.37245 0.39842 0.33179 0.43889 0.37674 0.79479
KDEOS 56 0.15254 0.05868 0.17456 0.08313 0.23666 0.15211 0.64354
KDEOS 65 0.19774 0.10888 0.16589 0.07350 0.24544 0.16186 0.66268
KDEOS 66 0.19774 0.10888 0.16960 0.07762 0.24615 0.16266 0.66403
LDF 100 0.20339 0.11515 0.16575 0.07335 0.34312 0.27036 0.65104
INFLO 68 0.44068 0.37873 0.50568 0.45093 0.47510 0.41696 0.78741
INFLO 96 0.48588 0.42893 0.52613 0.47365 0.51634 0.46277 0.78078
INFLO 97 0.48588 0.42893 0.52480 0.47217 0.51803 0.46465 0.77937
INFLO 99 0.49153 0.43521 0.52438 0.47170 0.51634 0.46277 0.77830
COF 4 0.25424 0.17163 0.17565 0.08434 0.25455 0.17198 0.61199
COF 8 0.24859 0.16536 0.19018 0.10049 0.25858 0.17645 0.60684
COF 9 0.25424 0.17163 0.18760 0.09761 0.26875 0.18775 0.56868

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Normalized, duplicates

This version contains 1555 attributes, 3122 objects, 312 outliers (9.99%)

Download raw algorithm results (13.7 MB) Download raw algorithm evaluation table (73.8 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 5 0.40967 0.34413 0.42418 0.36025 0.42331 0.35928 0.83929
KNN 14 0.47918 0.42135 0.52125 0.46810 0.50903 0.45451 0.79753
KNN 16 0.48123 0.42363 0.50766 0.45300 0.48668 0.42968 0.78615
KNNW 15 0.41346 0.34834 0.46878 0.40980 0.45305 0.39232 0.83827
KNNW 30 0.47756 0.41956 0.51179 0.45759 0.48494 0.42776 0.81500
KNNW 33 0.49038 0.43380 0.51111 0.45682 0.49038 0.43380 0.81010
KNNW 44 0.48718 0.43024 0.50794 0.45331 0.49673 0.44085 0.79510
LOF 7 0.12708 0.03016 0.13856 0.04291 0.25065 0.16745 0.64225
LOF 9 0.13953 0.04400 0.14126 0.04592 0.26506 0.18346 0.63101
LOF 11 0.12558 0.02849 0.13641 0.04053 0.26731 0.18596 0.64085
SimplifiedLOF 9 0.13655 0.04068 0.13027 0.03370 0.23085 0.14545 0.60567
SimplifiedLOF 16 0.12729 0.03039 0.12811 0.03130 0.23972 0.15531 0.61481
SimplifiedLOF 61 0.12672 0.02976 0.12669 0.02973 0.23083 0.14543 0.61685
LoOP 55 0.20833 0.12043 0.15736 0.06380 0.27182 0.19097 0.65947
LoOP 73 0.20833 0.12043 0.16583 0.07321 0.27457 0.19403 0.67313
LoOP 81 0.19551 0.10619 0.16301 0.07008 0.30264 0.22521 0.67585
LoOP 92 0.17949 0.08838 0.15996 0.06669 0.28283 0.20320 0.67868
LDOF 75 0.20833 0.12043 0.15838 0.06494 0.27920 0.19917 0.65822
LDOF 99 0.20192 0.11331 0.16234 0.06933 0.29125 0.21255 0.67594
LDOF 100 0.20192 0.11331 0.16235 0.06934 0.29125 0.21255 0.67561
ODIN 31 0.34259 0.26960 0.22765 0.14189 0.38235 0.31377 0.71430
ODIN 37 0.37137 0.30157 0.23601 0.15118 0.39815 0.33132 0.70931
ODIN 44 0.39066 0.32300 0.23694 0.15221 0.39350 0.32615 0.70368
ODIN 87 0.36868 0.29858 0.24635 0.16267 0.37811 0.30906 0.70277
FastABOD 33 0.14103 0.04565 0.17761 0.08630 0.32232 0.24708 0.72631
FastABOD 73 0.16026 0.06702 0.18048 0.08949 0.32512 0.25019 0.72400
FastABOD 88 0.14423 0.04921 0.17891 0.08774 0.33232 0.25819 0.72334
FastABOD 100 0.14744 0.05277 0.18165 0.09078 0.32858 0.25403 0.72589
KDEOS 5 0.08852 -0.01269 0.10424 0.00479 0.23744 0.15277 0.53896
KDEOS 6 0.12179 0.02429 0.11365 0.01523 0.23621 0.15140 0.56370
KDEOS 10 0.09936 -0.00064 0.12253 0.02511 0.23291 0.14774 0.59744
LDF 2 0.23397 0.14892 0.12787 0.03103 0.24654 0.16288 0.47302
LDF 30 0.09359 -0.00705 0.10558 0.00627 0.18567 0.09526 0.53238
INFLO 9 0.13953 0.04400 0.13302 0.03676 0.24649 0.16283 0.62183
INFLO 16 0.12729 0.03039 0.12802 0.03120 0.24291 0.15885 0.62399
COF 37 0.14103 0.04565 0.14195 0.04668 0.25033 0.16709 0.65795
COF 84 0.19872 0.10975 0.15559 0.06183 0.26945 0.18834 0.65698
COF 86 0.20833 0.12043 0.15506 0.06124 0.26395 0.18223 0.64644

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO