Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

InternetAds (10% of outliers version#03)

The data set consists of images from web pages, classified as ads or not. The goal is to learn to remove ads automatically from web pages while retaining regular images. Ads are considered outliers.

Download all data set variants used (6.0 MB). You can also access the original data. (ad.data)

Normalized, without duplicates

This version contains 1555 attributes, 1775 objects, 177 outliers (9.97%)

Download raw algorithm results (13.0 MB) Download raw algorithm evaluation table (73.2 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 5 0.46893 0.41010 0.48680 0.42995 0.46900 0.41019 0.78810
KNN 7 0.47511 0.41697 0.48736 0.43058 0.48649 0.42961 0.75623
KNN 8 0.46465 0.40536 0.49594 0.44011 0.50196 0.44680 0.74764
KNN 20 0.46147 0.40182 0.47178 0.41327 0.51948 0.46626 0.69799
KNNW 7 0.42373 0.35990 0.44715 0.38591 0.44988 0.38894 0.78542
KNNW 14 0.48588 0.42893 0.50533 0.45054 0.48588 0.42893 0.76708
KNNW 21 0.46328 0.40383 0.50960 0.45528 0.50000 0.44462 0.74786
KNNW 40 0.47458 0.41638 0.50559 0.45082 0.52857 0.47635 0.71732
LOF 39 0.41243 0.34735 0.43506 0.37249 0.45136 0.39059 0.76525
LOF 97 0.48023 0.42265 0.52510 0.47249 0.55351 0.50405 0.74206
LOF 99 0.47458 0.41638 0.52519 0.47260 0.55882 0.50996 0.74077
SimplifiedLOF 39 0.43503 0.37245 0.45408 0.39362 0.47020 0.41152 0.77945
SimplifiedLOF 97 0.49153 0.43521 0.52822 0.47596 0.53957 0.48857 0.75532
SimplifiedLOF 99 0.49153 0.43521 0.52864 0.47643 0.54152 0.49073 0.75397
LoOP 35 0.44633 0.38500 0.37989 0.31121 0.45143 0.39067 0.73743
LoOP 46 0.44068 0.37873 0.39381 0.32667 0.45614 0.39590 0.75242
LoOP 95 0.41808 0.35362 0.44937 0.38838 0.43529 0.37275 0.76722
LoOP 100 0.42938 0.36617 0.45215 0.39146 0.44444 0.38291 0.76641
LDOF 66 0.43503 0.37245 0.42165 0.35759 0.44444 0.38291 0.75567
LDOF 75 0.42938 0.36617 0.43735 0.37503 0.44444 0.38291 0.75698
LDOF 97 0.42938 0.36617 0.44495 0.38347 0.43396 0.37127 0.76980
LDOF 99 0.42938 0.36617 0.44492 0.38343 0.43125 0.36825 0.76984
ODIN 18 0.25440 0.17181 0.18458 0.09426 0.29362 0.21538 0.66513
ODIN 32 0.28043 0.20073 0.19077 0.10114 0.30400 0.22691 0.65808
ODIN 33 0.28110 0.20148 0.18959 0.09983 0.30707 0.23032 0.65737
ODIN 36 0.26601 0.18471 0.18809 0.09816 0.30801 0.23136 0.65948
FastABOD 25 0.42938 0.36617 0.38888 0.32119 0.46358 0.40416 0.76438
FastABOD 26 0.43503 0.37245 0.38445 0.31627 0.46557 0.40638 0.76187
FastABOD 28 0.44068 0.37873 0.38583 0.31780 0.45860 0.39863 0.76636
KDEOS 13 0.24294 0.15908 0.16158 0.06871 0.24642 0.16295 0.61130
KDEOS 63 0.18644 0.09633 0.15955 0.06646 0.26644 0.18519 0.65330
KDEOS 72 0.20339 0.11515 0.16864 0.07656 0.25352 0.17084 0.65790
LDF 99 0.37288 0.30342 0.20480 0.11672 0.41432 0.34945 0.66994
INFLO 56 0.40678 0.34107 0.46546 0.40625 0.42449 0.36074 0.76959
INFLO 99 0.48023 0.42265 0.50889 0.45449 0.50000 0.44462 0.76453
INFLO 100 0.48023 0.42265 0.50904 0.45466 0.49853 0.44299 0.76386
COF 4 0.18644 0.09633 0.16063 0.06765 0.24592 0.16239 0.62412
COF 7 0.20339 0.11515 0.14842 0.05410 0.21348 0.12637 0.55102
COF 54 0.19209 0.10260 0.14620 0.05163 0.25243 0.16962 0.57413

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Normalized, duplicates

This version contains 1555 attributes, 3122 objects, 312 outliers (9.99%)

Download raw algorithm results (13.7 MB) Download raw algorithm evaluation table (73.8 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 8 0.46637 0.40711 0.52837 0.47600 0.48244 0.42497 0.84991
KNN 13 0.46256 0.40288 0.53009 0.47791 0.51009 0.45570 0.80494
KNN 15 0.46717 0.40801 0.51710 0.46348 0.50564 0.45075 0.79141
KNNW 13 0.43269 0.36970 0.47450 0.41615 0.46279 0.40314 0.84438
KNNW 25 0.46474 0.40531 0.52242 0.46939 0.47147 0.41279 0.82736
KNNW 46 0.48397 0.42668 0.51005 0.45565 0.48631 0.42928 0.79472
LOF 9 0.13649 0.04061 0.14508 0.05016 0.26600 0.18451 0.66000
LOF 10 0.13672 0.04087 0.14389 0.04883 0.26081 0.17874 0.65651
SimplifiedLOF 16 0.13136 0.03492 0.13286 0.03658 0.24702 0.16341 0.63415
SimplifiedLOF 17 0.13419 0.03805 0.13348 0.03727 0.24741 0.16385 0.63084
LoOP 74 0.21154 0.12399 0.17019 0.07806 0.28073 0.20087 0.67961
LoOP 76 0.21154 0.12399 0.17191 0.07996 0.28777 0.20869 0.68438
LoOP 85 0.19872 0.10975 0.16969 0.07749 0.30263 0.22520 0.68753
LoOP 91 0.19872 0.10975 0.16909 0.07684 0.30083 0.22320 0.68812
LDOF 78 0.19872 0.10975 0.16800 0.07563 0.28288 0.20326 0.67063
LDOF 95 0.19231 0.10263 0.17291 0.08108 0.30044 0.22276 0.68663
LDOF 100 0.19231 0.10263 0.17293 0.08110 0.30416 0.22690 0.68590
ODIN 23 0.27103 0.19010 0.22497 0.13892 0.37101 0.30117 0.72841
ODIN 78 0.38668 0.31858 0.24618 0.16248 0.39073 0.32308 0.71758
ODIN 81 0.38660 0.31849 0.24749 0.16394 0.39203 0.32452 0.71822
ODIN 85 0.38269 0.31415 0.24955 0.16623 0.38796 0.32000 0.71755
FastABOD 35 0.12179 0.02429 0.18302 0.09231 0.34448 0.27169 0.73347
FastABOD 74 0.16346 0.07058 0.18820 0.09807 0.33981 0.26650 0.73534
FastABOD 99 0.15385 0.05990 0.18923 0.09921 0.33708 0.26347 0.73810
FastABOD 100 0.15705 0.06346 0.18931 0.09929 0.33683 0.26319 0.73807
KDEOS 4 0.09936 -0.00064 0.10738 0.00827 0.23983 0.15542 0.54586
KDEOS 18 0.10256 0.00292 0.12000 0.02229 0.23475 0.14978 0.60726
KDEOS 77 0.11859 0.02072 0.11734 0.01933 0.22057 0.13402 0.58424
KDEOS 87 0.11538 0.01716 0.12186 0.02436 0.22565 0.13967 0.60241
LDF 1 0.21314 0.12577 0.11167 0.01304 0.21691 0.12996 0.40446
LDF 2 0.19231 0.10263 0.11586 0.01770 0.21785 0.13100 0.42387
LDF 29 0.15025 0.05590 0.11132 0.01265 0.18616 0.09579 0.53387
INFLO 10 0.13672 0.04087 0.13468 0.03860 0.25741 0.17496 0.63439
INFLO 16 0.13136 0.03492 0.13325 0.03701 0.25999 0.17782 0.63918
INFLO 17 0.13419 0.03805 0.13484 0.03878 0.25608 0.17348 0.64393
COF 39 0.13462 0.03853 0.14348 0.04838 0.25341 0.17052 0.65693
COF 84 0.20192 0.11331 0.15700 0.06341 0.26374 0.18199 0.64572
COF 86 0.19872 0.10975 0.15694 0.06333 0.27023 0.18920 0.65263
COF 100 0.22436 0.13824 0.14980 0.05540 0.24200 0.15784 0.63162

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO