Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

InternetAds (10% of outliers version#01)

The data set consists of images from web pages, classified as ads or not. The goal is to learn to remove ads automatically from web pages while retaining regular images. Ads are considered outliers.

Download all data set variants used (6.0 MB). You can also access the original data. (ad.data)

Normalized, without duplicates

This version contains 1555 attributes, 1775 objects, 177 outliers (9.97%)

Download raw algorithm results (13.0 MB) Download raw algorithm evaluation table (73.3 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 5 0.46973 0.41100 0.50002 0.44464 0.48951 0.43297 0.79598
KNN 21 0.45297 0.39238 0.47541 0.41730 0.52381 0.47107 0.70221
KNNW 7 0.40113 0.33480 0.45347 0.39293 0.43961 0.37754 0.80246
KNNW 12 0.46893 0.41010 0.49694 0.44122 0.48297 0.42570 0.78681
KNNW 20 0.46893 0.41010 0.50467 0.44981 0.50000 0.44462 0.76044
KNNW 42 0.45763 0.39755 0.49433 0.43832 0.51079 0.45660 0.72275
LOF 38 0.40678 0.34107 0.42118 0.35707 0.41245 0.34737 0.76839
LOF 94 0.47458 0.41638 0.52605 0.47355 0.55094 0.50120 0.75205
LOF 95 0.48023 0.42265 0.52643 0.47397 0.55094 0.50120 0.75081
SimplifiedLOF 48 0.46893 0.41010 0.47759 0.41973 0.48276 0.42547 0.78535
SimplifiedLOF 94 0.48588 0.42893 0.53078 0.47881 0.53901 0.48795 0.76582
SimplifiedLOF 100 0.48588 0.42893 0.52784 0.47554 0.54483 0.49441 0.76128
LoOP 31 0.45198 0.39128 0.38935 0.32171 0.45198 0.39128 0.74193
LoOP 36 0.43503 0.37245 0.39868 0.33207 0.46027 0.40049 0.75302
LoOP 92 0.41243 0.34735 0.44917 0.38815 0.42907 0.36583 0.77646
LoOP 100 0.42373 0.35990 0.45410 0.39364 0.43986 0.37782 0.77553
LDOF 21 0.43503 0.37245 0.33241 0.25846 0.43503 0.37245 0.74344
LDOF 66 0.43503 0.37245 0.42407 0.36027 0.44382 0.38222 0.76475
LDOF 94 0.40678 0.34107 0.44300 0.38131 0.42894 0.36569 0.77994
ODIN 13 0.27405 0.19364 0.21156 0.12423 0.34982 0.27781 0.70386
ODIN 26 0.31483 0.23893 0.23158 0.14647 0.38479 0.31664 0.69150
ODIN 35 0.31759 0.24200 0.23289 0.14792 0.37915 0.31038 0.69189
ODIN 99 0.33161 0.25758 0.22058 0.13425 0.34254 0.26972 0.68188
FastABOD 18 0.43503 0.37245 0.38107 0.31252 0.45485 0.39447 0.77283
FastABOD 20 0.44633 0.38500 0.38371 0.31544 0.45062 0.38977 0.77377
FastABOD 25 0.43503 0.37245 0.38488 0.31674 0.45424 0.39379 0.77664
KDEOS 64 0.23164 0.14653 0.17123 0.07943 0.26054 0.17863 0.66523
KDEOS 65 0.22599 0.14026 0.17778 0.08670 0.26496 0.18354 0.66395
LDF 99 0.30508 0.22811 0.17804 0.08700 0.37352 0.30413 0.66369
LDF 100 0.31638 0.24066 0.17931 0.08841 0.37264 0.30315 0.66386
INFLO 49 0.42373 0.35990 0.45934 0.39945 0.42991 0.36676 0.78142
INFLO 94 0.49718 0.44148 0.51405 0.46023 0.52201 0.46907 0.77493
INFLO 100 0.48023 0.42265 0.51007 0.45581 0.52733 0.47498 0.77038
COF 5 0.21469 0.12771 0.16880 0.07673 0.25108 0.16813 0.62593
COF 9 0.27119 0.19046 0.18659 0.09650 0.29012 0.21150 0.57606

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Normalized, duplicates

This version contains 1555 attributes, 3122 objects, 312 outliers (9.99%)

Download raw algorithm results (13.7 MB) Download raw algorithm evaluation table (74.0 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 8 0.47516 0.41689 0.52764 0.47519 0.47619 0.41803 0.85807
KNN 9 0.49213 0.43574 0.55621 0.50693 0.49753 0.44174 0.85340
KNN 13 0.46267 0.40301 0.52708 0.47457 0.50916 0.45466 0.81570
KNNW 14 0.43269 0.36970 0.48006 0.42233 0.45400 0.39338 0.85314
KNNW 26 0.47115 0.41243 0.53226 0.48032 0.47988 0.42213 0.83696
KNNW 88 0.49359 0.43736 0.50834 0.45375 0.49435 0.43820 0.77574
KNNW 99 0.49359 0.43736 0.50382 0.44873 0.49675 0.44088 0.77038
LOF 9 0.12799 0.03117 0.14174 0.04645 0.27019 0.18916 0.65109
LOF 10 0.12995 0.03335 0.14094 0.04556 0.26893 0.18776 0.64868
SimplifiedLOF 8 0.11869 0.02084 0.12116 0.02359 0.23737 0.15270 0.59128
SimplifiedLOF 10 0.12995 0.03335 0.12938 0.03272 0.23694 0.15221 0.61224
SimplifiedLOF 65 0.12541 0.02830 0.12553 0.02843 0.22994 0.14444 0.61322
LoOP 1 0.22115 0.13468 0.14720 0.05251 0.24082 0.15653 0.58158
LoOP 73 0.18910 0.09907 0.16286 0.06991 0.27054 0.18955 0.65874
LoOP 83 0.20192 0.11331 0.16285 0.06990 0.29534 0.21710 0.66810
LoOP 100 0.19551 0.10619 0.15943 0.06610 0.27248 0.19170 0.66916
LDOF 79 0.20513 0.11687 0.16297 0.07003 0.28433 0.20487 0.64906
LDOF 80 0.20513 0.11687 0.16357 0.07070 0.28655 0.20733 0.64965
LDOF 100 0.20513 0.11687 0.16550 0.07284 0.28182 0.20208 0.66523
ODIN 16 0.22752 0.14175 0.19784 0.10877 0.33415 0.26022 0.71676
ODIN 31 0.34160 0.26850 0.22954 0.14399 0.38251 0.31395 0.71121
ODIN 47 0.37227 0.30257 0.23739 0.15272 0.37544 0.30610 0.69623
ODIN 89 0.37067 0.30080 0.24314 0.15910 0.37968 0.31080 0.69696
FastABOD 73 0.16346 0.07058 0.18969 0.09972 0.33283 0.25875 0.74467
FastABOD 90 0.15705 0.06346 0.18916 0.09913 0.34244 0.26942 0.74505
FastABOD 100 0.16026 0.06702 0.19160 0.10184 0.33968 0.26636 0.74701
KDEOS 2 0.09615 -0.00420 0.11752 0.01954 0.25779 0.17538 0.57977
KDEOS 11 0.09936 -0.00064 0.12106 0.02347 0.23308 0.14793 0.59600
KDEOS 76 0.11859 0.02072 0.11574 0.01756 0.21585 0.12878 0.56921
LDF 1 0.21154 0.12399 0.12341 0.02608 0.21441 0.12718 0.40666
LDF 4 0.20192 0.11331 0.12248 0.02504 0.22222 0.13586 0.46563
LDF 63 0.10183 0.00210 0.10183 0.00210 0.18483 0.09432 0.51032
INFLO 10 0.12995 0.03335 0.12950 0.03284 0.25366 0.17079 0.61805
INFLO 11 0.12761 0.03075 0.12894 0.03222 0.25203 0.16898 0.62001
COF 2 0.04494 -0.06110 0.10768 0.00860 0.25739 0.17494 0.57864
COF 50 0.15385 0.05990 0.14433 0.04932 0.24299 0.15894 0.64673
COF 79 0.21795 0.13112 0.15491 0.06107 0.25000 0.16673 0.64000
COF 80 0.21795 0.13112 0.15714 0.06355 0.25139 0.16827 0.64500

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO