Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

InternetAds (5% of outliers version#04)

The data set consists of images from web pages, classified as ads or not. The goal is to learn to remove ads automatically from web pages while retaining regular images. Ads are considered outliers.

Download all data set variants used (6.0 MB). You can also access the original data. (ad.data)

Normalized, without duplicates

This version contains 1555 attributes, 1682 objects, 84 outliers (4.99%)

Download raw algorithm results (10.5 MB) Download raw algorithm evaluation table (67.4 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.42560 0.39540 0.43113 0.40123 0.46774 0.43976 0.83362
KNN 3 0.42143 0.39102 0.43114 0.40124 0.46875 0.44082 0.82349
KNN 13 0.40189 0.37045 0.38467 0.35233 0.47059 0.44276 0.70664
KNNW 4 0.41667 0.38600 0.40204 0.37061 0.43312 0.40332 0.83576
KNNW 8 0.44048 0.41106 0.42985 0.39988 0.47244 0.44471 0.81329
KNNW 15 0.42857 0.39853 0.43334 0.40356 0.48276 0.45557 0.77083
KNNW 17 0.44048 0.41106 0.43477 0.40505 0.47826 0.45084 0.76200
LOF 21 0.35714 0.32335 0.37659 0.34382 0.40000 0.36846 0.82053
LOF 39 0.45238 0.42359 0.42862 0.39859 0.48227 0.45505 0.80181
LOF 50 0.44048 0.41106 0.44405 0.41483 0.51852 0.49321 0.77928
SimplifiedLOF 21 0.39286 0.36094 0.39990 0.36835 0.41830 0.38772 0.83675
SimplifiedLOF 33 0.47619 0.44866 0.44947 0.42053 0.50331 0.47720 0.83315
SimplifiedLOF 50 0.47619 0.44866 0.46002 0.43163 0.51064 0.48491 0.80198
SimplifiedLOF 51 0.46429 0.43613 0.46068 0.43233 0.50360 0.47750 0.80513
LoOP 39 0.36905 0.33588 0.37846 0.34579 0.40909 0.37803 0.84009
LoOP 92 0.47619 0.44866 0.43365 0.40388 0.47619 0.44866 0.80544
LoOP 100 0.47619 0.44866 0.43555 0.40588 0.48521 0.45815 0.80108
LDOF 33 0.38095 0.34841 0.36470 0.33130 0.39785 0.36620 0.85200
LDOF 79 0.44048 0.41106 0.40869 0.37760 0.44311 0.41384 0.81216
LDOF 90 0.44048 0.41106 0.41389 0.38308 0.44970 0.42078 0.81062
ODIN 10 0.22743 0.18682 0.16773 0.12398 0.28470 0.24710 0.78007
ODIN 15 0.26190 0.22311 0.18701 0.14428 0.33180 0.29667 0.76633
ODIN 72 0.32020 0.28446 0.20099 0.15899 0.32787 0.29254 0.74952
ODIN 100 0.31960 0.28383 0.20724 0.16557 0.32955 0.29430 0.74224
FastABOD 10 0.36905 0.33588 0.32740 0.29205 0.40816 0.37705 0.80453
FastABOD 23 0.42857 0.39853 0.34428 0.30981 0.43210 0.40225 0.79406
KDEOS 10 0.20238 0.16045 0.11982 0.07356 0.20809 0.16647 0.70840
KDEOS 11 0.19048 0.14792 0.11646 0.07002 0.21469 0.17341 0.69560
KDEOS 61 0.15476 0.11033 0.13596 0.09054 0.20501 0.16322 0.74701
KDEOS 71 0.16667 0.12286 0.14586 0.10096 0.21198 0.17056 0.73987
LDF 4 0.09809 0.05068 0.08606 0.03802 0.17354 0.13010 0.64473
LDF 99 0.05952 0.01009 0.07549 0.02690 0.19186 0.14938 0.61772
INFLO 33 0.45238 0.42359 0.43178 0.40191 0.48101 0.45373 0.83866
INFLO 79 0.47619 0.44866 0.44620 0.41709 0.49032 0.46353 0.78817
INFLO 84 0.47619 0.44866 0.44626 0.41716 0.49351 0.46688 0.78709
INFLO 93 0.47619 0.44866 0.44787 0.41884 0.49351 0.46688 0.77950
COF 4 0.16667 0.12286 0.11139 0.06468 0.21256 0.17117 0.62823
COF 13 0.21429 0.17298 0.12366 0.07759 0.23611 0.19596 0.52182
COF 69 0.20238 0.16045 0.14070 0.09553 0.21519 0.17394 0.50863

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Normalized, duplicates

This version contains 1555 attributes, 2957 objects, 147 outliers (4.97%)

Download raw algorithm results (12.6 MB) Download raw algorithm evaluation table (72.4 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 4 0.48586 0.45896 0.52372 0.49880 0.54148 0.51750 0.88720
KNN 6 0.51377 0.48833 0.56354 0.54071 0.59912 0.57815 0.88280
KNN 7 0.50868 0.48297 0.56484 0.54207 0.59821 0.57720 0.87873
KNNW 9 0.47619 0.44879 0.49864 0.47242 0.48352 0.45650 0.88714
KNNW 10 0.49660 0.47026 0.51159 0.48604 0.51613 0.49082 0.88709
KNNW 16 0.48980 0.46311 0.54340 0.51952 0.58182 0.55994 0.87177
KNNW 21 0.48299 0.45595 0.53828 0.51413 0.60000 0.57907 0.85672
LOF 7 0.10185 0.05487 0.10402 0.05715 0.21190 0.17067 0.74751
LOF 8 0.10164 0.05465 0.10785 0.06118 0.23932 0.19952 0.76341
SimplifiedLOF 8 0.11126 0.06477 0.10361 0.05672 0.18962 0.14722 0.74694
LoOP 1 0.19410 0.15195 0.09906 0.05192 0.19649 0.15446 0.60509
LoOP 13 0.17687 0.13381 0.15956 0.11560 0.25231 0.21320 0.78368
LoOP 22 0.15646 0.11233 0.15981 0.11586 0.23905 0.19925 0.77877
LoOP 73 0.15646 0.11233 0.15603 0.11188 0.25940 0.22066 0.77984
LDOF 75 0.17007 0.12665 0.14978 0.10531 0.26915 0.23092 0.77796
LDOF 76 0.16327 0.11949 0.15013 0.10567 0.27197 0.23388 0.77756
LDOF 99 0.16327 0.11949 0.14883 0.10430 0.24516 0.20567 0.78759
ODIN 42 0.32856 0.29343 0.25371 0.21467 0.43602 0.40652 0.83295
ODIN 93 0.41497 0.38436 0.28746 0.25018 0.47383 0.44630 0.82061
ODIN 99 0.42207 0.39184 0.29079 0.25369 0.46961 0.44187 0.81847
ODIN 100 0.42207 0.39184 0.29096 0.25387 0.46961 0.44187 0.81851
FastABOD 35 0.02721 -0.02368 0.11807 0.07193 0.28065 0.24301 0.77071
FastABOD 36 0.02721 -0.02368 0.11796 0.07182 0.28070 0.24307 0.77006
FastABOD 72 0.10204 0.05507 0.12000 0.07396 0.26292 0.22437 0.76661
FastABOD 99 0.10204 0.05507 0.12100 0.07502 0.27169 0.23359 0.76483
KDEOS 2 0.03786 -0.01247 0.06777 0.01900 0.17615 0.13306 0.61472
KDEOS 10 0.06122 0.01211 0.08130 0.03324 0.17500 0.13184 0.69051
KDEOS 73 0.08844 0.04075 0.07428 0.02585 0.13920 0.09417 0.66519
LDF 1 0.09184 0.04433 0.05255 0.00299 0.14796 0.10339 0.36902
LDF 13 0.04622 -0.00368 0.05661 0.00726 0.10920 0.06260 0.56091
INFLO 7 0.10201 0.05503 0.09632 0.04905 0.19438 0.15224 0.73033
INFLO 8 0.10180 0.05481 0.10533 0.05853 0.22378 0.18317 0.76620
INFLO 9 0.10015 0.05307 0.10422 0.05735 0.22772 0.18732 0.76348
COF 35 0.08163 0.03359 0.10413 0.05726 0.21578 0.17476 0.72427
COF 76 0.13605 0.09086 0.11285 0.06644 0.25376 0.21473 0.69554
COF 80 0.16327 0.11949 0.11186 0.06540 0.25446 0.21546 0.69394
COF 89 0.17007 0.12665 0.10614 0.05938 0.23278 0.19264 0.69076

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO