Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

InternetAds (19% of outliers)

The data set consists of images from web pages, classified as ads or not. The goal is to learn to remove ads automatically from web pages while retaining regular images. Ads are considered outliers.

Download all data set variants used (6.0 MB). You can also access the original data. (ad.data)

Normalized, without duplicates

This version contains 1555 attributes, 1966 objects, 368 outliers (18.72%)

Download raw algorithm results (15.2 MB) Download raw algorithm evaluation table (76.5 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 12 0.45792 0.33308 0.53742 0.43090 0.47569 0.35495 0.72227
KNN 15 0.46635 0.34345 0.54146 0.43586 0.48998 0.37253 0.71929
KNN 51 0.47671 0.35620 0.51750 0.40639 0.49624 0.38023 0.69034
KNN 53 0.47446 0.35343 0.52234 0.41235 0.49851 0.38302 0.69061
KNNW 18 0.47283 0.35142 0.50012 0.38500 0.48493 0.36631 0.73971
KNNW 20 0.49185 0.37483 0.50884 0.39573 0.49242 0.37554 0.73896
KNNW 22 0.49185 0.37483 0.51534 0.40373 0.49655 0.38061 0.73687
KNNW 79 0.47283 0.35142 0.53653 0.42980 0.48154 0.36214 0.70495
LOF 98 0.45924 0.33471 0.55448 0.45188 0.48264 0.36350 0.74088
LOF 99 0.46196 0.33805 0.55449 0.45190 0.48185 0.36252 0.74070
SimplifiedLOF 35 0.45652 0.33137 0.43895 0.30974 0.45722 0.33222 0.66670
SimplifiedLOF 98 0.45109 0.32468 0.54472 0.43987 0.47304 0.35169 0.74307
SimplifiedLOF 99 0.45380 0.32802 0.54485 0.44003 0.47351 0.35227 0.74289
LoOP 58 0.44022 0.31131 0.37556 0.23176 0.45512 0.32964 0.65873
LoOP 64 0.44837 0.32134 0.38894 0.24822 0.45385 0.32807 0.66410
LoOP 100 0.42935 0.29793 0.45133 0.32498 0.43185 0.30102 0.70066
LDOF 89 0.43478 0.30462 0.42493 0.29250 0.43572 0.30578 0.68300
LDOF 96 0.43478 0.30462 0.43558 0.30560 0.43691 0.30723 0.69169
LDOF 97 0.43478 0.30462 0.43585 0.30593 0.43631 0.30650 0.69313
LDOF 98 0.43207 0.30128 0.43573 0.30578 0.43562 0.30565 0.69359
ODIN 7 0.26454 0.09518 0.24569 0.07198 0.35393 0.20515 0.60536
ODIN 17 0.25386 0.08204 0.23342 0.05689 0.36299 0.21629 0.58365
FastABOD 24 0.41304 0.27787 0.42162 0.28842 0.45967 0.33524 0.73389
FastABOD 49 0.40489 0.26784 0.41115 0.27554 0.46350 0.33996 0.72633
FastABOD 83 0.42663 0.29459 0.41944 0.28575 0.44852 0.32152 0.72351
FastABOD 90 0.41848 0.28456 0.42180 0.28864 0.45359 0.32776 0.72474
KDEOS 12 0.20109 0.01711 0.21704 0.03673 0.31620 0.15873 0.52309
KDEOS 35 0.18750 0.00039 0.21447 0.03358 0.36399 0.21753 0.57775
KDEOS 36 0.19022 0.00373 0.21354 0.03243 0.36562 0.21953 0.57646
KDEOS 72 0.22826 0.05054 0.20446 0.02126 0.33020 0.17595 0.55180
LDF 100 0.46467 0.34139 0.35504 0.20652 0.46713 0.34442 0.68495
INFLO 38 0.45380 0.32802 0.41102 0.27538 0.45856 0.33388 0.66458
INFLO 98 0.44293 0.31465 0.51595 0.40448 0.46235 0.33853 0.72963
INFLO 99 0.44565 0.31799 0.51518 0.40353 0.46422 0.34084 0.72930
COF 10 0.25000 0.07728 0.25616 0.08487 0.36275 0.21600 0.59884
COF 18 0.31250 0.15418 0.25425 0.08251 0.32215 0.16605 0.56519
COF 73 0.26087 0.09066 0.26029 0.08995 0.31583 0.15828 0.53772

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Normalized, duplicates

This version contains 1555 attributes, 3264 objects, 454 outliers (13.91%)

Download raw algorithm results (14.5 MB) Download raw algorithm evaluation table (76.8 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 6 0.41108 0.31593 0.42932 0.33712 0.45554 0.36757 0.81491
KNN 15 0.44400 0.35417 0.53070 0.45488 0.47520 0.39041 0.78493
KNN 54 0.46612 0.37986 0.47507 0.39026 0.48341 0.39995 0.72417
KNN 60 0.46120 0.37414 0.47912 0.39497 0.48368 0.40026 0.72073
KNNW 19 0.40749 0.31176 0.48342 0.39996 0.49493 0.41332 0.81491
KNNW 38 0.48018 0.39619 0.51701 0.43898 0.48721 0.40436 0.79435
KNNW 57 0.49119 0.40898 0.51411 0.43561 0.49485 0.41323 0.77588
KNNW 90 0.48458 0.40131 0.51253 0.43377 0.50426 0.42417 0.75647
LOF 8 0.14119 0.00244 0.16887 0.03459 0.30222 0.18949 0.60041
LOF 16 0.13685 -0.00260 0.15664 0.02038 0.30929 0.19769 0.58036
LOF 38 0.16920 0.03497 0.16321 0.02802 0.28407 0.16840 0.56676
SimplifiedLOF 8 0.14133 0.00260 0.15381 0.01710 0.29835 0.18498 0.56410
SimplifiedLOF 21 0.15612 0.01978 0.16119 0.02566 0.28944 0.17464 0.58288
SimplifiedLOF 38 0.16920 0.03497 0.16444 0.02944 0.28420 0.16855 0.57501
LoOP 48 0.22467 0.09940 0.17786 0.04503 0.29497 0.18106 0.60627
LoOP 90 0.15639 0.02009 0.18561 0.05403 0.31847 0.20836 0.62661
LoOP 100 0.16960 0.03544 0.18457 0.05283 0.32254 0.21309 0.62938
LDOF 74 0.18062 0.04823 0.18007 0.04760 0.31014 0.19868 0.60364
LDOF 99 0.17401 0.04056 0.18809 0.05691 0.33740 0.23035 0.61943
LDOF 100 0.17401 0.04056 0.18821 0.05705 0.33740 0.23035 0.62013
ODIN 1 0.16133 0.02583 0.18527 0.05364 0.33278 0.22498 0.64963
ODIN 10 0.21566 0.08894 0.20229 0.07341 0.32244 0.21296 0.65877
ODIN 39 0.29590 0.18214 0.22989 0.10547 0.32113 0.21145 0.65314
FastABOD 94 0.26432 0.14546 0.22728 0.10244 0.38984 0.29126 0.71170
FastABOD 98 0.26652 0.14801 0.22820 0.10350 0.38984 0.29126 0.71246
FastABOD 99 0.26652 0.14801 0.22811 0.10340 0.38885 0.29011 0.71247
KDEOS 2 0.16525 0.03039 0.14483 0.00667 0.26528 0.14657 0.53999
KDEOS 7 0.10879 -0.03519 0.14211 0.00350 0.29623 0.18253 0.53402
KDEOS 12 0.11674 -0.02596 0.15311 0.01629 0.28873 0.17382 0.56749
LDF 2 0.25091 0.12988 0.16369 0.02857 0.27188 0.15425 0.45325
LDF 6 0.21806 0.09173 0.16856 0.03423 0.24422 0.12211 0.47891
LDF 46 0.20010 0.07087 0.16188 0.02647 0.24422 0.12211 0.53886
INFLO 21 0.15612 0.01978 0.16014 0.02445 0.29791 0.18447 0.58620
INFLO 38 0.16920 0.03497 0.16477 0.02982 0.28470 0.16913 0.58117
COF 9 0.15198 0.01497 0.16418 0.02914 0.31922 0.20923 0.58941
COF 17 0.19604 0.06614 0.15832 0.02234 0.26575 0.14712 0.53913
COF 87 0.17621 0.04312 0.18218 0.05005 0.31491 0.20422 0.61985
COF 88 0.18062 0.04823 0.18287 0.05085 0.31579 0.20524 0.61918

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO