Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

InternetAds (10% of outliers version#07)

The data set consists of images from web pages, classified as ads or not. The goal is to learn to remove ads automatically from web pages while retaining regular images. Ads are considered outliers.

Download all data set variants used (6.0 MB). You can also access the original data. (ad.data)

Normalized, without duplicates

This version contains 1555 attributes, 1775 objects, 177 outliers (9.97%)

Download raw algorithm results (13.0 MB) Download raw algorithm evaluation table (72.9 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 6 0.48276 0.42547 0.49974 0.44432 0.49102 0.43464 0.79710
KNN 7 0.49859 0.44305 0.51309 0.45916 0.50549 0.45072 0.78885
KNN 8 0.48929 0.43272 0.51453 0.46076 0.49351 0.43741 0.78046
KNN 20 0.47566 0.41758 0.50045 0.44511 0.53247 0.48068 0.73772
KNNW 9 0.41808 0.35362 0.46277 0.40327 0.46050 0.40074 0.81384
KNNW 12 0.50282 0.44776 0.49338 0.43726 0.50568 0.45093 0.80796
KNNW 31 0.48588 0.42893 0.52836 0.47612 0.52318 0.47036 0.76771
KNNW 33 0.48588 0.42893 0.52799 0.47571 0.52492 0.47230 0.76467
LOF 57 0.49153 0.43521 0.53719 0.48593 0.52280 0.46994 0.79346
LOF 73 0.53107 0.47913 0.55674 0.50764 0.53722 0.48596 0.78807
LOF 99 0.52542 0.47286 0.56711 0.51916 0.57244 0.52508 0.77837
SimplifiedLOF 57 0.49718 0.44148 0.53477 0.48324 0.50993 0.45565 0.80642
SimplifiedLOF 95 0.54802 0.49796 0.56608 0.51802 0.56051 0.51183 0.79304
SimplifiedLOF 99 0.54802 0.49796 0.56877 0.52100 0.56535 0.51721 0.79108
SimplifiedLOF 100 0.54802 0.49796 0.56828 0.52046 0.56655 0.51854 0.79031
LoOP 22 0.49153 0.43521 0.39655 0.32971 0.49432 0.43831 0.72483
LoOP 31 0.48588 0.42893 0.40588 0.34007 0.51515 0.46145 0.72466
LoOP 99 0.44633 0.38500 0.47236 0.41392 0.47436 0.41614 0.79287
LoOP 100 0.44633 0.38500 0.47253 0.41410 0.47097 0.41237 0.79283
LDOF 28 0.48588 0.42893 0.36805 0.29805 0.48680 0.42996 0.73478
LDOF 31 0.48023 0.42265 0.37536 0.30617 0.49249 0.43628 0.73357
LDOF 99 0.43503 0.37245 0.46044 0.40068 0.46057 0.40082 0.79050
ODIN 16 0.24441 0.16072 0.20077 0.11224 0.32303 0.24805 0.69736
ODIN 26 0.26554 0.18419 0.20997 0.12246 0.34459 0.27200 0.68812
ODIN 37 0.25359 0.17091 0.20609 0.11816 0.34955 0.27750 0.68335
ODIN 99 0.28609 0.20702 0.20070 0.11217 0.33680 0.26334 0.68414
FastABOD 21 0.45198 0.39128 0.38324 0.31493 0.46429 0.40495 0.79232
FastABOD 22 0.45198 0.39128 0.38043 0.31181 0.46154 0.40190 0.79287
FastABOD 25 0.44068 0.37873 0.38647 0.31851 0.46057 0.40082 0.79091
FastABOD 28 0.45198 0.39128 0.38379 0.31554 0.46914 0.41034 0.79028
KDEOS 57 0.18079 0.09005 0.17334 0.08178 0.23816 0.15377 0.63987
KDEOS 61 0.23729 0.15281 0.16225 0.06945 0.25895 0.17687 0.65528
KDEOS 62 0.24859 0.16536 0.16984 0.07788 0.25220 0.16937 0.65559
LDF 98 0.37288 0.30342 0.21524 0.12831 0.41646 0.35183 0.70373
LDF 100 0.38418 0.31597 0.22025 0.13388 0.41546 0.35071 0.70486
INFLO 71 0.49153 0.43521 0.51587 0.46224 0.49153 0.43521 0.79900
INFLO 95 0.50282 0.44776 0.53881 0.48773 0.52396 0.47123 0.79767
INFLO 99 0.50282 0.44776 0.54288 0.49225 0.53247 0.48068 0.79751
COF 7 0.16384 0.07123 0.16247 0.06970 0.23920 0.15493 0.59856
COF 11 0.23164 0.14653 0.14679 0.05228 0.24737 0.16400 0.57040

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Normalized, duplicates

This version contains 1555 attributes, 3122 objects, 312 outliers (9.99%)

Download raw algorithm results (13.7 MB) Download raw algorithm evaluation table (73.9 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 5 0.42953 0.36619 0.47111 0.41239 0.45877 0.39868 0.84783
KNN 8 0.46554 0.40620 0.51419 0.46024 0.47045 0.41165 0.84765
KNN 10 0.46099 0.40115 0.53335 0.48154 0.50598 0.45112 0.82689
KNNW 13 0.42949 0.36614 0.47879 0.42092 0.45205 0.39122 0.84705
KNNW 22 0.48077 0.42312 0.51629 0.46258 0.48232 0.42484 0.83401
KNNW 25 0.47115 0.41243 0.51840 0.46493 0.47360 0.41515 0.82753
LOF 9 0.14127 0.04593 0.15004 0.05567 0.28972 0.21086 0.66758
LOF 11 0.13777 0.04204 0.14502 0.05009 0.26264 0.18077 0.66760
SimplifiedLOF 9 0.13821 0.04253 0.13327 0.03703 0.23944 0.15500 0.62297
SimplifiedLOF 10 0.13725 0.04146 0.13517 0.03915 0.23778 0.15314 0.62975
SimplifiedLOF 17 0.13317 0.03693 0.13347 0.03726 0.24884 0.16544 0.63335
LoOP 71 0.25321 0.17029 0.17325 0.08145 0.28192 0.20219 0.67830
LoOP 73 0.23718 0.15248 0.17732 0.08598 0.28943 0.21054 0.68450
LoOP 76 0.21474 0.12755 0.17628 0.08482 0.29921 0.22140 0.68679
LoOP 78 0.20833 0.12043 0.17558 0.08404 0.31087 0.23436 0.68655
LDOF 77 0.22756 0.14180 0.17363 0.08187 0.29683 0.21876 0.67201
LDOF 78 0.22756 0.14180 0.17383 0.08210 0.29737 0.21936 0.67233
LDOF 99 0.21154 0.12399 0.17337 0.08159 0.30627 0.22925 0.68127
ODIN 23 0.28371 0.20418 0.22446 0.13835 0.36118 0.29025 0.73141
ODIN 39 0.35884 0.28765 0.24215 0.15801 0.38095 0.31222 0.71889
ODIN 46 0.37073 0.30086 0.24429 0.16038 0.37798 0.30891 0.71314
ODIN 86 0.36293 0.29220 0.24821 0.16473 0.37186 0.30212 0.71067
FastABOD 31 0.15385 0.05990 0.19308 0.10349 0.36382 0.29318 0.74543
FastABOD 33 0.15064 0.05633 0.19315 0.10357 0.36620 0.29582 0.74483
FastABOD 72 0.16987 0.07770 0.19600 0.10673 0.35153 0.27953 0.74394
FastABOD 73 0.16667 0.07414 0.19694 0.10777 0.35218 0.28025 0.74473
KDEOS 2 0.12179 0.02429 0.12278 0.02538 0.26689 0.18550 0.58942
KDEOS 10 0.10363 0.00411 0.12654 0.02955 0.24164 0.15744 0.61188
KDEOS 11 0.10256 0.00292 0.12641 0.02941 0.24011 0.15574 0.61388
KDEOS 76 0.12500 0.02785 0.11954 0.02178 0.21647 0.12948 0.58400
LDF 1 0.20833 0.12043 0.11562 0.01742 0.21320 0.12584 0.40888
LDF 30 0.12224 0.02478 0.10719 0.00806 0.18567 0.09526 0.53898
INFLO 9 0.14127 0.04593 0.13631 0.04041 0.25769 0.17527 0.63274
INFLO 10 0.13725 0.04146 0.13742 0.04164 0.26047 0.17836 0.64497
INFLO 11 0.13777 0.04204 0.13709 0.04128 0.26122 0.17919 0.64309
COF 43 0.15064 0.05633 0.14722 0.05253 0.25730 0.17484 0.65781
COF 78 0.20192 0.11331 0.16296 0.07003 0.26834 0.18711 0.65385
COF 80 0.21795 0.13112 0.15938 0.06604 0.25731 0.17485 0.64986

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO