Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

InternetAds (5% of outliers version#03)

The data set consists of images from web pages, classified as ads or not. The goal is to learn to remove ads automatically from web pages while retaining regular images. Ads are considered outliers.

Download all data set variants used (6.0 MB). You can also access the original data. (ad.data)

Normalized, without duplicates

This version contains 1555 attributes, 1682 objects, 84 outliers (4.99%)

Download raw algorithm results (10.4 MB) Download raw algorithm evaluation table (67.2 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.47180 0.44404 0.46650 0.43846 0.47273 0.44501 0.81560
KNN 16 0.44841 0.41942 0.44067 0.41127 0.53435 0.50987 0.72670
KNNW 4 0.47619 0.44866 0.44245 0.41315 0.49333 0.46670 0.81261
KNNW 14 0.50000 0.47372 0.48193 0.45470 0.52632 0.50142 0.78285
KNNW 16 0.48810 0.46119 0.48527 0.45822 0.53691 0.51257 0.77698
KNNW 21 0.50000 0.47372 0.48804 0.46113 0.53237 0.50779 0.76452
LOF 28 0.41667 0.38600 0.41312 0.38227 0.43939 0.40993 0.81628
LOF 51 0.47619 0.44866 0.48730 0.46035 0.51064 0.48491 0.78656
LOF 63 0.47619 0.44866 0.49167 0.46495 0.52288 0.49780 0.78434
LOF 95 0.46429 0.43613 0.48793 0.46101 0.54135 0.51724 0.76335
SimplifiedLOF 28 0.44048 0.41106 0.44028 0.41086 0.47134 0.44355 0.82410
SimplifiedLOF 44 0.51190 0.48625 0.49018 0.46338 0.51765 0.49229 0.80901
SimplifiedLOF 56 0.48810 0.46119 0.50932 0.48353 0.52349 0.49844 0.79729
SimplifiedLOF 83 0.47619 0.44866 0.49859 0.47223 0.53623 0.51185 0.78100
LoOP 59 0.41667 0.38600 0.42779 0.39771 0.43243 0.40260 0.81401
LoOP 81 0.51190 0.48625 0.45650 0.42793 0.51190 0.48625 0.80971
LoOP 100 0.48810 0.46119 0.47482 0.44721 0.50847 0.48264 0.80333
LDOF 49 0.44048 0.41106 0.41649 0.38582 0.45714 0.42861 0.81668
LDOF 86 0.48810 0.46119 0.45097 0.42211 0.49057 0.46379 0.81036
ODIN 30 0.35661 0.32279 0.22299 0.18214 0.38919 0.35708 0.72379
ODIN 75 0.37381 0.34089 0.23554 0.19536 0.37696 0.34421 0.73775
ODIN 77 0.37381 0.34089 0.23595 0.19579 0.37696 0.34421 0.73842
ODIN 99 0.36728 0.33402 0.23552 0.19534 0.36757 0.33432 0.73937
FastABOD 19 0.47619 0.44866 0.39026 0.35821 0.49412 0.46753 0.79561
FastABOD 23 0.47619 0.44866 0.39755 0.36589 0.50279 0.47666 0.80107
KDEOS 13 0.23810 0.19805 0.15409 0.10962 0.24242 0.20260 0.64620
KDEOS 14 0.23810 0.19805 0.14867 0.10392 0.24390 0.20416 0.63537
KDEOS 66 0.21429 0.17298 0.15259 0.10805 0.22295 0.18210 0.72445
LDF 2 0.07885 0.03043 0.06943 0.02051 0.16406 0.12012 0.51658
LDF 100 0.01190 -0.04004 0.10227 0.05508 0.26383 0.22513 0.67070
INFLO 28 0.41667 0.38600 0.41828 0.38771 0.45333 0.42460 0.81678
INFLO 44 0.50000 0.47372 0.45308 0.42433 0.50299 0.47687 0.81072
INFLO 78 0.47619 0.44866 0.48426 0.45715 0.51748 0.49212 0.79661
INFLO 86 0.50000 0.47372 0.49312 0.46648 0.51034 0.48461 0.79541
COF 3 0.19048 0.14792 0.13538 0.08993 0.22378 0.18297 0.60012
COF 4 0.19048 0.14792 0.13425 0.08874 0.23474 0.19452 0.62188
COF 5 0.21429 0.17298 0.12323 0.07714 0.25243 0.21313 0.58534

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Normalized, duplicates

This version contains 1555 attributes, 2957 objects, 147 outliers (4.97%)

Download raw algorithm results (12.6 MB) Download raw algorithm evaluation table (72.0 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.44058 0.41132 0.47251 0.44492 0.44562 0.41662 0.89272
KNN 4 0.48404 0.45705 0.52517 0.50033 0.51556 0.49021 0.88522
KNN 5 0.47985 0.45264 0.52153 0.49650 0.52586 0.50106 0.87924
KNNW 6 0.45578 0.42731 0.47749 0.45016 0.48413 0.45714 0.88599
KNNW 9 0.47619 0.44879 0.51654 0.49124 0.49817 0.47192 0.88176
KNNW 10 0.48299 0.45595 0.51622 0.49091 0.49606 0.46970 0.87887
KNNW 22 0.46939 0.44163 0.49500 0.46859 0.52336 0.49843 0.84088
LOF 7 0.09627 0.04900 0.09756 0.05035 0.21528 0.17423 0.73025
LOF 9 0.09626 0.04899 0.09875 0.05160 0.20938 0.16802 0.74143
SimplifiedLOF 8 0.09431 0.04693 0.08900 0.04134 0.17517 0.13202 0.70543
SimplifiedLOF 9 0.09410 0.04671 0.08997 0.04237 0.17342 0.13018 0.71220
LoOP 27 0.16327 0.11949 0.15346 0.10917 0.28070 0.24307 0.77825
LoOP 32 0.17007 0.12665 0.14599 0.10132 0.27451 0.23656 0.78023
LoOP 72 0.19728 0.15529 0.14465 0.09991 0.29306 0.25608 0.77405
LoOP 75 0.20408 0.16244 0.14095 0.09601 0.26923 0.23100 0.77551
LDOF 75 0.18367 0.14097 0.15218 0.10783 0.26816 0.22987 0.77480
LDOF 76 0.18367 0.14097 0.15493 0.11072 0.27350 0.23550 0.77758
LDOF 91 0.18367 0.14097 0.14972 0.10524 0.25656 0.21767 0.78011
ODIN 100 0.44397 0.41488 0.28347 0.24598 0.45113 0.42241 0.80355
FastABOD 29 0.04762 -0.00220 0.12325 0.07739 0.28239 0.24485 0.77197
FastABOD 73 0.14966 0.10518 0.12744 0.08180 0.27036 0.23219 0.76722
FastABOD 74 0.14966 0.10518 0.12820 0.08259 0.27124 0.23312 0.76790
FastABOD 100 0.14966 0.10518 0.12783 0.08220 0.28339 0.24590 0.76802
KDEOS 2 0.03165 -0.01901 0.06871 0.01999 0.17907 0.13613 0.62510
KDEOS 10 0.06122 0.01211 0.08214 0.03413 0.16129 0.11741 0.69171
KDEOS 11 0.06122 0.01211 0.08191 0.03388 0.16232 0.11850 0.69191
KDEOS 75 0.10204 0.05507 0.08005 0.03193 0.14963 0.10514 0.67849
LDF 1 0.14531 0.10059 0.06956 0.02088 0.20690 0.16541 0.37214
LDF 17 0.05106 0.00142 0.05529 0.00587 0.10530 0.05850 0.54874
INFLO 7 0.09627 0.04900 0.09489 0.04754 0.20883 0.16744 0.72861
INFLO 8 0.09380 0.04639 0.09446 0.04709 0.21421 0.17310 0.72935
COF 75 0.18367 0.14097 0.12250 0.07659 0.25049 0.21128 0.71910
COF 76 0.18367 0.14097 0.12688 0.08120 0.26068 0.22201 0.72481

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO