Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

InternetAds (5% of outliers version#05)

The data set consists of images from web pages, classified as ads or not. The goal is to learn to remove ads automatically from web pages while retaining regular images. Ads are considered outliers.

Download all data set variants used (6.0 MB). You can also access the original data. (ad.data)

Normalized, without duplicates

This version contains 1555 attributes, 1682 objects, 84 outliers (4.99%)

Download raw algorithm results (10.6 MB) Download raw algorithm evaluation table (66.0 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.50940 0.48361 0.51298 0.48738 0.53237 0.50779 0.84014
KNN 3 0.51190 0.48625 0.54736 0.52356 0.58015 0.55808 0.82718
KNN 8 0.48810 0.46119 0.49264 0.46597 0.59091 0.56940 0.75904
KNNW 4 0.50000 0.47372 0.50299 0.47687 0.51220 0.48655 0.84513
KNNW 7 0.53571 0.51131 0.54377 0.51979 0.56296 0.53999 0.82902
KNNW 9 0.53571 0.51131 0.54690 0.52309 0.57576 0.55346 0.81599
KNNW 16 0.50000 0.47372 0.53059 0.50591 0.59542 0.57415 0.78316
LOF 19 0.46429 0.43613 0.45119 0.42234 0.49333 0.46670 0.81918
LOF 30 0.51190 0.48625 0.47619 0.44865 0.53416 0.50967 0.79165
LOF 53 0.45238 0.42359 0.50024 0.47397 0.55118 0.52759 0.76192
LOF 75 0.46429 0.43613 0.48599 0.45897 0.56489 0.54201 0.75347
SimplifiedLOF 19 0.47619 0.44866 0.48685 0.45987 0.51064 0.48491 0.84078
SimplifiedLOF 24 0.52381 0.49878 0.50830 0.48245 0.54088 0.51675 0.82792
SimplifiedLOF 31 0.52381 0.49878 0.52235 0.49724 0.55128 0.52769 0.81506
SimplifiedLOF 73 0.46429 0.43613 0.50074 0.47450 0.56250 0.53950 0.77070
LoOP 16 0.48810 0.46119 0.42135 0.39093 0.50340 0.47730 0.85154
LoOP 48 0.54762 0.52384 0.48268 0.45548 0.56000 0.53687 0.81518
LoOP 53 0.53571 0.51131 0.49468 0.46811 0.58442 0.56257 0.81075
LoOP 91 0.52381 0.49878 0.52123 0.49606 0.55172 0.52816 0.78762
LDOF 25 0.45238 0.42359 0.40083 0.36933 0.46667 0.43863 0.83748
LDOF 54 0.48810 0.46119 0.51057 0.48484 0.53333 0.50880 0.80897
LDOF 75 0.52381 0.49878 0.50038 0.47412 0.53793 0.51364 0.79403
ODIN 20 0.26681 0.22827 0.21263 0.17124 0.35341 0.31943 0.77470
ODIN 95 0.34712 0.31280 0.22523 0.18450 0.38298 0.35054 0.74743
ODIN 97 0.34712 0.31280 0.22759 0.18699 0.38298 0.35054 0.74712
FastABOD 21 0.51190 0.48625 0.37223 0.33923 0.51190 0.48625 0.81092
FastABOD 24 0.52381 0.49878 0.37945 0.34683 0.52695 0.50208 0.80952
FastABOD 25 0.52381 0.49878 0.38547 0.35316 0.53254 0.50797 0.80986
KDEOS 7 0.14286 0.09780 0.15192 0.10734 0.19130 0.14879 0.67841
KDEOS 61 0.14286 0.09780 0.13250 0.08690 0.21094 0.16946 0.72570
KDEOS 64 0.23810 0.19805 0.14998 0.10529 0.24719 0.20762 0.72328
LDF 4 0.08557 0.03751 0.07914 0.03073 0.17404 0.13062 0.63552
LDF 6 0.07516 0.02654 0.07575 0.02716 0.15857 0.11434 0.67348
LDF 100 0.07143 0.02262 0.09089 0.04310 0.23529 0.19510 0.65341
INFLO 19 0.46429 0.43613 0.45683 0.42828 0.51701 0.49162 0.83486
INFLO 24 0.52381 0.49878 0.48721 0.46025 0.54658 0.52275 0.82541
INFLO 50 0.50000 0.47372 0.53085 0.50619 0.54815 0.52440 0.79630
INFLO 99 0.51190 0.48625 0.51887 0.49358 0.55556 0.53219 0.77510
COF 4 0.13095 0.08527 0.08764 0.03968 0.15011 0.10544 0.58617
COF 9 0.16667 0.12286 0.13603 0.09062 0.20290 0.16100 0.55667

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Normalized, duplicates

This version contains 1555 attributes, 2957 objects, 147 outliers (4.97%)

Download raw algorithm results (12.6 MB) Download raw algorithm evaluation table (71.5 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 5 0.50952 0.48387 0.54346 0.51958 0.52830 0.50363 0.88486
KNN 9 0.46536 0.43739 0.51980 0.49468 0.56075 0.53777 0.85093
KNNW 9 0.46259 0.43447 0.50870 0.48300 0.49550 0.46910 0.88043
KNNW 14 0.46939 0.44163 0.52217 0.49717 0.50943 0.48377 0.86980
KNNW 20 0.48299 0.45595 0.51497 0.48960 0.51675 0.49147 0.85304
KNNW 25 0.46259 0.43447 0.50656 0.48075 0.52968 0.50508 0.83953
LOF 9 0.10128 0.05427 0.10113 0.05411 0.21622 0.17521 0.74402
LOF 10 0.10163 0.05463 0.10011 0.05303 0.20935 0.16799 0.73780
SimplifiedLOF 9 0.09902 0.05189 0.09147 0.04395 0.17324 0.12999 0.70683
SimplifiedLOF 10 0.09859 0.05144 0.09257 0.04510 0.17984 0.13693 0.71354
LoOP 1 0.17412 0.13092 0.10376 0.05687 0.19397 0.15180 0.60726
LoOP 2 0.17007 0.12665 0.14828 0.10372 0.19169 0.14941 0.68470
LoOP 34 0.14966 0.10518 0.13743 0.09231 0.26354 0.22502 0.77385
LoOP 72 0.16327 0.11949 0.13532 0.09008 0.26577 0.22736 0.76575
LDOF 75 0.16327 0.11949 0.12978 0.08425 0.23333 0.19323 0.75149
LDOF 78 0.16327 0.11949 0.13551 0.09029 0.24615 0.20672 0.76091
LDOF 79 0.16327 0.11949 0.13586 0.09065 0.24615 0.20672 0.76231
ODIN 80 0.36871 0.33568 0.27094 0.23280 0.42051 0.39020 0.81065
ODIN 100 0.39738 0.36585 0.28018 0.24252 0.43200 0.40229 0.80724
FastABOD 73 0.09524 0.04791 0.12190 0.07597 0.26452 0.22604 0.76755
FastABOD 82 0.09524 0.04791 0.12441 0.07860 0.27053 0.23237 0.77444
FastABOD 93 0.09524 0.04791 0.12543 0.07967 0.28763 0.25036 0.77352
FastABOD 97 0.09524 0.04791 0.12556 0.07981 0.28763 0.25036 0.77350
KDEOS 10 0.05442 0.00496 0.08023 0.03211 0.17415 0.13095 0.67949
KDEOS 11 0.05442 0.00496 0.08074 0.03265 0.17272 0.12944 0.68588
KDEOS 75 0.08844 0.04075 0.07450 0.02608 0.14189 0.09700 0.64674
LDF 2 0.13288 0.08752 0.06909 0.02039 0.16022 0.11629 0.50863
LDF 3 0.15238 0.10804 0.06875 0.02004 0.17021 0.12680 0.48811
LDF 12 0.05833 0.00907 0.05571 0.00631 0.10650 0.05975 0.53343
INFLO 9 0.10143 0.05442 0.09685 0.04961 0.21099 0.16971 0.72772
INFLO 10 0.10163 0.05463 0.09754 0.05033 0.20790 0.16646 0.73085
COF 42 0.12245 0.07654 0.11188 0.06542 0.23722 0.19732 0.73539
COF 78 0.17007 0.12665 0.12737 0.08172 0.27249 0.23444 0.72919
COF 91 0.17687 0.13381 0.10992 0.06335 0.23333 0.19323 0.70177

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO