Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

WDBC (version#09)

This data set describes nuclear characteristics for breast cancer diagnosis. Again, we consider examples of benign cancer as inliers and malignant cancer as outliers. In the preprocessing, we follow Zhang et al. [1], downsampling the outliers to 10. The processed database has 30 numeric attributes and 367 instances, namely 10 outliers (2.72%) and 357 inliers (97.28%).

References:

[1] K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, pages 813-822, 2009.

Download all data set variants used (1.1 MB). You can also access the original data. (wdbc.data)

Normalized, without duplicates

This version contains 30 attributes, 367 objects, 10 outliers (2.72%)

Download raw algorithm results (3.3 MB) Download raw algorithm evaluation table (39.8 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 3 0.40000 0.38319 0.36862 0.35093 0.42105 0.40484 0.90658
KNN 9 0.40000 0.38319 0.40772 0.39113 0.47059 0.45576 0.92773
KNN 81 0.40000 0.38319 0.40935 0.39280 0.44444 0.42888 0.93725
KNN 100 0.40000 0.38319 0.40837 0.39179 0.44444 0.42888 0.93866
KNNW 6 0.40000 0.38319 0.35229 0.33414 0.40000 0.38319 0.90840
KNNW 19 0.40000 0.38319 0.38673 0.36956 0.44444 0.42888 0.92325
KNNW 99 0.40000 0.38319 0.40113 0.38436 0.44444 0.42888 0.93109
LOF 12 0.40000 0.38319 0.39839 0.38154 0.42105 0.40484 0.92241
LOF 25 0.40000 0.38319 0.49583 0.48170 0.45455 0.43927 0.94734
LOF 45 0.40000 0.38319 0.48229 0.46779 0.50000 0.48599 0.94034
SimplifiedLOF 21 0.40000 0.38319 0.42763 0.41159 0.42857 0.41257 0.94370
SimplifiedLOF 27 0.40000 0.38319 0.48999 0.47570 0.48000 0.46543 0.95070
SimplifiedLOF 77 0.40000 0.38319 0.49308 0.47888 0.50000 0.48599 0.94678
SimplifiedLOF 79 0.40000 0.38319 0.49515 0.48101 0.50000 0.48599 0.94762
LoOP 21 0.40000 0.38319 0.41846 0.40217 0.42105 0.40484 0.94286
LoOP 33 0.40000 0.38319 0.47404 0.45931 0.48000 0.46543 0.95042
LoOP 35 0.40000 0.38319 0.46238 0.44732 0.50000 0.48599 0.94706
LoOP 73 0.40000 0.38319 0.48563 0.47122 0.47059 0.45576 0.94874
LDOF 23 0.40000 0.38319 0.42746 0.41142 0.45455 0.43927 0.94958
LDOF 27 0.40000 0.38319 0.46695 0.45202 0.47619 0.46152 0.96050
LDOF 52 0.30000 0.28039 0.49752 0.48345 0.48649 0.47210 0.95826
LDOF 93 0.40000 0.38319 0.45312 0.43780 0.54545 0.53272 0.95350
ODIN 56 0.30000 0.28039 0.24568 0.22455 0.36364 0.34581 0.92367
ODIN 69 0.30000 0.28039 0.30176 0.28220 0.44444 0.42888 0.93459
ODIN 75 0.30000 0.28039 0.31086 0.29155 0.41379 0.39737 0.93557
ODIN 86 0.30000 0.28039 0.35968 0.34175 0.42857 0.41257 0.93403
FastABOD 4 0.40000 0.38319 0.34390 0.32552 0.40000 0.38319 0.92801
FastABOD 7 0.30000 0.28039 0.34124 0.32279 0.48485 0.47042 0.92913
FastABOD 39 0.40000 0.38319 0.43760 0.42184 0.46667 0.45173 0.94258
KDEOS 33 0.20000 0.17759 0.08462 0.05898 0.20000 0.17759 0.79804
KDEOS 62 0.00000 -0.02801 0.13831 0.11418 0.28571 0.26571 0.90420
KDEOS 75 0.10000 0.07479 0.14497 0.12102 0.30303 0.28351 0.90140
LDF 7 0.40000 0.38319 0.44364 0.42805 0.46154 0.44646 0.89692
LDF 8 0.30000 0.28039 0.48018 0.46562 0.46154 0.44646 0.93445
LDF 10 0.40000 0.38319 0.45541 0.44016 0.47059 0.45576 0.93025
INFLO 19 0.40000 0.38319 0.41665 0.40030 0.40000 0.38319 0.94482
INFLO 61 0.40000 0.38319 0.49808 0.48403 0.50000 0.48599 0.95182
INFLO 72 0.40000 0.38319 0.46177 0.44669 0.52941 0.51623 0.95518
COF 12 0.30000 0.28039 0.25060 0.22961 0.31579 0.29662 0.85266
COF 14 0.20000 0.17759 0.23108 0.20954 0.36364 0.34581 0.88655
COF 17 0.30000 0.28039 0.30232 0.28278 0.31250 0.29324 0.86947
COF 26 0.30000 0.28039 0.28904 0.26912 0.42857 0.41257 0.87339

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 30 attributes, 367 objects, 10 outliers (2.72%)

Download raw algorithm results (3.1 MB) Download raw algorithm evaluation table (37.4 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.60000 0.58880 0.66653 0.65719 0.62500 0.61450 0.97745
KNN 6 0.60000 0.58880 0.67141 0.66220 0.66667 0.65733 0.96863
KNN 7 0.70000 0.69160 0.66507 0.65569 0.70000 0.69160 0.96723
KNNW 1 0.60000 0.58880 0.72774 0.72012 0.70588 0.69764 0.98319
KNNW 15 0.70000 0.69160 0.67929 0.67031 0.70000 0.69160 0.97199
LOF 14 0.60000 0.58880 0.64753 0.63766 0.63158 0.62126 0.89748
LOF 18 0.60000 0.58880 0.65638 0.64676 0.62500 0.61450 0.91401
LOF 44 0.60000 0.58880 0.61862 0.60793 0.63636 0.62618 0.96779
LOF 54 0.60000 0.58880 0.60665 0.59563 0.66667 0.65733 0.95350
SimplifiedLOF 22 0.60000 0.58880 0.64885 0.63902 0.62500 0.61450 0.93193
SimplifiedLOF 25 0.60000 0.58880 0.65727 0.64767 0.62500 0.61450 0.92717
SimplifiedLOF 35 0.60000 0.58880 0.64760 0.63773 0.66667 0.65733 0.92129
SimplifiedLOF 73 0.60000 0.58880 0.62606 0.61558 0.66667 0.65733 0.96891
LoOP 30 0.60000 0.58880 0.64244 0.63242 0.60000 0.58880 0.91078
LoOP 31 0.60000 0.58880 0.64758 0.63771 0.63158 0.62126 0.91078
LoOP 77 0.50000 0.48599 0.61157 0.60069 0.63636 0.62618 0.96583
LoOP 80 0.60000 0.58880 0.61936 0.60870 0.63636 0.62618 0.96835
LDOF 30 0.60000 0.58880 0.62969 0.61932 0.60000 0.58880 0.94538
LDOF 35 0.50000 0.48599 0.63292 0.62264 0.63636 0.62618 0.94874
LDOF 40 0.60000 0.58880 0.64548 0.63555 0.63636 0.62618 0.93894
LDOF 89 0.50000 0.48599 0.60532 0.59426 0.60870 0.59773 0.96527
ODIN 52 0.50000 0.48599 0.60692 0.59591 0.55556 0.54311 0.95910
ODIN 55 0.60000 0.58880 0.60327 0.59216 0.60000 0.58880 0.96036
ODIN 56 0.60000 0.58880 0.60221 0.59107 0.60000 0.58880 0.96148
ODIN 79 0.60000 0.58880 0.59414 0.58277 0.66667 0.65733 0.88683
FastABOD 4 0.60000 0.58880 0.75785 0.75107 0.66667 0.65733 0.98599
FastABOD 6 0.70000 0.69160 0.71483 0.70684 0.70000 0.69160 0.98431
KDEOS 4 0.10000 0.07479 0.09659 0.07128 0.16667 0.14332 0.72241
KDEOS 57 0.10000 0.07479 0.13387 0.10961 0.32727 0.30843 0.86975
KDEOS 59 0.10000 0.07479 0.13656 0.11237 0.31579 0.29662 0.87199
KDEOS 100 0.00000 -0.02801 0.12188 0.09729 0.30508 0.28562 0.88964
LDF 7 0.60000 0.58880 0.64991 0.64010 0.60000 0.58880 0.89048
LDF 11 0.70000 0.69160 0.61557 0.60480 0.70000 0.69160 0.90980
LDF 42 0.70000 0.69160 0.63776 0.62762 0.70000 0.69160 0.95238
INFLO 21 0.60000 0.58880 0.64142 0.63138 0.60000 0.58880 0.89020
INFLO 27 0.60000 0.58880 0.64818 0.63833 0.61538 0.60461 0.89384
INFLO 48 0.50000 0.48599 0.61445 0.60365 0.57143 0.55942 0.97283
INFLO 62 0.60000 0.58880 0.62775 0.61732 0.66667 0.65733 0.96947
COF 18 0.70000 0.69160 0.68451 0.67567 0.70000 0.69160 0.94328
COF 19 0.70000 0.69160 0.70300 0.69468 0.73684 0.72947 0.93725
COF 34 0.70000 0.69160 0.72914 0.72155 0.72727 0.71963 0.92409
COF 74 0.70000 0.69160 0.69128 0.68263 0.70588 0.69764 0.97157

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO