Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

WDBC (version#07)

This data set describes nuclear characteristics for breast cancer diagnosis. Again, we consider examples of benign cancer as inliers and malignant cancer as outliers. In the preprocessing, we follow Zhang et al. [1], downsampling the outliers to 10. The processed database has 30 numeric attributes and 367 instances, namely 10 outliers (2.72%) and 357 inliers (97.28%).

References:

[1] K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, pages 813-822, 2009.

Download all data set variants used (1.1 MB). You can also access the original data. (wdbc.data)

Normalized, without duplicates

This version contains 30 attributes, 367 objects, 10 outliers (2.72%)

Download raw algorithm results (3.2 MB) Download raw algorithm evaluation table (38.1 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 4 0.20000 0.17759 0.36197 0.34410 0.46154 0.44646 0.94594
KNN 6 0.30000 0.28039 0.36838 0.35069 0.44444 0.42888 0.94482
KNN 12 0.30000 0.28039 0.37588 0.35840 0.46154 0.44646 0.94678
KNN 14 0.30000 0.28039 0.37776 0.36033 0.46154 0.44646 0.94622
KNNW 10 0.30000 0.28039 0.33096 0.31222 0.44444 0.42888 0.94454
KNNW 13 0.30000 0.28039 0.34205 0.32362 0.44444 0.42888 0.94622
KNNW 34 0.30000 0.28039 0.36124 0.34335 0.46154 0.44646 0.94538
KNNW 73 0.30000 0.28039 0.37591 0.35843 0.46154 0.44646 0.94426
LOF 15 0.50000 0.48599 0.51217 0.49850 0.50000 0.48599 0.96050
LOF 77 0.60000 0.58880 0.43495 0.41912 0.60000 0.58880 0.94846
SimplifiedLOF 15 0.40000 0.38319 0.45051 0.43512 0.45455 0.43927 0.95854
SimplifiedLOF 26 0.50000 0.48599 0.41310 0.39666 0.50000 0.48599 0.95434
SimplifiedLOF 43 0.50000 0.48599 0.43381 0.41795 0.52174 0.50834 0.95966
SimplifiedLOF 63 0.50000 0.48599 0.41548 0.39911 0.57143 0.55942 0.95742
LoOP 42 0.40000 0.38319 0.43975 0.42406 0.50000 0.48599 0.95966
LoOP 43 0.40000 0.38319 0.42809 0.41207 0.50000 0.48599 0.96022
LoOP 51 0.40000 0.38319 0.41113 0.39463 0.54545 0.53272 0.95798
LoOP 83 0.50000 0.48599 0.42639 0.41032 0.54545 0.53272 0.95574
LDOF 29 0.40000 0.38319 0.37251 0.35493 0.48000 0.46543 0.95294
LDOF 44 0.40000 0.38319 0.41621 0.39986 0.52174 0.50834 0.95994
LDOF 47 0.40000 0.38319 0.39689 0.38000 0.54545 0.53272 0.95770
ODIN 60 0.40000 0.38319 0.30367 0.28416 0.45455 0.43927 0.94650
ODIN 62 0.50000 0.48599 0.32422 0.30529 0.50000 0.48599 0.94636
ODIN 100 0.50000 0.48599 0.39285 0.37584 0.50000 0.48599 0.94174
FastABOD 15 0.20000 0.17759 0.35173 0.33357 0.50000 0.48599 0.95574
FastABOD 63 0.30000 0.28039 0.39300 0.37599 0.47059 0.45576 0.95854
FastABOD 71 0.40000 0.38319 0.39383 0.37685 0.45161 0.43625 0.95770
FastABOD 74 0.40000 0.38319 0.39781 0.38094 0.45714 0.44194 0.95826
KDEOS 70 0.00000 -0.02801 0.12140 0.09679 0.26316 0.24252 0.89664
KDEOS 76 0.10000 0.07479 0.13876 0.11464 0.26087 0.24017 0.90616
KDEOS 79 0.20000 0.17759 0.13969 0.11559 0.25455 0.23366 0.90168
KDEOS 81 0.20000 0.17759 0.14295 0.11894 0.24000 0.21871 0.89804
LDF 4 0.40000 0.38319 0.48725 0.47289 0.57143 0.55942 0.90560
LDF 23 0.40000 0.38319 0.38787 0.37072 0.48000 0.46543 0.93866
LDF 27 0.50000 0.48599 0.39898 0.38215 0.54545 0.53272 0.92969
INFLO 44 0.40000 0.38319 0.41203 0.39556 0.54545 0.53272 0.95574
INFLO 85 0.50000 0.48599 0.43255 0.41665 0.54545 0.53272 0.95154
INFLO 89 0.50000 0.48599 0.44835 0.43290 0.55556 0.54311 0.95294
COF 11 0.40000 0.38319 0.37024 0.35260 0.47059 0.45576 0.93473
COF 12 0.40000 0.38319 0.39178 0.37475 0.47059 0.45576 0.93669
COF 21 0.20000 0.17759 0.32252 0.30354 0.37500 0.35749 0.94146
COF 49 0.20000 0.17759 0.32948 0.31070 0.50000 0.48599 0.90952

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 30 attributes, 367 objects, 10 outliers (2.72%)

Download raw algorithm results (3.1 MB) Download raw algorithm evaluation table (31.0 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.80000 0.79440 0.87264 0.86907 0.84211 0.83768 0.99328
KNNW 1 0.80000 0.79440 0.81241 0.80715 0.84211 0.83768 0.99062
KNNW 2 0.80000 0.79440 0.85472 0.85065 0.84211 0.83768 0.99300
KNNW 6 0.80000 0.79440 0.86098 0.85708 0.80000 0.79440 0.99300
LOF 12 0.80000 0.79440 0.86508 0.86130 0.84211 0.83768 0.99076
LOF 13 0.80000 0.79440 0.86615 0.86240 0.84211 0.83768 0.99076
LOF 15 0.80000 0.79440 0.86087 0.85697 0.84211 0.83768 0.99160
SimplifiedLOF 14 0.80000 0.79440 0.84275 0.83834 0.82353 0.81859 0.98768
SimplifiedLOF 19 0.80000 0.79440 0.84770 0.84344 0.84211 0.83768 0.98852
SimplifiedLOF 21 0.80000 0.79440 0.84996 0.84576 0.84211 0.83768 0.98936
LoOP 20 0.80000 0.79440 0.82940 0.82463 0.80000 0.79440 0.98711
LoOP 24 0.80000 0.79440 0.84314 0.83875 0.82353 0.81859 0.98711
LoOP 35 0.80000 0.79440 0.83481 0.83019 0.84211 0.83768 0.98571
LDOF 20 0.80000 0.79440 0.72491 0.71720 0.80000 0.79440 0.97171
LDOF 29 0.80000 0.79440 0.79229 0.78647 0.84211 0.83768 0.97731
LDOF 37 0.80000 0.79440 0.83004 0.82528 0.80000 0.79440 0.98291
LDOF 48 0.70000 0.69160 0.80974 0.80441 0.76190 0.75524 0.98431
ODIN 29 0.56000 0.54768 0.53417 0.52112 0.57143 0.55942 0.97241
ODIN 49 0.75000 0.74300 0.78230 0.77620 0.77778 0.77155 0.90924
ODIN 50 0.80000 0.79440 0.76683 0.76029 0.80000 0.79440 0.91022
FastABOD 3 0.80000 0.79440 0.79984 0.79423 0.80000 0.79440 0.98711
FastABOD 4 0.80000 0.79440 0.82829 0.82348 0.84211 0.83768 0.99132
FastABOD 12 0.80000 0.79440 0.85089 0.84671 0.84211 0.83768 0.99384
FastABOD 18 0.80000 0.79440 0.85867 0.85471 0.80000 0.79440 0.99384
KDEOS 2 0.00000 -0.02801 0.02825 0.00103 0.06504 0.03885 0.46541
KDEOS 62 0.00000 -0.02801 0.12157 0.09697 0.30508 0.28562 0.89692
KDEOS 97 0.00000 -0.02801 0.11640 0.09165 0.31250 0.29324 0.89300
LDF 7 0.80000 0.79440 0.88557 0.88237 0.82353 0.81859 0.99468
LDF 9 0.80000 0.79440 0.87100 0.86738 0.84211 0.83768 0.99328
INFLO 15 0.80000 0.79440 0.83381 0.82915 0.82353 0.81859 0.98459
INFLO 19 0.80000 0.79440 0.82870 0.82390 0.84211 0.83768 0.98347
INFLO 21 0.80000 0.79440 0.83739 0.83283 0.84211 0.83768 0.98655
COF 9 0.80000 0.79440 0.68220 0.67330 0.80000 0.79440 0.98866
COF 12 0.80000 0.79440 0.87627 0.87280 0.81818 0.81309 0.99552
COF 13 0.80000 0.79440 0.88902 0.88591 0.84211 0.83768 0.99524

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO