Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

WDBC (version#02)

This data set describes nuclear characteristics for breast cancer diagnosis. Again, we consider examples of benign cancer as inliers and malignant cancer as outliers. In the preprocessing, we follow Zhang et al. [1], downsampling the outliers to 10. The processed database has 30 numeric attributes and 367 instances, namely 10 outliers (2.72%) and 357 inliers (97.28%).

References:

[1] K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, pages 813-822, 2009.

Download all data set variants used (1.1 MB). You can also access the original data. (wdbc.data)

Normalized, without duplicates

This version contains 30 attributes, 367 objects, 10 outliers (2.72%)

Download raw algorithm results (3.2 MB) Download raw algorithm evaluation table (39.4 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 12 0.50000 0.48599 0.35911 0.34116 0.50000 0.48599 0.91485
KNN 65 0.40000 0.38319 0.36815 0.35045 0.45455 0.43927 0.93389
KNN 73 0.40000 0.38319 0.36045 0.34253 0.45455 0.43927 0.93417
KNNW 13 0.40000 0.38319 0.33533 0.31671 0.45455 0.43927 0.91709
KNNW 80 0.40000 0.38319 0.35823 0.34025 0.45455 0.43927 0.92857
LOF 13 0.50000 0.48599 0.40474 0.38806 0.50000 0.48599 0.89272
LOF 17 0.50000 0.48599 0.41004 0.39351 0.54545 0.53272 0.89496
LOF 100 0.40000 0.38319 0.38836 0.37123 0.47059 0.45576 0.93557
SimplifiedLOF 16 0.40000 0.38319 0.42530 0.40920 0.44444 0.42888 0.89468
SimplifiedLOF 19 0.50000 0.48599 0.36955 0.35189 0.54545 0.53272 0.89720
SimplifiedLOF 20 0.50000 0.48599 0.36734 0.34962 0.57143 0.55942 0.89440
SimplifiedLOF 99 0.40000 0.38319 0.39790 0.38103 0.47059 0.45576 0.93193
LoOP 20 0.50000 0.48599 0.38041 0.36306 0.57143 0.55942 0.89188
LoOP 65 0.40000 0.38319 0.40062 0.38383 0.50000 0.48599 0.92073
LoOP 99 0.40000 0.38319 0.38713 0.36996 0.48000 0.46543 0.93081
LDOF 22 0.40000 0.38319 0.38722 0.37006 0.54545 0.53272 0.91709
LDOF 24 0.50000 0.48599 0.36318 0.34534 0.50000 0.48599 0.90700
LDOF 31 0.50000 0.48599 0.39278 0.37577 0.52632 0.51305 0.91092
LDOF 100 0.40000 0.38319 0.37483 0.35732 0.50000 0.48599 0.93053
ODIN 78 0.45000 0.43459 0.32510 0.30619 0.47619 0.46152 0.90588
ODIN 100 0.40000 0.38319 0.34102 0.32256 0.45455 0.43927 0.91373
FastABOD 10 0.40000 0.38319 0.35736 0.33936 0.44444 0.42888 0.94678
FastABOD 49 0.50000 0.48599 0.39036 0.37328 0.50000 0.48599 0.94230
FastABOD 84 0.50000 0.48599 0.39281 0.37581 0.50000 0.48599 0.94370
KDEOS 70 0.20000 0.17759 0.11304 0.08819 0.21053 0.18841 0.84818
KDEOS 82 0.00000 -0.02801 0.13131 0.10698 0.28571 0.26571 0.86443
KDEOS 83 0.00000 -0.02801 0.13244 0.10814 0.31250 0.29324 0.86359
KDEOS 84 0.00000 -0.02801 0.13255 0.10826 0.29412 0.27435 0.86443
LDF 2 0.40000 0.38319 0.46689 0.45196 0.53333 0.52026 0.80756
LDF 17 0.50000 0.48599 0.35093 0.33275 0.50000 0.48599 0.90364
LDF 62 0.40000 0.38319 0.33667 0.31809 0.45455 0.43927 0.93613
INFLO 20 0.50000 0.48599 0.36426 0.34646 0.52174 0.50834 0.89384
INFLO 45 0.40000 0.38319 0.37000 0.35235 0.54545 0.53272 0.92605
INFLO 73 0.40000 0.38319 0.39040 0.37333 0.48000 0.46543 0.93473
INFLO 82 0.40000 0.38319 0.40318 0.38646 0.50000 0.48599 0.93333
COF 10 0.40000 0.38319 0.35246 0.33432 0.40000 0.38319 0.86218
COF 11 0.30000 0.28039 0.35983 0.34190 0.36364 0.34581 0.85938
COF 13 0.40000 0.38319 0.30996 0.29063 0.44444 0.42888 0.85294
COF 70 0.30000 0.28039 0.28112 0.26098 0.33333 0.31466 0.90224

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 30 attributes, 367 objects, 10 outliers (2.72%)

Download raw algorithm results (3.1 MB) Download raw algorithm evaluation table (30.6 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.80000 0.79440 0.87349 0.86995 0.85714 0.85314 0.99398
KNN 3 0.80000 0.79440 0.87851 0.87511 0.85714 0.85314 0.99468
KNNW 1 0.80000 0.79440 0.89030 0.88723 0.88889 0.88578 0.99202
KNNW 4 0.80000 0.79440 0.88559 0.88238 0.85714 0.85314 0.99468
LOF 11 0.70000 0.69160 0.87215 0.86857 0.77778 0.77155 0.99496
LOF 13 0.70000 0.69160 0.87530 0.87181 0.77778 0.77155 0.99496
LOF 14 0.80000 0.79440 0.85585 0.85181 0.80000 0.79440 0.99384
SimplifiedLOF 14 0.70000 0.69160 0.86122 0.85733 0.81818 0.81309 0.99272
SimplifiedLOF 19 0.70000 0.69160 0.85544 0.85139 0.77778 0.77155 0.99440
SimplifiedLOF 27 0.80000 0.79440 0.86512 0.86134 0.80000 0.79440 0.99440
LoOP 14 0.70000 0.69160 0.71858 0.71069 0.70000 0.69160 0.98207
LoOP 32 0.70000 0.69160 0.85068 0.84650 0.77778 0.77155 0.99244
LoOP 33 0.70000 0.69160 0.85760 0.85361 0.77778 0.77155 0.99328
LDOF 18 0.70000 0.69160 0.67869 0.66969 0.70000 0.69160 0.97955
LDOF 19 0.70000 0.69160 0.74290 0.73570 0.77778 0.77155 0.98403
LDOF 39 0.70000 0.69160 0.82055 0.81552 0.72727 0.71963 0.99300
ODIN 40 0.70000 0.69160 0.73696 0.72960 0.70000 0.69160 0.98711
ODIN 41 0.70000 0.69160 0.78426 0.77822 0.75000 0.74300 0.98739
ODIN 53 0.66667 0.65733 0.78700 0.78103 0.72727 0.71963 0.98768
FastABOD 4 0.80000 0.79440 0.86879 0.86511 0.85714 0.85314 0.99440
FastABOD 17 0.80000 0.79440 0.88938 0.88628 0.85714 0.85314 0.99524
KDEOS 4 0.20000 0.17759 0.11825 0.09355 0.25000 0.22899 0.69832
KDEOS 61 0.00000 -0.02801 0.13555 0.11133 0.32787 0.30904 0.90952
KDEOS 62 0.00000 -0.02801 0.12992 0.10555 0.33898 0.32047 0.90448
LDF 7 0.70000 0.69160 0.84106 0.83661 0.77778 0.77155 0.99216
LDF 10 0.80000 0.79440 0.85436 0.85028 0.84211 0.83768 0.99160
INFLO 11 0.70000 0.69160 0.76893 0.76245 0.76190 0.75524 0.97927
INFLO 15 0.70000 0.69160 0.82999 0.82522 0.77778 0.77155 0.98880
INFLO 18 0.70000 0.69160 0.85878 0.85482 0.77778 0.77155 0.99356
INFLO 19 0.70000 0.69160 0.85179 0.84763 0.77778 0.77155 0.99384
COF 13 0.60000 0.58880 0.79833 0.79268 0.75000 0.74300 0.98978
COF 20 0.80000 0.79440 0.85139 0.84723 0.80000 0.79440 0.98641
COF 23 0.80000 0.79440 0.86414 0.86033 0.84211 0.83768 0.98613
COF 45 0.80000 0.79440 0.87313 0.86957 0.84211 0.83768 0.98852

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO