Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

WDBC (version#01)

This data set describes nuclear characteristics for breast cancer diagnosis. Again, we consider examples of benign cancer as inliers and malignant cancer as outliers. In the preprocessing, we follow Zhang et al. [1], downsampling the outliers to 10. The processed database has 30 numeric attributes and 367 instances, namely 10 outliers (2.72%) and 357 inliers (97.28%).

References:

[1] K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, pages 813-822, 2009.

Download all data set variants used (1.1 MB). You can also access the original data. (wdbc.data)

Normalized, without duplicates

This version contains 30 attributes, 367 objects, 10 outliers (2.72%)

Download raw algorithm results (3.3 MB) Download raw algorithm evaluation table (40.1 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 3 0.60000 0.58880 0.48659 0.47221 0.60000 0.58880 0.91681
KNN 4 0.60000 0.58880 0.54380 0.53102 0.66667 0.65733 0.90784
KNN 7 0.60000 0.58880 0.61650 0.60576 0.66667 0.65733 0.92325
KNN 94 0.60000 0.58880 0.59826 0.58700 0.63158 0.62126 0.93193
KNNW 5 0.60000 0.58880 0.46115 0.44605 0.60000 0.58880 0.91289
KNNW 17 0.60000 0.58880 0.57934 0.56755 0.66667 0.65733 0.91989
KNNW 100 0.60000 0.58880 0.59145 0.58000 0.63158 0.62126 0.92521
LOF 13 0.60000 0.58880 0.60579 0.59475 0.60870 0.59773 0.89552
LOF 28 0.60000 0.58880 0.67168 0.66249 0.70588 0.69764 0.90644
LOF 96 0.60000 0.58880 0.63019 0.61983 0.66667 0.65733 0.93221
SimplifiedLOF 23 0.60000 0.58880 0.58873 0.57721 0.63636 0.62618 0.88207
SimplifiedLOF 26 0.60000 0.58880 0.58964 0.57814 0.66667 0.65733 0.88627
SimplifiedLOF 39 0.60000 0.58880 0.64803 0.63817 0.66667 0.65733 0.89412
SimplifiedLOF 94 0.60000 0.58880 0.62587 0.61539 0.66667 0.65733 0.92185
LoOP 35 0.50000 0.48599 0.54454 0.53179 0.63636 0.62618 0.88123
LoOP 38 0.60000 0.58880 0.54912 0.53649 0.63636 0.62618 0.88796
LoOP 94 0.60000 0.58880 0.60146 0.59030 0.63158 0.62126 0.91933
LDOF 22 0.50000 0.48599 0.36544 0.34766 0.50000 0.48599 0.85014
LDOF 83 0.50000 0.48599 0.55933 0.54698 0.58824 0.57670 0.91401
LDOF 88 0.50000 0.48599 0.57163 0.55963 0.60870 0.59773 0.90980
LDOF 94 0.50000 0.48599 0.57213 0.56015 0.60870 0.59773 0.91373
ODIN 92 0.50000 0.48599 0.32862 0.30981 0.54545 0.53272 0.90686
ODIN 94 0.50000 0.48599 0.34717 0.32889 0.57143 0.55942 0.91008
ODIN 100 0.50000 0.48599 0.38808 0.37094 0.57143 0.55942 0.91204
FastABOD 17 0.50000 0.48599 0.44011 0.42443 0.54545 0.53272 0.93641
FastABOD 44 0.50000 0.48599 0.51323 0.49959 0.60870 0.59773 0.94062
FastABOD 67 0.50000 0.48599 0.52289 0.50953 0.60870 0.59773 0.94370
FastABOD 89 0.50000 0.48599 0.51709 0.50356 0.60870 0.59773 0.94426
KDEOS 2 0.10000 0.07479 0.05101 0.02443 0.13793 0.11378 0.54692
KDEOS 74 0.00000 -0.02801 0.08773 0.06217 0.20690 0.18468 0.80644
KDEOS 100 0.00000 -0.02801 0.08225 0.05654 0.18919 0.16648 0.81176
LDF 4 0.70000 0.69160 0.67433 0.66521 0.70000 0.69160 0.93529
LDF 35 0.60000 0.58880 0.62037 0.60974 0.63158 0.62126 0.94426
INFLO 25 0.60000 0.58880 0.52317 0.50981 0.60000 0.58880 0.89244
INFLO 49 0.60000 0.58880 0.59998 0.58877 0.66667 0.65733 0.90756
INFLO 94 0.60000 0.58880 0.61093 0.60003 0.66667 0.65733 0.92717
COF 35 0.60000 0.58880 0.48118 0.46665 0.60870 0.59773 0.91345
COF 43 0.60000 0.58880 0.54885 0.53621 0.60000 0.58880 0.91036
COF 51 0.50000 0.48599 0.43925 0.42354 0.60870 0.59773 0.92381
COF 53 0.60000 0.58880 0.46274 0.44769 0.66667 0.65733 0.92045

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 30 attributes, 367 objects, 10 outliers (2.72%)

Download raw algorithm results (3.1 MB) Download raw algorithm evaluation table (30.5 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.80000 0.79440 0.85293 0.84881 0.85714 0.85314 0.97031
KNN 2 0.80000 0.79440 0.88842 0.88530 0.84211 0.83768 0.99216
KNN 3 0.80000 0.79440 0.87987 0.87650 0.84211 0.83768 0.99412
KNNW 1 0.90000 0.89720 0.82303 0.81807 0.90000 0.89720 0.93529
KNNW 7 0.80000 0.79440 0.88133 0.87801 0.84211 0.83768 0.99328
LOF 10 0.80000 0.79440 0.88151 0.87819 0.88889 0.88578 0.99076
SimplifiedLOF 11 0.80000 0.79440 0.80845 0.80309 0.80000 0.79440 0.97619
SimplifiedLOF 12 0.80000 0.79440 0.84028 0.83580 0.84211 0.83768 0.98039
SimplifiedLOF 13 0.80000 0.79440 0.86229 0.85843 0.84211 0.83768 0.98431
SimplifiedLOF 20 0.80000 0.79440 0.85264 0.84851 0.84211 0.83768 0.98627
LoOP 23 0.70000 0.69160 0.83440 0.82976 0.78261 0.77652 0.98571
LoOP 25 0.80000 0.79440 0.85340 0.84929 0.80000 0.79440 0.98403
LoOP 29 0.80000 0.79440 0.85408 0.85000 0.82353 0.81859 0.98067
LoOP 46 0.80000 0.79440 0.85319 0.84908 0.84211 0.83768 0.97563
LDOF 19 0.80000 0.79440 0.70713 0.69893 0.80000 0.79440 0.97675
LDOF 80 0.80000 0.79440 0.84491 0.84056 0.84211 0.83768 0.98319
LDOF 87 0.80000 0.79440 0.85568 0.85164 0.88889 0.88578 0.98319
ODIN 48 0.80000 0.79440 0.78005 0.77389 0.80000 0.79440 0.91555
ODIN 55 0.80000 0.79440 0.78762 0.78168 0.84211 0.83768 0.93375
ODIN 68 0.80000 0.79440 0.82332 0.81837 0.84211 0.83768 0.96667
ODIN 97 0.80000 0.79440 0.82076 0.81574 0.84211 0.83768 0.97521
FastABOD 3 0.80000 0.79440 0.86614 0.86239 0.85714 0.85314 0.98880
KDEOS 2 0.10000 0.07479 0.03748 0.01052 0.10811 0.08313 0.46793
KDEOS 62 0.00000 -0.02801 0.11214 0.08727 0.28571 0.26571 0.88459
KDEOS 69 0.00000 -0.02801 0.11128 0.08638 0.29508 0.27534 0.88319
KDEOS 100 0.00000 -0.02801 0.11145 0.08656 0.29032 0.27044 0.88655
LDF 4 0.80000 0.79440 0.87516 0.87166 0.84211 0.83768 0.99272
LDF 10 0.80000 0.79440 0.86737 0.86365 0.88889 0.88578 0.98599
LDF 36 0.80000 0.79440 0.91389 0.91148 0.84211 0.83768 0.99692
INFLO 13 0.80000 0.79440 0.83694 0.83237 0.80000 0.79440 0.98123
INFLO 20 0.80000 0.79440 0.85583 0.85179 0.84211 0.83768 0.98235
INFLO 22 0.80000 0.79440 0.86229 0.85843 0.84211 0.83768 0.98431
COF 13 0.80000 0.79440 0.86297 0.85914 0.84211 0.83768 0.98095
COF 35 0.80000 0.79440 0.86142 0.85753 0.88889 0.88578 0.96765
COF 95 0.80000 0.79440 0.89294 0.88994 0.88889 0.88578 0.99356
COF 97 0.80000 0.79440 0.89471 0.89176 0.88889 0.88578 0.99356

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO