Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

WDBC (version#04)

This data set describes nuclear characteristics for breast cancer diagnosis. Again, we consider examples of benign cancer as inliers and malignant cancer as outliers. In the preprocessing, we follow Zhang et al. [1], downsampling the outliers to 10. The processed database has 30 numeric attributes and 367 instances, namely 10 outliers (2.72%) and 357 inliers (97.28%).

References:

[1] K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, pages 813-822, 2009.

Download all data set variants used (1.1 MB). You can also access the original data. (wdbc.data)

Normalized, without duplicates

This version contains 30 attributes, 367 objects, 10 outliers (2.72%)

Download raw algorithm results (3.3 MB) Download raw algorithm evaluation table (36.1 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.50000 0.48599 0.51125 0.49756 0.54545 0.53272 0.95098
KNN 7 0.50000 0.48599 0.53156 0.51843 0.62500 0.61450 0.94566
KNN 94 0.50000 0.48599 0.54766 0.53499 0.62500 0.61450 0.95798
KNN 96 0.50000 0.48599 0.54712 0.53443 0.62500 0.61450 0.95854
KNNW 4 0.50000 0.48599 0.45488 0.43961 0.50000 0.48599 0.95126
KNNW 41 0.50000 0.48599 0.53115 0.51801 0.62500 0.61450 0.95070
KNNW 99 0.50000 0.48599 0.53913 0.52622 0.62500 0.61450 0.95462
LOF 11 0.60000 0.58880 0.57304 0.56108 0.63158 0.62126 0.95014
LOF 15 0.50000 0.48599 0.64857 0.63873 0.66667 0.65733 0.95966
LOF 18 0.50000 0.48599 0.66049 0.65098 0.66667 0.65733 0.96527
SimplifiedLOF 14 0.50000 0.48599 0.45707 0.44186 0.52174 0.50834 0.95938
SimplifiedLOF 18 0.50000 0.48599 0.63652 0.62634 0.58824 0.57670 0.96611
SimplifiedLOF 19 0.50000 0.48599 0.61981 0.60916 0.62500 0.61450 0.96527
SimplifiedLOF 22 0.50000 0.48599 0.62246 0.61189 0.62500 0.61450 0.96639
LoOP 19 0.50000 0.48599 0.50866 0.49489 0.56000 0.54768 0.96134
LoOP 21 0.50000 0.48599 0.56642 0.55427 0.53846 0.52553 0.96415
LoOP 27 0.50000 0.48599 0.57787 0.56604 0.62500 0.61450 0.95798
LoOP 32 0.50000 0.48599 0.58334 0.57167 0.62500 0.61450 0.95770
LDOF 18 0.50000 0.48599 0.44003 0.42434 0.50000 0.48599 0.95770
LDOF 41 0.50000 0.48599 0.58446 0.57282 0.62500 0.61450 0.95994
LDOF 44 0.50000 0.48599 0.59228 0.58086 0.62500 0.61450 0.96190
LDOF 72 0.50000 0.48599 0.56259 0.55034 0.62500 0.61450 0.96471
ODIN 59 0.50000 0.48599 0.43157 0.41564 0.50000 0.48599 0.94594
ODIN 77 0.50000 0.48599 0.45893 0.44378 0.55556 0.54311 0.94888
ODIN 79 0.50000 0.48599 0.46073 0.44562 0.55556 0.54311 0.94986
ODIN 100 0.50000 0.48599 0.43441 0.41856 0.55556 0.54311 0.95168
FastABOD 7 0.60000 0.58880 0.53564 0.52263 0.60000 0.58880 0.96863
FastABOD 12 0.60000 0.58880 0.58870 0.57717 0.63636 0.62618 0.97227
FastABOD 63 0.50000 0.48599 0.60478 0.59371 0.60870 0.59773 0.97423
KDEOS 3 0.10000 0.07479 0.04233 0.01551 0.10000 0.07479 0.58347
KDEOS 84 0.00000 -0.02801 0.13045 0.10609 0.26667 0.24613 0.90084
KDEOS 85 0.00000 -0.02801 0.12662 0.10215 0.27907 0.25888 0.89748
LDF 3 0.50000 0.48599 0.57480 0.56289 0.62500 0.61450 0.88011
LDF 5 0.60000 0.58880 0.54465 0.53190 0.62500 0.61450 0.93389
LDF 12 0.50000 0.48599 0.57514 0.56324 0.62500 0.61450 0.92381
LDF 38 0.50000 0.48599 0.54250 0.52968 0.62500 0.61450 0.95518
INFLO 17 0.50000 0.48599 0.53183 0.51871 0.52632 0.51305 0.95966
INFLO 22 0.50000 0.48599 0.53117 0.51803 0.55556 0.54311 0.96162
INFLO 25 0.50000 0.48599 0.54511 0.53237 0.62500 0.61450 0.95826
INFLO 45 0.50000 0.48599 0.57930 0.56751 0.62500 0.61450 0.95490
COF 14 0.50000 0.48599 0.39556 0.37863 0.50000 0.48599 0.94090
COF 23 0.50000 0.48599 0.44550 0.42996 0.62500 0.61450 0.94426
COF 33 0.50000 0.48599 0.50273 0.48880 0.62500 0.61450 0.94314
COF 36 0.50000 0.48599 0.39747 0.38060 0.52632 0.51305 0.94762

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 30 attributes, 367 objects, 10 outliers (2.72%)

Download raw algorithm results (3.1 MB) Download raw algorithm evaluation table (34.1 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.80000 0.79440 0.68832 0.67959 0.80000 0.79440 0.97801
KNN 6 0.80000 0.79440 0.82383 0.81890 0.80000 0.79440 0.98992
KNN 11 0.70000 0.69160 0.79567 0.78994 0.81818 0.81309 0.98515
KNNW 3 0.80000 0.79440 0.70028 0.69188 0.80000 0.79440 0.97647
KNNW 11 0.80000 0.79440 0.80963 0.80430 0.80000 0.79440 0.98459
KNNW 22 0.70000 0.69160 0.79606 0.79035 0.81818 0.81309 0.98543
LOF 16 0.70000 0.69160 0.76399 0.75738 0.72000 0.71216 0.98263
LOF 20 0.60000 0.58880 0.76863 0.76215 0.75000 0.74300 0.98459
LOF 23 0.60000 0.58880 0.77453 0.76822 0.78261 0.77652 0.98179
SimplifiedLOF 23 0.70000 0.69160 0.73075 0.72321 0.70000 0.69160 0.97087
SimplifiedLOF 27 0.60000 0.58880 0.74694 0.73985 0.72727 0.71963 0.97675
SimplifiedLOF 29 0.70000 0.69160 0.75227 0.74533 0.72727 0.71963 0.97535
SimplifiedLOF 75 0.60000 0.58880 0.74912 0.74209 0.78261 0.77652 0.95322
LoOP 21 0.60000 0.58880 0.60221 0.59107 0.60000 0.58880 0.95602
LoOP 46 0.60000 0.58880 0.72007 0.71223 0.69565 0.68713 0.96975
LoOP 81 0.60000 0.58880 0.73552 0.72811 0.75000 0.74300 0.91905
LDOF 25 0.60000 0.58880 0.57457 0.56266 0.60000 0.58880 0.94174
LDOF 47 0.60000 0.58880 0.72210 0.71431 0.72000 0.71216 0.94594
LDOF 100 0.60000 0.58880 0.74608 0.73897 0.72000 0.71216 0.95210
ODIN 55 0.63333 0.62306 0.69163 0.68299 0.64000 0.62992 0.89314
ODIN 76 0.60000 0.58880 0.75457 0.74770 0.78261 0.77652 0.92087
ODIN 100 0.56667 0.55453 0.71691 0.70898 0.69565 0.68713 0.96120
FastABOD 4 0.80000 0.79440 0.64706 0.63718 0.80000 0.79440 0.97703
FastABOD 92 0.70000 0.69160 0.74629 0.73919 0.76190 0.75524 0.98487
KDEOS 2 0.10000 0.07479 0.04840 0.02175 0.14815 0.12429 0.55308
KDEOS 63 0.00000 -0.02801 0.11161 0.08672 0.28070 0.26055 0.87731
KDEOS 100 0.00000 -0.02801 0.11179 0.08691 0.30508 0.28562 0.87675
LDF 8 0.80000 0.79440 0.80657 0.80115 0.80000 0.79440 0.98319
LDF 10 0.70000 0.69160 0.80962 0.80428 0.78261 0.77652 0.98627
LDF 11 0.70000 0.69160 0.76730 0.76078 0.81818 0.81309 0.98487
LDF 15 0.70000 0.69160 0.80344 0.79794 0.81818 0.81309 0.99132
INFLO 17 0.60000 0.58880 0.61063 0.59972 0.60000 0.58880 0.96022
INFLO 52 0.50000 0.48599 0.72478 0.71707 0.75000 0.74300 0.89468
INFLO 86 0.60000 0.58880 0.75108 0.74411 0.75000 0.74300 0.97647
COF 18 0.80000 0.79440 0.74402 0.73685 0.80000 0.79440 0.92969
COF 23 0.80000 0.79440 0.80248 0.79695 0.84211 0.83768 0.97535
COF 40 0.80000 0.79440 0.83122 0.82649 0.84211 0.83768 0.98852
COF 42 0.80000 0.79440 0.82401 0.81908 0.84211 0.83768 0.98950

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO