Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

Shuttle (version#04)

This dataset has been preprocessed in different variants in the literature. We follow the procedure of Zhang et al. [1], using classes 1, 3, 4, 5, 6 and 7 as inliers and class 2 as outlier, selecting 1000 inliers vs. 13 outliers (class 2). The selection of instances is based on the test set. The processed dataset consists of 1013 instances represented in 9 attributes, with 13 outliers (1.28%) and 1000 inliers (98.72%).

References:

[1] K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, pages 813-822, 2009.

Download all data set variants used (328.2 kB). You can also access the original data. (shuttle.tst, [1] only uses test set)

Normalized, without duplicates

This version contains 9 attributes, 1013 objects, 13 outliers (1.28%)

Download raw algorithm results (8.5 MB) Download raw algorithm evaluation table (45.4 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.38462 0.37662 0.33644 0.32782 0.56250 0.55681 0.82862
KNN 6 0.38462 0.37662 0.46438 0.45742 0.63415 0.62939 0.99146
KNNW 3 0.38462 0.37662 0.31627 0.30738 0.51429 0.50797 0.91254
KNNW 5 0.38462 0.37662 0.36619 0.35795 0.54545 0.53955 0.97815
KNNW 8 0.38462 0.37662 0.40233 0.39456 0.54545 0.53955 0.98623
LOF 5 0.30769 0.29869 0.12072 0.10929 0.30769 0.29869 0.84008
LOF 10 0.23077 0.22077 0.31050 0.30154 0.57143 0.56586 0.98531
LOF 11 0.15385 0.14285 0.30752 0.29852 0.59091 0.58559 0.98585
LOF 12 0.15385 0.14285 0.29408 0.28491 0.60465 0.59951 0.98523
SimplifiedLOF 8 0.30769 0.29869 0.15390 0.14290 0.37037 0.36219 0.88131
SimplifiedLOF 14 0.23077 0.22077 0.30387 0.29482 0.53333 0.52727 0.98500
SimplifiedLOF 15 0.15385 0.14285 0.29622 0.28707 0.55319 0.54738 0.98446
LoOP 8 0.23077 0.22077 0.10947 0.09789 0.29630 0.28715 0.84454
LoOP 18 0.15385 0.14285 0.28503 0.27574 0.52174 0.51552 0.98323
LoOP 21 0.15385 0.14285 0.28745 0.27819 0.52174 0.51552 0.98415
LoOP 28 0.23077 0.22077 0.29367 0.28449 0.50000 0.49350 0.98400
LDOF 15 0.23077 0.22077 0.09451 0.08273 0.23077 0.22077 0.88146
LDOF 51 0.15385 0.14285 0.26437 0.25481 0.48649 0.47981 0.97800
LDOF 62 0.23077 0.22077 0.26770 0.25818 0.45833 0.45129 0.97969
LDOF 67 0.23077 0.22077 0.26472 0.25517 0.44444 0.43722 0.98054
ODIN 43 0.23077 0.22077 0.40471 0.39697 0.63415 0.62939 0.98869
ODIN 45 0.23077 0.22077 0.41373 0.40610 0.65000 0.64545 0.98869
ODIN 88 0.51282 0.50649 0.34165 0.33310 0.57143 0.56586 0.96135
FastABOD 5 0.23077 0.22077 0.15390 0.14290 0.25455 0.24485 0.82162
FastABOD 77 0.23077 0.22077 0.23016 0.22016 0.34483 0.33631 0.83031
FastABOD 82 0.23077 0.22077 0.23033 0.22033 0.34483 0.33631 0.83085
FastABOD 100 0.23077 0.22077 0.22979 0.21977 0.34483 0.33631 0.83292
KDEOS 67 0.15385 0.14285 0.23354 0.22358 0.47619 0.46938 0.97477
KDEOS 71 0.30769 0.29869 0.25587 0.24619 0.42105 0.41353 0.97831
KDEOS 72 0.30769 0.29869 0.25341 0.24370 0.42105 0.41353 0.97854
KDEOS 92 0.38462 0.37662 0.21114 0.20089 0.41379 0.40617 0.93785
LDF 6 0.23077 0.22077 0.27585 0.26643 0.50000 0.49350 0.98123
LDF 10 0.07692 0.06492 0.33315 0.32448 0.61905 0.61410 0.98769
INFLO 5 0.15385 0.14285 0.07063 0.05855 0.20690 0.19659 0.68515
INFLO 12 0.15385 0.14285 0.24378 0.23395 0.48889 0.48224 0.93677
INFLO 15 0.15385 0.14285 0.26907 0.25957 0.47826 0.47148 0.98254
COF 20 0.46154 0.45454 0.37375 0.36561 0.47826 0.47148 0.98654
COF 26 0.38462 0.37662 0.36888 0.36068 0.54167 0.53571 0.98708
COF 27 0.38462 0.37662 0.37072 0.36254 0.54167 0.53571 0.98723

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 9 attributes, 1013 objects, 13 outliers (1.28%)

Download raw algorithm results (8.3 MB) Download raw algorithm evaluation table (44.8 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.23077 0.22077 0.13575 0.12452 0.28571 0.27643 0.80992
KNN 8 0.23077 0.22077 0.19640 0.18595 0.32836 0.31963 0.96812
KNNW 5 0.23077 0.22077 0.12003 0.10860 0.25000 0.24025 0.85477
KNNW 13 0.23077 0.22077 0.15768 0.14673 0.26667 0.25713 0.95277
KNNW 15 0.23077 0.22077 0.15975 0.14883 0.27586 0.26645 0.95262
LOF 1 0.23077 0.22077 0.08746 0.07559 0.27273 0.26327 0.50523
LOF 14 0.23077 0.22077 0.19104 0.18052 0.37037 0.36219 0.94800
LOF 17 0.23077 0.22077 0.19037 0.17985 0.37500 0.36687 0.94962
LOF 100 0.07692 0.06492 0.14355 0.13242 0.31746 0.30859 0.95254
SimplifiedLOF 2 0.23077 0.22077 0.08183 0.06989 0.26087 0.25126 0.56785
SimplifiedLOF 17 0.15385 0.14285 0.16105 0.15014 0.33333 0.32467 0.92946
SimplifiedLOF 22 0.15385 0.14285 0.17462 0.16389 0.32258 0.31377 0.95508
SimplifiedLOF 23 0.15385 0.14285 0.17374 0.16300 0.31250 0.30356 0.95854
LoOP 2 0.23077 0.22077 0.08110 0.06915 0.26087 0.25126 0.61988
LoOP 29 0.15385 0.14285 0.17449 0.16376 0.31373 0.30480 0.95538
LoOP 32 0.15385 0.14285 0.17292 0.16217 0.32727 0.31853 0.95546
LoOP 35 0.07692 0.06492 0.16999 0.15920 0.32432 0.31554 0.95646
LDOF 2 0.23077 0.22077 0.06671 0.05457 0.24000 0.23012 0.56608
LDOF 3 0.23077 0.22077 0.09663 0.08488 0.30000 0.29090 0.49400
LDOF 42 0.15385 0.14285 0.16779 0.15697 0.30000 0.29090 0.96031
ODIN 8 0.15385 0.14285 0.04813 0.03576 0.16000 0.14908 0.72262
ODIN 36 0.15385 0.14285 0.20331 0.19295 0.34615 0.33765 0.95250
ODIN 52 0.15385 0.14285 0.21983 0.20969 0.47368 0.46684 0.93915
ODIN 60 0.07692 0.06492 0.22158 0.21146 0.47368 0.46684 0.92108
FastABOD 3 0.15385 0.14285 0.05853 0.04629 0.19048 0.17995 0.59877
FastABOD 4 0.15385 0.14285 0.05856 0.04632 0.20000 0.18960 0.59077
FastABOD 7 0.15385 0.14285 0.06070 0.04849 0.20000 0.18960 0.60285
FastABOD 100 0.15385 0.14285 0.06952 0.05742 0.20000 0.18960 0.57662
KDEOS 4 0.07692 0.06492 0.02405 0.01136 0.08696 0.07509 0.52338
KDEOS 62 0.07692 0.06492 0.07123 0.05916 0.14815 0.13707 0.89123
KDEOS 84 0.07692 0.06492 0.06912 0.05702 0.18182 0.17118 0.87915
LDF 10 0.23077 0.22077 0.23287 0.22290 0.45000 0.44285 0.97192
LDF 12 0.30769 0.29869 0.24636 0.23656 0.42857 0.42114 0.97500
INFLO 1 0.23077 0.22077 0.08161 0.06968 0.23077 0.22077 0.62008
INFLO 22 0.07692 0.06492 0.15695 0.14599 0.29412 0.28494 0.95177
INFLO 50 0.07692 0.06492 0.14332 0.13218 0.29730 0.28816 0.95685
INFLO 81 0.07692 0.06492 0.13109 0.11980 0.31746 0.30859 0.93285
COF 2 0.15385 0.14285 0.06684 0.05471 0.22222 0.21211 0.52773
COF 18 0.15385 0.14285 0.10715 0.09554 0.20000 0.18960 0.86415
COF 76 0.07692 0.06492 0.09109 0.07927 0.22414 0.21405 0.93015

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO