Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

Shuttle (version#03)

This dataset has been preprocessed in different variants in the literature. We follow the procedure of Zhang et al. [1], using classes 1, 3, 4, 5, 6 and 7 as inliers and class 2 as outlier, selecting 1000 inliers vs. 13 outliers (class 2). The selection of instances is based on the test set. The processed dataset consists of 1013 instances represented in 9 attributes, with 13 outliers (1.28%) and 1000 inliers (98.72%).

References:

[1] K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, pages 813-822, 2009.

Download all data set variants used (328.2 kB). You can also access the original data. (shuttle.tst, [1] only uses test set)

Normalized, without duplicates

This version contains 9 attributes, 1013 objects, 13 outliers (1.28%)

Download raw algorithm results (8.6 MB) Download raw algorithm evaluation table (45.3 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.23077 0.22077 0.17581 0.16509 0.35000 0.34155 0.87415
KNN 3 0.23077 0.22077 0.22509 0.21502 0.37931 0.37124 0.97338
KNNW 4 0.23077 0.22077 0.16086 0.14995 0.25926 0.24963 0.91692
KNNW 7 0.23077 0.22077 0.18337 0.17275 0.27451 0.26508 0.95762
KNNW 9 0.23077 0.22077 0.18375 0.17314 0.26415 0.25458 0.95885
KNNW 10 0.23077 0.22077 0.17783 0.16715 0.26415 0.25458 0.95931
LOF 11 0.30769 0.29869 0.21717 0.20699 0.43750 0.43019 0.94762
LOF 12 0.38462 0.37662 0.23468 0.22473 0.45161 0.44448 0.94600
SimplifiedLOF 12 0.07692 0.06492 0.17465 0.16392 0.31818 0.30932 0.93785
SimplifiedLOF 15 0.15385 0.14285 0.17464 0.16391 0.35556 0.34718 0.94231
SimplifiedLOF 25 0.23077 0.22077 0.14290 0.13176 0.27586 0.26645 0.93492
SimplifiedLOF 99 0.00000 -0.01300 0.11379 0.10227 0.22500 0.21493 0.94469
LoOP 18 0.23077 0.22077 0.18437 0.17377 0.32432 0.31554 0.94292
LoOP 21 0.30769 0.29869 0.18147 0.17083 0.32000 0.31116 0.94531
LoOP 22 0.30769 0.29869 0.18414 0.17354 0.34783 0.33935 0.94592
LoOP 23 0.30769 0.29869 0.18422 0.17361 0.33333 0.32467 0.94685
LDOF 24 0.07692 0.06492 0.15520 0.14422 0.33333 0.32467 0.89269
LDOF 26 0.07692 0.06492 0.15817 0.14723 0.31034 0.30138 0.90531
LDOF 41 0.23077 0.22077 0.11772 0.10625 0.23077 0.22077 0.89177
LDOF 99 0.07692 0.06492 0.10387 0.09222 0.19355 0.18306 0.92600
ODIN 10 0.15385 0.14285 0.06420 0.05203 0.15385 0.14285 0.86142
ODIN 23 0.15385 0.14285 0.13246 0.12119 0.22535 0.21528 0.94146
ODIN 26 0.15385 0.14285 0.14742 0.13633 0.24242 0.23258 0.94092
ODIN 27 0.07692 0.06492 0.12372 0.11233 0.25000 0.24025 0.93246
FastABOD 3 0.15385 0.14285 0.06457 0.05241 0.19048 0.17995 0.65508
FastABOD 7 0.15385 0.14285 0.09528 0.08352 0.22222 0.21211 0.67177
FastABOD 8 0.15385 0.14285 0.09624 0.08449 0.22222 0.21211 0.67400
FastABOD 60 0.15385 0.14285 0.08633 0.07445 0.22222 0.21211 0.70285
KDEOS 30 0.07692 0.06492 0.07230 0.06024 0.14966 0.13861 0.90677
KDEOS 40 0.23077 0.22077 0.08692 0.07505 0.23077 0.22077 0.84723
KDEOS 44 0.23077 0.22077 0.10398 0.09233 0.28571 0.27643 0.84562
KDEOS 58 0.15385 0.14285 0.22122 0.21110 0.26667 0.25713 0.87200
LDF 4 0.30769 0.29869 0.21040 0.20013 0.38710 0.37913 0.93562
LDF 6 0.23077 0.22077 0.20338 0.19302 0.37838 0.37030 0.95708
LDF 8 0.15385 0.14285 0.19699 0.18655 0.41026 0.40259 0.94646
INFLO 14 0.30769 0.29869 0.16950 0.15870 0.32143 0.31261 0.92654
INFLO 18 0.30769 0.29869 0.18049 0.16984 0.34783 0.33935 0.94123
COF 17 0.15385 0.14285 0.16376 0.15289 0.25243 0.24271 0.94969
COF 45 0.30769 0.29869 0.16846 0.15765 0.30769 0.29869 0.92523
COF 48 0.23077 0.22077 0.15690 0.14594 0.31579 0.30689 0.92123

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 9 attributes, 1013 objects, 13 outliers (1.28%)

Download raw algorithm results (8.3 MB) Download raw algorithm evaluation table (44.4 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 3 0.30769 0.29869 0.20176 0.19138 0.34783 0.33935 0.95088
KNN 9 0.30769 0.29869 0.25595 0.24628 0.38095 0.37290 0.97435
KNNW 15 0.30769 0.29869 0.22195 0.21184 0.30769 0.29869 0.96477
LOF 8 0.38462 0.37662 0.19004 0.17951 0.38462 0.37662 0.91754
LOF 17 0.38462 0.37662 0.32767 0.31893 0.52941 0.52329 0.98100
LOF 18 0.38462 0.37662 0.33111 0.32242 0.54545 0.53955 0.98100
SimplifiedLOF 19 0.38462 0.37662 0.27240 0.26294 0.40816 0.40047 0.97608
SimplifiedLOF 23 0.38462 0.37662 0.30379 0.29474 0.44898 0.44182 0.97969
SimplifiedLOF 25 0.38462 0.37662 0.30991 0.30094 0.42857 0.42114 0.97892
LoOP 25 0.30769 0.29869 0.27019 0.26070 0.42553 0.41806 0.97700
LoOP 33 0.30769 0.29869 0.26637 0.25683 0.42857 0.42114 0.97269
LoOP 35 0.30769 0.29869 0.28886 0.27961 0.42857 0.42114 0.97308
LoOP 40 0.38462 0.37662 0.23895 0.22906 0.38710 0.37913 0.96708
LDOF 9 0.30769 0.29869 0.11227 0.10073 0.32000 0.31116 0.61492
LDOF 47 0.23077 0.22077 0.27372 0.26428 0.47059 0.46371 0.97569
LDOF 50 0.30769 0.29869 0.26242 0.25283 0.44444 0.43722 0.97654
ODIN 42 0.23077 0.22077 0.32825 0.31952 0.52941 0.52329 0.97231
ODIN 53 0.30769 0.29869 0.31138 0.30243 0.58065 0.57519 0.95035
ODIN 84 0.38462 0.37662 0.24988 0.24012 0.46667 0.45973 0.91431
FastABOD 3 0.15385 0.14285 0.09311 0.08132 0.22222 0.21211 0.63369
FastABOD 24 0.15385 0.14285 0.12289 0.11148 0.25000 0.24025 0.58554
FastABOD 75 0.23077 0.22077 0.13087 0.11957 0.25000 0.24025 0.57515
FastABOD 100 0.23077 0.22077 0.13130 0.12001 0.25000 0.24025 0.57523
KDEOS 37 0.15385 0.14285 0.07461 0.06258 0.17143 0.16066 0.87377
KDEOS 54 0.07692 0.06492 0.11805 0.10658 0.25000 0.24025 0.92038
KDEOS 56 0.07692 0.06492 0.11195 0.10040 0.27451 0.26508 0.92369
KDEOS 64 0.07692 0.06492 0.11716 0.10568 0.27451 0.26508 0.93808
LDF 11 0.53846 0.53246 0.36762 0.35940 0.62069 0.61576 0.98000
LDF 15 0.53846 0.53246 0.42882 0.42139 0.64706 0.64247 0.98731
INFLO 3 0.30769 0.29869 0.10993 0.09836 0.30769 0.29869 0.58492
INFLO 20 0.30769 0.29869 0.28063 0.27128 0.41379 0.40617 0.97038
INFLO 21 0.30769 0.29869 0.28103 0.27169 0.42857 0.42114 0.97008
COF 2 0.23077 0.22077 0.10520 0.09357 0.27273 0.26327 0.61142
COF 6 0.15385 0.14285 0.10783 0.09624 0.27586 0.26645 0.60577
COF 22 0.15385 0.14285 0.18899 0.17844 0.25000 0.24025 0.94808

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO