Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

Shuttle (version#08)

This dataset has been preprocessed in different variants in the literature. We follow the procedure of Zhang et al. [1], using classes 1, 3, 4, 5, 6 and 7 as inliers and class 2 as outlier, selecting 1000 inliers vs. 13 outliers (class 2). The selection of instances is based on the test set. The processed dataset consists of 1013 instances represented in 9 attributes, with 13 outliers (1.28%) and 1000 inliers (98.72%).

References:

[1] K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, pages 813-822, 2009.

Download all data set variants used (328.2 kB). You can also access the original data. (shuttle.tst, [1] only uses test set)

Normalized, without duplicates

This version contains 9 attributes, 1013 objects, 13 outliers (1.28%)

Download raw algorithm results (8.5 MB) Download raw algorithm evaluation table (44.4 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.30769 0.29869 0.26420 0.25463 0.54545 0.53955 0.78277
KNN 3 0.30769 0.29869 0.42528 0.41780 0.65000 0.64545 0.99123
KNNW 5 0.30769 0.29869 0.36187 0.35357 0.54545 0.53955 0.98669
KNNW 6 0.30769 0.29869 0.37602 0.36790 0.54545 0.53955 0.98846
KNNW 7 0.30769 0.29869 0.37385 0.36571 0.57778 0.57229 0.98846
LOF 10 0.46154 0.45454 0.31103 0.30208 0.51282 0.50649 0.97900
LOF 11 0.38462 0.37662 0.35441 0.34601 0.61111 0.60606 0.98285
LOF 13 0.46154 0.45454 0.35381 0.34541 0.62069 0.61576 0.98062
SimplifiedLOF 15 0.38462 0.37662 0.34387 0.33534 0.61111 0.60606 0.98477
SimplifiedLOF 16 0.38462 0.37662 0.35114 0.34270 0.61111 0.60606 0.98492
SimplifiedLOF 17 0.38462 0.37662 0.34749 0.33901 0.57895 0.57347 0.98508
LoOP 12 0.38462 0.37662 0.21160 0.20135 0.42857 0.42114 0.95292
LoOP 22 0.30769 0.29869 0.28373 0.27442 0.45455 0.44745 0.97985
LoOP 26 0.30769 0.29869 0.28709 0.27782 0.51429 0.50797 0.97646
LoOP 27 0.30769 0.29869 0.28732 0.27806 0.51429 0.50797 0.97638
LDOF 19 0.30769 0.29869 0.15687 0.14591 0.33333 0.32467 0.92123
LDOF 35 0.30769 0.29869 0.23366 0.22369 0.40000 0.39220 0.95823
LDOF 44 0.23077 0.22077 0.22041 0.21028 0.44444 0.43722 0.95900
LDOF 59 0.15385 0.14285 0.17568 0.16497 0.30508 0.29605 0.96415
ODIN 30 0.23077 0.22077 0.18842 0.17786 0.32727 0.31853 0.95446
ODIN 32 0.23077 0.22077 0.20075 0.19036 0.35294 0.34453 0.95846
ODIN 42 0.23077 0.22077 0.27027 0.26078 0.48485 0.47815 0.95473
ODIN 45 0.23077 0.22077 0.27667 0.26726 0.48485 0.47815 0.95477
FastABOD 3 0.15385 0.14285 0.10447 0.09283 0.21053 0.20026 0.76762
FastABOD 7 0.23077 0.22077 0.15175 0.14073 0.24138 0.23152 0.76169
FastABOD 64 0.23077 0.22077 0.17669 0.16599 0.33333 0.32467 0.76323
FastABOD 76 0.23077 0.22077 0.17709 0.16639 0.33333 0.32467 0.76369
KDEOS 11 0.07692 0.06492 0.02954 0.01692 0.10000 0.08830 0.66723
KDEOS 69 0.00000 -0.01300 0.08382 0.07191 0.23684 0.22692 0.91331
KDEOS 96 0.07692 0.06492 0.09737 0.08564 0.20779 0.19749 0.93285
LDF 5 0.38462 0.37662 0.18669 0.17612 0.46667 0.45973 0.79123
LDF 8 0.38462 0.37662 0.37840 0.37032 0.66667 0.66233 0.98708
INFLO 7 0.38462 0.37662 0.16725 0.15643 0.42857 0.42114 0.72508
INFLO 14 0.38462 0.37662 0.27462 0.26519 0.45000 0.44285 0.97262
INFLO 15 0.30769 0.29869 0.25718 0.24752 0.40909 0.40141 0.97308
COF 12 0.38462 0.37662 0.29803 0.28891 0.45161 0.44448 0.98092
COF 21 0.38462 0.37662 0.40887 0.40118 0.63415 0.62939 0.99069
COF 25 0.30769 0.29869 0.40650 0.39879 0.66667 0.66233 0.99054
COF 27 0.38462 0.37662 0.41253 0.40489 0.64865 0.64408 0.99008

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 9 attributes, 1013 objects, 13 outliers (1.28%)

Download raw algorithm results (8.3 MB) Download raw algorithm evaluation table (44.5 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.30769 0.29869 0.14636 0.13527 0.30769 0.29869 0.81323
KNN 5 0.23077 0.22077 0.18389 0.17328 0.30380 0.29475 0.96473
KNNW 3 0.23077 0.22077 0.10918 0.09760 0.24390 0.23407 0.71138
KNNW 8 0.23077 0.22077 0.14598 0.13488 0.28571 0.27643 0.93854
KNNW 12 0.23077 0.22077 0.15584 0.14486 0.27586 0.26645 0.95192
LOF 1 0.23077 0.22077 0.06980 0.05771 0.25000 0.24025 0.50362
LOF 12 0.23077 0.22077 0.21417 0.20395 0.41509 0.40749 0.96146
LOF 13 0.23077 0.22077 0.21420 0.20398 0.39216 0.38425 0.96669
SimplifiedLOF 2 0.23077 0.22077 0.06935 0.05725 0.23077 0.22077 0.64415
SimplifiedLOF 22 0.23077 0.22077 0.18801 0.17745 0.33333 0.32467 0.96623
SimplifiedLOF 24 0.15385 0.14285 0.18681 0.17624 0.30303 0.29397 0.96769
SimplifiedLOF 95 0.07692 0.06492 0.16814 0.15733 0.40000 0.39220 0.93354
LoOP 2 0.23077 0.22077 0.07081 0.05873 0.25000 0.24025 0.67969
LoOP 24 0.15385 0.14285 0.19637 0.18592 0.34286 0.33431 0.96808
LoOP 25 0.15385 0.14285 0.19444 0.18396 0.36364 0.35536 0.96685
LDOF 2 0.23077 0.22077 0.05688 0.04462 0.23077 0.22077 0.53308
LDOF 29 0.23077 0.22077 0.15998 0.14906 0.32000 0.31116 0.95131
LDOF 39 0.23077 0.22077 0.16890 0.15810 0.30769 0.29869 0.95738
LDOF 40 0.23077 0.22077 0.16710 0.15628 0.31579 0.30689 0.95862
ODIN 33 0.15385 0.14285 0.22819 0.21816 0.40816 0.40047 0.95496
ODIN 45 0.15385 0.14285 0.24976 0.24001 0.45000 0.44285 0.95092
ODIN 54 0.15385 0.14285 0.21601 0.20581 0.46154 0.45454 0.92477
ODIN 93 0.23077 0.22077 0.15742 0.14647 0.32787 0.31913 0.90350
FastABOD 3 0.15385 0.14285 0.05658 0.04432 0.19048 0.17995 0.63646
FastABOD 76 0.15385 0.14285 0.06309 0.05091 0.19355 0.18306 0.57700
FastABOD 98 0.15385 0.14285 0.06351 0.05134 0.19355 0.18306 0.57692
KDEOS 48 0.07692 0.06492 0.05496 0.04268 0.12381 0.11242 0.87738
KDEOS 70 0.15385 0.14285 0.08493 0.07303 0.16000 0.14908 0.83546
KDEOS 98 0.15385 0.14285 0.14371 0.13257 0.21053 0.20026 0.84531
KDEOS 99 0.15385 0.14285 0.11055 0.09898 0.22222 0.21211 0.84446
LDF 9 0.23077 0.22077 0.21194 0.20169 0.39216 0.38425 0.97154
LDF 10 0.30769 0.29869 0.21774 0.20757 0.37736 0.36926 0.97154
LDF 29 0.07692 0.06492 0.16480 0.15394 0.41667 0.40908 0.90985
INFLO 1 0.23077 0.22077 0.07133 0.05926 0.23077 0.22077 0.64915
INFLO 19 0.15385 0.14285 0.17277 0.16202 0.29851 0.28939 0.95577
INFLO 29 0.07692 0.06492 0.15236 0.14134 0.29032 0.28110 0.95646
INFLO 100 0.07692 0.06492 0.14590 0.13480 0.32727 0.31853 0.92269
COF 2 0.23077 0.22077 0.06773 0.05561 0.23077 0.22077 0.62085
COF 10 0.15385 0.14285 0.09354 0.08175 0.29167 0.28246 0.75931
COF 18 0.15385 0.14285 0.11425 0.10274 0.23529 0.22535 0.93677
COF 20 0.15385 0.14285 0.11391 0.10240 0.20455 0.19420 0.94015

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO