Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

Shuttle (version#10)

This dataset has been preprocessed in different variants in the literature. We follow the procedure of Zhang et al. [1], using classes 1, 3, 4, 5, 6 and 7 as inliers and class 2 as outlier, selecting 1000 inliers vs. 13 outliers (class 2). The selection of instances is based on the test set. The processed dataset consists of 1013 instances represented in 9 attributes, with 13 outliers (1.28%) and 1000 inliers (98.72%).

References:

[1] K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, pages 813-822, 2009.

Download all data set variants used (328.2 kB). You can also access the original data. (shuttle.tst, [1] only uses test set)

Normalized, without duplicates

This version contains 9 attributes, 1013 objects, 13 outliers (1.28%)

Download raw algorithm results (8.7 MB) Download raw algorithm evaluation table (50.1 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.07692 0.06492 0.02330 0.01061 0.09524 0.08348 0.44827
KNN 3 0.07692 0.06492 0.04356 0.03112 0.09524 0.08348 0.81762
KNN 98 0.00000 -0.01300 0.02854 0.01591 0.11111 0.09956 0.63469
KNNW 1 0.07692 0.06492 0.02298 0.01028 0.10526 0.09363 0.30154
KNNW 11 0.07692 0.06492 0.03321 0.02065 0.10000 0.08830 0.72785
LOF 1 0.00000 -0.01300 0.01686 0.00408 0.06897 0.05686 0.43808
LOF 6 0.00000 -0.01300 0.03146 0.01886 0.07568 0.06366 0.78208
LOF 81 0.00000 -0.01300 0.04414 0.03171 0.15385 0.14285 0.72138
LOF 95 0.00000 -0.01300 0.04660 0.03421 0.15385 0.14285 0.72423
SimplifiedLOF 1 0.07692 0.06492 0.02162 0.00891 0.08333 0.07142 0.50885
SimplifiedLOF 96 0.00000 -0.01300 0.03945 0.02696 0.10870 0.09711 0.76277
SimplifiedLOF 99 0.00000 -0.01300 0.03944 0.02696 0.10417 0.09252 0.76608
LoOP 1 0.07692 0.06492 0.02184 0.00913 0.08333 0.07142 0.50877
LoOP 12 0.00000 -0.01300 0.03127 0.01867 0.07246 0.06041 0.78992
LoOP 97 0.00000 -0.01300 0.03142 0.01883 0.08403 0.07213 0.76131
LoOP 99 0.00000 -0.01300 0.03204 0.01945 0.08264 0.07072 0.76400
LDOF 2 0.07692 0.06492 0.02467 0.01199 0.08333 0.07142 0.50585
LDOF 3 0.07692 0.06492 0.02933 0.01671 0.11765 0.10618 0.35654
LDOF 12 0.00000 -0.01300 0.04493 0.03252 0.09677 0.08503 0.84462
LDOF 15 0.00000 -0.01300 0.04367 0.03124 0.09804 0.08631 0.84754
ODIN 2 0.01681 0.00403 0.01659 0.00381 0.03390 0.02134 0.61223
ODIN 8 0.00000 -0.01300 0.03814 0.02563 0.08511 0.07321 0.78900
ODIN 12 0.00000 -0.01300 0.04887 0.03650 0.11765 0.10618 0.77712
FastABOD 3 0.00000 -0.01300 0.01108 -0.00177 0.03774 0.02523 0.31600
FastABOD 7 0.07692 0.06492 0.01544 0.00264 0.08000 0.06804 0.27792
FastABOD 36 0.07692 0.06492 0.01934 0.00659 0.10000 0.08830 0.26431
KDEOS 25 0.00000 -0.01300 0.02125 0.00853 0.05155 0.03922 0.69192
KDEOS 69 0.15385 0.14285 0.03341 0.02084 0.15385 0.14285 0.62185
KDEOS 88 0.07692 0.06492 0.04031 0.02783 0.14815 0.13707 0.65208
KDEOS 94 0.15385 0.14285 0.03701 0.02449 0.16000 0.14908 0.66554
LDF 1 0.00000 -0.01300 0.01774 0.00497 0.07018 0.05809 0.41685
LDF 4 0.00000 -0.01300 0.02775 0.01511 0.08451 0.07261 0.71585
LDF 36 0.00000 -0.01300 0.03435 0.02180 0.13043 0.11913 0.67946
LDF 63 0.00000 -0.01300 0.03591 0.02338 0.13043 0.11913 0.69762
INFLO 1 0.07692 0.06492 0.02238 0.00967 0.09091 0.07909 0.48438
INFLO 98 0.00000 -0.01300 0.03604 0.02351 0.09091 0.07909 0.79885
INFLO 100 0.00000 -0.01300 0.03554 0.02301 0.09302 0.08123 0.79408
COF 1 0.07692 0.06492 0.02138 0.00865 0.08333 0.07142 0.50681
COF 2 0.07692 0.06492 0.02592 0.01325 0.10000 0.08830 0.46515
COF 13 0.07692 0.06492 0.02328 0.01059 0.10526 0.09363 0.38338
COF 71 0.00000 -0.01300 0.01877 0.00601 0.04380 0.03136 0.63969

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 9 attributes, 1013 objects, 13 outliers (1.28%)

Download raw algorithm results (8.3 MB) Download raw algorithm evaluation table (43.7 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 3 0.30769 0.29869 0.16978 0.15898 0.30769 0.29869 0.94585
KNN 4 0.30769 0.29869 0.16342 0.15255 0.32000 0.31116 0.93538
KNN 7 0.30769 0.29869 0.20373 0.19338 0.30769 0.29869 0.96612
KNNW 3 0.23077 0.22077 0.12308 0.11168 0.24000 0.23012 0.70438
KNNW 7 0.23077 0.22077 0.15404 0.14304 0.27586 0.26645 0.92869
KNNW 8 0.23077 0.22077 0.16005 0.14913 0.27586 0.26645 0.94046
KNNW 10 0.23077 0.22077 0.15824 0.14729 0.26667 0.25713 0.94677
LOF 4 0.38462 0.37662 0.17927 0.16860 0.46667 0.45973 0.64746
LOF 10 0.38462 0.37662 0.25066 0.24092 0.41379 0.40617 0.97215
LOF 11 0.38462 0.37662 0.25651 0.24685 0.43137 0.42398 0.97177
SimplifiedLOF 6 0.38462 0.37662 0.16626 0.15542 0.38710 0.37913 0.65962
SimplifiedLOF 9 0.30769 0.29869 0.17587 0.16515 0.43750 0.43019 0.84585
SimplifiedLOF 16 0.30769 0.29869 0.21994 0.20980 0.41379 0.40617 0.96300
SimplifiedLOF 29 0.07692 0.06492 0.19333 0.18284 0.37736 0.36926 0.96515
LoOP 8 0.30769 0.29869 0.14119 0.13002 0.35294 0.34453 0.71019
LoOP 21 0.23077 0.22077 0.22197 0.21186 0.40000 0.39220 0.96269
LoOP 22 0.23077 0.22077 0.22711 0.21706 0.40000 0.39220 0.96246
LoOP 29 0.23077 0.22077 0.20827 0.19798 0.36364 0.35536 0.96469
LDOF 11 0.30769 0.29869 0.14383 0.13270 0.37037 0.36219 0.63885
LDOF 26 0.30769 0.29869 0.19836 0.18794 0.34286 0.33431 0.95115
LDOF 60 0.15385 0.14285 0.18084 0.17019 0.31250 0.30356 0.96185
ODIN 42 0.19231 0.18181 0.16072 0.14981 0.30000 0.29090 0.93638
ODIN 43 0.23077 0.22077 0.16257 0.15168 0.30000 0.29090 0.93431
ODIN 49 0.23077 0.22077 0.17698 0.16628 0.33333 0.32467 0.93035
ODIN 58 0.23077 0.22077 0.16781 0.15699 0.37500 0.36687 0.92562
FastABOD 3 0.15385 0.14285 0.06980 0.05771 0.22222 0.21211 0.59200
FastABOD 70 0.15385 0.14285 0.08324 0.07132 0.22222 0.21211 0.56162
KDEOS 12 0.15385 0.14285 0.04527 0.03285 0.15385 0.14285 0.66585
KDEOS 53 0.00000 -0.01300 0.06205 0.04986 0.14208 0.13092 0.90192
KDEOS 92 0.07692 0.06492 0.06851 0.05640 0.12766 0.11632 0.86369
LDF 4 0.46154 0.45454 0.18127 0.17063 0.46154 0.45454 0.64346
LDF 8 0.38462 0.37662 0.34704 0.33855 0.58065 0.57519 0.98446
LDF 9 0.38462 0.37662 0.34284 0.33429 0.60606 0.60094 0.98362
INFLO 6 0.38462 0.37662 0.14586 0.13475 0.38462 0.37662 0.68492
INFLO 15 0.30769 0.29869 0.21763 0.20745 0.36842 0.36021 0.94615
INFLO 62 0.07692 0.06492 0.18129 0.17065 0.34043 0.33185 0.95985
COF 2 0.23077 0.22077 0.08347 0.07155 0.26087 0.25126 0.58615
COF 27 0.15385 0.14285 0.14105 0.12988 0.27586 0.26645 0.93292
COF 50 0.07692 0.06492 0.14689 0.13580 0.34043 0.33185 0.90177
COF 53 0.07692 0.06492 0.14670 0.13561 0.35294 0.34453 0.89754

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO