Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

Shuttle (version#05)

This dataset has been preprocessed in different variants in the literature. We follow the procedure of Zhang et al. [1], using classes 1, 3, 4, 5, 6 and 7 as inliers and class 2 as outlier, selecting 1000 inliers vs. 13 outliers (class 2). The selection of instances is based on the test set. The processed dataset consists of 1013 instances represented in 9 attributes, with 13 outliers (1.28%) and 1000 inliers (98.72%).

References:

[1] K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, pages 813-822, 2009.

Download all data set variants used (328.2 kB). You can also access the original data. (shuttle.tst, [1] only uses test set)

Normalized, without duplicates

This version contains 9 attributes, 1013 objects, 13 outliers (1.28%)

Download raw algorithm results (8.6 MB) Download raw algorithm evaluation table (47.2 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.07692 0.06492 0.04377 0.03134 0.12698 0.11563 0.60369
KNN 3 0.07692 0.06492 0.11191 0.10036 0.22000 0.20986 0.93723
KNNW 1 0.07692 0.06492 0.03720 0.02469 0.14815 0.13707 0.45031
KNNW 8 0.07692 0.06492 0.07466 0.06263 0.15686 0.14590 0.88631
KNNW 11 0.07692 0.06492 0.07542 0.06340 0.17021 0.15943 0.88269
LOF 4 0.07692 0.06492 0.07739 0.06539 0.18182 0.17118 0.89608
LOF 8 0.07692 0.06492 0.09774 0.08601 0.31111 0.30216 0.79369
LOF 9 0.15385 0.14285 0.10413 0.09248 0.31111 0.30216 0.80123
SimplifiedLOF 1 0.07692 0.06492 0.03503 0.02248 0.11765 0.10618 0.52096
SimplifiedLOF 7 0.00000 -0.01300 0.07221 0.06015 0.16949 0.15869 0.87485
SimplifiedLOF 10 0.00000 -0.01300 0.08353 0.07161 0.20588 0.19556 0.85069
LoOP 1 0.07692 0.06492 0.03147 0.01887 0.11111 0.09956 0.53596
LoOP 7 0.00000 -0.01300 0.06877 0.05666 0.16393 0.15307 0.86900
LoOP 10 0.07692 0.06492 0.07958 0.06762 0.18182 0.17118 0.84658
LoOP 13 0.07692 0.06492 0.07628 0.06427 0.20290 0.19254 0.83681
LDOF 2 0.07692 0.06492 0.04371 0.03127 0.12500 0.11362 0.51000
LDOF 10 0.00000 -0.01300 0.06053 0.04831 0.15652 0.14556 0.88538
LDOF 23 0.00000 -0.01300 0.07634 0.06433 0.22222 0.21211 0.82885
LDOF 25 0.00000 -0.01300 0.07663 0.06463 0.21277 0.20253 0.81215
ODIN 20 0.07692 0.06492 0.09991 0.08820 0.26923 0.25973 0.86027
ODIN 24 0.00000 -0.01300 0.09149 0.07968 0.20000 0.18960 0.88158
FastABOD 3 0.07692 0.06492 0.02812 0.01549 0.10256 0.09090 0.49900
FastABOD 5 0.15385 0.14285 0.03105 0.01846 0.15385 0.14285 0.49085
FastABOD 6 0.15385 0.14285 0.03153 0.01894 0.16000 0.14908 0.47985
KDEOS 34 0.23077 0.22077 0.11013 0.09856 0.23077 0.22077 0.83200
KDEOS 43 0.23077 0.22077 0.13053 0.11923 0.28571 0.27643 0.80115
KDEOS 44 0.23077 0.22077 0.13111 0.11981 0.28571 0.27643 0.79823
KDEOS 98 0.15385 0.14285 0.09963 0.08793 0.22222 0.21211 0.85700
LDF 4 0.07692 0.06492 0.09849 0.08677 0.26667 0.25713 0.85354
LDF 5 0.15385 0.14285 0.09355 0.08177 0.23729 0.22737 0.77500
INFLO 1 0.07692 0.06492 0.02312 0.01042 0.08163 0.06969 0.49346
INFLO 10 0.07692 0.06492 0.07906 0.06709 0.18182 0.17118 0.88369
INFLO 11 0.07692 0.06492 0.07576 0.06375 0.19178 0.18127 0.88092
INFLO 13 0.00000 -0.01300 0.06999 0.05790 0.17073 0.15995 0.88938
COF 1 0.07692 0.06492 0.03512 0.02258 0.11765 0.10618 0.52192
COF 14 0.07692 0.06492 0.05308 0.04077 0.15000 0.13895 0.69946
COF 71 0.00000 -0.01300 0.04707 0.03468 0.15217 0.14115 0.81762
COF 79 0.00000 -0.01300 0.04650 0.03410 0.15909 0.14816 0.72508

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 9 attributes, 1013 objects, 13 outliers (1.28%)

Download raw algorithm results (8.2 MB) Download raw algorithm evaluation table (44.3 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 3 0.30769 0.29869 0.16714 0.15632 0.30769 0.29869 0.94746
KNN 5 0.30769 0.29869 0.19726 0.18683 0.31169 0.30274 0.96742
KNNW 4 0.23077 0.22077 0.12357 0.11218 0.25000 0.24025 0.84000
KNNW 13 0.23077 0.22077 0.16715 0.15633 0.26667 0.25713 0.95531
LOF 4 0.30769 0.29869 0.14726 0.13617 0.35897 0.35064 0.65146
LOF 14 0.30769 0.29869 0.25669 0.24703 0.47368 0.46684 0.96723
LOF 15 0.30769 0.29869 0.25770 0.24805 0.50000 0.49350 0.96377
SimplifiedLOF 5 0.30769 0.29869 0.14864 0.13757 0.35714 0.34879 0.68877
SimplifiedLOF 18 0.23077 0.22077 0.23124 0.22125 0.41667 0.40908 0.96646
SimplifiedLOF 19 0.23077 0.22077 0.23723 0.22731 0.45000 0.44285 0.96531
LoOP 7 0.30769 0.29869 0.11831 0.10685 0.35714 0.34879 0.65508
LoOP 27 0.15385 0.14285 0.21549 0.20529 0.40909 0.40141 0.95954
LDOF 11 0.30769 0.29869 0.11651 0.10503 0.30769 0.29869 0.65338
LDOF 34 0.23077 0.22077 0.21796 0.20779 0.43478 0.42743 0.96238
LDOF 39 0.23077 0.22077 0.22694 0.21689 0.43478 0.42743 0.97000
LDOF 40 0.23077 0.22077 0.23114 0.22115 0.43478 0.42743 0.97000
ODIN 50 0.23077 0.22077 0.27191 0.26245 0.52941 0.52329 0.95577
ODIN 61 0.30769 0.29869 0.24872 0.23896 0.47059 0.46371 0.93715
FastABOD 3 0.15385 0.14285 0.05998 0.04776 0.20000 0.18960 0.62554
FastABOD 71 0.15385 0.14285 0.07094 0.05886 0.20690 0.19659 0.57677
FastABOD 100 0.15385 0.14285 0.07241 0.06035 0.20690 0.19659 0.57385
KDEOS 52 0.23077 0.22077 0.10829 0.09670 0.23077 0.22077 0.91385
KDEOS 60 0.15385 0.14285 0.14132 0.13015 0.23881 0.22891 0.92892
KDEOS 63 0.15385 0.14285 0.12470 0.11333 0.22785 0.21781 0.93154
KDEOS 100 0.23077 0.22077 0.11032 0.09875 0.26087 0.25126 0.89662
LDF 8 0.30769 0.29869 0.24244 0.23260 0.50000 0.49350 0.96746
LDF 14 0.38462 0.37662 0.26753 0.25801 0.45833 0.45129 0.96300
INFLO 5 0.30769 0.29869 0.11446 0.10295 0.35714 0.34879 0.67885
INFLO 17 0.15385 0.14285 0.19999 0.18959 0.39024 0.38232 0.94038
INFLO 19 0.23077 0.22077 0.20645 0.19613 0.38298 0.37496 0.94254
INFLO 51 0.07692 0.06492 0.16127 0.15036 0.32653 0.31778 0.95892
COF 2 0.23077 0.22077 0.09030 0.07847 0.27273 0.26327 0.64869
COF 21 0.15385 0.14285 0.10801 0.09642 0.20000 0.18960 0.92638

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO