Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

Shuttle (version#06)

This dataset has been preprocessed in different variants in the literature. We follow the procedure of Zhang et al. [1], using classes 1, 3, 4, 5, 6 and 7 as inliers and class 2 as outlier, selecting 1000 inliers vs. 13 outliers (class 2). The selection of instances is based on the test set. The processed dataset consists of 1013 instances represented in 9 attributes, with 13 outliers (1.28%) and 1000 inliers (98.72%).

References:

[1] K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, pages 813-822, 2009.

Download all data set variants used (328.2 kB). You can also access the original data. (shuttle.tst, [1] only uses test set)

Normalized, without duplicates

This version contains 9 attributes, 1013 objects, 13 outliers (1.28%)

Download raw algorithm results (8.5 MB) Download raw algorithm evaluation table (45.7 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.30769 0.29869 0.27588 0.26646 0.51429 0.50797 0.81535
KNN 3 0.23077 0.22077 0.35849 0.35015 0.59091 0.58559 0.98723
KNNW 5 0.38462 0.37662 0.30790 0.29890 0.45000 0.44285 0.97923
KNNW 8 0.30769 0.29869 0.32945 0.32074 0.49057 0.48394 0.98454
KNNW 10 0.30769 0.29869 0.32718 0.31843 0.52000 0.51376 0.98446
LOF 4 0.30769 0.29869 0.11949 0.10805 0.34783 0.33935 0.67938
LOF 14 0.30769 0.29869 0.30925 0.30027 0.52000 0.51376 0.98569
LOF 16 0.23077 0.22077 0.29744 0.28831 0.53659 0.53056 0.98400
SimplifiedLOF 15 0.30769 0.29869 0.26222 0.25263 0.48148 0.47474 0.98100
SimplifiedLOF 17 0.30769 0.29869 0.27869 0.26931 0.52000 0.51376 0.98292
LoOP 18 0.30769 0.29869 0.22770 0.21766 0.34483 0.33631 0.97385
LoOP 33 0.23077 0.22077 0.31482 0.30592 0.50000 0.49350 0.98308
LDOF 28 0.30769 0.29869 0.17345 0.16271 0.33333 0.32467 0.95185
LDOF 78 0.23077 0.22077 0.29536 0.28620 0.51282 0.50649 0.97931
LDOF 83 0.23077 0.22077 0.29747 0.28833 0.50000 0.49350 0.98054
ODIN 39 0.15385 0.14285 0.31708 0.30820 0.57778 0.57229 0.98512
ODIN 77 0.49231 0.48571 0.38076 0.37271 0.62069 0.61576 0.97696
ODIN 82 0.61538 0.61038 0.35530 0.34692 0.61538 0.61038 0.96042
ODIN 90 0.61538 0.61038 0.37056 0.36237 0.64000 0.63532 0.92858
FastABOD 18 0.23077 0.22077 0.15649 0.14552 0.23810 0.22819 0.80285
FastABOD 48 0.23077 0.22077 0.17583 0.16512 0.28571 0.27643 0.80792
FastABOD 92 0.23077 0.22077 0.17682 0.16611 0.28571 0.27643 0.81162
FastABOD 100 0.23077 0.22077 0.17636 0.16565 0.28571 0.27643 0.81223
KDEOS 91 0.23077 0.22077 0.17026 0.15948 0.30303 0.29397 0.96000
KDEOS 93 0.23077 0.22077 0.17268 0.16193 0.28571 0.27643 0.96092
KDEOS 98 0.23077 0.22077 0.17506 0.16434 0.30769 0.29869 0.96062
KDEOS 100 0.23077 0.22077 0.17554 0.16482 0.30000 0.29090 0.95869
LDF 5 0.15385 0.14285 0.29716 0.28802 0.60000 0.59480 0.97569
LDF 6 0.23077 0.22077 0.27600 0.26659 0.50000 0.49350 0.97815
LDF 12 0.23077 0.22077 0.32381 0.31502 0.56522 0.55957 0.98708
LDF 13 0.23077 0.22077 0.32583 0.31707 0.55556 0.54978 0.98700
INFLO 12 0.30769 0.29869 0.19625 0.18580 0.36735 0.35912 0.88438
INFLO 25 0.15385 0.14285 0.27600 0.26659 0.45283 0.44572 0.98046
INFLO 26 0.15385 0.14285 0.27798 0.26860 0.46154 0.45454 0.98023
INFLO 31 0.15385 0.14285 0.27969 0.27033 0.45455 0.44745 0.97962
COF 14 0.38462 0.37662 0.24966 0.23991 0.41667 0.40908 0.96654
COF 29 0.23077 0.22077 0.38916 0.38122 0.61905 0.61410 0.98815

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 9 attributes, 1013 objects, 13 outliers (1.28%)

Download raw algorithm results (8.3 MB) Download raw algorithm evaluation table (44.9 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.30769 0.29869 0.15643 0.14546 0.30769 0.29869 0.81369
KNN 7 0.23077 0.22077 0.24029 0.23041 0.42308 0.41558 0.97577
KNNW 3 0.23077 0.22077 0.11292 0.10139 0.24000 0.23012 0.71404
KNNW 6 0.23077 0.22077 0.15050 0.13946 0.27586 0.26645 0.92377
KNNW 12 0.23077 0.22077 0.17997 0.16931 0.26531 0.25576 0.96092
KNNW 13 0.23077 0.22077 0.18127 0.17062 0.26531 0.25576 0.96062
LOF 9 0.30769 0.29869 0.17780 0.16711 0.37037 0.36219 0.93562
LOF 14 0.23077 0.22077 0.22988 0.21986 0.40909 0.40141 0.94685
LOF 20 0.07692 0.06492 0.17473 0.16400 0.36735 0.35912 0.95915
SimplifiedLOF 2 0.23077 0.22077 0.09322 0.08143 0.30000 0.29090 0.62831
SimplifiedLOF 20 0.15385 0.14285 0.19724 0.18681 0.34483 0.33631 0.96277
SimplifiedLOF 23 0.15385 0.14285 0.19723 0.18679 0.30303 0.29397 0.96500
SimplifiedLOF 99 0.07692 0.06492 0.17670 0.16600 0.36364 0.35536 0.95192
LoOP 2 0.23077 0.22077 0.09095 0.07913 0.30000 0.29090 0.67438
LoOP 26 0.15385 0.14285 0.19913 0.18872 0.35294 0.34453 0.96292
LoOP 28 0.23077 0.22077 0.19930 0.18889 0.33333 0.32467 0.96254
LoOP 29 0.15385 0.14285 0.19527 0.18481 0.32653 0.31778 0.96331
LDOF 3 0.23077 0.22077 0.08301 0.07109 0.24000 0.23012 0.52508
LDOF 39 0.23077 0.22077 0.17512 0.16440 0.27451 0.26508 0.95738
LDOF 51 0.15385 0.14285 0.15933 0.14841 0.30303 0.29397 0.96031
LDOF 58 0.15385 0.14285 0.15798 0.14704 0.31746 0.30859 0.95962
ODIN 45 0.20513 0.19479 0.20069 0.19030 0.42857 0.42114 0.95169
ODIN 53 0.23077 0.22077 0.23570 0.22576 0.47368 0.46684 0.94665
ODIN 59 0.17949 0.16882 0.23009 0.22008 0.50000 0.49350 0.93577
ODIN 68 0.25641 0.24674 0.20753 0.19723 0.40000 0.39220 0.92469
FastABOD 3 0.15385 0.14285 0.06284 0.05066 0.21053 0.20026 0.62446
FastABOD 69 0.15385 0.14285 0.07675 0.06474 0.21053 0.20026 0.57831
KDEOS 86 0.07692 0.06492 0.09656 0.08482 0.16495 0.15409 0.91115
KDEOS 87 0.15385 0.14285 0.09836 0.08664 0.16327 0.15239 0.91092
KDEOS 97 0.15385 0.14285 0.10726 0.09565 0.21053 0.20026 0.90808
LDF 4 0.30769 0.29869 0.12516 0.11378 0.34483 0.33631 0.63146
LDF 11 0.23077 0.22077 0.29009 0.28086 0.54545 0.53955 0.97546
INFLO 9 0.30769 0.29869 0.11521 0.10371 0.30769 0.29869 0.77654
INFLO 19 0.23077 0.22077 0.18701 0.17644 0.32258 0.31377 0.94954
INFLO 20 0.23077 0.22077 0.18518 0.17459 0.33333 0.32467 0.95438
INFLO 52 0.07692 0.06492 0.16435 0.15349 0.29032 0.28110 0.96038
COF 2 0.23077 0.22077 0.07516 0.06313 0.27273 0.26327 0.58046
COF 24 0.15385 0.14285 0.15835 0.14741 0.25316 0.24346 0.94923
COF 36 0.07692 0.06492 0.15471 0.14373 0.36000 0.35168 0.93762

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO