Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

PageBlocks (10% of outliers)

The data set contains information about different types of blocks in document pages. The task of distinguishing them is an essential step in document analysis, namely to separate text from pictures or graphics. If the block content is text, it was labeled here as inlier, otherwise it was labeled as outlier.

Download all data set variants used (14.6 MB). You can also access the original data. (page-blocks.data.Z)

Normalized, duplicates

This version contains 10 attributes, 5473 objects, 560 outliers (10.23%)

Download raw algorithm results (46.0 MB) Download raw algorithm evaluation table (75.0 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 41 0.41429 0.34752 0.44167 0.37803 0.43831 0.37429 0.75938
KNN 52 0.42857 0.36344 0.44352 0.38009 0.43158 0.36679 0.75782
KNN 100 0.40357 0.33559 0.44941 0.38665 0.42573 0.36027 0.84082
KNNW 60 0.41250 0.34553 0.42944 0.36440 0.43492 0.37051 0.75451
KNNW 74 0.42500 0.35946 0.43355 0.36898 0.43178 0.36701 0.75616
KNNW 100 0.41071 0.34355 0.43617 0.37191 0.42590 0.36046 0.76448
LOF 60 0.50357 0.44699 0.48218 0.42316 0.50968 0.45380 0.81871
LOF 65 0.49821 0.44102 0.48422 0.42543 0.50752 0.45139 0.81583
LOF 76 0.51071 0.45494 0.47665 0.41700 0.51852 0.46364 0.81226
SimplifiedLOF 60 0.51607 0.46091 0.48408 0.42527 0.52553 0.47145 0.79972
SimplifiedLOF 84 0.50357 0.44699 0.49068 0.43263 0.51815 0.46322 0.81161
SimplifiedLOF 98 0.50893 0.45295 0.48423 0.42544 0.53307 0.47985 0.80468
LoOP 61 0.44643 0.38333 0.41925 0.35305 0.47279 0.41270 0.78332
LoOP 64 0.46786 0.40720 0.42311 0.35736 0.46892 0.40838 0.78434
LoOP 82 0.45536 0.39328 0.43235 0.36765 0.46269 0.40144 0.79053
LoOP 86 0.44821 0.38532 0.43154 0.36675 0.45972 0.39814 0.79381
LDOF 82 0.49464 0.43704 0.47476 0.41489 0.49923 0.44215 0.82978
LDOF 88 0.50000 0.44301 0.47594 0.41621 0.50563 0.44928 0.82910
ODIN 96 0.42857 0.36344 0.36324 0.29066 0.43358 0.36902 0.72642
ODIN 98 0.43036 0.36543 0.36599 0.29372 0.43173 0.36696 0.72948
ODIN 100 0.42721 0.36192 0.36717 0.29504 0.42806 0.36287 0.73064
FastABOD 3 0.33929 0.26398 0.35264 0.27885 0.36044 0.28754 0.68121
KDEOS 75 0.21964 0.13070 0.19056 0.09830 0.28766 0.20646 0.69122
KDEOS 84 0.21429 0.12473 0.19118 0.09899 0.29309 0.21252 0.69333
KDEOS 91 0.21429 0.12473 0.19073 0.09849 0.29792 0.21790 0.69507
KDEOS 95 0.20893 0.11876 0.18948 0.09709 0.29857 0.21862 0.69477
LDF 42 0.53393 0.48080 0.50741 0.45126 0.53680 0.48401 0.83024
LDF 59 0.53929 0.48677 0.53266 0.47939 0.54025 0.48785 0.82554
LDF 60 0.53750 0.48478 0.53346 0.48029 0.54044 0.48806 0.82523
LDF 62 0.53750 0.48478 0.53167 0.47829 0.54467 0.49277 0.82393
INFLO 65 0.46071 0.39924 0.41602 0.34946 0.46782 0.40715 0.76004
INFLO 66 0.46429 0.40322 0.41520 0.34854 0.47059 0.41024 0.76062
INFLO 80 0.45179 0.38930 0.41106 0.34393 0.45503 0.39291 0.76796
COF 70 0.48036 0.42113 0.45635 0.39439 0.48681 0.42831 0.76919
COF 71 0.46786 0.40720 0.45581 0.39378 0.48061 0.42140 0.77017
COF 99 0.49107 0.43306 0.43447 0.37001 0.49952 0.44248 0.72153

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, duplicates

This version contains 10 attributes, 5473 objects, 560 outliers (10.23%)

Download raw algorithm results (47.3 MB) Download raw algorithm evaluation table (73.5 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 4 0.19464 0.10285 0.20491 0.11428 0.20635 0.11589 0.50624
KNN 27 0.23036 0.14263 0.20453 0.11386 0.23626 0.14921 0.54437
KNN 36 0.23393 0.14661 0.20337 0.11257 0.23393 0.14661 0.54515
KNN 71 0.21964 0.13070 0.19697 0.10544 0.22528 0.13697 0.54599
KNNW 37 0.20536 0.11478 0.20434 0.11365 0.23259 0.14512 0.53581
KNNW 50 0.21071 0.12075 0.20382 0.11307 0.23578 0.14867 0.53908
KNNW 68 0.23214 0.14462 0.20210 0.11115 0.23396 0.14664 0.54121
KNNW 100 0.22679 0.13865 0.19922 0.10794 0.22977 0.14197 0.54288
LOF 91 0.54107 0.48876 0.54658 0.49490 0.54204 0.48984 0.86145
LOF 100 0.53929 0.48677 0.54893 0.49752 0.54414 0.49218 0.86003
SimplifiedLOF 97 0.51607 0.46091 0.51280 0.45727 0.53157 0.47818 0.82976
SimplifiedLOF 98 0.51964 0.46489 0.51325 0.45777 0.53157 0.47818 0.83121
SimplifiedLOF 100 0.51964 0.46489 0.51471 0.45940 0.53107 0.47762 0.83307
LoOP 100 0.48929 0.43107 0.49399 0.43632 0.49709 0.43977 0.81555
LDOF 98 0.45714 0.39527 0.45692 0.39501 0.45822 0.39647 0.81152
LDOF 100 0.45357 0.39129 0.45928 0.39764 0.45714 0.39527 0.81396
ODIN 100 0.41382 0.34701 0.38472 0.31459 0.42036 0.35429 0.81009
FastABOD 3 0.18750 0.09489 0.18818 0.09564 0.20670 0.11628 0.44636
KDEOS 98 0.13393 0.03521 0.13667 0.03827 0.25154 0.16622 0.63679
KDEOS 100 0.12857 0.02924 0.13755 0.03924 0.25304 0.16789 0.63915
LDF 66 0.56250 0.51263 0.56827 0.51906 0.56383 0.51411 0.87118
LDF 91 0.59464 0.54844 0.57888 0.53088 0.60870 0.56409 0.85966
LDF 95 0.59821 0.55242 0.57659 0.52833 0.61301 0.56890 0.85611
LDF 98 0.60536 0.56037 0.57454 0.52605 0.61224 0.56805 0.85320
INFLO 100 0.50179 0.44500 0.48963 0.43145 0.50923 0.45329 0.80185
COF 89 0.48571 0.42709 0.48833 0.43001 0.51498 0.45970 0.77090
COF 98 0.50536 0.44898 0.49766 0.44040 0.51313 0.45764 0.77726
COF 99 0.51071 0.45494 0.49893 0.44182 0.51237 0.45679 0.77373

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO