Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

PageBlocks (9% of outliers)

The data set contains information about different types of blocks in document pages. The task of distinguishing them is an essential step in document analysis, namely to separate text from pictures or graphics. If the block content is text, it was labeled here as inlier, otherwise it was labeled as outlier.

Download all data set variants used (14.6 MB). You can also access the original data. (page-blocks.data.Z)

Normalized, without duplicates

This version contains 10 attributes, 5393 objects, 510 outliers (9.46%)

Download raw algorithm results (45.6 MB) Download raw algorithm evaluation table (75.1 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 99 0.47255 0.41746 0.50974 0.45854 0.50594 0.45434 0.88387
KNN 100 0.47255 0.41746 0.50991 0.45872 0.50601 0.45441 0.88392
KNNW 59 0.44314 0.38498 0.47188 0.41673 0.45966 0.40322 0.85745
KNNW 75 0.44706 0.38931 0.48240 0.42834 0.45741 0.40074 0.86797
KNNW 100 0.43529 0.37631 0.49202 0.43897 0.45307 0.39595 0.87602
LOF 99 0.47255 0.41746 0.43494 0.37592 0.47676 0.42211 0.79429
LOF 100 0.47451 0.41963 0.43479 0.37576 0.47732 0.42273 0.79449
SimplifiedLOF 58 0.46275 0.40663 0.41603 0.35503 0.46391 0.40792 0.77314
SimplifiedLOF 88 0.45294 0.39580 0.43433 0.37525 0.47285 0.41779 0.78306
SimplifiedLOF 99 0.45490 0.39797 0.43627 0.37739 0.47685 0.42221 0.78223
LoOP 78 0.43725 0.37848 0.39330 0.32993 0.44068 0.38226 0.77006
LoOP 96 0.43333 0.37415 0.40396 0.34171 0.44324 0.38509 0.77258
LoOP 99 0.43137 0.37198 0.40472 0.34254 0.44244 0.38420 0.77409
LDOF 94 0.46863 0.41313 0.44480 0.38681 0.47476 0.41990 0.85946
LDOF 100 0.46667 0.41096 0.45012 0.39269 0.47686 0.42222 0.86263
ODIN 89 0.44695 0.38919 0.37417 0.30881 0.45464 0.39768 0.76272
ODIN 97 0.45190 0.39465 0.38286 0.31841 0.45247 0.39529 0.76347
ODIN 100 0.44734 0.38962 0.38685 0.32281 0.45369 0.39663 0.76441
FastABOD 26 0.33333 0.26370 0.34121 0.27241 0.36010 0.29327 0.69240
FastABOD 27 0.33725 0.26804 0.34116 0.27235 0.36132 0.29462 0.69244
FastABOD 28 0.33529 0.26587 0.34112 0.27231 0.36149 0.29480 0.69250
FastABOD 40 0.33333 0.26370 0.34053 0.27165 0.36570 0.29945 0.69210
KDEOS 92 0.19804 0.11428 0.17942 0.09372 0.27904 0.20374 0.69111
KDEOS 94 0.20588 0.12294 0.17916 0.09342 0.28175 0.20673 0.69165
KDEOS 97 0.19216 0.10778 0.17881 0.09305 0.28384 0.20904 0.69277
KDEOS 98 0.19804 0.11428 0.17877 0.09300 0.28234 0.20739 0.69325
LDF 58 0.46863 0.41313 0.46427 0.40832 0.47276 0.41770 0.77672
LDF 69 0.47451 0.41963 0.45998 0.40357 0.47498 0.42014 0.77759
LDF 100 0.47255 0.41746 0.45456 0.39759 0.48131 0.42714 0.80309
INFLO 94 0.44118 0.38281 0.38868 0.32483 0.44534 0.38741 0.75200
INFLO 100 0.43529 0.37631 0.39101 0.32741 0.44068 0.38226 0.75656
COF 73 0.43333 0.37415 0.40999 0.34837 0.43933 0.38077 0.73823
COF 76 0.43529 0.37631 0.41137 0.34990 0.44093 0.38254 0.73660
COF 77 0.43529 0.37631 0.41317 0.35187 0.44397 0.38590 0.73730
COF 82 0.43529 0.37631 0.41422 0.35303 0.43797 0.37927 0.73652

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 10 attributes, 5393 objects, 510 outliers (9.46%)

Download raw algorithm results (46.9 MB) Download raw algorithm evaluation table (75.2 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 4 0.20588 0.12294 0.21554 0.13361 0.21644 0.13460 0.55609
KNN 27 0.21373 0.13160 0.21458 0.13255 0.24636 0.16765 0.59270
KNN 31 0.23922 0.15976 0.21421 0.13214 0.24548 0.16668 0.59341
KNN 64 0.22549 0.14460 0.20753 0.12476 0.23679 0.15708 0.59608
KNNW 16 0.21961 0.13810 0.21408 0.13199 0.22378 0.14270 0.56913
KNNW 37 0.20784 0.12511 0.21467 0.13264 0.24185 0.16266 0.58572
KNNW 50 0.20196 0.11861 0.21405 0.13197 0.24592 0.16716 0.58906
KNNW 100 0.20000 0.11644 0.20903 0.12642 0.24022 0.16086 0.59328
LOF 100 0.55490 0.50841 0.57464 0.53021 0.56410 0.51858 0.90368
SimplifiedLOF 95 0.54510 0.49759 0.53491 0.48633 0.54691 0.49958 0.86643
SimplifiedLOF 99 0.54314 0.49542 0.53717 0.48883 0.55000 0.50300 0.87074
SimplifiedLOF 100 0.54314 0.49542 0.53809 0.48985 0.54934 0.50227 0.87188
LoOP 91 0.50196 0.44994 0.50831 0.45696 0.50669 0.45517 0.84183
LoOP 99 0.49804 0.44561 0.51859 0.46831 0.51316 0.46231 0.85150
LoOP 100 0.49608 0.44345 0.51979 0.46963 0.51209 0.46113 0.85288
LDOF 97 0.45490 0.39797 0.48061 0.42636 0.47081 0.41554 0.83705
LDOF 100 0.46275 0.40663 0.48355 0.42961 0.46887 0.41340 0.84114
ODIN 99 0.43062 0.37115 0.39948 0.33676 0.43259 0.37333 0.83636
ODIN 100 0.43024 0.37073 0.40129 0.33876 0.43090 0.37146 0.83729
FastABOD 3 0.17647 0.09046 0.17999 0.09435 0.18547 0.10040 0.45964
FastABOD 4 0.17059 0.08396 0.18009 0.09446 0.18598 0.10096 0.45716
FastABOD 12 0.16471 0.07746 0.18065 0.09507 0.18373 0.09848 0.45748
KDEOS 97 0.14510 0.05581 0.13982 0.04998 0.25129 0.17310 0.67102
KDEOS 99 0.14314 0.05364 0.14056 0.05080 0.25248 0.17441 0.67258
KDEOS 100 0.14118 0.05148 0.14115 0.05145 0.25205 0.17393 0.67392
LDF 81 0.58824 0.54523 0.59752 0.55549 0.62138 0.58183 0.90477
LDF 87 0.60980 0.56905 0.60002 0.55825 0.62534 0.58621 0.90474
LDF 100 0.63137 0.59287 0.59913 0.55726 0.63541 0.59733 0.90219
INFLO 98 0.51961 0.46943 0.51605 0.46551 0.52805 0.47876 0.83277
INFLO 99 0.51765 0.46727 0.51689 0.46643 0.53025 0.48119 0.83274
INFLO 100 0.52157 0.47160 0.51734 0.46693 0.52802 0.47872 0.83250
COF 100 0.52549 0.47593 0.51595 0.46539 0.53044 0.48140 0.79838

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO