Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

PageBlocks (5% of outliers version#10)

The data set contains information about different types of blocks in document pages. The task of distinguishing them is an essential step in document analysis, namely to separate text from pictures or graphics. If the block content is text, it was labeled here as inlier, otherwise it was labeled as outlier.

Download all data set variants used (14.6 MB). You can also access the original data. (page-blocks.data.Z)

Normalized, without duplicates

This version contains 10 attributes, 5139 objects, 256 outliers (4.98%)

Download raw algorithm results (43.5 MB) Download raw algorithm evaluation table (71.5 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 7 0.49219 0.46556 0.46048 0.43219 0.49906 0.47280 0.91578
KNN 41 0.46094 0.43268 0.50172 0.47560 0.49481 0.46832 0.92314
KNN 60 0.48047 0.45323 0.50303 0.47697 0.50691 0.48106 0.92111
KNNW 14 0.49609 0.46968 0.46386 0.43576 0.49609 0.46968 0.91498
KNNW 20 0.48828 0.46145 0.47578 0.44830 0.50734 0.48151 0.91921
KNNW 60 0.46094 0.43268 0.49790 0.47157 0.47463 0.44708 0.92415
KNNW 72 0.46484 0.43679 0.50067 0.47449 0.48992 0.46318 0.92387
LOF 53 0.46094 0.43268 0.41691 0.38634 0.47228 0.44461 0.88106
LOF 56 0.46875 0.44090 0.41589 0.38527 0.47419 0.44662 0.88596
LOF 100 0.44531 0.41623 0.41639 0.38579 0.45339 0.42473 0.93075
SimplifiedLOF 57 0.47266 0.44501 0.42516 0.39502 0.48133 0.45414 0.83757
SimplifiedLOF 84 0.47656 0.44912 0.41327 0.38251 0.47656 0.44912 0.86262
SimplifiedLOF 100 0.47656 0.44912 0.41453 0.38383 0.47656 0.44912 0.88221
LoOP 99 0.46094 0.43268 0.39522 0.36351 0.46743 0.43951 0.86166
LoOP 100 0.46484 0.43679 0.39440 0.36265 0.46923 0.44140 0.86245
LDOF 97 0.46875 0.44090 0.44490 0.41580 0.47558 0.44809 0.90542
LDOF 100 0.46875 0.44090 0.44701 0.41801 0.47347 0.44587 0.90706
ODIN 99 0.44252 0.41330 0.36017 0.32663 0.45208 0.42335 0.81418
ODIN 100 0.44336 0.41418 0.35968 0.32611 0.45091 0.42212 0.81513
FastABOD 30 0.42969 0.39979 0.39752 0.36594 0.43697 0.40746 0.81359
FastABOD 79 0.44922 0.42034 0.40119 0.36980 0.45010 0.42127 0.81164
FastABOD 99 0.44531 0.41623 0.40218 0.37084 0.45059 0.42179 0.81106
KDEOS 98 0.11719 0.07090 0.11961 0.07345 0.23263 0.19240 0.75248
KDEOS 100 0.11719 0.07090 0.12042 0.07431 0.23591 0.19585 0.75358
LDF 99 0.48828 0.46145 0.47912 0.45182 0.55908 0.53596 0.94275
LDF 100 0.48438 0.45734 0.48023 0.45299 0.56023 0.53718 0.94284
INFLO 56 0.45312 0.42445 0.38123 0.34878 0.45738 0.42893 0.78203
INFLO 75 0.46875 0.44090 0.37068 0.33769 0.47244 0.44478 0.78334
INFLO 96 0.45312 0.42445 0.36963 0.33658 0.46000 0.43169 0.79750
COF 69 0.46094 0.43268 0.43462 0.40498 0.46667 0.43871 0.78950
COF 70 0.46484 0.43679 0.43453 0.40489 0.46667 0.43871 0.79041
COF 98 0.43750 0.40801 0.42778 0.39778 0.47807 0.45071 0.81701
COF 100 0.43750 0.40801 0.42896 0.39902 0.47768 0.45029 0.82088

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Normalized, duplicates

This version contains 10 attributes, 5171 objects, 258 outliers (4.99%)

Download raw algorithm results (43.6 MB) Download raw algorithm evaluation table (72.3 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 8 0.46512 0.43703 0.42526 0.39508 0.47217 0.44445 0.83422
KNN 82 0.43023 0.40031 0.46918 0.44131 0.47896 0.45160 0.91480
KNN 89 0.43411 0.40439 0.46906 0.44118 0.48387 0.45677 0.91483
KNN 100 0.45736 0.42887 0.46899 0.44111 0.49635 0.46990 0.91410
KNNW 13 0.46899 0.44111 0.43779 0.40827 0.48780 0.46091 0.83196
KNNW 15 0.47287 0.44519 0.44055 0.41117 0.48671 0.45975 0.83378
KNNW 100 0.42636 0.39623 0.45932 0.43093 0.44498 0.41583 0.91272
LOF 30 0.53101 0.50638 0.46237 0.43414 0.55950 0.53637 0.84356
LOF 100 0.42248 0.39215 0.38291 0.35051 0.43038 0.40047 0.88510
SimplifiedLOF 30 0.50775 0.48190 0.44431 0.41513 0.51565 0.49022 0.84383
SimplifiedLOF 39 0.54651 0.52270 0.47232 0.44461 0.54971 0.52606 0.83371
SimplifiedLOF 41 0.53488 0.51046 0.47524 0.44768 0.53988 0.51571 0.83169
LoOP 36 0.45736 0.42887 0.37993 0.34737 0.48629 0.45931 0.82487
LoOP 38 0.46512 0.43703 0.38609 0.35385 0.47985 0.45254 0.82604
LoOP 43 0.48062 0.45335 0.39762 0.36599 0.48263 0.45546 0.82027
LoOP 47 0.48062 0.45335 0.40164 0.37022 0.48413 0.45704 0.81561
LDOF 36 0.42248 0.39215 0.37033 0.33727 0.43636 0.40676 0.89413
LDOF 100 0.50000 0.47374 0.44717 0.41814 0.50104 0.47483 0.87030
ODIN 71 0.45252 0.42377 0.33855 0.30382 0.45594 0.42737 0.77441
ODIN 96 0.43235 0.40254 0.36033 0.32674 0.45122 0.42240 0.80226
ODIN 100 0.43632 0.40672 0.36019 0.32659 0.45415 0.42548 0.80649
FastABOD 3 0.39922 0.36768 0.33128 0.29617 0.40611 0.37493 0.78295
FastABOD 14 0.42248 0.39215 0.37074 0.33770 0.42248 0.39215 0.75158
FastABOD 70 0.40310 0.37176 0.37195 0.33897 0.42765 0.39759 0.74731
FastABOD 86 0.40310 0.37176 0.37278 0.33985 0.42358 0.39331 0.74625
KDEOS 42 0.12403 0.07803 0.10140 0.05421 0.17726 0.13406 0.70211
KDEOS 100 0.11628 0.06987 0.10968 0.06293 0.21266 0.17131 0.71314
LDF 23 0.51550 0.49006 0.46579 0.43773 0.53306 0.50854 0.84607
LDF 24 0.51938 0.49414 0.47716 0.44970 0.53219 0.50762 0.84340
LDF 28 0.50775 0.48190 0.49155 0.46485 0.52174 0.49662 0.84140
LDF 100 0.41860 0.38807 0.43200 0.40217 0.45775 0.42928 0.93120
INFLO 31 0.47287 0.44519 0.38259 0.35017 0.47451 0.44691 0.77076
INFLO 37 0.49225 0.46558 0.39343 0.36158 0.49520 0.46869 0.75657
INFLO 41 0.47674 0.44927 0.39663 0.36494 0.48447 0.45740 0.75881
COF 31 0.50388 0.47782 0.44204 0.41274 0.50628 0.48035 0.82025
COF 36 0.53488 0.51046 0.47539 0.44784 0.55928 0.53614 0.80632
COF 38 0.52713 0.50230 0.48230 0.45512 0.56103 0.53798 0.80930
COF 39 0.51938 0.49414 0.48572 0.45871 0.56018 0.53708 0.80425

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 10 attributes, 5139 objects, 256 outliers (4.98%)

Download raw algorithm results (44.6 MB) Download raw algorithm evaluation table (71.0 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.17188 0.12846 0.12893 0.08326 0.17565 0.13243 0.52260
KNN 9 0.14062 0.09557 0.13709 0.09185 0.15005 0.10549 0.57715
KNN 54 0.10938 0.06268 0.12949 0.08386 0.14583 0.10105 0.58782
KNNW 1 0.14844 0.10379 0.12697 0.08120 0.17788 0.13478 0.51515
KNNW 3 0.16016 0.11613 0.13116 0.08561 0.16941 0.12587 0.53225
KNNW 19 0.14062 0.09557 0.13675 0.09149 0.15637 0.11214 0.57499
KNNW 73 0.11328 0.06679 0.13142 0.08588 0.14933 0.10474 0.58478
LOF 90 0.48047 0.45323 0.47416 0.44659 0.49775 0.47142 0.92249
LOF 99 0.49219 0.46556 0.47657 0.44913 0.52504 0.50014 0.92121
LOF 100 0.49609 0.46968 0.47638 0.44893 0.52668 0.50186 0.92087
SimplifiedLOF 97 0.48047 0.45323 0.45162 0.42287 0.49187 0.46523 0.91896
SimplifiedLOF 100 0.48438 0.45734 0.45466 0.42607 0.49004 0.46330 0.92072
LoOP 97 0.44531 0.41623 0.43898 0.40957 0.45106 0.42228 0.90051
LoOP 100 0.44531 0.41623 0.44132 0.41203 0.45396 0.42533 0.90409
LDOF 87 0.39062 0.35868 0.39660 0.36497 0.42553 0.39541 0.87263
LDOF 98 0.41016 0.37923 0.40408 0.37284 0.41913 0.38868 0.88533
LDOF 100 0.41016 0.37923 0.40633 0.37521 0.42254 0.39226 0.88779
ODIN 98 0.39819 0.36664 0.36315 0.32976 0.41772 0.38719 0.87036
ODIN 100 0.39453 0.36279 0.36704 0.33386 0.42217 0.39188 0.87285
FastABOD 3 0.12109 0.07502 0.10831 0.06156 0.13105 0.08550 0.43577
FastABOD 6 0.11719 0.07090 0.10939 0.06270 0.13855 0.09339 0.43466
FastABOD 15 0.11719 0.07090 0.11009 0.06344 0.13333 0.08790 0.43491
FastABOD 100 0.11328 0.06679 0.10960 0.06292 0.13084 0.08527 0.43590
KDEOS 95 0.09766 0.05035 0.09293 0.04537 0.16758 0.12394 0.72982
KDEOS 100 0.08984 0.04213 0.09475 0.04729 0.16958 0.12604 0.73543
LDF 63 0.48438 0.45734 0.47767 0.45029 0.51585 0.49047 0.91908
LDF 72 0.50391 0.47790 0.48801 0.46117 0.53737 0.51311 0.91825
LDF 81 0.50781 0.48201 0.48556 0.45859 0.55536 0.53205 0.91685
LDF 94 0.51953 0.49434 0.47975 0.45247 0.54965 0.52603 0.91268
INFLO 87 0.46094 0.43268 0.41759 0.38705 0.47881 0.45149 0.85430
INFLO 96 0.47266 0.44501 0.42638 0.39630 0.47619 0.44873 0.87029
INFLO 100 0.46875 0.44090 0.43017 0.40029 0.47714 0.44973 0.87604
COF 88 0.48438 0.45734 0.44770 0.41874 0.50000 0.47379 0.84341
COF 90 0.50391 0.47790 0.44952 0.42066 0.51024 0.48457 0.84298
COF 95 0.50391 0.47790 0.45418 0.42557 0.51685 0.49152 0.84126
COF 100 0.50000 0.47379 0.45897 0.43060 0.51246 0.48690 0.84105

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, duplicates

This version contains 10 attributes, 5171 objects, 258 outliers (4.99%)

Download raw algorithm results (44.7 MB) Download raw algorithm evaluation table (71.8 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 3 0.17054 0.12698 0.16461 0.12074 0.18182 0.13885 0.55799
KNN 14 0.18605 0.14330 0.16304 0.11908 0.19928 0.15723 0.57872
KNNW 1 0.16279 0.11883 0.15933 0.11518 0.19417 0.15186 0.52854
KNNW 2 0.17054 0.12698 0.15877 0.11460 0.18994 0.14741 0.53673
KNNW 7 0.15891 0.11475 0.16324 0.11930 0.17955 0.13647 0.55640
KNNW 69 0.13953 0.09435 0.15392 0.10949 0.16733 0.12360 0.57616
LOF 59 0.46512 0.43703 0.47779 0.45036 0.47059 0.44279 0.89459
LOF 79 0.49612 0.46966 0.48362 0.45650 0.51138 0.48572 0.89212
LOF 97 0.50775 0.48190 0.47303 0.44536 0.54930 0.52563 0.88867
LOF 100 0.51550 0.49006 0.47144 0.44369 0.54804 0.52431 0.88842
SimplifiedLOF 83 0.48837 0.46150 0.47699 0.44952 0.49558 0.46909 0.90053
SimplifiedLOF 96 0.49612 0.46966 0.47752 0.45008 0.51254 0.48695 0.89954
SimplifiedLOF 98 0.50388 0.47782 0.47708 0.44962 0.51916 0.49391 0.89942
SimplifiedLOF 100 0.50388 0.47782 0.47627 0.44876 0.52373 0.49871 0.89886
LoOP 97 0.47287 0.44519 0.48258 0.45540 0.48429 0.45721 0.89355
LoOP 99 0.46899 0.44111 0.48382 0.45671 0.48799 0.46110 0.89451
LoOP 100 0.47287 0.44519 0.48398 0.45688 0.49071 0.46396 0.89451
LDOF 99 0.46124 0.43295 0.46021 0.43186 0.47084 0.44305 0.90566
LDOF 100 0.46124 0.43295 0.46102 0.43272 0.47495 0.44737 0.90635
ODIN 99 0.40891 0.37787 0.38508 0.35279 0.41683 0.38621 0.87928
ODIN 100 0.41113 0.38021 0.38754 0.35537 0.41449 0.38374 0.88055
FastABOD 3 0.15504 0.11067 0.13879 0.09356 0.20772 0.16611 0.48087
KDEOS 82 0.08915 0.04132 0.08524 0.03720 0.15780 0.11357 0.70543
KDEOS 100 0.08527 0.03724 0.09268 0.04503 0.17472 0.13138 0.72460
LDF 44 0.45349 0.42479 0.48323 0.45609 0.48841 0.46155 0.89531
LDF 61 0.51163 0.48598 0.48637 0.45940 0.52980 0.50511 0.88414
LDF 62 0.51550 0.49006 0.48612 0.45913 0.53223 0.50767 0.88308
LDF 70 0.49612 0.46966 0.47584 0.44832 0.54110 0.51700 0.87577
INFLO 89 0.48837 0.46150 0.46227 0.43404 0.49485 0.46832 0.86591
INFLO 95 0.48450 0.45743 0.46244 0.43421 0.50000 0.47374 0.86275
INFLO 100 0.49612 0.46966 0.45933 0.43094 0.51761 0.49227 0.85791
COF 75 0.47287 0.44519 0.45744 0.42894 0.49069 0.46395 0.83396
COF 97 0.51550 0.49006 0.46694 0.43895 0.52261 0.49754 0.81244
COF 100 0.51550 0.49006 0.46701 0.43902 0.53125 0.50663 0.80972

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO