Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

InternetAds (2% of outliers version#05)

The data set consists of images from web pages, classified as ads or not. The goal is to learn to remove ads automatically from web pages while retaining regular images. Ads are considered outliers.

Download all data set variants used (6.0 MB). You can also access the original data. (ad.data)

Normalized, without duplicates

This version contains 1555 attributes, 1630 objects, 32 outliers (1.96%)

Download raw algorithm results (10.1 MB) Download raw algorithm evaluation table (57.1 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 1 0.48958 0.47936 0.46724 0.45657 0.53571 0.52642 0.85557
KNN 2 0.50000 0.48999 0.47190 0.46132 0.56140 0.55262 0.83652
KNN 4 0.53125 0.52186 0.44053 0.42933 0.58824 0.57999 0.82517
KNNW 2 0.46875 0.45811 0.48673 0.47645 0.53571 0.52642 0.85974
KNNW 5 0.53125 0.52186 0.51653 0.50685 0.59259 0.58443 0.84883
KNNW 9 0.53125 0.52186 0.51047 0.50066 0.60714 0.59928 0.83048
LOF 10 0.37500 0.36248 0.36769 0.35503 0.41379 0.40205 0.86440
LOF 14 0.43750 0.42624 0.36934 0.35671 0.45161 0.44063 0.86293
LOF 21 0.43750 0.42624 0.39761 0.38555 0.50980 0.49999 0.84776
LOF 99 0.43750 0.42624 0.40254 0.39058 0.49057 0.48036 0.77707
SimplifiedLOF 14 0.43750 0.42624 0.41290 0.40115 0.47458 0.46405 0.87496
SimplifiedLOF 21 0.43750 0.42624 0.45494 0.44403 0.50000 0.48999 0.86166
SimplifiedLOF 53 0.43750 0.42624 0.40910 0.39727 0.53061 0.52121 0.81004
LoOP 11 0.53125 0.52186 0.40670 0.39482 0.56667 0.55799 0.88720
LoOP 15 0.56250 0.55374 0.44404 0.43290 0.57143 0.56285 0.88682
LoOP 25 0.50000 0.48999 0.45566 0.44476 0.51515 0.50544 0.87485
LDOF 25 0.40625 0.39436 0.39746 0.38539 0.45070 0.43970 0.87158
LDOF 45 0.46875 0.45811 0.39812 0.38607 0.49057 0.48036 0.84265
LDOF 54 0.46875 0.45811 0.43799 0.42674 0.48889 0.47865 0.83188
LDOF 59 0.43750 0.42624 0.43795 0.42669 0.52174 0.51216 0.82830
ODIN 50 0.28125 0.26686 0.18812 0.17186 0.30769 0.29383 0.84007
ODIN 97 0.34375 0.33061 0.22501 0.20949 0.34375 0.33061 0.83072
ODIN 100 0.34375 0.33061 0.22511 0.20959 0.34375 0.33061 0.83087
FastABOD 9 0.40625 0.39436 0.38331 0.37096 0.41975 0.40813 0.85299
FastABOD 12 0.37500 0.36248 0.38817 0.37592 0.43836 0.42711 0.85785
FastABOD 15 0.37500 0.36248 0.39731 0.38524 0.43836 0.42711 0.85482
FastABOD 18 0.40625 0.39436 0.36664 0.35396 0.44444 0.43332 0.82683
KDEOS 8 0.21875 0.20311 0.15365 0.13671 0.26506 0.25034 0.75542
KDEOS 18 0.15625 0.13935 0.09491 0.07679 0.18182 0.16543 0.79126
LDF 4 0.05000 0.03098 0.04166 0.02247 0.09223 0.07405 0.67395
LDF 6 0.04147 0.02228 0.03927 0.02003 0.07988 0.06146 0.74330
LDF 100 0.00000 -0.02003 0.03843 0.01918 0.12448 0.10695 0.66595
INFLO 14 0.43750 0.42624 0.39822 0.38617 0.44776 0.43670 0.86527
INFLO 21 0.43750 0.42624 0.44150 0.43031 0.49123 0.48104 0.85588
INFLO 54 0.43750 0.42624 0.42530 0.41379 0.53061 0.52121 0.81983
COF 4 0.06250 0.04373 0.05808 0.03922 0.13665 0.11936 0.67130
COF 5 0.15625 0.13935 0.07153 0.05293 0.16949 0.15286 0.65595
COF 6 0.15625 0.13935 0.07602 0.05752 0.19718 0.18111 0.57548
COF 13 0.15625 0.13935 0.10768 0.08981 0.17391 0.15737 0.49536

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Normalized, duplicates

This version contains 1555 attributes, 2867 objects, 57 outliers (1.99%)

Download raw algorithm results (12.0 MB) Download raw algorithm evaluation table (65.6 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 2 0.50526 0.49523 0.49657 0.48636 0.54902 0.53987 0.88712
KNN 3 0.50600 0.49598 0.49394 0.48367 0.54545 0.53623 0.87647
KNN 4 0.49954 0.48939 0.50634 0.49632 0.58947 0.58115 0.87634
KNNW 3 0.44737 0.43616 0.45711 0.44610 0.49600 0.48578 0.88437
KNNW 4 0.49708 0.48687 0.49032 0.47998 0.52991 0.52038 0.88390
KNNW 6 0.49123 0.48091 0.51642 0.50661 0.56000 0.55107 0.88114
KNNW 10 0.49123 0.48091 0.50866 0.49869 0.58427 0.57584 0.87057
LOF 7 0.05057 0.03131 0.04660 0.02726 0.10842 0.09033 0.74248
LOF 10 0.04292 0.02350 0.04538 0.02602 0.10250 0.08430 0.78283
SimplifiedLOF 8 0.04658 0.02724 0.04275 0.02333 0.09058 0.07213 0.73342
SimplifiedLOF 10 0.04054 0.02108 0.04123 0.02178 0.09272 0.07431 0.75328
LoOP 2 0.22222 0.20645 0.13730 0.11980 0.24096 0.22557 0.73020
LoOP 72 0.12281 0.10501 0.09104 0.07260 0.18677 0.17027 0.79678
LDOF 12 0.12281 0.10501 0.10118 0.08295 0.17949 0.16284 0.78846
LDOF 73 0.15789 0.14081 0.10179 0.08357 0.18182 0.16522 0.77515
LDOF 74 0.15789 0.14081 0.10381 0.08563 0.18440 0.16785 0.77813
LDOF 75 0.15789 0.14081 0.10512 0.08697 0.18440 0.16785 0.78051
ODIN 16 0.19444 0.17810 0.14132 0.12391 0.26515 0.25025 0.85974
ODIN 91 0.35234 0.33920 0.25117 0.23598 0.42105 0.40931 0.85416
ODIN 100 0.35234 0.33920 0.25217 0.23700 0.42105 0.40931 0.85298
FastABOD 28 0.03597 0.01642 0.06004 0.04097 0.15244 0.13525 0.78021
FastABOD 77 0.03509 0.01551 0.07021 0.05135 0.18677 0.17027 0.78949
FastABOD 89 0.03509 0.01551 0.07140 0.05257 0.20077 0.18456 0.78714
FastABOD 98 0.03509 0.01551 0.07170 0.05287 0.20077 0.18456 0.78759
KDEOS 2 0.03509 0.01551 0.03470 0.01511 0.09296 0.07457 0.68551
KDEOS 6 0.03333 0.01372 0.03968 0.02020 0.08725 0.06873 0.70079
KDEOS 11 0.03333 0.01372 0.03943 0.01994 0.07966 0.06100 0.72656
LDF 1 0.04545 0.02609 0.02887 0.00917 0.13065 0.11302 0.44360
LDF 8 0.01923 -0.00066 0.02628 0.00652 0.05239 0.03317 0.59680
INFLO 7 0.05065 0.03140 0.04528 0.02591 0.11127 0.09324 0.73788
INFLO 10 0.04292 0.02350 0.04261 0.02319 0.10399 0.08582 0.75338
COF 75 0.12281 0.10501 0.07309 0.05428 0.20619 0.19008 0.75032
COF 81 0.12281 0.10501 0.07605 0.05731 0.21192 0.19593 0.73826
COF 84 0.14035 0.12291 0.07353 0.05474 0.19608 0.17977 0.74284

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO