Supplementary Material for
On the Evaluation of Unsupervised Outlier Detection: Measures, Datasets, and an Empirical Study
by G. O. Campos, A. Zimek, J. Sander, R. J. G. B. Campello, B. Micenková, E. Schubert, I. Assent and M. E. Houle
Data Mining and Knowledge Discovery 30(4): 891-927, 2016, DOI: 10.1007/s10618-015-0444-8

Waveform (version#10)

This dataset represents 3 classes of waves. Class 0 was defined here as an outlier class and downsampled to 100 objects. After preprocessing, this database has 21 numeric attributes and 3443 instances, divided into 100 outliers (2.9%) and 3343 inliers (97.1%) [1].

References:

[1] A. Zimek, M. Gaudet, R. J. G. B. Campello, and J. Sander. Subsampling for efficient and effective unsupervised outlier detection ensembles. In Proc. KDD, pages 428-436, 2013.

Download all data set variants used (5.1 MB). You can also access the original data. (waveform.data.Z)

Normalized, without duplicates

This version contains 21 attributes, 3443 objects, 100 outliers (2.90%)

Download raw algorithm results (30.1 MB) Download raw algorithm evaluation table (64.9 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 62 0.24000 0.21727 0.17454 0.14985 0.25137 0.22897 0.77119
KNN 77 0.24000 0.21727 0.18202 0.15755 0.25137 0.22897 0.77551
KNN 94 0.24000 0.21727 0.18482 0.16044 0.25556 0.23329 0.77367
KNN 100 0.24000 0.21727 0.18546 0.16110 0.24865 0.22617 0.77347
KNNW 47 0.23000 0.20697 0.15331 0.12798 0.23585 0.21299 0.76025
KNNW 100 0.23000 0.20697 0.17002 0.14519 0.24339 0.22075 0.76786
LOF 78 0.20000 0.17607 0.13005 0.10403 0.21395 0.19044 0.75469
LOF 96 0.20000 0.17607 0.13881 0.11304 0.22222 0.19896 0.75600
LOF 100 0.20000 0.17607 0.14052 0.11481 0.22472 0.20153 0.75542
SimplifiedLOF 100 0.17000 0.14517 0.10626 0.07952 0.18797 0.16368 0.72948
LoOP 98 0.15000 0.12457 0.09690 0.06988 0.16997 0.14514 0.72059
LoOP 100 0.15000 0.12457 0.09850 0.07153 0.17530 0.15063 0.72368
LDOF 41 0.06000 0.03188 0.05167 0.02330 0.10833 0.08166 0.67135
LDOF 62 0.09000 0.06278 0.05167 0.02330 0.09903 0.07208 0.67448
LDOF 100 0.09000 0.06278 0.05419 0.02590 0.10172 0.07485 0.68824
ODIN 88 0.08000 0.05248 0.05346 0.02514 0.10390 0.07709 0.69139
ODIN 100 0.04625 0.01772 0.05565 0.02740 0.11018 0.08357 0.69678
FastABOD 10 0.04000 0.01128 0.05078 0.02239 0.12283 0.09660 0.66448
FastABOD 18 0.07000 0.04218 0.05102 0.02263 0.10435 0.07756 0.66721
FastABOD 24 0.04000 0.01128 0.05145 0.02308 0.10313 0.07630 0.66875
FastABOD 40 0.05000 0.02158 0.05041 0.02201 0.10832 0.08165 0.67309
KDEOS 5 0.07000 0.04218 0.03543 0.00658 0.07955 0.05201 0.55424
KDEOS 16 0.03000 0.00098 0.03584 0.00700 0.07496 0.04729 0.57584
KDEOS 99 0.03000 0.00098 0.03443 0.00554 0.07780 0.05021 0.59240
LDF 16 0.26000 0.23786 0.23791 0.21511 0.29487 0.27378 0.78892
LDF 19 0.29000 0.26876 0.25431 0.23200 0.31847 0.29808 0.77864
LDF 34 0.27000 0.24816 0.26579 0.24382 0.31429 0.29377 0.76486
LDF 49 0.27000 0.24816 0.25824 0.23605 0.33333 0.31339 0.77394
INFLO 83 0.13000 0.10398 0.08370 0.05629 0.15385 0.12853 0.70703
INFLO 94 0.12000 0.09368 0.08331 0.05589 0.15873 0.13357 0.70921
INFLO 96 0.13000 0.10398 0.08313 0.05570 0.16043 0.13531 0.70471
INFLO 100 0.13000 0.10398 0.08548 0.05812 0.15888 0.13372 0.70493
COF 77 0.31000 0.28936 0.25442 0.23212 0.31285 0.29229 0.76790
COF 79 0.31000 0.28936 0.25751 0.23530 0.34682 0.32728 0.76803
COF 99 0.30000 0.27906 0.26955 0.24770 0.31250 0.29193 0.77594
COF 100 0.28000 0.25846 0.27072 0.24890 0.31847 0.29808 0.77311

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO

Not normalized, without duplicates

This version contains 21 attributes, 3443 objects, 100 outliers (2.90%)

Download raw algorithm results (30.2 MB) Download raw algorithm evaluation table (64.6 kB)

Best Parameters

The following table contains the best (overall and per-method) results for each method and evaluation measure (when the same score was achieved twice, only the smallest k is given).
The Maximum F1-Measure is complimentary in addition to the measures in the original publication.

Algorithm k P@n Adj. P@n AP Adj. AP Max-F1 Adj. MF1 ROC AUC
KNN 40 0.28000 0.25846 0.22518 0.20200 0.30000 0.27906 0.78886
KNN 70 0.27000 0.24816 0.24480 0.22221 0.30882 0.28815 0.79615
KNN 96 0.28000 0.25846 0.25273 0.23038 0.30657 0.28583 0.80088
KNN 99 0.28000 0.25846 0.25221 0.22984 0.30657 0.28583 0.80150
KNNW 75 0.25000 0.22757 0.21917 0.19582 0.29787 0.27687 0.78852
KNNW 79 0.26000 0.23786 0.22049 0.19717 0.29787 0.27687 0.78904
KNNW 100 0.26000 0.23786 0.22524 0.20206 0.29787 0.27687 0.79179
LOF 99 0.23000 0.20697 0.19985 0.17591 0.27972 0.25817 0.78325
LOF 100 0.24000 0.21727 0.19936 0.17541 0.27972 0.25817 0.78334
SimplifiedLOF 93 0.20000 0.17607 0.12534 0.09918 0.20853 0.18486 0.74552
SimplifiedLOF 98 0.20000 0.17607 0.12861 0.10254 0.21395 0.19044 0.74700
SimplifiedLOF 99 0.20000 0.17607 0.12915 0.10310 0.21395 0.19044 0.74759
SimplifiedLOF 100 0.20000 0.17607 0.13038 0.10437 0.21277 0.18922 0.74757
LoOP 99 0.17000 0.14517 0.11540 0.08894 0.20657 0.18284 0.74019
LoOP 100 0.18000 0.15547 0.11507 0.08860 0.20755 0.18384 0.74037
LDOF 89 0.09000 0.06278 0.05912 0.03097 0.11961 0.09328 0.69906
LDOF 91 0.09000 0.06278 0.05956 0.03143 0.11966 0.09332 0.70122
LDOF 100 0.08000 0.05248 0.06011 0.03200 0.11589 0.08945 0.70553
ODIN 69 0.05250 0.02416 0.04736 0.01887 0.10095 0.07406 0.68151
ODIN 100 0.05000 0.02158 0.05387 0.02556 0.11006 0.08344 0.70310
FastABOD 15 0.05000 0.02158 0.03419 0.00530 0.06984 0.04202 0.53686
FastABOD 16 0.04000 0.01128 0.03399 0.00510 0.07166 0.04389 0.53818
KDEOS 7 0.02000 -0.00931 0.04214 0.01348 0.06457 0.03659 0.53729
KDEOS 17 0.06000 0.03188 0.03983 0.01110 0.09770 0.07071 0.57463
KDEOS 100 0.03000 0.00098 0.03299 0.00407 0.07509 0.04742 0.57901
LDF 14 0.35000 0.33056 0.31281 0.29225 0.36264 0.34357 0.79495
LDF 18 0.34000 0.32026 0.34028 0.32054 0.35036 0.33093 0.80031
LDF 65 0.31000 0.28936 0.32932 0.30926 0.38095 0.36243 0.81436
LDF 100 0.33000 0.30996 0.32387 0.30364 0.34074 0.32102 0.82525
INFLO 77 0.13000 0.10398 0.08200 0.05454 0.15528 0.13001 0.71414
INFLO 93 0.12000 0.09368 0.09327 0.06614 0.17121 0.14641 0.71776
INFLO 97 0.12000 0.09368 0.09596 0.06892 0.16598 0.14103 0.71919
INFLO 99 0.12000 0.09368 0.09561 0.06856 0.16725 0.14234 0.72102
COF 71 0.27000 0.24816 0.28366 0.26223 0.31507 0.29458 0.79283
COF 93 0.33000 0.30996 0.32189 0.30160 0.34532 0.32574 0.79032
COF 97 0.33000 0.30996 0.32480 0.30460 0.36364 0.34460 0.78780
COF 100 0.32000 0.29966 0.33070 0.31068 0.36000 0.34086 0.78803

Plots

Precision at n
Adjusted precision at n
Average precision
Adjusted average precision
Maximum F1 score
Adjusted maximum F1 score
ROC AUC
Diversity
A: KNN, B: KNNW, C: LOF, D: SimplifiedLOF, E: LoOP, F: LDOF
G: ODIN, H: KDEOS, I: COF, J: FastABOD, K: LDF, L: INFLO