Shuttle

This dataset has been preprocessed in different variants in the literature. We follow the procedure of Zhang et al. [1], using classes 1, 3, 4, 5, 6 and 7 as inliers and class 2 as outlier, selecting 1000 inliers vs. 13 outliers (class 2). The selection of instances is based on the test set. The processed dataset consists of 1013 instances represented in 9 attributes, with 13 outliers (1.28%) and 1000 inliers (98.72%).

References:

[1] K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, pages 813-822, 2009.

Download all data set variants (328.2 kB). Access original data (shuttle.tst, [1] only uses test set)

Shuttle (version#01)
Shuttle (version#02)
Shuttle (version#03)
Shuttle (version#04)
Shuttle (version#05)
Shuttle (version#06)
Shuttle (version#07)
Shuttle (version#08)
Shuttle (version#09)
Shuttle (version#10)