Shuttle
This dataset has been preprocessed in different variants in the literature. We follow the procedure of Zhang et al. [1], using classes 1, 3, 4, 5, 6 and 7 as inliers and class 2 as outlier, selecting 1000 inliers vs. 13 outliers (class 2). The selection of instances is based on the test set. The processed dataset consists of 1013 instances represented in 9 attributes, with 13 outliers (1.28%) and 1000 inliers (98.72%).
References:
[1] K. Zhang, M. Hutter, and H. Jin. A new local distance-based outlier detection approach for scattered real-world data. In Proc. PAKDD, pages 813-822, 2009.
Download all data set variants (328.2 kB). Access original data (shuttle.tst, [1] only uses test set)