Accepted article at DSE (Data Science and Engineering)
Florian Richter, Yifeng Lu, Daniyal Kazempour, Thomas Seidl
OPTICS is a popular tool to analyze the clustering structure of a dataset visually. The created two-dimensional plots indicate very dense areas and cluster candidates in the data as troughs. Each horizontal slice represents an outcome of a density-based clustering specified by the height as the density threshold for clusters. However, in very dynamic and rapid changing applications a complex and finely detailed visualization slows down the knowledge discovery. Instead, a framework that provides fast but coarse insights is required to point out structures in the data quickly. The user can then control the direction he wants to put emphasize on for refinement.
We develop AMTICS as a novel and efficient divide-and-conquer approach to pre-cluster data in distributed instances and align the results in a hierarchy afterward. An interactive online phase ensures a low complexity while giving the user full control over the partial cluster instances. The offline phase reveals the current data clustering structure with low complexity and at any time.