Lehr- und Forschungseinheit für Datenbanksysteme
print


Breadcrumb Navigation


Content

Accepted Paper at VLDB 2022

SCAR - Spectral Clustering Accelerated and Robustified

11.07.2022

Authors

Ellen Hohma, Christian Frey, Anna Beer, Thomas Seidl

vldb2022_logo

48th International Conference on Very Large Databases (VLDB 2022),
05–09 September 2022, Sydney, Australia (and hybrid) 

 

Abstract

Spectral clustering is one of the most advantageous clustering approaches. However, standard Spectral Clustering is sensitive to noisy input data and has a high runtime complexity. Tackling one of these problems often exacerbates the other. As real-world datasets are often large and compromised by noise, we need to improve both robustness and runtime at once. Thus, we propose Spectral Clustering - Accelerated and Robust (SCAR), an accelerated, robustified spectral clustering method. In an iterative approach, we achieve robustness by separating the data into two latent components: cleansed and noisy data. We accelerate the eigendecomposition – the most time-consuming step – based on the Nyström method. We compare SCAR to related recent state-of-the-art algorithms in extensive experiments. SCAR surpasses its competitors in terms of speed and clustering quality on highly noisy data.