Lehr- und Forschungseinheit für Datenbanksysteme

Breadcrumb Navigation


Accepted Paper at IAL Workshop at ECML PKDD 2022

Accelerating Diversity Sampling for Deep Active Learning By Low-Dimensional Representations



Sandra Gilhuber, Max Berrendorf, Yunpu Ma, and Thomas Seidl


6th International Workshop on Interactive Adaptive Learning (IAL2022) co-located with the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2022),
19–23 September 2022, Grenoble, France



Selecting diverse instances for annotation is one of the key factors of successful active learning strategies.
To this end, existing methods often operate on high-dimensional latent representations. In this work, we propose to use the low-dimensional vector of predicted probabilities instead, which can be seamlessly integrated into existing methods. We empirically demonstrate that this considerably decreases the query time, i.e., time to select an instance for annotation, while at the same time improving results. Low query times are relevant for active learning researchers, which use a (fast) oracle for simulated annotation and thus are often constrained by query time. It is also practically relevant when dealing with complex annotation tasks for which only a small pool of skilled domain experts is available for annotation with a limited time budget.

Our code is available at GitHub