Lehr- und Forschungseinheit für Datenbanksysteme
print


Breadcrumb Navigation


Content

Praktikum Deep Clustering (SoSe 2024)

News

  • Registration for the practical will take place via Moodle. The Central registration starts on the 1st of March and is open until the 31st of March.

Organisation

  • Volume: 12 ECTS
  • Lecture: Prof. Dr. Thomas Seidl
  • Contact: Collin Leiber
  • Audience: The course is directed towards master students in Informatics, Media Informatics, Statistics and Data Science
  • Registration: Moodle

Time and Locations

All times are s.t. (sine tempore). Please consult Moodle for an up-to-date schedule!

When Where Start
Mo, 16:00 - 20:00 h Oettingenstr. 67, 151 15.04.24

Content

    In this practical course, we deal with the topic of deep clustering. This term describes the combination of clustering with concepts from the field of deep learning. Corresponding methods have become popular in recent years and have achieved very good results on image and text data sets.

    Clustering describes the task of automatically dividing objects into suitable groups, so-called clusters. Only the similarity between objects is considered, which means that no amount of training data with known labels is required. This is also referred to as unsupervised learning.

    The identification of clusters in high-dimensional data sets like images, text, or videos can be very complex as we have to deal with the curse of dimensionality, which describes the phenomenon that samples become more and more similar with an increasing amount of dimensions. For this reason, the clustering task is often accompanied by some kind of feature reduction. Here, we can utilize linear transformations, e.g., PCA, or non-linear transformations, e.g., autoencoders. Non-linear applications are more flexible and, therefore, suitable for more complex clustering tasks. In deep clustering, a deep learning-based representation learning method is supplemented by a specific clustering loss. Such combinations will be studied in the context of this practical.

    In the course of the practical, we will first introduce theoretical foundations. Afterward, modern deep clustering algorithms will be implemented and evaluated. Here, we will deal intensively with relevant research publications. The practical assignments should be performed in small groups.

    Prerequisites:

    • Basic knowledge of deep learning (Autoencoders, GANs, ...)
    • Programming knowledge: Python, PyTorch, Git
    • Interest in scientific working
    Helpful references:
    • Deep Clustering Survey 1: Link
    • Deep Clustering Survey 2: Link
    • List of Deep Clustering Algorithms: Link
    • Implementations of Deep Clustering Algorithms: Link