Praktikum Deep Clustering (SoSe 2024)
News
- Registration for the practical will take place via Moodle. The Central registration starts on the 1st of March and is open until the 31st of March.
Organisation
- Volume: 12 ECTS (Doppelpraktikum)
- Lecture: Prof. Dr. Thomas Seidl
- Contact: Collin Leiber
- Audience: The course is directed towards master students in Informatics, Media Informatics, Statistics and Data Science
- Registration: Moodle
Time and Locations
All times are s.t. (sine tempore). Please consult Moodle for an up-to-date schedule!
When | Where | Start |
Mo, 16:00 - 20:00 h | Oettingenstr. 67, 151 | 15.04.24 |
Content
In this practical course, we deal with the topic of deep clustering. This term describes the combination of clustering with concepts from the field of deep learning. Corresponding methods have become popular in recent years and have achieved very good results on image and text data sets.
Clustering describes the task of automatically dividing objects into suitable groups, so-called clusters. Only the similarity between objects is considered, which means that no amount of training data with known labels is required. This is also referred to as unsupervised learning.
The identification of clusters in high-dimensional data sets like images, text, or videos can be very complex as we have to deal with the curse of dimensionality, which describes the phenomenon that samples become more and more similar with an increasing amount of dimensions. For this reason, the clustering task is often accompanied by some kind of feature reduction. Here, we can utilize linear transformations, e.g., PCA, or non-linear transformations, e.g., autoencoders. Non-linear applications are more flexible and, therefore, suitable for more complex clustering tasks. In deep clustering, a deep learning-based representation learning method is supplemented by a specific clustering loss. Such combinations will be studied in the context of this practical.
In the course of the practical, we will first introduce theoretical foundations. Afterward, modern deep clustering algorithms will be implemented and evaluated. Here, we will deal intensively with relevant research publications. The practical assignments should be performed in small groups.
Prerequisites:
- Basic knowledge of deep learning (Autoencoders, GANs, ...)
- Programming knowledge: Python, PyTorch, Git
- Interest in scientific working
Helpful references:
- Deep Clustering Survey 1: Link
- Deep Clustering Survey 2: Link
- List of Deep Clustering Algorithms: Link
- Implementations of Deep Clustering Algorithms: Link