Master Seminar "Foundation Models in AI" (SS 2023)
- Contact: Prof. Dr. Volker Tresp, Gengyuan Zhang
- Supervisors: email@example.com (Gengyuan Zhang), firstname.lastname@example.org (Yao Zhang), email@example.com (Ruotong Liao), firstname.lastname@example.org (Dr. Jindong Gu)
- Required: Lecture "Machine Learning" or equivalent.
Registration: via uni2work (central allocation)
All news will be announced here and on Uni2Work.
Termine und Ort
|Kick-off||16:00-18:00||Raum: Oettingenstr. 67 (C) - C 003||19.04.2023 (Mi)|
- kickoff slides
A foundation model is a large artificial intelligence model trained on massive data at scale (usually by self-supervised learning) and can be adapted to a wide range of downstream tasks in fields like healthcare/education and manifest capacities of perception, reasoning, and manipulation.
Early examples of foundation models are large-scale pre-trained language models like BERT and GPT-3 that show great transferability on linguistic tasks. Subsequently, multimodal foundation models, for example, DALL-E and Flamingo, that bridge vision and language draw increasing attention and attempt to understand richer sources of data and interact with multimodalities.
This seminar will cover the different dimensions of foundation models, ranging from foundation model architecture to trending foundation model training paradigms like prompt training to an in-depth understanding of the risk and capacities of current foundation model research.
 Bommasani, Rishi, et al. "On the opportunities and risks of foundation models." *arXiv preprint arXiv:2108.07258* (2021).
 Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." *arXiv preprint arXiv:1810.04805* (2018).
 Brown, Tom, et al. "Language models are few-shot learners." *Advances in neural information processing systems* 33 (2020): 1877-1901.
 Ramesh, Aditya, et al. "Zero-shot text-to-image generation." *International Conference on Machine Learning*. PMLR, 2021.
Alayrac, Jean-Baptiste, et al. "Flamingo: a visual language model for few-shot learning." *Advances in Neural Information Processing Systems* 35 (2022): 23716-23736.
- kickoff-ai (393 KByte)