Contact
Ludwig-Maximilians-Universität München
Lehrstuhl für Datenbanksysteme und Data Mining
Oettingenstraße 67
80538 München
Germany
Lehrstuhl für Datenbanksysteme und Data Mining
Oettingenstraße 67
80538 München
Germany
Room:
E U103
Phone:
+49-89-2180-9186
Email:
zhang@dbs.ifi.lmu.de
Biography
I received my bachelor's degree from Zhejiang University in China in 2018 and a master's degree from Technical University of Munich in 2021 and joined LMU as a Ph.D. computer science student. My research interests are focused on multimodal learning and reasoning with video/image and language data. If you are interested in a master's thesis or research projects on these research topics, please get in touch with me via zhang@dbs.ifi.lmu.de.
Research Interests
- Multimodal reasoning
- Video understanding
Teaching
- WS23/24: Machine Learning
- WS23/24: Master Seminar: Foundation Models in AI
- WS23/24: Master Seminar: Knowledge Graph with Machine Learning
- SS23: Machine Learning
- SS23: Master Seminar: Foundation Models in AI
- SS23: Master Seminar: Knowledge Graph with Machine Learning
- SS22: Machine Learning
Publications
- Gengyuan Zhang, Jisen Ren, Jindong Gu, and Volker Tresp. Multi-event video-text retrieval. In
Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 22113–22123, 2023. - Zhen Han∗, Gengyuan Zhang∗, Yunpu Ma, and Volker Tresp. Time-dependent entity embedding is
not all you need: A re-evaluation of temporal knowledge graph completion models under a unified
framework. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing,
pages 8104–8118. Association for Computational Linguistics, November 2021. - Gengyuan Zhang, Yurui Zhang, Kerui Zhang, and Volker Tresp. Can vision-language models be a
good guesser? Exploring vlms for times and location reasoning. arXiv preprint arXiv:2307.06166, 2023. - Jindong Gu, Zhen Han, Shuo Chen, Ahmad Beirami, Bailan He, Gengyuan Zhang, Ruotong Liao, Yao
Qin, Volker Tresp, and Philip Torr. A systematic survey of prompt engineering on vision-language
foundation models. arXiv preprint arXiv:2307.12980, 2023. - Yao Zhang, Haokun Chen, Ahmed Frikha, Yezi Yang, Denis Krompass, Gengyuan Zhang, Jindong
Gu, and Volker Tresp. Cl-crossvqa: A continual learning benchmark for cross-domain visual question
answering. arXiv preprint arXiv:2211.10567, 2022.