Lehr- und Forschungseinheit für Datenbanksysteme
print


Breadcrumb Navigation


Content

Accepted paper at EMNLP 2020

An Unsupervised Joint System for Text Generation from Knowledge Graphs and Semantic Parsing

08.10.2020

Authors

Martin Schmitt, Sahand Sharifzadeh, Volker Tresp, Hinrich Schütze

emnlp2020_logo


The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020),
16-20 November 2020, Virtual


Abstract

Knowledge graph (KG) schemas can vary greatly from one domain to another. Therefore supervised approaches to graph-to-text generation and text-to-graph knowledge extraction (semantic parsing) will always suffer from a shortage of domain-specific parallel graph-text data, while adapting a model trained on a different domain is often impossible due to little or no overlap in entities and relations. This situation calls for an approach that (1) does not need large amounts of annotated data and (2) is easy to adapt to new KG schemas. To this end, we present the first approach to fully unsupervised text generation from KGs and KG generation from text. Inspired by recent work on unsupervised machine translation, we serialize a KG as a sequence of facts and frame both tasks as sequence translation. By means of a shared sequence encoder and decoder, our model learns to map both graphs and texts into a joint semantic space and thus generalizes over different surface representations with the same meaning. We evaluate our approach on WebNLG v2.1 and a new benchmark leveraging scene graphs from Visual Genome. Our system outperforms strong baselines for both text↔graph tasks without any manual adaptation from one dataset to the other. In additional experiments, we investigate the impact of using different unsupervised objectives. arXiv