Multilingual text generation from structured formal representations

University dissertation from University of Gothenburg

Abstract: This thesis aims to identify the optimal ways in which natural language generation techniques can be brought to bear upon the problem of pro- cessing a structured body of information in order to devise a coherent presentation of text content in multiple languages. We investigate how chains of referential expressions are realized in English, Swedish and Hebrew, and suggest several coreference strate- gies that can be used to generate coherent descriptions about paintings. The suggested strategies focus on the need to produce paragraph-sized written natural language descriptions from formal structured represen- tations presented in the Semantic Web. We account for principles of coreference by introducing a new mod- ularized approach to automatically generate chains of referential ex- pressions from ontologies. We demonstrate the feasibility of the ap- proach by implementing a system where a Semantic Web domain on- tology serves as the background knowledge representation and where the language-specific coreference strategies are incorporated. The sys- tem uses both the principles of discourse structures and coreference strategies to guide the generation process. We show how the system successfully generates coherent, well-formed descriptions in multiple languages.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.