Structure-driven derivation of inter-lingual functor-argument trees for multi-lingual generation

University dissertation from Linköping : Univ

Abstract:

We show how an inter-lingual representation o messages can be exploited for natural language generation of technical documentation into Swedish and English in a system called Genie. Genie has a conceptual knowledge base of the facts considered as true in the domain. A user queries the knowledge base for the facts she wants the document to include. The responses constitute the messages which are multi-lingually generated into Swedish and English texts.The particular kind of conceptual representation of messages that is chosen, allows for two assumptions aboutinter-linguality; (i) Syntactic compositionality, viz. the linguistic expression for a message is a function from the expressions obtained from the parts of the message. (ii) A message has in itself an adequate expression, which gives a restriction in size of the input to generation. These assumptions underlie a grammar that maps individual messages to linguistic categories in three steps. The first step constructs a functor-argument tree over the target language syntax using a non-directed unification categorial grammar. The tree is an inter-mediate representation that includes the message and the assumptions. It lies closer to the target languages but is still language neutral. The second step instantiates the tree with linguistic material according to target language. The final step uses the categorial grammar application rule on each node of the tree to obtain the resulting basic category. It contains an immediate representation for the linguistic expression of the message, and is trivially converted into a string of words. Some example texts in the genre have been studied. Their sublanguage traits clearly enable generation by the proposed method.The results indicate that Genie, and possibly other comparable systems that have a conceptual message representation, benefits in efficiency and ease of maintenance of the linguistic resources by making use of the knowledge-intensive multi-lingual generation method described here.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.