Natural Language Processing Methods for Automatic Illustration of Text

University dissertation from Department of Computer Science, Lund University

Abstract: The thesis describes methods for automatic creation of illustrations of natural-language text. The main focus of the work is to convert texts that describe sequences of events in a physical world into animated images. This is what we call text-to-scene conversion. The first part of the thesis describes Carsim, a system that automatically illustrates traffic accident newspaper reports written in Swedish. This system is the first text-to-scene conversion system for non-invented texts. The second part of the thesis focuses on methods to generalize the NLP components of Carsim to make the system more easily portable to new domains of application. Specifically, we develop methods to sidestep the scarcity of annotated data, needed for training and testing of NLP methods. We present a method to annotate the Swedish side of a parallel corpus with shallow semantic information in the FrameNet standard. This corpus is then used to train a semantic role labeler for Swedish text.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)