Search for dissertations about: "language corpora"
Showing result 1 - 5 of 59 swedish dissertations containing the words language corpora.
-
1. Why the pond is not outside the frog? Grounding in contextual representations by neural language models
Abstract : In this thesis, to build a multi-modal system for language generation and understanding, we study grounded neural language models. Literature in psychology informs us that spatial cognition involves different aspects of knowledge that include visual perception and human interaction with the world. READ MORE
-
2. Recycling Translations : Extraction of Lexical Data from Parallel Corpora and their Application in Natural Language Processing
Abstract : The focus of this thesis is on re-using translations in natural language processing. It involves the collection of documents and their translations in an appropriate format, the automatic extraction of translation data, and the application of the extracted data to different tasks in natural language processing. READ MORE
-
3. Splitting rocks: Learning word sense representations from corpora and lexica
Abstract : The representation of written language semantics is a central problem of language technology and a crucial component of many natural language processing applications, from part-of-speech tagging to text summarization. These representations of linguistic units, such as words or sentences, allow computer applications that work with language to process and manipulate the meaning of text. READ MORE
-
4. Morphosyntactic Corpora and Tools for Persian
Abstract : This thesis presents open source resources in the form of annotated corpora and modules for automatic morphosyntactic processing and analysis of Persian texts. More specifically, the resources consist of an improved part-of-speech tagged corpus and a dependency treebank, as well as tools for text normalization, sentence segmentation, tokenization, part-of-speech tagging, and dependency parsing for Persian. READ MORE
-
5. Studies in Corpora and Idioms : Getting the cat out of the bag
Abstract : “Idiomatic” expressions, usually called “idioms”, such as a dime a dozen, a busman’s holiday, or to have bats in your belfry are a curious part of any language: they usually have a fixed lexical (why a busman?) and structural composition (only dime and dozen in direct conjunction mean ‘common, ordinary’), can be semantically obscure (why bats?), yet are widely recognized in the speech community, in spite of being so rare that only large corpora can provide us with access to sufficient empirical data on their use.In this compilation thesis, four published studies focusing on idioms in corpora are presented. READ MORE