Search for dissertations about: "corpus linguistics"

Showing result 1 - 5 of 80 swedish dissertations containing the words corpus linguistics.

  1. 1. Morphosyntactic Corpora and Tools for Persian

    Author : Mojgan Seraji; Joakim Nivre; Carina Jahani; Jan Hajic; Uppsala universitet; []
    Keywords : NATURAL SCIENCES; NATURVETENSKAP; NATURVETENSKAP; NATURAL SCIENCES; Persian; language technology; corpus; treebank; preprocessing; segmentation; part-of-speech tagging; dependency parsing; Computational Linguistics; Datorlingvistik;

    Abstract : This thesis presents open source resources in the form of annotated corpora and modules for automatic morphosyntactic processing and analysis of Persian texts. More specifically, the resources consist of an improved part-of-speech tagged corpus and a dependency treebank, as well as tools for text normalization, sentence segmentation, tokenization, part-of-speech tagging, and dependency parsing for Persian. READ MORE

  2. 2. The Multilingual Forest : Investigating High-quality Parallel Corpus Development

    Author : Yvonne Adesam; Martin Volk; Joakim Nivre; Koenraad de Smedt; Stockholms universitet; []
    Keywords : NATURAL SCIENCES; NATURVETENSKAP; NATURVETENSKAP; NATURAL SCIENCES; treebank; syntax; alignment; corpus; annotation projection; multilingual; tagging; parsing; datorlingvistik; Computational Linguistics;

    Abstract : This thesis explores the development of parallel treebanks, collections of language data consisting of texts and their translations, with syntactic annotation and alignment, linking words, phrases, and sentences to show translation equivalence. We describe the semi-manual annotation of the SMULTRON parallel treebank, consisting of 1,000 sentences in English, German and Swedish. READ MORE

  3. 3. Clefts in English and Swedish: A contrastive study of IT-clefts and WH-clefts in original texts and translations

    Author : Mats Johansson; Engelska; []
    Keywords : HUMANIORA; HUMANITIES; Engelska språk och litteratur ; English language and literature; information structure; ground; focus; discourse topic; topic; theme; discourse; fronting; wh-clefts; it-clefts; pseudo-cleft constructions; cleft constructions; bidirectional translation corpus; translation; corpus linguistics; contrastive linguistics; Swedish; English; Scandinavian languages and literature; Nordiska språk språk och litteratur ; Linguistics; Lingvistik;

    Abstract : This study investigates the use of cleft constructions in English and Swedish on the basis of a bidirectional translation corpus consisting of original English and Swedish texts and their translations into the other language. This design minimizes the problems inherent in corpora of original texts alone, viz. READ MORE

  4. 4. The Balochi Language of Turkmenistan : A corpus-based grammatical description

    Author : Serge Axenov; Carina Jahani; Åke Viberg; Elena Bashir; Uppsala universitet; []
    Keywords : Iranian languages; Balochi; dialectology; phonology; morphology; syntax; descriptive linguistics; sociolinguistics; unwritten languages; fieldwork; Iranian languages; Iranska språk - allmänt; Linguistics; lingvistik;

    Abstract : This dissertation is a synchronic description of the Balochi language as spoken in Turkmenistan. The dissertation consists of three main parts: sound structure, word and phraselevel morphosyntax and clause structure. READ MORE

  5. 5. Learning Idiomaticity : A Corpus-Based Study of Idiomatic Expressions in Learners' Written Production

    Author : Maria Wiktorsson; Engelska; []
    Keywords : HUMANIORA; HUMANITIES; corpus linguistics; idiom principle; open choice principle; construction grammar; compositionality; conventionalisation; idiom; formulae; collocation; prefab; Swedish learners of English; L2; Idiomaticity; L1; English language and literature; Engelska språk och litteratur ; Linguistics; Lingvistik;

    Abstract : The aim of this study is to investigate how Swedish learners of English (at different levels of proficiency) master idiomaticity in their target language. I argue that idiomaticity can be related to the storage and use of multi-word expressions that are preferred by native speakers. READ MORE