Search for dissertations about: "corpus building"

Showing result 1 - 5 of 13 swedish dissertations containing the words corpus building.

  1. 1. A Local Grammar of Cause and Effect : A Corpus-driven Study

    Author : Christopher Allen; Geoffrey Barnbrook; Christopher Gledhill; Susan Hunston; UK Univeristy of Birmingham; []
    Keywords : HUMANIORA; HUMANITIES; Local grammar; Biomedical English; Corpus-driven; Humaniora; Humanities;

    Abstract : This thesis puts forward a specialized, functional grammar of cause and effect withinthe sub-genre of biomedical research articles. Building on research into the localgrammars of dictionary definitions and evaluation, the thesis describes the applicationof a corpus-driven methodology to description of the principal lexical grammaticalpatterns which underpin causation in scientific writing. READ MORE

  2. 2. Studies in Corpora and Idioms : Getting the cat out of the bag

    Author : David Minugh; Nils-Lennart Johannesson; Maria Kuteeva; Karin Aijmer; Stockholms universitet; []
    Keywords : HUMANIORA; HUMANITIES; Coll corpus; corpora; corpus creation; idioms; idiom variation; idiom-breaking; online newspapers; student newspapers; college newspapers; English language; Engelska språket; English; engelska;

    Abstract : “Idiomatic” expressions, usually called “idioms”, such as a dime a dozen, a busman’s holiday, or to have bats in your belfry are a curious part of any language: they usually have a fixed lexical (why a busman?) and structural composition (only dime and dozen in direct conjunction mean ‘common, ordinary’), can be semantically obscure (why bats?), yet are widely recognized in the speech community, in spite of being so rare that only large corpora can provide us with access to sufficient empirical data on their use.In this compilation thesis, four published studies focusing on idioms in corpora are presented. READ MORE

  3. 3. Resource Lean and Portable Automatic Text Summarization

    Author : Martin Hassel; Hercules Dalianis; Viggo Kann; Kerstin Severinson Eklundh; Horacio Saggion; KTH; []
    Keywords : NATURVETENSKAP; NATURAL SCIENCES; holsum; language independent; holistic; summarization; lexical semantics; co-occurrence statistics; word space model; bag-of-words; bag-of-concepts; random indexing; swesum; news corpus; extract corpus; Computer science; Datalogi;

    Abstract : Today, with digitally stored information available in abundance, even for many minor languages, this information must by some means be filtered and extracted in order to avoid drowning in it. Automatic summarization is one such technique, where a computer summarizes a longer text to a shorter non-rendundant form. READ MORE

  4. 4. Clefts in English and Swedish: A contrastive study of IT-clefts and WH-clefts in original texts and translations

    Author : Mats Johansson; Engelska; []
    Keywords : HUMANIORA; HUMANITIES; Engelska språk och litteratur ; English language and literature; information structure; ground; focus; discourse topic; topic; theme; discourse; fronting; wh-clefts; it-clefts; pseudo-cleft constructions; cleft constructions; bidirectional translation corpus; translation; corpus linguistics; contrastive linguistics; Swedish; English; Scandinavian languages and literature; Nordiska språk språk och litteratur ; Linguistics; Lingvistik;

    Abstract : This study investigates the use of cleft constructions in English and Swedish on the basis of a bidirectional translation corpus consisting of original English and Swedish texts and their translations into the other language. This design minimizes the problems inherent in corpora of original texts alone, viz. READ MORE

  5. 5. Building Knowledge Graphs : Processing Infrastructure and Named Entity Linking

    Author : Marcus Klang; Robotik och Semantiska System; []
    Keywords : NATURVETENSKAP; NATURAL SCIENCES; natural language processing; machine learning; computational lingustics; named entity linking;

    Abstract : Things such as organizations, persons, or locations are ubiquitous in all texts circulating on the internet, particularly in the news, forum posts, and social media. Today, there is more written material than any single person can read through during a typical lifespan. READ MORE