Search for dissertations about: "Corpus creation"

Showing result 1 - 5 of 12 swedish dissertations containing the words Corpus creation.

  1. 1. Studies in Corpora and Idioms : Getting the cat out of the bag

    Author : David Minugh; Nils-Lennart Johannesson; Maria Kuteeva; Karin Aijmer; Stockholms universitet; []
    Keywords : HUMANIORA; HUMANITIES; Coll corpus; corpora; corpus creation; idioms; idiom variation; idiom-breaking; online newspapers; student newspapers; college newspapers; English language; Engelska språket; English; engelska;

    Abstract : “Idiomatic” expressions, usually called “idioms”, such as a dime a dozen, a busman’s holiday, or to have bats in your belfry are a curious part of any language: they usually have a fixed lexical (why a busman?) and structural composition (only dime and dozen in direct conjunction mean ‘common, ordinary’), can be semantically obscure (why bats?), yet are widely recognized in the speech community, in spite of being so rare that only large corpora can provide us with access to sufficient empirical data on their use.In this compilation thesis, four published studies focusing on idioms in corpora are presented. READ MORE

  2. 2. Idioms Unlimited. A study of non-canonical forms of English verbal idioms in the British National Corpus

    Author : Elisabeth Gustawsson; Göteborgs universitet; []
    Keywords : HUMANIORA; HUMANITIES; English; idiom; corpus; transparency; isomorphicity; modification; derivation; permutation; non-canonical; figurative;

    Abstract : The study is a corpus-based investigation of the semantic, lexical, and grammatical flexibility of English verbal idioms, focusing on qualitative analyses of examples of current British English usage. 300 verbal idioms - i.e. idioms that consist of a verb and a complement, e. READ MORE

  3. 3. Bootstrapping Named Entity Annotation by Means of Active Machine Learning

    Author : Fredrik Olsson; Göteborgs universitet; []
    Keywords : NATURVETENSKAP; NATURAL SCIENCES; NATURVETENSKAP; NATURAL SCIENCES; corpus creation; data annotation; active learning; named entity recognition; machine learning; computational linguistics; natural language processing; information refinement;

    Abstract : This thesis describes the development and in-depth empirical investigation of a method, called BootMark, for bootstrapping the marking up of named entities in textual documents. The reason for working with documents, as opposed to for instance sentences or phrases, is that the BootMark method is concerned with the creation of corpora. READ MORE

  4. 4. Bootstrapping Named Entity Annotation by Means of Active Machine Learning: A Method for Creating Corpora

    Author : Fredrik Olsson; RISE; []
    Keywords : NATURVETENSKAP; NATURAL SCIENCES; corpus creation; data annotation; active learning; named entity recognition; machine learning; computational linguistics; nlp;

    Abstract : This thesis describes the development and in-depth empirical investigation of a method, called BootMark, for bootstrapping the marking up of named entities in textual documents. The reason for working with documents, as opposed to for instance sentences or phrases, is that the BootMark method is concerned with the creation of corpora. READ MORE

  5. 5. Shades of Certainty : Annotation and Classification of Swedish Medical Records

    Author : Sumithra Velupillai; Hercules Dalianis; Martin Hassel; Sabine Bergler; Stockholms universitet; []
    Keywords : SAMHÄLLSVETENSKAP; SOCIAL SCIENCES; Clinical documentation; Certainty level classification; Annotation; E-health; Corpus creation; De-identification; Speculative language; Medical Records; Swedish; Natural Language Processing; Language Technology; Computer and Systems Sciences; data- och systemvetenskap;

    Abstract : Access to information is fundamental in health care. This thesis presents research on Swedish medical records with the overall goal of building intelligent information access tools that can aid health personnel, researchers and other professions in their daily work, and, ultimately, improve health care in general. READ MORE