A Local Grammar of Cause and Effect A Corpus-driven Study

University dissertation from University of Birmingham

Abstract: This thesis puts forward a specialized, functional grammar of cause and effect withinthe sub-genre of biomedical research articles. Building on research into the localgrammars of dictionary definitions and evaluation, the thesis describes the applicationof a corpus-driven methodology to description of the principal lexical grammaticalpatterns which underpin causation in scientific writing. The source of data is the 2million-word Halmstad Biomedical Corpus constructed from 589 on-line researcharticles published since 1997. These articles were sampled in accordance with astandard library classification system across the broad spectrum of the biomedicalresearch literature. On the basis of lexical grammatical patterns identified in thecorpus, a total of five functional sub-types of causation are put forward. The localgrammar itself is a description of these sub-types based on the Hallidayian notion ofsystem along the syntagm coupled with the identification of the paradigmatic contentsof these systems as a closed set of 37 semantic categories specific to the biomedicaldomain. A preliminary evaluation of the grammar is then offered in terms of handparsingexperiments using a test corpus. Finally potential NLP applications of thegrammar are described in terms of on-line information extraction, ontology buildingand text summary.