Genomic feature identification in trypanosomatid parasites

University dissertation from Stockholm : Karolinska Institutet, Department of Cell and Molecular Biology

Abstract: The trypanosomatid parasites cause death and suffering, among humans as well as livestock. Current drugs lack efficacy and cause severe side effects, and no vaccines are available. Increased knowledge of the biology of the parasites is vital for the development of new drugs. Research on these ancient eukaryotes has also already led to the discovery of mechanisms of broader relevance, such as RNA editing, trans splicing and antigenic variation. Post-transcriptional regulation is an important part of the regulatory networks of most higher organisms, including humans. In the kinetoplastids, only a very limited part of the control of gene expression is exerted at the transcriptional level. Genes are expressed as long polycistronic pre-mRNA, and individual messages are formed by trans splicing and polyadenylation. Even genes that are not coregulated can be on the same polycistronic pre-mRNA. The trypanosomatids can be regarded as models for post-transcriptional regulation, in relation to the more complex eukaryotes. The progress of the human and other genome projects shows the opportunity provided by a complete genomic sequence to increase the efficiency of traditional molecular biology. Use of computer-aided and fully automated genome sequence analysis tools allows novel feature discovery as well as the direction of hypothesis driven experiments. We have sequenced the genome of Trypanosoma cruzi as part of a three-centre collaboration, and provided an extensive annotation that identifies biologically interesting features. To this end we have used available informatics tools where possible, and developed some new programs. Focus was on integrating current molecular biology knowledge in large scale analyses, and arriving at experimentally testable hypotheses. This thesis is based on five papers (I-V). Paper I describes a program for gene-finding and annotation that we constructed for the annotation of the genome, described in paper III. Here we collaborated with experts in several areas to investigate the gene content of T. cruzi. In paper II we present global base skew features in the genome. In paper IV we describe a model of trans splicing in Trypanosoma brucei, and the application of it at the genome level. In paper V, we apply the trans splice model to predict message boundaries in Trypanosoma cruzi, and based on these predictions, we find that upstream open reading frames are common. We hypothesise that these generally repress translation.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.