The deep evolutionary roots of non-coding RNA - a comparative genomics approach

University dissertation from Stockholm : Department of Molecular Biology and Functional Genomics, Stockholm University

Abstract: Non-coding RNAs (ncRNA) are a diverse group of genes that do not encode proteins but function exclusively on the level of RNA and were originally suggested to be remnants of a pre-DNA stage of life known as the RNA world. More recent work, however, has uncovered a rich repertoire of previously unknown families with possible consequences for our understanding of the origin and evolution of the modern RNA infrastructure. The main goal of this thesis was therefore to re-examine the evolutionary history of RNAs and theories regarding the transition from an RNA world in light of recent advances in molecular and computational biology.Using comparative genomics approaches and sequence data from all domains of life, my work shows that the majority of known RNAs exhibit a highly domain-specific distribution, compatible with an ongoing emergence rather than deep ancestry. Focusing on small nucleolar RNAs (snoRNA), I find that the eukaryote ancestor possessed a complex snoRNA infrastructure, but that intronic snoRNAs are mobile over larger evolutionary time scales. The latter has consequences for predictions made by the Introns-first hypothesis, a framework to explain the emergence of introns in an RNA world and which we revisited in light of advances in our understanding of the evolutionary dynamics of introns.A more in-depth analysis of ncRNA mobility across vertebrates found intronic copies of both snoRNAs and miRNAs to be more stable than intergenic ones, suggesting that this arrangement may be a consequence of co-expression. Also, snoRNAs are frequently located in highly expressed genes, in line with their role in ribosome biogenesis. Finally, a closer examination of the genomic distribution of two essential ncRNAs, snoRNA U3 and the spliceosomal RNA U1 shows that both are present in numerous copies across vertebrate genomes. Using next-generation sequencing data, I tested whether this is the result of genetic drift or a requirement for having many copies.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.