Bridging the past to the present: investigating species boundaries with herbarium specimens and next-generation-sequencing. A case study of circumpolar Silene sect. Physolychnis

Abstract: During the recent decades, genetic information enclosed in herbarium collections have been partly revealed, largely due to the fast-moving high-throughput sequencing technologies. However, the genetic outcome from herbarium material is limited by its degradation over time, often worsened by certain preservation methods. The primary goal of this thesis is to use high-throughput sequencing and target capture to optimize the use of herbaria in systematics, with a focus on disentangling evolutionary relationships within diploid circumpolar members of Silene sect. Physolychnis. In Manuscript I, I investigate whether Silene herbarium specimens can yield long DNA reads with target capture and a Silene-specific bait panel, enriching 48 low copy nuclear genes. To isolate long DNA fragments, I optimize wet lab protocols to increase the DNA output and the recovery of long DNA fragments post-size selection. The sequencing is performed with highly accurate hifi reads from PacBio SMRT Sequel. The results show that specimens less than thirty years old yield long DNA reads with high sequencing depth, giving hope for accessing genomic complexity from young herbarium material. In Manuscript II, the design of a target capture experiment is described step-by-step. Even though universal kits are available, the development of group-specific kits is gaining in popularity. This review shows how to choose orthologous genes and how to design probes tailored to a group of interest. It breaks down the actual target captures to three steps: hybridization, incubation and wash. In Manuscript III, I give an overview of species delimitation methods based on a multi-locus dataset. I describe how multi-locus approaches have revealed gene tree discordance, caused by incomplete lineage sorting and gene flow. I emphasize that coalescent-based methods are models of choice to take into account incomplete lineage sorting, and how they are implemented. I also describe how allopolyploidization impacts species boundaries, and how it is implemented in multilabelled- and network-like trees. In Manuscript IV, I test a recently proposed taxonomy of the diploid and circumpolar Silene uralensis group using 42 low copy nuclear genes and the multispecies coalescent model STACEY. The phylogenetic inference shows little support to the taxonomy of the complex, suggesting the need to revise the group taxonomically. The group includes diploid and polyploid taxa difficult to tell apart morphologically. To identify allopolyploid plants, I developed a novel method based on allele phasing. The method is demonstrated as being efficient in determining allopolyploids, also from very old herbarium specimens. This thesis shows that herbarium material may hold unsuspected potential to study genomic complexity and disentangle taxonomic complex groups. This is made possible by the fast development of high-throughput sequencing, bioinformatic pipelines and computational resources.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.