Population genomic analyses of regulatory variation and selection in Brassicaceae species

Abstract: The impact of selection on regulatory variation and the contribution of regulatory changes to phenotypic variation has long been debated in evolutionary genetics. Because cis-regulatory elements such as promoters and enhancers can be difficult to identify, it has been more challenging to quantify the impact of selection on variation in cis-regulatory regions than in protein-coding regions. In this thesis, I use genomic tools to investigate gene expression variation and selection in Brassicaceae species. First, I investigated the genomic impact of selection on putative cis-regulatory regions in the genome of the crucifer species Capsella grandiflora (Brassicaceae) (Paper I). I used an assay for transposase-accessible chromatin with high throughput sequencing (ATAC-seq) to empirically identify putative cis-regulatory regions as those located in accessible chromatin regions (ACRs) in the genome of the crucifer species Capsella grandiflora. Based on whole-genome resequencing data from a natural population, I then showed that ACRs are under stronger purifying selection than other intergenic regions and that they are depleted for transposable element (TE) insertions and enriched for expression quantitative trait loci (eQTL), as would be expected if ACRs are enriched for functional elements affecting gene expression. Second, I explored how the location and silencing of transposable elements (TEs) affects selection against TEs (Paper II). Specifically, I tested a trade-off model on epigenetic TE silencing, according to which the positive effects of TE silencing on preventing TE movement conflict with negative effects of TE silencing on nearby gene expression. I found that TE silencing through the RNA-directed DNA methylation (RdDM) pathway affects selection against TEs close to genes in C. grandiflora, which is consistent with the trade-off model. Third, I used Arabidopsis thaliana single-cell expression data to investigate the relationship between gene body methylation (gbM) and transcriptional regulation (Paper III). I found that there was an indirect correlation between gbM and gene expression noise as well as a direct correlation between gbM and gene expression consistency and potentially intron retention in Arabidopsis thaliana. Fourth, I investigated the impact of demographic history on genomic signatures of selection at linked sites (linked selection) (Paper IV). This study revealed that neutral genetic diversity in C. grandiflora with a stable effective population size is influenced by linked selection whereas in Arabidopsis lyrata, which underwent a recent and strong bottleneck, neutral diversity is mainly affected by population size change. In summary, this thesis offers new insights into determinants of gene expression variation, selection on genomic features linked to gene expression alteration, as well as on the effect of demographic history on linked selection patterns in Brassicaceae.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)