Elucidating the principles of gene regulation in the human genome

Abstract: Gene regulation is largely controlled by two groups of cis-regulatory elements: (i) promoters that initiate stable mRNAs and (ii) enhancers that generate bidirectionally transcribed unstable enhancer RNAs (eRNAs). While promoters tend to be ubiquitously expressed, enhancers are transcribed in a strongly cell-type specific manner. They also contain the vast majority of known noncoding genetic variations. Therefore, it is essential to identify enhancers at high sensitivity in each tissue of interest. In paper I, we developed a new method, Native Elongating Transcripts using Cap Analysis of Gene Expression (NET-CAGE). This method allowed us to study transcription from 5’-ends in both mRNAs and eRNAs at high sensitivity and at single-nucleotide resolution in cell lines and tissues. By monitoring transcriptional dynamics following cellular stimulation, we revealed that enhancer-promoter pairs are generally activated simultaneously. A comparison of mRNAs and eRNAs across five cell lines showed that promoters are transcribed more ubiquitously while enhancers are more cell-type specific. In paper II, we used NET-CAGE to study the role of the DUX4 gene in human embryonic genome activation (EGA). We discovered thousands of new bidirectionally transcribed enhancers specifically active in DUX4-induced human embryonic stem cells. Furthermore, we identified the enhancers that potentially regulate ZSCAN4, a functionally significant gene in EGA. We also found that the chromatin architecture is thoroughly reorganized following DUX4 induction, and that both accessible regions and transcribed enhancers are associated with the ERVL-MaLR repeat element. In paper III, we studied the differences between promoters and enhancers in the context of transcription factors (TFs), evolution, and genetic variation in human retinal pigment epithelium (RPE) cells. We showed that RPE enhancers are enriched for AT-rich TFs, whereas promoters are enriched for non-specific TFs such as the GC-rich SP family. A larger fraction of enhancers is primate-specific compared to promoters and coding exons. Enrichment of disease-associated SNPs was significantly higher in enhancers than in promoters. These SNPs also overlapped TFs motifs, potentially disrupting the enhancer-TF binding.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.