Exploring the multiscale 3D architecture of human chromosome contact networks

Abstract: Cells regulate genes to coordinate essential functions allowing cells to grow, divide, specialize, and respond to stresses. While regulatory proteins are the most common way to control these genes, the DNA’s 3D structure also plays a critical role as it affects how proteins access genes and how regulatory DNA elements interact over large genomic distances. This thesis explores the latter aspect of gene regulation by mapping DNA’s 3D multiscale architecture and exploring the within-scale variability.To study these aspects, we analyzed empirical DNA-DNA contact data from a technique known as Hi-C. This technique measures the contact frequency between pairs of points on DNA. To infer multiscale DNA 3D structures from this data set, we adopt and develop a community detection framework that finds the groups of interconnected DNA regions. Rooted in network science, this approach allowed us to study the DNA’s ensemble-averaged 3D organization while embracing its complexity and variability.In this work, we mapped DNA’s multiscale 3D architecture and demonstrated how our community detection algorithm charts the structural scales in regimes that are often opaque to other computational tools. We also addressed several specific research questions. First, we explored cross-scale 3D structures, quantifying to what extent their interactions are hierarchical. We also determined the scales where the 3D structures seem most robust and quantified the DNA's structural ambiguities. Additionally, we explored the association between DNA's 3D architecture and epigenetic states. Finally, we demonstrated how our framework applies to another DNA contact data set (HiChIP) that may be useful to better understand spatial rearrangements in cancer cells.