Computational studies of mutational sequence signatures in cancer genomes

Abstract: Cancer typically forms when mutational processes modify key cancer driver genes, resulting in positive selection and tumor growth. As such, mutational processes are at the core of the disease. Trinucleotide mutational signatures have emerged in the last decade as essential tools for analysis of mutational processes. These models describe the relative probability of mutagenesis at different trinucleotide contexts for a variety of mutational processes. In UV exposed cancers, the DNA sequence “TTCCG” constitutes UV mutation hotspots in active promoters, which cannot be represented by trinucleotide-based mutational signatures. In the first study, we expand the mutational signature of UV and demonstrate that its trinucleotide profile depends on cytosine methylation, and that this stems from increased CPD formation at methylated sites. We also show that incorporation of longer sequence patterns into the signature model better describes the UV mutational process. Furthermore, we show that such extended signature models increase accuracy when separating driver mutations from passengers. Strongest effect on mutation probability was from TTCCG in expressed promoters, but other sequence patterns also significantly modulate mutation probability in UV exposed melanoma. In the second study, we build on this concept further and develop a bioinformatics tools capable of estimating longer sequence patterns’ effect on mutation rates in conjunction with trinucleotide contexts. We then applied our tool on 27 cancer types to explore sequence patterns with modulating effects on mutation frequencies. Homopolymer patterns were found to be the strongest effectors, but pentamers of higher complexities were also found to increase mutation rates among multiple cancers. Finally, in the last study we analyze the mutational properties of small intestine neuroendocrine tumors (SI-NETs). Mutational signature analysis reveals a lack of mutagenic processes active in these tumors, which goes together with its notably low mutation burden and low frequency of driver mutations. The most striking result in this study is that despite the multifocal nature (multiple tumors in close proximity) of this cancer type, each tumor has evolved independently. In summary, this thesis demonstrates the utility of mutational signatures, and highlight novel approaches to signature analysis that incorporate longer sequence patterns.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.