Bioinformatic methods in rare disease genomics

Abstract: The larger goal of medical genetics is to map genotype to phenotype and to understand how genomic variation affects human health. In the field of rare disease genomics, there is a mendelian assumption that states: one disease one variant. This is simplified and means that when we observe the phenotype of a rare disease patient, we suspect that there is one or two genetic variations in one gene that cause the disease. It might sound like a simple problem to solve at first, especially compared to other fields in genomics, such as cancer and common disease where multiple loci, unrelated, together are expected to cause the biological state. However, it can be a daunting task to find this variant among the handful of million variants that each human individual is carrying in the genome. This thesis is focused on the problem of finding the causative variants in patients with suspected rare inherited disorders even though some of the tools and methods are applicable in other areas as well. Many challenges arise in the sequencing analysis as the amount of data grows, requiring development of novel methods and algorithms to enable handling and interpretation of the massive amounts of data. Hundreds of millions of short sequence reads are produced for a single individual in a whole genome sequencing experiment. These are mapped to a reference genome and the positions and regions that differ from the reference are identified or “called” as variants. The variants are annotated with as much relevant information as possible, so that prediction algorithms and humans can determine which variant or small number of variants among the millions identified that are pathogenic in a particular genomic or phenotypic context. This thesis was created in parallel with the process of establishing a genomics platform in the Stockholm region, to provide the hospitals with state-of-the-art genome analysis. The tools and methods that were developed during these years were implemented and tested in a production setting immediately. In this thesis work I will illustrate the field of Clinical Genomics from different perspectives, from the components of a rare disease analysis pipeline to the integration of whole genome sequencing in a clinical setting via a close-up case study.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.