Interrogation of Nucleic Acids by Parallel Threading

University dissertation from Stockholm : KTH

Abstract: Advancements in the field of biotechnology are expanding the scientific horizon and a promising era is envisioned with personalized medicine for improved health. The amount of genetic data is growing at an ever-escalating pace due to the availability of novel technologies that allow massively parallel sequencing and whole-genome genotyping, that are supported by the advancements in computer science and information technologies. As the amount of information stored in databases throughout the world is growing and our knowledge deepens, genetic signatures with significant importance are discovered. The surface of such a set in the data mining process may include causative- or marker single nucleotide polymorphisms (SNPs), revealing predisposition to disease, or gene expression signatures, profiling a pathological state. When targeting a reduced set of signatures in a large number of samples for diagnostic- or fine-mapping purposes, efficient interrogation and scoring require appropriate preparations. These needs are met by miniaturized and parallelized platforms that allow a low sample and template consumption.This doctoral thesis describes an attempt to tackle some of these challenges by the design and implementation of a novel assay denoted Trinucleotide Threading (TnT). The method permits multiplex amplification of a medium size set of specific loci and was adapted to genotyping, gene expression profiling and digital allelotyping. Utilizing a reduced number of nucleotides permits specific amplification of targeted loci while preventing the generation of spurious amplification products. This method was applied to genotype 96 individuals for 75 SNPs. In addition, the accuracy of genotyping from minute amounts of genomic DNA was confirmed. This procedure was performed using a robotic workstation running custom-made scripts and a software tool was implemented to facilitate the assay design. Furthermore, a statistical model was derived from the molecular principles of the genotyping assay and an Expectation-Maximization algorithm was chosen to automatically call the generated genotypes. The TnT approach was also adapted to profiling signature gene sets for the Swedish Human Protein Atlas Program. Here 18 protein epitope signature tags (PrESTs) were targeted in eight different cell lines employed in the program and the results demonstrated high concordance rates with real-time PCR approaches. Finally, an assay for digital estimation of allele frequencies in large cohorts was set up by combining the TnT approach with a second-generation sequencing system. Allelotyping was performed by targeting 147 polymorphic loci in a genomic pool of 462 individuals. Subsequent interrogation was carried out on a state-of-the-art massively parallelized Pyrosequencing instrument. The experiment generated more than 200,000 reads and with bioinformatic support, clonally amplified fragments and the corresponding sequence reads were converted to a precise set of allele frequencies.

  CLICK HERE TO DOWNLOAD THE WHOLE DISSERTATION. (in PDF format)