On Transcriptome Sequencing

University dissertation from Stockholm : KTH

Abstract: This thesis is about the use of massive DNA sequencing to investigate the transcriptome. During recent decades, several studies have made it clear that the transcriptome comprises a more complex set of biochemical machinery than was previously believed. The majority of the genome can be expressed as transcripts; and overlapping and antisense transcription is widespread. New technologies for the interroga- tion of nucleic acids have made it possible to investigate such cellular phenomena in much greater detail than ever before. For each application, special requirements need to be met. The work presented in this thesis focuses on the transcrip- tome and the development of technology for its analysis. In paper I, we report our development of an automated approach for sample preparation. The procedure was benchmarked against a publicly available reference data set, and we note that our approach outperformed similar manual procedures in terms of reproducibility. In the work reported in papers II-IV, we used different massive sequencing technologies to investigate the transcriptome. In paper II we describe a concatemerization approach that increased throughput by 65% using 454 sequencing,and we identify classes of transcripts not previously described in Populus. Papers III and IV both report studies based on SOLiD sequencing. In the former, we investigated transcripts and proteins for 13% of the human gene and detected a massive overlap for the upper 50% transcriptional levels. In the work described in paper IV, we investigated transcription in non-genic regions of the genome and detected expression from a high number of previ- ously unknown loci.