Ensemble methods for protein structure prediction

University dissertation from Stockholm : Department of Biochemistry and Biophysics, Stockholm University

Abstract: Proteins play an essential role in virtually all of life's processes. Their function is tightly coupled to the three-dimensional structure they adopt.Solving protein structures experimentally is a complicated, time- and resource-consuming endeavor. With the rapid growth of the amount of protein sequences known, it is very likely that only a small fraction of known proteins will ever have their structures solved experimentally. Recently, computational methods for protein structure prediction have become increasingly accurate and offer a promise for bridging this gap.In this work, we show the ways the rapidly growing amounts of available biological data can be used to improve the accuracy of protein structure prediction. We discuss the use of multiple sources of structural information to improve the quality of predicted models. The methods for assigning the estimated quality scores for predicted models are discussed as well.  In particular we present a novel, successful approach to the clustering-based quality assessment, which runs nearly 50 times faster than other methods of comparable accuracy, allowing to tackle much larger problems.Additionally, this thesis discusses the impact the recent breakthroughs in sequencing and the consequent rapid growth of sequence data have on the prediction of residue-residue contacts. We propose a novel methodology, which allows for predicting such contacts with astonishing, previously unheard-of accuracy. These contacts in turn can be used to guide protein modeling, allowing for discovering protein structures that have been unattainable by conventional prediction methods.Finally, a considerable part of this dissertation discusses the community efforts in protein structure prediction, as embodied by CASP (Critical Assessment of protein Structure Prediction) experiments.

  This dissertation MIGHT be available in PDF-format. Check this page to see if it is available for download.