Deep learning solutions to protein quaternary structure

Abstract: Interactions between proteins are directly involved in most biological processes and are essential for the correct functioning of every form of life. The nature of protein-protein interactions allows functional assemblies of hundreds of protein chains. Given the enormous complexity and the pivotal role of protein interactions in life’s mechanics, the necessity to obtain a complete comprehension of such mechanisms is just as big as the challenge to achieve such knowledge. In the last few decades, experimental procedures constantly improved, dramatically increasing the available structural data for protein interactions. Unfortunately, experimental methods require a lot of time and resources and cannot always be applied with the same degree of success. Several computational methods have been developed in parallel with experimental procedures to overcome such limitations. Therefore, this thesis focused on screening existing computational methods and adopting them to improve the overall accuracy in solving structures of protein-complexes. In the first paper, I propose a simple rigid-body docking framework to test several interface predictors and their ability to drive a protein-protein docking procedure. Next, in the second paper, I display a method to adapt the trRosetta deep neural network to predict inter-residues distances and dihedral angle constraints for full protein complexes. The same concept is then improved in the third paper with FoldDock, an adaptation of Alphafold2 to work on multiple protein sequences and produce the corresponding complex. Finally, in the fourth paper, the FoldDock pipeline is applied to a large dataset of protein pairwise interactions derived from the hu.MAP and HuRI datasets, resulting in the characterization of more than 3000 high-confidence structural models.