Accurate modeling of protein structures by homology

Biasini, Marco. Accurate modeling of protein structures by homology. 2013, PhD Thesis, University of Basel, Faculty of Science.


Official URL: http://edoc.unibas.ch/diss/DissB_10597


Proteins are macromolecules which play a crucial role in virtually any process in the living cell. The determination of the 3-dimensional structure of a protein is a key component in understanding its function and mode of action. Preferably, the structure is solved by an experimental technique such as X-ray crystallography, nuclear magnetic resonance (NMR), or electron microscopy (EM). In many instances, experimental structures are unavailable or can not be readily determined. To the rescue come computational modeling techniques, e.g. comparative modeling, which are producing structures at a fast pace. State of the art methods are capable of generating accurate models down to the level of sidechains. These models are a useful tool in designing experiments, e.g. site-directed mutagenesis, virtual screening and identifying proteins of similar function. Despite the recent advancements, comparative modeling still has substantial room for improvement in many areas. In the course of this thesis, we aim at developing techniques which address some of the shortcomings of today's methods. As a solid foundation for this work, the OpenStructure software framework is developed, which allows to conveniently implement new methods and seamlessly integrate them with existing programs.
Computational modeling often requires comparisons of models and/or template structures. Standard structure similarity measures, such as RMSD and GDT are based on global superposition of structures, and their results are not meaningful when applied to structures exhibiting domain movements. For unsupervised comparison of structures on a large scale, a similarity measure based on internal distances was developed, which, to a large extent, is insensitive to domain movements. In analogy to the global distance test, the similarity measure is referred to as local distance difference test (lDDT).
A critical step of template-based modeling is the selection of suitable template structure information. For well characterized protein families, often 20 or more alternative experimental template structures are available. While all templates may share a similar overall topology, the relative orientation of sub-domains often differs significantly. Such intrinsic movements limit the assignment of consistent structural constraints for the comparative modeling step. An efficient and robust procedure to identify stable structural building blocks in ensembles of structures using contact-overlap map consistency (COM) is proposed.
The ability of a structural model to answer a particular biological research question is determined by its accuracy. Since models may contain substantial errors, reliable quality estimates are fundamental to determine their usefulness. We will develop techniques to assign quality estimates to models, which expand on the typical potential of mean force (PMF) formalism used in the field. By relating the protein's PMF energy to energy of experimental structures, we obtain a Z-score of the model's structure being of comparable quality to experimentally determined structures. In a second scoring function, the PMF scores are complemented with distance restraints from evolutionary related experimental structures. These restraints are helpful in discriminating between correct and incorrect folds and greatly improve the accuracy of the scoring function.
A novel modeling pipeline for the SWISS-MODEL expert system for comparative modeling is presented. For template and model selection, the pipeline builds on scoring functions developed in this thesis, and combines them with probability-based reliability estimates. The pipeline is embedded into a new web-interface, leveraging on capabilities of modern web browsers to perform the modeling in an interactive manner.
Finally, computational models are often improved by incorporating experimental restraints, e.g. from electron density maps, proteomics cross-links, mutation studies etc. Likewise, at resolutions below 2.5 A, X-ray density maps are often insufficiently defined to allow completely automated model building and can benefit from the incorporation of computational techniques. We explore the application of computational sampling techniques to the automated model building with ARP/wARP at low resolution with the aim to improve model completeness and to reduce fragmentation.
Advisors:Schwede, Torsten
Committee Members:Torda, Andrew
Faculties and Departments:05 Faculty of Science > Departement Biozentrum > Computational & Systems Biology > Bioinformatics (Schwede)
Item Type:Thesis
Thesis no:10597
Bibsysno:Link to catalogue
Number of Pages:210 S.
Identification Number:
Last Modified:30 Jun 2016 10:54
Deposited On:09 Dec 2013 15:42

Repository Staff Only: item control page