A Rigorous and Efficient Method To Reweight Very Large Conformational Ensembles Using Average Experimental Data and To Determine Their Relative Information Content

Leung, Hoi Tik Alvin and Bignucolo, Olivier and Aregger, Regula and Dames, Sonja A. and Mazur, Adam and Bernèche, Simon and Grzesiek, Stephan. (2016) A Rigorous and Efficient Method To Reweight Very Large Conformational Ensembles Using Average Experimental Data and To Determine Their Relative Information Content. Journal of chemical theory and computation, 12 (1). pp. 383-394.

Full text not available from this repository.

Official URL: http://edoc.unibas.ch/41205/

Downloads: Statistics Overview


Flexible polypeptides such as unfolded proteins may access an astronomical number of conformations. The most advanced simulations of such states usually comprise tens of thousands of individual structures. In principle, a comparison of parameters predicted from such ensembles to experimental data provides a measure of their quality. In practice, analyses that go beyond the comparison of unbiased average data have been impossible to carry out on the entirety of such very large ensembles and have, therefore, been restricted to much smaller subensembles and/or nondeterministic algorithms. Here, we show that such very large ensembles, on the order of 10(4) to 10(5) conformations, can be analyzed in full by a maximum entropy fit to experimental average data. Maximizing the entropy of the population weights of individual conformations under experimental χ(2) constraints is a convex optimization problem, which can be solved in a very efficient and robust manner to a unique global solution even for very large ensembles. Since the population weights can be determined reliably, the reweighted full ensemble presents the best model of the combined information from simulation and experiment. Furthermore, since the reduction of entropy due to the experimental constraints is well-defined, its value provides a robust measure of the information content of the experimental data relative to the simulated ensemble and an indication for the density of the sampling of conformational space. The method is applied to the reweighting of a 35 000 frame molecular dynamics trajectory of the nonapeptide EGAAWAASS by extensive NMR (3)J coupling and RDC data. The analysis shows that RDCs provide significantly more information than (3)J couplings and that a discontinuity in the RDC pattern at the central tryptophan is caused by a cluster of helical conformations. Reweighting factors are moderate and consistent with errors in MD force fields of less than 3kT. The required reweighting is larger for an ensemble derived from a statistical coil model, consistent with its coarser nature. We call the method COPER, for convex optimization for ensemble reweighting. Similar advantages of large-scale efficiency and robustness can be obtained for other ensemble analysis methods with convex targets and constraints, such as constrained χ(2) minimization and the maximum occurrence method.
Faculties and Departments:05 Faculty of Science > Departement Biozentrum > Services Biozentrum > Research IT (Podvinec)
05 Faculty of Science > Departement Biozentrum > Structural Biology & Biophysics > Structural Biology (Grzesiek)
UniBasel Contributors:Grzesiek, Stephan and Leung, Hoi Tik Alvin and Bignucolo, Olivier and Bernèche, Simon and Mazur, Adam and Podvinec, Michael
Item Type:Article, refereed
Article Subtype:Research Article
Publisher:American Chemical Society
Note:Publication type according to Uni Basel Research Database: Journal article
Identification Number:
Last Modified:18 Nov 2016 09:18
Deposited On:23 Aug 2016 08:07

Repository Staff Only: item control page