A critical assessment of methods to recover information from averaged data
Abstract
Conformational heterogeneity is key to the function of many biomacromolecules, but only a few groups have tried to characterize it until recently. Now, thanks to the increased throughput of experimental data and the increased computational power, the problem of the characterization of protein structural variability has become more and more popular. Several groups have devoted their efforts in trying to create quantitative, reliable and accurate protocols for extracting such information from averaged data. We analyze here different approaches, discussing strengths and weaknesses of each. All approaches can roughly be clustered into two groups: those satisfying the maximum entropy principle and those recovering ensembles composed of a restricted number of molecular conformations. In the first case, the solution focuses on the features that are common to all the infinite solutions satisfying the experimental data; in the second case, the reconstructed ensemble shows the conformational regions where a large probability can be placed. The upper limits for conformational probabilities (MaxOcc) can also be calculated. We also give an overview of the mainstream experimental observables, with considerations on the assumptions underlying their usage.
- This article is part of the themed collection: Exploring the conformational heterogeneity of biomolecules: theory and experiments