Open Access Article
This Open Access Article is licensed under a
Creative Commons Attribution 3.0 Unported Licence

Dynamic design: manipulation of millisecond timescale motions on the energy landscape of cyclophilin A

Jordi Juárez-Jiménez*a, Arun A. Gupta a, Gogulan Karunanithyb, Antonia S. J. S. Meya, Charis Georgiou§ a, Harris Ioannidisa, Alessio De Simone a, Paul N. Barlowa, Alison N. Hulmea, Malcolm D. Walkinshawc, Andrew J. Baldwinb and Julien Michel*a
aEaStCHEM School of Chemistry, University of Edinburgh, David Brewster Road, Edinburgh, EH9 3FJ, UK. E-mail:;
bDepartment of Chemistry, Physical and Theoretical Chemistry Laboratory, University of Oxford, South Parks Road, Oxford, OX1 3QZ, UK
cSchool of Biological Sciences, Michael Swann Building, Max Born Crescent, Edinburgh, EH9 3BF, UK

Received 18th September 2019 , Accepted 14th January 2020

First published on 15th January 2020

Proteins need to interconvert between many conformations in order to function, many of which are formed transiently, and sparsely populated. Particularly when the lifetimes of these states approach the millisecond timescale, identifying the relevant structures and the mechanism by which they interconvert remains a tremendous challenge. Here we introduce a novel combination of accelerated MD (aMD) simulations and Markov state modelling (MSM) to explore these ‘excited’ conformational states. Applying this to the highly dynamic protein CypA, a protein involved in immune response and associated with HIV infection, we identify five principally populated conformational states and the atomistic mechanism by which they interconvert. A rational design strategy predicted that the mutant D66A should stabilise the minor conformations and substantially alter the dynamics, whereas the similar mutant H70A should leave the landscape broadly unchanged. These predictions are confirmed using CPMG and R solution state NMR measurements. By efficiently exploring functionally relevant, but sparsely populated conformations with millisecond lifetimes in silico, our aMD/MSM method has tremendous promise for the design of dynamic protein free energy landscapes for both protein engineering and drug discovery.


Capturing the complete dynamics of proteins to understand their many functions has long been a goal of structural biology. Experimentally, trapping proteins into relevant conformational states and characterising these at atomic resolution provides tremendous insight.1–3 Such static ‘snapshots’ are inherently incomplete and especially in cases where proteins are particularly dynamic, understanding function can require characterisations of conformational states that are formed only transiently, and populated as sparsely as a few per cent.4–6 These conformers are typically challenging to characterise experimentally. Solution-state NMR has proven unique in its ability to detect and structurally characterise these functionally relevant states, whose lifetimes can be on the order of a few milliseconds and are otherwise ‘invisible’ to experimental measures.6–8 Such states can play important roles in processes as diverse as protein folding, molecular recognition and catalysis, but remain challenging to characterise.9–16 To engineer proteins with new functions, and for drug discovery, there is a clear need to be able to characterise, explore and manipulate the full set of conformational states accessed by proteins, and the mechanism by which they interconvert.

Molecular dynamics (MD) simulations provide an attractive means to accomplishing this goal in silico. To characterise low amplitude structural fluctuations in local minima, simulations on timescales of a few hundred nanoseconds may be sufficient.17,18 Substantially longer simulations are required in order to sample sparsely populated ‘excited’ conformational states with millisecond lifetimes. Such a ‘brute force’ calculation is possible at present only for relatively small proteins using dedicated hardware that is not widely available.19 To make progress, two broad categories of enhanced sampling methods have been developed with the goal of increasing total effective time covered by a simulation. Predefined collective variables can be exhaustively sampled (metadynamics,20–22 umbrella sampling,23–25 steered MD26–28), or the potential energy function of the forcefield can be explicitly altered (scaled MD29–32 accelerated MD (aMD)33,34). The former class of methods require the degrees of freedom underpinning the motion of interest to be supplied in advance, which pre-supposes that the relevant motions are already known. The latter class of methods can overestimate the population of high energy regions of the free energy surface (FES) and so explore unphysical regions of conformational space.

Aiming to improve upon these methods, Markov State Modelling (MSM) based sampling approaches have emerged as an exciting method for sampling protein dynamics on millisecond timescales.2,35–41 These techniques may be used to sample conformational space using a series of relatively short MD trajectories. These trajectories are analysed with the goal of obtaining a discrete set of interconverting conformational ‘states’, where each has no knowledge of any previously visited state. This method allows a series of independently generated MD trajectories to be combined into a complete description of the energy landscape of the protein. MSM have been shown to reproduce experimental observables in model systems where the structures of the relevant states were known in advance,35,42,43 and in cases where the timescales for conformational transitions were sufficiently short to be validated with ‘brute force’ MD methods.19,37,44

In this work, we generate a protocol suitable for sampling transitions between structurally ambiguous states separated by millisecond kinetic barriers (Fig. 1b). This is achieved by harnessing the ability of aMD to efficiently explore conformational space, while relying on the MSM procedures to recover a more accurate thermodynamic and kinetic description of the system. Because this approach provides atomically detailed structural ensembles and detailed information about how they interconvert, it is well suited to both generate experimentally testable mechanistic hypotheses and enable the rational design of conformational ensembles.

image file: c9sc04696h-f1.tif
Fig. 1 Calculation the dynamic ensemble of WT CypA. (a) An X-ray diffraction structure of CypA, with the 100s and 70s loops indicated. A number of residues have been previously determined to undergo conformational exchange69,70 which is evidenced by a large value of Rex for these residues, measured by NMR. These values are consistent with the formation of transiently sampled, but sparsely populated alternative loop conformations. (b) It is desirable to determine the nature of the interconverting structures, their populations and mechanism by which they exchange. We accomplish this for CypA using a combination of aMD, MD and MSM methodologies (detailed methods). (c) A 100 microstates MSM for wild-type CypA that describes 70s and 100s loop motions. The width of the arrows is proportional to the interconversion rates. (d) The microstates were clustered into a minimal set of sub-states that together contain the relevant amplitudes of motion and timescales of state-to-state interconversion present in the full ensemble. The calculated rates and populations are indicated. Error bars on reported populations and mean first passage times (MFPT) were obtained by bootstrapping of the MD trajectories assigned to the individual microstates.

We tested the novel aMD/MSM protocol by both characterising and rationally modifying the free energy surface of human cyclophilin A (CypA). This enzyme is a human prolyl isomerase whose incorporation into new virus assemblies is essential for HIV-1 infectivity45–48 and HCV replication,49 making it a major drug target.50–57 This protein has been widely studied,58–65 revealing intricate networks of allosteric regulation,66–68 and possesses two functional loops, residues 65–77 (70s loop, Fig. 1a) and 100–110 (100s loop, Fig. 1a) that have been established by NMR to undergo substantial dynamics on the millisecond timescale.69–71 An atomistic picture of the conformational changes has proven elusive. Our aMD/MSM procedure reveals a range of conformational sub-states of CypA where both the 100s and 70s loops exchange between open and closed states (Fig. 1c and d).

To manipulate the energy surface, we implemented a design strategy based on hydrogen bonding patterns to stabilise specifically sparsely populated conformers with millisecond lifetimes (Fig. 2). This procedure led us to a previously unidentified mutation D66A. Whereas WT has the 70s loop predominantly in the closed state, consistent with the majority of X-ray structures of CypA solved to date,52,55,71–76 D66A was predicted to predominantly occupy the open state of the 70s loop. Similarly, as a negative control, a second similar mutation H70A was predicted by aMD/MSM to closely resemble the WT (Fig. 3).

image file: c9sc04696h-f2.tif
Fig. 2 Design of ensemble disrupting mutation. (a) Specific interactions made by D66 with 70s loop residues in representative MD snapshots of the closed, intermediate and open states. (b) The probability distribution of the number of intra-molecular H-bonds between D66 and 70s loop residues in the 5 sub-states. These are compared to X-ray structure PDB 1AK4 (black),72 and NMR ensemble PDB 2N0T (grey).63 (c) The difference in the average number of intra-molecular H-bonds in the 70s closed (orange/red) and 70s open (blue/purple) states. Residue D66 stands out as ‘designable’ as it has substantially more H-bonds stabilising the closed state than the open state of the 70s loop.

image file: c9sc04696h-f3.tif
Fig. 3 Conformational ensembles of mutant CypA proteins D66A and H70A. (a) MSM model for D66A. (b) MSM model for H70A. Other details as in Fig. 1c. (c) Heatmap of calculated mean first passage times (MFPT) between non-adjacent macro-states. The symbols ‘c’ and ‘o’ denote transitions between closed and open loop conformations.

Using a combination of R and CPMG NMR experiments, with X-ray crystallography we tested these predictions (Fig. 4 and 5) and verified that exactly as predicted, the 70s loop is substantially disordered in D66A, and adopts a conformation very similar to the minor conformations populated in WT ensemble. Also consistent with the predictions of aMD/MSM, the free energy landscape of H70A was largely the same as for WT.

image file: c9sc04696h-f4.tif
Fig. 4 Comparison of biophysical characterisation and MD/MSM conformational ensembles for WT CypA, D66A and H70A. (a) Binding isotherms for WT and D66A measured by ITC against an inhibitor. (b) X-ray structure of WT and D66A bound to inhibitor. Magenta meshes indicate 2FoFc electron density map at an isocontour of 1.0σ for the inhibitor and protein residues 65 to 75. (c) 1H–15N HSQC correlation spectra of WT and D66A at (10 °C), residues showing significant CSPs are highlighted. (d) Random coil index S2 values calculated from Talos+ for WT and D66A (blue) revealing increased disorder in the vicinity of the 70s loop for D66A. (e) Measured CSP per residue with respect to WT (orange) for D66A (blue) and H70A (red).70 (f) Calculated CSPs per residue from the computed ensembles. (g) Measured steady-state 15N–{1H} heteronuclear NOE transfers for WT and D66A in the vicinity of the 70s loop. (h) Calculated 15N–{1H} heteronuclear NOE transfers in the vicinity of the 70s loop from the computed ensembles.

image file: c9sc04696h-f5.tif
Fig. 5 Characterisation of μs–ms dynamics in CypA and D66A by CPMG and R NMR measurements. (a) Estimated contribution to relaxation from exchange, taken as the average of the difference between the lowest and highest frequency measurements from CPMG dispersion curves. (b) Comparison of selected dispersion profiles measured for CypA (orange) and D66A (blue) at 283 K and 600 MHz. (c) The experimentally determined chemical shift changes, Δω, from analysis of the CPMG/R curves for WT and (d) D66A. (e) Comparison of magnitude of Δω between CypA and D66A from HSQC measurements (x-axis) and from CPMG/R measurements on the WT (y-axis). The agreement between the two is remarkably good. (f) Comparison of magnitude of Δω between CypA and D66A from HSQC measurements (x-axis) and from CPMG/R measurements on D66A (y-axis). The agreement is poor revealing that the ‘excited state’ in D66A is not the ground state of WT CypA.

Overall, we demonstrate that the aMD/MSM method accurately predicts the conformational fluctuations of the challenging and dynamic protein CypA including states with millisecond lifetimes. The resulting free energy surface is sufficiently accurate to enable the design of mutants that stabilise specific conformations. aMD/MSM provides a powerful tool for not only the atomistic characterisation, but also the manipulation of the free energy surface of dynamic proteins, including controlling states with millisecond lifetimes.


Atomic-level insights in millisecond timescale CypA loop motion mechanisms

All-atom accelerated molecular dynamics (aMD) simulations33,34 were initiated from an X-ray structure of wild type CypA (PDB ID 1AK4[thin space (1/6-em)]72). In aMD conformational sampling is accelerated by adding a biasing potential to the molecular mechanics potential energy function when it falls below a threshold (ESI methods). Structures sampled from the aMD trajectory were used to seed sets of equilibrium MD trajectories, which were used for Markov state modelling (MSM) analysis. In a MSM, the protein dynamics are modelled as a set of Markov jump processes between discrete conformational states. These Markov processes can be thought of as jumps between nodes in a network, with the edges representing transition probabilities between these nodes, and where no knowledge of previously visited nodes is required to estimate the probability for a particular transition. The procedure enables reconstruction of a conformational ensemble from sets of independently run short MD trajectories that have each individually explored only a small portion of that conformational ensemble. Three rounds of MD simulations and MSM estimations were required to produce a 100 microstate MSM (Fig. 1c) that was numerically stable, corresponding to a cumulative MD sampling time of 187.5 μs. Details about model validation against experimental observables are given in the ESI (Fig. S1–S5).

To obtain physical insight into the calculated ensemble, the 100 MSM microstates were further lumped into a coarser 5-macrostate model (see ESI methods). This five state conformational landscape reveals that both the 100s and 70s loops can adopt ‘open’ and ‘closed’ conformations. While the interconversions of both loops were largely independent, the interconversions of the 100s loop was one order of magnitude faster than the 70s loop (Fig. 1d and S6), suggesting that the slower loop dynamics of the protein are largely associated with the opening and closing of the 70s loop. The most populated state for WT CypA, (100s open/70s closed, Fig. 1c and d orange, 41 ± 6%) closely resembles the conformation typically observed in X-ray structures of CypA (Fig. S7), and fluctuations within the state were small (Cα RMSF values of 2.7 Å and 1.1 Å for the 100s and 70s loops, respectively). From this state, the 100s loop can rapidly close to adopt the 100s closed/70s closed state (Fig. 1c and d, red, RMSF 2.6 Å and 1.1 Å). While these two states share a common conformation of the 70s loop, the 100s loop is in a significantly different position with values of Cα RMSD with respect to the crystal structure >4.5 Å. No major conformational fluctuations outside the 70s and 100s loop regions are observed between macrostates (Fig. S8).

These two conformational states must overcome a substantially larger kinetic barrier to adopt open conformations of the 70s loop (100s open/70s open, magenta and 100s closed/70s open, blue). In contrast to the 70s closed states (red/orange), the 70s loop in its open conformations is relatively disordered, as evidenced by RMSF values >4 Å. These two 70s open states can interconvert, with the 100s loop opening/closing in an analogous fashion to the inter-conversions observed in the 70s closed states. While fluctuations of the 100s loop are rapid and largely independent of the 70s loop conformations, the opening/closing transition of the 70s loop was more complex, passing via an intermediate fifth state (teal) where both 70s and 100s loops display positional fluctuations that are intermediate between open and closed (RMSF ca. 2.9 Å). Qualitatively, it is important for the 100s loop to simply ‘not be in the way’, as some conformations of the open 100s state can effectively block the opening of the 70s loop.

To validate these findings we sought a mutation that would stabilise the group of minor, 70s open, conformations (teal/blue/purple). The number of hydrogen bonds each residue makes to the 70s loop was assessed in both the open and closed forms (Fig. 2b). Notably, D66 was found to adopt a large number of H-bonds in the closed form, and substantially fewer in the open form (Fig. 2c), making this residue stand out as ‘designable’ and provide a target for altering the free energy landscape of CypA. The critical role of D66 in the proposed 70s loop motions mechanism is not supported by a recent NMR structural ensemble that has been proposed to describe the CypA 70s loop conformational changes measured by CPMG experiments.63 Analysis of the structures in that NMR bundle shows little changes in hydrogen bonding interactions with the 70s-loop (Fig. 2b). Further none of the structures in the NMR bundle contain a 70s-loop conformation similar to that commonly observed in high-resolution X-ray structures of CypA (Fig. S7). aMD/MSM simulations of D66A were performed as described for WT CypA (Fig. 1). The resulting ensemble predicted that the population of the previous ground state was substantially reduced (from ca. 40% to ca. 15%, Fig. 3a), and the intermediate (teal) state was increased in a compensatory fashion to become the new ground state (from ca. 5% to ca. 40%). Overall with respect to CypA the population of states in which the 70s-loop is closed/open transitioned from ca. 70[thin space (1/6-em)]:[thin space (1/6-em)]30% to 25[thin space (1/6-em)]:[thin space (1/6-em)]75%. Moreover, the mean first passage times between the conformational states were predicted to decrease by 10–15 fold with respect to wild type (Fig. 3a and c). Finally, as a negative control, we performed aMD/MSM simulations on the mutation H70A.70 The hydrogen bonding pattern for this residue was predicted to be broadly similar when the 70s loop is either open or closed (Fig. 2c). Consistent with this, aMD/MSM simulations determined that the conformational ensemble for H70A should resemble WT (Fig. 3b).

Experimental validation of the calculated ensembles

To establish the validity of mechanistic hypotheses originating from the aMD/MSM models, we sought to compare our results to experimental data. Initially, we examined binding of an inhibitor against both WT and D66A. The Kd for D66A was 12-fold weaker than for WT (Fig. 4a), indicating a substantial change in the underlying protein conformations. Crystallisation of D66A proved challenging as we might expect for a protein with increased disorder, although both WT and D66A were successfully crystallised in the presence of an inhibitor51 and their structures were solved (Table S2). Comparison of electron density profiles revealed that while the 70s-loop adopts as expected a closed conformation in the WT complex (Fig. 4b orange), electron density in the 70s loop for the D66A complex was largely absent (Fig. 4b blue), consistent with the notion that this loop occupies a predominantly disordered conformation in D66A.

To probe the conformational dynamics more directly, isotopically enriched samples of WT and D66A CypA were prepared for NMR analysis (Fig. S9 and S10). 1H–15N HSQC spectra of both proteins revealed that while both were predominantly well structured (Fig. 4c and S11), there are substantial differences in the ground-state ensembles, exactly as predicted from the simulations. On a per-residue basis, the largest deviations in observed chemical shift positions between D66A and WT were found in the 70s loop region (Fig. 4e). The changes in chemical shift were quantitatively estimated by calculating the chemical shifts of individual conformers in the aMD/MSM ensembles using the program shiftx2.77 The calculated CSP values are consistent with the experimental observation of significant changes in the 70s loop region, though the magnitude of the CSPs is underestimated (Fig. 4f).

To determine if there are any changes in fast timescale motions of the 70s loop, the random coil index (RCI) was calculated for WT and D66A from the observed chemical shifts (Fig. 4d). These confirmed that the 70s loop is substantially more disordered in D66A, as expected from the simulations. This finding was further corroborated by measurement of the H–N steady-state heteronuclear NOE, where substantially lower values were recorded in the 70s loop for the D66A versus the WT. The heteronuclear NOEs were calculated from the MD ensembles, and also found to be in reasonable qualitative agreement with the experimental measurements (Fig. 4g and h).

To further characterise the fast timescale dynamics, R2 and R1 values were additionally measured at two magnetic field strengths, and a model-free analysis was performed. Such an analysis requires successively more complex motional models to be fitted to individual residues, in order to achieve an overall self-consistent picture of the dynamics. The procedure was unable to provide a self-consistent picture of relaxation for either CypA or D66A, with large variations in parameters between adjacent residues and ‘fast’ local fluctuations measured to be comparable to the global tumbling rate (Fig. S12). This is likely because of the substantial contributions of chemical exchange to the individual measurements (vide infra). Of the measurements obtained, the heteronuclear NOE nevertheless strikes a reasonable compromise between being relatively resistant to the effects of chemical exchange, whilst still being sensitive to relatively rapid local fluctuations within the molecule (Fig. 4g and h).

While the fast ps/ns timescale dynamics indicate substantial differences in the 70s loop, the major goal was to determine if the structure of D66A resembles the excited state of WT CypA, thus validating the design strategy. A combination of R and CPMG experiments were employed to study these conformational fluctuations on the millisecond timescale (Fig. 5a, b and S13). The CPMG data for WT strongly resembles that obtained previously.69,70 The combined data were analysed globally within a modified version of the software CATIA.78 Such an analysis can ascertain a minimally complex binding model, and provide experimental measures of the kinetic and thermodynamic parameters that govern the exchange. Moreover, the analysis also provides chemical shifts (Fig. 5c and d), which characterise the structural changes that accompany the millisecond conformational rearrangements. For WT, a two-state model was found to explain well the combined dataset to within experimental error. The exchange rate kex was determined to be approximately 2200 ± 90 s−1 and a population of excited state was determined to be approximately 2.0 ± 0.1% (Table S3).

Remarkably, the chemical shift differences directly measured from WT and D66A HSQC spectra were in good agreement with the chemical shift differences between ground and excited states of WT obtained by CPMG/R measurements (Fig. 5e). This correlation strongly supports the in silico design strategy demonstrating that the ground state of D66A is similar to the ‘excited’ minor conformational state adopted by the WT, and that the design process effectively swapped the major and minor conformers.

Having established that the ground state of D66A resembles the ‘excited’ state of WT, we acquired CPMG and R data to further characterise its millisecond dynamics. The raw exchange rates Rex for D66A in the 70s loop Rex were found substantially smaller than those detected for the WT protein, and residues that had large Rex values in WT were reduced to zero in D66A (e.g. A66, G74 Fig. 5) indicating the success of the design protocol. Consistent with this, the pattern of specific residues involved with the conformational dynamics was very different (Fig. 5 and Table S3) to WT. As with WT, the data could be well explained by fitting to a 2-state model, where the exchange rate was similar to WT (kex ca. 2000 ± 350 s−1), but the population of the excited state was much smaller, very close to our detection threshold (pB ca. 0.5 ± 0.1%). Analysis of the excited state chemical shifts shows that this new state is not moving towards a particularly disordered state, nor the ground state of CypA. Instead, since a number of residues in the 80s loop were found to have relatively large Rex values in D66A this suggests that the excited state involves rearrangements of the 80s loop, outwith the scope of the present simulations that were designed to resolve millisecond timescale motions of the 70s and 100s loops. Overall the key finding is that the NMR data shows that the ‘ground’ conformational state of D66A has the 70s loop in a less structured conformation, consistent with what is expected for the ‘open’ conformations identified by MD simulations. The correlation of chemical shifts suggests that this conformation is similar to the ‘excited’ state adopted by the WT protein.


The work demonstrates that our new hybrid method aMD/MSM can efficiently provide a comprehensive description of protein free energy surfaces, including a characterisation of sparsely populated conformational states with millisecond lifetimes. The simulations are hypothesis generating and are sufficiently accurate to be amenable to rational design strategies whose goal is to adjust the relative populations of the different conformational states (Fig. 3).

It is interesting to compare different ensembles of WT CypA available in the literature. A notable ensemble of WT CypA63 was calculated recently from solution-state NMR data. Our aMD/MSM ensemble is also in good agreement with NOE, J-coupling and RDC solution-state NMR data (Fig. S3). However while both ensembles suggest large amplitude motions of the 70s loop, the NMR ensemble does not support the critical role of D66 in a conformational change between open and closed loop conformations, and is also inconsistent with the closed 70s loop conformation observed by X-ray diffraction experiments.

To further test our method, we constructed an ensemble of CypA purely from our aMD trajectories. Unlike the aMD/MSM ensemble, the resulting aMD ensemble predicts that CypA adopts predominantly a 100s open/70s open conformation (Fig. S14), a prediction inconsistent with experimental data. Taken together, the aMD/MSM ensemble is the only method that characterises the conformational exchange, revealing the presence of both open and closed conformations of 70s loop in the WT with sufficient precision to provide the means to redesign the conformational landscape.

The resulting conformational states of CypA also provide insight into function. A double mutant (D66N/R69H) of the rhesus macaque homologue TRIMCyp adopts a 70s open conformation, which has been linked to potent HIV-2 restriction.79 These observations lead to the interesting speculation that the sparsely populated open 70s conformation in human CypA plays an important and unrealised role in the pathway of HIV-2 infectivity, suggesting stabilisation of this conformation could lead to novel therapeutics.

Approaches have been developed based on using antibodies to stabilise transiently populated conformation states to facilitate ligand discovery efforts.80–82 Our results open an exciting opportunity for determining this in silico as part of rational drug design endeavours. In the specific case of CypA, current treatment methods have insufficient isoform selectivity.53,54 We envision that aMD/MSM can be used to selectively target transient states that are differently populated among different protein isoforms, hence paving the way for the rational design of isoform selective inhibitors.

There has been substantial discussion on the relationship between millisecond timescale motions in CypA and its catalytic cycle.83 Initial measurements suggested that the two could be linked, although this has been disputed by later more detailed measurements.64,68 Our results indicate that the opening and closing of the 70s loop is largely independent of the rest of the protein, supporting the notion that its fluctuations are disconnected from the active site.

There remain technical challenges when comparing aMD/MSM simulations to experimental data. A key issue is the different number of states used to explain the MD and NMR data. In the case of MD one starts with hundreds of thousands of snapshots that lead to a minimal 5-macrostates description after application of standard MSM kinetic clustering methodologies. In the case of NMR the existence of states in intermediate exchange is inferred by fitting phenomenological models to experimental data, leading almost always to a small number of states to avoid overfitting the data. What constitutes a conformational state is also fundamentally different between both methodologies. To resolve different states in NMR a difference in chemical shifts is needed, but this need not be the case in MD that uses a kinetic clustering approach.

To address this discrepancy it is necessary to calculate chemical shifts from the structural ensembles and combine them appropriately to encapsulate the effects of intermediate exchange in order to accurately capture the effects of millisecond dynamics in NMR data. In this case, we found that there is a significant degree of overlap of predicted chemical shifts for residues in the 70s loop among MD macrostates, and the major separation is identified along the opening of the 70s loop (Fig. S15). We therefore suggest that the best mapping between MD and NMR states is along this variable between closed (orange + red) and open (teal + magenta + blue) states with a lowly populated intermediate state not evident in NMR experiments. We note that chemical shift calculation algorithms such as shiftX2 are trained on static structures and not MD trajectories, and so their accuracy in such applications has not been extensively validated. Nevertheless, the broad agreement between the predicted and measured CSPs between WT/D66A and H70A/WT are encouraging. However agreement between chemical shifts values calculated for one protein calculated in isolation is less impressive, perhaps suggesting that systematic deficiencies in the chemical shift calculations are removed when one takes the difference between two simulated states. Moreover, using the expectation values for the CSPs from shiftX2 with the rate constants predicted by the MSM leads to a large overestimation of Rex values, on the order of hundreds per second. This suggests that a combination of both the populations and the kinetics together with the estimated chemical shifts are together not in perfect agreement. This can partially explained by noting that while the CPMG and R data were acquired at 10 °C, the simulations were performed at 25 °C. These technical challenges aside, it is clear that the main predictions of aMD/MSM are fundamentally correct, and the key conformational ensembles predicted are sufficiently accurate to enable rational design efforts and to generate and test structural hypotheses.

Overall we demonstrate that aMD/MSM can provide atomic-level insights into the thermodynamics and kinetics of independent conformational sub-ensembles that characterise the free energy surface of CypA. The picture is chemically intuitive, and sufficiently accurate for rational design efforts aimed at stabilising particular conformational states over others, even when the lifetimes of the individual states reaches the millisecond timescale. We anticipate that this protocol can greatly facilitate future protein structure–dynamics–function relationship studies and will unlock new opportunities in bioengineering and drug discovery.


Molecular dynamic simulations were performed using AMBER16 (ref. 84) and GROMACS5.0 (ref. 85) software packages and processed using MDTRAJ.86 MSMs were built and analysed using PyEMMA PALES88 and BME89 were used to compute RDCs, chemical shifts were calculated with ShiftX2 (ref. 77) and MultiEx.90 NMRPipe91 and Sparky 3.1 (ref. 92) were used to process all spectra. The CCP4 suite93 was used for the structure refinement of crystallographic data. Detailed simulation and experimental methods are provided in the ESI.

Conflicts of interest

There are no conflicts to declare.


Authors thank Prof. Beat Vogeli and Prof. Carlo Camilloni for kindly providing sets of NMR observables for Cyp A, and Dr Xavier Hanoulle and Prof. Guy Lippens for kindly providing the NMR residue assignment of wild type CypA. J. M. was supported by a Royal Society University Research Fellowship. The research leading to these results has received funding from the European Union's Horizon 2020 Research and Innovation Program under the Marie Sklodowska-Curie grant agreement No. 655667 awarded to J. J.-J. and from the European Research Council under the European Seventh Framework Programme (FP7/2007–2013)/ERC grant agreement No. 336289 and from EPSRC (grant no. EP/P011330/1). This project made use of time on the ARCHER UK National Supercomputing Service ( granted via the UK High-End Computing Consortium for Biomolecular Simulation, HECBioSim (, supported by EPSRC (grant no. EP/L000253/1). This work was supported by Wellcome Trust Multi-User Equipment grant 101527/Z/13/Z and the Diamond Light Source for beamtime (BAG proposal MX18515).


  1. G. M. Lee and C. S. Craik, Trapping Moving Targets with Small Molecules, Science, 2009, 324, 213–215 CrossRef CAS PubMed.
  2. S. Chen, et al., The dynamic conformational landscape of the protein methyltransferase SETD8, eLife, 2019, 8, e45403 CrossRef PubMed.
  3. G. M. Lee, T. Shahian, A. Baharuddin, J. E. Gable and C. S. Craik, Enzyme Inhibition by Allosteric Capture of an Inactive Conformation, J. Mol. Biol., 2011, 411, 999–1016 CrossRef CAS PubMed.
  4. F. A. A. Mulder, A. Mittermaier, B. Hon, F. W. Dahlquist and L. E. Kay, Studying excited states of proteins by NMR spectroscopy, Nat. Struct. Biol., 2001, 8, 932–935 CrossRef CAS PubMed.
  5. P. Vallurupalli, G. Bouvignies and L. E. Kay, Studying ‘invisible’ excited protein states in slow exchange with a major state conformation, J. Am. Chem. Soc., 2012, 134, 8148–8161 CrossRef CAS PubMed.
  6. A. J. Baldwin and L. E. Kay, NMR spectroscopy brings invisible protein states into focus, Nat. Chem. Biol., 2009, 5, 808–814 CrossRef CAS PubMed.
  7. S.-R. Tzeng and C. G. Kalodimos, Allosteric inhibition through suppression of transient conformational states, Nat. Chem. Biol., 2013, 9, 462–465 CrossRef CAS PubMed.
  8. T. Yuwen, A. Sekhar, A. J. Baldwin, P. Vallurupalli and L. E. Kay, Measuring Diffusion Constants of Invisible Protein Conformers by Triple-Quantum 1H CPMG Relaxation Dispersion, Angew. Chem., Int. Ed., 2018, 57, 16777–16780 CrossRef CAS PubMed.
  9. B. J. Grant, A. A. Gorfe and J. A. McCammon, Large conformational changes in proteins: Signaling and other functions, Curr. Opin. Struct. Biol., 2010, 20, 142–147 CrossRef CAS PubMed.
  10. D. D. Boehr, R. Nussinov and P. E. Wright, The role of dynamic conformational ensembles in biomolecular recognition, Nat. Chem. Biol., 2009, 5, 789–796 CrossRef CAS PubMed.
  11. S.-R. Tzeng and C. G. Kalodimos, Dynamic activation of an allosteric regulatory protein, Nature, 2009, 462, 368–372 CrossRef CAS PubMed.
  12. S.-R. Tzeng and C. G. Kalodimos, Protein activity regulation by conformational entropy, Nature, 2012, 488, 236–240 CrossRef CAS PubMed.
  13. J. S. Fraser, et al., Accessing protein conformational ensembles using room-temperature X-ray crystallography, Proc. Natl. Acad. Sci. U. S. A., 2011, 108, 16247–16252 CrossRef CAS PubMed.
  14. K. Henzler-Wildman and D. Kern, Dynamic personalities of proteins, Nature, 2007, 450, 964–972 CrossRef CAS PubMed.
  15. P. Vallurupalli, N. Chakrabarti, R. Pomès and L. E. Kay, Atomistic picture of conformational exchange in a T4 lysozyme cavity mutant: an experiment-guided molecular dynamics study, Chem. Sci., 2016, 7, 3602–3613 RSC.
  16. Y. Wang, E. Papaleo and K. Lindorff-Larsen, Mapping transiently formed and sparsely populated conformations on a complex energy landscape, eLife, 2016, 5, e17505 CrossRef PubMed.
  17. T. Maximova, R. Moffatt, B. Ma, R. Nussinov and A. Shehu, Principles and Overview of Sampling Methods for Modeling Macromolecular Structure and Dynamics, PLoS Comput. Biol., 2016, 12, e1004619 CrossRef PubMed.
  18. D. J. Huggins, et al., Biomolecular simulations: From dynamics and mechanisms to computational assays of biological activity, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 2019, 9, e1393 Search PubMed.
  19. D. E. Shaw, et al., Atomic-Level Characterization of the Structural Dynamics of Proteins, Science, 2010, 330, 341–346 CrossRef CAS PubMed.
  20. A. Laio and M. Parrinello, Escaping Free-Energy Minima, Proc. Natl. Acad. Sci. U. S. A., 2002, 99, 12562–12566 CrossRef CAS PubMed.
  21. V. Leone, F. Marinelli, P. Carloni and M. Parrinello, Targeting biomolecular flexibility with metadynamics, Curr. Opin. Struct. Biol., 2010, 20, 148–154 CrossRef CAS PubMed.
  22. C. Camilloni, A. Cavalli and M. Vendruscolo, Replica-averaged metadynamics, J. Chem. Theory Comput., 2013, 9, 5610–5617 CrossRef CAS PubMed.
  23. J. Kästner, Umbrella sampling, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 2011, 1, 932–942 Search PubMed.
  24. M. Souaille and B. Roux, Extension to the weighted histogram analysis method: combining umbrella sampling with free energy calculations, Comput. Phys. Commun., 2001, 135, 40–57 CrossRef CAS.
  25. C. Bartels and M. Karplus, Probability Distributions for Complex Systems: Adaptive Umbrella Sampling of the Potential Energy, J. Phys. Chem. B, 1998, 102, 865–880 CrossRef CAS.
  26. B. Isralewitz, M. Gao and K. Schulten, Steered molecular dynamics and mechanical functions of proteins, Curr. Opin. Struct. Biol., 2001, 11, 224–230 CrossRef CAS PubMed.
  27. S. Park and K. Schulten, Calculating potentials of mean force from steered molecular dynamics simulations, J. Chem. Phys., 2004, 120, 5946–5961 CrossRef CAS PubMed.
  28. H. Xiong, A. Crespo, M. Marti, D. Estrin and A. E. Roitberg, Free Energy Calculations with Non-Equilibrium Methods: Applications of the Jarzynski Relationship, Theor. Chem. Acc., 2006, 116, 338–346 Search PubMed.
  29. W. Sinko, Y. Miao, C. A. F. de Oliveira and J. A. McCammon, Population based reweighting of scaled molecular dynamics, J. Phys. Chem. B, 2013, 117, 12759–12768 CrossRef CAS PubMed.
  30. Y. Miao and J. A. McCammon, Unconstrained enhanced sampling for free energy calculations of biomolecules: a review, Mol. Simul., 2016, 42, 1046–1055 CrossRef CAS PubMed.
  31. K. A. Feenstra, B. Hess and H. J. C. Berendsen, Improving efficiency of large time-scale molecular dynamics simulations of hydrogen-rich systems, J. Comput. Chem., 1999, 20, 786–798 CrossRef CAS.
  32. H. Tsujishita, I. Moriguchi and S. Hirono, Potential-scaled molecular dynamics and potential annealing: effective conformational search techniques for biomolecules, J. Phys. Chem., 1993, 97, 4416–4420 CrossRef CAS.
  33. D. Hamelberg, J. Mongan and J. A. McCammon, Accelerated molecular dynamics: A promising and efficient simulation method for biomolecules, J. Chem. Phys., 2004, 120, 11919–11929 CrossRef CAS PubMed.
  34. L. C. T. Pierce, R. Salomon-Ferrer, C. A. F. de Oliveira, J. A. McCammon and R. C. Walker, Routine access to millisecond time scale events with accelerated molecular dynamics, J. Chem. Theory Comput., 2012, 8, 2997–3002 CrossRef CAS PubMed.
  35. F. Noé, H. Wu, J. H. Prinz and N. Plattner, Projected and hidden Markov models for calculating kinetics and metastable states of complex molecules, J. Chem. Phys., 2013, 139, 184114 CrossRef PubMed.
  36. B. E. Husic and V. S. Pande, Markov State Models: From an Art to a Science, J. Am. Chem. Soc., 2018, 140, 2386–2396 CrossRef CAS PubMed.
  37. S. Olsson and F. Noé, Mechanistic models of chemical exchange induced relaxation in protein NMR, J. Am. Chem. Soc., 2017, 139, 200–210 CrossRef CAS PubMed.
  38. J. D. Chodera and F. Noé, Markov state models of biomolecular conformational dynamics, Curr. Opin. Struct. Biol., 2014, 25, 135–144 CrossRef CAS PubMed.
  39. N. Plattner and F. Noé, Protein conformational plasticity and complex ligand-binding kinetics explored by atomistic simulations and Markov models, Nat. Commun., 2015, 6, 7653 CrossRef PubMed.
  40. N. Plattner, S. Doerr, G. De Fabritiis and F. Noé, Complete protein–protein association kinetics in atomic detail revealed by molecular dynamics simulations and Markov modelling, Nat. Chem., 2017, 9, 1005–1011 CrossRef CAS PubMed.
  41. R. D. Malmstrom, C. T. Lee, A. T. Van Wart and R. E. Amaro, Application of Molecular-Dynamics Based Markov State Models to Functional Proteins, J. Chem. Theory Comput., 2014, 10, 2648–2657 CrossRef CAS PubMed.
  42. J. H. Prinz, B. Keller and F. Noé, Probing molecular kinetics with Markov models: Metastable states, transition pathways and spectroscopic observables, Phys. Chem. Chem. Phys., 2011, 13, 16912–16927 RSC.
  43. S. Olsson, H. Wu, F. Paul, C. Clementi and F. Noé, Combining experimental and simulation data of molecular processes via augmented Markov models, Proc. Natl. Acad. Sci. U. S. A., 2017, 114, 8265–8270 CrossRef CAS PubMed.
  44. K. Lindorff-Larsen, P. Maragakis, S. Piana and D. E. Shaw, Picosecond to Millisecond Structural Dynamics in Human Ubiquitin, J. Phys. Chem. B, 2016, 120, 8313–8320 CrossRef CAS PubMed.
  45. M. Thali, et al., Functional association of cyclophilin A with HIV-1 virions, Nature, 1994, 372, 363–365 CrossRef CAS PubMed.
  46. E. K. Franke, H. E. H. Yuan and J. Luban, Specific incorporation of cyclophilin A into HIV-1 virions, Nature, 1994, 372, 359–362 CrossRef CAS PubMed.
  47. M. BonHomme, S. Wong, C. Carter and S. Scarlata, The pH dependence of HIV-1 capsid assembly and its interaction with cyclophilin A, Biophys. Chem., 2003, 105, 67–77 CrossRef CAS.
  48. G. J. Towers, et al., Cyclophilin A modulates the sensitivity of HIV-1 to host restriction factors, Nat. Med., 2003, 9, 1138–1143 CrossRef CAS.
  49. U. Chatterji, et al., The Isomerase Active Site of Cyclophilin A Is Critical for Hepatitis C Virus Replication, J. Biol. Chem., 2009, 284, 16998–17005 CrossRef CAS PubMed.
  50. J.-F. Guichou, et al., Structure-based design, synthesis, and biological evaluation of novel inhibitors of human cyclophilin A, J. Med. Chem., 2006, 49, 900–910 CrossRef CAS PubMed.
  51. A. Ahmed-Belkacem, et al., Fragment-based discovery of a new family of non-peptidic small-molecule cyclophilin inhibitors with potent antiviral activities, Nat. Commun., 2016, 7, 12777 CrossRef CAS PubMed.
  52. V. A. Steadman, et al., Discovery of Potent Cyclophilin Inhibitors Based on the Structural Simplification of Sanglifehrin A, J. Med. Chem., 2017, 60, 1000–1017 CrossRef CAS PubMed.
  53. Z. K. Sweeney, J. Fu and B. Wiedmann, From chemical tools to clinical medicines: Nonimmunosuppressive cyclophilin inhibitors derived from the cyclosporin and sanglifehrin scaffolds, J. Med. Chem., 2014, 57, 7145–7159 CrossRef CAS PubMed.
  54. A. De Simone, et al., A computationally designed binding mode flip leads to a novel class of potent tri-vector cyclophilin inhibitors, Chem. Sci., 2019, 10, 542–547 RSC.
  55. C. Georgiou, et al., Pushing the Limits of Detection of Weak Binding Using Fragment-Based Drug Discovery: Identification of New Cyclophilin Binders, J. Mol. Biol., 2017, 429, 2556–2570 CrossRef CAS PubMed.
  56. H. Tang, Cyclophilin Inhibitors as a Novel HCV Therapy, Viruses, 2010, 2, 1621–1634 CrossRef CAS PubMed.
  57. P. Nigro, G. Pompilio and M. C. Capogrossi, Cyclophilin A: a key player for human disease, Cell Death Dis., 2013, 4, e888 CrossRef CAS PubMed.
  58. P. K. Agarwal, A. Geist and A. Gorin, Protein Dynamics and Enzymatic Catalysis: Investigating the Peptidyl–Prolyl Cis–Trans Isomerization Activity of Cyclophilin A, Biochemistry, 2004, 43, 10605–10618 CrossRef CAS PubMed.
  59. P. Wang and J. Heitman, The cyclophilins, Genome Biol., 2005, 6, 226 CrossRef PubMed.
  60. J. Schlegel, G. S. Armstrong, J. S. Redzic, F. Zhang and E. Z. Eisenmesser, Characterizing and controlling the inherent dynamics of cyclophilin-A, Protein Sci., 2009, 18, 811–824 CAS.
  61. T. L. Davis, et al., Structural and biochemical characterization of the human cyclophilin family of peptidyl-prolyl isomerases, PLoS Biol., 2010, 8, e1000439 CrossRef PubMed.
  62. L. C. McGowan and D. Hamelberg, Conformational Plasticity of an Enzyme during Catalysis: Intricate Coupling between Cyclophilin A Dynamics and Substrate Turnover, Biophys. J., 2013, 104, 216–226 CrossRef CAS PubMed.
  63. C. N. Chi, et al., A Structural Ensemble for the Enzyme Cyclophilin Reveals an Orchestrated Mode of Action at Atomic Resolution, Angew. Chem., Int. Ed., 2015, 54, 11657–11661 CrossRef CAS PubMed.
  64. R. Otten, et al., Rescue of conformational dynamics in enzyme catalysis by directed evolution, Nat. Commun., 2018, 9, 1314 CrossRef PubMed.
  65. C. Camilloni, et al., Cyclophilin A catalyzes proline isomerization by an electrostatic handle mechanism, Proc. Natl. Acad. Sci. U. S. A., 2014, 111, 10203–10208 CrossRef CAS PubMed.
  66. M. J. Holliday, C. Camilloni, G. S. Armstrong, M. Vendruscolo and E. Z. Eisenmesser, Networks of Dynamic Allostery Regulate Enzyme Function, Structure, 2017, 25(2), 276–286 CrossRef CAS.
  67. E. Papaleo, L. Sutto, F. L. Gervasio and K. Lindorff-Larsen, Conformational Changes and Free Energies in a Proline Isomerase, J. Chem. Theory Comput., 2014, 10, 4169–4174 CrossRef CAS PubMed.
  68. P. Wapeesittipan, A. S. J. S. Mey, M. D. Walkinshaw and J. Michel, Allosteric effects in cyclophilin mutants may be explained by changes in nano-microsecond time scale motions, Commun. Chem., 2019, 2, 41 CrossRef.
  69. E. Z. Eisenmesser, D. A. Bosco, M. Akke and D. Kern, Enzyme dynamics during catalysis, Science, 2002, 295, 1520–1523 CrossRef CAS PubMed.
  70. E. Z. Eisenmesser, et al., Intrinsic dynamics of an enzyme underlies catalysis, Nature, 2005, 438, 117–121 CrossRef CAS PubMed.
  71. J. S. Fraser, et al., Hidden alternative structures of proline isomerase essential for catalysis, Nature, 2009, 462, 669–673 CrossRef CAS PubMed.
  72. T. R. Gamble, et al., Crystal structure of human cyclophilin A bound to the amino-terminal domain of HIV-1 capsid, Cell, 1996, 87, 1285–1294 CrossRef CAS PubMed.
  73. G. Kontopidis, P. Taylor and M. D. Walkinshaw, Enzymatic and structural characterization of non-peptide ligand–cyclophilin complexes, Acta Crystallogr., Sect. D: Biol. Crystallogr., 2004, 60, 479–485 CrossRef PubMed.
  74. B. R. Howard, F. F. Vajdos, S. Li, W. I. Sundquist and C. P. Hill, Structural insights into the catalytic mechanism of cyclophilin A, Nat. Struct. Mol. Biol., 2003, 10, 475–481 CrossRef CAS PubMed.
  75. D. A. Keedy, L. R. Kenner and M. Warkentin, et al., Mapping the conformational landscape of a dynamic enzyme by multitemperature and XFEL crystallography, eLife, 2015, 4, e07574 CrossRef PubMed.
  76. C. Liu, et al., Cyclophilin A stabilizes the HIV-1 capsid through a novel non-canonical binding site, Nat. Commun., 2016, 7, 10714 CrossRef CAS PubMed.
  77. B. Han, Y. Liu, S. W. Ginzinger and D. S. Wishart, SHIFTX2: Significantly improved protein chemical shift prediction, J. Biomol. NMR, 2011, 50, 43–57 CrossRef CAS PubMed.
  78. D. F. Hansen, P. Vallurupalli, P. Lundström, P. Neudecker and L. E. Kay, Probing chemical shifts of invisible states of proteins with relaxation dispersion NMR spectroscopy: How well can we do?, J. Am. Chem. Soc., 2008, 130, 2667–2675 CrossRef CAS PubMed.
  79. A. J. Price, et al., Active site remodeling switches HIV specificity of antiretroviral TRIMCyp, Nat. Struct. Mol. Biol., 2009, 16, 1036–1042 CrossRef CAS PubMed.
  80. A. D. G. Lawson, Antibody-enabled small-molecule drug discovery, Nat. Rev. Drug Discovery, 2012, 11, 519–525 CrossRef CAS PubMed.
  81. C. J. Hutchings, M. Koglin, W. C. Olson and F. H. Marshall, Opportunities for therapeutic antibodies directed at G-protein-coupled receptors, Nat. Rev. Drug Discovery, 2017, 16, 787–810 CrossRef CAS PubMed.
  82. B. G. Tehan and J. A. Christopher, The use of conformationally thermostabiliseded GPCRs in drug discovery: application to fragment, structure and biophysical techniques, Curr. Opin. Pharmacol., 2016, 30, 8–13 CrossRef CAS PubMed.
  83. S. C. L. Kamerlin and A. Warshel, At the dawn of the 21st century: Is dynamics the missing link for understanding enzyme catalysis, Proteins, 2010, 78, 1339–1375 CAS.
  84. D. A. Caseet al., Amber 16, University of California, San Francisco, 2016,  DOI:10.1002/jcc.23031.
  85. M. J. Abraham, et al., GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers, SoftwareX, 2015, 1–2, 19–25 CrossRef.
  86. R. T. McGibbon, et al., MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories, Biophys. J., 2015, 109, 1528–1532 CrossRef CAS PubMed.
  87. M. K. Scherer, et al., PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models, J. Chem. Theory Comput., 2015, 11, 5525–5542 CrossRef CAS PubMed.
  88. M. Zweckstetter, NMR: Prediction of molecular alignment from structure using the PALES software, Nat. Protoc., 2008, 3, 679–690 CrossRef CAS.
  89. S. Bottaro, T. Bengtsen and K. Lindorff-Larsen, Integrating Molecular Simulation and Experimental Data: A Bayesian/Maximum Entropy Reweighting Approach, bioRxiv, 2018, 457952.
  90. A. J. Baldwin, An exact solution for R2,eff in CPMG experiments in the case of two site chemical exchange, J. Magn. Reson., 2014, 244, 114–124 CrossRef CAS PubMed.
  91. F. Delaglio, et al., NMRPipe: A multidimensional spectral processing system based on UNIX pipes, J. Biomol. NMR, 1995, 6, 277–293 CrossRef CAS PubMed.
  92. T. Goddard and D. G. Kneller, Sparky 3, Univ. California, San Fr., 2004, vol. 14, p. 15 Search PubMed.
  93. M. D. Winn, et al., Overview of the CCP4 suite and current developments, Acta Crystallogr., Sect. D: Biol. Crystallogr., 2011, 67, 235–242 CrossRef CAS PubMed.


Electronic supplementary information (ESI) available: Detailed Materials and methods section, MSM validation procedures, LC-MS analysis for CypA and D66A, additional NMR experiments, CypA assignment, X-ray refinement statistics, exchange parameters for CypA and D66A, measured Rex values. Instructions to download the available datasets. Fig. S1–S15 and Tables S1–S5. See DOI: 10.1039/c9sc04696h
Present address: Department of Chemistry, University of Warwick, Coventry, CV4 7AL, United Kingdom
§ Present address: Department of Biomedicine, University of Bergen, 5020 Bergen, Norway.
Present address: Sygnature Discovery, Biocity, Nottingham NG1 1GR, United Kingdom.

This journal is © The Royal Society of Chemistry 2020