Atomic Level Characterisation of Millisecond-Time Scale Protein Motions through a Combined Molecular Simulations and NMR Approach

Advances in biomolecular sciences are closely linked to our ability to chart the energy landscapes of biomolecules with atomic details. Here we validate a new paradigm to characterise thermodynamics and kinetics of millisecond timescale conformational transitions between ground state and transient excited states in the enzyme cyclophilin A (CypA). We describe a novel methodology that combines molecular dynamics simulations and Markov State modelling with NMR measurements to provide atomic-level insights into the nature of CypA transient conformational states. The computed conformational ensembles also enabled the predictive design and experimental validation of a single-site mutant that dramatically perturbs millisecond timescale loop motions, converting a CypA excited state into the ground state. The resulting models open up new horizons for targeting CypA with inhibitors and pave the way towards rational design of protein energy landscapes for protein engineering and drug discovery purposes.


3
The structural characterization of conformational changes between long-lived lowenergy (ground) states and short-lived sparsely populated (excited) states in biological macromolecules remains a major unsolved challenge in structural biology. Despite advances in integrating data from multiple biophysical techniques, 1-12 atomistic descriptions of the conformational changes that link the ground and excited states of a protein remain scarce. New approaches to provide such descriptions could transform the fields of drug discovery and protein engineering. [13][14][15] While X-ray diffraction provides atomic resolution structures, it struggles to resolve lowly populated states (ca. <20%), and time-resolution of conformational motions has been possible in only a handful of special cases. 16,17 Molecular dynamics (MD) simulations provide atomic resolution, but their accuracy is limited by the approximate nature of potential energy functions . MD also only readily characterizes transient states that exchange with a ground state within few microseconds, and efforts to surpass this timescale are generally accompanied by loss of temporal resolution. [18][19][20][21] relaxation experiments provide temporal resolution on a broad range of timescales but lack spatial resolution in comparison with X-ray diffraction or molecular modelling techniques. [22][23][24][25] Here we implement and validate an approach that combines MD, NMR and X-ray crystallography, to characterize in atomic detail a millisecond timescale conformational change in the human enzyme cyclophilin A (CypA). [26][27][28] Extensive biophysical studies have previously sought to characterize its diverse millisecond timescale motions using X-ray and NMR, but a detailed atomistic description of the conformational changes involved has remained elusive. [29][30][31][32][33] CypA is a major drug target, but atomic level description of its ground state has proven insufficient to develop isoform selective CypA inhibitors. [34][35][36][37][38][39][40][41] The structural ensembles reported herein give key insights into the 4 nature of a millisecond conformational change in CypA, and these guided the design and experimental validation of a single-site mutant that converts a transient state into a ground state, thereby unlocking new opportunities for CypA inhibitors design efforts.

An experimentally validated ensemble of Cyclophilin A conformations
The workflow outlined in Fig. 1a was applied to CypA. Previous NMR results had identified a set of residues displaying significant chemical-exchange contribution (R ex > 4 s -1 ) to the transverse relaxation rate R 2 . 29 The majority of those residues clustered in the region between Asp66 and Gly75, herein referred to as the 70s -loop. Accelerated molecular dynamics (aMD) simulations were initiated from the X-ray structure of wild type CypA that entailed enhanced motions specifically in the vicinity of the 70s -loop while maintaining the integrity of the overall fold (Fig 1a, step I and SI for details.) 42,43  aMD-biased simulations successfully generated large-scale rearrangements of the 70s -loop as well as of the loop between residues Ala101 and Gln111 (hereafter referred 6 to as the 100s -loop) on timescales of a few hundred ns . To recover the equilibrium properties of the aMD-generated conformational changes, CypA snapshots that were representative of the conformational transition were pooled with snapshots from the initial and final conformational states sampled during the aMD simulations. These The conformational ensemble was cross-validated by back-calculation of a set of NMR observables that included standard and exact NOEs, 3 J-coupling measurements 8 and Residual Dipolar Couplings (RDCs). 6,44 Observables were also back-calculated from a 40-structures CYANA NMR ensemble, 31 a 20 structures DYANA NMR ensemble, 45 and a representative CypA X-ray structure. 46 The MD ensemble was found to reproduce about 90% and 95% of derived proton-proton distances from eNOE and NOE measurements, which was comparable to both NMR ensembles and slightly better than the X-ray structure. 3 J-coupling values were also computed for backbone HN-HA, HN-C, and HN-CB pairs (Fig 2b). HN-HA RMSDs values for the MD ensemble are slightly above those reported in benchmark studies of smaller proteins, 47,48 and above those obtained with the CYANA ensemble. The DYANA ensemble that was not refined against the particular set of 3 J coupling values used here shows poorer agreement. The RMSD for 3 J coupling values derived from the X-ray structure is similar to the one obtained from the MD ensemble. There is little difference in accuracy for HN-C and HN-CB pairs between the MD and NMR ensembles, whereas the X-ray structure is slightly worse. Finally, the RDCs are better reproduced with the CYANA ensemble, whereas the X-ray structure, MD and DYANA ensembles are of similar accuracy (Fig   2c). Overall the accuracy of the MD ensemble, which does not include any refinement step against NOEs, 3 J-coupling or RDCs, was deemed encouraging, and further insights into loop motion mechanisms were sought by detailed inspection of the MSM.

Atomic-level insights in millisecond timescale CypA loop motion mechanisms
A MSM description of CypA 70s-and 100s-loop dynamics is presented in Fig. 2d.
For ease of interpretation, the MSM states were further grouped into a coarser, five-state description (see SI for details). The most populated macro-state (orange in Fig. 2d) resembles the conformation observed in most X-ray structures reported for CypA, with   The well dispersed HSQC spectra recorded for the D66A mutant indicated the protein was natively folded (Fig. S6). A chemical shift perturbation (CSP) analysis , with respect to CypA, indicated strikingly larger changes for the D66A mutant compared to 14 the previously reported H70A mutant (Fig. 4a, Table S1). The largest deviations are observed in the 70s -loop region with CSP values of up to 9.0 ppm. Additional perturbations occur around residues 50 and 110 that flank the 70s-loop region. These measurements are in good agreement with the predicted CSP patterns obtained through back-calculation of chemical shifts from the computed ensembles for CypA, D66A and H70A (Fig. 4b). Moreover, the neighbour corrected intrinsically disordered protein (ncIDP) web server was used to compare chemical shift changes with respect to random coil values (Fig 4c). The collective displacement of chemical shifts in the 70s -loop towards random-coil values supported the MD predicted order-disorder transition.
Further corroboration was provided by co-crystallization and X-ray diffraction experiments on CypA and D66A in complex with inhibitor 1 (Table S2). 36 Comparison of electron density profiles revealed that the 70s-loop adopts an ordered and closed conformation in the CypA:1 complex (Fig 4d) while in the D66A:1 complex electron density in the 70s-loop region is largely absent (Fig. 4e), consistent with a predominantly disordered conformation. The electron density for other parts of the protein and for 1 was similar, suggesting that the conformational change is restricted to the 70s-loop region. Moreover, the ITC estimated K d value of 1 is ca. 12-fold weaker for D66A than for CypA ( Fig. 4f and 4g) suggesting that the order-disorder transition of the 70s-loop weakens binding of compounds to the Abu and Pro pockets without completely disrupting the CypA active site. This is in line with the conservation of binding mode between CypA and D66A observed for 1 by X-ray crystallography.
The narrow difference between effective correlation time for internal motions of a large number of residues and global tumbling times for both CypA and D66A precluded the obtention of generalized S 2 order parameters to characterize differences in ps -ns timescale motions between CypA and D66A (Fig. S7) D66A are consistently lower than for CypA in the 70s-loop region, even though the global tumbling times are similar (9.1 and 9.2 ns respectively). This suggests increased ps -ns time scale motions for D66A (Fig. 4h), an observation that is in line with the back-calculated NOE values, which show a significant dip in the 70s -loop region for D66A (Fig. 4i).
CPMG measurements turned out to be insufficient to reliably fit intermediate exchange parameters for D66A. However additional R 1 experiments provided sufficient data to enable a combined fit of the R 1 and CPMG measurements (Fig. S7).
The data for CypA and D66A could be fitted in both cases to a two-state exchange model with similar k ex values (ca. 2000 s -1 ) for CypA and D66A, but with significant differences in excited state populations (p b ca. 0.5% for D66A and ca. 2% for CypA).
The fit also enabled determination of the magnitude of the chemical shift changes  for residues showing significant dispersions for CypA, and the magnitude and sign of chemical shift changes for D66A.
Remarkably the pattern and magnitude of between ground state and excited state is dramatically different between CypA and D66A (Fig. 5a, 5b, Table S3). Major differences are also apparent in the comparison of measured R ex values for CypA and D66A, which shows that the large dispersions measured for residues in the 70s-loop in CypA have been completely quenched in D66A (Fig. 5c, Table S4) while R ex contributions can be measured in D66A for residues outside the 70s region . This suggests that D66A may undergo different millisecond exchange processes. CPMG dispersion profiles for individual residues (Fig. 5d) further support this hypothesis .
Residues in the 70s-loop region (e.g. Asp/Ala66; Gly74) that exhibited strong dispersion in CypA are no longer sensitive to CPMG pulses in D66A, whereas other residues show more pronounced or similar dispersions elsewhere in the structure (e.g. Gly80, Asn87). Furthermore, the difference in chemical shifts measured between CypA and D66A proteins in their ground states correlates well with the difference in chemical shifts between CypA ground and excited states (Fig. 5e). This provides additional evidence that the exchange process involving the 70s -loop in CypA is an order-disorder transition. By contrast the differences in chemical shifts measured between CypA and D66A ground states do not correlate with the differences in chemical shifts between D66A ground and excited states (Fig. 5f). Altogether the data suggests that the exchange process measured for D66A is unrelated to an order-disorder transition of the 70s -loop region. between CypA ground and excited states as determined fro m exchange para meters . f Sa me as e but for differences in chemical shifts between D66A and CypA as observed by HSQC measure ments, and between ground and excited states of D66A as determined from exchange parameters.

Discussion
The present results demonstrate comprehensive description of specific millisecondtimescale processes in proteins by combining molecular simulations with NMR measurements. The resulting conformational ensemble for CypA describes a range of NMR observables with accuracy comparable to that of NMR structure-refinement protocols. Yet there are remarkable structural differences. The DYANA ensemble does not display large amplitude conformational changes of the 70s-loop or the 100s -loop. 45 Conformational variability in the CYANA ensemble is mostly in the 70s-loop region, 31 but the closed state structures in that ensemble do not resemble that observed in the Xray structure. By contrast the present MD-derived ensemble suggests that both the 70sloop and 100s-loop undergo large amplitude motions. The source of these discrepancies may be linked to the nature of the NMR measurements used to validate the ensembles.
Indeed, back-calculated NOE, J-coupling and RDC values from a single X-ray structure are only slightly less accurate than those obtained with the various NMR and MD ensembles (Fig. 2a-c) Our results open the door to in silico generation of such hypotheses for manipulating thermodynamics and kinetic parameters in a protein ensemble, information that is lacking in traditional workflows for NMR-based structure calculations. There has been much debate about the relationship between millisecond timescale motions in CypA and catalysis. 51 The present results suggest that the previously reported millisecond timescale process involving the 70s -loop region in CypA is a local order-disorder transition that does not dramatically perturb the shape of the active site, as evidenced by X-ray and ITC measurements on a CypA-ligand complex. These results are in line with previously reported observations that millisecond time scale motions of the 70s-loop do not appear to be linked with the catalytic cycle. 52,53 Previous work has shown that while human CypA only weakly restricts the HIV-2 virus, mutations in the 70s-loop region of RhTC confer potent HIV-2 restriction. 49 The proposed mechanism involved alternative conformations of the 70s-loop thought to be inaccessible to CypA. 50 Our results suggest instead that CypA's energy landscape readily permits exploration of large scale rearrangement of the 70s-loop; thus detailed characterisation of transient conformational states emerge as a powerful strategy to understand how mutations in protein sequence may evolve new function.
Alternative conformational states of the 70s -loop in CypA also offers prospects for the design of next-generation inhibitors that could potentially overcome isoformselectivity concerns. 54,55 The D66A mutant offers a template for such efforts and further work to validate the mechanism of order-disorder transition in the 70s -loop of other cyclophilin isoforms is warranted. Engineering methodologies based on antibodies have been developed to trap transient conformational states and facilitate ligand discovery efforts. [56][57][58] Our results suggest that molecular simulations may be used enhance the effectiveness of conformational trapping technologies.
The ability to describe millisecond timescale conformational changes in proteins by MD simulations is shedding new light on NMR relaxation measurements. The interpretation of such experiments is traditionally performed by fitting data to two or three states phenomenological models. By contrast MSMs constructed from MD datasets typically use a much larger number of states to describe protein dynamics. In the present case there is an apparent discrepancy between CPMG-derived populations for the CypA excited state (ca. 2%, Table S3) Table   S5) that are in closer agreement with the NMR measurements. This suggests that twostate CPMG analyses may underestimate the populations of minor states owing to a degree of degeneracy of chemical shifts with respect to different protein conformations.
Overall this work has demonstrated that molecular simulation methodologies may now be combined with NMR to offer atomic-level details about millisecond-timescale conformational changes in proteins. Such ability is expected to be widely enabling for protein structure-dynamics-function relationships studies, and for unlocking new opportunities in bioengineering or drug discovery.

Methods
Detailed simulation and experimental methods are provided in the SI.