Hierarchical dynamics in allostery following ATP hydrolysis monitored by single molecule FRET measurements and MD simulations

We report on a study that combines advanced fluorescence methods with molecular dynamics (MD) simulations to cover timescales from nanoseconds to milliseconds for a large protein. This allows us to delineate how ATP hydrolysis in a protein causes allosteric changes at a distant protein binding site, using the chaperone Hsp90 as test system. The allosteric process occurs via hierarchical dynamics involving timescales from nano- to milliseconds and length scales from Ångstroms to several nanometers. We find that hydrolysis of one ATP is coupled to a conformational change of Arg380, which in turn passes structural information via the large M-domain α-helix to the whole protein. The resulting structural asymmetry in Hsp90 leads to the collapse of a central folding substrate binding site, causing the formation of a novel collapsed state (closed state B) that we characterise structurally. We presume that similar hierarchical mechanisms are fundamental for information transfer induced by ATP hydrolysis through many other proteins.

After spectral separation pinholes with a diameter of 150µm refined the detection volume to 8fL. Finally, the two photon streams were separated by polarizing beam splitters into their parallel and perpendicular parts and recorded by single-photon detectors (two SPCM-AQR-14, PerkinElmer and two PDMseries APDs, Micro Photon Devices). Time-correlated single photon counting with picosecond resolution and data collection was performed by a smFRET data analysis.
Donor and acceptor photon streams were collected into 1 ms bins. Bursts were selected by a minimum threshold of 100 photons. For every burst the FRET efficiency E and a stoichiometry S was determined as detailed in Ref. 3 . FRET populations were determined within the region of 0.3 < S < 0.7 of corrected E vs S plots. From the FRET efficiency an apparent donor-acceptor distance R E between the two dyes is determined as 3 where R 0 denotes the Förster radius. We use the efficiency-averaged apparent distance R E , because we assume that during a burst the complete accessible volume of the respective dye is sampled homogeneously, i.e., E is already an average efficiency. R E can be calculated by the FPS software 4 as is visualized in Fig. S2. Note that R E is also used as a variable for the average from many efficiency averaged bursts, which is only equivalent to R E in Equation 1 if protein dynamics is negligible.
For every FRET label pair, the Förster radius R 0 was calculated from the donor quantum efficiency Q D , the spectral overlap J, and the relative dipole orientation factor κ 2 Here the dipole orientation factor is assumed to be κ 2 = 2/3 for isotropic coupling. The position-specific donor quantum yield Q D is given in terms of in dependence of the measured lifetime τ D as well as the parameters Q D0 and lifetime τ D0 specified by the manufacturer 5 as Based on the efficiency distributions (see Fig. 2, S3 and S4) and the respective R E distribution, we determined the expectation value µ of the distance between the two dyes and the apparent distance fluctuation σ via a Photon Distribution Analysis (PDA). 6,7 To that end, we fitted a shot-noise-broadened sum of Gaussian distance distributions to our E histogram, and performed an axis transformation from E-space to R E -space via Equation (1). Specifically, we transform only the bin edges. The Gaussian distance fraction of each histogram bin is obtained by integrating from the left to the right edge and extracting the mean distance.
Additionally, the E-specific shot-noise contribution is calculated for each bin, which can be described by a beta-function. We derive its parameters from the number of photons per burst, direct excitation and leakage similar to Ref. 7 . The final convolution is gained by simply summing all bin-wise shot-noise beta-functions weighted with the value of the Gaussians for each bin. This essentially assumes a distance delta-function (no protein dynamics) for each bin that undergoes a broadening by shot-noise, which simplifies the convolution greatly.
The amplitudes, mean values µ and width σ Gauss of each Gaussian are then optimised by minimising the squared residuals between data and fit.
Uncertainties in single-molecule experiments.
The widths of the shot-noise-broadened Gaussian fits is caused by shot-noise and structural fluctuations of the protein itself. While σ Gauss is therefore not directly related to an uncertainty of the mean value µ, in our case it is a good indication for the uncertainty of the fit. In addition, in Fig. 4  χ 2 red values were calculated by varying one lifetime while all other fit parameters were fixed to their fit minimum. For τ 1 we obtained a range of 13% in which χ 2 red did not change more than 0.01. For τ 2 we obtained 7%. These values agree well with the standard deviations obtained in the first case.
E vs. τ D(A) analysis was performed for the stoichiometry-filtered burst subset of the Hsp90 closed state region. The bursts are smeared along the dynamic FRET line which is evidence for millisecond dynamics of Hsp90 closed state A and B. We emphasise that the dynamic line is not the result of a fit, but an analytic solution. [9][10][11] Comparing the measured data to these limiting cases, Fig. 2d  accounts for differences of the IRF shift in the parallel and perpendicular channel and the G-factor considers differences of parallel and perpendicular detection efficiencies. to Barth et al. 12 we calculated DonxDon, FRETxFRET and DonxFRET using the PAM software 8 and Matlab R2017a. We tested several fit models, also including triplet terms (see below of an overview of the tested models). In a first approach we fixed the triplet fraction and triplet relaxation time to values that were separately determined to reduce the number of free parameters. We measured free Atto550 and free Atto647N dyes in solution and applied fit model A which contains a simple 3D diffusion term and a triplet term. However, as triplet characteristics are sensitive towards dye environments and further experimental parameters (e.g. laser excitation power), it is essentially preferable to derive them from the experiments directly. Therefore, we additionally calculated acceptor acceptor correlations (par x perp) for each data set. Obtained triplet times τ T were in the range of 1-10µs with fractions around 5% which is consistent with previous studies 13 . This was also expected because we used sufficiently low laser powers which were within the linear range of a laser power vs. count rate plot (not shown here).

FRET-FCS analysis
As the triplet fraction is very low, we proceeded similar to Barth et al. 12 and we fit the experimental data without a triplet term but with a kinetic term instead (model B). We obtained a fast kinetic rate which is on the same timescale as the triplet rate but with a fraction of more than 10%. We concluded, that an additional process on this timescale must be present which is not related to the triplet state of the dyes used. Finally, best fits were obtained in model C, which includes two kinetic terms and no triplet term. Thereby we obtained a second slower kinetic rate on the hundred microsecond timescale. Note, that the fact, that we can fit DonxDon, FRETxFRET and DonxFRET with model C whereas AccxAcc is well described by model A, is a negative control, because for AccxAcc one would not expect to see conformational dynamics.
Based on the idea of separating conformational dynamics from diffusive dynamics 11 we fit DonxDon, FRETxFRET and DonxFRET with globally linked relaxation times τ K and τ L .
We obtained diffusion times τ D in the range of 2-4 ms, depending on the analysed FRET pair and the nucleotide condition. As aggregates impede correct interpretation of correlation data we applied the software-integrated algorithm for filtering 8 . To stay consistent we applied the same aggregate filter parameters for all data sets (threshold 40, time window 10, add window 3).
Tested fit models: Assuming a gaussian shape of the confocal volume we set the geometric factor γ to 1/ √ 8.
N is the average number of particles in the confocal volume. T and τ T are triplet fraction and triplet relaxation time, respectively. Diffusion is characterized by the diffusion time τ D .
τ K , τ L with fractions K and L describe two relaxation times with corresponding fractions, respectively. The setup-related factor ρ describes the ratio between the axial and lateral diameter of the confocal volume and was determined beforehand. It was measured before each experiment by scanning of matrix-immobilized polymer beads which are smaller than

Anisotropy criterion
Sufficient rotational freedom of protein-coupled FRET dyes is a prerequisite for reliable distance measurements and can be measured by time-resolved anisotropy experiments 5 .
Nucleotide-and subpopulation-specific time-resolved anisotropies were determined for each FRET pair. We did so by identifying all photons of a population from the 2D ES plot, histogramming their microtimes and calculating the anisotropies r(t): Here, the subscripts DD and AA denote photons from the green channel after green excitation and photons from the red channel after red excitation, respectively. Polarisation of the photons is indicated by the superscripts and ⊥. I is the number of photons in the respective microtime bin and g a detection correction parameter for green (g G ) and red (g R ) detection. The combined residual anisotropies were calculated as a geometric mean of the donor and acceptor residual anisotropies 5 : We determined the combined residual anisotropies based on the FRET-subpopulations (r a c ), as well as based on the donor-and acceptor-only subpopulations (r b c ). We decided to show both values because r a c suffered from weak statistics. Although r b c can be biased towards lower values due to remaining free dyes in solution, we believe that this is an important information, especially as the amount of free dye is very low. All values are shown in Supplementary Table 1.

Protein modeling.
The yeast wild type Hsp90 dimer model was created by applying MODELLER 14 to one Hsp90 monomer (chain A) from the yeast Hsp90 crystal structure (PDB ID 2CG9) 15 to add missing loops and revert point mutations. The dimer was then reconstituted in vmd 16 by copying the full length monomer and aligning both monomers with the protein backbone of 2CG9. ATP was introduced into the binding site according to the coordinates in 2CG9. We need to point out that this crystal structure does contain ATP coordinates, but was actually crystallized using AMPPNP. For modeling structures with bound ADP and phosphate, we manually introduced a geometry change of the ATP γ-phosphate as it should appear in a S N 2 nucleophilic attack by a water molecule. The resulting ADP + Pi complex thus represents a structure immediately after the hydrolysis reaction and before any relaxation of the protein.

MD simulations and data analysis.
All simulations of the Hsp90 dimer water were carried out using Gromacs 2016 (Ref. 17) using the Amber99SB*ILDN-parmbsc0χ OL3 + AMBER99ATP/ADP force field, 18 which is an extension of AMBER99SB*ILDN [19][20][21] and contains improved parameters of ATP/ADP, 22,23 glycosidic torsions 24 and magnesium. 25 We modelled P i as H 2 PO 4 -, as would be expected from the addition of a water molecule to P γ . Missing H 2 PO 4 and AMPPNP parameters were generated with antechamber 26 and acpype 27 using GAFF atomic parameters 28 and AM1/BCC charges 29 based on a protocol we have used before. 30 Minimum angles involving the N-H group between P β and P γ were derived from an AMPPNP structure minimised at the B3LYP/6-31G* level using Gaussian09. 31 The quality of AMPPNP simulation parameters was checked by comparison of a AMPPNP structure minimised in vacuo with the quantum mechanically minimised structure.
The simulation system consisted of a dodecahedral box of 17.5 nm side length filled with ca. 120,000 TIP3P water molecules. 32 Sodium and chloride ions were added to result in a charge neutral box with a 154 mM ion solution. We used a 2 fs time step and constrained hydrogen bonds by the LINCS algorithm. 33 Electrostatics were described by the particle mesh Ewald (PME) method. 34 Cutoffs were set to 1 nm for van der Waals interactions and a minimum of 1 nm for PME real space. Simulations were carried out in a NPT ensemble with the temperature set to 310 K and the pressure to 1 bar. Temperature control was achieved by the Bussi velocity rescaling thermostat 35  conPCA builds a correlation matrix with distance variances σ i and σ j which after diagonalisation yields n eigenvectors e (n) that are aligned with the maximal correlation within the data set, and n eigenvalues v (n) that determine the contribution of eigenvector n to the overall correlation. The principal components P C n are then obtained via the projection P C n = e (n) · d.
Visualization of molecular data was performed with vmd. 16 For nonequilibrium targeted molecular dynamics simulations, 39 Comparing smFRET data to MD simulation data.
Usually, FRET experiments and MD simulations are compared by converting dye distances into C β -atom distances, using geometric arguments concerning linker lengths and flexibility. Since the latter usually involves ill-defined assumptions, we instead compared measured and calculated R E . That is, we directly calculate the expected R E for each simulations snapshot using the FPS software, 4 which analyses the volume a dye can access within the linker length around a given C β -atom. 3,41 This yields the R E distributions shown in Fig. 3a, which can be directly compared to the mean experimental distance µ.
The approach assumes isotropic averaging of dipoles during FRET, which can be verified via the low combined anisotropy of the FRET dyes (see Tab. S1 and Ref. 3 ). Even in case of partial anisotropic averaging, this approach is preferential over simple C β estimation, as the volumes accessible to the fluorophores significantly depend on the structures appearing during the MD simulation.

SUPPLEMENTARY TABLES smFRET distance measurements
Tab. S 1. Results from smFRET and anisotropy measurements. For each FRET pair ('donor position'-'acceptor position') and nucleotide condition Förster radii R 0 , fractions of the closed states A and B, mean fluorophore distances µ of closed states A and B, uncertainties σ and combined residual anisotropies determined from FRET populations r a C and donor-and acceptor-only populations r b C are summarised. Note, that for one distance in principle two smFRET experiments can be examined which is due to swapping of the donor-and acceptor-dye position. Swapping of donor-and acceptor-dyes can have a small effect on the Förster radius because of its environmentsensitivity. As seen from the table, we performed these measurements for some distances as a check for self-consistency. For details on the determination of the combined residual anisotropies, please refer to the SI Methods part C. For details on the error determination, please refer to the Methods section in the main text.

FRET
Nucleotide  .0 ml, respectively. The first one is located within the exclusion volume of the column and therefore probably related to aggregates. We identify the second one as labelled Hsp90 dimers. The SEC profiles show, that amount of aggregates is very low with respect to the one of labelled Hsp90 dimers. Note, that in experiment, burst threshold and a stoichiometry filter further assure that aggregates are not involved in the data analysis. Right: ATPase activity test for the FRET pair 142-597 (all other FRET pairs have been tested and published before 5 ). Hsp90 heterodimers were obtained by heating the respective homodimers, D142C and A597C, at 1:1 ratio for 43min at 43 • C which promotes exchange of the monomers. Subsequent spin down (2h, 4 • C, 16.9g) was performed to remove potential aggregates. The ATPase assay was performed at 37 • C according to previous ATPase tests 42,43 . Absorbance at 340nm was monitored on a Lambda35 UV-VIS spectrometer (Perkin Elmer). 2 mM ATP was added to 0.2mM NADH, 10 u/ml lactate dehydrogenase, 6 u/ml pyruvate kinase and phosphoenolpyruvate solved in 40 mM HEPES, 150 mM KCl and 10 mM MgCl 2 . The Hsp90 heterodimer 142-597 was added after the signal was stable.
To determine the ATPase background, the reaction was stopped by radiciol (R2146-1MG, Sigma Aldrich) which specifically inhibits the ATPase activity of Hsp90. The ATP-turnover rate was determined from the slope of a linear fit to the decay of the signal after protein addition. We took the average from three tests and obtained k Hsp90 =(2.1±0.2) min −1 which agrees well with previously determined values 5 . Here, the C β -C β distance between position 452 and 298 is shown exemplarily for the 2CG9 structure (black dashed line). In order to account for the difference, for each structural MD snapshot the expected R FPS E is calculated by the FPS Software. 4 Here, this is shown at the 2CG9 structure (purple line). Parameters used to calculate the accessible volumes of the dyes (blue and magenta spheres) were taken from Ref. 3  Considering the latter parameters, we arrive at the same distances for swapped dye pairs. In that manner we can exclude positions-specific photophysics-based uncertainties.   [44][45][46] confirms that ATP stabilizes the closed state, and its hydrolysis is needed for the formation of the open state. d E-S plot of the asymmetric E33-WT dimer with 2mM ATP. In contrast to the symmetric E33A-E33A variant, Hsp90 is mainly found in the open state. The capability of Hsp90 to almost fully reopen is evidence for a single ATP hydrolysis to suffice for the reopening.  The obtained AccxAcc correlation can be fit with model A, which includes simple 3D diffusion and a term for triplet kinetics only. The fact, that an additional kinetic term to fit AccxAcc is not necessary provides evidence, that we indeed see protein dynamics in a-d. Bottom: FRET-FCS fit results for FRET pair 298-452 of Hsp90. Shown are the relaxation times for the fast (orange) and the slow kinetic mode (purple), τ K and τ L , respectively and the corresponding weights of the kinetic modes K and L for different nucleotide conditions. For each weight three data points are shown which are the weights obtained from the DonxDon correlation (green cycle), from the FRETxFRET correlation (red diamond) and from the FRETxDon correlation (blue square). We could not observe a significant trend with respect to the nucleotide present. See the paragraph "FRET-FCS analysis" on SI p. 6 for a more detailed discussion.  The obtained AccxAcc correlation can be fit with model A, which includes simple 3D diffusion and a term for triplet kinetics only. The fact, that an additional kinetic term to fit AccxAcc is not necessary provides evidence, that we indeed see protein dynamics in a-d. Bottom: FRET-FCS fit results for FRET pair 327-452 of Hsp90. Shown are the relaxation times for the fast (orange) and the slow kinetic mode (purple), τ K and τ L , respectively and the corresponding weights of the kinetic modes K and L for different nucleotide conditions. For each weight three data points are shown which are the weights obtained from the DonxDon correlation (green cycle), from the FRETxFRET correlation (red diamond) and from the FRETxDon correlation (blue square). As for the FRET pair shown before, we could not observe a significant trend with respect to the nucleotide present. See the paragraph "FRET-FCS analysis" on SI p. 6 for a more detailed discussion   Protein structure as transparent cartoon, Arg380 and ATP as sticks, magnesium ion as van der Waals sphere. Volumes with a minimal residence probability P ≥ 0.4 for a water molecule to be found within 500-1000 ns of free MD simulation ("water densities") 48 as grey isosurfaces. b Comparison of position of Arg380 in Hsp90 (PDB ID 2CG9) 15 and Lys307 in MutL (PDB ID 1B63) 49 .