Experimental and computational evidence on conformational fluctuations as a source of catalytic defects in genetic diseases

Theoretical and experimental evidence has shown that protein function, regulation and degradation are intrinsically linked to the dynamic and fluctuating nature of protein ensembles. However, the effect of missense mutations on catalytic performance are often interpreted from conformational analyses derived from X-ray crystallography, molecular dynamics and modeling, while effects on conformational fluctuations at the active site as the source of catalytic defects are rarely investigated. Here, we explore the role of conformational fluctuations in the catalytic efficiency of WT and three missense mutations in the UDP-galactose 40-epimerase (GALE) protein causing type III galactosemia. Using comprehensive molecular dynamics simulations and small angle X-ray scattering we correlate low NAD binding affinity in some mutants with an increased population of non-competent conformations for NAD binding. Proteolysis studies combined with thermodynamic calculations reveal that mutations affecting GALE catalytic performance favor larger conformational fluctuations at the N-terminal domain and NAD binding site, shifting the equilibrium towards non-binding competent states in the native ensemble. Therefore, we provide a novel ensemble-based thermodynamic mechanism to explain catalytic defects caused by missense mutations that links large and transient conformational fluctuations and loss of catalytic efficiency and substrate/coenzyme affinity. In the context of this mechanism, we propose that allosteric ligands aimed at modulating these transient conformational fluctuations might correct catalytic defects in inherited metabolic diseases, representing a different approach to current small ligand therapies which target the low stability, but not catalytic defects, of mutations.


Introduction
Conformational uctuations are essential to understanding protein function, regulation and degradation. [1][2][3][4] Fluctuations may be either local or involve large changes (unfolding) of protein structure. 1,[5][6][7][8] Importantly, the role of dynamic alterations in the pathogenic mechanisms of loss-of-function genetic diseases, such as catalytic defects and protein misfolding, are poorly understood. Catalytic defects are oen identied experimentally by enzyme kinetic analyses and structurally explained by the combination of X-ray diffraction crystal structures, molecular dynamics simulations and modeling procedures, oen providing a static view of the local structural changes at the active sites to explain catalytic defects. Protein destabilization and misfolding are customarily identied by in vitro experiments such as thermal denaturation/inactivation and proteolysis, [9][10][11][12][13][14] and again, the destabilizing effect can be rationalized using structural information, as well as energy and sequence-based algorithms. [15][16][17][18][19] Novel therapies aimed at overcoming destabilizing effects of mutations using small ligands (natural and pharmacological chaperones) are being described. [20][21][22][23][24] Despite the emergent importance of native state dynamics on protein functionality, its role in mutation-associated catalytic defects in inherited metabolic diseases has been largely unexplored. Here, we investigate the relationship between dynamic alterations and catalytic defects in three catalytic mutations of UDP-galactose 4 0 -epimerase (GALE, EC 5.1.3.2) causing type III galactosemia. GALE catalyzes the interconversion between UDP-galactose and UDP-glucose using NAD + as cofactor. 25 Two common GALE disease-causing mutations (p.N34S and p.G90E) decrease NAD + binding affinity, selectively decrease protein unfolding cooperativity and facilitate proteolytic cleavage at the loop formed by residues 34-45 near the NAD + binding site. 12 The impaired catalytic performance in these two mutants may arise from structural uctuations of the NAD + binding site that energetically penalize coenzyme binding. The p.V94M mutant greatly reduces GALE catalytic efficiency, 25,26 possibly due to binding of the substrate in a non-competent fashion for catalysis, 27 without largely affecting protein stability and exibility. 12 We have combined molecular dynamic (MD) simulations and small angle X-ray scattering (SAXS) to ascertain the effects of these mutations on the native conformation and dynamics. Furthermore, we have experimentally characterized the overall conformation and relative population of partially unfolded states under native conditions using partial proteolysis in combination with thermodynamic analyses. Our results allow to propose a thermodynamic mechanism that explains how mutational defects on NAD + binding occur due to an increment of the population of non-competent binding states and changes in their structure, and as result, allosteric ligands that bind to and stabilize the N-terminal domain of GALE might be used to overcome functional defects in type III galactosemia. We propose that these results may generally apply to inherited metabolic diseases in which point mutations, distant from the enzyme active site, inuence binding or catalysis by similar mechanisms.

Molecular dynamics simulations
We performed molecular dynamics (MD) simulations of wild-type and mutant GALE both in monomeric and dimeric state, and in presence and absence of the NAD + cofactor. In total we performed 16 independent MD simulations of eight monomeric and eight dimeric systems. For each assembly state we investigated apo and NAD + -bound states of the wildtype (WT) enzyme as well as the three mutants p.N34S, p.G90E, and p.V94M. All simulations were based on a crystal structure of GALE in complex with NAD + (PDB:1EK5 28 ) that was prepared for simulations using MOE's protonate3D and mutate function. 29 MD simulations were performed using the GPU code of pmemd in Amber12. 30 Amino acids were parameterized using the Amber force eld 99SB with ILDN corrections, 31 whilst the NAD + co-factors were described according to parameters from Walker and Pavelites. 32, 33 We employed an extensive equilibration protocol 34 before performing unrestrained sampling simulations for 100 ns in NpT ensemble at 300 K and 1 bar. We stored 1000 equal-spaced snapshots to trajectory for later analysis. Shorter sampling periods (i.e. 10 ps) yielded essentially the same results (data not shown).
We checked the simulations for energetic and structural stability and found maximum Ca RMSD values to be below 3.5Å over all 692 protein residues. Resulting conformational ensembles were analyzed using ptraj and cpptraj from AmberTools. 35 We extracted residue-wise B-factors derived from root mean squared uctuations of Ca atoms following a global alignment. Additionally, we calculated thermodynamic entropies from the distributions of protein backbone torsion angles and summarized the three individual degrees of freedom to residue-wise and total entropies. 36 To study the behavior of the 34-45 loop, we performed an alignment to the whole respective protein subunits and recorded occurring loop movements. Subsequently, we clustered the loop structures according to their Ca RMSD into two states using a hierarchical agglomerative clustering approach. Respective cluster centroids were extracted as representative structures and compared in terms of structure and cluster occupancy. Values for dimeric systems are reported as average over both individual sub-units. Structures were visualized using Pymol. 37 2.2. Protein expression, purication and preparation GALE enzymes were expressed and puried as described 12 and stored in HEPES (4-(2-hydroxyethyl)piperazine-1-ethanesulfonic acid)-KOH 50 mM, 1 mM TCEP (Tris(2-carboxyethyl)phosphine hydrochloride), pH 7.4. Thermolysin from Bacillus thermoproteolyticus Rokko was purchased from Sigma-Aldrich, exchanged to (HEPES)-KOH 50 mM, CaCl 2 100 mM, 1 mM TCEP, pH 7.4. Proteins were ash frozen in liquid nitrogen and stored at À80 C. Protein concentration was measured spectrophotometrically based on protein primary sequences. 12

Small-angle X-ray scattering (SAXS)
Small-angle X-ray scattering (SAXS) measurements of wild-type and p.G90E GALE mutant in solution were performed at Diamond Light Source beamline B21 (Oxforshire, UK), using a BioSAXS robot for sample loading from solutions of both proteins at different concentrations in 50 mM HEPES-KOH pH 7.4 and 1 mM TCEP at 277 K. During the experiment the samples were exposed for 300 s in 10 s acquisition blocks using a sample to detector distance of 3.9 m and X-ray wavelength of 1Å. The data were analyzed, buffer-subtracted, scaled, and merged using the Scatter soware package (http://www.bioisis.net). This soware was also used to check possible radiation damage of the samples by visual inspection of the Guinier region as a function of exposure time during data collection. R G and D max values were calculated with PRIMUS and GNOM, and shape estimation was carried out with DAMMIF/DAMMIN. Real-space scattering proles of atomic models were calculated using CRYSOL and the nal ab initio models were superimposed with the high-resolution structure using the program SUPCOMB. The program OLIGOMER was used to calculate the ratio of the species present in the solutions. All the SAXS programs are included in the ATSAS programs package. 38

Urea induced denaturation
Urea stock solutions were daily prepared in HEPES-KOH 50 mM, 1 mM TCEP, pH 7.4 at a 9 M concentration based on refractive index measurements. GALE enzymes (5 mM in protein monomer) were incubated in HEPES-KOH 50 mM, 1 mM TCEP, pH 7.4, in the presence of urea (0-4 M) for 16 h at 25 C. Far-UV CD spectra were acquired in a Jasco J-710 spectropolarimeter using 1 mm path-length cuvettes at a 100 nm min À1 scan rate and 4 scans were registered and averaged.

Proteolysis kinetics and thermodynamic analysis
GALE enzymes (20 mM in protein monomer) were incubated for at least 6 h at 25 C in HEPES-KOH 50 mM, 1 mM TCEP, pH 7.4, in the presence of urea (0-0.6 M), and in the absence or presence of NAD + 80 mM. Proteolysis was initiated by adding thermolysin to a nal concentration of 0.1-0.6 mM. Aliquots were withdrawn at different times, and proteolysis was quenched by adding EDTA (ethylenediaminetetraacetic acid) pH 8 (to a 20 mM nal concentration) and mixed with an equal volume of Laemmli's buffer. Control experiments without thermolysin were prepared in the same way and used to normalize the proteolysis kinetics. Proteolysis reactions were evaluated using 12% acrylamide SDS-PAGE (sodium dodecyl sulfate-polyacrylamide gel electrophoresis) and the bands corresponding to full length GALE were scanned and analyzed using the ImageJ soware (http://rsbweb.nih.gov/ij/).
The apparent rate constants for proteolysis (k p ) were determined using a single exponental function. For the estimation of equilibrium m values, k p was corrected for the inhibitory effect of urea on thermolysin degradation using the values previously reported. 39 The corrected rate constants (k 0 p ) were used to determine the equilibrium m values between the native and cleavable state (X) by determining their urea concentration dependence using the following equation: where R is the ideal gas constant (1.987 cal mol À1 K À1 ) and T is the absolute temperature (298.15 K). The theoretical equilibrium m value for the global unfolding of GALE monomers was determined using the correlations from. 40 To estimate the effect of mutations on the thermodynamic stability of the native vs. the cleavable state, we have assumed that mutations do not affect the intrinsic proteolytic step (X / F), and thus, changes in the overall proteolysis rate aer normalization using protease concentration (k WT and k mut ) reect changes in the equilibrium constant between N and X (N 4 X; 41 ), and therefore in their free energy changes (DDG), following eqn (2):

Molecular dynamics (MD) simulations identify changes
in conformational states in the native ensemble due to catalytic GALE mutations GALE disease-associated variants affect catalytic properties and binding of NAD + increasing proteolysis susceptibility. 12,25,26,[42][43][44] Indeed, based on proteolysis sensitivity measurements and the crystal structure of GALE, 12 we have proposed that certain catalytic mutations (e.g. p.N34S and p.G90E) could affect the local exibility of the loop 34-45, which encloses the NAD + binding site. Knowing that p.G90E and p.V94M present only 0.1% and 3% respectively of the wild-type catalytic activity, and p.N34S shows 4-fold lower affinity for NAD + which is abolished in p.G90E, we hypothesize that the catalytic defects and NAD + binding effects could be associated with conformational changes arising from dynamic alterations in the native state ensemble. MD simulations were used to characterize the conformational microstates present in native GALE ensembles at atomic resolution. GALE monomeric enzymes are rather unstable over 100 ns simulation time, and therefore, subsequent analyses were carried out using the functional GALE dimer only. We have focused our analyses on the 34-45 loop where major conformational transitions are found (reaching RMSD values up to 8Å in apo GALE systems; see Fig. S1 † for a 2D RMSD plot). By contrast, NAD + -bound GALE complexes show drastically reduced loop mobility (Fig. S1 †), supporting its stabilizing effect towards proteolysis. 12 The aggregate simulation time of 1.6 ms of the dimer (i.e. 3.2 ms sampling time for individual loops) make us condent on the observed trends for loop dynamics, despite individual simulation (100 ns) is at the lower limit for investigating changes in loop movements. 45 In this regard, enhanced sampling methods 46 could be helpful to provide further insight into the loop trends described here.
We performed a cluster analysis to investigate plausible loop conformations during molecular dynamics simulations (Fig. 1A-D). MD trajectories were clustered into two states: the major one representing a ground state (close to that of the crystal structure) and the minor excited (i.e. high-energy) state. The relative populations of the major and minor states depend on the cut-off values used in the clustering procedure, even though the qualitative trends described here for the effect of mutations and NAD + binding are robust.
In the absence of NAD + , the minor conformation is more populated in the p.N34S and p.G90E mutants (Fig. 1E), defective in NAD + binding, while generally the thermolysin cleavage site Ala38-Phe39 is more solvent exposed in these high-energy states ( Fig. 1A and C). In the presence of NAD + , a much lower population of the high-energy states is found, except for p.V94M (Fig. 1E), an in all cases the conformation of the ground and excited states is more similar (Fig. 1B and D). These analyses support that two conformational states exist in equilibrium in the absence of NAD + , with the more populated displaying higher affinity for NAD + due to higher structural complementarity that likely prevents from thermolysin cleavage (Fig. 1A-D). Therefore, the lower affinity for NAD + in p.N34S and the inability of p.G90E to bind it, can be partly explained by a shi in the conformational equilibrium towards non-binding competent states, remodeling the protein energy landscape in such a way that cofactor binding is thermodynamically penalized. 4,47 According to this proposed mechanism, p.N34S and p.G90E could be more susceptible to proteolytic attack at the 34-45 loop since they increase the population of cleavable (excited) non-binding competent states, while NAD + binding reduces the population of cleavable states explaining its protective effect. 12 Alternatively, accelerated proteolysis and low binding affinity could also be explained by enhancement of local dynamics at the NAD + binding site that impedes its correct spatial complementarity towards the cofactor. To distinguish between these two mechanisms, we have analyzed local dynamics at the residue level by MD simulations (Fig. 1F). Interestingly, we nd that the exibility around the cleavage site (loop 34-45) does not correlate with the experimentally determined proteolysis susceptibility ( Fig. 1F and ref. 12), implying that proteolysis susceptibility is not directly linked to local dynamics in this case. Additionally, p.N34S and p.G90E facilitate thermally induced partial unfolding 12 which supports a population-shi mechanism. Addition of NAD + decreases the local dynamics around the cleavage site, which likely reects the shi towards binding competent states (Fig. 1E).

Conformation of WT and p.G90E in solution studied by small-angle X-ray scattering (SAXS)
To investigate experimentally the effect of a catalytic mutation on GALE conformational ensemble, we have determined the low-resolution conformation of WT and p.G90E by SAXS. The overall parameters for SAXS modeling are summarized in Table S1. † Scattering proles of both proteins show differences in the radius of gyration calculated both in real and reciprocal space, as well as in their corresponding D max values, and Kratky plots suggested an increase in the exibility in the p.G90E mutant ( Fig. 2A-D). Envelope reconstruction clearly results in a more elongated shape of the p.G90E mutant than the WT protein ( Fig. 2E and F). The ab initio shape reconstructions superimposed well with the available crystallographic data (PDB entry 1EK6), but exhibited a molecular envelope larger than that observed in the crystal structure, especially in the p.G90E mutant.
In combination with our results using MD simulations (Fig. 1), these SAXS analyses support the existence of conformational uctuations (i.e. high-energy states) in GALE in solution, which are enhanced by the p.G90E mutation.

Thermodynamic characterization of high-energy states under native conditions by proteolysis
We have used proteolysis to evaluate the effect of GALE mutations on the energetics of the native state ensemble, particularly on high-energy excited (cleavable) states, such as those detected by MD simulations and difficult to characterize experimentally. Proteolysis of GALE under native conditions is well-described by the following mechanism: N 5 X 0 F, where X is the cleavable state in equilibrium with the native state (N), and undergoes proteolytic cleavage to produce a proteolyzed state F. The step N 4 X is governed by a equilibrium constant K, while the step X / F is determined by an intrinsic proteolysis rate constant k. To provide insight into the conformation of X, we have measured proteolysis kinetics in the presence of low urea concentrations. Urea and protease concentrations were kept very low to minimize deviations from native conditions (Fig. S2 †) and to ensure that the proteolysis step X / F is rate-limiting, and thus, the overall proteolysis rate constant k p z Kk. 8,41,48 Experimental k p values (Fig. 3A-D) were determined from rst-order kinetic decays (Fig. S3 †). Based on this mechanism, and upon correction of k p values due to modest inhibition caused by urea, 39 changes in proteolysis kinetics were used to determine equilibrium m values using eqn (1) (Fig. 3E-H). These equilibrium m values (Fig. 3I) are proportional to the magnitude of the conformational change between native and cleavable states ( Fig. 3J; i.e. degree of unfolding), and thus provide structural information on the cleavable state X.
The sensitivity to thermolysin degradation correlates well with the effect of urea on proteolysis kinetics (Fig. 3I-K): the higher the sensitivity to proteolysis of a GALE variant, the higher the effect of urea. For WT and p.V94M, we found fairly similar equilibrium m values, on average a 0.55 kcal mol À1 M À1 , which implies that $15% of the native structure ($50 residues) is unfolded in the cleavable state X (Fig. 3J). In the case of p.N34S and p.G90E, the equilibrium m value is around 1.4 kcal mol À1 M À1 , and thus, about $40% of the native structure ($130 residues) is unfolded in the cleavable state for these two mutants. Therefore, p.N34S and p.G90E populate a more extensively unfolded cleavable state under native conditions, and since proteolysis occur at Ala38-Phe39, 12 likely involves the N-terminal domain that contains the NAD + binding site. Moreover, a direct comparison of proteolysis rate constants normalized by protease concentration provides a measure of the thermodynamic destabilization induced by p.N34S and p.G90E on the N 4 X equilibrium (eqn (2)), which amounts to 1.5-2 kcal mol À1 , while the effect of p.V94M is marginal (Fig. 3K). These results show that p.N34S and p.G90E shi the N 4 X towards the cleavable state by a factor of 22.5 AE 4.5 and 14.1 AE 3.1. Overall, in the context of the population-shi mechanism, the lower affinity for NAD + in p.N34S and p.G90E is explained by the stabilization of non-competent binding states and changes in their structure under native conditions. Addition of NAD + has a stabilizing effect towards proteolysis ( Fig. 3 and S3 †), although this effect is variantdependent. In p.G90E, only a 20% decrease in the proteolysis rate constant is found, while this effect is of 40% in WT, 60% in p.N34S and 130% in p.V94M. The stabilization agrees well with their corresponding affinities for NAD + . 12 Importantly, the stabilization exerted by NAD + does not cause large changes in the equilibrium m values (Fig. 3I), indicating that NAD + binding does not alter the structure of the cleavable state but instead simply shis the conformational equilibrium towards more protease-resistant NAD + binding competent states.

Discussion
Inherited metabolic diseases are oen caused by missense mutations impairing enzyme activity, regulatory properties and/or intracellular stability. 21,24,[49][50][51][52][53] Two of the most common pathogenic mechanisms either involve native state destabilization or catalytic alterations, which are customarily considered as separate mechanisms. Here we explore the possibility of a relationship between these two mechanisms, by describing how two mutations in GALE associated to type III galactosemia affect the affinity for NAD + , and therefore their catalytic performance, promoting unfolding of the N-terminal domain and NAD + binding site. We identify alterations in the conformational ensemble of p.N34S and p.G90E consistent with a shi in the conformational equilibrium towards partially unfolded non-competent states for NAD + binding as shown by a combination of proteolysis and SAXS experiments and MD simulations. Therefore, we provide experimental and computational evidence to support an ensemble-based mechanism to explain catalytic mutations, in which structural and chemical perturbations caused by the mutations shi the equilibrium towards inactive conformations in the native state ensemble.
It must be noted that conformational uctuations at the NAD + binding site occur even in GALE variants with WT-like binding affinity ( Fig. 1 and 3). Non-binding competent excited states can be used by nature to provide the appropriate binding affinity for enzyme function at physiologically relevant ligand concentrations, allowing conformational excursions to states with different degrees of unfolding. 6,47 In this context, p.N34S and p.G90E are deleterious because they shi the conformational equilibrium towards more unstructured partially unfolded conformations in the native state ensemble thus energetically penalizing NAD + binding ( Fig. 4A and B). These excited states seem to be populated to signicant levels, as seen by the ability of our MD simulations to detect them even at the low microsecond sampling time scale (Fig. 1), although variants with low binding affinity for NAD + clearly promote the population of excited states. This population-shi is reected in our SAXS analyses as an expansion in the average ensemble in solution (Fig. 2), thus overall supporting this conformational-selection mechanism for the binding of NAD + to GALE. In addition, proteolysis experiment revealed an increase in population of these excited states in p.N34S and p.G90E as well as a change in their overall structure, showing a higher degree of unfolding (Fig. 3). Importantly, since the population of these excited states is inversely proportional to NAD + binding affinity, it is plausible that allosteric effectors would be able to bind to WT-like excited states destabilized by catalytic mutations such as p.N34S and p.G90E, thus partially restoring the binding affinity for NAD + (Fig. 4C).
So far, we have identied a thermodynamic signature for mutations affecting NAD + binding to GALE. Consistently, this signature is not found in p.V94M, a mutant strongly reducing catalytic efficiency with mild effects on the apparent affinities for the substrate and cofactor. 12,25 Previous MD simulations have suggested little effect of p.V94M on the interaction with substrate and cofactor. 27 However, p.V94M seems to severely affect the ability of GALE active site to bring together the NAD + cofactor, the substrate and the catalytic Tyr157, leading to non-optimal ternary complex for catalysis. 25 In the case of p.N34S and p.G90E, their low binding affinity for NAD + (Fig. 4 and ref. 12) is caused by a shi in the conformational equilibrium of the apo-state towards NAD + non-binding competent states, and in addition, p.G90E could develop unfavorable electrostatic interactions that may also contribute to its apparent inability to bind the cofactor and its dramatically low activity. 12,25

Conclusions
In conclusion, we describe a computational and experimental approach that directly links mutational effects on native state uctuations to catalytic and coenzyme binding defects in an inherited metabolic disease. We propose that our strategy, in the context of an ensemble-based mechanism, would be helpful to understand better mutational effects on catalysis and regulation (such as allostery and cooperativity), which are frequently altered in inborn errors of metabolism such as phenylketonuria, 10,50,54 classical homocystinuria, 13,55 inherited galactosemias 25,56 and primary hyperoxaluria type I. 49 Interestingly, several previous works allow to propose some correlation between altered proteolytic susceptibility and allosteric response in inborn errors of metabolism, 10,13,57 but not for catalytic alterations. Therefore, we advance the general hypothesis that disease-associated point mutations oen adversely disturb native-state dynamics. More generally, we expect that future work with mutations associated to other metabolic diseases will test whether our ensemble-based mechanism provides a unifying framework to reconcile analysis from enzyme kinetic and ligand binding studies, X-ray crystal structures, extensive molecular dynamic simulations and experimental ensemble studies using techniques such as SAXS and proteolysis. From a therapeutic viewpoint, we propose a strategy aimed at identifying molecules which can (partially) reverse the changes in active-site conformational uctuations upon binding to cryptic allosteric sites (Fig. 4), alternative to current approaches that focus on compounds which promote proper folding of affected proteins. Fig. 4 Pictorial representation of the effect of catalytic mutations on the structure of excited states and NAD + binding affinity and its potential correction by allosteric effectors. Panels (A) and (B) show the equilibrium between native GALE WT (A) and p.G90E (B) and the excited, cleavable states as inferred from native proteolysis experiments. In panel (C), the potential effect of an allosteric ligand binding to an unstructured region of the excited state of a catalytic mutant leading to correction of its NAD + binding affinity by a coupled bindingfolding process. For sake of simplicity, only the GALE monomer is displayed (based on PDB:1EK5).