Cristina
Duran
a,
Guillem
Casadevall
a and
Sílvia
Osuna
*ab
aDepartament de Química, Institut de Química Computacional i Catàlisi, Universitat de Girona, c/Maria Aurèlia Capmany 69, 17003, Girona, Spain. E-mail: silvia.osuna@udg.edu
bICREA, Pg. Lluís Companys 23, 08010, Barcelona, Spain
First published on 18th March 2024
Enzymes exhibit diverse conformations, as represented in the free energy landscape (FEL). Such conformational diversity provides enzymes with the ability to evolve towards novel functions. The challenge lies in identifying mutations that enhance specific conformational changes, especially if located in distal sites from the active site cavity. The shortest path map (SPM) method, which we developed to address this challenge, constructs a graph based on the distances and correlated motions of residues observed in nanosecond timescale molecular dynamics (MD) simulations. We recently introduced a template based AlphaFold2 (tAF2) approach coupled with 10 nanosecond MD simulations to quickly estimate the conformational landscape of enzymes and assess how the FEL is shifted after mutation. In this study, we evaluate the potential of SPM when coupled with tAF2-MD in estimating conformational heterogeneity and identifying key conformationally-relevant positions. The selected model system is the beta subunit of tryptophan synthase (TrpB). We compare how the SPM pathways differ when integrating tAF2 with different MD simulation lengths from as short as 10 ns until 50 ns and considering two distinct Amber forcefield and water models (ff14SB/TIP3P versus ff19SB/OPC). The new methodology can more effectively capture the distal mutations found in laboratory evolution, thus showcasing the efficacy of tAF2-MD-SPM in rapidly estimating enzyme dynamics and identifying the key conformationally relevant hotspots for computational enzyme engineering.
Tracing an enzymatic cycle in detail, the core steps within a generic catalytic pathway include: (i) substrate binding in the catalytic pocket, often involving exploration of additional conformational states with properly positioned loops and flexible domains facilitating active site access;9,10 (ii) substrate(s) activation for enzyme–substrate (ES) formation; (iii) transition state stabilisation leading to the formation of reaction intermediates and products; (iv) product release, often accompanied by conformational alterations resetting the catalytic cycle. Each of these steps are pivotal for enhanced catalytic activity.
The experimental and computational (or combined) studies exploring the conformational landscape of natural and laboratory-evolved enzymes highlight the key role of mutations at the active site but also at distal sites, in changing the stabilities of the pre-existing conformations.1,3–5,16 This can be, for instance, shown experimentally through NMR and room-temperature X-ray crystallography as shown for some Kemp eliminases,17,18 or by the evolution of B-factors in static X-ray structures of the multiple enzyme variants generated via directed evolution (DE) along an evolutionary trajectory.2,16 From a computational perspective, molecular dynamics (MD) simulations and enhanced sampling methods can be applied to estimate this ensemble of conformations, i.e., the free energy landscape (FEL, see Fig. 1).4,13,19,20 In the reconstructed FEL, the relative stability of thermally accessible conformations, as well as the kinetic barriers separating them are displayed. The barrier height separating a specific pair of conformational states dictates the timescale of the associated transition. Conformational changes directly influencing catalytic function encompass side-chain conformational shifts, loop motions often crucial for substrate binding/product release, and in some cases allosteric transitions.11
Fig. 1 Shortest path map (SPM) construction, workflow and applications. SPM is a correlation-based tool that can be used to identify conformationally relevant positions of importance for inducing a population shift in the FEL. The first step requires the estimation of the conformational heterogeneity of the studied system (FEL estimation) using for instance MD simulations. From these simulations the distance and correlation matrices are constructed, which are used to generate a first complex graph (shown in step 2, middle panel) that contains all protein residues represented as spheres and some edges of different length (weighted according to the correlation value) that link those positions situated, on average, less than 6 Å along the MD run. This complex graph is further simplified by finding the edges that are shorter, i.e., more correlated, providing the final SPM graph than can be plotted on top of the 3D structure (step 3). In the SPM graph, the sphere size and edges are weighted according to the number of times the pair of residues have been included in the shortest paths (i.e., the size qualitatively represents the importance of the identified positions for the conformational dynamics of the enzyme). SPM has been used to rationalise DE mutations,11 to study and understand the allosteric regulation within monomers in multimers,12,13 and more recently to design new enzyme variants starting from natural scaffolds.14,15 SPM is a strategy to reduce the number of potential hotspots, thus reducing the sequence space for enzyme engineering. The figure has been generated with a model system, the SPM graph, and displayed structure correspond to the beta subunit of tryptophan synthase. |
The evaluation of the FEL and understanding how it is altered after mutation provides crucial insights for comprehending and engineering enzyme function.4 The mutations introduced at the active site and often at distal sites induce a long-range conformational effect impacting enzymatic catalysis. Triggered by the introduced mutations, catalytically productive conformational states are stabilised, while the non-beneficial ones for the new functionality are disfavoured, thus transforming computational enzyme redesign into a population shift problem.11 These discoveries boosted the exploration of enzyme conformational dynamics for enzyme redesign.3,4,16 The reconstruction of ancestral enzymes displaying higher flexibility compared to modern counterparts, and their utilization as initial scaffolds for enzyme redesign, brought interesting new insights.21 Increased flexibility in many ancestral variants proved crucial for attaining superior levels of catalytic activity with only a few mutations at the active site. Various ancestrally-reconstructed enzymes have since been employed as starting points for enzyme design, for example, to enhance some residual catalytic promiscuity within an enzyme family, or to alter the allosteric regulation of some heterodimeric enzymes, among others.21–23
Although the above-mentioned examples highlight the importance of the enzyme conformational heterogeneity for function, what remains challenging is the identification of which mutations should be introduced to enhance a given conformational change, especially if those involve mutating distal sites.11 In this line, we developed the shortest path map (SPM) tool that relies on the construction of a graph based on the computed mean distances and correlation values obtained from MD simulations, following a similar strategy as the protocol reported by Sethi et al. for investigating allosterically-regulated enzymes (see Fig. 1).20 SPM instead of identifying communities in this first original complex graph (see step 2 in Fig. 1), focuses on the identification of the shortest path lengths.24 SPM therefore reduces the sequence space to a smaller number of conformationally-relevant positions, and of importance is the fact that SPM has the potential to identify the challenging distal activity-enhancing positions.11 Indeed, we successfully applied SPM for identifying DE mutations in the retro-aldolase, monoamine oxidase and tryptophan synthase enzymes, suggesting its potential application for the rational design of enzyme variants (Fig. 1).11 Our SPM tool was also applied by the Mulholland lab to evaluate the changes in the dynamical networks at the transition-state ensemble along DE of a computationally designed Kemp eliminase.25 We have also used SPM to investigate the allosteric communication within monomers, and have recently found that SPM is highly complementary to distance fluctuation analysis and dynamical non-equilibrium MD simulations for investigating allosteric systems.12,13 Despite the successes in identifying key positions targeted in DE, the application of SPM in computational enzyme design is not direct, as it identifies multiple conformationally relevant positions (usually around 50–70 residues are included) and, most importantly, it does not provide which specific amino-acid substitution should be introduced for achieving the desired conformational change.11 We combined SPM and ancestral sequence reconstruction to mitigate some of these limitations and design new stand-alone tryptophan synthase B (TrpB) variants.15 As the ancestral LBCA TrpB was known to display stand-alone activity, our approach focused on including the LBCA amino acid in those non-conserved SPM positions. The stand-alone activity of the new SPM6-TrpB variant was increased 7-fold (in terms of kcat). Still, it is worth highlighting that by testing only one single variant the fold increase in kcat was similar to the 9-fold obtained by DE that required the generation and screening of more than 3000 variants. In a recent pre-print, we have further demonstrated the power of our SPM methodology for redesigning natural scaffolds and boosting an existing side-activity into nature-like catalytic activities. In particular, we were able to increase the esterase catalytic efficiency of a hydroxynitrile lyase (HNL) more than 1300-fold, and we actually surpassed the activity of the esterase taken as reference.14 These studies therefore provide further evidence for the potential of our SPM methodology for computational enzyme redesign.
The neural network Alphafold2 (AF2) revolutionized the structural biology and protein design field as it is able to predict the folded structure of proteins and enzymes from the primary sequence with high levels of precision.26–29 The innovative AF2 neuronal network exploits data on the evolutionary, physical and geometric constraints of existing protein structures. AF2 is acknowledged as a pivotal milestone in protein structure prediction and has boosted the utilization of deep-learning techniques for numerous other applications.29 Despite the remarkable efficacy of AF2 algorithms in predicting the native lowest energy structure of proteins, using AF2 for understanding and engineering function directly from the acquired single static structure is not straightforward. However, some recent research has indicated that AF2 can also be used to predict multiple conformations of the same protein, thereby potentially being utilized to explore the conformational adaptability of biological systems.30–32 In this regard, we have recently developed a template-based AF2 approach to estimate the conformational landscape of enzymes, and quickly assess how it is shifted by mutations.33,34 This is promising as it suggests that AF2 could be employed for evaluating the impact of introduced mutations on the conformational landscape at a significantly reduced computational cost: in hours instead of days/weeks of simulation time, thus accelerating the creation of conformationally-driven enzyme design protocols.4,11
In this study, we evaluate the potential of our developed SPM tool and template based AF2 (tAF2) approach coupled to MD simulations for estimating the conformational heterogeneity and identifying key active site and distal conformationally-relevant positions. We first compare the reconstructed FEL from tAF2 coupled to 10–50 nanosecond timescale MD simulations employing different forcefields and water models. Second, the predictions from tAF2-MD are compared to the previously reconstructed FELs from multiple replica well-tempered metadynamics simulations.15 Finally, we generate the SPM maps using the tAF2-MD data and considering different conformational states. Our results show the potential of SPM and tAF2-MD, for estimating the conformational heterogeneity and identifying the key conformationally-relevant hotspots. The developed approach has a reduced computational cost (results in a few hours instead of days/weeks or even months compared to metadynamics) and can be used to computationally generate and screen multiple enzyme variants, thus potentially giving access to nature-like activities.
Fig. 2 SPM and reconstructed free energy landscapes (FELs) of PfTrpB and the evolved variant 0B2-PfTrpB. The catalytically relevant COMM domain that covers the active site of the enzyme (shown in teal) can adopt different conformations along the catalytic cycle: open (O) states are adopted in the resting state E(Ain), partially closed (PC) at the reaction intermediates E(Aex1) and E(A–A) and closed (C) at E(Q2) states. Representation of the computed SPM for PfTrpB with DE mutations highlighted: 2 mutations are predicted (shown in orange), three are adjacent to SPM residues (teal) and one is distal (grey).19 The pyridoxal phosphate cofactor is shown in dark grey. The previously reconstructed FELs from the developed tAF2-MD protocol in the red-blue colourmap (blue for the most stable conformations, red for the least stable ones)33 are shown on top of the FELs computed from the multiple walker well-tempered metadynamics simulations (shown in grey).19 The predictions of AF2 are represented on the 2D-FEL representation using vertical lines coloured from orange to dark blue depending on the MSA depth: AF2 predictions obtained with a 32 MSA depth are shown with a vertical orange line, 64 in light orange, 128 in light brown, 256 in light cyan, 512 in cyan, 1024 in teal, and 5120 in dark blue. |
In our previous publication we found that the conformational heterogeneity of related TrpB variants exhibit different levels of stand-alone activity that could be estimated by performing 10 ns MD simulations, starting from the ensemble of structures generated by our template-based AF2 (tAF2) approach.33,34 This approach generates multiple structures by running AF2 with different starting templates and multiple sequence alignments (MSAs) of different depths, followed by short nanosecond timescale MD simulations. As we found in our previous study, the conformational landscapes reconstructed from these multiple replica 10 ns MD simulations starting at different tAF2 predictions, were qualitatively in line with the previously reconstructed computationally expensive FELs obtained from well-tempered multiple-walker metadynamics simulations.15 The computational cost associated with both approaches is dramatically different: estimations can be obtained in a matter of hours for the former, whereas multiple weeks are needed for the latter. Still, there were some major differences observed in the conformational landscapes, which we want to evaluate further in this study. In this section, we aimed to extend the simulation time of the MD simulations to test whether a better exploration of the conformational landscape could be obtained. To that end, we extended the 10 ns MD simulations to up to 50 ns and reconstructed and compared the FELs (see Fig. 3 and 4, and the Materials and methods section for a detailed description). We also tested the effect of using the recommended AMBER ff19SB force field and the improved OPC water model (as opposed to the previously used ff14SB and TIP3P). OPC was found to accurately reproduce the electrostatic properties of water, which is particularly important for balancing the interactions between the protein residues and water molecules, especially for disordered regions, as most water models tend to favour too compact structures.39,40
Fig. 3 Estimated FEL of PfTrpB. The estimated FEL from the accumulated multiple replica nanosecond timescale MD simulations performed starting at the ensemble of template-based AF2 predictions, is shown in colour on top of the previously reconstructed FEL of the PfTrpB variant (shown in grey scale).19 FELs are displayed every 10 ns of MD simulation time and using two different combinations of forcefield and water model: (A) ff14SB and TIP3P water, and (B) ff19SB and OPC water. The x axis denotes the ensemble of structures generated from X-ray data for the open-to-closed transition of the COMM domain, which ranges from 1–5 (open structures, O), 6–10 (partially closed, PC), to 11–15 (closed, C), the y axis is the mean square deviation (MSD, in Å2) from the path of O-to-C structures generated. Most stable conformations are shown in blue, whereas higher energy regions are shown in red. |
Fig. 4 Estimated FEL of 0B2-PfTrpB. The estimated FEL from the accumulated multiple replica nanosecond timescale MD simulations performed starting at the ensemble of template-based AF2 predictions, is shown in colour on top of the previously reconstructed FEL of the PfTrpB variant (shown in grey scale).19 FELs are displayed every 10 ns of MD simulation time and using two different combinations of forcefield and water model: (A) ff14SB and TIP3P water, and (B) ff19SB and OPC water. The x axis denotes the ensemble of structures generated from X-ray data for the open-to-closed transition of the COMM domain, which ranges from 1–5 (open, O), 6–10 (partially closed, PC), to 11–15 (closed, C), the y axis is the mean square deviation (MSD, in Å2) from the path of O-to-C structures generated. Most stable conformations are shown in blue, whereas higher energy regions are shown in red. |
In Fig. 3 and 4, the reconstructed FELs from the multiple replica 10 ns MD simulations from our previous study are compared to the new FELs generated from 50 ns MD. The FEL obtained from well-tempered multiple-walker metadynamics is also shown in grey for comparison.15 Although the 10 ns MD simulations already suggested some differences in the conformational heterogeneity between systems, the extension of the simulations up to 50 ns confirmed the estimations. Still, the analysis of the conformational space sampled every 10 ns of MD simulation show that, especially in the 20–30 ns timeframe, the obtained FEL is very similar to the one explored after 50 ns (see Fig. 3, 4, S1 and S2†). This is especially relevant for the open conformation of the COMM domain in the evolved 0B2-PfTrpB, which after 20–30 ns is substantially more sampled and thus stabilised as expected from the metadynamics FEL. This stabilisation is even more evident in the FELs from the ff19SB-OPC simulations. Thanks to the improved water model and forcefield, and as found for disordered proteins, ff19SB-OPC stabilises a much more open conformation of the COMM (Fig. S2 and S3†). For wild-type PfTrpB the open conformation is more sampled in the 50 ns FEL, but it is not stabilised as in the case of 0B2-PfTrpB in line with its inferior conformational heterogeneity. Altogether, the extension of the multiple replica MD simulations up to 20–30 ns show a better agreement with the computationally intensive FELs reconstructed from the well-tempered metadynamics simulations.19 This analysis suggests that the TrpB conformational landscape can be properly sampled and estimated from the developed tAF2 when coupled with 20–30 ns MD starting from 60, and 59 structures from tAF2 for PfTrpB, and 0B2-PfTrpB, respectively. As shown in Fig. S4–S6,† ff19SB/OPC provides a better description of the secondary structure of TrpB as compared to X-ray data, thus this combination is more appropriate for evaluating TrpB conformational dynamics.
We generated the SPM graphs every 10 ns of the multiple replica MD, started from a different tAF2 output structure, and compared them with respect to the SPM generated from the metadynamics simulations.19 We also assessed whether the SPM graphs contained DE mutation sites, as we did in our previous publication.19 All SPM graphs generated at this point make use of the default parameters of distance and significance thresholds (set to 6 Å and 0.3, respectively, see methods). Similarly to what is found in the previous section regarding FEL convergence, smaller differences between the computed SPM are found after 20–30 ns of MD. It is interesting to observe that the SPM computed in the 0–10 ns timeframe contains a substantially reduced number of residues, and this number is expanded when the MD simulation time is increased (Fig. 5). This is particularly evident in PfTrpB, as the first SPM contains 46 residues and is further increased to up to 81. In the case of 0B2-PfTrpB the number of included residues differs only from 63 to 55 (see Fig. 5). Altogether this comparison suggests a higher interlinked communication in the most evolved variant, which can be successfully captured after short 20–30 ns MD simulations from the multiple tAF2 output structures. The comparison of the SPM with the one obtained for PfTrpB with the multiple-walker well-tempered metadynamics simulations19 reveals similar SPM pathways (Fig. 2 and 6). However, it should be mentioned that in terms of identifying DE mutations, the new SPM coming from the tAF2 and 20–30 ns MD simulations captures two additional mutations (T292S and I68V), which in the previously published SPM based on metadynamics, were not predicted. The same distance and significance thresholds were used for SPM construction, thus the conformational space sampled was the most important difference between both strategies (tAF2+MD versus metadynamics).
Fig. 6 Computed SPM of PfTrpB. Representation of the generated SPM graphs using the last 20 ns of the MD trajectories with: (A) default settings (i.e., considering a threshold of 6.0 Å for the distance matrix computed from the MD data without any filtering of structures), (B) only the distance information from the structures that present a closed conformation of the COMM domain, (C) only the distance information from the open structures of the COMM, and (D) combining the distance information of (B) and (C). The number of contained residues in the SPMs is shown (N). Directed evolution (DE) mutations37 are labelled and highlighted in yellow if contained in the SPM, or green if not contained but making non-covalent interactions with SPM positions. The size of spheres and the thickness of edges in the SPM plots are weighted according to their importance for the conformational dynamics of the enzyme: those pairs of residues that have a higher contribution to the conformational dynamics are represented with a thicker edge and larger sphere. In all cases, a threshold for the significance of 0.3 is used. |
Finally, as the SPM outcome is dependent on the mean distances between all combinations of residues that compose the protein, we decided to evaluate how the SPM graphs differ when: (1) considering only open or closed states of the COMM domain, and (2) generating a new distance matrix combining the information from the individual open and closed distance matrices (see methods for a full description, and Fig. S7†). For PfTrpB, the generated SPM considering either closed or open states contain a reduced number of identified positions if compared with the SPM generated with the standard protocol (i.e., considering the full trajectory for computing the distance matrix). It is interesting to note that the SPM for the open state contains a larger number of residues in comparison with the one generated with closed states (80 versus 69), thus suggesting a reduced communication in the closed state in PfTrpB. This observation is in line with the inability of PfTrpB to adopt productively closed conformations, consistent with its lower catalytic activity as stand alone.19 The connection between the COMM domain and the active site is also different in both SPMs, as the one for the open state is more similar to the default SPM (see Fig. 6). In terms of identifying the DE mutations, a reduced number of positions is identified in comparison with the default parameters. This suggests that considering the whole ensemble of MD trajectories is a more appropriate strategy for SPM construction for capturing DE hotspots. Another interesting observation is that the SPM computed considering a distance matrix containing the information from the closed and open matrices (Fig. 6), contains a substantially lower number of positions (50 versus 102 for the default).
In 0B2-PfTrpB, the SPM for the closed state contains a much larger number of positions (110), in comparison with the one for the open state (55) and the default SPM (47, see Fig. 7). This comparison suggests that the distal mutations introduced in 0B2-PfTrpB stabilise the closed state, thus enhancing the communication between the COMM domain, the active site, and the interface region between the beta subunits. This fact is also in line with the reconstructed FELs reported in the previous section, and the well-tempered metadynamics published in the previous study that show the stabilisation of the catalytically competent closed state.19 The SPM for the closed state actually contains a higher number of DE mutations (3 are contained, 3 are adjacent), especially if compared with the default SPM (1 mutation is contained in the path, 4 are adjacent, 1 is distal). The SPM obtained from the distance matrix generated from the individual contributions of the closed and open state contains a reduced number of positions (37), and actually is very similar to the default SPM (47 residues identified).
Fig. 7 Computed SPM of 0B2-PfTrpB. Representation of the generated SPM graphs using the last 20 ns of the MD trajectories with: (A) default settings (i.e., considering a threshold of 6.0 Å for the distance matrix computed from the MD data without any filtering of structures), (B) only the distance information from the structures that present a closed conformation of the COMM domain, (C) only the distance information from the open structures of the COMM, and (D) combining the distance information of (B) and (C). The number of contained residues in the SPMs is shown (N). Directed evolution (DE) mutations37 are labelled and highlighted in yellow if contained in the SPM, or green if not contained but making non-covalent interactions with SPM positions. The size of spheres and the thickness of edges in the SPM plots are weighted according to their importance for the conformational dynamics of the enzyme: those pairs of residues that have a higher contribution to the conformational dynamics are represented with a thicker edge and larger sphere. In all cases, a threshold for the significance of 0.3 is used. |
In this study, we aimed to evaluate the potential of the SPM tool, especially if combined with our previously developed template based AF2 (tAF2) approach coupled to MD simulations,33 for quickly estimating the conformational heterogeneity and identifying key conformationally relevant positions. We first compared the reconstructed FEL obtained from the tAF2 method in conjunction with short nanosecond timescale MD simulations using two different forcefields and water models. The conclusions derived from these new FELs match those of the computationally much more demanding well-tempered multiple walker metadynamics simulations. The most evolved 0B2-PfTrpB exhibits a much more stabilised closed conformation of the COMM domain, as well as a higher conformational heterogeneity, as described by the previously reported metadynamics simulations.19 Our analysis indicates that the TrpB conformational landscape can be properly sampled and estimated from the developed tAF2 approach especially when coupled with 20–30 ns MD simulations. Still, as reported in our earlier study,33 10 ns MD simulations starting from the multiple outputs from tAF2 can provide some hints about the conformational heterogeneity of the systems. The comparison between ff14SB/TIP3P and ff19SB/OPC indicates that the latter seems to provide a better description of the open and closed states of TrpB.
We also assessed the differences in the SPM graphs obtained via the tAF2 approach coupled to nanosecond timescale MD simulations. SPM requires the correlation and distance matrix, therefore we decided to evaluate in more detail the effect of the distance matrix on the generated output graphs. We compared the SPM computed using the last 20 ns of the 50 ns MD trajectory, with the SPMs derived from the distance matrix containing only either closed or open states of the COMM domain. We also evaluated the effect on the SPM when a distance matrix combining the information from the individual open and closed distance matrices was used. The SPM of PfTrpB using 20 ns of the 50 ns MD data is comparable to the one obtained with the multiple replica well-tempered metadynamics simulations.19 However, the SPM generated via the tAF2 approach coupled to short MD simulations predicts two additional DE mutations (positions T292S and I68V). This suggests that the tAF2-MD approach, at least in TrpB, is a suitable strategy for identifying key positions targeted with DE. The SPMs obtained considering only open or closed states include a reduced number of DE mutations. In the case of PfTrpB, the SPM generated for both closed and open states shows fewer identified positions, as compared to the one produced considering the last 20 ns of the MD trajectory. Notably, the SPM for the open state has more residues than the one for the closed state, indicating a limited communication in the closed state of PfTrpB. This aligns with the inability of PfTrpB to adopt productive closed conformations,19 consistent with its lower stand-alone catalytic activity. These SPMs also identify a reduced number of DE mutations. Interestingly, the SPM for the closed state of the stand-alone 0B2-PfTrpB variant features a larger number of positions compared to the open state SPM and the default. This suggests that the introduced distal mutations stabilise the closed state and enhance the communication between the COMM domain, the active site, and the interface region between the beta subunits. This higher communication is in line with the stabilisation of the catalytically competent closed state, as we observed in our previous publications.19,33 The SPM for the closed state also contains more distal DE mutations, especially when compared to the default SPM. This also highlights the potential of tAF2-MD-SPM for rationalising the effect of DE mutations and design.
In this study we show that 20 ns MD simulations using ff19SB with OPC water model and starting from the multiple output structures provided by the developed tAF2 protocol,33 allow the estimation of the conformational heterogeneity of systems differing in only a few mutations (98.4% of sequence identity). Most importantly, these simulations allow the construction of SPM graphs that are comparable to those obtained after performing multiple-walker well-tempered metadynamics simulations. SPM can also be used to study the conformationally relevant positions at the different open and closed states of the protein, which identify fewer DE mutations but are useful to study how the introduced mutation altered the inter-residue communication. This study demonstrates the potential application of the developed tAF2-MD-SPM for the fast computational evaluation, redesign and ranking of new enzyme variants. This is exciting for achieving the ultimate goal of computationally redesigning new enzymes with nature-like catalytic efficiencies.
For the evaluation of the SPM graph considering only open or closed states of the COMM domain, two clusters were obtained fitting the trajectories in two Gaussian distributions described with the open-to-closed path, using Gaussian Mixture Model implemented in the scikit-learn package.47 For each open and closed state, a distance and a DCC matrix were obtained. From here, the SPM procedure is the same as followed in the default part, after obtaining the matrices. The SPM graph was finally assessed generating a new distance matrix that combines the individual open and closed distance matrices. By taking the information of the clusters previously computed, a single distance matrix was created. However, the DCC matrix was constructed with all data without filtering by the open/closed clusters. The SPM analysis method was then equal to the default procedure as described above. As discussed in the Results section, the SPM graphs were constructed using different trajectory lengths.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3fd00156c |
This journal is © The Royal Society of Chemistry 2024 |