Michael
Gastegger
*a,
Kristof T.
Schütt
ab and
Klaus-Robert
Müller
abcd
aMachine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany. E-mail: michael.gastegger@tu-berlin.de
bBerlin Institute for the Foundations of Learning and Data, 10587 Berlin, Germany
cDepartment of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea
dMax-Planck-Institut für Informatik, 66123 Saarbrücken, Germany
First published on 23rd July 2021
Fast and accurate simulation of complex chemical systems in environments such as solutions is a long standing challenge in theoretical chemistry. In recent years, machine learning has extended the boundaries of quantum chemistry by providing highly accurate and efficient surrogate models of electronic structure theory, which previously have been out of reach for conventional approaches. Those models have long been restricted to closed molecular systems without accounting for environmental influences, such as external electric and magnetic fields or solvent effects. Here, we introduce the deep neural network FieldSchNet for modeling the interaction of molecules with arbitrary external fields. FieldSchNet offers access to a wealth of molecular response properties, enabling it to simulate a wide range of molecular spectra, such as infrared, Raman and nuclear magnetic resonance. Beyond that, it is able to describe implicit and explicit molecular environments, operating as a polarizable continuum model for solvation or in a quantum mechanics/molecular mechanics setup. We employ FieldSchNet to study the influence of solvent effects on molecular spectra and a Claisen rearrangement reaction. Based on these results, we use FieldSchNet to design an external environment capable of lowering the activation barrier of the rearrangement reaction significantly, demonstrating promising venues for inverse chemical design.
Recently, machine learning (ML) methods have emerged as a powerful strategy to overcome this trade-off between accuracy and efficiency inherent to computational chemistry approaches.5–8 Highly efficient ML models now provide access not only to interatomic potential energy surfaces,9–17 but also to a growing range of molecular properties.18–24 Specialized ML architectures have been developed for the prediction of vectorial and tensorial quantities,25–27 such as dipole moments,28 polarizabilities29 and non-adiabatic coupling vectors.30 Such advances pave the way for using ML models in practical applications, with the simulation of infrared,28 Raman,31,32 ultraviolet33 and nuclear magnetic resonance (NMR) spectra34 being only a few examples. At the same time, there is an ongoing effort to incorporate more physical knowledge into ML algorithms, giving rise to semi-empirical ML schemes35,36 and even models based on electron densities37–39 and wavefunctions.40–42
Most of these approaches operate on closed systems, where the molecule is not subjected to any external environment. Christensen et al.43 proposed a general framework for modeling response properties with kernel methods. In this context, and in order to model electric field dependent properties, the FCHL representation44 was extended by an electric field model based on rough estimates of atomic partial charges. This makes it possible to predict various response properties beyond atomic forces, such as dipole moments across compositional space as well as static infrared spectra. Note that this approach inherently relies on molecular representations which are able to capture the required perturbations of the energy.
In this work, we propose the FieldSchNet deep learning framework, which models the interactions with arbitrary environments in the form of vector fields. As a consequence, our model is able to describe various implicit and explicit interactions with the molecular environment using external fields as a physically motivated interface. Moreover, the field-based formalism automatically grants access to first and higher order response functions of the potential energy, with polarizabilities and nuclear magnetic shielding tensors being only a few examples. This enables the prediction of a wide range of molecular spectra (e.g. infrared, Raman and NMR) without the need for additional, specialized ML models.
FieldSchNet can be used as a polarizable continuum model for solvation45 or to interact with an electrostatic field generated by explicit external charges in an QM/MM-like setup46 – an approach we refer to as ML/MM. As the model offers speed-ups of up to four orders of magnitude compared to the original electronic structure reference, we can employ FieldSchNet to investigate the influence of solvent effects on the Claisen rearrangement reaction of allyl-p-tolyl ether – a study out of reach for conventional high-level electronic structure or ML approaches. While these simulations would take 18 years with the original electronic structure reference, they could be performed within 5 hours with FieldSchNet.
FieldSchNet goes well beyond the scope of previous ML approaches by providing an analytic description of a chemical system in its environment. We exploit this feature by designing an external field in order to minimize the height of the reaction barrier in the above Claisen reaction. Hence, FieldSchNet constitutes a unified framework that enables not only the prediction of spectroscopic properties of molecules in solution, but even the inverse chemical design of catalytic environments.
xl+1i = xli + wli + uli + vli, | (1) |
Starting from an initial embedding depending only on the respective atom type. Here, wli is the standard SchNet interaction (Fig. 1a, left block), while the added terms uli and vli correspond to dipole-field and dipole–dipole interactions (Fig. 1a, right block). The SchNet interaction update then takes the form
(2) |
The radial interaction functions Wlq depend on the interatomic distance rij and are learned from reference data. A fully-connected neural network is applied afterwards performing a non-linear transformation. From here on, we will represent networks of this type as NNlw, where the superscript l indicates the current interaction layer and the subscript different sets of parameters.
In terms of rotational symmetry, the SchNet feature refinements can be interpreted as charge–charge interactions, as the features of xi are scalars. Hence, the internal structure of SchNet, as well as the generated representation, is invariant with respect to rotations and translations of the molecule. This is an important requirement for machine learning potentials, as the energy of an atomic system exhibits the same symmetry.47 However, this invariance breaks down in the presence of external fields. In this case, models also need to be constructed such that they are able to resolve rotations and translations relative to an external frame of reference.
FieldSchNet achieves this requirement by introducing an additional vector valued representation based on atomic dipole moments (left side of Fig. 1a), which is refined analogous to eqn (1):
(3) |
Based on these features, FieldSchNet models the interaction between molecule and external fields uli with the term
uli = NNlε[(μli)Tεext(Ri)], | (4) |
In addition to the dipole-field interaction, FieldSchNet introduces an update vli based on the interaction between neighboring dipoles
(5) |
(6) |
For each external field, FieldSchNet adds a set of dipole features and expands the update in eqn (1) by the corresponding dipole-field and dipole–dipole terms. As the scalar features xi of the next layer depend on the dipole and field interactions of the current layer, those in turn are coupled with the preceding scalar features via the dipoles (eqn (3)). This allows FieldSchNet to capture interactions with external fields far beyond the linear regime.
As visualized in Fig. 1d at the example of an ethanol molecule, the FieldSchNet descriptor exhibits the same symmetry as the molecule (top). Upon introducing an electric field in the z-axis of the molecule, the descriptor adapts to the changed environment and the original symmetry is broken (middle). This effect can also be observed in the response of the descriptor with respect to the field (bottom). At the same time, multiple successive SchNet and dipole–dipole updates enable the efficient construction of higher-order features of the molecular structure (see Fig. 1c), going beyond the purely radial dependence of charge-like features. Although inspired by electric dipoles μ, the dipole-like representations in FieldSchNet are auxiliary constructs and may also represent other quantities, e.g. the nuclear magnetic moments Ii when modeling magnetic fields. An appropriate form of the dipoles corresponding to each field is learned purely data-driven.
Once the FieldSchNet representation xi has been constructed, the potential energy is predicted via the atomic energy contributions typical for atomistic networks (Fig. 1b)
(7) |
Since the features xi depend on the interactions between dipoles and fields of each previous layer and the dipoles are in turn constructed from the features, the potential energy is now a function of the atomic positions, nuclear charges and all external fields as well as initial atomic dipoles. This makes it possible to obtain response properties from the energy model by taking the corresponding derivatives.
(8) |
aWli = NNla(xli), | (9) |
(10) |
(11) |
These charges are fully polarized charges, depending not only on the molecular structure but also the charge distribution of the environment.
We train a single FieldSchNet model on reference data generated with the PBE0 functional for an ethanol molecule in vacuum. The dataset is based on the MD17 ethanol data53 and contains a total of 10000 structures (see ESI† for more details). A combined loss function incorporates the energy (E), atomic forces (F), dipole moment (μ), polarizability (α) and nuclear shielding tensors (σ) as target properties. Details on reference data generation and training are provided in the ESI.† Excellent fits were obtained for all quantities, as can be seen based on the test accuracy reported in ESI Table S3.† Energies and force predictions fall well within chemical accuracy, with mean absolute errors (MAEs) of 0.017 kcal mol−1 and 0.128 kcal mol−1 Å−1. The response properties of the electric field exhibit MAEs as low as 0.04 D (μ) and 0.008 bohr3 (α) compared to value ranges of 4.56 D and 13.58 bohr3 present in the reference data. In a similar manner, FieldSchNet yields low MAEs for the shielding tensor σ, exhibiting an error of 0.123 ppm for hydrogen (range of 29.43 ppm) and 0.194 ppm for carbon atoms (range of 153.94 ppm).
In the following, we simulate a range of spectra using the multi-property FieldSchNet model (see ESI† for computational details). Fig. 2a shows infrared spectra obtained from static calculations and the dipole time-autocorrelation functions simulated by molecular dynamics and ring-polymer PIMD. In addition, a static spectrum obtained with the reference method, as well as the gas-phase experimental spectrum are provided.54 While the FieldSchNet model is able to reproduce the static reference almost exactly, comparison of both spectra to experiment demonstrates the drawbacks of relying on purely static calculations for the prediction of vibrational spectra in general.
The potential of the machine learning model becomes apparent when going beyond the static picture. In order to predict vibrational spectra from molecular dynamics simulations, a large number of successive computations are necessary, which quickly become prohibitive when relying on electronic structure methods. However, due to the computational efficiency of FieldSchNet, these simulations can be carried out with little effort. A single evaluation of all response properties takes 220 ms on a Nvidia Tesla P100 GPU compared to a computation time of 207 s with the original electronic structure method on a Intel Xeon E5-2690 CPU, a speedup by almost three orders of magnitude. As a consequence, a simulation which would take 240 days with conventional approaches can now be performed in approximately 6 hours. This huge step in efficiency grows even stronger for larger systems.
The benefits of performing molecular dynamics simulations can be observed in the spectrum recovered in this manner, as it exhibits a much better agreement with experiment in the low frequency regions. Nevertheless, this approach still fails to reproduce positions and intensities of the bands associated with the stretching vibrations of the C–H bonds (∼3000 cm−1) and the O–H bond (∼3600 cm−1). These differences are primarily due to the neglect of anharmonic and nuclear quantum effects. One way to include these effects is via PIMD, where multiple coupled replicas of the molecule are simulated. While this approach is computationally more demanding than classical molecular dynamics due to the additional replicas, the simulations can still be carried out efficiently with FieldSchNet. As can be seen in Fig. 2a, accounting for anharmonic effects does indeed shift the C–H stretching vibrations to the experimental wavelengths and improves the position the O–H band, yielding a predicted spectrum close to experiment.
In addition to infrared spectra, FieldSchNet enables the simulation of Raman spectra, which rely on molecular polarizabilities α. Since the full polarizability tensor is predicted by the model, it is possible to compute polarized as well as depolarized Raman spectra. Fig. 2b depicts both types of spectra as obtained with FieldSchNet via path integral molecular dynamics, as well as their experimental counterparts.55 In both cases, very good agreement with experiment is observed, with the most prominent difference being once again the vibrations in the O–H stretching regions. The quality of the predicted spectra is particularly striking, when comparing to the statically computed polarized Raman spectrum, which fails to reproduce the shapes and magnitudes of several peaks.
Beyond vibrational spectra, FieldSchNet can be used to obtain NMR chemical shifts via the nuclear shielding tensors σi. In this manner, chemical shifts can be obtained for all NMR active isotopes, which in the case of ethanol are 1H, 13C and 17O. The predicted and reference chemical shifts for the equilibrium configuration of ethanol are provided in Fig. 2c and d. In addition, the distribution of shifts sampled during PIMD simulations are shown. 17O shifts are omitted from the analysis, as only one oxygen nucleus is present. However, the shifts are still reproduced accurately and the associated error is given in ESI Table S3.† FieldSchNet predictions agree closely with the reference method for the 1H and 13C isotopes. The 1H chemical shifts of the hydrogens in the CH2 and CH3 groups are close to their expected experimental values of 1.2 ppm and 3.8 ppm, respectively. The peak shifted to 1 ppm is associated with the hydrogen atom in plane with the O–H bond. The resulting band structure vanishes in the molecular dynamics simulation due to rotations of the methyl group. A major disagreement with experiment is the shift of the hydrogen in the O–H group which shows uncharacteristically low values of 1 ppm. This can be attributed to a shortcoming of the reference method, which exhibits the same behavior. The 13C shifts of the CH2 and CH3 carbon atoms agree almost perfectly with their experimental values of ∼60 ppm and ∼20 ppm.
By using an expression for the external field adapted from the reaction field approach of Onsager,48 FieldSchNet can operate as a machine learning model for polarizable continuum solvents with an explicit dependency on ε (see Section 2.2). To study this mode of operation, we train such a polarizable continuum FieldSchNet (pc-FieldSchNet) on a reference data set composed of computations for a ethanol molecule in the gas phase, as well as ethanol (ε = 24.3) and water (ε = 80.4) continuum solvents. To this end, the 10000 ethanol structures in vacuum were recomputed for each environment (ethanol and water) and the combined dataset was then screened for numerical instabilities due to the cavity generation process in the PCM computations, yielding a total of 27990 reference structures (see ESI† for details). Again, the model is trained to predict the potential energy, the atomic forces and the response properties for the molecular spectra. pc-FieldSchNet reproduces all quantities with high accuracy, comparable with that of the model for response properties in vacuum (see ESI Table S3†). This demonstrates that pc-FieldSchNet is able to learn the correct dependence on the specific dielectric constant ε.
ESI Table S3† furthermore shows results for two benchmark datasets of methanol (ε = 32.63) and toluene (ε = 10.3), solvents that have not been included in training. We find that pc-FieldSchNet generalizes well to these solvents with errors comparable to the prediction of solutions used for training. The accuracy for toluene is slightly lower with the polarizability showing particularly high mean absolute errors of 0.243 bohr3. However, this is can be attributed to an insufficient sampling of the regions of low polarity, as the most similar solvent included in training is vacuum. The methanol dataset is reproduced with high accuracy, demonstrating that the model is able to generalize across unseen continuum solvents.
We train a FieldSchNet model on a set of PBE0 reference data of 30000 ethanol configurations polarized by external charge distributions that have been sampled from a classical force field (three classical ethanol environments for each vacuum structure, more details on reference data generation and model training can be found in the ESI†). The test set performance of the resulting model is provided in ESI Table S3.† We observe slightly increased errors for the ML/MM model in energy and forces compared to the vacuum and continuum models due to the more complex environment. Still, the model is able to reach high accuracy.
ML/MM molecular dynamics simulations are carried out for a single machine learning ethanol surrounded by a solvent box of 1250 ethanol molecules treated at force field level (see ESI† for computational details.). The resulting infrared spectrum is depicted in Fig. 3, alongside spectra computed in the same manner using the vacuum model (Section 3.1) and continuum model (ε = 24.3), as well as an experimental spectrum of liquid ethanol.57 Comparing the gas phase and continuum models, we find that in this case implicit solvent effects lead to no major improvements with respect to experiment. The spectrum simulated via the ML/MM approach on the other hand, yields significantly better predictions. The low wavelength regions in particular show excellent agreement with experiment. The high wavelength regions are shifted to higher wavelengths since anharmonic effects are neglected in the classical ML/MM simulation. However, we still observe the broadening and red-shift of the O–H stretching vibration present in the experimental spectrum. This effect is caused by hydrogen bonding between the O–H groups in the machine learning region and the surrounding ethanols (see inset Fig. 3). Continuum models fail to account for these kind of interactions, as they neglect the structure of the solvent (see ML/PCM Fig. 3).
While we have trained the FieldSchNet ML/MM model using a rather large training set, adaptive sampling schemes can drastically reduce the amount of required reference calculations in practice.28,58–60 To demonstrate this, we select a representative set of configurations from the ML/MM ethanol reference data (see ESI† for details). An initial ensemble of models is first trained on a small subset (100 configurations) randomly selected from the original data. We then use variance of the predictions of this ensemble to estimate the uncertainty associated with the other samples in the ethanol reference data. The 100 configurations with the highest uncertainty are added to the initial dataset, the FieldSchNet ensemble is retrained, and the procedure is repeated until the desired accuracy is obtained. The final data set contains only 2000 configurations in contrast to the initial 30000 while still yielding an accurate IR spectrum (see ESI Fig. S4†).
Beyond that, the FieldSchNet ML/MM model can be extended in a systematic manner using adaptive sampling to include other solutes or solvents. We demonstrate this by constructing an ML/MM model for liquid methanol based on the FieldSchNet ensemble for ethanol from above. The system consists of a single ML-modeled methanol suspended in a box of 1859 methanol molecules treated at force field level. The predictive uncertainty of the pretrained ensemble is monitored and methanol ML/MM configurations with high uncertainty are selected. We recompute these geometries with the electronic structure reference and retrain our models on the expanded data set.
As can be seen in Fig. 4 even a single adaptive sampling iteration where the model has been trained 100 additional methanol configuration is enough to yield a qualitatively accurate infrared spectrum of liquid methanol in a ML/MM simulation. The spectrum accurately reproduces the difference between ethanol and methanol in the angular deformation vibrations at 1500 cm−1 and the C–C–O asymmetric (1080 cm−1) and symmetric (900 cm−1) stretching vibrations observed in ethanol are now absent.
Fig. 4 Infrared spectrum of methanol obtained via adaptive sampling. ML/MM spectrum of liquid methanol as obtained with a FieldSchNet model trained on 100 additional methanol configurations selected with adaptive sampling (blue). An experimental spectrum for liquid methanol is shown in gray.57 |
An example for such a reaction is the Claisen rearrangement of allyl-p-tolyl ether (Fig. 5a). The presence of water as a solvent accelerates this reaction by a factor of 300 compared to reaction rates in the gas-phase.61 Computational and experimental studies have determined that the main reason for this acceleration is explicit hydrogen bonding between the transition state and the water molecules of the solvent. These lead to a lowered barrier and promote the reaction.62,63 The combination of computational efficiency and accuracy with the ability to perform ML/MM simulations makes FieldSchNet well suited for modeling such reactions. Beyond that, the model provides access to a range of properties which can be used to characterize the different species formed during reaction.
We train two FieldSchNet models to simulate the first step in the Claisen rearrangement of allyl-p-tolyl ether. The first model is trained on 61000 PBE0 reference configurations sampled from a metadynamics trajectory of the reaction simulated at a lower level of theory (PBE, see ESI† for details). A second model is a ML/MM model based on the reference configurations determined above augmented by different MM charge distributions sampled via the TIP3P force field for water64 (three charge configurations for every third structure of the metadynamics simulation for a total of 61002 structures, see ESI† for details). Errors for both models are reported in ESI Table S4.†
We perform umbrella sampling65 in vacuum as well as a solvent box containing ∼7000 TIP3P water molecules using the respective FieldSchNet models (see ESI†). The difference between the two bonds broken and formed during rearrangement is chosen as reaction coordinate (indicated as rCO and rCC in Fig. 5a). The speedup offered by FieldSchNet for this system is even more pronounced than for ethanol. A single computation which takes 1.6 hours with the reference method can now be performed in 180 ms, corresponding to an acceleration by a factor of over ∼30000.
The resulting free energy barriers are depicted in Fig. 5b, along with potential energy barriers computed in vacuum using the reference method and vacuum model. The FieldSchNet ML/MM model correctly predicts a lower activation barrier (30.08 kcal mol−1) for the aqueous environment compared to the gas phase reaction (33.35 kcal mol−1). The overall difference in the barrier height ΔΔG = 3.28 kcal mol−1 is close to the experimental value of ΔΔG = 4 kcal mol−1.61 Analyzing the configurations sampled during the ML/MM simulation, we observe the hydrogen bonding between the ether oxygen and water molecules in the solvent responsible for the acceleration of the reaction.62,63Fig. 5c shows the radial distribution function between hydrogens in the solvent and the oxygen of the transition state. A pronounced peak at a distance of 2 Å indicates that hydrogen bonds between solvent and solute form frequently at this stage.
The NMR chemical shifts provided by FieldSchNet can be used to trace structural changes occurring during the rearrangement and allow to connect theoretical predictions to experiment. Fig. 5d depicts the 13C chemical shifts predicted for different stages of the reaction. For example, it shows how the first (C12) and third (C16) carbon atom in the allyl ether exchange chemical environments during reaction. The former moves from typical shifts of allyl ether groups (∼68 ppm) to shifts associated with terminal carbons in conventional allyl groups (∼130 ppm). At the same time, the C16 undergoes this process in reverse, ending at shifts of ∼50 ppm characteristic for this position in allyl substituents. Another change of interest are the shifts of the carbon C2 in the aromatic ring where a new bond forms. This atom starts at typical values for aromatic carbons (∼120 ppm), staying there during the transitions state. Upon formation of the product, the aromaticity of the ring is lost and the shift moves to regions more indicative for carbon atoms in cyclic ketones (∼25–50 ppm).
The external field in the ML/MM FieldSchNet model used to describe the reaction depends on the charge distribution of the environment. Thus, the external charges can be optimized to lower the reaction barrier by minimizing
(12) |
Fig. 6a shows the evolution of the barrier during various stages of the optimization procedure. By designing an optimal environment, the activation barrier can be reduced by ∼25 kcal mol−1, lowering it from an initial 35 kcal mol−1 to 10 kcal mol−1. These findings correspond to a rate acceleration by a factor of ∼2 × 1018 at 300 K. Fig. 6b and c show the optimized environment in presence of the transition state, visualizing regions of negative (b) and positive (c) charge. Motifs such as the strong negative density close to the oxygen atom and the neighboring carbons involved in the forming bond are in strong agreement with experimental studies, where it was found that electron donating groups in these regions promote the reaction.62,67
We outline how such an optimized field can guide design endeavors by translating these motifs into an atomic environment of amino acids as would be present in an enzymatic active site. Negatively charged aspartic acid residues (Asp) are placed close to regions exhibiting high negative density, while regions of high positive charge are populated with lysine (Lys) molecules. Using eqn (12), we optimize the placement of the amino acids based on the electrostatic field derived from their Hirshfeld atomic partial charges.68 The resulting environment is shown in Fig. 6d. Finally, we recompute the reaction barrier in presence of the amino acid charge distribution with the original electronic structure reference (Fig. 6e). Although the optimal environment cannot be reconstructed perfectly due to the constraint imposed by the structure of the molecules, this relatively straightforward approach leads to a significant reduction of the barrier. The activation energy is lowered from 35 kcal mol−1 to 22 kcal mol−1, which still corresponds to an acceleration by a factor of ∼3 × 109. This demonstrates that the theoretically optimized environment Fig. 6b and d can be mimicked retaining the same trend on the barrier and reaction speed.
Computational spectroscopy can benefit greatly from the FieldSchNet framework, as it provides efficient means for computing high quality spectra as we have demonstrated for IR and Raman spectra as well as NMR chemical shifts. Combining the latter with nuclear spin–spin coupling tensors, i.e. response properties of the magnetic moments, enables the accurate prediction of nuclear magnetic resonance spectra. In principle, all other spectroscopic quantities, derived via the response formalism, can be modeled by FieldSchNet as well.
The ability of our neural network to operate as a continuum model for solvation is not only highly attractive for applications in drug design, where efficient models of solvent effects are much sought after, but serves as a starting point for developing more powerful models, which could for example consider structural aspects of the solvent. When treating interactions with the environment explicitly, the introduced ML/MM procedure proves to be a powerful tool combining the accuracy of machine learning potentials with the even higher speed of force fields. While the presented study of solvent effects on the Claisen rearrangement reaction of allyl-p-tolyl ether would require 18 CPU years with the electronic structure reference, it was performed within 5 hours with FieldSchNet on a single GPU. This greatly expands the range of application of ML models and brings the simulation of large, biologically relevant systems, such as enzymatic reactions, within reach.
Moreover, the fully analytic nature of FieldSchNet enables inverse design as we have illustrated by minimizing the reaction barrier of the Claisen rearrangement through interactions with an optimized environment. Coupling this to a generative model of molecular structure,66 the charge distributions found using FieldSchNet may be populated in a fully automated fashion. Possible application of FieldSchNet to inverse design tasks include the targeted functionalization of compounds or the creation of enzyme cavities promoting reactions.
FieldSchNet constitutes a unified framework for describing reactions and spectroscopic properties in solution. Beyond that, it provides insights on how these quantities can be controlled via the molecular environment. As this opens up new avenues for designing workflows tightly integrated with experiment, we expect FieldSchNet to become a valuable tool for chemical research and discovery.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d1sc02742e |
This journal is © The Royal Society of Chemistry 2021 |