Open Access Article
I. S.
Novikov
a,
Y. V.
Suleimanov
*bc and
A. V.
Shapeev
*a
aSkolkovo Institute of Science and Technology, Skolkovo Innovation Center, Nobel St. 3, Moscow 143026, Russia. E-mail: a.shapeev@skoltech.ru
bComputation-based Science and Technology Research Center, Cyprus Institute, 20 Kavafi Street, Nicosia 2121, Cyprus
cDepartment of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA. E-mail: ysuleyma@mit.edu
First published on 9th November 2018
We propose a methodology for the fully automated calculation of thermal rate coefficients of gas phase chemical reactions, which is based on combining ring polymer molecular dynamics (RPMD) and machine-learning interatomic potentials actively learning on-the-fly. Based on the original computational procedure implemented in the RPMDrate code, our methodology gradually and automatically constructs the potential energy surfaces (PESs) from scratch with the data set points being selected and accumulated during the RPMDrate simulation. Such an approach ensures that our final machine-learning model provides a reliable description of the PES that avoids artifacts during exploration of the phase space by RPMD trajectories. We tested our methodology on two representative thermally activated chemical reactions studied recently by RPMDrate at temperatures within the interval of 300–1000 K. The corresponding PESs were generated by fitting to only a few thousand automatically generated structures (less than 5000) while the RPMD rate coefficients showed deviation from the reference values within the typical convergence error of RPMDrate. In future, we plan to apply our methodology to chemical reactions that proceed via complex-formation thus providing a completely general tool for calculating RPMD thermal rate coefficients for any polyatomic gas phase chemical reaction.
Despite the instantaneous success of RPMDrate code,16 the current version is restricted to a limited number of chemical reactions for which the underlying potential energy surfaces (PESs) are available in an analytical form. For the code to become a generally useful tool, efficient ways to couple RPMD with electronic structure evaluations are required. In principle, a PES can be calculated “on-the-fly”, but even with the most advanced supercomputers, it is extremely CPU-intensive and is generally limited to fairly short propagation times. This challenge has been partially solved by approximating a limited number of quantum-mechanical calculations (typically tens of thousands), constructing a PES using the permutation invariant polynomial-neural network (PIP-NN) method.17–19 However, during preliminary RPMDrate simulations for several polyatomic systems, convergence issues were detected due to artifacts in the PIP-NN PESs resulting from a lack of points in data sets in certain areas (see, e.g., the supporting information file of ref. 20). As compared to classical trajectories that are normally used for verification of the PESs, RPMD trajectories provide more enhanced sampling of the phase space by using ring polymer beads, which can enter the potential artifact zones.
During the past few years, the application of machine learning to constructing PESs has gained a lot of attention.21–48 The methods are based on neural networks,21–33,45,46,48 Gaussian processes,34–38 and other methods.39,40 Also, closely related are energy-free (i.e., non-conservative) machine-learning force fields.41–43 Among these is the moment tensor potential (MTP).39,49 We use the MTP as an interatomic interaction model in this work.
The goal of this work is to propose an algorithm for automatically constructing an approximation to the reference PES for any given molecular system for the subsequent calculation of RPMD thermal rate coefficients. The main challenge in automatically constructing such an approximation is to automatically assemble the training set that can be used to fit a good potential. A natural idea would be to use RPMDrate itself to sample the needed configurations for training, but the original version of RPMDrate requires a fitted potential to run. This seems to be a vicious circle: we need a training set in order to fit a potential, while we need a potential in order to sample a relevant training set. We resolve this challenge by applying the active learning (AL) approach, proposed in ref. 50 for linearly parametrized potentials and extended to nonlinearly parametrized models in ref. 49 and 51. The idea of the approach is to let RPMDrate sample the needed configurations, and for each configuration, decide on-the-fly whether a potential can yield reliable energies and forces or if it needs to be trained on this configuration. The underlying algorithm for choosing configurations for training is based on a D-optimality criterion for selecting the configurations in the training set (after computing its energy and forces using an ab initio potential). The core of this criterion is the so-called maxvol algorithm, proposed in ref. 52. We refer to the combined approach as AL-MTP (active-learning moment tensor potential).
In this paper, we propose and test a combination of AL-MTP and RPMDrate for predicting chemical reaction thermal rate coefficients. For the present study, we have selected two exemplifying systems, namely, OH + H2 → H + H2O and CH4 + CN → CH3 + HCN, recently studied using RPMDrate.53,54 The RPMD rate coefficients and the corresponding analytical PESs54,55 for these chemical reactions were readily available to us at the time we started this project. As our main purpose is to demonstrate the feasibility of our new approach, we consider these PESs as ab initio models and compare the rate coefficients predicted by these models to the ones calculated using the MTPs. We emphasize that although the practical purpose would be to fit machine-learning PESs to accurate quantum-mechanical models and hence calculate accurate reaction rates, the purpose of this work is to test the accuracy of our approach and hence we fit our PESs to the existing accurate and efficient PESs for which we can compute the reaction rates used as a reference for our models.
. Each contribution is further expanded as a linear combination of basis functions Bα,![]() | (1) |
![]() | (2) |
![]() | (3) |
is another set of parameters to be fitted and φβ are the radial basic functions (expressed through the Chebyshev polynomials and ensuring a smooth cut-off to 0 for r > Rcut). One can think of the functions fμ as the ones that define the shells of neighboring atoms, while the coefficients
express the relative weights of atomic species zj in the μ-th shell of the i-th atom.
We then construct our basis functions Bα as different contractions of the moment tensor descriptors (2) to a scalar, such as
| B0(ri) = M0,0(ri), |
| B1(ri) = M0,0(ri)M1,0(ri), |
| B2(ri) = M0,2(ri):M1,2(ri),… |
and hence we denote the MTP energy of a configuration x by E = E(θ;x).
![]() | (4) |
In order to find the active set, we use the so-called maxvol algorithm proposed in ref. 52. For a configuration x*, we then define its extrapolation grade γ(x*) as the maximum, by the absolute value, a factor by which the above determinant can increase if we try to replace each x(i) by x*. We emphasize that γ(x*) does not depend on the ab initio data, it depends only on the geometric information of the configuration x*. Thus, it is not necessary to carry out ab initio calculations to calculate the extrapolation grade.
In order to formulate our active learning algorithm, we introduce two thresholds: γth and Γth, 1 < γth < Γth. These thresholds define the bounds of permissible extrapolation. Thus, our AL algorithm can be systemized as follows:
• For each configuration x* occurring in the RPMDrate simulation, we calculate γ(x*). If γ(x*) < γth, then x* will not be added to the training set. Otherwise, there are two possibilities:
(a) γth ≤ γ(x*) < Γth. In this case, we think of γ(x*) as sufficiently high for x* to be added to the training set, but not too high to terminate the RPMDrate simulation. Hence, in this case, we mark (save to a file) the configuration x* and proceed with the RPMDrate simulation.
(b) γ(x*) ≥ Γth. In this case, the extrapolation grade is too high, therefore we add x* to the training set and terminate the RPMDrate simulation. We then update the active set with the marked configurations, calculate their ab initio energies and forces, add them to the training set, refit the potential, and repeat the entire RPMDrate simulation from the beginning.
As a result, our algorithm will restart RPMDrate several times until the training set covers the needed region in the phase space. We emphasize that the potential is fixed at each RPMDrate run, thus ensuring that the code samples a proper canonical ensemble at each run.
Through the algorithm described above, our potential is trained in a fully automatic manner, removing the need for tedious manual analysis of the quality of the PES being constructed. The scheme of our AL-MTP algorithm is shown in Fig. 1.
We study the first reaction, OH + H2 → H + H2O, at T = 300 K and T = 1000 K with nbeads = 1 at both temperatures, nbeads = 128 at the low temperature and nbeads = 16 at the high temperature. We run the second reaction, CH4 + CN → CH3 + HCN, at T = 300 K and T = 600 K with the same number of ring polymer beads at the low and the high temperatures as for the first reaction.
The remaining input parameters for the RPMDrate simulations are similar to those used in numerous studies of thermally activated chemical reactions.5 In order to obtain the PMF profiles for both chemical reactions, we divide the interval −0.05 ≤ ξ ≤ 1.05 into 111 windows of width 0.01. The umbrella force constant was set to ki = 2.72 ((T/K) eV) for each window centered at ξi, i = 1,…,111. In every window, we run 80 constrained RPMD trajectories with the sampling period of 50 ps and the equilibration period of 15 ps. Finally, the propagation time step was set to equal 0.0001 ps.
For the calculation of κ, we choose slightly different parameters depending on the chemical reaction. For the OH + H2 system, all the calculations (except the computation with nbeads = 128) are carried out along 20
000 unconstrained child trajectories (Ntotalchild) with the equilibration time of 10 ps (tequilibration) and 100 child trajectories per one initially constrained configuration (Nchild). All the unconstrained child trajectories run for tchild = 0.05 ps with the time step dt = 0.00005 ps. For the case of nbeads = 128, we increase the number of unconstrained child trajectories up to 25
000 and the time step is set to 0.0001 ps. For the CH4 + CN system, we take the following parameters: Ntotalchild = 50
000, tequilibration = 5 ps, Nchild = 100, tchild = 0.06 ps, and dt = 0.0001 ps.
As mentioned above, we consider the potentials described in ref. 54 and 55 as ab initio models for the present exemplifying study. The potential for the OH + H2 → H + H2O reaction was developed using Neural Network (NN) fitting55 and is denoted as NN1 PES. Another potential, applied for the CH4 + CN system, is a combination of various semi-empirical potentials54 including 34 parameters that were obtained after fitting this potential on the data set, describing the stationary points, the reaction path and the reaction swath. For simplicity, we shall call this potential CH4 + CN PES though we note that its original abbreviation is different (PES2017). We also note that the previous RPMD studies using these PESs demonstrated very good agreement with the experimental measurements of rate coefficients.53,54
We fit the MTP with 92 basis functions Bα, 4 radial functions fμ and 12 radial basis functions φβ. This results in approximately 300 and 500 MTP parameters for OH + H2 and CH4 + CN, respectively. We choose Rcut = 4 and 6 Å, respectively, for these systems. The active learning was performed with γth = 2 and Γth = 10, thus the interval of high but permissible grades is [2,10).
As described above, we need to compute kQTST (the first step of RPMDrate) and κ (the second step of RPMDrate). In order to obtain kQTST, we focus only on the region connecting the reactants with the transition state (i.e., ξ ∈ (−0.05, 1.05)). During the second step – computation of κ – the RPMD trajectories visit the products region, i.e., ξ > 1.05. The geometries of the configurations in the reactant and product regions are different and, thus, we use a slightly different MTP for the calculation of kQTST and κ trained on two data sets. More precisely, during the first RPMDrate step, we form the reactant set (kQTST set) that consists of configurations selected from the reactant region and learn on-the-fly the first MTP. During the second RPMDrate step, we start from the MTP and the training set derived after the first step, update the training set with the additional configurations (the product set, or, κ set) and learn on-the-fly the second MTP. Thus, the two MTPs differ by their training sets—the first training set is a subset of the second one. Having computed kQTST and κ, respectively, by these two MTPs, we obtain the final RPMD rate coefficient.
| T = 300 K | T = 300 K | T = 1000 K | T = 1000 K | |
|---|---|---|---|---|
| n beads = 1 | n beads = 128 | n beads = 1 | n beads = 16 | |
| k AIQTST (cm3 s−1) | 5.74 × 10−16 | 2.37 × 10−14 | 2.78 × 10−12 | 3.72 × 10−12 |
| k MTPQTST (cm3 s−1) | 5.37 × 10−16 | 1.84 × 10−14 | 2.91 × 10−12 | 3.97 × 10−12 |
| Error (%) | 6.5 | 22.3 | 4.7 | 6.7 |
| κ AI | 0.613 | 0.528 | 0.666 | 0.599 |
| κ MTP | 0.626 | 0.527 | 0.649 | 0.589 |
| Error (%) | 2.1 | 0.2 | 2.6 | 1.7 |
| k AIRPMD (cm3 s−1) | 3.52 × 10−16 | 1.25 × 10−14 | 1.85 × 10−12 | 2.23 × 10−12 |
| k MTPRPMD (cm3 s−1) | 3.36 × 10−16 | 9.70 × 10−15 | 1.89 × 10−12 | 2.34 × 10−12 |
| Error (%) | 4.5 | 22.4 | 2.2 | 4.9 |
| T = 300 K | T = 300 K | T = 600 K | T = 600 K | |
|---|---|---|---|---|
| n beads = 1 | n beads = 128 | n beads = 1 | n beads = 16 | |
| k AIQTST (cm3 s−1) | 1.69 × 10−13 | 1.13 × 10−11 | 6.10 × 10−12 | 3.63 × 10−11 |
| k MTPQTST (cm3 s−1) | 1.61 × 10−13 | 1.35 × 10−11 | 6.17 × 10−12 | 3.48 × 10−11 |
| Error (%) | 4.7 | 19.5 | 1.1 | 4.1 |
| κ AI | 0.267 | 0.184 | 0.304 | 0.250 |
| κ MTP | 0.256 | 0.185 | 0.317 | 0.251 |
| Error (%) | 4.1 | 0.5 | 4.3 | 0.4 |
| k AIRPMD (cm3 s−1) | 4.51 × 10−14 | 2.08 × 10−12 | 1.85 × 10−12 | 9.07 × 10−12 |
| k MTPRPMD (cm3 s−1) | 4.12 × 10−14 | 2.50 × 10−12 | 1.95 × 10−12 | 8.73 × 10−12 |
| Error (%) | 8.6 | 20.2 | 5.4 | 3.7 |
The number of configurations selected in the reactant region (kQTST set size), the product region (κ set size) and the total training set (kRPMD set size) is reported in Table 3. As can be seen, we select many more configurations from the reactant region than we add from the product region (see Fig. 6). The reason for this is as follows. During the first RPMDrate step, we need to approximate the PMF difference between the reactants and the transition state as accurately as possible due to its exponential contribution to kQTST. Thus, we need to predict the PMF profile across each umbrella window, especially near the transition state. This fact is illustrated in Fig. 7. Indeed, most of the configurations for both systems were selected for ξ ∈ (0.95, 1.05), i.e., near the transition state. Then, it happens that for the purposes of calculating RPMD rate coefficients, a potential that is well-trained in the reactant region needs much less data to be fitted in the product region (only a few bonds significantly differ, while most of the bonds in the molecular systems are the same in both regions).
| OH + H2 → H + H2O | CH4 + CN → CH3 + HCN | ||||||
|---|---|---|---|---|---|---|---|
| T, nbeads | k QTST set size | κ set size | k RPMD set size | T, nbeads | k QTST set size | κ set size | k RPMD set size |
| 300 K, 1 | 1401 | 96 | 1497 | 300 K, 1 | 3348 | 581 | 3929 |
| 300 K, 128 | 1816 | 44 | 1860 | 300 K, 128 | 4138 | 380 | 4518 |
| 1000 K, 1 | 1784 | 123 | 1907 | 600 K, 1 | 3904 | 544 | 4448 |
| 1000 K, 16 | 2014 | 83 | 2097 | 600 K, 16 | 4572 | 320 | 4892 |
The size of a total training set significantly depends on the number of different atomic types in the molecule. Note that the number of parameters
grows as the square of the number of atomic types (as they depend on pairs of types of interacting atoms, (zi,zj)). Thus, there are 2.25 times more coefficients in the potential for the CH4 + CN system than for the OH + H2 one. As can be seen from Table 3, we need approximately 2.25 times more configurations in the training sets for the CH4 + CN system than for the OH + H2 one. This confirms that the number of coefficients grows quadratically with the number of atomic types and thus the proposed algorithms should be applicable for large molecular systems, the direct description of which is problematic due to high dimensionality.
We additionally test how the accuracy improves when the number of MTP parameters increases. This test was done for the CH4 + CN system, T = 300 K, nbeads = 1. In Fig. 8, the PMF profile and the transmission coefficient are plotted for three potentials, with 150, 250, and 500 parameters, respectively. As can be seen, the training set size increases with the number of parameters, and so does the accuracy.
The remaining two factors that affect the size of our training set are the number of ring polymer beads and the temperature. Increasing the number of ring polymer beads leads to more enhanced phase space exploration thus more configurations are necessary in the training set. The same is valid for the temperature factor: as the temperature increases, the energy dispersion increases and therefore we need more configurations in the dataset in order to describe all possible energy levels. We attribute both correlations to the fact that a higher temperature and higher number of beads imply that we need to sample a larger region in the phase space and therefore collect more configurations for training. In any case, the maximal training set size is less than 5000, thus, we needed to carry out less than 5000 ab initio calculations in order to obtain an MTP for both exemplifying chemical systems considered in the present study.
In future, we plan to extend our methodology to chemical reactions that proceed via complex formation in order to propose a completely general tool for calculating RPMD rate coefficients for any polyatomic chemical reaction. In principle, the Bennett–Chandler factorization can also be implemented in this case5 though the contribution from the real-time propagation of the dynamic factor significantly increases leading to possible alterations to the AL-MTP algorithm. This work is currently ongoing. Finally, we would like to note that our AL-MTP approach could be used in calculations of other dynamical properties (such as RPMD diffusion coefficients); the applicability of the algorithm does not depend on a predicted physical quantity.
| This journal is © the Owner Societies 2018 |