Open Access Article
Tomoya
Shiota
*ab,
Kenji
Ishihara
b and
Wataru
Mizukami
*ab
aGraduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka 560-8531, Japan. E-mail: shiota.tomoya.ss@gmail.com; mizukami.wataru.qiqb@osaka-u.ac.jp
bCenter for Quantum Information and Quantum Biology, Osaka University, 1–2 Machikaneyama, Toyonaka 560-8531, Japan
First published on 16th July 2024
Accurate prediction of diverse chemical properties is crucial for advancing molecular design and materials discovery. Here we present a versatile approach that uses the intermediate information of a universal neural network potential as a general-purpose descriptor for chemical property prediction. Our method is based on the insight that by training a sophisticated neural network architecture for universal force fields, it learns transferable representations of atomic environments. We show that transfer learning with graph neural network potentials such as M3GNet and MACE achieves accuracy comparable to state-of-the-art methods for predicting the NMR chemical shifts by using quantum machine learning as well as a standard classical regression model, despite the compactness of its descriptors. In particular, the MACE descriptor demonstrates the highest accuracy to date on the 13C NMR chemical shift benchmarks for drug molecules. This work provides an efficient way to accurately predict properties, potentially accelerating the discovery of new molecules and materials.
With machine learning, physics-inspired descriptors that characterize the chemical space have been developed and serve as the cornerstone for building efficient and highly accurate models.21,23,28–40 Smooth overlap of atomic positions (SOAP),14,18,21,28,31,32,41 Faber–Christensen–Huang–Lilienfeld (FCHL),29,30,33,34,38 and similar descriptors offer atom-level descriptions within molecular or material environments based on physical insights and are effective in regressing chemical quantities, such as interatomic potentials (IAP) and nuclear magnetic resonance (NMR) chemical shifts.11,14,15,21,23,24,28–38,41–43 Notably, IAPs built using descriptors and Gaussian process regression (GPR)14 have been termed Gaussian approximation potentials (GAP) and have found success in the exploration of the chemical space of molecules and materials.14,21,37 Both kernel ridge regression (KRR) and GPR have been employed to improve the accuracy of NMR chemical shift prediction.29,30,41–44 However, the dimensionality of the descriptors becomes a barrier to generalization and high accuracy as the molecular or material composition becomes more diverse owing to the addition of different types of elements.19,35,39,45
Recently, deep-learning models based on graph neural networks (GNNs) have been proposed to describe chemical spaces using graph representations.9,10,17–20,24,25,45–64 In most GNN-based IAPs, atoms within a molecular or material environment are represented as nodes, and their local connectivity as edges in a graph. The graph is then convolved to embed atom-specific information within each node, and further processed using multilayer perceptrons (MLP) to predict target observables. In molecular and materials simulation and modeling, the consideration of symmetry is extremely important. It is desirable for GNNs to be invariant or equivariant to symmetry operations such as translation, rotation, and reflection for the models to make physically meaningful predictions. GNNs that possess these properties are referred to as invariant GNNs or equivariant GNNs. The universal GNN-based IAPs proposed thus far have been designed to satisfy these symmetries. Recently, E(3) or SE(3) equivariant GNN-based IAPs (e.g., Allegro,61 GNoME,65 MACE62–64) have demonstrated superior performance compared to E(3) invariant GNN-based IAPs (e.g., MEGNet,52 M3GNet10).66,67
Similarly, GNN-based models have been developed to predict NMR chemical shifts.46,47,49,50,59,68 DFT-level calculations of NMR chemical shifts for 1H and 13C have demonstrated the ability to predict within a target accuracy range of 1–2% relative to the possible ranges of approximately 10 ppm and 200 ppm, respectively.69,70 Therefore, the uncertainty in machine learning models using DFT-level datasets is this level of precision, with the target accuracy of 0.2 ppm for 1H and 2 ppm for 13C.30 For example, Yanfei Guan et al. achieved the target accuracy of 0.16 ppm for 1H and 1.26 ppm for 13C by training the SchNet architecture51 on molecular NMR chemical shifts (CASCADE).46
However, the scalability remains an issue due to the increasing optimization costs of GNN and MLP parameters when the size of datasets increase. Han et al. addressed this issue by constraining the nodes in a GNN to heavy elements only, thereby rendering the construction of scalable GNN-based NMR chemical shift models feasible while achieving a state-of-the-art prediction accuracy comparable to that of CASCADE.68 Furthermore, NMR chemical shifts of various nuclei beyond hydrogen and carbon have become crucial for understanding systems involving a wide range of elements, such as proteins and solids.71–77 Consequently, efforts are being made to develop machine learning models for NMR chemical shifts of nuclei such as 15N, 17O, and 19F.73–76 These elements exhibit wide chemical shift ranges, with about 600, 2500, 500 ppm for 15N, 17O, and 19F, respectively. The target accuracy for these nuclei is set at 25 ppm for 15N and 5 ppm for 19F as well as 1H and 13C.71–77
Notably, both descriptor-based and GNN-based methods face challenges. The former faces increased learning costs as the composition becomes more complex, and the latter faces increasing parameter optimization costs with larger training datasets. To address these issues simultaneously, we focused on the potential utility of the outputs from pre-trained GNN-based IAPs as descriptors. We considered these outputs GNN transfer learning (GNN-TL) descriptors and built machine-learning models for predicting chemical properties. Note that there are existing studies attempting to apply pre-trained GNN potentials to other tasks, particularly to generative modeling.78–81
The remainder of this paper is organized as follows. Section 2 details the GNN-TL descriptor and the kernel method, implemented on both classical and quantum computers, for predicting NMR chemical shifts of 1H, 13C, 15N, 17O, and 19F. Section 3 presents the performance of our developed machine learning models. Section 4 discusses the benefits and applications of the GNN-TL descriptor. Finally, Section 5 concludes the paper.
When fed with the atomic coordinates of a molecule with N atoms, denoted by {Zi, Ri}, where Zi represents the atomic number indicating the type of each atom, and Ri is the three-dimensional position vector of the ith atom, the GNN layer generates a set of vectors, {Gi}, which mirrors the environment of the ith atom in the molecule. This is referred to as the GNN-TL descriptor. The GNN layer for both MEGNet and M3GNet outputs GNN-TL descriptors with dimensions of 32 and 64 per atom, respectively. On the other hand, MACE is a GNN architecture that predicts energy in the form of atomic cluster expansion. As in ref. 86, only the output of the 1st layer of the GNN layer, corresponding to the one-body term of the many-body expansion, is used as the GNN-TL descriptor. The dimensions of this GNN-TL descriptor are 128, 256, 96, and 224 per atom for MACE-MP0-small, MACE-MP0-large, MACE-OFF23-small, and MACE-OFF23-large, respectively.
Using GNN-TL descriptors as input, a regression model was constructed to predict NMR chemical shielding constants. For the regressor, one can choose methodologies, such as GPR, KRR, or feed-forward neural network (NNs), which are contingent on the specific task. In this study, to ensure a maximally fair comparison with other descriptor-based techniques, we adopted KRR.
KRR combines the merits of ridge regression, which offers regularization to mitigate overfitting, with the kernel method, facilitating nonlinear regression. In kernel methods, the data – in the context of our study, the GNN-TL descriptors – are mapped into a high-dimensional feature space through a non-linear kernel function. The Laplacian and Gaussian kernels were applied:
| k(Gi, Gj) = exp(−γ‖Gi − Gj‖pp), | (1) |
t for the target chemical property of the target atom is derived from the GNN-TL descriptor Gt as follows:![]() | (2) |
| α = (K + λI)−1σ | (3) |
All computations related to the KRR were executed using Scikit-learn v.1.2.2,87 and the hyperparameters of each model were tuned using Optuna v.2.10.88 For dataset sizes of up to 50K items, we conducted hyperparameter optimization for 100 iterations with ten-fold cross-validation, while for those at 100K, we limited the optimization to 10 iterations.
The quantum-kernel method leverages quantum computers to compute kernels,16,89,90 which is achieved by embedding feature vectors generated by classical computers into quantum states. This method calculates the inner product of these quantum states to derive the desired kernels. Embedding feature vectors into quantum states corresponds to mapping them onto a Hilbert space with dimensions raised to the power of two quantum bits (qubits). Using the kernel matrix constructed on a quantum computer, we performed a KRR, denoted as quantum KRR (QKRR).
In this study, we adopted the natural parameterized quantum circuits (NPQC) kernel, which has been demonstrated to possess performance characteristics similar to the Gaussian kernel, both theoretically and in actual hardware experiments.91–93 All computations were conducted using Scikit-qulacs.87,94,95 The quantum kernel was constructed in a 10-qubit space. Hyperparameters for the quantum kernel were determined through grid search. The determined parameters of NPQC kernel were c = 1.5 and the repetition times of embedding 40. The regularization hyperparameter in QKRR was determined using 10 iterations of randomized search.
In Section 3.2, we focused on the accuracy of the GNN-TL descriptor in predicting NMR chemical shifts, which are key to understanding molecular details (e.g., interatomic distances and bond angles). This scenario provides an ideal test for determining how well the GNN-TL descriptor works in our study.
Our analysis began by comparing quantum kernel learning, in which the kernels are tested using a quantum computer emulator with traditional kernel learning methods. We then checked the accuracy of the GNN-TL descriptors across the different pretrained GNN models.
Finally, we juxtaposed our GNN-TL descriptor using well-established physics-inspired descriptors. This comparison demonstrates the superiority of the proposed descriptor in terms of efficiency and accuracy. Furthermore, it highlights its potential for accurately predicting chemical properties, which is crucial for advancing research in the molecular and material sciences.
In Table 1, we present the scaling of the SOAP, FCHL19 and various GNN-TL descriptors in response to an increase in the number of elemental species considered. Additionally, for the QM9, QMugs,85 and MPF.2021.8 or MPtrj datasets,10 the descriptor dimensions corresponding to 5, 10, and 89 elemental species comprising each dataset are summarized, respectively. Remarkably, with an increase in the number of element types, both SOAP and FCHL19 exhibited quadratic scaling. As a snapshot, when representing five elements in the QM9 dataset, the SOAP and FCHL19 methods have dimensions of 5740 and 740, respectively. This dimensional disparity increases with the number of elemental types. Hence, to represent the 89 elements, the dimensions increased to 1
737
120 and 162
336, respectively. These dimensions are hundreds to tens of thousands of times larger than the compact GNN-TL descriptors, which ranges from 64 to 256 dimensions. Owing to its consistent dimensionality, irrespective of the increase in elements, the GNN-TL descriptors are overwhelmingly compact.
| N elem | SOAPa | FCHL19a | SchNet GNN-TL | MEGNet GNN-TL | M3GNet GNN-TL | MACE-MP0-small GNN-TL | MACE-MP0-large GNN-TL | MACE-OFF23-small GNN-TL | MACE-OFF23-large GNN-TL |
|---|---|---|---|---|---|---|---|---|---|
| O(Nelem2) | O(Nelem2) | O(1) | O(1) | O(1) | O(1) | O(1) | O(1) | O(1) | |
| a SOAP and FCHL were generated by Dscribe 0.4.0 (ref. 28) and QML 0.4.0.12,100 respectively. The default hyperparameters were selected as in QM9NMR paper. | |||||||||
| 5 | 5740 | 740 | 128 | 32 | 64 | 128 | 256 | 96 | 224 |
| 10 | 22 680 |
2440 | — | — | 64 | 128 | 256 | 96 | 224 |
| 89 | 1 737 120 |
162 336 |
— | — | 64 | 128 | 256 | — | — |
| δ = σref − σ. | (4) |
The reference substances selected for the various nuclei in this study are widely recognized and commonly adopted in the literature.29,101–104 Specifically, tetramethylsilane was selected for both 1H and 13C, nitromethane (MeNO2) for 15N, water-17O (H217O) for 17O, and trichlorofluoromethane (CFCl3) for 19F. We determined the chemical shielding constants for these well-established reference substances as follows: 31.7608 ppm for 1H, 187.0521 ppm for 13C, −147.8164 ppm for 15N, 325.8642 ppm for 17O, and 171.2621 ppm for 19F. These constants were evaluated by calculations at the mPW1PW91 (ref. 105)/6-311+G(2d,p) level using density functional theory (DFT) and gauge-including atomic orbital (GIAO)106 methods. Structure optimization was conducted at the B3LYP107/6-31G(2df,p) level in alignment with the methodologies employed for the QM9 NMR dataset. All calculations were performed using the Gaussian 16 software suite.108
In our study, we utilized the QM9NMR dataset, which contains approximately 134K small organic molecules containing C, N, O, and F (excluding H), with each molecule having no more than nine atoms.29,82 This dataset provides the detailed NMR chemical shielding constants for these molecules. To analyze how the model accuracy changes with training data size, we adopted an approach similar to that used in the original publication of the QM9NMR dataset.29 Specifically, for 13C, of a total of 831K data points, we randomly withheld 50K data points to build our test set. Subsequently, from the remaining 13C NMR chemical shifts, we randomly selected subsets containing 100, 200, 500, 1K, 2K, 5K, 10K, 50K, 100K, and 200K data points to create various training sets. For the other isotopes (i.e., 1H, 15N, 17O, and 19F), the test sets were similarly established by withholding 50K, 30K, 50K, and 1K data points, respectively. The training size for 19F was set to 2K, whereas the other isotopes were trained on datasets of 100K data points. In addition to the QM9 NMR dataset, we sought to validate the performance of our model on external datasets. Hence, we employed the two sets of molecules provided in another study;29 one consisting of 40 drug molecules from the GDB17 universe and another containing 12 drugs with 17 or more heavy atoms.
Fig. 2 shows the relationship between the mean absolute error (MAE) for the 13C NMR shielding constant predictions and the training data size. Both QKRR and KRR demonstrated consistent improvements in predictive accuracy with an increase in training size. Notably, the quantum kernel exhibited a performance comparable to that of the Laplacian kernel. For a training size of 100K, the MAE for the 13C predictions was 2.28 ppm. In a comparative study by Gupta et al., the KRR models using the Coulomb matrix (CM),109 SOAP, and FCHL descriptors reported MAEs of approximately 4, 2.1, and 1.88 ppm, respectively, for the same training size.29 Compared with the CM descriptor, our GNN-TL descriptor showed significantly better predictive capabilities, achieving an MAE that was nearly half that of the CM descriptor. Although our method did not exceed the accuracy levels of SOAP and FCHL, the performance of the GNN-TL descriptor was competitive, highlighting its potential as a robust descriptor.
Next, we compared the performance of the GNN-TL descriptors derived from different IAP architectures. Recently, independent of our work, a predictive model for 13C NMR chemical shielding was proposed using a pretrained IAP known as SchNet, which is a pioneering GNN used as a descriptor.110 This model was trained on 400 data points of 13C NMR chemical shielding constants of the molecules in QM9 dataset,82 with the SchNet GNN-TL descriptor as an input to a feed-forward NN for regression. The predictive accuracy of the SchNet/NN was a root mean-squared error (RMSE) of 12.8 ppm. In pursuit of a fair comparison with their model, we applied KRR using pre-trained MEGNet, M3GNet and MACE GNN-TL descriptors, setting our training data size to 400 data points of 13C NMR chemical shielding constants. To account for the influence of random sampling, we created 10 different training sets, each comprising 400 data points. The effect of potential data bias was then quantified by calculating the mean RMSE and standard deviation (STD) for each model. Detailed verification including kernel function dependencies can be found in the Appendix. The results of this comparative study are summarized in Table 2. In Table 2, the results for KRR using the Gaussian kernel, which showed superior accuracy compared to the Laplacian kernel, are presented.
In contrast to the SchNet/NN model's RMSE of 12.8 ppm, the MEGNet/KRR model shows significantly lower predictive accuracy with an RMSE of 20.08 ± 0.55 ppm, suggesting that the MEGNet descriptor is less effective for 13C NMR chemical shielding data. The M3GNet/KRR model demonstrates a substantial improvement with an RMSE of 10.02 ± 0.37 ppm. Models using MACE descriptors show even greater accuracy: the MACE-MP-0-small/KRR and MACE-MP-0-large/KRR models achieve RMSEs of 9.77 ± 0.34 ppm and 9.74 ± 0.27 ppm, respectively. The best performance is observed with the MACE-OFF23-small/KRR model, which has an RMSE of 8.05 ± 0.19 ppm, with the MACE-OFF23-large/KRR model close behind at 8.15 ± 0.42 ppm. These results highlight the superior performance of the MACE descriptors, particularly MACE-OFF23-small, in enhancing the accuracy of KRR models for predicting 13C NMR chemical shielding. A more detailed discussion of the nuances of these architectural differences is presented in Section 4.1.
The accuracy of KRR models incorporating the M3GNet GNN-TL descriptor with a Laplacian kernel for NMR chemical shifts was evaluated for each test set of the five different nuclei. Table 3 lists the statistical performance metrics for predicting NMR chemical shifts. Across all elements, the MAE for the test set remained below 5 ppm. The MAE for 1H and 19F were notably low at 0.18 ppm and 2.65 ppm, respectively, indicating a high degree of prediction accuracy for these nuclei in the unseen molecular environments. The MAE for 17O, although higher at 4.95 ppm, still reflects a reasonable predictive capability, given the complexity of the oxygen chemical shifts. The STD and interquartile range (IQR) values in the Table 3 represent the distribution of chemical shifts within the training data, rather than the accuracy of the model itself. Thus, the higher STD and IQR values for 17O do not indicate a lack of model precision but rather the natural variability inherent in the 17O chemical shifts within the training data. The MAE/STD ratio can still offer insights into model performance relative to data variability. For example, the relatively low ratio of 17O (2.21%) suggests that the model predictions are consistent with the diversity of the training data. On the other hand, the higher ratios for 1H (9.09%) and 19F (7.78%) indicate that the accuracy of the models are not as high as desired, particularly when considering the range of chemical shifts represented in the training dataset. The maximum absolute error (MaxAE) for all nuclei is comparable to the STD of the training data. This is attributed to random sampling and is expected to improve with the application of more sophisticated data point sampling techniques, such as active learning.
| 1H | 13C | 15N | 17O | 19F | |
|---|---|---|---|---|---|
| MAE (ppm) | 0.18 | 2.28 | 3.42 | 4.95 | 2.65 |
| MaxAE (ppm) | 7.50 | 68.58 | 71.62 | 279.84 | 39.31 |
| STD (ppm) | 1.98 | 51.96 | 119.58 | 224.40 | 34.07 |
| IQR (ppm) | 2.34 | 59.93 | 211.19 | 354.25 | 36.77 |
| MAE/STD (%) | 9.09 | 4.38 | 2.86 | 2.21 | 7.78 |
Subsequently, these models were employed to predict the NMR chemical shifts of a single molecule C5H5N2OF containing five elements that was not included in the training data. The results are shown in Fig. 3. The MAE for each nucleus were found to be 0.08 ppm for 1H, 1.03 ppm for 13C, 6.45 ppm for 15N, 2.86 ppm for 17O, and 6.73 ppm for 19F. The remarkably low MAE for 1H and 13C underscores the high accuracy of our model for these nuclei, with predictions that closely mirror the calculated values. The model performed well for the more challenging 15N and 17O nuclei, where the chemical shifts can be significantly affected by subtle changes in the molecular structure and environment, as indicated by the MAE values. The 19F nucleus, while having a higher MAE, showed excellent agreement with the DFT/GIAO calculations, suggesting that the model predictions were robust, even for nuclei with typically higher chemical shift ranges. These results demonstrate the strong predictive power and potential of the model as a reliable tool for accurately predicting NMR chemical shifts across a variety of nuclei, even in molecules beyond the scope of the training data.
We then expanded our assessment to evaluate the predictive ability of our model for molecules larger than those in the QM9 NMR dataset. As such, we incorporate the test sets provided in ref. 29, which comprised 40 drug molecules from the GDB17 universe and another set containing 12 drugs with 17 or more heavy atoms. See ref. 29 for the structures of these molecules.
Table 4 presents the benchmark results for each test set using our M3GNet GNN-TL descriptor and MACE-OFF23-small GNN-TL descriptor. For comparison, we used the FCHL descriptor from Gupta's study.29 To ensure a fair comparison, we employed our GNN-TL descriptor models trained on a size of 100K 13C chemical shielding constants. For both models, an increased molecular size in the dataset correlated with deterioration of the MAE value. Notably, although our M3GNet GNN-TL descriptor did not match the 1.88 ppm value achieved by the FCHL descriptor for the QM9 50K test set, our model exhibited an MAE value that was approximately 0.3 ppm lower for the 40 GDB17 dataset test. The MACE-OFF23-small GNN-TL descriptor showed even better performance, with an MAE of 1.87 ppm for the QM9 50K test set, closely matching the FCHL descriptor, and significantly outperforming it for the 40 GDB17 dataset with an MAE of 2.83 ppm. For the set of 12 drugs with 17 or more heavy atoms, the M3GNet descriptor showed an MAE of 4.21 ppm, while the MACE-OFF23-small descriptor showed an MAE of 3.85 ppm. Notably, the M3GNet descriptor's accuracy is comparable to the FCHL descriptor. The results were nearly identical for the set of 12 drugs with 17 or more heavy atoms, highlighting that the M3GNet GNN-TL descriptor was less affected by increasing molecular size. On the other hand, the MACE-OFF23-small descriptor significantly outperforms FCHL with an MAE of 3.85 ppm, highlighting its superior predictive performance.
For a detailed comparison, Fig. 6 illustrates the molecule-specific MAE values for both drug test sets. The molecular structures are provided in ref. 29. Our M3GNet and MACE-OFF-small GNN-TL descriptor-based prediction models ensured that the highest MAE values for individual molecules across both test sets remained below 10 ppm. Intriguingly, the desflurane molecule, which posed the greatest challenge, showed MAE values of 53.3 ppm, 9.35 ppm and 8.31 ppm for the FCHL, M3GNet and MACE-OFF23-small GNN-TL descriptor models, respectively. This suggests an approximately 80% reduction in the MAE with our descriptor, which is likely attributable to differences in the encompassed descriptor domain.
The cutoff radius for the FCHL descriptor was determined through a grid search,29 which settled at 4.0 Å. In this scenario, the two fluorine atoms in the terminal trifluoromethyl group (CF3) of the desflurane molecule, which lie beyond 4 Å from the CF2H carbon, were neglected. In contrast, our M3GNet descriptor had a 6 Å cutoff radius during the initial graph configuration and a 5 Å cutoff for three-body interactions during graph convolution, capturing the entire CF3 group. This suggests that the descriptor adequately accounts for the influence of the terminal trifluoromethyl group. Additionally, the intrinsic ability of GNN-TL descriptors to account for environments beyond their cutoff radius, owing to graph convolution, may have contributed to the substantial improvement in MAE. Notably, the MACE-OFF23-small model, with a cutoff value of 4.5 Å, achieves the highest accuracy, even though it does not capture the fluorine element at a distance of 4.65 Å in the CF3 group. In summary, the proposed M3GNet and MACE GNN-TL descriptors demonstrate the capability of predicting 13C NMR chemical shifts for molecules outside the training dataset with an accuracy comparable to that of the state-of-the-art FCHL descriptor.
Lastly, to explore further practical applications of the constructed models, we validated the NMR chemical shielding constants obtained using semi-empirical PM7-level geometries as inputs against the NMR chemical shift values obtained using DFT/GIAO-level structures from the training data. This validation was performed on the QM9 50K holdout set and two drug molecule test sets, as provided by ref. 29. The 13C prediction model employed was the M3GNet/KRR model. The MAE values for each molecule in the drug datasets can be found in Fig. 6b and d. For the QM9 50K holdout set, the result was 3.61 ppm, showing a significant deterioration of 1.33 ppm compared to when DFT-level geometries were used as inputs. Conversely, predictions for the 40 drugs and 12 drugs test sets showed only minor deteriorations of 0.23 ppm and 0.04 ppm, respectively. These results suggest that even when using more readily available PM7-level geometries as inputs, the transferability of the model remains robust for extrapolative predictions on larger molecules compared to the training data.
687 ionic steps spanning 62
783 compounds, including 187
687 energies, 16
875
138 force components, and 1
689
183 stress components. This diverse dataset covers 89 elements from the periodic table. The model is not limited to learning only the energies associated with these elements but extends to atomic-level forces. Moreover, M3GNet training includes not only stable structures but also the processes of structure optimization. The ingestion of vast amounts of data from crystalline systems may have endowed the M3GNet with enhanced expression, potentially making it adept at interpolating molecular systems. The pre-trained MACE-MP0 model was trained using ten times more energy data of crystalline systems, potentially contributing to the improved accuracy of the 13C NMR chemical shift predictions shown in Table 2. On the other hand, the MACE-OFF23 model, which is specialized for molecules containing 10 elemental species, was trained on a dataset comprising about 1 M energy data points, with structures containing up to 150 atoms. This extensive training dataset might make it more suitable for predicting molecular NMR chemical shifts. Thus, the training data for IAPs, much like their architectures, could be a crucial factor in determining the performance of the descriptors.
Comparative evaluations with other renowned descriptors, such as SOAP, suggest that the GNN-TL descriptor can match, if not surpass, the performance of its contemporaries while maintaining a more compact representation. This is especially important when factoring large datasets, where dimensionality can exponentially burgeon.
Architectural choice plays a pivotal role in the performance of GNN-TL descriptors. Moreover, the diversity and vastness of the training dataset, which encompasses myriad elemental types and structural configurations, augment the robustness and versatility of the GNN.
Our proposed model has immense potential for creating a unified framework capable of predicting various atomic and molecular properties simultaneously, presenting profound implications for accelerated material and molecular research. This potential union of multiple predictions can usher in an era of comprehensive understanding and quicker innovations, possibly revolutionizing fields, such as catalysis, drug discovery, and material design.
The union of transfer learning with pretrained GNNs not only augments prediction accuracy but also drastically reduces learning costs, presenting a cost-effective and efficient alternative to more computationally intensive methods. As we move toward an era in which data-driven insights and models govern the pace of innovation, our research offers a promising pathway for future endeavors in the domain of chemical property predictions with both classical and quantum computers.
Note added – as we were finalizing this manuscript, we became aware of recent articles86,110,113 that also utilize intermediate information from graph neural network potentials. In Section 3.2, we added a direct comparison between our results and theirs. Elijošius et al. applied the pre-trained MACE descriptor to generative modeling of molecules.86
| GNN-TL descriptor | Gaussian kernel | Laplacian kernel |
|---|---|---|
| MEGNet | 20.08 ± 0.55 | 21.12 ± 0.56 |
| M3GNet | 10.02 ± 0.37 | 10.31 ± 0.38 |
| MACE-MP-0-small | 9.77 ± 0.34 | 10.78 ± 0.31 |
| MACE-MP-0-large | 9.74 ± 0.27 | 10.17 ± 0.30 |
| MACE-OFF23-small | 8.05 ± 0.19 | 8.64 ± 0.13 |
| MACE-OFF23-large | 8.15 ± 0.42 | 8.77 ± 0.21 |
Next, Table 6 shows the accuracy of KRR models using M3GNet and MACE-OFF23-small GNN-TL descriptors trained on a 100K 13C training set. Unlike models trained on the 400 13C training set, the KRR models with M3GNet GNN-TL descriptors consistently showed higher accuracy with the Laplacian kernel compared to the Gaussian kernel. Conversely, the results for MACE-OFF23-small GNN-TL descriptors were similar to those for models trained on the 400 13C training set, with the Gaussian kernel models demonstrating higher accuracy. This suggests that the appropriate kernel function may vary depending on the size of the training data.
| M3GNet | MACE-OFF23-small | |||
|---|---|---|---|---|
| Gaussian | Laplacian | Gaussian | Laplacian | |
| 50K QM9 | 2.35 | 2.28 | 1.87 | 2.10 |
| 40 drugs | 3.98 | 3.46 | 2.83 | 3.21 |
| 12 drugs | 5.14 | 4.21 | 3.85 | 3.93 |
Finally, these results indicate the choice of kernel functions for KRR models as presented in the Results section of this paper. For models trained on 400 13C data points, all KRR models using GNN-TL descriptors employed the Gaussian kernel. In contrast, for models trained on 100K 13C data points, the Laplacian kernel was used for KRR models with M3GNet GNN-TL descriptors, whereas the Gaussian kernel was employed for models with MACE-OFF23-small GNN-TL descriptors.
The accuracy of the GNN-TL descriptors was also validated using the molecular structures of two drug molecule data sets reported in ref. 29. The predicted 13C NMR shielding constants for each drug molecule using the M3GNet and MACE-OFF23 GNN-TL/KRR models are shown in Fig. 6a and c. These predictions are accompanied by the values predicted by the FCHL/KRR model.29 The prediction results of the M3GNet/KRR model using PM7-level optimized geometries, along with the prediction results using DFT-level geometries, are shown in Fig. 6b and d.
![]() | ||
| Fig. 6 Comparison of 13C NMR shielding constant predictions using different descriptors for (a) 40 drug molecules from the GDB17 universe and (d) 12 drugs with 17 or more heavy atoms. The predictions were made using the KRR model with the FCHL descriptor (red), the M3GNet GNN-TL descriptor (blue), and the MACE-OFF23-small GNN-TL descriptor (green). The FCHL results were taken from ref. 29. The results for the M3GNet/KRR model using DFT-level geometries and PM7-level geometries are shown in (b) and (d), respectively. | ||
| This journal is © The Royal Society of Chemistry 2024 |