Calvin
Yiu
a,
Ben
Honoré
a,
Will
Gerrard
a,
Jose
Napolitano-Farina
b,
Dave
Russell
b,
Iuni Margaret Laura
Trist
c,
Ruth
Dooley
c and
Craig P.
Butts
*a
aSchool of Chemistry, University of Bristol, UK. E-mail: Craig.Butts@Bristol.ac.uk
bGenentech Inc., USA
cEvotec (UK) Ltd, Milton Park, Abingdon. E-mail: UK
First published on 31st March 2025
Predicting 3D-aware Nuclear Magnetic Resonance (NMR) properties is critical for determining the 3D structure and dynamics, both stereochemical and conformational, of molecules in solution. Existing tools for such predictions are limited, being either relatively slow quantum chemical methods such as Density Functional Theory (DFT), or niche parameterised empirical or machine learning methods that only predict a single parameter type, often across only a limited chemical space. We present here IMPRESSION-Generation 2 (G2), a transformer-based neural network which can be used as a much faster alternative to high level DFT calculations in computational workflows of multiple classes of NMR parameter simultaneously, with time-savings of several orders of magnitude. IMPRESSION-G2 is the first system that simultaneously predicts all NMR chemical shifts, as well as scalar couplings for 1H, 13C, 15N and 19F nuclei up to 4 bonds apart, in a single prediction event starting from a 3D molecular structure. Rapid NMR predictions take <50 ms to predict on average ∼5000 chemical shifts and scalar couplings per molecule, which is approximately 106-times faster than DFT-based NMR predictions starting from a 3D structure. When combined with fast GFN2-xTB geometry optimisations to generate the 3D input structures themselves in just a few seconds, a complete workflow for NMR predictions on a new molecule is 103–104 times faster than a wholly DFT-based workflow for this. The accuracy of this multi-parameter predictor in reproducing DFT-quality results for a wide chemical space of organic molecules up to ∼1000 g mol−1 containing C, H, N, O, F, Si, P, S, Cl, Br exceeds that of existing state-of-the-art empirical or machine learning systems (∼0.07 ppm for 1H chemical shifts, ∼0.8 ppm for 13C chemical shifts, <0.15 Hz for 3JHH scalar coupling constants) and, critically, it also demonstrates generalisability when tested against molecules from sources that are completely independent of its own training data. When compared to experimental NMR data for ∼5000 compounds, IMPRESSION-G2 gives results in minutes on a standard laptop which are almost indistinguishable from DFT results that took days on a large scale High Performance Computing system. This accuracy and speed of IMPRESSION-G2 coupled to GFN-xTB shows that it can be used to simply replace DFT for predicting 3D-aware NMR parameters inside the wide chemical space of its training data.
Traditional rapid empirical methods for predicting NMR chemical shifts are limited mostly to 2-dimensional structures and cannot readily deal with 3-dimensional conformational or stereochemical analysis. For example, the additivity rules of Pretsch5 and HOSE-codes6 are inherently ‘flat’, with some modifications to treating for 3-dimensionality by e.g. flat-but-stereochemically-aware HOSE codes7 or conformational ensemble models for experimental systems.8–10 The most accurate tool for fully 3D-aware NMR predictions are quantum chemical calculations, typically based on Density Functional Theory (DFT).11–14 The best DFT methods can reproduce experiment to within 1–2% of the appropriate range of parameter values i.e. 0.2–0.3/2–4 ppm15,16 on ranges of ∼10/∼200 ppm for 1H and 13C chemical shifts respectively, across a very wide range of chemical structure space. However accurate DFT is very slow, especially when calculating for multiple molecules and/or conformers – full workflows typically take hours to days of CPU time for NMR predictions for each 3D geometry of a molecule of moderate size (say 30–40 non-H atoms). Naturally, if multiple conformers or isomers must be considered then the computation time can become days to months of computation for a single study, which rapidly becomes impractical.
Accurate prediction of scalar coupling constants are more directly linked to 3-dimensional structure than chemical shifts, through their high dependency on the dihedral angles of intervening bonds between the coupled nuclei. Generic Karplus-style empirical relationships, such as that from Haasnoot et al.,17 provide a partial solution for specific coupling types, e.g. 3-bond 1H–1H and 1H–13C, but they lose accuracy for even moderately complex structures, for example where heteroatoms introduce stereoelectronic effects. While bespoke versions of these can be optimised to deal with specific sub-structures, such as often used with carbohydrates,18 they are conversely not generalisable to molecules outside that specific chemical space. Finally, many NMR parameters which have great potential value in molecular structure elucidation, for example 15N chemical shifts and 1-bond 1H–13C scalar coupling constants, 1JCH, are much more rarely used in quantitative comparison simply because there are not reliable, fast and accurate predictive methods for them, but one must ask the question – what if there were?
Machine learning systems, trained on DFT-computed NMR parameters for 3D molecular structures, offer a solution to all of these issues. They are much faster to run than DFT NMR predictions, executing in seconds rather than hours or days. Machines for 2D-based (no conformation or stereochemistry) predictions for 1H and 13C chemical shifts exist and are typically trained on many thousands of literature experimental chemical shift data.19–22 These experimental chemical shift datasets are of variable quality due to limitations in measurement accuracy and errors of reporting by researchers. Training such machines for prediction of scalar couplings, on the other hand, is generally not even possible because large, accurate and validated experimental databases simply do not exist with the associated 3D molecular structures that are critical to scalar coupling constants (e.g.3JHH/CH values). On the other hand, large datasets of both DFT-computed chemical shifts and scalar couplings can be generated accurately, fully validated and ensure a direct match of those parameters to a single 3D structure. Also, datasets of DFT-generated structures can readily be made more diverse, as they are not limited only to chemical structures similar to previously experimentally studied molecules. The only downsides are then how accurately the machine reproduces the DFT result and how accurate the DFT method is in reproducing experiment. Paruzzo et al. first reported this approach for machine learning-based prediction of DFT-like solid-state NMR chemical shifts with ShiftML based on a kernel-ridge regression approach.23,24 Soon after, we demonstrated a similar architecture for solution-state NMR predictions with the first generation of our IMPRESSION model25 which could generate predictions comparable to DFT with mean absolute deviations of 0.23 ppm (δ1H) and 2.45 ppm (δ13C) for chemical shifts, as well as predicting 1JCH (MAD = 0.87 Hz). IMPRESSION was trained on 882 chemical structures, covering the same relatively limited chemical space as ShiftML (C,H,N,O,F only) and was limited in training dataset size by the kernel ridge regression architecture and resulting memory-demands of its molecular representation. CASCADE from Guan et al.26 later reported two separate message passing neural networks that provide 1H or 13C chemical shift predictions respectively. Both CASCADE machines were trained on ∼8000 DFT-derived molecular structures (DFT8K) and provided accuracies approaching 0.10 ppm (δ1H) and 1.26 ppm (δ13C) against an internal hold-out of structures from that same training data, with testing outcomes against external datasets not reported.
Herein we introduce our second generation system with a transformer-based neural network architecture, IMPRESSION-G2. This simultaneously predicts all defined types of scalar coupling constants and chemical shifts with DFT-like accuracy but much higher computational efficiency. Its performance is assessed against both computed and experimental external test sets to ensure generalisability, and we demonstrate that it can effectively replace DFT in such workflows, while providing orders of magnitude in time-savings.
![]() | ||
Fig. 1 Workflow for training IMPRESSION-Generation 2 including DFT methodology and training and testing dataset sources. Full details can be found in the ESI.† |
NMR parameters for each training molecule were predicted with DFT from a single 3D structure, using mPW1PW91/6-311g(d,p) for geometry optimisation, and ωB97XD/6-311g(d,p) for NMR predictions,33–37 providing 739913 chemical shift environments (330
411 δ1H; 306
458 δ13C) and 5
696
784 scalar coupling constants (including 307
270 1JCH; 486
884 2JCH; 672
433 3JCH; 705
737 4JCH; 134
051 2JHH; 217
940 3JHH; 333
010 4JHH) with the latter divided into their labelled sets, nJXY, depending on the number of bonds (n) between the coupled pairs of nuclei (X and Y). Details of the DFT workflows and neural network architecture are available in the ESI.†
Entry | Predictor | Training dataset | Testing dataset | 3D geometry | δ 1H/ppm | δ 13C/ppm | 3 J HH/Hz | 2 J HH/Hz | 3 J CH/Hz | 2 J CH/Hz | 1 J CH/Hz |
---|---|---|---|---|---|---|---|---|---|---|---|
a Test result against all molecules in DFT8K, recalculated using the same DFT method (ωB97xd/6-311g(d,p)) used for IMPRESSION-G2. b Testing result reported by Guan et al.,26 with both training and testing sets calculated using the mPW1PW91/6-311+G(d,p) DFT method. | |||||||||||
1 | IMPRESSION-G2 | IG2 | Internal | DFT | 0.07 | 0.76 | 0.12 | 0.13 | 0.15 | 0.15 | 0.36 |
2 | IMPRESSION-G2 | IG2 | CSD-500 | DFT | 0.09 | 0.97 | 0.14 | 0.14 | 0.19 | 0.18 | 0.44 |
3 | IMPRESSION-G2 | IG2 | DFT8Ka | DFT | 0.09 | 1.27 | 0.14 | 0.17 | 0.20 | 0.20 | 0.54 |
4 | IMPRESSION-G2 | IG2 | CSD-500 | GFN2-xTB | 0.13 | 1.18 | 0.31 | 0.36 | 0.29 | 0.25 | 0.68 |
5 | IMPRESSION-G2 | IG2 | DFT8Ka | GFN2-xTB | 0.13 | 1.46 | 0.31 | 0.33 | 0.32 | 0.27 | 0.74 |
6 | IMPRESSION25 | IG1 | CSD-500 | DFT | 0.23 | 2.45 | — | — | — | — | 0.87 |
7 | CASCADE26 | DFT8Kb | Internal | DFT | 0.10 | 1.26 | — | — | — | — | — |
Beyond simple time-saving, the 1H and 13C chemical shift performance of IMPRESSION-G2 against DFT on an internal holdout improves on the current gold standard CASCADE predictor performance (Table 1, entry 7; MAD = 0.10 ppm 1H, 1.26 ppm 13C). We note that this improvement in performance exceeds what is expected solely on the basis of the slightly (∼2×) larger training dataset used for IMPRESSION-G2, as a 10-fold increase in training size is generally required to deliver a 2-fold improvement in accuracy. This suggests that the transformer architecture of IMPRESSION-G2, with attention passed between NMR parameters, also offers some benefits to accuracy during training.
The key test of any machine learning system is how it performs in generalisation tasks i.e. predictions on external sets of molecules that are entirely independent of those from which it was trained. Here we first compared against the relatively forgiving CSD-500 testing set used for the original IMPRESSION report (410 chemical structures comprising C,H,N,O reported by Paruzzo et al. for ShiftML,23 comprising 8475 δ1H; 7523 δ13C environments). IMPRESSION-G2 again provided excellent performance (MAD = 0.09 ppm 1H, 0.97 ppm 13C; Table 1, entry 2) that is ∼2.5-times better than the original IMPRESSION25 using the same test (Table 1, entry 6). We also tested IMPRESSION-G2 against CASCADE's more challenging DFT8K dataset of molecules,26 which contains a greater diversity of elements than CSD-500 and is sourced from a database (NMRShiftDB) that is entirely independent of those used to curate our training set. Excellent performance was again observed (MAD = 0.09 ppm 1H, 1.27 ppm 13C; Table 1, entry 3) suggesting the IMPRESSION-G2 model is indeed generalisable across a wide chemical space for 1H and 13C chemical shift prediction in molecules containing C, H, N, O, F, Si, P, S, Cl, Br, including independent data sources.
Accurate chemical shift predictions for 15N and 19F are also made simultaneously by IMPRESSION-G2 and predicted well (MAD = 2.26 ppm 15N, 2.60 ppm 19F). These accuracies of <3 ppm are comparable to the best reported DFT methods for 19F,38,39 and a substantial improvement for 15N over the original kernel ridge-based IMPRESSION model (MAD = 6.20 ppm).40
It is important to note that while predicting dozens of chemical shifts per molecule, IMPRESSION is simultaneously predicting hundreds to thousands of coupling constants for each molecule, providing extremely efficient computation. The accuracy of these scalar coupling constants predictions is also very high (Table 1). Multiple-bond 1H–1H and 1H–13C coupling constants were predicted with high accuracy against the internal hold-out (MAD ≤ 0.15 Hz; Table 1, entry 1) and both CSD500 and DFT8K external testing sets (MAD < 0.2 Hz, Table 1, entries 2 and 3). Similarly, one-bond 1H–13C scalar coupling accuracy (MAD ∼ 0.5 Hz; Table 1, entries 2 and 3) were nearly twice as accurate as those from the original IMPRESSION system (Table 1, entry 6).40 IMPRESSION-G2 thus represents a step-change in NMR parameter prediction as currently DFT is the only generalisable tool to predict scalar coupling constants across 1–4 bonds and IMPRESSION-G2 is the first system capable of reproducing such DFT calculations but much more rapidly than DFT can achieve.
It should be noted that the main time-limiting feature for IMPRESSION-G2 workflows is how long it takes to generate the 3D structures prior to NMR prediction. The accuracies described above were achieved by starting from DFT-based 3D molecular geometries i.e. the overall workflow to achieve this accuracy still required a slow (minute to hours) DFT geometry optimisation prior to running IMPRESSION-G2. Gratifyingly, IMPRESSION-G2 predictions are still accurate when executed on 3D molecular structures derived from much more rapid GFN2-xTB optimisations.41 This was tested against both CSD-500 and DFT8K (Table 1, entries 4 and 5 compared to entries 2 and 3) and these calculations took only a few seconds per molecule to deliver the combined 3D geometry optimisation and IMPRESSION-G2 NMR prediction, i.e. ∼104 times faster than a full DFT workflow.
![]() | ||
Fig. 2 Table of mean absolute deviations and overlay of error distributions for DFT ωB97XD/6-311g(d,p) and IMPRESSION-Generation 2 vs. Exp5K, as well as DFT ωB97XD/6-311g(d,p) vs. IMPRESSION. |
The overlay of error distributions in Fig. 2 for the IMPRESSION-G2 and DFT methods further demonstrates the comparability of these two approaches across the ∼5k molecules. However the DFT predictions (DFT geometry optimisations and NMR predictions) took days even on a large parallelised high-performance computing cluster, while IMPRESSION-G2 (GFN2-xTB geometry optimisations and NMR predictions) completed the whole dataset in <10 minutes on a standard laptop. While each individual IMPRESSION-G2 prediction is different to what DFT would predict, the ensemble of predictions from IMPRESSION (green) is nearly indistinguishable from DFT (blue). This strongly supports our conclusion that IMPRESSION-G2 can be used as a drop-in replacement for DFT when predicting NMR parameters for any molecules inside the chemical space (C, H, N, O, F, Si, P, S, Cl, Br) for which it has been trained.
Finally we explored what improvements IMPRESSION-G2 provides in the critical application of 3D-structure molecular determination. This is demonstrated here through the enhanced 3D-structure discrimination offered by predicted coupling constants with IMPRESSION-G2, as opposed to just chemical shifts, in the diastereomeric determination of strychnine. Strychnine is a well-studied example, has a nearly perfectly ‘rigid’ structure,42 and has rigorously validated and tested NMR assignments of both chemical shifts and couplings across multiple literature reports. Consequently it allows us to test IMPRESSION-G2 while avoiding complications arising from errors in conformer population averaging and the misassignments of experimental NMR data that abound in literature reports.
The MAD between IMPRESSION-G2 predictions and experimental values for 1H- and 13C-based NMR parameters for diastereomers of strychnine, 1–7, that were found here to be stable by computation are shown in the table in Fig. 3. In every case, the correct diastereomer, 1, has the best fit, however if one only considers chemical shift then there is a lack of certainty, with other plausible fits also having low average deviations close to the performance limits of IMPRESSION-G2 (highlighted in green where MAD <0.2 ppm for 1H and <3 ppm for 13C). Using predicted coupling constants provides much more effective discrimination of diastereomers, with both 1H–1H and 1H–13C scalar couplings suggesting only diastereomer 1 as a plausible solution. An alternative analysis using the more discriminating χ2-reduced statistic, reinforces this finding. χ2-Reduced should provide values close to 1 for good fits and ideally values >2 for incorrect structures (see ESI† for details). Here, the χ2-reduced achieved with the combined 1H–1H and 1H–13C coupling constants offers a very clear fit for 1 (χ2-reduced = 1.01) with the next best option being 7, which can be definitively excluded based on a six-fold higher χ2-reduced of 5.89. By contrast, the differentiation if only considering chemical shifts is much less, with less than 2-fold discrimination between diastereomers 1 and 6 (1.52 and 2.93). Unsurprisingly, the combination of J and δ together also provides a clear discrimination between the correct structure, 1, and all other options and this is clearly illustrated in Fig. 3, top.
In common with other machine learning systems for NMR prediction, it achieves the highest accuracy when tested against internal hold-outs from its own training dataset of molecules, but crucially IMPRESSION-G2 also provides excellent accuracy for molecules within its chemical space (C, H, N, O, F, Si, P, S, Cl, Br; MR < 1000 g mol−1) that are sourced entirely independently of its training data. IMPRESSION-G2 reproduces experimental data with error distributions that are comparable to those achievable by DFT, and can provide similarly excellent diastereomeric discrimination to DFT, but in seconds rather than hours. Consequently we believe IMPRESSION-G2 is the first plausible machine learning replacement for DFT for the prediction of 3D-sensitive NMR parameters, with time-savings that make it possible to predict millions of parameters for thousands of structures in minutes.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4sc07858f |
This journal is © The Royal Society of Chemistry 2025 |