Open Access Article
This Open Access Article is licensed under a
Creative Commons Attribution 3.0 Unported Licence

AIQM2: organic reaction simulations beyond DFT

Yuxinxin Chen ac and Pavlo O. Dral *abc
aState Key Laboratory of Physical Chemistry of Solid Surfaces, Department of Chemistry, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Xiamen University, Xiamen 361005, China. E-mail: dral@xmu.edu.cn
bInstitute of Physics, Faculty of Physics, Astronomy, and Informatics, Nicolaus Copernicus University in Toruń, ul. Grudziądzka 5, 87-100 Toruń, Poland
cAitomistic, Shenzhen 518000, China

Received 16th April 2025 , Accepted 7th August 2025

First published on 7th August 2025


Abstract

Density functional theory (DFT) is the workhorse of reaction simulations, but it often suffers from either prohibitive cost or insufficient accuracy. In this work, we report AIQM2—the universal AI-enhanced QM method 2—as the first method that enables fast and accurate large-scale organic reaction simulations for practically relevant system sizes and time scales beyond what is possible with DFT. This breakthrough is based on the high speed of AIQM2, which is orders of magnitude faster than common DFT, while its accuracy in reaction energies, transition state optimizations, and barrier heights is at least at the level of DFT and often approaches the gold-standard coupled cluster accuracy. AIQM2 can be used out of the box without any further retraining. Compared to pure machine learning potentials, AIQM2 possesses high transferability and robustness in simulations without catastrophic breakdowns. We showcase the superiority of AIQM2 compared to traditional DFT by performing an extensive reaction dynamics study overnight and revising the mechanism and product distribution reported in the previous investigation of the bifurcating pericyclic reaction.


Introduction

Reaction simulations provide atomistic-level information on the efficiency and mechanism of chemical transformations, guiding both experimental reaction design and in silico reaction planning. The common tools to simulate reactions are based on the concept of the potential energy surface (PES).1 Critical static points along the reaction pathway on the PES, such as minima (reactants, products, intermediates) and first-order saddle points (transition states, TS), can be used to derive kinetics and thermochemical properties. The key challenge is that to ensure high simulation accuracy, the PES ideally must be evaluated with chemical accuracy (1 kcal mol−1 error in energies), because the reaction rate depends exponentially on the Gibbs energy of activation according to the Eyring equation. Similarly, the product distribution under thermodynamic control strongly depends on the difference in the products' Gibbs free energies.

Minor changes to the shape of the PES, including the locations of critical points, can have large effects on reaction mechanisms and pathways,2e.g., synchronous and step-wise mechanisms in the widely explored Diels–Alder reactions3 and ambimodal TSs in bifurcating reactions.4 To gain deeper insight into the conformational changes and energy flow during chemical transformations, the reaction dynamics must be investigated, where bonds breaking and formation are directly simulated by performing molecular dynamics (MD) trajectory propagations.5 One example is post-TS MD, which provides branching ratios and energy partitioning for reactions with diverging product channels—cases in which traditional Eyring's TS theory fails.6 While performing MD calculations is highly desirable for modeling reactions, an ensemble of reactive trajectories is usually required to obtain statistically robust and converged results, thus requiring computationally efficient methods for obtaining PESs.

Thus, the fast generation of well-shaped PES in the reactive region is required, which is a tall order for traditional quantum mechanical (QM) methods (Fig. 1a), because, generally, less expensive calculations are less accurate. For example, methods that achieve chemical accuracy on barrier heights—e.g., coupled-cluster methods and beyond,7 and double hybrid functionals with large basis sets8—may not be feasible for large systems and TS structure optimizations. The only practical solution widely adopted by the community is the combination schemes. In these schemes, cheaper and less accurate QM methods, such as commonly employed density functional theory (DFT) with relatively small basis sets or even semi-empirical QM methods, are used to obtain the geometries and frequencies of TS and minima. More expensive methods, such as the gold-standard CCSD(T),9 are then used to calculate the single-point energies on these geometries. These combination schemes are, obviously, approximations assuming that the geometry does not change much between the levels of theory, which is not necessarily true.10


image file: d5sc02802g-f1.tif
Fig. 1 Selected types of methods used for simulating organic reactions. (a) Traditional QM methods exemplified by semi-empirical QM, density functional theory (DFT), the gold-standard CCSD(T), and universal (machine learning) interatomic potentials (UIPs)—each with its strengths and weaknesses. (b) AIQM2 proposed in this work eliminates the above weaknesses through careful design. (c) Various applications of AIQM2 in reaction simulations such as transition state (TS) optimization, reactive dynamics, and the accurate and fast calculation of reaction barriers.

The problem of high computational cost is particularly pertinent in reaction dynamics simulations, where one often must sacrifice the number of trajectories and, hence, the quality of the statistical analysis. Indeed, most reported studies propagated too few trajectories, leading to statistically unreliable product ratios,11 although, in these studies, accuracy has already been sacrificed by using relatively cheap DFT methods.

The rise of machine learning (ML) promises to speed up reaction simulations by directly predicting key properties, such as reaction outcome,12 yields,13–16 barrier heights,17–27 rate constants,28,29 as well as TS geometry.30–37 However, ideally, we would like to use the ML models as surrogate models to describe PES at the QM level. This would allow us to obtain similar levels of insight and to use existing, mature techniques for reaction simulations and analysis. Such models are popular ML interatomic potentials (MLIPs), where either an accurate description of the system-specific PES is established or universal interatomic potentials (UIPs) are constructed in the hope of covering the target chemical space.38

The construction of the system-specific MLIPs requires considerable effort because it requires careful selection of the training points on the PES and resource-intensive calculations with high-level reference QM methods.39 After all this work, these potentials cannot easily be reused for similar systems, and additional effort is needed to extend their applicability. A highly desirable alternative is UIPs, which promise to remove the need for repeated retraining while maintaining adequate accuracy for molecules outside the training set.

In principle, UIPs are pre-trained MLIP models, but constructing UIPs is much more challenging than training a system-specific MLIP because it requires an ML model with good transferability capabilities, large and representative datasets covering the chemical space of interest, as well as training and validation strategies ensuring good generalizing ability of the model, etc. Another important consideration in UIPs is the quality of the reference data. Most state-of-the-art (SOTA) UIPs target the DFT level because most of the available data are at the DFT level with relatively small basis sets.40–46 Hence, such potentials are bound not just to fail to achieve chemical accuracy in most cases but also to be inferior even to the reference DFT level they were trained on.

Only a few UIPs have gone beyond the DFT level and targeted the gold-standard coupled cluster level: AIQM1 (ref. 47) and ANI-1ccx48 are the first and only successful examples. AIQM1 is generally more transferable and robust than ANI-1ccx because AIQM1 is based on a Δ-learning49 approach, where MLIP is used only to correct the difference between the target (coupled cluster) and baseline (semi-empirical) QM levels, while ANI-1ccx is a pure MLIP model directly trained on the target coupled-cluster level. One of the major disadvantages of AIQM1 is its limitation to CHNO elements, which can be mitigated by using an extrapolative ONIOM approach,50 where AIQM1 is used for the major core region of the atomistic system, while lower-level methods such as ANI-2x51 and/or GFN2-xTB52 are used for the remaining parts containing other elements.

Both AIQM1 and ANI-1ccx generally achieve chemical accuracy within the scope of the training data, but we found that they have subpar performance in describing reaction barriers.53 Hence, until now, no theoretical method has been reported that can be used for organic reaction simulations—including transition state search, barrier height calculations, and reactive dynamics—with the robustness and transferability of DFT approaches, while being orders of magnitude faster.

Here, we introduce AIQM2, the second-generation general-purpose AI-enhanced quantum mechanical method, which approaches gold-standard coupled-cluster level accuracy at the cost of semi-empirical methods—i.e., beyond those of typical DFT approaches (Fig. 1b). As we show in this work, AIQM2 consistently improves upon its predecessor AIQM1 and the related universal ANI-1ccx method, especially for transition state optimizations and barrier heights. It also outperforms hybrid and double-hybrid DFT methods with a common double-ζ-quality basis set in reaction energies of large systems and some non-covalent interactions. In contrast to DFT, AIQM2 provides uncertainty estimates for its predictions, increasing the trustworthiness of the simulations with this method (Fig. 1b and c).

To showcase the efficiency and robustness of AIQM2, we propagated thousands of high-quality trajectories for a bifurcating pericyclic reaction in parallel on 16 CPUs within one day and obtained the product distribution at nearly coupled-cluster level accuracy. This way we effectively revise the original, much more expensive and less accurate, DFT results. AIQM2 is publicly available in our open-source software MLatom54 at https://github.com/dralgroup/mlatom.

To facilitate progress in ML-assisted computational chemistry, we have included the methods introduced in this work into the library of Universal and Updatable AI-enhanced QM foundational models (UAIQM).55 It is an umbrella platform that collects various models in a coherent library, enabling the automatic choice of the model according to the user's needs and, if required, the improvement of the models. UAIQM hosts both published (e.g., AIQM1 and ANI-1ccx) and unpublished (e.g., AIQM2, when this work was submitted) models so that the latest developments can be made accessible to the public timely.

Results and discussion

AIQM2 design

AIQM2 follows the same principle of Δ-learning49 as in AIQM1 by applying neural network corrections to the baseline semi-empirical QM method for higher accuracy at a lower cost. We have chosen GFN2-xTB52 as the baseline of AIQM2 due to its robustness and broad applicability in reaction exploration.56 In this regard, the change of baseline makes AIQM1 and AIQM2 two distinct methods, both in terms of composition and application preference. An explicit dispersion correction is added, because neither semi-empirical QM nor local neural network models can properly describe the long-range noncovalent interactions on their own.

Specifically, AIQM2 consists of 3 parts (Fig. 1b): the semi-empirical baseline GFN2-xTB* (asterisk indicates the special version of GFN2-xTB with D4 dispersion corrections57 removed), the correction predicted by the ensemble of 8 ANI neural networks,58 and the D4 dispersion correction (for the ωB97X59 functional). Thus, the energy predicted by AIQM2 can be formulated as:

 
EAIQM2 = EGFN2-xTB* + ENN + ED4(ωB97X),(1)
and energy derivative properties are the sum of each term's derivatives. Note that the Hessians provided by GFN2-xTB* and D4 are calculated by numerical differentiation on analytical gradients, while neural networks provide analytical Hessians; the final Hessian is the sum of these three Hessians. Other electronic structure properties, such as partial charges and dipole moments, are inherited from the baseline GFN2-xTB* calculations.

Several modifications were made to the original architecture used in ANI-1x and ANI-1ccx. The activation function was changed from CELU (Continuously Differentiable Exponential Linear Units) to GELU (Gaussian Error Linear Unit), which has been shown to provide better performance for higher-order energy derivatives.60 The angular cutoff in the atomic environment vector (AEV) was increased from 3.5 Å to 4.0 Å for a better description of long-range interactions during training and inference.47

In terms of the training data, the ANI model corrections were trained on the data derived from the ANI-1ccx and ANI-1x datasets.61 They were generated using an extrapolated coupled cluster method termed CCSD(T)*/CBS61 and a popular DFT method ωB97X59/def2-TZVPP,62 respectively. The asterisk in CCSD(T)* refers to the estimated TightPNO accuracy of DLPNO-CCSD(T)63 and the complete basis set extrapolation uses the formula by Halkier64 and Helgaker.65 Details about CCSD(T)*/CBS can be found in ref. 61. GFN2-xTB* data was generated for all the data points in both datasets and subtracted from the reference level to get the Δ-values49 for training. The models were first trained on the differences in energies and forces between ωB97X59/def2-TZVPP62 and GFN2-xTB*. This led to the AIQM2@DFT* approach approximating the ωB97X/def2-TZVPP level. To approximate the ωB97X-D4/def2-TZVPP level with explicit dispersion corrections, we can add the D4 corrections for the ωB97X functional, yielding the AIQM2@DFT method. The final models in AIQM2 were obtained by transfer learning on the energy difference between CCSD(T)*/CBS and GFN2-xTB*.61 The D4 corrections for the ωB97X functional were removed from the reference CCSD(T)*/CBS. Removing the D4 corrections during training was necessary because we added the explicit D4 corrections for the ωB97X functional back to the model when making predictions. In transfer learning, the first and the third layers were fixed as in ANI-1ccx and AIQM1, which effectively reduced the number of parameters for training and prevented overfitting to a smaller high-level dataset.

An advantage of AIQM2 is also that it provides uncertainty estimates of its predictions, in contrast to DFT approaches. The uncertainty of the calculation is defined as the standard deviation of the ANI models in the ensemble as is done for both AIQM1 and ANI-1ccx.66 We can interpret uncertainty as how different a new system, for which we want to make predictions, is from the molecules in the training set of AIQM2. The uncertainties are reported in this work as the error bars of the predictions, i.e., energy estimates from averaging ANI model predictions ± standard deviation of the ANI models. The uncertainty for reaction energies is calculated as the standard deviation of the relative energies predicted by the ANI models. We calibrated the uncertainty on the standard CHNO dataset as done in the literature66 which yielded the threshold of 0.36 kcal mol−1, i.e., the prediction is deemed reliable if the uncertainty is less than 0.36 kcal mol−1. From our previous tests on AIQM1 and ANI-1ccx, we have numerical evidence that highly uncertain predictions overall lead to bigger errors.66

Revision of the bifurcating pericyclic reaction

Here we demonstrate the power of AIQM2 by revising the earlier state-of-the-art simulations67 of a paradigmatic pericyclic reaction, by performing resource-intensive downhill molecular dynamics (MD) simulations (Fig. 2). These simulations involve thousands of quasi-classical trajectories from the region around the ambimodal TS, leading to more than one product.
image file: d5sc02802g-f2.tif
Fig. 2 Energy profile, transition state geometries, and quasi-classical trajectories of ambimodal reaction. (a) Comparison of barrier heights and reaction energies in kcal mol−1 obtained with different methods. Energies are reported on the geometries optimized with the corresponding methods, except for CCSD(T)*/CBS energies evaluated on the B3LYP-D3/6-31G*-optimized geometries. Values after ± are the uncertainties of the relative energies given by AIQM2. (b) Critical bond lengths in angstroms (Å) of transition states optimized by various methods. The Cope rearrangement transition state is not found using GFN2-xTB and AIQM1, thus not presented. The same color scheme is used for each method as (a). (c) Distribution of 1000 quasi-classical trajectories starting from the ambimodal transition state (B3LYP product ratio is taken from ref. 67). B3LYP-D3 – B3LYP-D3/6-31G*, CCSD(T)* – CCSD(T)*/CBS, TS-Ambi – ambimodal TS, TS-Cope – Cope rearrangement TS.

AIQM2 enables extensive explorations: overnight (100 CPU-hours) on commodity hardware, we could propagate 1000 quasi-classical trajectories, each 500 fs long in both forward and backward directions with 0.1 fs time step (in total, 10 million single-point evaluations, see computational details). These simulations on a smaller scale (only 117 reactive trajectories with 1 fs time step) were performed earlier67 with a much costlier DFT method B3LYP-D3/6-31G*, which would take 160 CPU-years for a thousand trajectories with 0.1 fs time step. Even with such extensive computational resources spent on DFT calculations, the number of trajectories might still have been insufficient to draw definite conclusions because of the low precision and potential for missing rare events, as shown in a recent analysis.68 AIQM2 simulations with 1000 trajectories, hence, furnish much higher statistical certainty.

In addition to propagating ten times more trajectories, we use the time step that is ten times smaller than in the original DFT study.67 Our AIQM2 simulations with 0.1 fs lead to energy-conservative dynamics, while we show that the 1 fs time step, employed in the original study, leads to rather large violations in energy conservation (Fig. 3b). The total energy fluctuations in 1 fs time-step trajectories reach up to several kcal mol−1, which might influence the conclusions drawn in the original DFT study. Indeed, our calculations with the same method, AIQM2, show that, e.g., for an exemplary trajectory, product P2 is formed 9 fs later when the trajectory is propagated with 0.1 fs.


image file: d5sc02802g-f3.tif
Fig. 3 Total energies and their standard deviations along quasi-classical trajectories propagated with 1 fs and 0.1 fs using AIQM2. 1 fs was used in the original DFT study67 and is given here only for comparison; all results in this study are obtained with 0.1 fs time-step trajectories. (a) Comparison of the standard deviation of total energies in kcal mol−1 on 1000 trajectories propagated with 1 and 0.1 fs. (b) Comparison of total energy deviations with respect to the mean value on representative trajectories for P1 and P2. Figures on the right side show the snapshots where the products are first observed. Blue represents trajectories with the 1 fs time step, and orange represents those with the 0.1 fs time step. Time 0 fs corresponds to the initial geometry near the transition state.

Most importantly, it is known69 that the quality of the downhill dynamics is strongly influenced by the accuracy of the QM method. Here our AIQM2 approach also has a significant advantage over B3LYP-D3/6-31G* as judged by the analysis of the PES stationary points. The barrier height of ambimodal TS at AIQM2 is almost the same as CCSD(T)*/CBS value. Note that the CCSD(T)*/CBS values are derived from DFT optimized geometries to avoid potential bias; we cannot compare to the geometries at the CCSD(T)* level due to its cost and absence of analytical energy derivatives. TS geometry optimized with AIQM2 qualitatively resembles that at B3LYP-D3/6-31G* with some noticeable differences in bond lengths. In comparison, B3LYP-D3/6-31G* underestimates the barrier by around 3 kcal mol−1. For the minima, the error between AIQM2 and reference CCSD(T)*/CBS is within ca. 2 kcal mol−1, while both B3LYP-D3 (ca. 10 kcal mol−1) and GFN2-xTB have huge errors (more than 25 kcal mol−1) compared to the same reference. Such large errors in B3LYP-D3 used in the previous study67 certainly influence the quality of the downhill dynamics.

Having at our disposal (1) a more accurate method, (2) more trajectories, and (3) better energy conservation, we turn to analyzing the simulation results. AIQM2 results differ from the DFT results in two major aspects. First, the product distribution changed from 1[thin space (1/6-em)]:[thin space (1/6-em)]10 to 1[thin space (1/6-em)]:[thin space (1/6-em)]52, showing that the formation of P1 has a much smaller role than previously thought. Second, the AIQM2 results show that the reaction proceeds via a dynamically stepwise mechanism rather than a concerted mechanism, as previously thought. The average time gap between the formation of the first and second bond is 205 fs for the major P2 product. The DFT study reported a 60 fs time gap. In AIQM2 trajectories leading to P2, the C3–C13 bond formation is delayed, and most of the C3–C13 bond lengths oscillate between 2.5 Å and 3 Å. This might potentially be caused by the PES curvature shaped by the “neighboring” TS corresponding to the Cope rearrangement connecting the competing products P2 and P1 (Fig. 2a and b). Once the first C2–C15 bond is formed, the formed adduct is hot and cannot immediately ‘decide’ which bond to form; eventually, the shorter C3–C13 bond is formed most of the time because it is typically shorter in the initial conditions. The role of the nearby TS is supported by the observation that the Cope rearrangement's barrier height at B3LYP-D3/6-31G* is much smaller than at AIQM2, which might explain the difference in product ratio. We did not observe the transformation of P2 to P1 in AIQM2 trajectories, as the barrier is rather high (27.24 kcal mol−1) compared to the small barrier at DFT (10.23 kcal mol−1).

Summarizing the analysis of the quasi-classical trajectories, we revised the previous state-of-the-art.65 While we leave more detailed explorations to future research, our dynamics study showcased the power of the AIQM2 approach, which should be considered instead of the common DFT methods when performing such simulations.

Several additional remarks are due. The B3LYP-D3/6-31G* calculations might have suffered from substantial basis-set superposition error (BSSE), which might have been rather large in TSs. An advantage of AIQM2 is that it was trained on the CCSD(T)*/CBS-level data, which were generated using a complete basis set (CBS) extrapolation scheme. By construction, CBS does not suffer from BSSE. AIQM2 can, therefore, provide a more accurate description in cases where BSSE becomes an issue for common approaches with the medium-sized basis sets like in B3LYP-D3/6-31G*.

Obviously, among the zoo of DFT functionals, there is always a high chance of finding a better functional than B3LYP/6-31G*. Particularly, M06-2X is now commonly used in the pericyclic reaction simulations because benchmarks showed its better accuracy for such simulations. For this particular reaction, M06-2X/6-31G* is more accurate than B3LYP-D3/6-31G* according to our comparison with CCSD(T)*/CBS (Table 1) and achieves similar accuracy to AIQM2. It does not help that M06-2X is even slower than B3LYP-D3, which would strongly hamper the extensive evaluation of many trajectories with small time steps. AIQM2 also provides error bars, allowing users to at least get an idea of how far the predictions might be from the target CCSD(T)/CBS values, while none of the DFT approaches offer such a handy feature.

Table 1 Comparison of various methods on energy profiles of ambimodal reaction and the estimated CPU time required to propagate 1000 trajectories with the 0.1 fs time step
  CCSD(T)*/CBS//B3LYP-D3/6-31G* AIQM2//AIQM2 B3LYP-D3/6-31G*//B3LYP-D3/6-31G* M06-2X/6-31G*//B3LYP-D3/6-31G* M06-2X/6-31G*//M06-2X/6-31G*
Barrier (TS-Ambi) 19.1 19.0 ± 1.7 16.6 17.8 17.6
Reaction energy (P1) −18.5 −16.4 ± 2.5 −7.0 −20.3 −19.7
Barrier (TS-Cope) 16.9 15.3 ± 1.8 16.6 15.5 15.4
Reaction energy (P2) −13.3 −11.9 ± 1.8 −1.5 −14.2 −13.6
Time cost estimate for 1000 trajectories 100 CPU-hours 160 CPU-years 200 CPU-years


Finally, our approach has another huge advantage over common composite schemes, such as when semi-empirical methods are used for pre-screening possible transition states and calculating their ensembles, followed by refinement with the DFT methods. In our example of the pericyclic reaction, AIQM2 could find both the ambimodal and Cope rearrangement TSs, while the popular semi-empirical GFN2-xTB failed to locate the latter. In addition, GFN2-xTB has huge errors (more than 25 kcal mol−1) for the minima too. It means that the composite schemes relying on the semi-empirical methods are likely missing key conformers and TSs. Our approach is a big step forward in replacing the need for composite approaches by providing an all-in-one solution for all steps of the workflow from geometry optimization and ensemble sampling, to dynamics and ultimately reliable energetics.

Overall AIQM2 performance

It is imperative to judge the quality of universal models by evaluating their performance on independent tests that are closer to the real-world applications. We impose a strict requirement on our new method to be applicable for performing molecular simulations with higher speed and with accuracy at least comparable to, or better than, DFT. Hence, we evaluate our new method on the GMTKN55 benchmark8 – a standard test for DFT methods. This benchmark covers various sizes of molecules and types of relative energies, which provides a comprehensive evaluation of AIQM2 beyond reaction properties. The reference values in the GMTKN55 benchmark are taken from high-level QM calculations which are expected to have errors within 1 kcal mol−1 relative to experiment. Hence, our evaluation will show the errors that might be expected when compared to the experiment.

We compare the performance of AIQM2 to that of the DFT methods commonly used in practice, i.e., B3LYP-D4/6-31G* and ωB97X-D4/def2-TZVPP; we also include a double-hybrid functional representative DSD-BLYP-D4/6-31G*. As is shown in Fig. 4, AIQM2 gives overall much better results than B3LYP-D4/6-31G* and DSD-BLYP-D4/6-31G* in terms of WTMAD-2, which is a weighted metric used to balance the different scales of various reaction energies. Note that the double-hybrid functional DSD-BLYP was recommended in the original GMTKN55 benchmark based on the evaluation with a much more expensive def2-QZVP, which is rarely used in practice. We instead choose the 6-31G* basis set for this functional from a practical perspective as one of the most commonly used in DFT studies, which is not intended for rigorous evaluation of the functional itself. Compared with the hybrid functional ωB97X with the more expensive triple-ζ basis set, AIQM2 can achieve comparable accuracy while maintaining the cost of a semi-empirical method. Overall, AIQM2 offers an outstanding cost-accuracy trade-off for predicting thermochemical, kinetic, and noncovalent properties: it is significantly faster than DFT approaches while maintaining competitive accuracy.


image file: d5sc02802g-f4.tif
Fig. 4 Performance assessment of AIQM2 compared to universal ML models ANI-1ccx and AIQM1 targeting coupled cluster accuracy and a selection of QM methods on the GMTKN55 data. Results are shown for neutral, closed-shell CHNO-containing compounds, as AIQM2 was designed to target coupled cluster accuracy on such types of molecules. Numbers are mean absolute errors in kcal mol−1, color-coded for each subset so that green corresponds to the smallest and red – to the largest errors (green – the best). WTMAD-2 is the weighted mean absolute deviation-2 in kcal mol−1 as defined in ref. 8. Time is reported in CPU hours. Notations and definitions of the reaction and interaction energies follow the original GMTKN55 benchmark paper;8 note that our benchmarks for DFT methods differ from the original ones, because we performed them with commonly used, smaller basis sets.

Considering the performance in each category, AIQM2 exhibits the lowest WTMAD-2 value of 1.57 kcal mol−1 among all tested methods for isomerization reaction energies of large systems. This is partly due to the good extensibility of the ANI neural network70 and the rich conformers presented in the training data, as exemplified in the performance of ANI-1ccx on the ISOL24 dataset. However, in cases where strong effects come from atoms outside the cutoff, such as the conjugated system, the ANI neural network may not work properly. Thus, baseline GFN2-xTB plays another important role in the robustness and accuracy of AIQM2. For example, in the C60ISO dataset, GFN2-xTB provides a good starting point for AIQM2 to approach mean absolute error (MAE) as low as 3.18 kcal mol−1, considering the average reaction energies in C60ISO are ca. 100 kcal mol−1. In the generally problematic non-covalent interactions, AIQM2 can even improve on some difficult cases for DFT, such as those in the WATER27 dataset, due to the extensive representation of water clusters in the training data, as known for the ANI-1ccx48 and AIQM1 (ref. 47) methods.

Performance of AIQM2 for transition states and barrier heights

Another significant improvement of AIQM2 over related methods lies in its prediction of barrier heights. This presents a big challenge for AIQM1 and ANI-1ccx, as shown in the previous study,53 although they all target the CCSD(T)/CBS level. Compared to AIQM1, the WTMAD-2 of AIQM2 decreased from 5.45 to 2.24 kcal mol−1 for the datasets benchmarking the barrier heights (Fig. 4).

The most prominent change is in the BHPERI dataset, where the overall MAE is the lowest among the tested methods. Thus, we dive into the details of BHPERI and check whether AIQM2 can produce accurate geometries for TSs. Representative reactions in BHPERI and the comparison of barrier heights and TS geometries are shown in Fig. 5, which also demonstrates that the uncertainty of AIQM2 predictions is lower than that of the less accurate AIQM1.53 All TS structures were successfully found by AIQM2 and exhibited one imaginary frequency, with the normal mode vibration corresponding to the correct reaction.


image file: d5sc02802g-f5.tif
Fig. 5 Representative reactions in BHPERI and their transition state structures. For the first three reactions, the reference barrier heights are approximated with the composite scheme to approach the CBS limit on the CCSD(T)/cc-pVTZ optimized geometries.71 The latter 3 reactions adopted reference values from the BHPERI dataset of the GMTKN55 2017 version, i.e., barriers are calculated with W1–F12 and W2–F12 (ref. 72) methods on geometries from the CBS-QB3 (ref. 73) protocol. Orange is used for reference barrier heights, and blue for AIQM2. Barrier heights of AIQM2 are calculated using reference geometries to be consistent with the GMTKN55 benchmark. RMSD in Ångstrom shows the structural difference between the reference and the AIQM2-optimized TS. The same color scheme is used when overlapping their structures.

Unfortunately, we cannot evaluate the quality of the TS geometry using the BHPERI dataset because it does not contain accurate reference geometries (which were optimized at B3LYP/6-31G* level). As a non-representative comparison, we report only the root-mean-square deviations (RMSDs) of three AIQM2-optimized TS structures relative to the CCSD(T)/cc-pVTZ geometries reported in a more recent work71 (Fig. 5): the Diels–Alder reaction starting from s-cis-butadiene and ethylene, the electrocyclic ring-opening reaction starting from cyclobutene, and the sigmatropic rearrangement originating from the E-isomer of 1,3-pentadiene. This recent work also reported higher-level energetics achieved with hierarchical focal point analyses (FPA), extrapolating to the ab initio limit. AIQM2 agrees well with the reference TS structures, with the RMSDs less than 0.1 Å.

Conclusions and outlook

In this work, we report the second-generation general-purpose AI-enhanced Quantum Mechanical Method (AIQM2). AIQM2 offers remarkable speed and accuracy for reaction simulations by providing robust results for various applications out of the box. The overall accuracy of AIQM2 is rivaling the established DFT approaches, but the simulations are orders of magnitude faster. As we show, this can be exploited for efficient TS optimizations and extensive dynamics investigations of reaction mechanisms.

We exploited AIQM2's excellent performance to lift the accuracy in the quasi-classical downhill molecular dynamics, providing more statistically significant results with trajectories featuring energy conservation. This enabled us to obtain a new chemical insight into a bifurcating pericyclic reaction previously studied with fewer, non-energy-conserving trajectories obtained with a much slower and less accurate DFT approach. We show that this reaction proceeds via a stepwise mechanism rather than a concerted one, as previously believed. We also revise the product distribution in this ambimodal reaction to show that the minor product plays a much smaller role, which we attribute to the proximity of another TS corresponding to the Cope rearrangement, not described properly by B3LYP.

AIQM2 has an overall improved accuracy compared to its predecessor AIQM1, particularly for TSs and barrier heights, which are important for the quantum chemical reaction simulations. AIQM2 already has a growing track record of successful applications in real-world simulations beyond organic reactions, due to its universality, high speed, and accuracy. This method has been shown to provide outstanding IR spectra for a broad range of chemical compounds.74 Recently, it has also been used in the investigation of large systems, where AIQM2 enabled fast and accurate optimizations (e.g., within 90 minutes on 32 GPUs for a 714-atom noncovalent complex), leading to high-quality UV/vis spectra calculated on these optimized geometries.75

All the power of AIQM2 is publicly available in MLatom at https://github.com/dralgroup/mlatom under the permissive MIT license, which lifts any limitations on the reuse of the method. The method is also included in the Aitomic package76 and can be used to perform online calculations on the Aitomistic Hub at https://aitomistic.xyz and XACS platform at https://xacscloud.com/, further enhancing its availability.

Computational details

All calculations were performed with the MLatom program.54 The D4 dispersion corrections were calculated via MLatom's interface to the dftd4 program,77,78 the ANI contributions via the interface to TorchANI,58 and the GFN2-xTB(*) calculations were performed via the interface to the locally modified xtb program.79 AIQM1 calculations require the ODM2* contributions which were calculated with the MNDO program80 interfaced to MLatom. MLatom was used for the B3LYP (VWN5) and ωB97X calculations via the interface to PySCF81–83 and DSD-BLYP via the interface to Orca program.84,85 Additional CCSD(T)*/CBS calculations were performed following the methodology in ref. 61 using the open-source MLatom implementation,54 where the required energy components were calculated via the interface to the Orca program.84,85

In the simulations of the bifurcating pericyclic reaction, we followed the simulation setup and analysis protocol analogous to those in the original study,67i.e., initial conditions for quasi-classical trajectories were sampled from the harmonic quantum Boltzmann distribution starting at the ambimodal transition state, as implemented86 in MLatom.54 The trajectories were started near the TS region and essentially led to “downhill” dynamics to either products or reactants; forward and backward trajectories were launched from the same initial conditions, differing only in the sign of the initial velocities. Each set of such forward and backward trajectories, starting with the same initial conditions, yields a combined trajectory, where the time steps in the backward trajectories are reversed. Reactive trajectories were defined as those combined trajectories that contain reactants and one of the products; nonreactive combined trajectories contain either only reactants or only one of the products.

Author contributions

P. O. D.: conceptualization, method design, original draft, figures. Y. C.: method development and implementations, calculations, analysis, manuscript revision, figures.

Conflicts of interest

The authors declare no competing interests.

Data availability

AIQM2 is publicly available in the MLatom at https://github.com/dralgroup/mlatom. Data with energies and optimized coordinates for the performed benchmarks and ambimodal reaction analysis, as well as datasets used for training AIQM2, are available at https://github.com/707-Moira/aiqm2.

Acknowledgements

P. O. D. acknowledges funding from the National Natural Science Foundation of China (funding via the Outstanding Youth Scholars (Overseas, 2021) project) and via the Lab project of the State Key Laboratory of Physical Chemistry of Solid Surfaces. We also acknowledge Aitomistic for financial support and hosting the AIQM2 methods online on the Aitomistic Hub at https://aitomistic.xyz/ making it available for a wider usership.

References

  1. M. Meuwly, Machine Learning for Chemical Reactions, Chem. Rev., 2021, 121, 10218–10239 CrossRef.
  2. H. B. Schlegel, Geometry optimization, Wiley Interdiscip. Rev.:Comput. Mol. Sci., 2011, 1, 790–809 Search PubMed.
  3. M. Linder and T. Brinck, On the method-dependence of transition state asynchronicity in Diels-Alder reactions, Phys. Chem. Chem. Phys., 2013, 15, 5108–5114 RSC.
  4. A. Sengupta and K. N. Houk, Origins of Periselectivity and Regioselectivity in Ambimodal Tripericyclic [8+6]/[6+4]/[4+2] Intramolecular Cycloadditions of a Heptafulvenyl-Fulvene, J. Phys. Chem. A, 2023, 127, 7976–7983 CrossRef PubMed.
  5. S. Pratihar, X. Ma, Z. Homayoon, G. L. Barnes and W. L. Hase, Direct Chemical Dynamics Simulations, J. Am. Chem. Soc., 2017, 139, 3570–3590 CrossRef PubMed.
  6. A. Agarwal, P. J. Boruah, B. Sarkar and A. K. Paul, Post-Transition-State Direct Dynamics Simulations on the Ozonolysis of Catechol, J. Phys. Chem. A, 2022, 126, 5314–5327 CrossRef PubMed.
  7. J. Zheng, Y. Zhao and D. G. Truhlar, The DBH24/08 Database and Its Use to Assess Electronic Structure Model Chemistries for Chemical Reaction Barrier Heights, J. Chem. Theory Comput., 2009, 5, 808–821 CrossRef PubMed.
  8. L. Goerigk, et al., A look at the density functional theory zoo with the advanced GMTKN55 database for general main group thermochemistry, kinetics and noncovalent interactions, Phys. Chem. Chem. Phys., 2017, 19, 32184–32215 RSC.
  9. K. Raghavachari, G. W. Trucks, J. A. Pople and M. Head-Gordon, A fifth-order perturbation comparison of electron correlation theories, Chem. Phys. Lett., 1989, 157, 479–483 CrossRef CAS.
  10. L. Simon and J. M. Goodman, How reliable are DFT transition structures? Comparison of GGA, hybrid-meta-GGA and meta-GGA functionals, Org. Biomol. Chem., 2011, 9, 689–700 RSC.
  11. S. P. Tyukina, J. A. Velmiskina, A. O. Dmitrienko and M. G. Medvedev, Binomial Uncertainty in Molecular Dynamics-Based Reactions Analysis, J. Phys. Chem. Lett., 2024, 15, 2105–2110 CrossRef CAS PubMed.
  12. V. Venkatasubramanian and V. Mann, Artificial intelligence in reaction prediction and chemical synthesis, Curr. Opin. Chem. Eng., 2022, 36, 100749 CrossRef.
  13. D. T. Ahneman, J. G. Estrada, S. Lin, S. D. Dreher and A. G. Doyle, Predicting reaction performance in C-N cross-coupling using machine learning, Science, 2018, 360, 186–190 CrossRef CAS.
  14. A. L. Haywood, et al., Kernel Methods for Predicting Yields of Chemical Reactions, J. Chem. Inf. Model., 2022, 62, 2077–2092 CrossRef CAS.
  15. F. Sandfort, F. Strieth-Kalthoff, M. Kühnemund, C. Beecks and F. Glorius, A Structure-Based Platform for Predicting Chemical Reactivity, Chem, 2020, 6, 1379–1390 CAS.
  16. P. Schwaller, A. C. Vaucher, T. Laino and J.-L. Reymond, Prediction of chemical reaction yields using deep learning, Mach. Learn.: Sci. Technol., 2021, 2, 015016 Search PubMed.
  17. H.-C. Chang, M.-H. Tsai and Y.-P. Li, Enhancing Activation Energy Predictions under Data Constraints Using Graph Neural Networks, J. Chem. Inf. Model., 2025, 65, 1367–1377 CrossRef CAS.
  18. S. Choi, Y. Kim, J. W. Kim, Z. Kim and W. Y. Kim, Feasibility of Activation Energy Prediction of Gas-Phase Reactions by Machine Learning, Chem. Eur J., 2018, 24, 12354–12358 CrossRef CAS.
  19. X. Garcia-Andrade, P. Garcia Tahoces, J. Perez-Rios and E. Martinez Nunez, Barrier Height Prediction by Machine Learning Correction of Semiempirical Calculations, J. Phys. Chem. A, 2023, 127, 2274–2283 CrossRef CAS PubMed.
  20. C. A. Grambow, L. Pattanaik and W. H. Green, Deep Learning of Activation Energies, J. Phys. Chem. Lett., 2020, 11, 2992–2997 Search PubMed.
  21. E. Heid and W. H. Green, Machine Learning of Reaction Properties via Learned Representations of the Condensed Graph of Reaction, J. Chem. Inf. Model., 2022, 62, 2101–2110 CrossRef CAS.
  22. S. Heinen, G. F. von Rudorff and O. A. von Lilienfeld, Toward the design of chemical reactions: Machine learning barriers of competing mechanisms in reactant space, J. Chem. Phys., 2021, 155, 064105 CrossRef CAS PubMed.
  23. I. Ismail, C. Robertson and S. Habershon, Successes and challenges in using machine-learned activation energies in kinetic simulations, J. Chem. Phys., 2022, 157, 014109 CrossRef CAS PubMed.
  24. T. I. Madzhidov, et al., Structure–reactivity relationship in Diels–Alder reactions obtained using the condensed reaction graph approach, J. Struct. Chem., 2017, 58, 650–656 CrossRef CAS.
  25. K. A. Spiekermann, L. Pattanaik and W. H. Green, Fast Predictions of Reaction Barrier Heights: Toward Coupled-Cluster Accuracy, J. Phys. Chem. A, 2022, 126, 3976–3986 CrossRef CAS PubMed.
  26. T. Stuyver and C. W. Coley, Quantum chemistry-augmented neural networks for reactivity prediction: Performance, generalizability, and explainability, J. Chem. Phys., 2022, 156, 084104 Search PubMed.
  27. Q. Zhao, D. M. Anstine, O. Isayev and B. M. Savoie, Delta(2) machine learning for reaction property prediction, Chem. Sci., 2023, 14, 13392–13401 RSC.
  28. E. Komp and S. Valleau, Machine Learning Quantum Reaction Rate Constants, J. Phys. Chem. A, 2020, 124, 8607–8613 CrossRef CAS PubMed.
  29. P. L. Houston, A. Nandi and J. M. Bowman, A Machine Learning Approach for Prediction of Rate Constants, J. Phys. Chem. Lett., 2019, 10, 5250–5258 CrossRef CAS.
  30. R. Barrett and J. Westermayr, Reinforcement Learning for Traversing Chemical Structure Space: Optimizing Transition States and Minimum Energy Paths of Molecules, J. Phys. Chem. Lett., 2024, 15, 349–356 CrossRef CAS PubMed.
  31. C. Duan, Y. Du, H. Jia and H. J. Kulik, Accurate transition state generation with an object-aware equivariant elementary reaction diffusion model, Nat. Comput. Sci., 2023, 3, 1045–1055 CrossRef PubMed.
  32. R. Jackson, W. Zhang and J. Pearson, TSNet: predicting transition state structures with tensor field networks and transfer learning, Chem. Sci., 2021, 12, 10022–10040 RSC.
  33. S. Kim, J. Woo and W. Y. Kim, Diffusion-based generative AI for exploring transition states from 2D molecular graphs, Nat. Commun., 2024, 15, 341 CrossRef CAS PubMed.
  34. M. Z. Makos, N. Verma, E. C. Larson, M. Freindorf and E. Kraka, Generative adversarial networks for transition state geometry prediction, J. Chem. Phys., 2021, 155, 024116 CrossRef CAS.
  35. L. Pattanaik, J. B. Ingraham, C. A. Grambow and W. H. Green, Generating transition states of isomerization reactions with deep learning, Phys. Chem. Chem. Phys., 2020, 22, 23618–23626 RSC.
  36. E. C. Yuan, et al., Analytical ab initio hessian from a deep learning potential for transition state optimization, Nat. Commun., 2024, 15, 8865 CrossRef CAS.
  37. J. Zhang, et al., Deep reinforcement learning of transition states, Phys. Chem. Chem. Phys., 2021, 23, 6888–6895 RSC.
  38. Y. Yang, S. Zhang, K. D. Ranasinghe, O. Isayev and A. E. Roitberg, Machine Learning of Reactive Potentials, Annu. Rev. Phys. Chem., 2024, 75, 371–395 CrossRef CAS.
  39. G. D. Fabritiis, Machine Learning Potentials: A Roadmap Toward Next-Generation Biomolecular Simulations, arXiv, 2024, preprint, 2408.12625 [physics.chem-ph],  DOI:10.48550/arXiv.2408.12625.
  40. D. M. Anstine, R. Zubatyuk and O. Isayev, AIMNet2: a neural network potential to meet your neutral, charged, organic, and elemental-organic needs, Chem. Sci., 2025, 16, 10228–10244 RSC.
  41. C. Chen and S. P. Ong, A universal graph deep learning interatomic potential for the periodic table, Nat. Comput. Sci., 2022, 2, 718–728 CrossRef PubMed.
  42. K. Choudhary, et al., Unified graph neural network force-field for the periodic table: solid state applications, Digit. Discov., 2023, 2, 346–355 RSC.
  43. B. Deng, et al., CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling, Nat. Mach. Intell., 2023, 5, 1031–1041 CrossRef.
  44. M. Nováček and J. Řezáč, PM6-ML: The synergy of semiempirical quantum chemistry and machine learning transformed into a practical computational method, J. Chem. Theory Comput., 2025, 21, 678–690 CrossRef PubMed.
  45. S. Takamoto, et al., Towards universal neural network potential for material discovery applicable to arbitrary combination of 45 elements, Nat. Commun., 2022, 13, 2991 CrossRef CAS.
  46. F. Xie, T. Lu, S. Meng and M. Liu, GPTFF: A high-accuracy out-of-the-box universal AI force field for arbitrary inorganic materials, Sci. Bull., 2024, 69, 3525–3532 CrossRef.
  47. P. Zheng, R. Zubatyuk, W. Wu, O. Isayev and P. O. Dral, Artificial Intelligence-Enhanced Quantum Chemical Method with Broad Applicability, Nat. Commun., 2021, 12, 7022 CrossRef CAS.
  48. J. S. Smith, et al., Approaching coupled cluster accuracy with a general-purpose neural network potential through transfer learning, Nat. Commun., 2019, 10, 2903 CrossRef.
  49. R. Ramakrishnan, P. O. Dral, M. Rupp and O. A. von Lilienfeld, Big Data Meets Quantum Chemistry Approximations: The Δ-Machine Learning Approach, J. Chem. Theory Comput., 2015, 11, 2087–2096 CrossRef CAS PubMed.
  50. Z. Yan, D. Wei, X. Li and L. W. Chung, Accelerating reliable multiscale quantum refinement of protein-drug systems enabled by machine learning, Nat. Commun., 2024, 15, 4181 CrossRef CAS.
  51. C. Devereux, et al., Extending the Applicability of the ANI Deep Learning Molecular Potential to Sulfur and Halogens, J. Chem. Theory Comput., 2020, 16, 4192–4202 CrossRef CAS PubMed.
  52. C. Bannwarth, S. Ehlert and S. Grimme, GFN2-xTB-An Accurate and Broadly Parametrized Self-Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions, J. Chem. Theory Comput., 2019, 15, 1652–1671 CrossRef CAS PubMed.
  53. Y. Chen, et al., Benchmark of general-purpose machine learning-based quantum mechanical method AIQM1 on reaction barrier heights, J. Chem. Phys., 2023, 158, 074103 CrossRef CAS.
  54. P. O. Dral, et al., MLatom 3: A Platform for Machine Learning-Enhanced Computational Chemistry Simulations and Workflows, J. Chem. Theory Comput., 2024, 20, 1193–1213 CrossRef CAS PubMed.
  55. Y. Chen, Y.-F. Hou, O. Isayev & P. O. Dral, Universal and Updatable Artificial Intelligence-Enhanced Quantum Chemical Foundational Models, ChemRxiv, 2024, preprint,  DOI:10.26434/chemrxiv-2024-604wb.
  56. S. Grimme, Exploration of Chemical Compound, Conformer, and Reaction Space with Meta-Dynamics Simulations Based on Tight-Binding Quantum Chemical Calculations, J. Chem. Theory Comput., 2019, 15, 2847–2862 CrossRef CAS PubMed.
  57. E. Caldeweyher, C. Bannwarth and S. Grimme, Extension of the D3 dispersion coefficient model, J. Chem. Phys., 2017, 147, 034112 CrossRef PubMed.
  58. X. Gao, F. Ramezanghorbani, O. Isayev, J. S. Smith and A. E. Roitberg, TorchANI: A Free and Open Source PyTorch-Based Deep Learning Implementation of the ANI Neural Network Potentials, J. Chem. Inf. Model., 2020, 60, 3408–3415 CrossRef CAS.
  59. J.-D. Chai and M. Head-Gordon, Long-range corrected hybrid density functionals with damped atom-atom dispersion corrections, Phys. Chem. Chem. Phys., 2008, 10, 6615–6620 RSC.
  60. S. F. Alavi, et al., ANI-1ccx-gelu Universal Interatomic Potential and Its Fine-Tuning: Toward Accurate and Efficient Anharmonic Vibrational Frequencies, J. Phys. Chem. Lett., 2025, 16, 483–493 CrossRef CAS PubMed.
  61. J. S. Smith, et al., The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules, Sci. Data, 2020, 7, 134 CrossRef CAS PubMed.
  62. F. Weigend and R. Ahlrichs, Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy, Phys. Chem. Chem. Phys., 2005, 7, 3297–3305 RSC.
  63. Y. Guo, et al., Communication: An improved linear scaling perturbative triples correction for the domain based local pair-natural orbital based singles and doubles coupled cluster method [DLPNO-CCSD(T)], J. Chem. Phys., 2018, 148, 011101 CrossRef PubMed.
  64. A. Halkier, T. Helgaker, P. Jørgensen, W. Klopper and J. Olsen, Basis-set convergence of the energy in molecular Hartree–Fock calculations, Chem. Phys. Lett., 1999, 302, 437–446 CrossRef CAS.
  65. T. Helgaker, W. Klopper, H. Koch and J. Noga, Basis-set convergence of correlated calculations on water, J. Chem. Phys., 1997, 106, 9639–9646 CrossRef CAS.
  66. P. Zheng, W. Yang, W. Wu, O. Isayev and P. O. Dral, Toward Chemical Accuracy in Predicting Enthalpies of Formation with General-Purpose Data-Driven Methods, J. Phys. Chem. Lett., 2022, 13, 3479–3491 CrossRef CAS PubMed.
  67. P. Yu, et al., Mechanisms and Origins of Periselectivity of the Ambimodal [6 + 4] Cycloadditions of Tropone to Dimethylfulvene, J. Am. Chem. Soc., 2017, 139, 8251–8258 CrossRef CAS PubMed.
  68. Y. F. Hou, Q. Zhang and P. O. Dral, Surprising Dynamics Phenomena in the Diels-Alder Reaction of C(60) Uncovered with AI, J. Org. Chem., 2024, 89, 15041–15047 CrossRef CAS PubMed.
  69. T. A. Young, T. Johnston-Wood, H. Zhang and F. Duarte, Reaction dynamics of Diels-Alder reactions from machine learned potentials, Phys. Chem. Chem. Phys., 2022, 24, 20820–20827 RSC.
  70. J. S. Smith, O. Isayev and A. E. Roitberg, ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost, Chem. Sci., 2017, 8, 3192–3203 RSC.
  71. P. Vermeeren, et al., Pericyclic reaction benchmarks: hierarchical computations targeting CCSDT(Q)/CBS and analysis of DFT performance, Phys. Chem. Chem. Phys., 2022, 24, 18028–18042 RSC.
  72. A. Karton and J. M. Martin, Explicitly correlated Wn theory: W1-F12 and W2-F12, J. Chem. Phys., 2012, 136, 124114 CrossRef PubMed.
  73. J. A. Montgomery, M. J. Frisch, J. W. Ochterski and G. A. Petersson, A complete basis set model chemistry. VII. Use of the minimum population localization method, J. Chem. Phys., 2000, 112, 6532–6542 CrossRef CAS.
  74. Y.-F. Hou, C. Wang and P. O. Dral, Accurate and Affordable Simulation of Molecular Infrared Spectra with AIQM Models, J. Phys. Chem. A, 2025, 129, 3613–3623 CrossRef CAS PubMed.
  75. J.-C. Song, et al., Alternating Donor–Acceptor Thienoacenes Featuring Up to 23 Linearly Fused Rings, Org. Lett., 2025, 27, 3753–3759 CrossRef CAS.
  76. P. O. Dral, Y. Chen and J. Li, Aitomic: A package for AI-enhanced atomistic simulations, Aitomistic, Shenzhen, China, http://MLatom.com/aitomic (accessed on August 9, 2025), 20252 Search PubMed.
  77. E. Caldeweyher, J.-M. Mewes, S. Ehlert and S. Grimme, Extension and evaluation of the D4 London-dispersion model for periodic systems, Phys. Chem. Chem. Phys., 2020, 22, 8499–8512 RSC.
  78. E. Caldeweyher, S. Ehlert & S. Grimme, DFT-D4, Version 3.5.0 (Mulliken Center for Theoretical Chemistry), University of Bonn, 2024 Search PubMed.
  79. Semiempirical extended tight-binding program package xtb, https://github.com/grimme-lab/xtb (accessed on August 9, 2025).
  80. W. Thiel, with contributions from M. Beck, S. Billeter, R. Kevorkiants, M. Kolb, A. Koslowski, S. Patchkovskii, A. Turner, E.-U. Wallenborn, W. Weber, L. Spörkel, and P. O. Dral, MNDO2020: a semiempirical quantum chemistry program v. development version (Max-Planck-Institut für Kohlenforschung, Mülheim an der Ruhr), https://mndo.kofo.mpg.de (accessed on August 9, 2025), 2020.
  81. Q. Sun, Libcint: An efficient general integral library for Gaussian basis functions, J. Comput. Chem., 2015, 36, 1664–1671 CrossRef CAS PubMed.
  82. Q. Sun, et al., PySCF: the Python-based simulations of chemistry framework, Wiley Interdiscip. Rev.:Comput. Mol. Sci., 2017, 8, e1340 Search PubMed.
  83. Q. Sun, et al., Recent developments in the PySCF program package, J. Chem. Phys., 2020, 153, 024109 CrossRef CAS PubMed.
  84. F. Neese, Software update: the ORCA program system, version 4.0, Wiley Interdiscip. Rev.:Comput. Mol. Sci., 2018, 8, e1327 Search PubMed.
  85. F. Neese, The ORCA program system, Wiley Interdiscip. Rev.:Comput. Mol. Sci., 2012, 2, 73–78 CAS.
  86. Y. F. Hou, L. Zhang, Q. Zhang, F. Ge and P. O. Dral, Physics-informed active learning for accelerating quantum chemical simulations, J. Chem. Theory Comput., 2024, 20, 7744–7754 CAS.

This journal is © The Royal Society of Chemistry 2025
Click here to see how this site uses Cookies. View our privacy policy here.