Revisiting a Large and Diverse Data Set for Barrier Heights and Reaction Energies: Best Practices in Density Functional Theory Calculations for Chemical Kinetics
Abstract
Accurate prediction of barrier heights and reaction energies is of paramount importance for reaction kinetics. For computational efficiency, such calculations are typically performed with density functional theory (DFT) calculations, with accuracy that depends critically on the choice of functional. The RDB7 dataset (Sci Data 9, 417 (2022)) is a diverse chemical kinetics data set that covers 11926 reactions and their barriers to assess present-day functionals. Strikingly, the RDB7 barrier heights reported using a reputable rung 4 hybrid functional ($\omega$B97X-D3) exhibited significantly larger errors than seen in other benchmarks. Here, we identify the sources of error, and to the extent possible, address those sources. We categorize the barrier heights and reaction energies into three subsets based on orbital stability analysis. The ``easy'' subset has orbitals that are stable at the mean-field Hartree-Fock (HF) level, which implies weak correlation effects. An ``intermediate'' subset exhibits spin symmetry breaking at the HF level, but the restricted orbitals are stable at the dynamically correlated $\kappa$ orbital optimized second order M{\o}ller-Plesset ($\kappa$-OOMP2) level with $\kappa=1.45$. While more challenging than the easy category, this implies that correlation effects are still not strong. The remaining ``difficult'' subset is expected to be significantly affected by strong electron correlations, which potentially affects the accuracy of standard DFT. With this data classification, we performed new benchmarks with unrestricted $\omega$B97X-D3 as well as two other hybrid functionals, $\omega$B97M-V, and MN15, and the double hybrid $\omega$B97M(2) functional. The RMSD values on the easy subset are comparable to prior high-quality benchmark studies, while the performance of all functionals on the intermediate subset is consistently less good. By far the largest errors lie in the difficult subset involving strongly correlated species. We refined some of the previous reference values to further assess the two key error sources: the density functional and its associated orbitals, and the reduced reliability of the previous RHF:RCCSD(T)-F12 reference. We propose our orbital stability classification as a best-practice approach for DFT calculations in chemical kinetics involving even numbers of electrons, as it provides useful information about the expected accuracy. We strongly recommend the routine use of orbital stability analysis in DFT calculations, as the spin-polarized solutions significantly reduce the strong correlation errors seen with spin-restricted orbitals.