Identifying systematic DFT errors in catalytic reactions

Using CO 2 reduction reactions as examples, we present a widely applicable method for identifying the main source of errors in density functional theory (DFT) calculations. The method has broad applications for error correction in DFT calculations in general, as it relies on the dependence of the applied exchange – correlation functional on the reaction energies rather than on errors versus the experimental data. As a result, improved energy corrections can now be determined for both gas phase and adsorbed reaction species, particularly interesting within heterogeneous catalysis. We show that for the CO 2 reduction reactions, the main source of error is associated with the C  O bonds and not the typically energy corrected OCO backbone.

Identifying systematic DFT errors in catalytic reactions † Rune Christensen, Heine A. Hansen and Tejs Vegge* Using CO 2 reduction reactions as examples, we present a widely applicable method for identifying the main source of errors in density functional theory (DFT) calculations. The method has broad applications for error correction in DFT calculations in general, as it relies on the dependence of the applied exchangecorrelation functional on the reaction energies rather than on errors versus the experimental data. As a result, improved energy corrections can now be determined for both gas phase and adsorbed reaction species, particularly interesting within heterogeneous catalysis. We show that for the CO 2 reduction reactions, the main source of error is associated with the CO bonds and not the typically energy corrected OCO backbone.
Electroreduction of CO 2 using electricity from renewable sources has the potential to supply carbon neutral transportation fuels, but improved electrocatalysts are needed. Although research within this area has been conducted for decades, 1,2 current challenges include low efficiency, product selectivity and stability of the catalysts. In recent years, significant progress in understanding catalytic activity from fundamental reaction mechanisms has been made for a variety of different heterogeneous electrocatalysts using density functional theory (DFT). 3,4 Within DFT, several levels of calculational complexity in describing the exchangecorrelation functional exist. Although higher level methods can be used in some cases, computational cost will often limit treatment of the exchange energy to be performed using the Generalized Gradient Approximation (GGA) for extensive studies of heterogeneous catalysis. 5 For this reason, this work focuses solely on improving the accuracy of calculations relying on functionals with GGA type exchange.
The RPBE 6 and BEEF-vdW 7 exchange-correlation functionals have been developed for catalysis studies and showed to be well-suited for determining chemisorption energies. 8,9 However, experimental gas phase reaction enthalpies for CO 2 reduction to products of interest can, in many cases, not be reproduced within an error of 0.5 eV per CO 2 , due to significant systematic errors. [10][11][12] These inaccuracies can prevent, e.g., accurate determination of product selectivity. To remain at a sufficiently low level of computational cost while improving accuracy, energy corrections have previously been applied in a fitting procedure, where the mean absolute error (MAE) versus experimental data has been minimized for a given set of reactions. [10][11][12] As a result, energy corrections were previously applied to molecules containing an oxygen-carbon-oxygen (OCO) backbone structure, e.g. CO 2 and HCOOH, for the RPBE functional 10 with an additional correction on the H 2 molecule for the BEEF-vdW functional. 11,12 This correction scheme has subsequently been widely accepted and applied in a large number of high impact papers. 10,13,14 The minimized MAE can vary by as little as 0.01 eV for different choices of corrections, 11 making the choice of corrections based solely on the minimized MAE vulnerable to both minor calculational and experimental inaccuracies. Here, we present a new and robust approach for identification of the specific molecules or molecular components requiring an energy correction to obtain the needed accuracy. Here, we demonstrate it for CO 2 reduction reactions, but the approach is generally applicable, e.g. in studies of heterogeneous catalysis of organic compounds or oxygen reduction.
Although inferior to RPBE and BEEF-vdW for determination of chemisorption energies, 8,9 we find the PBE functional, 15 which is also frequently used within heterogeneous catalysis, able to reproduce most experimental gas phase enthalpies of reaction for CO 2 reduction reactions with sufficient accuracy. The PBE and RPBE functionals differ only in enhancement factor in the GGA exchange energy. 6 If the calculated energy of one specific molecule or molecular component is particularly sensitive towards changes in the enhancement factor, it will dominate the ability to reproduce the experimental enthalpies of reaction. Sensitivity towards changes in the enhancement factor can be probed by calculating the enthalpies of reaction with GGA type functionals with different enhancement factors. The enhancement factor in the BEEF-vdW functional is composed of a sum of Legendre polynomials with expansion coefficients determined in a machine learning process to obtain the best GGA functional for a range of databases containing different data such as enthalpies of formation, chemisorption energies, reaction barriers and van der Waals interactions. Intrinsic to the functional is an ensemble of functionals with perturbed expansion coefficients. Perturbations are included in the ensemble of functionals based on how well the perturbed functional performs and a "temperature" parameter such that one standard deviation in a quantity calculated using the generated ensemble of functionals corresponds to the predicted calculational error. 7 The use of ensemble functionals is a computationally very efficient method for calculating enthalpies of formation with a large number of functionals with different enhancement factors. Here, 2000 different ensemble functionals have been examined. The vdW-DF 16 and vdW-DF2 17 functionals are also examined for comparison as they contain vdW correlation similar to that of BEEF-vdW.
Here, the reaction enthalpy is calculated for a range of different CO 2 reduction reactions listed in Table 1. The gas phase reactions in the 'primary set' are identical to those examined previously for establishing general energy corrections. † [10][11][12] In addition to the primary set, a 'verification set' is also introduced. With the exception of the reduction to dimethyl ether, i.e. reaction (15), the verification set consists of reactions with product molecules containing carbon-oxygen double bonds (CO). The stoichiometry of all gas phase reactions is normalized to one CO 2 reactant molecule. This is not required but this simplifies data treatment.
In addition to gas phase molecules, functional dependent errors can, in contrast to previously used correctional approaches, also be examined for surface adsorbates. A highly important examplecarboxyl (COOH*) adsorption on a Cu (111) surfaceis presented. By comparing the functional dependence of a surface reaction with a similar gas phase reaction, it can be determined whether a similar correction should be applied for the two. To obtain functional dependence comparable to the gas phase, adsorbed methyl (CH 3 *) has been used as the reactant as both COOH* and CH 3 * bond to a single copper atom through a carbon atom with 3 additional covalent bonds. The compared reactions, (*1a) and (*1b), can be seen in Table 1.
All calculations have been performed using the Vienna ab initio Simulation Package (VASP) [18][19][20][21] and the Atomic Simulation Environment (ASE), 22 which has been used to generate the ensembles. † Examples of correlated reaction enthalpies can be seen in Fig. 1. The observed linear correlations indicate that the functional dependence is dominated by a single molecule or molecular component, or by linearly dependent molecules or molecular components. By assuming a specific molecule or molecular component to dominate the functional dependence, it is possible to predict the slope by dividing the change in the number of occurrences in the y-axis reaction with the change in the x-axis reaction as exemplified below. The predicted slope can subsequently be validated against the observed slope. Assuming molecules with an oxygen-carbon-oxygen (OCO) backbone structure to be the major source of functional dependence and, thus, error, 10-12 the slope for the reactions in Fig. 1a is predicted to be 1.0 (−1/−1) since in both reactions an OCO backbone is present in the CO 2 reactant, but not in the product, giving a change of −1 in both reactions. This fits well with the observed slope. Fig. 1b and c show the functional dependence approximately following lines with a slope of 0.5. This disagrees with the general assumption that the OCO backbone dominates the functional dependence, as this assumption predicts a slope of 0 for Fig. 1b, since the OCO backbone is present in both reactant and product (HCOOH), giving a change of 0 in the y-axis reaction (0/−1), and a slope of 1.0 for Fig. 1c, which is similar to Fig. 1a in terms of changes in the OCO backbones. The observed slopes do, however, fit with the CO bonds dominating the functional dependence. In both reactions plotted on the y-axis, one of the double bonds in the CO 2 reactant is preserved in the product (HCOOH and H 2 CO) and both are broken in reaction (3) plotted on the x-axis, resulting in a predicted slope of 0.5 (−1/−2). In a more indepth quantitative analysis of the total reaction set, one can find the slopes obtained through linear regression to agree very well with the slopes predicted under the assumption that the CO bonds dominate the functional dependence. † Fig. 1d compares the two reactions in the adsorbate set. They are observed to show the same functional dependence, and COOH* (carboxyl) should thus be corrected with the same energy correction as HCOOH (formic acid). From the geometry and bond lengths, this is expected as the CO bond appears to be present in COOH*. † Dispersion forces included in the vdW functionals have an effect on the adsorbate reactions as the vdW functionals appear to be on a lower but parallel line to the non-vdW functionals, although too few non-vdW data points are present for definite conclusions. The offset does not impact the analysis of the functional dependence.
Having identified the CO bond to be dominating the functional dependence, energy corrections are applied based on the number of CO bonds in a molecule rather than to molecules with an OCO backbone. The magnitude of the energy correction is then determined by minimizing the MAE versus the experimental data for the primary set of reactions shown in Table 1 in a procedure identical to the one previously applied for determining corrections using the same reference data. [10][11][12] To make a direct comparison with the previously used corrections for the BEEF-vdW functional, corrections are also applied to the H 2 molecules. A comparison of corrections can be seen in Fig. 2. For both types of corrections (CO and OCO), the optimal magnitudes of corrections are 0.10 eV for the H 2 molecules and 0.29 eV for CO 2 (0.15 eV per CO bond). These are similar although not identical to what have been found previously (0.09 eV and 0.33 eV, 11 and 0.09 eV and 0.41 eV (ref. 12)). The magnitude of corrections is the same within 0.01 eV per bond if the reactions in the verification set are included in the minimization of the MAE. By applying a correction to CO instead of OCO, a significant reduction in post-correction errors are observed for the reactions, where the main product either has the OCO backbone but with only one CO bond or has the CO bond without the OCO backbone structure (see Fig. 2). This is seen to be the case in both the primary and the verification sets. This trend is also observed for the RPBE functional. In the case of the reduction to HCOOH (reaction (3)), the post-correction error in the calculations with BEEF-vdW changes from 0.13 eV to −0.02 eV. Previously, errors of 0.15 eV 11 and 0.17 eV 12 were obtained for this reaction using the OCO backbone and H 2 corrected BEEF-vdW functional. For HCOOCH 3 (reaction (10)) and CH 2 O (reaction (11)), significantly different experimental gas phase enthalpies of formation are available in the NIST database, which is used as the source for all reference data. 23 The previously used experimental value for HCOOCH 3 is by far the highest in the database and is extrapolated from the liquid phase enthalpy of formation. 23 Using one of the three alternative experimental  references in the NIST database, the error for reaction (10) follows the trend of the other reactions and can be decreased to a few meV after the CO correction. The effect of changing the correction scheme and applying corrections to the adsorbate can be very significant as exemplified by the reduction of COOH* to HCOOH. Computed using the BEEF-vdW functional, the reaction energy changes significantly from having a free energy change of −0.82 eV with OCO corrections, to −1.11 eV with CO corrections.
Optimal magnitudes of corrections have been determined for all ensemble functionals. The standard deviation is then calculated for the ensemble of corrected functionals and plotted as error bars in Fig. 2. This can be used as a measure of how well the correction performs for the class of functionals. The ensemble standard deviation is generally decreased with the CO correction for reactions where it applies differently than the OCO correction, e.g. for reaction (2), the reduction to formic acid. With the CO corrections, the standard deviation for reaction (0) is significantly larger than for the other reactions. As this is the only depicted reaction including CO, the relatively large standard deviation suggests that the energy of CO could also be functional dependent and require correction on the order of 0.2 eV with certain functionals. CO has previously been found to require corrections using the PBE functional. 9,24 A functional dependent error is also found for carbon-carbon (CC) double bonds, as described in the ESI. † It is, however, of minor importance for the reactions considered here.
The demonstrated method is not limited to the presented case or catalytic reactions and can be used to identify a dominating, error-causing structure, molecule or molecular component in other cases, where functional dependence is observed.
For the different GGA functionals, the difference in the enhancement factor will be small for low density gradients and increase as the density gradient increases. 7 This can explain why functional dependence is most notable for the CO bond, the CC bond, and the CO molecule, as they probably give rise to the largest density gradients for the species in the reactions. In the future, the reduced density gradient could potentially be used directly for qualitative identification of molecules or molecular components with large functional dependence.