Wan-Lu
Li
abcf,
Kaixuan
Chen
abc,
Elliot
Rossomme
ab,
Martin
Head-Gordon
abc and
Teresa
Head-Gordon
*abcde
aKenneth S. Pitzer Center for Theoretical Chemistry, USA
bDepartment of Chemistry, USA
cChemical Sciences Division, Lawrence Berkeley National Laboratory, USA. E-mail: thg@berkeley.edu
dDepartment of Chemical and Biomolecular Engineering, USA
eDepartment of Bioengineering, University of California, Berkeley, Berkeley, California 94720, USA
fDepartment of Nanoengineering and Materials Science and Engineering Program, University of California San Diego, La Jolla, California 92093, USA
First published on 14th September 2023
We present an investigation into the transferability of pseudopotentials (PPs) with a nonlinear core correction (NLCC) using the Goedecker, Teter, and Hutter (GTH) protocol across a range of pure GGA, meta-GGA and hybrid functionals, and their impact on thermochemical and non-thermochemical properties. The GTH-NLCC PP for the PBE density functional demonstrates remarkable transferability to the PBE0 and ωB97X-V exchange–correlation functionals, and relative to no NLCC, improves agreement significantly for thermochemical benchmarks compared to all-electron calculations. On the other hand, the B97M-rV meta-GGA functional performs poorly with the PBE-derived GTH-NLCC PP, which is mitigated by reoptimizing the NLCC parameters for this specific functional. The findings reveal that atomization energies exhibit the greatest improvements from use of the NLCC, which thus provides an important correction needed for covalent interactions relevant to applications involving chemical reactivity. Finally we test the NLCC-GTH PPs when combined with medium-size TZV2P molecularly optimized (MOLOPT) basis sets which are typically utilized in condensed phase simulations, and show that they lead to consistently good results when compared to all-electron calculations for atomization energies, ionization potentials, barrier heights, and non-covalent interactions, but lead to somewhat larger errors for electron affinities.
However, in a self-consistent field calculation, the charge distribution undergoes changes when a free atom is inserted into a molecule or solid, and the potential within the muffin-tin region will generally differ from the potential within a muffin-tin sphere of the same radius around the reference atom. As a result, the scattering properties undergo modifications, and the PP constructed based on the isolated atom might not accurately reproduce the altered scattering properties due to the new chemical environment. In spin-polarized calculations, the concept of an invariant muffin-tin sphere is also not applicable since the spin polarization deviates across different chemical environments, despite the similarity in the total charge within the invariant muffin-tin sphere. In other words, the typical assumption for PPs that assume a linear relationship between the charge distribution and potential leads to inaccuracies because the electron–electron interactions in the core region are highly localized such that they exhibit a nonlinear relationship between the electron density and the associated potential.
One approach to address this is by incorporating nonlinear core corrections (NLCC)9 into the PP optimization. For the NLCC schemes, the spin and charge densities within the muffin-tin sphere are not solely determined by the valence electrons, but instead encompass the combined contributions of the valence charge and the core charge. The NLCC method allows for a more realistic treatment of the core electron behavior by taking into account the strong electron–electron interactions and the asymmetric charge density distribution near the atomic nucleus. In accordance with the principles underlying the GTH PPs, Willand et al. in 2013 put forward a NLCC method in which the core charge density is represented by a single Gaussian function,10 which captures the essential characteristics of the core charge. The amplitude and width of this Gaussian core charge distribution were subsequently optimized through a rigorous fitting procedure similar to the regular GTH PP optimization11–14 but including not just ground state but also excited states and ionized electronic configurations. By employing this methodology, the NLCC-PP achieves a high level of precision in describing the atomization energies in molecular systems compared to all-electron calculations and reliability for high-pressure phases of crystalline solids. Together with an adequate treatment of dispersion, such as the empirical Grimme D2 (ref. 15) and D3 (ref. 16) schemes, the inclusion of NLCC corrections allows for a comprehensive description of weakly bound intermolecular interactions, achieving an average error of approximately 0.5 kcal mol−1.
Recently, we presented a study on the optimization and transferability of small and large core GTH PPs for Density Functional Theory (DFT) functionals applied to molecular systems and condensed phase simulations for electrocatalysis.13 We also have recently conducted a systematic exploration of the extent of PP inconsistency errors (PPIEs)17 that arise from the transfer of a different PP relative to the chosen level of DFT when evaluating energy differences. To address these errors, we have developed and implemented empirical atom- and density functional approximation specific PP corrections.17 But in neither case have we considered the role of NLCC PPs and their influence on the accuracy possible with different DFT functionals outside the original parameterization of the PBE functional10 and when combined with less complete basis sets such as the molecularly optimized (MOLOPT) basis sets used in AIMD application studies in CP2K.18
In this work, we directly use the 2013 NLCC HGH-PP developed for the PBE functional,10 and investigate its transferability and accuracy across various DFT functionals including the hybrid GGA PBE0,19 the semi-empirical hybrid ωB97M-rV,20 and the meta-GGA B97M-rV,21,22 as measured on performance for thermochemical properties, barrier heights, isomerization energies, and non-covalent interactions. In addition, we also combine the GTH-NLCC method with the medium-sized TZVP MOLOPT basis sets that are used to enhance the computational efficiency in CP2K23 and compare them to the more complete def2-TZVPPD basis set results. Although the NLCC GTH-PP results are more transferable and yield excellent results for the hybrid functionals regardless of basis set, the B97M-rV functional exhibits a large PPIE. Hence we undertook a reoptimization process to enhance the accuracy of the NLCC parameters for this meta-GGA functional which brings it into line with the other DFT/basis set combinations. As we show below, the use of the NLCC correction improves overall accuracy over the full range of DFT functionals and basis sets considered here, minimizing errors across all data sets but especially for thermochemical properties that are most relevant for applications involving chemical reactivity.
(1) |
The nonlocal part is expressed as
(2) |
(3) |
The nonlinear core correction is given by
(4) |
(5) |
(6) |
Meta-GGA functionals stand out for their ability to enhance the precision of GGAs by incorporating local kinetic energy density considerations. This distinctive feature empowers meta-GGAs to provide a more refined treatment of diverse chemical bonds, including covalent, metallic, and weak bonds, in contrast to the capabilities of LDAs and GGAs.25,26
EmGGAxc[ρ, τ] = d3rf [ρ(r), ∇ρ(r), τ(r)] | (7) |
(8) |
However, in the CP2K code,18 an assumption is made that the kinetic energy density stemming from core orbitals is excluded from consideration, with only valence orbitals taken into account. Consequently, during meta-GGA calculations, the employed variables encompass the core charge augmented density and density gradient, accompanied by the kinetic energy density derived solely from valence orbitals. To address this limitation, we reoptimized the NLCC parameters for the meta-GGA B97M-rV functional.
In order to examine the NLCC correction for different chemical properties, we chose a number of benchmark data sets. We follow the convention outlined by Mardirossian and Head-Gordon32 to organize datasets into categories: (1) thermochemistry (TC), (2) barrier heights (BH), (3) isomerization energies (I), and (4) non-covalent interactions (NC), ranging from most difficult to easier in regards demands on the PP accuracy. Note however we are not using their MGDB84 dataset here. For the TC category we used the whole series of G2 datasets33–35 that evaluates the covalent bond formation energy, as well as the TAE140nonMR and TAE140MR36 for easy and difficult atomization energies, respectively. For the BH category we used HTBH38,37 NHTBH38,38 and WCPT27 (ref. 39) datasets representing hydrogen transfer barrier heights, non-hydrogen transfer barrier heights, and barrier heights of water-catalysed proton–transfer reactions, respectively. For the I category we used ID (isomerization energies “difficult”) and ISOMERIZATION20 (ref. 36) for isomerization energies; G21EA40,41 and G21IP40,41 for adiabatic electron affinities and ionization potentials of atoms and small molecules, respectively. For the NC category we used the NCED (non-covalent “easy” dimers) and S66 (ref. 42 and 43) describing binding energies between organic molecules and biomolecules, and a NCD (non-covalent “difficult”) CT20 (ref. 44) dataset for binding energies of charge-transfer complexes.
PP | GTH | GTH | GTH-NLCC | GTH-NLCC | |
---|---|---|---|---|---|
Basis set | Dataset | MOLOPT | def2-TZVPPD | MOLOPT | def2-TZVPPD |
PBE | G2 | 10.13 | 15.55 | 6.06 | 1.90 |
TAE140nonMR | 10.20 | 12.25 | 3.36 | 1.32 | |
TAE140MR | 12.61 | 9.92 | 2.61 | 1.70 | |
PBE0 | G2 | 6.68 | 11.05 | 1.80 | 5.68 |
TAE140nonMR | 7.69 | 8.81 | 2.34 | 3.53 | |
TAE140MR | 10.65 | 6.96 | 3.11 | 2.76 | |
ωB97X-V | G2 | 4.98 | 9.66 | 4.03 | 3.99 |
TAE140nonMR | 4.05 | 4.16 | 4.48 | 2.60 | |
TAE140MR | 8.25 | 4.35 | 4.26 | 4.82 |
To assess the errors incurred for the GTH-NLCC parameters optimized for PBE,10Table 1 and Fig. 1(a)–(c) shows a significant reduction in the MADs and MAPE regardless of basis set relative to the all-electron calculations. This correction effectively addresses the issue of underestimating the covalent interactions arising from the original GTH/MOLOPT parameterization. For the PBE functional, the GTH-NLCC combined with more complete basis set def2-TZVPPD yields the best results, with the MAD generally less than 2 kcal mol−1 and MAPE less than 1.2%. Such consistency with the all-electron calculation has been corroborated in the study conducted by Willand et al.10 where they use the standard G2-1 dataset containing only 54 molecules. However the results using GTH-NLCC/MOLOPT, the basis set relied upon in condensed phase AIMD simulation work, is an impressive reduction in overall error as well.
Often PPs developed for one DFT functional, typically PBE, are transferred for use with other DFT functionals. Table 1 and Fig. 1(a)–(c) show that the underestimation error incurred by PBE can be alleviated by utilizing higher hierarchical density functionals, such as hybrid PBE0 and ωB97X-V functionals that incorporate a fraction of non-local exact exchange typically derived from Hartree–Fock theory, better describing both localized electron–electron interactions such as covalent bonding and delocalized interactions such as dispersion forces and van der Waals interactions.45–48 And yet the hybrid functionals also benefit from the NLCC PPs transferred from PBE. Notably, the combination of GTH-NLCC and MOLOPT basis sets exhibits a remarkable capacity to faithfully reproduce all-electron outcomes, when compared to the larger basis sets used in electronic structure codes. This underscores the extensive utility of MOLOPT basis in practical computations, as it adeptly strikes a fine balance between precision and computational efficiency.
Table 2 and Fig. 2 depict the errors in atomization energies predicted by various combinations of PPs and basis sets when evaluated across the G2, TAE140nonMR, and TAE140MR datasets for the semi-empirical meta-GGA B97M-rV functional. Although the GTH-NLCC PP for PBE was highly transferable to the two hybrid functionals, one of which is also a semi-empirical functional developed by the same data-driven procedure,49 we find the transferability is quite poor for B97M-rV meta-GGA functional,21,22 yielding large MAD errors (>rbin 50 kcal mol−1) regardless of the completeness of the basis sets when compared to all-electron calculations (Table 2). Notably two clear trends emerge, demonstrating first that the original GTH/MOLOPT approach significantly underestimates the atomically covalent interaction, whereas the existing NLCC correction (GTH-NLCC-2013) exhibits an adverse and now substantial overestimation of the interaction. By comparing the MADs predicted by the original GTH/MOLOPT approach (Table S2†), it is also apparent that under the existing NLCC correction the lighter elements exhibit significantly larger MADs compared to their heavier counterparts. This disparity is likely attributed to the greater sensitivity to the core region for lighter elements, while conversely the stronger screening effect8 exhibited by heavier elements helps maintain the invariance of the charge density in the core region, resulting in lower MAD values. Nonetheless, we found it imperative in this case to reparameterize the NLCC PPs specifically for the B97M-rV functional. After reoptimization of the NLCC parameters at the level of B97M-rV for only the lighter main-group elements (B, C, N, O, and F), leaving the original GTH parameters in place for the heavier atoms, we get more consistent results with a MAD less than 6 kcal mol−1 using either the MOLOPT or more complete basis sets (Table 2 and Fig. 2).
PP | GTH | GTH | GTH-NLCC | GTH-NLCC | GTH-NLCC-OPT | GTH-NLCC-OPT |
---|---|---|---|---|---|---|
Basis set | MOLOPT | def2-TZVPPD | MOLOPT | def2-TZVPPD | MOLOPT | def2-TZVPPD |
Dataset | ||||||
G2 | 25.42 | 29.96 | 68.76 | 61.39 | 6.63 | 5.95 |
TAE140nonMR | 19.38 | 20.36 | 55.23 | 51.93 | 5.62 | 4.16 |
TAE140MR | 11.87 | 9.07 | 38.71 | 42.43 | 6.25 | 5.24 |
Additionally, we investigated the transferability and errors for the standard and NLCC PPs and basis set combinations for other thermochemical properties including electron affinities (EA) and ionization potentials (IP) derived from the G21EA and G21IP datasets41,50 as shown in Fig. 3. As detailed in Fig. S1 and S2,† ionization involves a larger energy scale than that found for electron affinity energies, as it is harder to remove an electron from the highest occupied molecular orbital (HOMO) than to attach an electron into the lowest unoccupied molecular orbital (LUMO) because the former is much deeper than the latter relative to the vacuum level. Overall we find that IPs are fairly insensitive to both PPs and basis set completeness for the pure GGA and hybrid DFT functionals, whereas the B97M-rV meta GGA benefits from NLCC optimization, after which it yields comparable results to the other DFT functionals and regardless of basis set. These figures also highlight a significant trend in which the EA is far more sensitive to the completeness of the basis set, and that the NLCC has a more minor corrective effect on the errors due to overbinding.
Table 3 provides an assessment of PP/basis set combinations for other non-reactive properties including binding energies of non-covalent interactions in organic molecules and biomolecules (S66),42,43 binding energies of charge–transfer complexes (CT20),44 isomerization energies (ISOMERIZATION20),36 hydrogen transfer barrier heights (HTBH38),37 non-hydrogen transfer barrier heights (NHTBH38),38 and barrier heights for water-catalyzed proton–transfer reactions (WCPT27).39 In Table 3 we consider the accuracy and transferability of the PBE PP to the other functionals. Clearly, the NLCC effect is negligible for these non-thermochemical properties as the chemical environment is not dramatically changed before and after the reaction and the spin polarized state is less involved compared to the atomization process. Similar to the other properties examined previously, the GTH-NLCC pseudopotential at the PBE level exhibits poor transferability to the B97M-rV functional, resulting in significantly worse results compared to the GTH/MOLOPT approach without correction. However, with utilization of optimized NLCC parameters for the B97M-rV functional, substantial improvements can be achieved for non-thermal properties as well.
Functional | PP/basis | S66 | CT20 | ISOMER20 | HTBH38 | NHTBH38 | WCPT27 |
---|---|---|---|---|---|---|---|
PBE | GTH/MOLOPT | 0.12 | 0.11 | 0.63 | 1.13 | 2.00 | 0.43 |
GTH/def2-TZVPPD | 0.06 | 0.04 | 0.99 | 0.95 | 1.00 | 0.90 | |
GTH-NLCC/MOLOPT | 0.14 | 0.10 | 1.57 | 0.38 | 1.84 | 0.49 | |
GTH-NLCC/def2-TZVPPD | 0.08 | 0.05 | 0.72 | 0.61 | 0.88 | 1.15 | |
PBE0 | GTH/MOLOPT | 0.22 | 0.12 | 0.74 | 0.83 | 1.67 | 0.44 |
GTH/def2-TZVPPD | 0.07 | 0.07 | 0.78 | 0.71 | 1.03 | 1.04 | |
GTH-NLCC/MOLOPT | 0.23 | 0.12 | 0.68 | 0.37 | 1.87 | 0.45 | |
GTH-NLCC/def2-TZVPPD | 0.12 | 0.08 | 0.78 | 0.76 | 1.47 | 1.31 | |
ωB97M-rV | GTH/MOLOPT | 0.25 | 0.11 | 0.89 | 0.52 | 1.41 | 0.62 |
GTH/def2-TZVPPD | 0.10 | 0.09 | 0.38 | 0.51 | 0.58 | 1.03 | |
GTH-NLCC/MOLOPT | 0.24 | 0.11 | 0.93 | 0.23 | 1.63 | 0.57 | |
GTH-NLCC/def2-TZVPPD | 0.11 | 0.08 | 0.37 | 0.51 | 1.03 | 1.28 | |
B97M-rV | GTH/MOLOPT | 0.10 | 0.14 | 1.56 | 1.36 | 2.19 | 0.50 |
GTH/def2-TZVPPD | 0.22 | 0.09 | 1.46 | 1.20 | 1.53 | 0.80 | |
GTH-NLCC-2013/MOLOPT | 0.30 | 0.21 | 5.74 | 3.96 | 2.40 | 2.05 | |
GTH-NLCC-2013/def2-TZVPPD | 0.24 | 0.17 | 4.67 | 4.00 | 2.06 | 2.70 | |
B97M-rV | GTH-NLCC-OPT/MOLOPT | 0.27 | 0.16 | 0.61 | 0.79 | 2.58 | 0.62 |
GTH-NLCC-OPT/def2-TZVPPD | 0.23 | 0.11 | 0.42 | 0.67 | 0.75 | 0.57 |
In this work we tested the transferability and accuracy of the original NLCC parameters to the GTH pseudopotential developed for the PBE GGA functional51 and associated with the TZV2P MOLOPT basis set,23 to hybrid DFT functionals including PBE0 (ref. 19) and the range-separated hybrid, meta-GGA ωB9X-V,20 as well as the B97M-rV meta-GGA functional. Our investigation of transferability and accuracy encompassed a wide range of both thermochemistry and non-thermochemistry datasets, and our findings indicate that the NLCC correction has the largest impact on atomization energies, effectively rectifying the original GTH/MOLOPT method and resulting in an error reduction of less than 1% when compared to all-electron calculations. Importantly, the energy-consistent Stuttgart–Dresden relativistic PPs52,53 have also been specifically tailored to faithfully reproduce quantities such as atomic excitation and ionization energies, and showcased remarkable congruence between PPs and all-electron outcomes pertaining to dissociation energies across an array of atoms, ions, and dimers54–59 a principle that is in alignment with the NLCC concept. However, we have previously addressed PP transferability errors, and in that work by Rossomme and co-workers we conducted a comprehensive comparison involving the Stuttgart–Dresden effective core potentials.17 Notably, Table S8† within that paper unveiled a significant insight: the performance of the Stuttgart–Dresden PP, even in conjunction with the def2-QZVPPD basis set, is comparable to the GTH results without element-specific “atomic” corrections.17
For all properties investigated, the existing NLCC parameters optimized at the level of PBE have good transferability to PBE0 and ωB97M-rV while maintaining satisfactory accuracy using the TZV2P MOLOPT basis sets, although electron affinities were found to be more sensitive to the completeness of basis sets than the NLCC correction. In contrast, the GTH-NLCC PP for PBE shows poor transferability to the meta-GGA B97M-rV functional, with significant deviations of over 50 kcal mol−1 in atomization energies when compared to all-electron calculations, thus requiring a reoptimization of the NLCC parameters specifically for this DFT functional. We note that similar PPIEs were found for the SCAN functional as described by Rossomme and co-workers,17 indicating that meta-GGA functionals may be unusually sensitive to PP replacement of core electrons. Fortunately, our optimization of the GTH-NLCC PP for B97M-rV overcame the drastic errors using the standard GTH/MOLOPT, in which optimization was only necessary for lighter second-row elements while heavier elements did not require any changes, which may be insightful for improving other meta-GGA functionals. We note that the errors in the PPs for the meta-GGA may arise from the assumption made regarding the kinetic energy density, which neglects the contribution of the core charge and focuses solely on the valence density (which is also an assumption in the CP2K program). This highlights the need, and opportunity, for developing an NLCC correction for the kinetic energy density since meta-GGAs may offer better DFT accuracy at an affordable cost.
In addition to transferability, the GTH-NLCC PPs combined with the MOLOPT basis sets performed overall as well as the complete basis set calculations for PBE and hybrid functionals, and comparably for B97M-rV after optimization of its non-transferable GTH-NLCC PP. Thus our conclusion is that the GTH-NLCC PP/MOLOPT combination can be reliably employed in large-scale systems, effectively reducing computational costs while maintaining good accuracy. In summary, we believe these findings contribute to a better understanding of the sources of error in calculations using PPs and provide a protocol for enhancing the reliability of other DFT functionals when combined with chosen PPs and basis sets.
Footnote |
† Electronic supplementary information (ESI) available: Electronic configurations of core and valence regions for pseudopotentials of elements studied; comparison of mean absolute deviations for a specific atom type within the G2 dataset; distributions of MADs and all-electron calculations for IP and EA datasets; NLCC performance on non-thermochemical properties. See DOI: https://doi.org/10.1039/d3sc03709f |
This journal is © The Royal Society of Chemistry 2023 |