Minho
Kim
^{a},
Tim
Gould
^{b},
Dario
Rocca
^{a} and
Sébastien
Lebègue
*^{a}
^{a}Université de Lorraine and CNRS, LPCT, UMR 7019, Vandoeuvre-lès-Nancy 54506, France. E-mail: sebastien.lebegue@univ-lorraine.fr
^{b}Queensland Micro- and Nanotechnology Centre, Griffith University, Nathan, Queensland 4111, Australia

Received
4th August 2020
, Accepted 12th September 2020

First published on 17th September 2020

Biomolecules have complex structures, and noncovalent interactions are crucial to determine their conformations and functionalities. It is therefore critical to be able to describe them in an accurate but efficient manner in these systems. In this context density functional theory (DFT) could provide a powerful tool to simulate biological matter either directly for relatively simple systems or coupled with classical simulations like the QM/MM (quantum mechanics/molecular mechanics) approach. Additionally, DFT could play a fundamental role to fit the parameters of classical force fields or to train machine learning potentials to perform large scale molecular dynamics simulations of biological systems. Yet, local or semi-local approximations used in DFT cannot describe van der Waals (vdW) interactions, one of the essential noncovalent interactions in biomolecules, since they lack a proper description of long range correlation effects. However, many efficient and reasonably accurate methods are now available for the description of van der Waals interactions within DFT. In this work, we establish the accuracy of several state-of-the-art vdW-aware functionals by considering 275 biomolecules including interacting DNA and RNA bases, peptides and biological inhibitors and compare our results for the energy with highly accurate wavefunction based calculations. Most methods considered here can achieve close to predictive accuracy. In particular, the non-local vdW-DF2 functional is revealed to be the best performer for biomolecules, while among the vdW-corrected DFT methods, uMBD is also recommended as a less accurate but faster alternative.

However, one persistent problem of density functional theory is the description of van der Waals (vdW) interactions, which are key to the structure and stability of many biological entities, such as π–π base stacking interaction,^{2,4} base–backbone interaction,^{3} polypeptide folding^{1} and amino acid–nucleic acid complex interaction.^{11,12} Numerous efforts have been made to improve the accuracy of density functional theory in describing quantitatively vdW interactions.^{13–24} The most straightforward solution is to add a dipole–dipole correction to a local or semi-local exchange–correlation functional. More complex vdW density functionals^{19–24} introduce a nonlocal exchange–correlation kernel to take into account nonlocal correlation effects. Methods involving many-body interactions^{15–18} allow the description of collective effects that are missed in other theories.

While the precision of these methods for molecules and solids is now well established,^{25,26} their performance in the context of biological systems has not yet been comprehensively assessed, except for a single or few categories of biomolecules.^{27–31} This situation makes choosing a suitable method difficult^{28} in a biological context. There is thus an urgent need to assess the performance of state-of-the-art methods using an integrated benchmark set of biomolecules.

Accuracy in biological systems is not the only factor that constrains the choice of approximation, however. Increasing computational cost of hybrid functionals and double-hybrid functionals limits their general applicability. Cost issues are especially important in periodic systems, where the Hartree–Fock hole can become highly delocalized which leads to very slow convergence of very expensive calculations. Some functionals, like B3LYP, are limited in scope and perform very poorly in small gap solid state systems.^{32}

These high costs or methodological failures make it difficult to apply many approximations to biological/nanoparticle (bio/nano) interfaces, which mix solid state and molecular physics. Such systems are of increasing interest to researchers because of their potential applications in medical applications. There is thus an urgent need to ensure that GGA approaches (e.g., PBE) that are known to treat both molecular and solid state systems in a consistent fashion can be enhanced to reproduce weak dispersion forces.

In this work, we present an extensive evaluation of several state-of-the-art DFT-related methods used to compute the energy (binding or conformation) for various biomolecular systems ranging from peptides and nucleotide base pairs and backbones to biological inhibitors with macrocycles, as contained in four datasets. Our paper is organized as follows: after a short reminder of the different testsets and methods, our results are presented and discussed, while in the last part we offer our conclusions.

We chose PBE as a base functional due to its high speed and versatility in treating solid state and molecular systems. This versatility is not shared by methods like B3LYP which typically outperform PBE in molecules but perform very poorly in solids.^{32} Poor performance in solids means that otherwise successful molecular approximation cannot be used to treat the difficult bio/nano systems that will be of increasing scientific interest.

The vdW-corrected DFT methods compensate for the lack of long range correlation in local or semi-local DFT by simply adding a dispersion correction, E_{disp} to the DFT energy:

E_{total} = E_{DFT} + E_{disp} | (1) |

(2) |

While the pairwise methods are successful for small molecules and molecular crystals, for larger systems many-body interactions become significant.^{44} The many-body dispersion (MBD) method^{15} provides an efficient way to evaluate higher-order many-body interactions based on the coupled fluctuating dipole model (CFDM),^{45} where atoms are depicted as a collection of coupled quantum harmonic oscillators and whose Hamiltonian is written as:

(3) |

A nonlocal vdW density functional, on the other hand, attempts to design an explicit expression of the nonlocal correlation, which has the form of

(4) |

E_{xc}[n] = E_{x}[n] + E^{0}_{c}[n] + E^{nl}_{c}[n] | (5) |

First, we evaluated the accuracy of the different methods for individual pair-stacking energies of inter-base pairs, ΔE_{stack,ij}, in terms of intrastrand sandwich stacking and interstrand parallel-displaced stacking as shown in Fig. 1A. PBE shows the largest error (MAD = 4.91 kcal mol^{−1}, see Fig. 1B) among the tested methods, and a high NMAD (0.99) close to unity indicates that the error of PBE is as large as the average binding energy of the inter-base pairs. It is notable that PBE shows 4-fold poorer estimation for intrastrand stacking interaction (MAD = 6.40 kcal mol^{−1}) than for interstrand interaction (MAD = 1.42 kcal mol^{−1}) as shown in Fig. S1 (ESI†), indicating that the errors of PBE mostly are attributed to the lack of dispersion-type attraction between aromatic rings. The lack of dispersion in PBE also results in its underbinding nature in the positive mean deviation (MD = +3.91 kcal mol^{−1}). The meta-GGA functionals, M06-2X and M11, without dispersion correction also show large errors (MAD = 1.02 and 1.81 kcal mol^{−1})^{29} (Table S1, ESI†), since their kinetic energy dependence of nonlocal correlation still lacks long-range correlation as discussed before.^{55}

The D3, TS and TS + SCS effectively correct the lack of dispersion in PBE and reduce the MAD to around 0.42 kcal mol^{−1}, which is nine times smaller than PBE. TS and TS + SCS slightly overestimate 2-body dispersion-type attraction as shown in their negative MDs (MD = −0.41 kcal mol^{−1} by both of TS and TS + SCS) because of the lack of many-body screening, while D3 shows mild underbinding trend (MD = +0.34 kcal mol^{−1}).

The many-body approaches further improve the binding energies of interstrand parallel displaced stacking by including repulsive many-body interaction (MD = −0.19 kcal mol^{−1} by MBD@rsSCS, MD = −0.15 kcal mol^{−1} by MBD@rsSCS/FI, and MD = −0.11 kcal mol^{−1} by uMBD), while they incur somewhat underestimated binding of intrastrand sandwich stacking structures (MD = +1.29 kcal mol^{−1} by MBD@rsSCS, MD = +1.20 kcal mol^{−1} by MBD@rsSCS/FI, and MD = +0.79 kcal mol^{−1} by uMBD) as shown in Fig. S1 (ESI†). Overall, the MBD@rsSCS (MAD = 0.49 kcal mol^{−1}) and MBD@rsSCS/FI (MAD = 0.47 kcal mol^{−1}) show comparable accuracy with the pairwise methods, while only uMBD (MAD = 0.31 kcal mol^{−1}) shows better accuracy than any other vdW-corrected DFT methods. The pairwise D3 method also shares a slightly underbinding trend with the many-body methods, since it employs a unique scaling of C_{6} as a function of coordination number.

Most of the nonlocal vdW density functionals further enhance the accuracy of pairwise stacking energies by a factor of 2 in comparison with the vdW-corrected DFT methods. Among the tested methods, rev-vdW-DF2 shows the best MAD in overall (MAD = 0.22 kcal mol^{−1}), which is comparable to the best of hybrid and long-range corrected hybrid functional, such as B3LYP-D3(BJ) (MAD = 0.16 kcal mol^{−1}) and ωB97M-V (MAD = 0.18 kcal mol^{−1})^{29} while vdW-DF2 (MAD = 0.25 kcal mol^{−1}) and SCAN + rVV10 (MAD = 0.23 kcal mol^{−1}) are also comparable to rev-vdW-DF2. It is notable that only vdW-DF-cx (MAD = 0.44 kcal mol^{−1}) shows an error as large as the vdW-corrected DFT methods, and it mostly overbinds the inter-base pairs as shown in its negative MD (MD = −0.44 kcal mol^{−1}).

Next, we compare the accuracy of the methods for the 10 B-DNA base-pair steps based on their full four-body stacking energy, ΔE_{4stack} shown in Fig. 2 as well as its pairwise stacking contribution, , and 4-body nonadditivity contribution, ΔE_{ABCD} = ΔE_{4stack} − ΔE_{stack} shown in Fig. S2 (ESI†).

As shown in Fig. 2B, PBE shows unacceptably large errors for ΔE_{4stack} (MAD = 15.82 kcal mol^{−1}). Based on the decomposition of ΔE_{4stack}, most of the PBE errors can be attributed to the pairwise stacking contributions, ΔE_{stack} (MAD = 15.64 kcal mol^{−1}), where additive pairwise dispersion energy are dominant. The vdW-corrected DFT methods reduce the errors by factor of 8 to 10, and uMBD shows the best performance (MAD = 1.43 kcal mol^{−1}) among them. It is notable that pairwise TS and TS + SCS equally overestimate the attraction in the pairwise stacking contribution, ΔE_{stack} (MD = −1.65 kcal mol^{−1} by TS and −1.64 kcal mol^{−1} by TS + SCS), while TS appears to be more repulsive ΔE_{ABCD} (MD = +0.18 kcal mol^{−1} by TS and −0.13 kcal mol^{−1} by TS + SCS) as shown in Fig. S2 (ESI†). Therefore, we can conclude that the long-range electrodynamic screening effect becomes more significant from the 2-body pairwise stacking to 4-body stacking, and this contributes to ΔE_{ABCD}. The D3 method, on the other hand, shows repulsive character in both pairwise stacking and 4-body nonadditive contributions, which is also a shared feature with many-body methods as in the inter-base pairs.

Many-body methods have a more repulsive character than the pairwise methods, showing an underestimated attraction in ΔE_{stack} (MD = +1.64 kcal mol^{−1} by MBD@rsSCS and +1.69 kcal mol^{−1} by MBD@rsSCS/FI) and an overestimated repulsion in ΔE_{ABCD} (MD = +0.61 kcal mol^{−1} by MBD@rsSCS and +0.63 kcal mol^{−1} by MBD@rsSCS/FI) as shown in Fig. S2 (ESI†). As a result, they tend to more underbind the base-pair step than the pairwise methods, giving MD = +2.02 kcal mol^{−1}, +2.08 kcal mol^{−1}, and +1.43 kcal mol^{−1} by MBD@rsSCS, MBD@rsSCS/FI, and uMBD respectively. By replacing the point-dipole CFDM to the smeared-out dipole CFSDM, uMBD mostly corrects the short-to-moderate range interaction of ΔE_{stack} (MD = +1.06 kcal mol^{−1} by uMBD), while having a comparable MD (+0.60 kcal mol^{−1}) for ΔE_{ABCD} with other many-body methods.

Most of the nonlocal density functionals further improve the accuracy, and rev-vdW-DF2 shows the best performance (MAD = 0.15 kcal mol^{−1}). vdW-DF2 and SCAN + rVV10 show mild errors (MAD = 0.96 kcal mol^{−1} by vdW-DF2 and 0.69 kcal mol^{−1} by SCAN + rVV10), and only vdW-DF-cx shows as large error (MAD = 1.73 kcal mol^{−1}) as the vdW-corrected DFT methods. Based on the decomposition of ΔE_{4stack} in Fig. S2 (ESI†), the errors in vdW-DF-cx come from overestimated attraction in the pairwise stacking energies (MD = −1.74 kcal mol^{−1}), which was also pointed out from the individual inter-base pair subset as discussed above. On the other hand, rev-vdW-DF2 underestimates the 4-body nonadditive repulsion (MD = −0.49 kcal mol^{−1} for ΔE_{ABCD}) and pairwise stacking attraction (MD = +0.86 kcal mol^{−1} for ΔE_{4stack}), leading to an error cancellation in total.

Overall, the errors of the tested methods mostly show extensive properties. The MADs of the tested methods increase from 2-body inter-base pairs to 4-body base-pair steps by a factor of 3 to 4, but their NMADs remain comparable. It is notable that only rev-vdW-DF2 shows a marginal dependence on the size of system with comparable MADs for the inter-base pairs (0.22 kcal mol^{−1}) and the base-pair steps (0.15 kcal mol^{−1}).

As shown in Fig. 3, PBE shows the largest error among all of the tested methods in case of both vs. the average conformation energy and vs. the most stable conformer. Both of the vdW-corrected DFT methods and the nonlocal vdW density functionals largely reduces the MAD and NMAD. However, in case of vs. thermal average energy (Fig. 3D), pairwise TS and TS + SCS have comparable or larger MAD (1.53 and 1.85 kcal mol^{−1}, respectively) than PBE (MAD = 1.84 kcal mol^{−1}), since they predict the 5z conformation as the most stable and far more stable than the 2p conformation by 1.72 kcal mol^{−1} and 1.90 kcal mol^{−1}, respectively. All of the nonlocal density functionals also predict the 5z conformation as the most stable but more stable than the 2p conformation by a difference within 1 kcal mol^{−1}, giving smaller MAD than TS and TS + SCS.

Regardless of the definition of the conformation energy, rev-vdW-DF2 generally shows the best performance in the UpU23 set (MAD = 0.32 kcal mol^{−1}vs. average conformation energy, 0.38 kcal mol^{−1}vs. the most stable conformer, and 0.31 kcal mol^{−1}vs. thermal average energy, respectively), which has 5-fold to 6-fold smaller MADs than PBE, and better than any other (long-range-corrected) hybrid and meta-GGA functionals^{28} as shown in Table S2 (ESI†). uMBD and SCAN + rVV10 also show comparable MADs (MAD = 0.35, 0.37, and 0.39 kcal mol^{−1} by uMBD and 0.34, 0.40 and 0.34 kcal mol^{−1} by SCAN + rVV10, respectively) with rev-vdW-DF2.

Among the vdW-corrected DFT methods, uMBD shows the best performance over the tested methods and hybrid and meta-GGA functionals in Table S2 (ESI†), regardless of the definition of conformation energies, while other many-body methods, MBD@rsSCS/FI and MBD@rsSCS, are also as accurate as uMBD within a MAD difference of 0.1 kcal mol^{−1} (MAD = 0.42, 0.43 and 0.43 kcal mol^{−1}, respectively, by MBD@rsSCS/FI, and MAD = 0.44, 0.45 and 0.45 kcal mol^{−1}, respectively, by MBD@rsSCS). The pairwise D3 method also shows a comparable accuracy (MAD = 0.46, 0.50 and 0.47 kcal mol^{−1}, respectively) to the many-body methods. Many-body vdW-corrected DFT methods and D3 predict the 2p conformation as the most stable in agreement with CCSD(T) (see Fig. S4, ESI†), and therefore they show 4 times smaller MAD than TS and TS + SCS in terms of thermal average energy. It is also notable that the D3 correction combined with M06 deteriorates its MAD regardless of the definition of the conformation energy, due to the problem of this functional with the prediction of nonequilibrium geometries, as discussed before.^{28}

As shown in Fig. 4B–D, regardless of the definition of conformation energies, PBE shows the largest error among the tested methods (MAD = 1.62 kcal mol^{−1}vs. average conformation energy, 3.66 kcal mol^{−1}vs. the most stable conformer, and 2.32 kcal mol^{−1}vs. thermal averaged energy, respectively), comparable to B3LYP as shown in Table S3 (ESI†). On the other hand, nonlocal vdW density functionals show a better performance than the vdW-corrected DFT methods. Interestingly, the pairwise TS (MAD = 0.93, 1.25 and 1.08 kcal mol^{−1}, respectively) and TS + SCS methods (MAD = 0.92, 1.18 and 0.98 kcal mol^{−1}, respectively) are slightly better than D3 (MAD = 0.94, 1.43 and 1.23 kcal mol^{−1}, respectively) as well as many-body MBD@rsSCS (MAD = 0.96, 1.57 and 1.30 kcal mol^{−1}, respectively) and MBD@rsSCS/FI (MAD = 0.94, 1.54 and 1.29 kcal mol^{−1}, respectively). The uMBD also shows comparable or slightly larger MAD (0.89, 1.32 and 1.17 kcal mol^{−1}, respectively) than the pairwise methods. D3 shows a better performance when coupled with ωB97X, while it worsens the accuracy of the M06 and M06-2X functionals^{28} (see Table S3, ESI†). Nonlocal vdW density functionals show comparable accuracy with the hybrid meta-GGA M06 and M06-2X functionals. Among the nonlocal vdW density functionals, vdW-DF2 shows the best performance (MAD = 0.35, 0.43 and 0.35 kcal mol^{−1}, respectively), and SCAN + rVV10 (MAD = 0.53, 0.58 and 0.56 kcal mol^{−1}, respectively) is the second best. rev-vdW-DF2 (MAD = 0.57, 0.72 and 0.67 kcal mol^{−1}, respectively) and vdW-DF-cx (MAD = 0.60, 0.80 and 0.69 kcal mol^{−1}, respectively) have slightly larger errors than the other nonlocal functionals, but still better than the vdW-corrected DFT methods.

According to the comparison of individual relative energies of Phe-Gly-Gly tripeptides in Fig. S6 (ESI†), the vdW-corrected DFT methods and most of the nonlocal van der Waals density functionals share common features. To discuss this, we can categorize the Phe-Gly-Gly conformations into three groups according to their structures in Fig. S5A (ESI†). (i) Stretched subset: 366, 300 and 691st local minima structures in the AMBER 1500 conformation set,^{33} which has large separation between phenyl group and carboxylic end, and thereby their pairwise dispersion interaction is the smallest among the 11 conformations, making them less stable than 99 based on the vdW-corrected DFT methods. (ii) Folded subset: 444, 215, 357, 114 and 224th local minima structures in the AMBER 1500 conformation set, which have the most folded conformation with a common hydrogen bond between the OH of carboxylic acid and the O of the middle Gly unit (green dashed line in Fig. S5A, ESI†) that gives them a more folded carboxylic end than 99, 412 and 470 and thereby predicted to have the strongest pairwise attraction-type dispersion interaction. (iii) Intermediate subset: 99, 412 and 470th local minima structures in the AMBER 1500 conformation set, which have comparably folded conformations with the Folded subset, but they don’t make a hydrogen bond at the carboxylic end and thereby predicted to be less stable than the Folded subset based on most of the tested methods except for vdW-DF2 and SCAN + rVV10. For the Folded and Intermediate subset, their results are well correlated with CCSD(T) in case of vs. the most stable conformer, while for the Folded subset, their results are well correlated with CCSD(T) in case of vs. thermal average energy. vdW-DF2 is the only method that shows a unique trend strongly correlated with the CCSD(T) results and the smallest MAD as mentioned before. SCAN + rVV10 shows the similar trend with vdW-DF-cx and rev-vdW-DF2, but its prediction of the most stable conformer, 99, is slightly better than them and thereby leading it to be the second best in the PCONF21 set.

As shown in Fig. 5, PBE shows the largest error for predicting conformation energies relative to the average conformation energy (MAD = 1.78 kcal mol^{−1} and NMAD = 0.35), the energy of the most stable conformer (MAD = 2.55 kcal mol^{−1} and NMAD = 0.21), and the thermal average energy (MAD = 2.04 kcal mol^{−1} and NMAD = 0.16).

Among the vdW-corrected DFT methods, the D3 method shows the smallest error (MAD = 0.75 kcal mol^{−1} and NMAD = 0.16 vs. average conformation energy, MAD = 1.12 kcal mol^{−1} and NMAD = 0.09 vs. the most stable conformer, and MAD = 1.04 kcal mol^{−1} and NMAD = 0.09 vs. thermal average energy, respectively), which is also comparable to the many-body MBD@rsSCS, MBD@rsSCS/FI and uMBD (MAD = 0.79, 1.16 and 1.09 kcal mol^{−1} by MBD@rsSCS, MAD = 0.79, 1.15 and 1.08 kcal mol^{−1} by MBD@rsSCS/FI, and MAD = 0.84, 1.21 and 1.15 kcal mol^{−1}, respectively) within a MAD difference of 0.1 kcal mol^{−1}. Both of TS and TS + SCS shows larger errors (MAD = 1.08, 1.43 and 1.35 kcal mol^{−1} by TS and MAD = 1.02, 1.34 and 1.27 kcal mol^{−1} by TS + SCS, respectively) than any other vdW-corrected DFT methods and nonlocal vdW density functionals, which are attributed to their lack of many-body interaction. The M06-2X and M06L functionals show comparable accuracy with vdW-corrected DFT methods,^{28} while the inclusion of the D3 correction in these functionals again slightly deteriorates their accuracy as shown in Table S4 (ESI†). On the other hand, B3LYP with D3 correction can improve its accuracy (MAD = 0.83 kcal mol^{−1} and NMAD = 0.06 vs. average conformation energy, MAD = 0.57 kcal mol^{−1} and NMAD = 0.13 vs. the most stable conformer, and MAD = 0.78 kcal mol^{−1} and NMAD = 0.06 vs. thermal average energy, respectively) more than any other functionals considered in this work.

Among the nonlocal vdW density functionals, rev-vdW-DF2 shows the best performance (MAD = 0.61, 0.93 and 0.88 kcal mol^{−1}, respectively) and vdW-DF-cx also shows a slightly better accuracy (MAD = 0.64, 0.93 and 0.87 kcal mol^{−1}) than the vdW-corrected DFT methods. vdW-DF2 shows a comparable accuracy (MAD = 0.72, 1.14 and 1.08 kcal mol^{−1}) with the many-body vdW-corrected DFT methods. It is notable that only SCAN + rVV10 shows larger MAD than most of the vdW-corrected DFT methods when the conformer energies are defined with respect to the energy of the most stable conformer and the thermal average energy. As shown in Fig. S8 (ESI†), the large MAD of SCAN + rVV10 is attributed to the conformations of SANGLI, where SCAN + rVV10 overestimates the stability of the most state conformer as shown in Fig. S9 (ESI†).

rev-vdW-DF2 works better than the other tested functionals in most of benchmark sets except for PCONF21 set, where vdW-DF2 shows unique accuracy over other methods, and it also has comparable MAD with the best of hybrid and meta-GGA functionals within a 0.3 kcal mol^{−1} difference. Of the vdW-corrected DFT methods, the pairwise D3 and the many-body uMBD method individually reach comparable accuracy to the best vdW-corrected DFT method regardless of the systems. Including a dispersion correction like in vdW-corrected DFT methods mostly improves the accuracy of PBE and hybrid functionals, while it makes worse in case of hybrid meta-GGA functionals, indicating that the vdW-corrected DFT methods require a proper choice of a base functional for further improvements.

The radar plot of the MADs in Fig. 6 illustrates how uniformly a method conserves its accuracy for diverse benchmark sets. According to the radar plots, vdW-corrected DFT methods improves the accuracy of DFT-PBE comparably to each other, but the pairwise TS and TS + SCS show as large error of conformation energy relative to the thermal average energy as DFT-PBE in the UpU23 set. The nonlocal vdW density functionals show further improvements than vdW-corrected methods. While the vdW-corrected DFT methods show larger MADs than chemical (∼1 kcal mol^{−1}) accuracy in 2 to 3 benchmark sets, most of the nonlocal vdW density functionals reach chemical accuracy except for at most 1 benchmark set.

To comprehend the accuracy of the tested methods in the overall benchmark set, we introduce the weighted total mean absolute deviation (WTMAD-2), which is defined in the previous work on the GMTKN55 benchmark.^{28} Since DNA inter-base pair and DNA base-pair step have different energy scales, the WTMAD-2 was averaged over 5 benchmark sets – the DNA inter-base pairs, the DNA base-pair steps, the UpU23 set, the PCONF21 set and the macrocycles in MPCONF196 set.

According to the WTMAD-2, the nonlocal vdW density functionals have 4.2 to 6.5 times smaller WTMAD-2's than PBE, which are always better than vdW-corrected DFT methods and fall into the chemical accuracy (1 kcal mol^{−1}) regardless of the definition of conformation energy. Among them, vdW-DF2 and rev-vdW-DF2 show the best accuracy (WTMAD-2 = 0.78, 0.74 and 0.76 kcal mol^{−1}vs. average conformation energy, vs. the most stable conformer and vs. thermal average energy, respectively, by vdW-DF2 and WTMAD-2 = 0.82, 0.77 and 0.79 kcal mol^{−1}, respectively, by rev-vdW-DF2), though all of nonlocal functionals have comparable WTMAD-2 within 0.25 kcal mol^{−1}. As shown in the radar plot of Fig. 6 and the MADs in Fig. 1–5, vdW-DF2 and rev-vdW-DF2 have comparable accuracy to each other, but more specifically, vdW-DF2 show better accuracy in PCONF21 set, while rev-vdW-DF2 works better otherwise. SCAN + rVV10 also shows comparable accuracy (WTMAD-2 = 0.85, 0.84 and 0.86 kcal mol^{−1}, respectively) with vdW-DF2 and rev-vdW-DF2, while vdW-DF-cx shows slightly larger WTMAD-2 (0.97, 0.97 and 0.96 kcal mol^{−1}, respectively) yet still smaller than the vdW-corrected DFT methods.

The vdW-corrected DFT methods show slightly larger WTMAD-2 than those of the nonlocal functionals as well as the chemical accuracy, yet they still have reasonable accuracy that is 2.6 to 4.0 times better than DFT-PBE. Among the vdW-corrected DFT methods, uMBD shows the smallest WTMAD-2 (1.20, 1.21 and 1.25 kcal mol^{−1}, respectively), while the WTMAD-2's of other vdW-corrected DFT methods are also comparable within 0.24 kcal mol^{−1}.

Of the tested methods, vdW-DF2 shows the best performance for the relative energetics of conformers over other state-of-the-art vdW methods and higher rungs of density functionals. Among the vdW-corrected DFT methods, uMBD and D3 show a comparable accuracy to the nonlocal vdW density functionals for biomolecular interactions. It should be noted, however, that D3 does not perform well in metals and some other solids,^{18,59,60} meaning caution should be advised when using it to study interactions between biomolecules and solids. In this regard, nonlocal vdW density functionals seem to perform reasonably,^{61,62} although their use for large systems remain limited.

Overall, we anticipate that our findings will be helpful to researchers using DFT to study the interactions between biomolecules, and especially the dynamics of solvent-biomolecules systems.^{6} It should also be useful for developing machine-learned force fields,^{63} where good treatment of all interactions is important.

- A. Tkatchenko, M. Rossi, V. Blum, J. Ireta and M. Scheffler, Phys. Rev. Lett., 2011, 106, 118102 Search PubMed .
- J. Šponer, J. E. Šponer, A. Mládek, P. Jurečka, P. Banáš and M. Otyepka, Biopolymers, 2013, 99, 978–988 Search PubMed .
- M. Chawla, E. Chermak, Q. Zhang, J. M. Bujnicki, R. Oliva and L. Cavallo, Nucleic Acids Res., 2017, 45, 11019–11032 Search PubMed .
- O. B. Ol’ha, K. S. Tsiupa and D. M. Hovorun, Sci. Rep., 2018, 8, 10371 Search PubMed .
- P. K. Biswas and S. Chakraborty, Nucleic Acids Res., 2019, 47, 2757–2765 Search PubMed .
- M. Stöhr and A. Tkatchenko, Sci. Adv., 2019, 5, eaax0024 Search PubMed .
- V. Genna, M. Marcia and M. D. Vivo, J. Am. Chem. Soc., 2019, 141, 10770–10776 Search PubMed .
- L. O. Jones, M. A. Mosquera, G. C. Schatz and M. A. Ratner, J. Am. Chem. Soc., 2020, 142, 3281–3295 Search PubMed .
- A. Lunghi and S. Sanvito, Sci. Adv., 2019, 5, eaaw2210 Search PubMed .
- R. Galvelis, S. Doerr, J. M. Damas, M. J. Harvey and G. De Fabritiis, J. Chem. Inf. Model., 2019, 59, 3485–3493 Search PubMed .
- N. M. Luscombe, R. A. Laskowski and J. M. Thornton, Nucleic Acids Res., 2001, 29, 2860–2874 Search PubMed .
- S. Jones, D. T. Daley, N. M. Luscombe, H. M. Berman and J. M. Thornton, Nucleic Acids Res., 2001, 29, 943–954 Search PubMed .
- S. Grimme, S. Ehrlich and L. Goerigk, J. Comput. Chem., 2011, 32, 1456–1465 Search PubMed .
- A. Tkatchenko and M. Scheffler, Phys. Rev. Lett., 2009, 102, 073005 Search PubMed .
- A. Tkatchenko, R. A. DiStasio Jr, R. Car and M. Scheffler, Phys. Rev. Lett., 2012, 108, 236402 Search PubMed .
- A. Ambrosetti, A. M. Reilly, R. A. DiStasio Jr and A. Tkatchenko, J. Chem. Phys., 2014, 140, 18A508 Search PubMed .
- T. Gould, S. Lebègue, J. G. Ángyán and T. Bučko, J. Chem. Theory Comput., 2016, 12, 5920–5930 Search PubMed .
- M. Kim, W. J. Kim, T. Gould, E. K. Lee, S. Lebègue and H. Kim, J. Am. Chem. Soc., 2020, 142, 2346–2354 Search PubMed .
- M. Dion, H. Rydberg, E. Schröder, D. C. Langreth and B. I. Lundqvist, Phys. Rev. Lett., 2004, 92, 246401 Search PubMed .
- K. Lee, É. D. Murray, L. Kong, B. I. Lundqvist and D. C. Langreth, Phys. Rev. B: Condens. Matter Mater. Phys., 2010, 82, 081101 Search PubMed .
- R. Sabatini, T. Gorni and S. De Gironcoli, Phys. Rev. B: Condens. Matter Mater. Phys., 2013, 87, 041108 Search PubMed .
- I. Hamada, Phys. Rev. B: Condens. Matter Mater. Phys., 2014, 89, 121103 Search PubMed .
- K. Berland and P. Hyldgaard, Phys. Rev. B: Condens. Matter Mater. Phys., 2014, 89, 035412 Search PubMed .
- H. Peng, Z.-H. Yang, J. P. Perdew and J. Sun, Phys. Rev. X, 2016, 6, 041005 Search PubMed .
- M. Kim, W. J. Kim, E. K. Lee, S. Lebegue and H. Kim, Int. J. Quantum Chem., 2016, 116, 598–607 Search PubMed .
- F. Tran, L. Kalantari, B. Traoré, X. Rocquefelte and P. Blaha, Phys. Rev. Mater., 2019, 3, 063602 Search PubMed .
- H. Kruse, A. Mladek, K. Gkionis, A. Hansen, S. Grimme and J. Šponer, J. Chem. Theory Comput., 2015, 11, 4972–4991 Search PubMed .
- L. Goerigk, A. Hansen, C. Bauer, S. Ehrlich, A. Najibi and S. Grimme, Phys. Chem. Chem. Phys., 2017, 19, 32184–32215 Search PubMed .
- H. Kruse, P. Banas and J. Šponer, J. Chem. Theory Comput., 2018, 15, 95–115 Search PubMed .
- J. Rezac, D. Bm, O. Gutten and L. Rulisek, J. Chem. Theory Comput., 2018, 14, 1254–1266 Search PubMed .
- J. Antony and S. Grimme, Phys. Chem. Chem. Phys., 2006, 8, 5287–5293 Search PubMed .
- J. Paier, M. Marsman and G. Kresse, J. Chem. Phys., 2007, 127, 024103 Search PubMed .
- D. Řeha, H. Valdes, J. Vondrášek, P. Hobza, A. Abu-Riziq, B. Crews and M. S. de Vries, Chem. – Eur. J., 2005, 11, 6803–6817 Search PubMed .
- L. Goerigk, A. Karton, J. M. Martin and L. Radom, Phys. Chem. Chem. Phys., 2013, 15, 7028–7031 Search PubMed .
- J. P. Perdew, K. Burke and M. Ernzerhof, Phys. Rev. Lett., 1996, 77, 3865 Search PubMed .
- C. Lee, W. Yang and R. G. Parr, Phys. Rev. B: Condens. Matter Mater. Phys., 1988, 37, 785–789 Search PubMed .
- A. D. Becke, J. Chem. Phys., 1993, 98, 5648 Search PubMed .
- P. J. Stephens, F. Devlin, C. Chabalowski and M. J. Frisch, J. Phys. Chem., 1994, 98, 11623–11627 Search PubMed .
- J.-D. Chai and M. Head-Gordon, Phys. Chem. Chem. Phys., 2008, 10, 6615–6620 Search PubMed .
- Y. Zhao and D. G. Truhlar, J. Chem. Phys., 2006, 125, 194101 Search PubMed .
- Y. Zhao and D. G. Truhlar, Theor. Chem. Acc., 2008, 120, 215–241 Search PubMed .
- R. Peverati and D. G. Truhlar, J. Phys. Chem. Lett., 2011, 2, 2810–2817 Search PubMed .
- N. Mardirossian and M. Head-Gordon, J. Chem. Phys., 2016, 144, 214110 Search PubMed .
- J. F. Dobson, Int. J. Quantum Chem., 2014, 114, 1157–1161 Search PubMed .
- A. Donchev, J. Chem. Phys., 2006, 125, 074713 Search PubMed .
- J. P. Perdew and W. Yue, Phys. Rev. B: Condens. Matter Mater. Phys., 1986, 33, 8800 Search PubMed .
- E. D. Murray, K. Lee and D. C. Langreth, J. Chem. Theory Comput., 2009, 5, 2754–2762 Search PubMed .
- A. Becke, J. Chem. Phys., 1986, 85, 7184–7187 Search PubMed .
- J. Sun, A. Ruzsinszky and J. P. Perdew, Phys. Rev. Lett., 2015, 115, 036402 Search PubMed .
- G. Kresse and J. Hafner, Phys. Rev. B: Condens. Matter Mater. Phys., 1993, 47, 558 Search PubMed .
- G. Kresse and J. Furthmüller, Phys. Rev. B: Condens. Matter Mater. Phys., 1996, 54, 11169 Search PubMed .
- G. Kresse and J. Furthmüller, Comput. Mater. Sci., 1996, 6, 15–50 Search PubMed .
- T. Björkman, J. Chem. Phys., 2014, 141, 074708 Search PubMed .
- P. E. Blöchl, Phys. Rev. B: Condens. Matter Mater. Phys., 1994, 50, 17953 Search PubMed .
- N. Marom, A. Tkatchenko, M. Rossi, V. V. Gobre, O. Hod, M. Scheffler and L. Kronik, J. Chem. Theory Comput., 2011, 7, 3944–3951 Search PubMed .
- J. Jiang, Y. Wu, Z.-X. Wang and C. Wu, J. Chem. Theory Comput., 2010, 6, 1199–1209 Search PubMed .
- W. Qin, X. Li, W.-W. Bian, X.-J. Fan and J.-Y. Qi, Biomaterials, 2010, 31, 1007–1016 Search PubMed .
- Q. Wang, M.-h. Wang, K.-f. Wang, Y. Liu, H.-p. Zhang, X. Lu and X.-d. Zhang, Biomed. Mater., 2015, 10, 032001 Search PubMed .
- M. P. Andersson, Phys. Chem. Chem. Phys., 2016, 18, 19118–19122 Search PubMed .
- N. V. Ilawe, J. A. Zimmerman and B. M. Wong, J. Chem. Theory Comput., 2015, 11, 5426–5435 Search PubMed .
- K. Lee, Y. Morikawa and D. C. Langreth, Phys. Rev. B: Condens. Matter Mater. Phys., 2010, 82, 155461 Search PubMed .
- C. N. M. Ouma, P. Modisha and D. Bessarabov, RSC Adv., 2018, 8, 31895–31904 Search PubMed .
- S. Chmiela, H. E. Sauceda, K.-R. Müller and A. Tkatchenko, Nat. Commun., 2018, 9, 1–10 Search PubMed .

## Footnote |

† Electronic supplementary information (ESI) available: The definition of conformation energy, the graphs and tables for the error statistics of the tested methods and the structures of individual conformers in the benchmark set. See DOI: 10.1039/d0cp04137h |

This journal is © the Owner Societies 2020 |