S66x8 Noncovalent Interactions Revisited: New Benchmark and Performance of Composite Localized Coupled-Cluster Methods

The S66x8 noncovalent interactions benchmark has been re-evaluated at the"sterling silver"level, using explicitly correlated MP2-F12 near the complete basis set limit, CCSD(F12*)/aug-cc-pVTZ-F12, and a (T) correction from conventional CCSD(T)/sano-V{D,T}Z+ calculations. The revised reference value disagrees by 0.1 kcal/mol RMS with the original Hobza benchmark and its revision by Brauer et al, but by only 0.04 kcal/mol variety from the"bronze"level data in Kesharwani et al., Aust. J. Chem. 71, 238-248 (2018). We then used these to assess the performance of localized-orbital coupled cluster approaches with and without counterpoise corrections, such as PNO-LCCSD(T) as implemented in MOLPRO, DLPNO-CCSD (T1) as implemented in ORCA, and LNO-CCSD(T) as implemented in MRCC, for their respective"Normal","Tight", and"very Tight"settings. We also considered composite approaches combining different basis sets and cutoffs. Furthermore, in order to isolate basis set convergence from domain truncation error, for the aug-cc-pVTZ basis set we compared PNO, DLPNO, and LNO approaches with canonical CCSD(T). We conclude that LNO-CCSD(T) with veryTight criteria performs very well for"raw"(CP-uncorrected), but struggles to reproduce counterpoise-corrected numbers even for veryVeryTight criteria: this means that accurate results can be obtained using either extrapolation from basis sets large enough to quench basis set superposition error (BSSE) such as aug-cc-pV{Q,5}Z, or using a composite scheme such as Tight{T,Q}+1.11[vvTight(T) - Tight(T)]. In contrast, PNO-LCCSD(T) works best with counterpoise, while performance with and without counterpoise is comparable for DLPNO-CCSD(T1). Among more economical methods, the highest accuracies are seen for dRPA75-D3BJ, {\omega}B97M-V, {\omega}B97M(2), revDSD-PBEP86-D4, and DFT(SAPT) with a TDEXX or ATDEXX kernel.

NCIs in the condensed phase are often determined via NMR spectroscopy, [18][19][20] while in the gas phase, rotational spectroscopy is a reliable technique for polar molecules; for recent studies, see ref. 21-24. Significant efforts are still ongoing to understand and quantitatively measure different types of noncovalent interactions, present in bulk as well as interfaces. Unfortunately, results from experimentally studied noncovalently-bonded systems often include dynamical and (or) environmental effects, thus rendering them inappropriate to use as a reference for developing semi-empirical methods. Hence, accurate wavefunction ab initio methods are often essential for obtaining highly precise NCI energies. Over the past few years, several datasets have been proposed [25][26][27][28][29][30][31][32][33][34][35][36][37][38] for intra-and intermolecular noncovalent interactions involving biologically important organic molecules of different sizes. The strong and weak NCIs in biomolecules are particularly interesting because they often play prominent roles in determining their favorable structures, specific binding sites, and dynamics. 11 Hobza's S66 35 and its extended version, S66x8, 37 are two such datasets. The S66x8 set comprises equilibrium and angular-displaced nonequilibrium geometries of 66 dimersbuilt from 14 different monomers representative of biomolecule fragments. These nonequilibrium structures were generated by multiplying the equilibrium intermonomer distances (r e ) by factors of 0.90, 0.95, 1.00, 1.05, 1.10, 1.25, 1.50, and 2.00, respectively, while freezing the other degrees of freedom. All dimers of S66x8 can be categorized into four subsets: hydrogen bonds, p-stacking, London dispersion complexes, and mixedinfluence interactions.
It is already well established that even at the complete basis set (CBS) limit, second-order Møller-Plesset perturbation theory (MP2) performs poorly for atomization energies, electron affinities, and barrier heights (see ref. 39 and references therein). However, for noncovalent interactions, MP2 with a large basis set can be considered a starting point, which, combined with high-level corrections (e.g., CCSD(T)-MP2 in a smaller basis), can lead to more accurate ab initio energies. 35,37 [CCSD(T) refers to coupled-cluster theory, including singles, doubles, and perturbative triples 40,41 ].
The original S66x8 NCI reference energies were obtained by adding a [CCSD(T)-MP2]/AVDZ high-level correction (HLC) to the extrapolated MP2/AV{T,Q}Z with full counterpoise (CP) correction. In ref. 42 Brauer et al. improved the reference energies by combining explicitly correlated MP2-F12 energies at the CBS limit with an HLC from [CCSD(Tc sc )-F12b À MP2-F12]/cc-pVDZ-F12. Depending on the ab initio methods used for high-level correction, Martin and coworkers 43 proposed a hierarchy of revisited standards for NCI energies: gold, silver or sterling silver, and bronze (see Section IIIa for further details). In practice, the ''sterling silver'' (i.e., an alloy of 92.5% pure silver and 7.5% other metals, usually copper) level NCI energies are a low-cost version of the original ''silver'' standard. Due to the daunting computational cost, the authors in ref. 43 computed the interaction energies at the ''gold'' level for only 18 dimers out of 66 present in S66. However, they successfully calculated the ''sterling silver'' and ''bronze'' standard energetics of the S66 and S66x8 sets, respectively.
Although canonical CCSD(T) or explicitly correlated CCSD(T)-F12 is often desired for highly accurate NCI energies, the steep N 7 cost scaling of these methods with the system size (N) makes them prohibitively expensive for applications on larger systems. Hence, over the past few years, localized coupled-cluster methods, such as the PNO-LCCSD(T) [pair natural orbital localized coupled-cluster with singles, doubles, and perturbative triples] approach of the Werner group, 44 DLPNO-CCSD(T) [domain localized pair natural orbital CCSD(T)] and related methods by the Neese group, 45,46 and the LNO-CCSD(T) [localized natural orbital CCSD(T)] method of Kallay and coworkers [47][48][49] have gained considerable attention. These methods' attractive linear scaling behavior (for sufficiently large systems) allows calculations on systems consisting of hundreds of atoms. Often, with tight accuracy cutoffs and large basis sets, they can achieve accuracy comparable to canonical CCSD(T). Examples of recent use include the energetics of the (H 2 O) 20 cages using PNO-LCCSD(T)-F12b, 50 (the F12b suffix refers to explicit correlation 51 ), the noncovalent interaction energies of seven large dimers (L7 set 32 ) with LNO-CCSD(T), 52 the main group thermochemistry, barrier heights, intra-, and intermolecular interaction energies of GMTKN55 53 using DLPNO-CCSD(T), 54 and benchmark studies on the Ru(II) complexes involved in hydroarylation, 55 highly delocalized polypyrroles (POLYPYR 56 set), and metal-organic barrier heights (MOBH35, 35 reactions [57][58][59]. Granted the linear cost scaling, the accuracy of localized approaches is subject to various predefined cutoffs, and tuned fixed combinations of these cutoffs are given as keywords in several codes. The available options for DLPNO-CCSD(T) and related methods in ORCA 60 are LoosePNO, NormalPNO, TightPNO, and VeryTightPNO (see Table 1 in ref. 61 for definitions). The accuracy presets in LNO-CCSD(T) are set to Normal, Tight, vTight, or vvTight (see Table 1  Recently, S66 noncovalent interactions were studied with LNO-CCSD(T) by Kállay and Nagy, 49 and with eight low-cost LNO-CCSD(T)-based composite schemes by ourselves, succinctly reported in a conference proceeding extended abstract. 62 Both studies showed a smooth error convergence by gradually tightening the accuracy thresholds and increasing the basis set size with respect to the ''silver'' 43 standard reference energies. For our composite schemes, a two-point CBS extrapolation with the same accuracy cutoff was employed, and the effect of a tighter cutoff was estimated as a scaled additivity correction in a relatively small basis set. With half-counterpoise 63,64 correction (i.e., the average of full counterpoise-corrected and uncorrected results), some low-cost composite LNO-CCSD(T) methods performed very well.
That being said, the main objectives of the present study can be briefly summarized as: (a) climbing one step of the hierarchical ladder of NCI reference energies toward ''sterling silver'' standard S66x8 interaction energies; (b) evaluating the performance of pure and composite localized coupled-cluster methods based on LNO-CCSD(T), PNO-LCCSD(T), and DLPNO-CCSD(T 1 ) against the S66x8 set; (c) recommending a localized coupled cluster method which can be used for calculating accurate NCI energies, thus avoiding the expensive canonical coupled-cluster options.

II. Computational methods
The Molpro 2021.3 65 program suite was used for the conventional and explicitly correlated ab initio single-point calculations. The MRCC 2020, 66 Molpro 2021.3, 65 and ORCA 5.0.1 60 packages were employed for the LNO-CCSD(T), PNO-CCSD(T), and DLPNO-CCSD(T 1 ) calculations, respectively. The reference geometries of the S66x8 dataset were extracted from the Benchmark Energy & Geometry Database website (https://www.begdb. org/) and were used without further optimization. All electronic structure calculations were performed jointly on the Faculty of Chemistry's Linux cluster, ChemFarm, at the Weizmann Institute of Science, and the Karton group's Linux cluster at the University of Western Australia (UWA).
For the MP2-F12 steps, the 3C(D) 76 approach might have led to somewhat faster basis set convergence; however, while it is both size-consistent and geminal BSSE-free like 3C(Fix), it is not orbital-invariant. As we were already using quite large basis sets, we opted for the orbital-invariant 3C(Fix) default instead, which use fixed geminal amplitudes fixed by the cusp conditions. 77 CCSD-F12a, 78 CCSD-F12b, 78 and CCSD(F12*), 74,75 represent progressively more rigorous approximations to exact CCSD-F12-for a detailed analysis see Köhn and Tew 75 and ref. 79-the latter of which has extensive numerical data for total atomization energies. For interaction energies of water clusters, Manna et al. 80 showed clearly that CCSD(F12*) exhibits the fastest basis set convergence of all three methods. Specifically for S66, the various approximations to CCSD-F12 were investigated in some detail, 43 and CCSD(F12*) was found to be the most accurate by far. The repeatedly made claim, e.g. ref. 81, that CCSD-F12a is superior to CCSD-F12b for small basis sets, rests on an error compensation between basis set incompleteness of these small basis sets and the known overbinding 43,79,80 of F12a. The senior authors eschew ''right answers for the wrong reasons'' to the greatest extent possible.
For the conventional CCSD and CCSD(T), we employed the semi-augmented sano-pVnZ+ (n = D, T) atomic natural orbital basis sets of Neese and Valeev. 82 Appropriate auxiliary basis sets for JKfit 83 (Coulomb and exchange) and MP2fit 84,85 (density fitting in MP2) were used across the board. As recommended by Werner and coworkers 86 for VTZ-F12 and VQZ-F12, the geminal exponent b = 1.0a 0 À1 was employed for the haVTZ-F12 and haVQZ-F12 basis sets. For the localized CCSD(T) calculations, haVnZ (n = T, Q, and 5) basis sets, together with the corresponding haVnZ/MP2fit 84,85 (n = T, Q, and 5), were used. As the present paper was being revised, ref. 87 was brought to our attention, where it was found that substituting haV(n+1)Z/MP2Fit for haVnZ/MP2Fit significantly reduces error in the absolute correlation energy. We tested this at 1.0r e for water dimer (system 1) and benzene dimer pi-stacked (system 24) with Tight cutoffs and the haVTZ and haVQZ basis sets. While the changes in absolute correlation energies are close to what is reported in the ESI of ref. 87, they cancel almost exactly between dimer and separate monomers, such that the interaction energies remain unchanged to two decimal places (see also discussion of Tables 4 and 5 in Sherrill et al. 88 ).
Concerning accuracy thresholds, Normal, Tight, vTight, and vvTight were considered for LNO-CCSD(T), whereas for the PNO-LCCSD(T) calculations, we applied the Default and Tight settings. On the other hand, for DLPNO-CCSD(T 1 ), we employed NormalPNO, TightPNO, and VeryTightPNO (TightPNO with T CutMKN , T CutPNO , and T CutPairs tightened to 10 À4 , 10 À8 , and 10 À6 , respectively. 89 For the detailed description of these cutoff parameters, see ref. 61) settings together with the RIJCOSX (resolution of the identity in combination with the chain of spheres 90 algorithm) approximation. (T 1 ) refers to the more rigorous full iterative triples correction, 91 and (T 0 ), often confusingly referred to as (T), refers to the older, more economical noniterative approach 46 in which the offdiagonal Fock matrix elements are neglected. Unlike (T 0 ), (T 1 ) entails nontrivial I/O bandwidth requirements.
To investigate the dependence of the DLPNO-CCSD(T 1 ) correlation on the size of the PNO space, we also adjusted the T CutPNO threshold in DLPNO-CCSD(T 1 )/Tight from the default value 10 À7 to 10 À6 . Two-point PNO extrapolations were also carried out to the complete PNOs space limit (CPS), using the simple two points extrapolation scheme proposed by Altun et al., 92 Similar to our previous studies, 62,64,80 in addition to full counterpoise (CP) corrections, 71 we have also considered the average of raw and CP, i.e., 'half-CP', to calculate the dissociation energies of S66x8. A two-point basis set extrapolation was carried out using the following expression from Halkier et al., 93 , where L refers to the basis set cardinal number and a is the basis set extrapolation exponent. For RI-MP2-F12/haV{T,Q}Z-F12, where haV{T,Q}Z-F12 denotes the extrapolation from haVTZ-F12 and haVQZ-F12 basis sets, we used the same a value of 4.6324 initially obtained for the S66 set. 43 Like the W1 and W2 theories, 94 for the localized CC methods, we employed a = 3 and 3.22 for the haV{Q,5}Z and haV{T,Q}Z extrapolations, respectively. The optimal values of the linear combination coefficients for the composite localized CC schemes were obtained by minimizing the root mean square deviations (RMSD) from the new reference energies. Most of the DFT results were extracted from the ESI of ref. 42; the revDSD functionals from ref. 95 were evaluated using ORCA, while for oB97M-V 96 and oB97M(2), 97 the reference implementations in Q-CHEM 5 were used. 98
Originally proposed 43 Our best estimates of the S66x8 dissociation energies are listed in Table 1. Relative to the ''sterling silver'' standard dissociation energies, the RMS deviation for Hobza's 37 original S66x8 reference is 0.103 kcal mol À1 , which is marginally reduced to 0.096 kcal mol À1 for the revised reference proposed by Brauer et al. 42 However, the ''bronze'' quality dissociation energies lately proposed by Kesharwani et al. 43 have only 0.041 kcal mol À1 deviation. Among the four subsets, the most significant contribution to the RMS error comes from the p-stack complexes. In general, the error is most prominent for the compressed geometries, which gets dramatically reduced for the relaxed cases (see Table 2).
As the present paper was being finalized for submission, a paper by Nagy, Kállay, and coworkers 99 was published in which they revisited the S66 reference energies at what they call a ''14karat'' [sic] ''gold'' (14k-gold) level. Their levels of theory through CCSD are nearly identical to those used in the ''silver'' and ''sterling silver'' datasets of ref. 43, but their (T) treatment is effectively that of the ''gold'' level (used only for a subset of 18 systems ref. 43 owing to computational cost), namely CCSD(T)/ haV{T,Q}Z albeit with a DF approximation.
Essentially all of the discrepancies between ''14k-gold'' on the one hand and ''silver'' and ''sterling silver'' on the other hand can be attributed to the more economical triples corrections in these latter reference standards, namely CCSD(T)/ haV{D,T}Z and CCSD(T)/sano-{D,T}Z+, respectively.
For the 18 systems where we were able to obtain ''gold'' answers in ref. 43, the 14k-gold values of Nagy et al. 99 agree to 0.008 kcal mol À1 RMS. The ''silver'' is very close to both, with RMSD of just 0.011 kcal mol À1 from the ''14k-gold'' for the whole S66 set, and of just 0.006 kcal mol À1 from the ''gold'' for the subset of 18 systems where the latter is available. In contrast, the more economical ''sterling silver'' level used for S66x8 in the present study deviates by a somewhat larger 0.027 kcal mol À1 from ''14k-gold'' for S66. For the 18-system ''gold'' subset, we find a not dissimilar 0.024 kcal mol À1 deviation.
Keeping also in mind residual uncertainties in other components, we conclude from the above that 0.03 kcal mol À1 RMS is a realistic estimate of the uncertainty in our revised S66x8 data.

(b) LNO-CCSD(T)-based methods
In ref. 62 we benchmarked the performance of pure and composite LNO-CCSD(T) methods with respect to the S66 noncovalent interaction energies (''silver'' standard 43 ). In the present study, we investigate the performance of LNO-CCSD(T)based methods for the S66x8 set using the different basis sets and accuracy thresholds available.
Like in ref. 62 here too, we find a consistent improvement of accuracy with increasing basis set size and tightening of the thresholds; counterpoise correction also offers an appreciable improvement (see upper blocks of Table 3). With vTight threshold and half-CP, LNO-CCSD(T)/haV5Z is the best pick (0.056 kcal mol À1 ) among the single basis set approaches tested here. Except for the ''Normal'' settings, the RMS deviations with half-CP are marginally better than the full-CP option for all other accuracy thresholds.
Irrespective of the choice of accuracy threshold, a two-point CBS extrapolation improves the RMS error statistics across the board. Among all the LNO-CCSD(T)/CBS methods listed in Table 3, the LNO-CCSD(T, vTight)/haV{T,Q}Z half-CP is the best pick (0.025 kcal mol À1 ) for S66x8 noncovalent interactions. With the ''Normal'' setting and counterpoise uncorrected interaction energies, the low-cost LNO-CCSD(T)/haV{T,Q}Z fortuitously outperforms LNO-CCSD(T)/haV{Q,5}Z, and gradually tightening the accuracy threshold narrows that performance gap. The LNO-CCSD(T)/CBS methods with half-CP have slightly lower RMSD than their full-CP counterparts; however, the differences are within the uncertainty of the reference data. (Needless to say, at the true CBS limit, the difference between raw and CP-corrected values should be zero: a significant discrepancy between raw and CP in a CBS extrapolation suggests that the underlying calculations are inadequate, or that the extrapolation procedure is problematic.) Closer scrutiny of LNO-CCSD(T) performance for the subsets of S66x8 reveals that, for any individual haVnZ basis set, CP correction always degrades performance for hydrogen bonds, but improves it for the three other subsets. Upon CBS extrapolation, we find a significant raw-CP difference for hydrogen bonds with Normal cutoffs and haV{Q,5}Z basis sets, favoring full CP, while the opposite is true for the other three subsets. With Tight cutoffs, the raw-CP differences upon extrapolation approach the uncertainty in the reference values; with vTight cutoffs, one can no longer meaningfully choose between raw and CP. One might elect to use half-CP and use the difference between raw and CP as a crude uncertainty bar; on the other hand, the ''raw'' calculations are more convenient, especially when applied to intramolecular interactions where CP correction would be impractical. Using the vTight threshold and half-CP, LNO-CCSD(T)/haV{T,Q}Z and LNO-CCSD(T)/haV5Z are statistically tied for best pick for all four subsets of S66x8 (see the   Table 3). Hence, the more economical of the two would seem preferable. Let us take a closer look at the composite LNO-CCSD(T) schemes (a.k.a, cLNOs 62 ) with different accuracy thresholds and basis sets. Regardless of CP correction or lack thereof, the RMS deviations of all the cLNO methods are below 0.1 kcal mol À1 for the complete S66x8 set. With RMS error of 0.024 and 0.028 kcal mol À1 , respectively, vTight{T,Q} + 0.31[vvTight À vTight]/T half-CP and vTight{Q,5} + 0.71[vvTight À vTight]/T half-CP are the two best picks among the eleven cLNOs listed in Table 3. However, the first cLNO scheme is clearly preferred due to the lower computational cost. Either with ''raw'' or half-CP, the most affordable composite scheme Normal{T,Q} + c 1 [vTight À Normal]/T performs similarly to the very expensive LNO-CCSD(T, vTight)/haV5Z half-CP.

(c) PNO-LCCSD(T)-based methods
The RMS deviations for the pure and composite PNO-LCCSD(T) methods using different basis sets and accuracy thresholds are listed in Table 4.
Overall, the counterpoise uncorrected results indicate a consistent improvement in accuracy with increasing basis set size for any threshold. Among the PNO-LCCSD(T)/haVnZ methods, PNO-LCCSD(T, Tight)/haVQZ offers the best performance (0.071 kcal mol À1 ), followed by PNO-LCCSD(T, Default)/haV5Z (0.075 kcal mol À1 ). Further investigation reveals that the former method does better for the hydrogen-bonded systems and London dispersion complexes, whereas the latter is preferred for the p-stacks. Using a full-CP correction worsens performance across the board. However, half-CP can still perform close to the ''raw'' accuracy with a large basis set. Now, comparing the performance of ''raw'' and half-CP variants of  PNO-LCCSD(T, Tight)/haVQZ for the four subsets of S66x8, we have found that for the hydrogen bonding complexes, ''raw'' is preferred, but for the p-stack dimers, ''half-CP'' variant offers significantly better accuracy than its half-CP counterpart. For the London dispersion and mixed influence dimers, there is very little to choose between these two methods (see Table 4). Now, what is the effect of CBS extrapolation? Both haV{T,Q}Z and haV{Q,5}Z, in combination with the full-CP and Default threshold, perform remarkably well (0.027 and 0.028 kcal mol À1 , respectively). Using a tighter threshold marginally improves the accuracy of PNO-LCCSD(T)/haV{T,Q}Z (0.020 kcal mol À1 ). Hence, irrespective of the choice of accuracy threshold, the standard PNO-LCCSD(T)/CBS methods with full-CP perform similar to the best cLNO schemes.
Unlike for cLNOs, we have not found any significant improvement in accuracy when PNO-LCCSD(T)-based composite schemes (i.e., cPNOs) are considered instead of the standard PNO-LCCSD(T)/haVnZ or PNO-LCCSD(T)/CBS methods with full-CP. However, for special cases where counterpoise correction is not applicable (in other words using ''raw'' energies), the composite methods outperform the standard PNO-LCCSD(T) alternatives. With RMS deviations of 0.033 and 0.034 kcal mol À1 , respectively, Default{Q,5} + 0.15[Tight À Default]/T and Default{Q,5} + 0.33[Tight À Default]/Q are the Table 3 Root-mean-square deviations (RMSDs, kcal mol À1 ) of pure and composite LNO-CCSD(T) methods with respect to the ''sterling silver'' reference. Heat mapping is from red (worst) via yellow to green (best) Table 4 Root-mean-square deviations (RMSDs, kcal mol À1 ) of the pure and composite PNO-LCCSD(T) methods with respect to the ''sterling silver'' reference. Heat mapping is from red (worst) via yellow to green (best)

PCCP Paper
Open two best picks in the counterpoise uncorrected category. In the same vein, comparing the top performers from the ''raw'' cLNO and cPNO schemes, we have found that for hydrogen bonding and p-stack complexes, cLNOs are preferred, whereas cPNOs offer better accuracy for London dispersion complexes. For the mixed influence subset, there is very little to choose between cLNO and cPNO.
(d) DLPNO-CCSD(T 1 )-based methods Table 5 summarizes the RMSDs of the pure and composite DLPNO-CCSD(T 1 ) methods for different basis sets, accuracy thresholds, and T cutPNO combinations. The ''raw'' results indicate a gradual improvement of accuracy with increasing basis set size and tightening of the thresholds, provided that we use the default T cutPNO (i.e., 3.33 Â 10 À7 and 1.0 Â 10 À7 for Normal and TightPNO, respectively). Moving from ''raw'' to full-CP correction further improves the performance of DLPNO-CCSD(T 1 , Normal)/haVTZ, but stays more or less indifferent when haVQZ is employed. However, while using the Tight threshold, the performance of ''raw'' and full-CP DLPNO-CCSD(T 1 )/haVTZ are comparable, but with haVQZ basis set, ''raw'' performs better.
Fortuitously, the strategy of using the threshold T CutPNO = 1.0 Â 10 À6 in TightPNO and a haVTZ basis set offers the best performance (0.309 kcal mol À1 ) for ''raw'' interaction energies. The RMS error counter-intuitively increases as the T CutPNO parameter becomes tighter. When CP correction is included, we observe the reverse trend. Irrespective of the choice of T CutPNO , half-CP correction is more beneficial than full-CP. With the same basis set, tightening the accuracy threshold further (i.e., VeryTightPNO) is only beneficial for half-CP. Using a more extensive basis set and default T CutPNO improves accuracy throughout, with and without CP correction. Although ''raw'' DLPNO-CCSD(T 1 , Tight)/haVTZ underperforms ''raw'' DLPNO-CCSD(T 1 , Tight)/haVQZ for p-stacks, London dispersion, and mixed influence subsets, these two methods offer similar performance for H-bonds. Now, what is the effect of CBS extrapolation? Except for the hydrogen bonding, we noticed an improvement in performance across the board when the Normal threshold is employed. By tightening the accuracy threshold further, we found that CBS extrapolation does more harm than good for ''raw'' but improves the performance for full and half-CP. With full-CP correction, the DLPNO-CCSD(T 1 , Tight)/haV{T,Q}Z (0.095 kcal mol À1 ) is the best pick among the pure methods, followed by its half-CP version (0.117 kcal mol À1 ). Using DLPNO-CCSD(T 1 , Tight)/haVQZ/T CutPNO = 10 À7 and DLPNO-CCSD(T 1 , Tight)/haVTZ/CPS interaction energies for two-point CBS extrapolation worsen the performance regardless of considering a CP correction.
Among the counterpoise-uncorrected DLPNO-CCSD(T 1 )based composite schemes, a relatively low-cost composite method, Normal/{T,Q} + 0.93[Tight/CPS À Normal]/T offers the best accuracy (0.072 kcal mol À1 ). The use of counterpoise correction only offers marginal improvement (RMSD = 0.065 and 0.067 kcal mol À1 , with full and half-CP, respectively). With an RMS error of 0.059 kcal mol À1 , the full CP corrected composite method, Tight{T,Q} + 1.1[vTight À Tight]/T is the best pick among all the standard and composite DLPNO-CCSD(T 1 ) listed in Table 5.
DLPNO-CCSD(T 1 ) is much more demanding in terms of I/O storage and bandwidth requirements than DLPNO-CCSD(T 0 ), and this will hold especially true for the largest basis sets. It was previously observed by Iron and Janes 57,58 and by Efremenko and Martin,55 both in the context of organometallic catalysis, that the (T 1 )-(T 0 ) difference is relatively insensitive to the basis set; hence, we considered here a two-tier composite method, DLPNO-CCSD(T 0 )/haVQZ + c 1 [DLPNO-CCSD(T 0 )/haVQZ À DLPNO-CCSD(T 0 )/haVTZ] + c 2 [DLPNO-CCSD(T 1 )/haVTZ À DLPNO-CCSD(T 0 )/haVTZ], where the CBS extrapolation is carried out at the DLPNO-CCSD(T 0 ) level and the (T 1 ) À (T 0 ) difference is evaluated in a smaller basis set. With a rootmean-square error of 0.079 kcal mol À1 , the performance of this cDLPNO-scheme is comparable to the best pick in the ''raw'' category, Normal{T,Q} + 0.93[Tight/CPS À Normal]/T. While using counterpoise-uncorrected energies, the optimized coefficient for the (T 1 ) À (T 0 ) contribution is anomalously large (c 2 = 3.33). However, with full-CP correction, c 2 is reduced to 1.33, and the RMS error improves to 0.068 kcal mol À1 , which is not very far from the accuracy of the more expensive DLPNO-CCSD(T 1 )-based composite method, Tight{T,Q} + 1.1[vTight À Tight]/T (see Table 5). These two-tier cDLPNO methods may be an attractive option for larger systems.

(e) A few observations regarding computational requirements
This is not the appropriate place to enter into a discussion on the relative computational efficiency of the various codes-particularly since our computing cluster is extremely heterogeneous in terms of CPUs, and a fair comparison would require not only ''all else being equal'' but a broad and representative sample.
That being said, we can state a few observations concerning the effect of different accuracy settings. For illustration, wall clock timings for system 26 (stacked uracil dimer) at 1.0r e have been given in the ESI. † All these calculations were carried out on 8 cores of an otherwise idle 2.4GHz Intel Xeon Gold Intel(R) Xeon(R) Gold 6240R ''Cascade Lake'' node with 192GB of RAM-the ORCA jobs, which are more demanding in terms of I/O bandwidth, were run on a machine with the same CPU and 2.9TB of striped SSD.
As a rule, for LNO-CCSD(T), turning up accuracy from Normal to Tight requires between triple and quintuple the wall clock time.
The additional expense going from Tight to vTight tends to be smaller-doubling to tripling wall clock time.
As for the dependence of wall clock times on the basis set size, naively one would expect it to tend towards linear scaling with the number of basis functions for sufficiently large systems and basis sets. In practice, we are not yet operating in such a size regime: the numbers of basis functions are in a 848 : 408 = 2.08 ratio for haVTZ : haVDZ, and a 1520 : 848 = 1.79 ratio for haVQZ : haVTZ, but the wall clock times definitely go up faster than that, by a factor of 3-4. Still, this is much gentler than the N 4 scaling of the canonical calculation-so while canonical CCSD(T) calculations in the smallest haVDZ basis set may not be abysmally more demanding than their linearized counterpart, a wide chasm opens for the larger basis set. This militates in favor of a composite method where basis set extrapolation is carried out using localized methods, and the difference between canonical and localized method in the smallest basis set used as a differential correction.
Another observation that may be relevant here concerns the PNO cutoff extrapolation of the T cutPNO = 1.0 Â 10 À{6,7} type. The more expensive T cutPNO = 1.0 Â 10 À7 calculation is equivalent to standard Tight, while the T cutPNO = 1.0 Â 10 À6 calculation is not much more expensive than Normal. So, it would seem that this is an economical as well as accurate option.
(f) How far can we go with canonical CCSD(T)?
We could do the canonical CCSD(T) calculations with density fitting and haVnZ (n = D and T) basis set for the whole S66x8 set. Relative to the ''sterling silver'' reference, DF-CCSD(T)/haVDZ offers RMS deviations of 1.16, 0.90, and 0.53 kcal mol À1 with ''raw'', full-CP, and half-CP correction, respectively. Except for NCIs involving hydrogen bonds, CP correction is beneficial across the board. Using a larger basis set improves RMSDs substantially, and on top of that, a two-point CBS extrapolation (using Schwenke's 100 formula) further ameliorates the statistics. With full-CP, DF-CCSD(T)/haV{D,T}Z offers an RMS deviation of 0.11 kcal mol À1 . For p-stack, London dispersion, and mixed influence subsets, full-CP performs noticeably better than the ''raw'' and half-CP. The only exceptions are the Hbonded systems, where half-CP wins the race. Compared to the ''sterling silver'' level HLC, the canonical [CCSD(T)-MP2]/haVTZ energies offer RMS errors of 0.04, 0.05, and 0.07 kcal mol À1 with ''raw'', full, and half CP correction. Now, let us compare the performance of different localized coupled cluster methods relative to the canonical DF-CCSD(T) interaction energies. LNO-CCSD(T) with vTight and vvTight settings perform remarkably if we do not use any CP correction. Irrespective of the choice of accuracy threshold, full-CP correction is beneficial for the pure PNO-CCSD(T) methods, but they are not even close to ''raw'' LNO-CCSD(T)/vTight. However, with full CP correction, the PNO-based composite method, Tight + A(Tight À Default), offers accuracy similar to ''raw'' LNO-CCSD(T)/vTight. For DLPNO-CCSD(T 1 ), it does not make a significant difference whether we use counterpoise correction or not (see Table 6).
(g) Some remarks on the evaluation of more approximate methods, such as DFT functionals and SAPT(DFT) The revised reference data prompt the question: to what extent do the revised values affect or modify prior observations (e.g., in ref. 42) on the performance of DFT functionals for the S66x8 dataset?
The easiest way to see this would be to take the Excel workbook in the ESI † of the said paper, splice in our new reference data, and compare the published tables in the paper with the dynamic versions in leaf ''Summary'' of the ESI, † particularly for Tables 14 and 15. Generally speaking, the conclusions from ref. 42 are unaffected: for instance, the RMSDs for BLYP-D3BJ and BP86-D3BJ with the def2-QZVP 101 basis set and full counterpoise change from 0.23 and 0.58 kcal mol À1 , respectively, to 0.22 and 0.65 kcal mol À1 . This does not affect the superiority of BLYP over BP86 in this context. (The corresponding changes for B3LYP-D3BJ and PBE0-D3BJ are from 0.20 and 0.35 kcal mol À1 , respectively, to 0.23 and 0.36 kcal mol À1 .) Among double hybrids, all with haVQZ basis set and half-counterpoise, B2PLYP-D3BJ 102 and B2GP-PLYP-D3BJ 103 actually improve from 0.19 and 0.22 kcal mol À1 to 0.15 and 0.18 kcal mol À1 , respectively, while DSD-PBEP86-D3BJ 104 deteriorates from 0.20 to 0.26 kcal mol À1 . However, we previously found the revised version of the latter, revDSD-PBEP86-D3BJ, 95 to be an improvement over DSD-PBEP86-D3BJ across the board, and this is also seen here for the new S66x8 reference, RMSD = 0.19 kcal mol À1 . Using the more up-to-date D4 105 dispersion correction, this latter RMSD drops to 0.17 kcal mol À1 for revDSD-PBE86-D4. With RMSD of 0.10 kcal mol À1 , dRPA75 106 with a customfitted 42 D3BJ correction was the best performer in ref. 42 and Table 5 Root-mean-square deviations (RMSDs, kcal mol À1 ) of the pure and composite DLPNO-CCSD(T 1 ) methods with respect to the ''sterling silver'' reference. Heat mapping is from red (worst) via yellow to green (best) continues to be so here (0.12 kcal mol À1 ). [As explained in ref. 42, the D3BJ correction has a small coefficient and is similar to that found for CCSD: in effect, it compensates for the missing dispersion terms from (T), which start at fourth order in symmetry-adapted perturbation theory (SAPT) 107 ] As a parenthetical remark, the combinatorially optimized, range-separated hybrid, oB97M-V 96 and double hybrid, oB97M(2) 97 functionals were not yet available to us when ref.
42 was published; we find here 0.15 and 0.14 kcal mol À1 rootmean-square deviations relative to the new reference, respectively. It means oB97M-V outperforms all other rung 4 functionals, plus all rung 5 functionals considered in ref. 42 other than dRPA75-D3BJ and oB97M (2). 108 i.e., SAPT using DFT orbitals, has recently gained some currency as a relatively low-cost/high-accuracy approach for noncovalent interactions. Heßelmann 109 applied such approaches to S66x8 using asymptotically corrected PBE0ac orbitals and three different response kernels, namely ALDA (adiabatic local density approximation), TDEXX (timedependent exact exchange), and ATDEXX (adiabatic TDEXX). The RMSD between Table 5 of his ESI † and the present ''sterling silver'' reference data is 0.224 kcal mol À1 using ALDA, but this drops to 0.150 kcal mol À1 for TDEXX and slightly further to 0.136 kcal mol À1 for ATDEXX. Most of the improvement results from the p-stacked subset (see ESI † for details); suffice to say that both TDEXX and ATDEXX Table 6 Root-mean-square deviations (RMSDs, kcal mol À1 ) of different localized orbital coupled cluster methods with respect to the canonical DF-CCSD(T)/haVTZ level interaction energies of S66x8. Raw, CP, and half-CP represents the counterpoise uncorrected, full-, and half-CP corrected results. DCP represents the size of the counterpoise correction; DDCP is the deviation between DCP with this particular localized method and with canonical CCSD(T) are competitive with the best DFT functionals considered here, and not greatly inferior to previous wavefunction calculation sets.

IV. Conclusions
We have successfully calculated the ''sterling silver'' standard noncovalent interaction energies of the entire S66x8 set. Analyzing the RMS errors of three other S66x8 reference dissociation energies available in the literature, we can safely conclude that the ''sterling silver'' reference energies are markedly better than Hobza's original ones 37 as well as the earlier revised version proposed by our group, 42 but only marginally better than the ''bronze'' 43 level dissociation energies obtained at much lower cost.
Additionally, examining the RMS deviations of a variety of pure and composite localized coupled cluster methods for the new S66x8 reference, we can conclude the following: (ii) Even with the Default threshold, PNO-LCCSD(T)/ haV{T,Q}Z full-CP performs remarkably sound-which gets even better with a tighter threshold. This remarkable result of the PNO-LCCSD(T, Tight)/haV{T,Q}Z full-CP could be due to the benefits from fortuitous error compensation. Using a composite scheme does not have any added advantage over the pure PNO-LCCSD(T) methods. However, for the intramolecular interactions, the low-cost cPNO method, Default{Q,5} + 0.15[Tight À Default]/T still has the edge over the pure methods due to its superiority for the p stack complexes.
(iii) Even with CPS and CBS extrapolation, the pure DLPNO-CCSD(T 1 ) methods are not up to the mark for S66x8 noncovalent interactions. An RMS deviation of 0.059 kcal mol À1 is the best accuracy we can achieve by employing a composite scheme, which is well behind the best picks among the PNO or LNO-based composite methods.

Note added in revision
While the present paper was being revised in response to peer reviewer comments, a study by Förster 110 appeared that compares the performance of RPA, second-order screened exchange 111 (SOSEX), and second-order statically screened exchange 112,113 for a variety of chemical problems, including S66x8. Extracting the RPA, RPA + SOSEX(W, n c ), and RPA + SOSEX(W(0), W(0)) level interaction energies from the ESI of ref. 110, we have evaluated the RMS deviations relative to the ''sterling silver'' reference data (see the ESI †). With full-CP correction and CBS extrapolation, RPA, RPA + SOSEX(W, n c ), and RPA + SOSEX(W(0), W(0)) has RMSD of 0.44, 0.35, 0.40 kcal mol À1 , respectively-note that none of these include separate dispersion corrections, unlike the better-performing DFT options in the present paper.

Conflicts of interest
The authors declare no conflicts of interest.