An extensive assessment of the performance of pairwise and many-body interaction potentials in reproducing ab initio benchmark binding energies for water clusters n = 2–25 †

We assess the performance of 7 pairwise additive (TIP3P, TIP4P, TIP4P-ice, TIP5P, OPC, SPC, SPC/E) and 8 families of many-body potentials (q-AQUA, HIPPO, AMOEBA, EFP, TTM, WHBB, MB-pol, MB-UCB) in reproducing high-level ab initio benchmark values, CCSD(T) or MP2 at the complete basis set (CBS) limit for the binding energy and the many-body expansion (MBE) of water clusters n = 2–11, 16–17, 20, 25. By including a large range of cluster sizes having dissimilar hydrogen bonding networks, we obtain an understanding of how these potentials perform for different hydrogen bonding arrangements that are mostly outside of their parameterization range. While it is appropriate to compare the results of ab initio based many-body potentials directly to the electronic binding energies ( D e ’s), the pairwise additive ones are compared to the enthalpies at T = 298 K, D H (298 K), as the latter class of force fields are parametrized to reproduce enthalpies (implicitly accounting for zero-point energy corrections) rather than binding energies. We find that all pairwise additive potentials considered overestimate the reference D H values for the n = 2–25 clusters by 4 13%. For the water dimer ( n = 2) in particular, the errors are in the range 83–119% for the pairwise additive potentials studied since these are based on an effective rather than the true 2-body interaction specifically designed as a means of partially accounting for the missing many-body terms. This stronger 2-body interaction is achieved by an enhanced monomer dipole moment that mimics its increase from the gas phase monomer to the condensed phase value. Indeed, for cluster sizes n Z 4 the percent deviations become slightly smaller (albeit all exceeding 13%). In contrast, we find that the many-body potentials perform more accurately in reproducing the electronic binding energies ( D e ’s) throughout the entire cluster range ( n = 2–25), all reproducing the ab initio benchmark binding energies within (cid:2) 7% of the respective CBS values. We further assess the ability of a subset of the many-body potentials (MB-UCB, q-AQUA, MB-pol, and TTM2.1-F) to also reproduce the magnitude of the ab initio many-body energy terms for water cluster sizes n = 7, 10, 16 and 17. The potentials show an overall good agreement with the available benchmark values. However, we identify characteristic differences upon comparing the many-body terms at both the ab initio -optimized geometries and the respective potential-optimized geometries to the reference ab initio values. Additionally, by applying this analysis to a wide range of cluster sizes, trends in the MBE of the potentials with increasing cluster size can be identified. Finally, in an attempt to draw a parallel between the pairwise additive and many-body potentials, we report the analysis of the individual molecular dipole moments for water clusters with 1 to B 4 solvation shells with the TTM2.1-F potential. We find that the internally solvated water molecules have in general a larger molecular dipole moment ranging from 2.6–3.0 D. This justifies the use of an enhanced, with respect to the gas-phase value, molecular dipole moment for the pairwise additive potentials, which is intended to fold in the many body terms into an effective (enhanced) pairwise interaction through the choice of the charges. These results have important implications for the development of future generations of efficient, transferable, and highly accurate classical interaction potentials in both the pairwise additive and many-body categories.


I. Introduction
The ubiquity of water in nature has led to prolific scientific research aimed at understanding the complex interactions between water molecules in extended hydrogen bonding networks.[3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20] Small and medium size water clusters (i.e., up to n = 30) are simple enough to be studied using high level ab initio quantum calculations (the largest electronic structure calculation to date being the CCSD(T)/aug-cc-pVTZ single point energy calculation of CH 4 @(H 2 O) 20 ), 21 keeping in mind the challenge of obtaining the lowest energy structures especially for n 4 15. 22,23However, molecular dynamics simulations of water's macroscopic properties are often too computationally expensive for these high level, first principles calculations to be feasible, pushing for the development of classical interaction potentials parameterized to either macroscopic properties or ab initio cluster data.This task is no easy feat given the complexities of water's behavior including the importance of cooperative hydrogen bonding 24 and polarizability. 25For example, it has been demonstrated through Monte Carlo temperature basin paving that tens of thousands of water cluster conformers exist within 5 kcal mol À1 above the putative minima for medium-sized water clusters (n 4 9) [https://sites.uw.edu/wdbase/database-of-water-clusters/]. 26n order to reproduce water's macroscopic behavior, potentials must be able to accurately reproduce the energetics in addition to the geometries and dynamics of these cluster networks.
Since water potential development has been, and still is, an active field of research and more benchmark ab initio data continuously become available, it is necessary to intermittently compare and benchmark the developed potentials, especially including data outside their parametrization range, in order to assess their accuracy and identify areas that they can be improved. 27,28In this study, we focus on the structures, binding energies, and the many-body terms (comprising the binding energies) of the local minima for a wide range of cluster sizes and geometries.In our opinion, the compilation of the most accurate to date energetics for such a large range of water clusters can prove a useful resource and reference point for future studies.It should be noted that other properties, such as the harmonic frequencies and forces, are also important quantities to examine.While we do not investigate these other properties explicitly, obtaining accurate minimum energy structures is contingent on accurate forces near the minima.Further, the accuracy of the local minima will likely affect the accuracy of the harmonic frequencies, which are evaluated at these stationary points.Therefore, the ability to reproduce the structures of the local minima can be considered a precursor to obtaining accurate harmonic frequencies for the ''right reasons'' instead of a cancellation of errors (i.e., obtaining accurate harmonic frequencies at an inaccurate minimum energy structure).For this reason, we focus on the structures and the binding energies at the local minima, while acknowledging that other properties are also important to examine in the future for different applications.
The foundations of the developed water potentials can be best understood when discussed in the context of the manybody expansion (MBE) of the system's intermolecular interactions.The MBE is a combinatorial approach 29 that partitions the binding energy (BE) of a given system into its constituent nbody terms, according to eqn (1), where DE IJ. ..K is the energy of a subsystem of k ''bodies'' (water molecules).A complete MBE extends to the n-body term, where n is the number of fragments in a given system.Different approximations to represent the system's total energy, BE, give rise to two major classes of water potentials, namely pairwise additive and many-body potentials.Pairwise additive potentials (such as TIP3P, 30 TIP4P, 30,31 TIP5P, 32 SPC, 33 SPC/E, 34 OPC, 35 TIP4P/ice 36 ) approximate the total binding energy of the system through an effective two-body interaction between all pairs of molecules, This effective two-body interaction in the pairwise additive potentials can be modeled by simple electrostatic and dispersion terms using point charges and simple functional forms.
The number and locations of the point charges, ranging from 2 to 6 points, are fixed for each model uniquely defining the monomer permanent dipole moment.In this study, only rigid pairwise additive models (i.e., potentials for which the monomer geometry is fixed and correspondingly the 1-body term is zero) will be investigated.All higher order terms beyond the (effective) 2-body term in the expansion of pairwise potentials are exactly zero by construction.Importantly, the effective twobody potential differs from the ''true'' two-body potential (i.e., the one in the water dimer) as it attempts to fold the missing many-body higher order terms through the use of charges that produce molecular dipoles that are enhanced with respect to the gas phase isolated monomer values.Pairwise additive potentials are typically parameterized to reproduce various macroscopic properties of water, such as the density, specific heat, radial distribution function, and phase changes.Most of them are fitted to emulate the enthalpy of liquid water via classical (Newtonian) molecular dynamics simulations, effectively not considering zero-point energy effects explicitly, as these are assumed to be somehow implicitly included in the functional form and parametrization.Because of this, these potentials offer a less direct comparison to the water cluster ab initio binding energies (D e 's) but rather to enthalpies at 298 K (DH(298 K)'s), as they implicitly include zero-point energy corrections and temperature effects.
In contrast, many-body potentials include higher order contributions from the MBE, either implicitly or explicitly.

This journal is © the Owner Societies 2023
This can be achieved by incorporating an effective manybody term, that accounts for electrostatic, polarization, dispersion, and other interactions that implicitly incorporate higher order terms in the expansion.Potentials in this category include, among others, the Thole-type models (TTM), [37][38][39][40][41] the family of the effective fragment potentials (EFP), [42][43][44][45] the MB-UCB, 46 AMOEBA, [47][48][49][50][51] and HIPPO 52 potentials.Alternatively, the higher order terms in the MBE can be explicitly accounted for by defining fitted polynomials, that describe higher order terms up to a specified order of the expansion such as in the q-AQUA, 53 MB-pol, 54,55 WHBB, 56 and CC-pol [57][58][59] potentials.The challenge with explicitly fitted polynomials for the MBE terms is determining the point, at which the expansion should be truncated.It has been shown that the many-body expansion for water clusters converges at the 4-body term. 24,60However, it has been recently proposed that the 5-body term is necessary for the forces to converge. 61For explicit many-body potentials, the expansion is often truncated at the 3-body term.However, it should be noted that a 4-body term was recently fitted for use in explicit many-body potentials which was recently implemented in q-AQUA. 53,62The 1-body term is often described by the Partridge-Schwenke 63 potential energy surface (PES), in some instances modified to mimic the dissociation limit in condensed environments upon OH extension 38 or individual terms describing anharmonic O-H stretches and H-O-H bends with a Urey-Bradley 64 term describing the coupling between these degrees of freedom.Naturally, the 1-body term is zero when the many-body potential is rigid (i.e., for the CC-pol and rigid pairwise additive potentials).Because the many-body potentials are trying to approximate the Born-Oppenheimer PES and include zeropoint effects explicitly via path integral nuclear statistical simulations [65][66][67] it is appropriate to compare the results with this family of potentials to the ab initio cluster electronic binding energies (D e 's).This paper provides a comprehensive assessment of a set of existing classical interaction potentials for water with an emphasis on their performance in reproducing the binding energies and enthalpies of water cluster sizes lying outside of their parameterization range.The details of the collection of pairwise additive and many-body polarizable potentials considered in this study are given in Table S1 of the ESI.† This comparison provides insight into how well these water potentials reproduce complex water interactions of systems with varying hydrogen bonding arrangements and sizes, as well as the water cluster size regimes they perform optimally.To facilitate this comparison, we provide a compilation of existing MP2/CBS, CCSD(T)/CBS, SAMBA 68 and DMC 69,70 binding energies from earlier published works for water clusters within the range n = 2-25, 68,[71][72][73][74][75][76][77][78][79][80][81][82][83][84][85] against which the results obtained with the water potentials are compared.This includes water cluster sizes both within the parameterization range of the many-body potentials but also larger sizes that can be used to identify trends with cluster size.Further, we find it important that the cluster minimum energy structures are optimized with each potential to gauge how well these potentials reproduce the structure (minimum energy geometry) of the water cluster in addition to its energetics.For select many-body potentials (MB-UCB, q-AQUA, MB-pol, and TTM2.1-F), we also report a many-body decomposition both at the ab initio-optimized and potential-optimized geometries.The resulting many-body terms are compared with the MP2 results for these clusters (n = 7, 10, 16, 17). 60,86From this analysis, we subsequently analyze the composition (n-body terms) of the binding energy (D e ) rather than simply its total magnitude.Lastly, we analyze the molecular dipole moments in water clusters with 1 to B4 solvation shells using the TTM2.1-Fpotential to quantitatively account for the effect of the environment on the molecular dipole moments, thus providing a connection between the many-body and pairwise additive potentials, which use an enhanced fragment dipole moment to fold in the many-body terms into the effective 2-body term.Through this study, we strive to determine the strengths and weakness of various existing classical interaction potentials for water applicable in the cluster regime (structures and energetics), thus identifying opportunities for their further improvement and refinement.

II. Computational details a. Geometry optimizations and binding energies
For all geometry optimizations performed with each of the potentials, the starting geometries were the optimized ones at the CCSD(T)/aVDZ for n = 2-6, 82,87 MP2/aVDZ for n = 7, 75 CCSD(T)/aVDZ for n = 8, 82,87 MP2/6-31G* for n = 9, 74 MP2/aVDZ for n = 10, 75 MP2/aVTZ for n = 11, 81,82 16, 17, 82,88 and 20, 21,72,89 and MP2/aVDZ for n = 25 75 taken from the respective ESI † of published works.The reported binding energies were calculated at the optimized structure with each water potential.The optimized geometries with the potentials have retained the oxygen framework and hydrogen bond topology except for a few clusters; these cases are denoted as NSP (Not a Stationary Point).Note that the developers of the MB-pol potential oftentimes report the binding energies with that potential at the MP2/aVTZ optimized geometries.In our opinion this introduces an unnecessary complication since it has been reported that the MP2 level of theory overestimates the hydrogen bonded OH stretches 87 in water clusters and, consequently, the corresponding red shifts 90 with respect to CCSD(T). 87Additionally, a critical test of the accuracy of an interaction potential, besides the total cluster binding energies, rests with its ability to also yield accurate cluster geometries.The optimized structure needs to be eventually reported at the potential minima for the purpose of comparing harmonic frequencies and it will be confusing to report the binding energy at a non-stationary point (MP2/aVTZ optimized geometries) and the frequencies at the potential minimum that has a different energy.For this purpose, we report all cluster binding energies with the MB-pol potential at the corresponding cluster minima with that potential, while alerting the reader that these will be different than the ones previously reported by the MB-pol developers at the MP2/aVTZ geometries. 28,91The binding energies, D e , were computed as: where E cluster is the total energy of the cluster, E ref is the energy of a water monomer optimized with the same level of theory and basis set as E cluster was computed with, and n is the number of water molecules in the system.For many of the cluster sizes examined, the uncertainties in the CBS estimates were given in the original publications.However, for n = 7, 9, 10, 20, and 25, no uncertainties or error estimates were given in the published results.To remedy this, we have utilized the protocol suggested by Miliordos and Xantheas 82 to estimate the CCSD(T)/CBS or MP2/CBS limits and uncertainties from values at smaller basis sets.For n = 7, 9, and 10, the energies were computed at the MP2/aVDZ optimized geometries, so we have estimated the CCSD(T)/CBS and uncertainty from the CCSD(T)/aVDZ//MP2/aVDZ calculations.
For n = 20 we estimated the MP2/CBS limit and its uncertainty from the MP2/aVQZ//MP2/aVTZ single point energy calculations.For n = 25 we estimated the MP2/CBS limit from the MP2/aVDZ//MP2/aVDZ calculation according to the protocol suggested by Miliordos and Xantheas. 82The uncertainties obtained using this protocol (indicated in the plots as a shaded region) are in general consistent with the benchmark values originally reported.
Following the earlier discussion, we compare the binding energies with the many-body potentials to the ab initio binding energies (D e 's) and the ones obtained with the pairwise addition potentials to the zero-point energy corrected ones (D 0 's) (contained in the ESI † of this paper) and the enthalpies at T = 298 K, DH(298 K).The D 0 and DH(298 K) values for n = 2-10 were obtained from Temelso et al. 2011, 73 in which the anharmonic corrections were estimated by scaling the RI-MP2/aug-cc-pVDZ frequencies and thermal corrections by empirically-determined values.For the remaining cluster sizes (n = 11, 16, 17, 20, 25), we follow a similar protocol by scaling the B3LYP 92 /aug-cc-pVDZ 93 harmonic ZPEs by 0.976 and the thermal corrections to the enthalpies by 1.106, as recommended.It has been established that a high-order electron correlation is necessary for the accuracy of individual frequencies, whereas B3LYP performs comparably to the MP2 level of theory in the total zero point energy estimation. 94he harmonic zero-point energy corrections and corresponding thermal corrections to the enthalpies were computed using Gaussian 16 95 for all cluster sizes, except n = 20, 25 which were computed using the NWChem 6.8 electronic structure suite. 96he closeness of the optimized geometries (x i , y i , z i ) with the various potentials considered in this study to the respective reference ab initio geometries (x i,ref , y i,ref , z i,ref ), is captured by the root-mean-squares-deviation (RMSD), where n is the number of atoms in the water cluster.The RMSD values were minimized using the Kabsch algorithm, [97][98][99] to account for any translation or rotation of the molecule upon optimization.
Pairwise additive potentials.Gromacs 2020.2 was utilized for all optimizations using the pairwise additive potentials. 100The topology files for the TIP4P-ice 36 and OPC 35 pairwise potential were added manually.The TIP3P, 30 TIP4P, 30,31 TIP5P, 32 SPC, 33 and SPC/E 34 topology files used were the ones available within Gromacs.The structures were optimized using both mixed and double precision.The molecular geometries were constrained by the LINCS algorithm 101 to model rigid water molecules.The steep integrator was used to optimize the geometries with each of the potentials.The minimization was converged when the maximum force was either less than 0.005 kcal mol À1 Å À1 or until machine precision was reached (for the case of mixed precision).
Many-body potentials.Tinker 8.8 2020 [102][103][104] was utilized to optimize the clusters with the AMOEBA03, 47 AMOEBA14, 48 i-AMOEBA, 49 AMOEBA+, 50 AMOEBA + CF. 51 GAMESS (2019 R2) 105 was used for all of the effective fragment potential optimizations (EFP1, 42,43 EFP1-D 44 and EFP2-D 45 ).The optimizations with the MB-pol 54,55 potential were performed with the legacy MB-pol distribution of the code. 54The TTM2.1-F 37 and TTM3-F 38 optimizations were performed using the open-source distribution (TTM2.1-F:upon request from the authors, TTM3-F: https://www.pnnl.gov/science/ttm3f.asp).The MB-UCB 46 and WHBB 56 results were taken from published works 46,56 and, when not available, optimized using an in-house code from the respective development groups (see Acknowledgements).The binding energies using the CC-pol 57,58,68 potential were taken from published works for clusters sizes n = 2, 59 3, 57 and 6. 57 The code for q-AQUA 53 was obtained from the developers and optimized with the L-BFGS algorithm using the Scipy 106 python package until the maximum force was below 0.0001 hartree bohr À1 .b. Many-body expansion for the clusters n = 7, 10, 16, 17 The MBE was performed up to the 5th order (i.e., up to the 5-body) for water clusters of sizes n = 7, 10, 16, 17 (for the sphere/ ''internally solvated'' and 5525/''all surface'' isomers) 88,107 with the MB-UCB, q-AQUA, MB-pol, and TTM2.1-F potentials.The many-body terms for the potentials are computed both at the ab initio-optimized geometry (at which the reference values were calculated) and the potential-optimized geometries.The values obtained from the expansion with the many-body potentials are compared to the MP2 values with large basis sets.The manybody terms for n = 7, 10 were previously reported at the MP2/ aV5Z-V5Z level of theory. 60Although the previously reported ab initio MBE values were obtained for a slightly different conformation than the minimum energy structure examined in this study, the binding energies of these conformations including margins of error (À57.6 AE 0.2 kcal mol À1 and À93.1 AE 0.3 kcal mol À1 for n = 7 and 10, respectively) are o0.2 kcal mol À1 above the MP2/CBS at the minimum energy structures considered in this paper (see Fig. S1 in the ESI †).The MP2/CBS many-body terms for n = 16 were estimated using the BSSE-corrected MP2/ aVTZ values for the 1-, 3-and 4-body terms.It has been previously established for the n = 7 and 10 clusters that the MP2/aVTZ BSSEcorrected 4-body terms yield values that are very close (within 0.1 kcal mol À1 ) to the corresponding values at the MP2/aV5Z-V5Z level. 60The 2-and 3-body terms at the CBS limit were estimated by taking the average of the respective uncorrected and BSSEcorrected MP2 terms with the aug-cc-pVTZ 93 basis set.When this is performed for the n = 7 and 10 clusters (see Table S4 in ESI †), the CBS estimate for the 2-body term is within 0.7 kcal mol À1 and the 3-body term within 0.01 of the MP2/aV5Z-V5Z values. 60The reference MP2 values for the n = 16 cluster were calculated using the NWChem 6.8 electronic structure package. 96

c. Dipole moment analysis
The molecular dipole moments were computed with the TTM2.1-F 37potential for the n = 5, 8, 17, 53 and 102 water clusters resembling progressively increasing solvation shells around a single molecule (or solvated dimer in the case of n = 8) lying in the center of the cluster.(H 2 O) 5 , the simplest cluster mimicking the first solvation shell is a tetrahedrally coordinated water pentamer, is optimized at the MP2/aug-cc-pVTZ level keeping the O-O-O angles at tetrahedral values (109.51);note that this arrangement is not a stationary point on the PES and will collapse to the ring minimum upon full geometry optimization.Similarly, (H 2 O) 8 mimics a fully solvated water dimer (n = 8), in which the 3 + 3 water molecules solvating each of the two fragments are kept at tetrahedral orientations and the structure is optimized at the MP2/aug-cc-pVTZ level under the tetrahedral constraints; again, this is not a stationary point on the water octamer PES and will collapse to the cube minimum 71,108 upon full geometry optimization.The (H 2 O) 17 network corresponds to a monomer with two tetrahedrallycoordinated solvation shells optimized at the MP2/aug-cc-pVTZ level of theory, while (H 2 O) 53 and (H 2 O) 102 are water clusters with B3 and B4 solvation shells, respectively.These two last geometries were optimized at the PBE96 109 -D3 110 /def2svpd 111 level of theory.The Cartesian geometries of these n = 5, 8, 17, 53 and 102 clusters are listed in the ESI.† Note that accurate ab initio dipole moment surface for water has been previously reported and used in numerous calculations of IR spectra using the WHBB potential. 56,112Since the scope of our study is to provide a link between pairwise additive and many-body potentials via the inclusion of a permanent enhanced dipole moment in the former, we rely on trends in the magnitude of the dipole moment for different clusters environments.For that reason, we have chosen to report the dipole moments with the TTM2.1-Fpotential in Section III.e.The TTM2.1-F potential has 3 inducible point dipoles located at the atom sites (see reference for additional details). 37That said, the TTM2.1-Fpotential includes a static contribution, which accounts for the changes in the intramolecular geometry using the Partridge-Schwenke 1-body potential energy surface, 63 and a dynamic contribution from the point dipoles induced due to the field from the surrounding molecules that contribute to the total molecular dipole moment.This is in contrast to pairwise potentials which have a single, fixed, molecular dipole moment that is independent of the extended environment and it is larger than the gas phase value to attempt to ''fold in'' the many-body effects into the pairwise additive term.The reference benchmark binding energies of the water cluster minima for the various clusters considered in this study (see Fig. 1 for the various isomers) are compiled in Table 1.The values in italics indicate that the MP2 or CCSD(T) binding energies were obtained with a single basis set (typically aug-cc-pVTZ or larger) and were not extrapolated to the CBS limit.The remaining MP2 and CCSD(T) values correspond to binding energies at the CBS limit.For the most accurate comparisons, the CCSD(T)/CBS values, when available, were used as the reference values in the subsequent sections.When the CCSD(T)/CBS values were not previously available in the literature (for water cluster sizes n = 11 and 25), the MP2/CBS values were alternatively used as the references.However, in general the CCSD(T)/CBS and MP2/CBS values are very close to each other in the cluster regime considered here. 82Also note that when multiple reference values are available for a given level of theory, the values that were explicitly extrapolated to the CBS (rather than applying empirical corrections i.e., d CCSD(T) MP2 ) were selected as references.In this manner, the reference values from various sources are notably consistent.The ab initio D e and D 0 /DH(298 K) (in parentheses) values used as references for subsequent sections are indicated in bold in Table 1.

b. Cluster binding energies and geometries with pairwise additive potentials
The binding energies for each water cluster with the pairwise additive potentials considered in this study, obtained using Gromacs in double precision, are listed in Table 2.The results in mixed precision can be found in the ESI † (Table S1).The differences between the Gromacs implementation in mixed and double precision are small, yielding differences r0.05 kcal mol À1 in the binding energies.Because the double precision optimizations result in lower energy structures, we have opted to utilize these results for the succeeding analyses.As a measure of the closeness between the cluster geometries optimized with the potentials and the best available ab initio results (see references in Section IIa and Table 1), the RMSD between these two structures is listed in Table 3.This comparison has been previously used to quantify the differences between the MP2 and CCSD(T) geometries for clusters n = 2-6. 87he absolute energetic difference (kcal mol À1 ) and percent deviation from the best available reference values at the CCSD(T) or, when unavailable, at the MP2 level of theory corrected for thermally corrected enthalpies (DH(298 K)) using the scaled harmonic B3LYP/aug-cc-pVDZ frequencies are plotted in Fig. 2. A comparison of the binding energies with the pairwise additive potentials against the reference D e values (not corrected for ZPE) and D 0 (ZPE-corrected energies) can be found in the ESI † (Fig. S2-S5).It should be noted that the pairwise potentials yield results that align quite well with the D e values for cluster sizes n = 6-25, for which TIP4P and SPC yield errors of AE4%.However, since the potentials are fitted to reproduce enthalpies and other thermodynamic quantities, it is more appropriate to compare the potentials to zero-point energy corrected energies or thermally corrected enthalpies.This allows for a more appropriate comparison given that the pairwise additive potentials implicitly incorporate zero-point energy corrections (unlike the many-body potentials).Therefore, the apparent ''better agreement'' of the cluster binding energies with the pairwise additive potentials when the reference D e values are used (Fig. S2, ESI †) compared to the ones when the reference DH(298 K) values are used (Fig. 2) is for the wrong reason since it is based on the comparison of dissimilar quantities.
Since the effective two-body term of the pairwise potentials deviates from the true two-body (water dimer) term, it is expected that the binding energies of the small water clusters are significantly misestimated with the pairwise potentials relative to the reference DH(298 K) values.As the water cluster size grows, we see that better agreement with the benchmark reference values and a relatively constant percent deviation is achieved beyond n = 6.That said, the percent deviations exceed 13% for all of the potentials considered.It is worth pointing out that the binding energies with the OPC and TIP4P-ice potentials significantly overestimate the CBS reference binding energies, even for relatively large cluster sizes, a fact that is not surprising given similarities in their fitted parameters.Since TIP4P-ice is parameterized to better reproduce the phase diagram of the various forms of ice, while the original TIP4P potential targeted liquid water, this analysis demonstrates and to some extent quantifies the difference in the respective effective 2-B terms in these two regions of water's phase diagram and the inability to simultaneously reproduce both with a simple functional form and just two parameters (e and s of the Lennard-Jones term).In other words, the new parametrization in the TIP4P-ice potential brings the ice phase diagram closer to experiment compared to TIP4P but at the same time has a profound effect of being less accurate than TIP4P in the cluster regime.From the results of Table 3, it is also surprising that the TIP4P-ice potential is the least accurate in estimating cluster geometries, which can be thought of as closer to ice-like rather than liquid water structures.The TIP4P, SPC, TIP3P, and TIP5P potentials perform similarly and are grouped together B15-25% deviation from the benchmark values.
The reparameterization of SPC to yield SPC/E (extended SPC) included a polarization correction and tweaks to the point charges to improve upon the density, radial distribution function (RDF), and the diffusion coefficient. 34The SPC and SPC/E optimized geometries in this study aligned quite closely (see Fig. 3) according  for SPC/E), 34 we find the same to be true for the cluster energetics (see Table S5 in ESI †).While both potentials overestimate the cluster enthalpies, SPC/E does so to a greater extent.That said, the polarization correction added in SPC/E moves the energies toward a larger error.This suggests that the polarization correction (in SPC/E) is too large and produces inaccurate energetics for small to medium water clusters.Further, water potentials with the same number of point charge sites follow similar trends in the percent deviation in the cluster range n = 2-25.This suggests that the ability of these potentials to capture the interactions of various water cluster sizes and geometries is, to some extent, limited by the number of sites in the model.The 4-site models tend to perform most consistently across different hydrogen bonding arrangements having the smallest variation between isomers, as depicted in Fig. 3.In addition, the RMSD values comparing the optimized geometries between the potentials and the MP2 or CCSD(T) geometries (Table 3) show similar trends for potentials with the same number of charge sites.In general, the 3-site potentials exhibit the largest changes in structure (largest RMSD), indicating that these potentials struggle the most in reproducing the water cluster geometries.For the n = 3, 4 ring structures, the 3-site potentials optimize to a completely planar structure, unlike that of the CCSD(T) optimized structure.Further, none of the 3-site potentials yield an optimized structure resembling a prism for n = 6.Instead, upon optimization, one of the 3-membered rings opens resulting in a ''semi-prism''-or ''glove''-like structure.Similarities in the RMSD values are present in the other n-site models.Importantly, this showcases the limitation that a fixed number of point charges and their location puts on the ability of these water potentials to describe various hydrogen bonding networks, given that the charge density in the water molecule is not constant but varies across those different hydrogen bonding arrangements.
The water cluster configurations especially for the smaller (n o 17) clusters present a challenge for these pairwise additive potentials, because most water molecules in these smaller clusters are 2-or 3-coordinated, whereas many of these pairwise additive potentials were parameterized to reproduce bulk properties in which the average coordination of each water molecules is estimated to be between 4 and 5. 113 In other words, up to that cluster regime all atoms lie mostly on the surface of the cluster driven by the need to maximize hydrogen bonding.Despite the challenges presented by the cluster regime, the potentials produce energies that deviate from the reference by a consistent amount (albeit shift by 413% from the reference) beyond n = B6.

c. Cluster binding energies and geometries with many-body potentials
The cluster binding energies with the many-body potentials at the respective potential minima are listed in Table 4.For completeness, the corresponding results with the EFP1(RHF), EFP1(DFT), EFP1(RHF)-D3, EFP2(E6 + E7), EFP2(E6), AMOEBA03, AMOEBA14, iAMOEBA, and AMOEBA+ potentials are reported in the ESI † (Table S3).These potentials are variants (often older versions) of the families of potentials discussed in this section.
In general, these families of potentials perform progressively better along the development series.For the AMOEBA potentials (AMOEBA03 -AMOEBA14 -iAMOEBA -AMOEBA+ -AMOEBA + CF), we see similar behavior on water clusters n 4 10 (Fig. S6, ESI †).All potentials tend to underestimate the binding energies of large cluster sizes with AMOEBA + CF (the most recently developed potential) performing the best compared to earlier versions.Despite being the earliest potential in the development series, AMOEBA03 performs similarly to AMOEBA + CF.What it lacks is the marked consistency in the n = 5-10 range.The iAMOEBA potential underestimates the interactions of all water clusters examined in this study.However, iAMOEBA (''inexpensive AMOEBA'') is intended to be a faster alternative with a simpler functional form. 49That said, it is unsurprisingly that iAMOEBA performs less accurately than the other potentials for the binding energies.The Effective Fragment Potentials (EFP) perform better with increasing complexity and level of theory used to fit the potentials (see Fig. S7 in ESI †).EFP(RHF) and EFP(DFT) have quite large percent errors relative to the MP2/CBS or CCSD(T)/ CBS reference values (ranging from 2.2 to 12.3 and À47.7 to À28.3 kcal mol À1 , respectively).By adding Grimme's D3 dispersion correction to EFP(RHF), the potential now underestimates the total binding energies of the clusters (errors ranging from À18.3 to À5.3 kcal mol À1 ).It should be noted that the EFP1(RHF) and EFP1(DFT) potentials were intended to produce RHF and DFT (B3LYP) water-water interactions, not the MP2 and CCSD(T) interactions. 43The EFP2 potentials perform most accurately out of the EFP variants.This is expected given that the EFP2 potential explicitly includes additional physical interactions such as exchange-repulsion, dispersion, and charge transfer. 45Further, incorporating higher order terms in the dispersion correction (E6, E7, E8) produces the most accurate results with the EFP2(E6 + E7 + E8) typically performing most accurately.The Thole Type Models (TTM2.1-Fand TTM3-F) differ in the number of polarizable sites, the dipole moment surface (DMS), and the functional form of the pairwise dispersion interaction.TTM3-F was fitted to better produce the vibrational spectra of water clusters and liquid water by better accounting for the difference in the DMS in the gas phase (linear) versus liquid phase (nonlinear effective DMS). 38TM3-F has only one polarizable site per water molecule whereas TTM2.1-F has three, one on each atom site.TTM2.1-F outperforms TTM3-F on the energetics for most cluster sizes.However, TTM3-F produces notably accurate binding energies on the water hexamer (within 0.5% of the reference value for all hexamer isomers examined).WHBB-5 and WHBB-6 differ only in the order of polynomial used to fit the 3-body term.WHBB-5 (5th order polynomial) generally performs better than WHBB-6 on total binding energies, despite the slightly simpler functional form of the 3-body term.For n = 20, the difference in binding energy sometimes exceeds 10 kcal mol À1 with WHBB-6 underestimating the binding energies of large clusters.The remainder of the section will be focused on the best performing (often the most recently developed) variants of each family.
The absolute energetic difference (kcal mol À1 ) and percent deviation of the binding energies with the many-body potentials from the available most accurate ab initio values are depicted in Fig. 4. The results from the potentials that are explicitly fitted to reproduce the MBE (q-AQUA, MB-pol, WHBB, CC-pol) are in the right panels whereas the results from the implicit many-body potentials (TTM, HIPPO, MB-UCB, AMOEBA, EFP) are in the left panels.The MB-pol and AMOEBA + CF potentials perform exceptionally well in the small cluster regime with MB-pol yielding errors o1% for n = 2-8 and AMOEBA + CF yielding errors o0.6% for n = 3, 5-8.This is not surprising given that these potentials are parameterized on small (dimer, trimer, tetramer) water clusters.
Beyond n = 10 we see more pronounced inconsistencies in some water potentials.For example, MB-pol, WHBB-6, AMOEBA + CF, and HIPPO begin to drift upward, indicating an underestimation of the binding energies of larger water clusters.EFP2 also drifts upward and gives better estimates for larger clusters than smaller clusters.The potentials that perform most accurately on the larger water cluster regime (n = 10-25) include TTM2.1-F, and q-AQUA for which we have maximum errors within that range of B2.5% and B1.1%, respectively.A recent study 21 has confirmed these findings for the most stable isomer of the cage (H 2 O) 20  The RMSD values comparing the geometries obtained with these many-body potentials with the reference ones at the MP2 or CCSD(T) levels of theory are listed in Table 5.In general, the RMSD values with the many-body potentials are typically an order of magnitude smaller than the corresponding ones with the pairwise additive potentials, indicating a better proximity of the minimum cluster geometries to the respective reference ab initio ones.More specifically, q-AQUA, MBpol, and EFP2 have average RMSDs of 0.026, 0.046, and 0.059 Å, respectively.The potentials with the largest average RMSD are the TTM models (TTM3-F: 0.108 Å, TTM2.1-F: 0.132 Å)  While the total cluster binding energy provides useful comparisons, it does not itself afford insight into whether an agreement is inadvertent, i.e., whether the right answer is achieved for the right reason or through fortuitous cancellation of errors.This is particularly important for the many-body potentials, which are founded on the premise of the MBE.For this reason, we will examine the magnitude of the many-body terms for various cluster sizes with the MB-pol, q-AQUA, MB-UCB, and TTM2.1-F potentials and compare them with available ab initio results.The many-body expansion was performed for water clusters of size n = 7, 10, 16, 17 to compare the ability of the many-body potentials to accurately reproduce terms in the manybody expansion.Importantly, different cluster sizes shed light onto how well these potentials perform on different sizes of clusters and hydrogen bonding arrangements.Accurate reference values for the many-body terms of the n = 7, 10 clusters were reported recently 60 at the MP2/aV5Z-V5Z level.The n = 16 reference values were computed at the MP2/aVTZ level (BSSEcorrected).No reference values for the many-body terms are available for the isomers of the n = 17 cluster, however this cluster size was deemed important to examine since it represents the transition between the ''all surface'' and ''internally solvated'' water arrangements (Table 6).
The magnitudes of the MBE terms of water clusters n = 7, 10, 16 for the four many-body potentials considered here are shown in Fig. 6 along with the respective MP2 and CCSD(T) reference values.The left panels show the many-body terms at the ab initio optimized geometries and the right panels show the many-body terms at the respective potential-optimized geometries.By comparing the MBE at both the ab initio optimized geometry and potential optimized geometries different, we get two different perspectives.At the ab initio optimized geometry, we get a sense of how well the potentials reproduce the energetics at the same geometry.However, because this ab initio geometry is not the potential's minima, we also find it informative to examine how the MB terms compare at the minimum energy structure of the potential.
First, we will discuss the reference ab initio benchmarks used for n = 7, 10, 16.As previously discussed by Heindel This journal is © the Owner Societies 2023 et al., 60 the MP2/CBS estimates of the 2-body term is slightly smaller in energy than the CCSD(T)/CBS whereas the 3-body term is larger in energy (by o5% of the 3-body energy).For this reason, we have included CCSD(T)/aVTZ MBE results for the n = 7 cluster 60 to compare with the MP2 reference values.Additionally, the MP2/aVTZ values for the n = 7 many-body terms are shown to demonstrate the differences between a triple zeta (aVTZ) and the much larger five zeta basis set (aV5Z; oftentimes used as a representative of the CBS limit).The basis set size significantly affects the 2-body term, as previously reported. 60In contrast, the 3-and 4-body terms vary less with basis set size and even with the level of electron correlation. 60Because the 2-body term varies significantly with basis set size, it is more difficult to conclude which potential best describes the 2-body term.
We will now focus on the MBE at the ab initio optimized geometries (left panels of Fig. 6).Note that the MB-UCB values are not available at these geometries.The MB-pol, q-AQUA, and TTM2.1-F potentials describe the 1-body term identically with the Partridge-Schwenke potential energy surface, 63 so the computed 1-body terms at the same geometries are identical.For all clusters considered, we see that the 2-body term of TTM2.1-F is the largest and the q-AQUA and MB-pol values are very close in energy (q-AQUA is slightly lower by o0.8 kcal mol À1 ).Additionally, q-AQUA and MB-pol lie closest to the MP2/CBS value.As mentioned earlier, we expect the CCSD(T)/CBS to be slightly lower in energy than the MP2/CBS value which puts the q-AQUA and MB-pol 2-body term within that anticipated range.For the 3-body term, we see that the q-AQUA value aligns closely with the reference values with MB-pol o0.1 kcal mol À1 away for n = 7, 10 and B1.5 kcal mol À1 away for n = 16.The 3-body term of TTM2.1-F is even smaller indicating that this potential is underestimating the 3-body term with respect to the reference.Lastly, we find that the q-AQUA potential overestimates the 4-body term by 0.2-1.3kcal mol À1 with respect to the MP2/CBS reference values.The TTM2.1-F potential slightly overestimates the 4-body term (0.1-0.4 kcal mol À1 ) whereas the MB-pol potential slightly underestimates the 4-body term for n = 7, 10 but slightly overestimates the 4-body term for n = 16 relative to the MP2/CBS estimates.Now, let us examine how the MBE of the clusters change when the MBE is performed at the respective potentialoptimized geometries (right panels of Fig. 6).Note that the same ab initio reference values are used for the left and right panels.First, we notice that the 1-body term decreases significantly upon optimization (often by a factor of B1/2).Because each of the potentials have differing functional forms for the 2-body and higher terms, we see that MB-pol, q-AQUA, and TTM2.1-F all have different 1-body terms (different intramolecular geometries) at the respective potential-optimized geometries despite having the same functional form for the 1-body term.While each of these potentials underestimate the 1-body term relative to the MP2 reference values, this is, in part, due to the inherent overestimation of the deformation energy from MP2 relative to CCSD(T). 87Utilizing a CCSD(T)-optimized geometry to compute the 1-body term would provide a more accurate comparison of the 1-body term.The 2-body term of TTM2.1-F matches closely with the CCSD(T)/aVTZ value, however the CCSD(T)/CBS is expected to be lower in energy than the CCSD(T)/aVTZ value.This means that MB-pol, MB-UCB, and q-AQUA, which all yield 2-body terms within a few tenths of a kcal mol À1 from one another, likely produce 2-body terms closer to CCSD(T)/CBS.The 3-body terms differ for the potentials with MB-UCB being the most stabilizing followed by q-AQUA, MB-pol, and TTM2.1-F, in that order.MB-UCB aligns closely with the ab initio reference values (MP2/CBS: À12.19 kcal mol À1 , CCSD(T)/ aVTZ: À11.64 kcal mol À1 , MB-UCB: À11.95 kcal mol À1 for n = 7) at its respective optimized geometry.MB-pol and q-AQUA yield very similar 3-body terms for n = 7, 10.For n = 16, 17, q-AQUA yields larger 3-body terms than MB-pol by 41.5 kcal mol À1 .The TTM2.1-F potential underestimates this 3-body interaction.Interestingly, while TTM2.1-F has shown consistency in reproducing the total ab initio reference cluster binding energies across the n = 2-25 range, this agreement is fortuitous since it overestimates the 2-and underestimates the 1-and 3-body terms by about the same amount resulting in their deviations from the ab initio reference values to cancel out upon their summation.For the 4-body term, however, TTM2.1-F is in near perfect agreement with the MP2 benchmarks (À0.841/À0.841,À1.878/ À1.709, À2.142/À2.164) at its optimized geometry.The q-AQUA and MB-UCB potentials overestimate the 4-body interaction at their respective optimized geometry relative to the reference 4-body values.MB-pol typically yields 4-body energies that are slightly smaller than the reference values.

e. Connecting pairwise additive potentials with many-body potentials through the molecular dipole moment
The consistent performance of the pairwise additive on water cluster binding energies (especially n 4 6) implies that the underlying physical interactions in water could be sufficiently described with an effective pairwise interaction.As described earlier, to provide a connection between many-body and pairwise additive potentials, we have performed a dipole moment analysis on water clusters of increasing solvation.Molecular dipole moments were computed using the TTM2.1-Fpotential for the following systems: the singly solvated monomer, the singly solvated dimer, the doubly solvated monomer  5 RMSD (Å) comparing the geometries optimized with the various many-body potentials (q-AQUA, MB-pol, TTM2.1-F,TTM3-F, AMOEBA + CF, EFP2) against the MP2 or CCSD(T) reference geometries.Note that we do not report the RMSDs for the WHBB-5, WHBB-6, MB-UCB potentials since the binding energies were taken from previously published results.The rightmost column shows the level of theory and basis set used to optimize the structure that is used as the reference  114 we see that molecules with higher coordination numbers tend to have higher molecular dipole moments (i.e., AADD and AAADD in Fig. 9).Interestingly, the pairwise additive permanent dipole moments (2.18-2.48D) 35,115 for the potentials considered in this study are all smaller than those estimated for liquid water 13,116 and those calculated for water clusters here.Some pairwise additive potentials incorporate polarization corrections (i.e., SPC/E) which has been shown to improve performance on the liquid regime.However, this causes an even larger overestimation in the binding energies of the water clusters in this study, in contrast to SPC (without that polarization correction).Because we see differences in the average molecular dipole moment in different water cluster sizes and different phases of water (in agreement with previous works), 13,114,117 this helps to explain the limitations of pairwise additive potentials (with a constant dipole moment) in successfully transferring to different system sizes.More specifically, the pairwise potentials incorporate an enhanced (with respect to the isolated monomer value) dipole moment in a ''static'' fashion, meaning that this enhanced dipole moment is achieved through larger (permanent) charges on the atom sites.These enhanced molecular dipole moments are, however, not environment dependent.For this reason, the enhanced permanent dipole moment   This journal is © the Owner Societies 2023 exhibited by the pairwise potentials is inappropriate for describing the small water cluster environments, for which we observe much smaller molecular dipole moments (using the TTM2.1-Fpotential) than we do for large clusters.The flexible pairwise potentials will include some environment dependence, given that the R OH values can change depending on the surrounding environment.However, this will not remedy the fact that the enhanced molecular dipole moment is inappropriate for small  clusters, limiting the applicability of these potentials to the small cluster regime.Ultimately, this demonstrates the need for an environment dependent, inducible molecular dipole moment to ensure the transferability to a wide range of cluster sizes and phases.Nevertheless, our analysis supports the use of larger dipole moments (through the choice of charges) for the pairwise additive potentials, as this is viewed as an attempt to fold in the many-body terms into an effective 2-body term.

IV. Conclusions
By utilizing high-level MP2 and CCSD(T) benchmarks for the binding energies and geometries of water clusters, we gain insight into how classical interaction potentials for water perform in a relatively wide range of cluster sizes that lies outside their parametrization range.The pairwise potentials considered here were found to severely overestimate the binding energies of small water clusters.Given that these potentials only exhibit an effective pairwise interaction (effective 2-body term), this behavior is somewhat expected.However, beyond n = B6 the performance of these potentials improves slightly, and the energies deviate from the reference values by a relatively consistent percentage for each respective potential (the smallest error being 13%).Interestingly, the results from the pairwise additive potentials align more closely to the D e binding energies with the TIP4P and SPC potentials producing energies that lie within 4% of the reference ab initio values for n = 6-25.However, because these pairwise potentials implicitly incorporate zero-point energy corrections in their functional form (unlike the many-body potentials), it is more appropriate to compare to zero-point corrected energies or thermally corrected enthalpies, despite the better apparent better alignment to the D e binding energies, which can be considered as fortuitous.The many-body potentials perform more accurately than the pairwise additive ones over the entire n = 2-25 range.Notably, for cluster sizes n = 2-8 the MB-pol potential produces cluster binding energies that are within 1% of the reference ab initio values.Additionally, AMOEBA + CF produces binding energies within 1% of the reference values for the n = 3, 5-8 and q-AQUA produces binding energies within B1% of the reference values for all clusters examined.However, beyond that size range (n 4 10), the MB-pol, AMOEBA + CF, HIPPO, and WHBB-6 potentials consistently underestimate the binding energy of larger clusters.The potentials that perform most consistently across the entire n = 2-25 range include q-AQUA and TTM2.1-F which all give binding energies within B2.5% of the reference ab initio values.
The MBE analysis of the water cluster binding energies, carried out with the TTM2.1-F,MB-UCB, MB-pol, and q-AQUA many-body potentials, proved a valuable tool to further analyze the components of the total interaction.Because of the high number of individual n-body terms that are summed to get the total n-body term, this proves to be a challenging task.However, this test is valuable because it reveals errors in the many-body potentials that are not obvious from the comparison of the binding energy alone.For example, in terms of the absolute binding energy, the TTM2.1-Fpotential performs well, especially for larger water cluster sizes, which tend to be underestimated by MB-pol, AMOEBA + CF, WHBB-6, and HIPPO.However, upon analyzing the magnitudes of the many-body terms of the total interaction, it was found that TTM2.1-F overestimates the 2-body and underestimates the 1-and 3body terms so the errors in those major components of the total interaction cancel out.It is this fortuitous cancellation of errors that allows TTM2.1-F to give a better representation of the total binding energy producing binding energies that are within 2.5% across the entire range of cluster sizes examined.MB-UCB, on the other hand, is the most inconsistent among the above three many-body potentials with regards to the total binding energy, producing values within AE7% of the references.However, the composition of the total energy aligns closely with the MP2 reference values for the MBE of the clusters considered.MB-pol and q-AQUA perform very well in the many-body expansion at the ab initio optimized geometry.They differ most significantly in the 4-body energy.The MB-pol potential uses the TTM4-F potential to describe the 4-body term whereas the q-AQUA potential uses a 4-body term fitted to ab initio 4-body terms.At the potential-optimized geometries, MB-pol tends to underestimate the 3-body term more with increasing cluster size (with respect to MP2), which may be the cause of the drift in the total binding energy for clusters n 4 10.The 2-body terms are quite similar for the q-AQUA potential and the MBpol potentials.However, q-AQUA tends to have a larger 3-body and 4-body term which helps it to give better binding energies for large clusters in addition to small clusters.However, q-AQUA is lacking a 5-body term which, while small, can contribute a few tenths of a kcal mol À1 for n 4 16.Overall, our analysis demonstrates the immense scientific progress that has been made in the field of potential development.Despite many different strategies in the development of many-body potentials, we find good performance of the potentials on the binding energies of water clusters n = 2-25 and on the MB terms.The fact that MB-UCB very closely represents the MBE demonstrates the promise of implicit many-body potentials in both accurately describing the MBE (as accurately as explicit many-body potentials) while at the same time offering a much simpler and faster alternative (for example, TTM3-F and MB-pol differ by B7Â in a single step of a MD simulation with 256 water molecules). 27astly, the analysis of the molecular dipole moments of water clusters with 1 to B4 solvation shells using TTM2.1-Fdemonstrates the influence of the local environment (first solvation shell) and extended environment (subsequent solvation shells) on the molecular dipole moment of water.In agreement with Kemp and Gordon, 114 we see that molecules with a lower coordination number tend to have smaller dipole moments.In addition, we see that the internally solvated water molecules have a generally increasing molecular dipole moment as we increase the number of solvation shells.This justifies the use of an enhanced molecular dipole moment in the pairwise additive potentials to model larger water aggregates.However, this also a. Reference ab initio binding energies of water clusters n = 2-11, 16, 17, 20, 25

Fig. 2
Fig. 2 Absolute energy difference (top, kcal mol À1 ) and percent deviation (bottom) of the binding energy with the pairwise additive potentials (TIP3P, TIP4P, TIP5P, TIP4P-ice, SPC, SPC/E, OPC) relative to the DH(298 K) reference value for the lowest energy structures for water cluster sizes n = 2-11, 16, 17, 20, 25.The points for which the potential optimizes to a different structure (NSP in Table2) have been excluded.The shaded gray regions represent the uncertainty in the extrapolation to the CBS in the D e calculation.
cluster, for which TTM2.1-F overestimates the binding energy by 3.4 kcal mol À1 (1.7%) of the CCSD(T)/CBS value, whereas MB-pol underestimates it by 4.1 kcal mol À1 (2.1%) (cf.Fig. 3 of ref. 21).Nonetheless, all many-body potentials produce binding energies within 7% of the CBS reference values in the range n = 2-25.However, across the whole range of cluster sizes considered, we see that potentials perform optimally in different regions of cluster sizes.The percentage deviation from the CBS values for the various isomers of the n = 6, 8, 16, 20 clusters is shown in Fig. 5. MB-UCB exhibits inconsistencies for open structures including the ring and cage hexamers, and the pentagonal dodecahedron (n = 20).WHBB-5 underestimates the binding energy of the ring hexamer.Overall, the rest of the potentials (MB-pol, AMOEBA + CF, TTM2.1-F,TTM3-F, EFP2) perform consistently with different hydrogen bonding arrangements, varying by at most 3% across the isomers of a given water cluster size.

Fig. 4
Fig. 4 Absolute energy difference (top, kcal mol À1 ) and percent deviation (bottom) of the binding energy (D e ) of the many-body potentials relative to the CBS value for the lowest energy structure for water cluster n = 2-11, 16, 17, 20, 25.The results with the implicit many-body potentials (EFP2, TTM2.1-F,TTM3-F, AMOEBA + CF, MB-UCB, HIPPO) are in the left panels and the results with the explicit many-body potentials (MB-pol, WHBB-5, WHBB-6, q-AQUA, CC-pol) are in the right panels.The shaded gray regions represent the uncertainty in the extrapolation to the CBS in the D e calculation.
, and systems with roughly B3 (n = 53) and B4 (n = 102) solvation shells.The geometries of these clusters are shown in Fig. 7.The singly and doubly solvated clusters were optimized under constrained tetrahedral (109.51)O-O-O angles.Histograms showing the distributions of the molecular dipole moments in each system of interest is depicted in Fig. 8.For the larger water clusters, we observe a larger average molecular dipole moment and range of dipole moments.Further, we notice a difference in the distributions between the tetrahedrally coordinated (n = 5, 8, 17) and fully relaxed (n = 53, 102) water clusters.The tetrahedrally-coordinated systems more clearly separate the dipole moments of the molecules in different solvation shells.The molecular dipole moments shown inTable

Fig. 6
Fig. 6 Magnitude of many-body terms (kcal mol À1 ) at the ab initio-optimized geometry (left) and potential-optimized geometries (right) up to the 4-body for n = 7 (top panels), n = 10 (middle panels), and n = 16 (bottom panels).The MP2 and CCSD(T) reference values are shown in grayscale.

Fig. 7
Fig.7Geometries of the water clusters used in the molecular dipole moment analysis with the TTM2.1-Fpotential.From left to right, the singly solvated monomer, singly solvated dimer, doubly solvated monomer, a cluster with B3 solvation shells, and a cluster with B4 solvation shells.The singly and doubly solvated systems are constrained to enforce tetrahedral binding angles.The larger systems were optimized without any constraints.The blue shaded region indicates the central molecule(s).

Fig. 8
Fig.8The distribution of the molecular dipole moments (D) calculated by the TTM2.1-Fpotential.The average molecular dipole moment is indicated by the black vertical dotted line for each cluster.The dipole moment of the central water molecule (indicated by a blue shaded area in Fig.7) is indicated by a red vertical dashed line.

Fig. 9
Fig. 9 The molecular dipole moments plotted against the distance from the center of mass of a molecule to the center of mass of the water cluster with the TTM2.1-Fpotential.The colors indicate the hydrogen bonding environment of that water molecule (A: hydrogen bond acceptor, D: hydrogen bond donor).

Table 3 RMSD
(Å) comparing the geometries optimized with the various pairwise additive potentials against the MP2 or CCSD(T) reference geometries.NSP (not a stationary point) means that the water cluster optimized to a different structure.The rightmost column shows the level of theory and basis set used to optimize the structure that is used as the reference to the RMSD values.Both models have a H-O-H angle of 109.51 and O-H bonds of 1.0 Å with slightly different point charges on the atoms resulting in an increased dipole moment for SPC/E (2.27 D vs. 2.35 D for SPC).SPC and SPC/E have been shown to produce potential energies that differ by B9% (reported potential energy of MD simulations: À37.7 kJ mol À1 for SPC and À41.4 kJ mol À1