Understanding three-body contributions to coarse-grained force-fields

Coarse-graining (CG) is a systematic reduction of the number of degrees of freedom (DOF) used to describe a system of interest. CG can be thought of as a projection on coarse-grained DOF and is therefore dependent on the functions used to represent the CG force field. In this work, we show that naive extensions of the coarse-grained force-field can result in unphysical parametrizations of the CG potential energy surface (PES). This issue can be elevated by coarse-graining the two- and three-body forces separately, which also helps to evaluate the importance of many-body interactions for a given system. The approach is illustrated on liquid water, where three-body interactions are essential to reproduce the structural properties, and liquid methanol, where two-body interactions are sufficient to reproduce the main features of the atomistic system.

Coarse-graining (CG) is a systematic reduction of the number of degrees of freedom (DOF) used to describe a system of interest. CG can be thought of as a projection on coarse-grained DOF and is therefore dependent on the functions used to represent the CG force field. In this work, we show that naive extensions of the coarse-grained force-field can result in unphysical parametrizations of the CG potential energy surface (PES). This issue can be elevated by coarse-graining the twoand three-body forces separately, which also helps to evaluate the importance of many-body interactions for a given system. The approach is illustrated on liquid water, where three-body interactions are essential to reproduce the structural properties, and liquid methanol, where two-body interactions are sufficient to reproduce the main features of the atomistic system.
Coarse-graining (CG) is a systematic way of reducing the number of degrees of freedom describing a specific physical system. A typical but by no means complete list of coarsegraining procedures includes (i) a renormalization group analysis in the vicinity of a critical point, where degrees of freedom (e.g. spins) are blocked together; 1 (ii) the formulation of system dynamics in terms of a master equation, with the entire phase space represented by a few states; 2 (iii) parametrizations of classical force-fields, in which electronic degrees of freedom are incorporated into classical interaction potentials; 3 (iv) united-atom or coarser classical particle-based force fields, with light atoms (e.g. hydrogens) incorporated into the heavier ones. 4 Coarse-graining often relies on a certain time-scale separation: due to some parts of the system evolving on a significantly slower timescale than others. For example, diffusion and rotation of molecules, or parts of molecules, is a much slower process than a characteristic bond vibration. It is then possible to combine several coherently moving atoms into a single interaction site. This reduces the computational costs in several ways. First, the coarsegrained system has less degrees of freedom. Second, smoother (softer) interaction potentials result in a smaller friction, hence, faster dynamics, which can now be propagated using a bigger simulation time step. Though the connection between atomistic and coarse-grained dynamics is rather non-trivial, [5][6][7][8] it often helps to reach ten to a hundred times longer simulation times.
The coarse-graining procedure in itself involves three steps: choice of coarse-grained degrees of freedom and a mapping of the fine-to the coarse-grained description, identification of a merit function which quantifies the difference between the fine-and coarse-grained representations, and determination of the coarse-grained PES. Consistency between the coarse-grained and the fine-grained models requires consistency of the equilibrium probability densities which, given a specific mapping, results in unique expressions for the CG masses and interaction potential. 9 Evaluating this many-body potential of mean force (PMF) is, however, as computationally demanding as propagating the fine-grained system, counteracting the idea of coarse-graining to reduce computational cost. In practice, the many-body PMF is approximated with or, in other words, projected onto basis functions that are used to represent the coarse-grained bonded (bond, angle and dihedral) and non-bonded interactions.
By choosing an appropriate projection operator one can, for example, reproduce structural quantities of the atomistic system. Iterative Boltzmann inversion (IBI), 10 inverse Monte Carlo (IMC), 11 or relative entropy minimization 12 schemes have been developed for this purpose. An alternative route is to match the forces of the CG system to the ones of the atomistic description, employing the force matching (FM) or multiscale coarse-graining (MS-CG) methods. 9,13 FM can also be connected to structure based CG via YBG theory. 14,15 Non-bonded interaction potentials of practically all classical force-fields are represented by pair potentials. This limits the CG force-field functions in the out-of the box molecular dynamics packages, and hence the accuracy of the CG model. Indeed, for locally structured liquids, where liquid water is a prominent example, as well as mixtures, CG pair potentials fail to capture the structural properties. [16][17][18][19][20] In this respect the many-body terms become important: by adding a short-ranged non-bonded three-body potential one can significantly improve the structural and thermodynamic properties. 16,17,[21][22][23][24][25] An alternative way of incorporating many-body effects is by using local density dependent potentials 26 or by using machine learning to predict many-body contributions. 27 Let us, however, re-examine the effect of short-ranged three-body potentials, such as the three-body Stillinger-Weber (SW) potential, 28 on the structure and thermodynamics of liquid water. 16,[21][22][23][24] Note that in our case f (θ ijk ) is a tabulated function of θ ijk . In what follows, all coarse-grained MD simulations are performed using the LAMMPS package, 29 atomistic simulations using the GROMACS 30 package, and coarse-graining using the VOTCA package. 31 We employ the SPC/E water model 32 and the OPLS-AA force field 3,33 without constraints for methanol. More details are provided in the Supporting Information.  2-body IBI, 2-body force matching, 2-and 3-body force matching refer to the three different CG parametrizations: the iterative Boltzmann inversion of the radial distribution function, force-matching with a tabulated pair potential, and force matching with the tabulated pair potential, as well as, the three-body SW potential. Fig. 1(b) illustrates how the incorporation of the three-body potential leads to a significant improvement of the CG model compared to the CG model with only two-body FM interactions. In fact, not only the structure, represented by the radial distribution function in Fig. 1(b), but also the thermodynamic properties are significantly improved after extending the CG basis set. It turns out, however, that the addition of the SW potential leads to a significantly more attractive two-body potential (see Fig. 1(a), dotted green line) as compared to the CG models with pair potentials only (see Fig. 1(a), solid red line (IBI) and dashed blue line (FM)). 16,31 The same holds for liquid methanol which is shown in the supporting information. This is surprising, since the corresponding peaks in the radial distribution functions have the same height (solid red and dotted green lines in Fig. 1(b)). One might even jump into conclusion that the three-body term is as important as the twobody one, i.e. it cannot be treated as a part of a perturbative expansion of PMF into the many-body potentials.
In order to understand this discrepancy, let us examine the pair potential of mean force, U PMF (r) = − r 0 F r (r ′ ) dr ′ between two coarse-grained sites. Here F r (r) is the radial component of the total force on a CG bead averaged over all pairs of CG beads with distance r. This force is evaluated in the CG simulation run for different CG interactions or in the atomistic simulation by employing the CG mapping. Note that U PMF (r) is equivalent to the radial distribution function, since g(r) ∼ exp(−U PMF (r)/k B T ). It is, however, easier to interpret its decomposition on two-and three-body contributions.
Such a decomposition is shown in Fig.2 for liquid water and methanol. Here, strong attractive forces manifest themselves in a large dip in the two-body contribution to the PMF within the first coordination shell (dashed line). This attraction is, however, compensated by a short-range repulsive force coming from the three-body SW interaction (dotted line). Similar compensation can be observed for liquid methanol. Here, both the radial distribution function and the two-body PMF are perfectly reproduced by the coarse-gained force-fields with or without the inclusion of the three-body term (parametrizations using pair potentials only are shown as crosses and dots in Fig. 2(b)). Hence, one might anticipate that the contribution of the three-body term should be very small, which is obviously not   Figure 3. Splitting of the CG two-body potentials of mean force for different CG parametrization schemes for (a) SPC/E water and (b) methanol. The results shown are for the concurrent two-body and three-body FM parametrization and the three-body FM parametrization using the residual force, ∆fFM. The same trend is observed for the IBI pair force, ∆fIBI. the case. This indicates that the SW three-body potential is "not orthogonal" to the pair potential when the liquid structure is chosen as a target observable. This results in the motivation to test a different parametrization scheme. Following the idea of the Gram-Schmidt orthogonalization, we now parametrize the three-body SW potential using the residual force ∆f i , i.e., after subtracting the two-body force from the total force on each CG bead i:

2-body IBI 2-body FM 2-and 3-body FM
The two-body force is the force obtained using either the FM or the IBI parametrization with pair potentials only.
The results are shown Fig. 3. Both for water and methanol, the parametrization of the residual force significantly reduces the three-body contribution to the two-body PMF. As a result, the attractive part of the two-body contribution to PMF is also reduced. It is of course important to check whether the parametrization using the residual force is still capable of reproducing the liquid structure. In Fig. 4, we compare the radial distribution functions and the angular distributions of CG beads within the first coordination shell for several parametrization schemes for liquid water. The completely unconstrained parametrization performs slightly better than the one with the residual force, but the overall agreement (especially taking into account the results without any three-body contributions) is satisfactory. In fact, for liquid methanol (see the supporting information) the agreement is very good.
It is known that the incorporation of three-body terms improves the thermodynamic properties of CG water. 16,21,23 Here, we assess the performance of the CG models in terms of the internal pressure p and the enthalpy of vaporization ∆H. Details can be found in the supporting information. The internal (virial) pressures of all CG models are given in Table I. Ideally, the pressure of the CG models (simulated at the density of the atomistic model) should be 1 atm. It can be clearly seen that the CG models with two-and threebody interactions parametrized simultaneously still yield the best results, both for water and methanol. The pressure reduction, to a large extent, comes from the significantly more attractive pair potential (see Fig. 1). In fact, for liquid methanol the addition of the shortranged three-body potential parametrized using the residual force even leads to slightly inferior results compared to both two-body parametrizations. Note that we do not use any pressure corrections nor any explicit pressure matching schemes. 35 We also evaluate the enthalpy of vaporization, for which experimental values are readily available. 23,34 Table I lists the values of the enthalpy of vaporization taking into account the actual average liquid pressure of each CG parameterization. Essentially, the same conclusions can be drawn as comparing the pressure of the different CG models: The best values are obtained for the CG models with two-and three-body interactions parametrized simultaneously. Note that the overall discrepancies of the enthalpies of vaporization can not be attributed to the atomistic force fields (SPC/E and OPLS) as the all-atom results are close to the experimental values in accordance with previous work. 23,34 To conclude, we have shown that adding the Stillinger-Weber three-body potential to the coarse-grained force-field leads to a strongly attractive pair interaction potential. Shortrange attraction is then compensated by the three-body contribution in a way that the total pair potential of mean force does not change. Parametrization of the three-body term with residual forces reduces the three-body contribution to the pair potential but at the same time worsens thermodynamic properties of the coarse-grained model, quantified by pressure and vaporization enthalpy. Our work indicates that the three-body Stillinger-Weber potential is not optimal for capturing three-body interactions in soft condensed matter systems. The proposed approach helps to better understand the importance of many-body terms in the coarse-grained force field and paves the way to the development of computationally efficient many-body potentials tailored to reproduce both thermodynamic and structural properties.