A transferable double exponential potential for condensed phase simulations of small molecules

The Lennard–Jones potential is the most widely-used function for the description of non-bonded interactions in transferable force fields for the condensed phase. This is not because it has an optimal functional form, but rather it is a legacy resulting from when computational expense was a major consideration and this potential was particularly convenient numerically. At present, it persists because the effort that would be required to re-write molecular modelling software and train new force fields has, until now, been prohibitive. Here, we present Smirnoff-plugins as a flexible framework to extend the Open Force Field software stack to allow custom force field functional forms. We deploy Smirnoff-plugins with the automated Open Force Field infrastructure to train a transferable, small molecule force field based on the recently-proposed double exponential functional form, on over 1000 experimental condensed phase properties. Extensive testing of the resulting force field shows improvements in transfer free energies, with acceptable conformational energetics, run times and convergence properties compared to state-of-the-art Lennard–Jones based force fields.


S1 The Buckingham 6-8 Potential
The water model used as a starting point for our DE fit in this work is a four-point model, employing the Buckingham 6-8 non-bonded potential: 1 U B68 (r) = Aexp(−br) − f damp,6 (r) C 6 r 6 − f damp, 8 (r) Here, Pauli repulsion is modelled by the first exponential term, and the leading order terms of the attractive dispersion interaction (C 6 and C 8 ) are retained. 2 The basic form of the (ζr) k k!
with ζ = 35.8967 nm −1 for the B68 water model. 1 Figure S1 shows the O-O interaction potential for the B68 water model, using previously published A, b, C 6 and C 8 force field parameters. 1 Also plotted is a DE potential with parameters obtained from a non-linear least-squares fitting to the B68 potential. The resulting DE-B68 parameters (fit) are tabulated in Table S1. As expected from the overall shapes of the potential energy surfaces ( Figure 1 in main text), the DE fit parameters are similar to the previously used DE-TIP3P parameters. 4 In particular, the DE-B68 potential has a nearly identical O-O equilibrium separation distance (r m ) and a larger well depth (ϵ), as compared to DE-TIP3P.
Table S1 also shows the same parameters after training to water condensed phase properties (DE-B68 (trained)). Little change from the initial fit parameters are required to recover the condensed phase data, as expected since the underlying B68 model has been trained against the same data. Also noteworthy is the similarity between the final, trained electrostatic parameters (O-X and q H ) and those of the LJ-based TIP4P-FB water model. 2 We have also tested the performance of our implementation of the DE-B68 water model, accessed through the smirnoff-plugins interface to OpenMM. 5 For a box of 1000 4-point water molecules, we obtain throughput of 225 ns/day with a Tesla V100 32G NVLink 2.0.
This compares favourably with a throughput of 130 ns/day for the more complex B68 model on the same resources. Thus, the flexibility of the DE functional form has the potential to combine good accuracy (see Figure 2 in main text) with acceptable computational cost.

S2 Analysis of the co-optimised DE-B68 water model
In the training of the DE-FF non-bonded parameters, we chose to co-optimise the parameters of the DE-B68 water model (starting from the trained parameters in Table S1). It is possible that the parameters will significantly change such that it is no longer suitable as a pure liquid water model. To guard against this possibility, we also included pure water density target data in the fitting of DE-FF. Figure S5 confirms that the final water model is still sufficiently accurate. The biggest change is in the density (probably because we did not change the electrostatics of the water model), but even this changes by less than 1 % at room temperature. From the overall statistics presented in Table 1 in the main text, we also separated out the enthalpy of mixing data for mixtures involving water (Table S2 and Figure S6), and those not involving water (Table S3 and Figure S7). These confirm that much of the improvement in this quantity comes from the improved description of interactions between small molecules and water, described by the DE-FF and co-optimised DE-B68 models respectively.

S4 Convergence of Free Energy Calculations
The widely-used LJ soft-core potential is given by: 10 where r sc,ij = r m,ij (0.25(1 − λ) + (r ij /r m,ij ) 6 ) 1 6 . Figure S16(a) shows a typical variation of the potential energy surface with λ, including a gradual decrease to zero as the potential is turned off. Since the DE-FF has no singularity at r = 0, we might expect to be able to linearly interpolate the potential to zero: However, scaling the DE-FF potential in this way leads to a sudden reduction in the potential at small values of λ ( Figure S16(b)). This would manifest as poor convergence in free energy calculations, due to poor phase space overlap at the λ end states ( Figure S17).
Instead, we choose a scaling of the DE-FF that more closely resembles the behaviour of the LJ soft-core potential: where α s = (1.1 + λ(α − 1.1)) and β s = (1 + λ(β − 1)), which reduces the difference in the values of α and β to 0.1, further softening the potential during the scaling. Figures S16 and S17 confirm smooth scaling of the potential to zero and good overlap at all λ windows. We note that this scaling is not necessarily optimal, and further convergence improvements may be possible. Table S7 further investigates the convergence of the aqueous and non-aqueous free energy calculations with the employed λ schedule for the LJ soft-core and DE-FF force fields. In general, the variation across triplicate runs and convergence with the number of λ windows is slightly better for the LJ soft-core potential. However, all calculations are well converged even using just six evenly-spaced windows.
Finally, it has been previously noted that the soft-core LJ potential can lead to unwanted minima at intermediate alchemical states, in which two atoms become trapped at short distances from each other. 11 Figure S18 illustrates this situation for λ = 0.5. With the unmodified LJ potential, the two atoms are strongly repelled, but with the soft-core LJ potential there is a region at small r with small or zero force between the atoms. In contrast, the DE-FF potential remains repulsive at short range, but is softer than the unmodified LJ potential, which should lead to more robust free energy calculations. 11 Table S7: Convergence of free energy results with number of λ windows. The free energy (kcal/mol) required to annihilate a molecule of ethanol in aqueous and non-aqueous (2-Methylpyridine) solvent is computed using 6, 11 and 16λ windows. The calculations are performed in triplicate, using a simulation length of 2 ns, and the standard deviation across the runs is reported in parentheses. Figure S17: Overlap matrices extracted from non-aqueous free energy calculations of ethanol in 2-methylpyridine for 11 equally-spaced λ windows for the vdW annihilation stage. a) Sage shows good overlap as expected with the soft-core LJ potential. b) DE-FF shows poor overlap with the end state when using linear scaling due to the sudden disappearance of the potential at small λ. c) This is corrected using our scaled protocol (eq 5), which shows good overlap. 22

S5 Supplementary Computational Details
All simulations were run with OpenMM-7.6.0 on the cuda platform. 5 All liquid simulation boxes were created using PackMOL 12 via OpenFF-Evaluator. The systems were parameterised using the openff-toolkit-0.10.6 and smirnoff-plugins-0.0.2. ForceBalance-1.9.2 was used to optimise the non-bonded parameters via an interface to OpenFF-Evaluator-0.3.11. 13 A non-bonded cutoff distance of 9Å was used in all liquid simulations and a switching function was applied to smoothly reduce the vdW potential to zero over the last 1Å. The convergence of the solvation-free energy calculations with respect to the cutoff range was investigated to ensure the correct application of the long-range correction with the custom functional form ( Figure S19). All statistical performance metrics are reported with 95% confidence intervals from 1000 iterations of bootstrapping with replacement.  Figure S19: Dependence of the aqueous solvation free energy of methanol with non-bonded cut-off distance. Through the use of the long-ranged correction, the free energy is converged after 7Å. Figure S20: Regression tests for absolute hydration free energies in Absolv. Free energy changes are computed in triplicate in Absolv using non-equilibrium (neq) and equilibrium procedures, with independent replicas (eq-indep) and replica exchange (eq-repex). Also shown for comparison are the same quantities computed using the Yank 14 and Gromacs 15 software packages.