Open Access Article
This Open Access Article is licensed under a
Creative Commons Attribution 3.0 Unported Licence

Improved reweighting protocols for variationally enhanced sampling simulations with multiple walkers

Baltzar Stevensson * and Mattias Edén *
Department of Materials and Environmental Chemistry, Stockholm University, SE-106 91 Stockholm, Sweden. E-mail: baltzar.stevensson@mmk.su.se; mattias.eden@mmk.su.se

Received 29th August 2022 , Accepted 2nd July 2023

First published on 3rd July 2023


Abstract

In molecular dynamics simulations utilizing enhanced-sampling techniques, reweighting is a central component for recovering the targeted ensemble averages of the “unbiased” system by calculating and applying a bias-correction function c(t). We present enhanced reweighting protocols for variationally enhanced sampling (VES) simulations by exploiting a recent reweighting method, originally introduced in the metadynamics framework [Giberti et al. J. Chem. Theory Comput., 2020, 16, 100–107], which was modified and extended to multiple-walker simulations: these may be implemented either as “independent” walkers (associated with one unique correction function per walker) or “cooperative” ones that all share one correction function, which is the hitherto only explored option. When each case is combined with the two possibilities of determining c(t) by time integration up to either t or over the entire simulation period image file: d2cp04009c-t1.tif, altogether four reweighting options result. Their relative merits were assessed by well-tempered VES simulations of two model problems: locating the free-energy difference between two metastable molecular conformations of the N-acetyl-L-alanine methylamide dipeptide, and the recovery of an a priori known distribution when one water molecule in the liquid phase is perturbed by a periodic free-energy function. The most rapid convergence occurred for large cooperative walkers, regardless of the upper integration limit, but integrating up to t proved advantageous for small walker ensembles. That novel reweighting method compared favorably to the standard VES reweighting, as well as to current state-of-the-art reweighting options introduced for metadynamics simulations that estimate c(t) by integration over the collective variables. For further gains in computational speed and accuracy, we also introduce analytical solutions for c(t), as well as offering further insight into its features by approximative analytical expressions in the “high-temperature” regime.


1. Introduction

Enhanced sampling (ES) techniques for molecular dynamics (MD) simulations, such as umbrella sampling,1,2 replica exchange,3,4 and steered MD5,6 along with the more recent options of metadynamics,7–16 variationally enhanced sampling (VES),17–23 and machine learning,24–26 offer powerful means to accelerate the convergence of MD simulations via an enhanced free-energy surface-sampling by avoiding that states are revisited. Hence, they may address challenging systems featuring multiple local energy minima separated by high barriers that would require a prohibitive time-scale by classical MD simulations. This is accomplished by adding a time-dependent bias potential, V(s, t), which depends on a set of collective variables (CVs) of the system, denoted by s.7–13 Numerous refinements of ES protocols have been presented, encompassing the precise choices of CVs14,16,23,24,26–28 and bias-potential parametrization,10,15,22,29,30 along with miscellaneous very recent options,27,31,32 as well as improved reweighting procedures12,33–38 to recover the targeted unbiased free energy, F(s), from the sum F(s) + V(s, t) that governs the biased MD trajectory.

In particular metadynamics has been widely applied for modeling (bio)chemical processes, such as the nucleation and growth of carbon nano-tubes,39–41 proton transfers,42–44 conformations of biomolecules in solution,28,45–47 as well as their binding at inorganic surfaces.48–50 Also the herein utilized VES procedure17–23 has been employed for modeling of molecular conformations in solutions,18,51 crystal nucleation,21,30 whereas the ability of VES to handle large sets of CVs have enabled simulations of protein folding.22,52

Two powerful options for accelerating the convergence of the ensemble-averaged free energy and other observables from ES/MD simulations involve either (I) calculation of the time-dependent bias-correction function, c(t), by analyzing non-equilibrated systems using time integration, as in the very recent “iterative trajectory reweighting” (ITRE) algorithm introduced for metadynamics simulations by Giberti et al.,38 or (II) an enhanced simultaneous sampling of the CV space by performing multiple (NW) computations in parallel, each referred to as a “walker” and subjected to the same bias potential.10,14,16,19,27,46,53 Using several such walkers may greatly accelerate the convergence of ES/MD simulations.10,14,16,19,27,53 For multiple-walker ES simulations, we introduce the concepts of “independent” and “cooperative” walkers, which are associated with NW distinct bias-correction functions {cw(t)}, and one shared c(t) function [denoted by cΣ(t)], respectively; see Section 3. Section 2 outlines the salient features of VES and reweighting, moreover reviewing the hitherto (probably) most efficient and accurate options for determining c(t).36–38

By coupling options (I) and (II)—and thereby generalizing a modified ITRE protocol to multiple-walker simulations and integrating it with the VES framework17–19,23—we demonstrate that an overall accelerated convergence results relative to the reweighting procedure of the original VES protocol.17–20 The latter is moreover shown to be equivalent to the balanced exponential (BE) reweighting of Schäfer and Settanni37 when implemented within the VES scope. Our proposed reweighting method was selected from assessments of four new protocols for estimating c(t) by combining the options of (A) independent or cooperative walker ensembles with (B) estimations of c(t) by time-integration up to either t or across the entire MD simulation time period image file: d2cp04009c-t2.tif. The convergence properties of the altogether four distinct reweighting procedures were evaluated for two model problems (Section 5): (i) locating the free-energy difference between two metastable molecular conformations of the N-acetyl-L-alanine methylamide dipeptide, which is a widely exploited system for benchmarking ES developments.12,14,17,26,27,36,37,54 (ii) The convergence to a known distribution when one water molecule in the liquid phase is subjected to a known (artificial) free-energy perturbation.

We discuss the relative merits of the novel reweighting options for “small” and “large” ensembles of both cooperative and independent walkers for well-tempered VES implementations with both “good” and “poor” collective-variable selections, the choice of which strongly affects the convergence of the computed ensemble-averaged observables. Nonetheless because an optimal choice of collective variable often remains a priori unknown, simulation protocols that simultaneously offer quick convergence and high accuracy even for unfavorable collective variables are desirable. The overall most rapid convergence resulted for simulations with cooperative walkers combined with c(t) calculated by time integration up to t (rather than to image file: d2cp04009c-t3.tif). That herein advocated “MtΣ” procedure compared favorably both to the standard reweighting procedure in VES17–19 and the state-of-the-art reweighting metadynamics protocol by Tiwary and Parrinello36 (when implemented within VES). For further computational speed and accuracy enhancements, we also provide analytical solutions for the c(t) function of the MtΣ protocol (Section 3.2). These time savings and accuracy-boosts are expected to be of great utility for further ES/MD-simulation studies.

2. Theoretical background

2.1 Ensemble-averaged observables from biased trajectories

The ensemble average 〈O(R)〉 of an observable O(R) that depends on spatial coordinates R is defined7–10,14,16,19,55
 
image file: d2cp04009c-t4.tif(1)
It may be calculated according to
 
image file: d2cp04009c-t5.tif(2)
where U(R) is the internal energy of the system, and β = (kBT)−1, where kB and T are Boltzmann's constant and the absolute temperature, respectively. The partition functions of the biased (ZV) and unbiased (Z) ensembles are given by
 
image file: d2cp04009c-t6.tif(3)
and
 
image file: d2cp04009c-t7.tif(4)
respectively.

The unbiased probability distribution, P(s), is defined by

 
image file: d2cp04009c-t8.tif(5)
where δ(x) is the Dirac delta function. After integration over R, the probability distribution may be expressed as
 
P(s) = Z−1[thin space (1/6-em)]exp{−βF(s)},(6)
where “s” implies either that the collective variable(s) is/are independent on spatial coordinates or depend(s) only on a specifically selected subset thereof. The introduction of the two exp{±βV(s(R), t)} factors in eqn (2) along with the multiplication of both its nominator and denominator by ZV implies that the biased simulation sample configurations from the V(s, t)-biased probability distribution,7–10,14,16,19
 
PV(s, t) = ZV−1[thin space (1/6-em)]exp{−β[F(s) + V(s, t)]}.(7)

By equating the ratio ZV/Z of eqn (2) with the exponentiated bias-correction function, i.e., ZV/Z = exp{−βc(t)}, eqn (2) may be written

 
image file: d2cp04009c-t9.tif(8)
By assuming ergodicity—i.e., a time/ensemble-averaging equivalence, 〈O(R)〉 may be estimated from the time-average over the biased MD-generated trajectory, according to7–10,14,16,19,27,36–38
 
image file: d2cp04009c-t10.tif(9)
Eqn (9) is the prevailing route to calculate ensemble averages from ES simulations, where the calculation of c(t) becomes the central task for recovering F(s) and 〈O(R)〉 from the biased trajectories.7–10,14,16,19

Moreover, for a well-tempered ensemble over a sufficiently long time, the bias potential V(s, t) is related to F(s) via the bias factor γ by11–15,19,20

 
V(s, t) = −(1 − γ−1)F(s),(10)
whereas PV(s, t) [eqn (7)] is related to P(s), according to
 
image file: d2cp04009c-t11.tif(11)

2.2 Variationally enhanced sampling

The variationally enhanced sampling17–23 protocol aims at minimizing a bias-potential-dependent functional, Λ(V(s, t)), which for a time-dependent bias potential, V(s, t), and a well-tempered target distribution, PV(s, t) (eqn (11)) is given by17,19–21
 
image file: d2cp04009c-t12.tif(12)
The global minimum of the convex functional Λ(V(s, t)) is18,19
 
V(s, t) = −F(s) − (βγ)−1[thin space (1/6-em)]ln[thin space (1/6-em)]P(s) − β−1[thin space (1/6-em)]ln[thin space (1/6-em)]ZV.(13)
For practical computations, the bias potential is expanded in a suitable set of k basis functions, whose corresponding {αk} expansion coefficients are the variational parameters that are updated iteratively during the minimization of Λ(V(s, t)).17,19–22 For the present calculations that involve s-periodic bias potentials, we employed a combined cosine and sine Fourier series,
 
image file: d2cp04009c-t13.tif(14)
where α0 = 0 in previous17–23 as well as our current VES implementations.

With the recent exception of Yang and Parrinello,23 who combined the time-lagged independent component analysis56 and VES, all previous reweighting implementations were tantamount to using a constant bias-correction function, i.e., effectively c(t) = 0, which only holds strictly for “late” time-points of the MD simulation; see Sections 2.3.1, 3.1 and 5.3. Herein, we demonstrate that state-of-the-art reweighting procedures from the metadynamics context that employ time-dependent bias-correction functions36,38 may offer both a more rapid and a reliable convergence of the modeled observables relative to the standard VES reweighting implementation with c(t) = 0.

2.3 Efficient strategies for calculating the time-dependent bias correction

Here we review current state-of-the-art approaches—all from the realm of (well-tempered) metadynamics—for estimating the bias-potential-correction function by integrating either over collective variables36,37 or over time.38,54
2.3.1 Integration over collective variables. Tiwary and Parrinello36 showed that c(t), expressed by an integration over the entire CV space,
 
image file: d2cp04009c-t14.tif(15)
may be estimated by using the well-tempered relation for the free energy
 
image file: d2cp04009c-t15.tif(16)
Becuase eqn (16) is only exact once the entire CV space is sampled, which formally demands that t → ∞, reweighting viaeqn (15) and (16) remains accurate after a “transient” time period on a (sub)ns scale,19,36 where the a priori unknown lower limit is herein denoted by tmin. The calculations may otherwise introduce non-negligible errors and no universal and truly accurate procedure accounting for the unpopulated CV values is hitherto presented. Notably, as illustrated in Section 5.2, the same caveat applies to the standard VES reweighting (Sections 2.2 and 3.1).

The combination of eqn (15) and (16) is the key feature of the procedure by Tiwary any Parrinello36 for estimating c(t) via an integration over the CVs; it is henceforth denoted by MTPΣ, and is associated with a bias-potential-correction function cTPΣ(t) obtained from eqn (15), where the subscript Σ stresses the use of cooperative walkers for multi-walker simulations. We have successfully utilized the Tiwary–Parrinello reweighting protocol within the VES formalism for studying biomolecular binding at calcium phosphate surfaces.57,58

Schäfer and Settanni37 suggested by their “balanced exponential” protocol that a better estimate of c(t) [eqn (15)] may be obtained, associated with the readily calculated bias-correction function given by the average value of V(s,t) over the CVs:37

 
image file: d2cp04009c-t16.tif(17)
Metadynamics simulations utilizing reweighting by the BE procedure compared favorably37 with that of Tiwary and Parrinello,36 as well as with an earlier option introduced by Bonomi et al.12 (the latter is not considered further herein). Incidentally, for VES implementations with V(s,t) expressed according to eqn (14), then eqn (17) evaluates to
 
c(t) = α0 = 0.(18)
Hence, the bias-correction function from the metadynamics-stemming BE reweighting coincides with that of VES. Consequently, we will in the following refer to this reweighting method as MVESΣ but we will cite both ref. 17 and 37 to emphasize that once the BE reweighting is applied within the VES context, it reduces to the “standard” VES reweighting with a time-independent bias-correction function.

2.3.2 Integration over time. Giberti et al.38 recently introduced the ITRE reweighting procedure for calculating c(t) and F(s) from metadynamics simulations, which they argued gives significant benefits to s-integration-based reweighting counterparts, such as those of ref. 12, 19 and 36. The ITRE protocol estimates F(s) from the unbiased distribution by integrating eqn (9) up to time-point t (Fig. 1a),
 
image file: d2cp04009c-t17.tif(19)
whereupon combination of eqn (15) and (19) yields the expression38
 
image file: d2cp04009c-t18.tif(20)

We refer to ref. 38 for details on practical ITRE implementations within the metadynamics scope, whereas Section 3.1 presents a modified procedure implemented herein within VES, along with extensions to multiple-walker simulations.


image file: d2cp04009c-f1.tif
Fig. 1 (a) The time-integration limits of eqn (21) with (a) Γ = t and (b) image file: d2cp04009c-t155.tif depicted for the specific case of estimating c(t4) for n = 8 time-points {tj} [eqn (23)]. (c) Graphical illustration of the estimation of ctw(t4) for one walker by using the analytical solution of eqn (27), which involves the parameters A [eqn (25a)], C [eqn (25c)], and D [eqn (25d)]. The bias-potential terms for parameter C, V(sw(τk), τk) and for parameter D, V(sw(t4), t4), are depicted by green and black dots, respectively, whereas parameter A involves both V(sw(τk), t4) (red dots) and V(sw(τk), τk). Note that the index 0 ≤ k ≤ 3 implies that τ is integrated up to t4, as in (a).

3. Enhanced reweighting procedures

3.1 New reweighting protocols

The metadynamics-associated ITRE strategy by Giberti et al.38 for estimating c(t) by integrating eqn (20) over the history of V(s, t) is readily generalized to multiple-walker simulations within the VES framework, which may involve NW independent or cooperative walkers. Note that for a given simulation, all—independent and/or collective—walkers share the same bias potential and only differ in their bias-correction functions (vide infra). Hence, “independent walkers” should not be confused with “independent simulations”. Moreover, the time-integration of eqn (20) may for each case be evaluated by either using t (as in ref. 38) or image file: d2cp04009c-t19.tif as upper integration limit (Fig. 1a,b), which lead to time-dependent [P(s, t)] and time-independentimage file: d2cp04009c-t20.tif distribution functions, respectively. Consequently, combination of both pairs of “walker” and “time-integration” options furnishes four new reweighting methods: each one is denoted by image file: d2cp04009c-t21.tif, where the time-integration limit is given by the superscript Γ = t or image file: d2cp04009c-t22.tif, while the subscript image file: d2cp04009c-t23.tif identifies the scenarios of either cooperativeimage file: d2cp04009c-t24.tif or independentimage file: d2cp04009c-t25.tif walkers, where “w” is an index w = {1, 2, …, NW}.

Each independent walker w associates with its “own/unique” function, cΓw(t), calculated by a generalized form of eqn (20), according to

 
image file: d2cp04009c-t26.tif(21)
where sw(τ) is the CV value of walker w at time-point τ. All cooperative walkers, on the other hand, share the same bias-correction function, cΓΣ(t), which is obtained from
 
image file: d2cp04009c-t27.tif(22)
Eqn (21) and (22) were in practice implemented by sampling each sw(τ) function at n discrete time-points
 
image file: d2cp04009c-t28.tif(23)
with image file: d2cp04009c-t29.tif, yielding a self-consistent system of equations with the solution {c(t0), c(t1), …, c(tn−1)}, as depicted schematically in Fig. 1a,b.

Because the ITRE protocol38—along with its generalized multiple-walker expressions given herein [eqn (21) and (22)]—compute c(t) by time integration up to either t (ref. 38) or to image file: d2cp04009c-t30.tif, they do not explicitly assume ergodicity and are thereby less prone to acquire systematic errors that may decelerate the metadynamics/VES convergence. Notably, that contrasts with the MTPΣ and MVESΣ approaches of ref. 17, 36, and 37 that utilize s-integration and rely on the validity of eqn (16); see Section 2.3.1. Albeit their reweighting accuracy may improve by restricting the evaluation of eqn (15) to the data with ttmin, the precise value of tmin is a priori unknown and must be deduced empirically.19,36

The reweighting evaluations herein included the entire time domain to enable continuous “convergence curves” for all methods and simulation periods image file: d2cp04009c-t31.tif (Section 5), thereby offering practical assessments of MTPΣ and MVESΣ against the new time-integration-based reweighting protocols, none of which requires any t < tmin truncation of the data set and consequently also no assumptions about the unknown limit tmin. This feature is a decisive advantage of reweighting by time-integration.38

3.2 Analytical solutions for the bias-correction function c(t)

Albeit the computational efforts of the reweighting stage remain truly marginal as compared with those of its underlying MD simulations and eqn (22) may be solved in ≈3 iterations,38 here we provide analytical solutions for the bias-correction functions ctΣ(t) and ctw(t), whose computational costs match that of one sole numerical iteration cycle, thereby offering significant advantages both in terms of speed and accuracy.

To solve eqn (22) analytically, we express it according to

 
image file: d2cp04009c-t32.tif(24)
where each parameter A, B, C, and D is given by a summation over exponentiated functions evaluated at time-points {tj} [eqn (23)], each separated by Δτ:
 
image file: d2cp04009c-t33.tif(25a)
 
B = ΔτNW,(25b)
 
image file: d2cp04009c-t34.tif(25c)
 
image file: d2cp04009c-t35.tif(25d)
Fig. 1c illustrates the various time-dependent bias-potential components [V(sw(τk), tj)] of eqn (25) for one walker and tj value. Here, the C and D parameters depend only on the values V(sw(τk), τk) (green dots) and V(sw(tj), tj) (black dot), respectively, whereas A involves both V(sw(τk), tj) (red dots) and V(sw(τk), τk).

By identifying xj ≡ exp{−βc(tj)}, eqn (24) may be represented as

 
Dxj2 + (CB)xjA = 0, 0 ≤ jn − 1,(26)
which may be solved analytically:
 
image file: d2cp04009c-t36.tif(27)
Note that A = C = 0 at t0 = 0, which in the absence of a bias potential [V(sw(t0), t0) = 0] implies that B = D and c(t0) = 0. For each consecutive time-point tj (j = 0, 1, …, n − 1), eqn (25) are evaluated, whereupon c(tj) is determined from eqn (27).

Notably, eqn (27) applies to calculations of c(t) for the practically most relevant scenario of cooperative walkers, whereas that for independent walkers follows trivially because each of the sums AD in eqn (25) collapses into one sole term for each walker w. These analytical solutions of image file: d2cp04009c-t37.tif were employed in all computations presented below.

4. Computational methods

4.1 General simulation conditions

All atomistic MD simulations involved NVT ensembles at T = 37 °C, utilizing the GROMACS v2018.1 platform.59 The equations of motion were integrated in steps of 0.9 fs by using the velocity Verlet integrator.55 The Coulomb interactions were calculated with a smoothed particle–mesh Ewald summation60 of order four and a tolerance of 10−5, using a Fourier spacing of 0.12 and a switch distance of 1.2 nm, while the van der Waals interactions were truncated at 1.2 nm. The temperature was controlled by the velocity rescale thermostat61 with a 1.0 ps time constant.

The well-tempered VES simulations17–19 employed the PLUMED2.4 software.62 To enhance the configurational sampling, a well-tempered target distribution with bias factor γ = 5 was employed along with the CV. The time-dependent bias potential, V(s, t), was expanded out to order NF = 6 in the CV (eqn (14)). The well-tempered target distribution PV(s, t) = exp{βV(s, t)/(γ − 1)} and the {αk(t)} coefficients were calculated iteratively during the simulation by using the averaged-stochastic-gradient descent algorithm63 with a step size of μ = 1.0 to minimize the variational functional eqn (12) by the procedures described in ref. 17 and 63. The time integration spans the interval Δτ between each bias-potential update.17 To minimize numerical errors, the Fourier coefficients {αk(t)} were updated every Δτ = 0.9 ps and then stored at each time-point, while PV(s, t) was updated every 0.45 ns.

The accuracy of the bias-potential evaluation was improved by sampling different CV domains by employing NW walkers, each operating within an independently generated system and starting from different configurations to ensure that they sample different MD trajectories.

4.2 Alanine dipeptide simulations

One N-acetyl-L-alanine methylamide molecule—henceforth referred to as “alanine dipeptide” (Fig. 2)—was simulated in vacuum by using the CHARMM36/CMAP all-atom force field (July 2017).64 The volume of the cubic cell was kept constant at V = 36.7 nm3 to avoid undesirable boundary-condition effects. The simulations were performed with two distinct CVs represented by either torsion angle s = ϕ or s = ψ (Fig. 2), as well as with ensembles of NW = {4, 8, 16, 64} cooperative and independent walkers. As described further in Section 5.1, every simulated {s, NW} combination was reweighted by each novel image file: d2cp04009c-t38.tif protocol along with MTPΣ (ref. 36) and MVESΣ (ref. 17 and 37), by evaluating ΔF between two metastable molecular conformations for increasing image file: d2cp04009c-t39.tif (see Section 5.1).
image file: d2cp04009c-f2.tif
Fig. 2 Illustration of the two “C7eq” and “C7ax” molecular conformations of N-acetyl-L-alanine methylamide (“alanine dipeptide”) with the two torsion angles ϕ and ψ indicated, along with the free-energy surface.

The convergence of each reweighting method was assessed from Nsim = {16, 16, 8, 6} independent but nominally identical simulations for the respective walker ensembles with NW = {4, 8, 16, 64} by calculating the root-mean-square (rms) deviation of image file: d2cp04009c-t40.tif to the fully converged reference value ΔFref = 6.80 kJ mol−1, i.e., we evaluated the entity rmsimage file: d2cp04009c-t41.tif. ΔFref was determined from three independent VES/MD simulations that utilized both CVs, s = {ϕ, ψ}, for a long simulation period of image file: d2cp04009c-t42.tif (using one walker). Moreoever, because the bias-potential V(s, t) converged well within 20 ns, employing the standard VES reweighting protocol17–19 for t > 20 ns yielded the same value of ΔFref = 6.80 kJ mol−1 for all three simulations. Note that all reweighting methods converge to the same result (see Section 5.1). The initial configuration of each walker of the subsequent Nsim simulations constituted a randomly selected frame for t > 20 ns of the fully converged 100 ns MD simulations.

The variance among the rmsimage file: d2cp04009c-t43.tif results of the Nsim simulations was determined by the Jackknife method65 according to

 
image file: d2cp04009c-t44.tif(28)
 
image file: d2cp04009c-t45.tif(29)

4.3 Analytical potential-model simulations

We simulated an NVT ensemble of 1000 water molecules with the force field of ref. 66 in a cubic box of equal axis lengths lx = ly = lz = 3.1 nm. An internal reference point was obtained by restricting the position of one molecule at origo by an harmonic potential with a large force constant (κ = 250 kJ mol−1). The center-of-mass of one other water molecule, referred to as “A”, was restricted at y = 1.3 nm. The intermolecular separation was around ly/2, which ensured that the two molecules are further apart than the cutoff distance of 1.2 nm for all Coulomb and van der Waals interactions. The position of molecule A along the x direction was subjected to the periodic free-energy function
 
F(x)/(kJ mol−1) = 5[thin space (1/6-em)]cos{6x2π/lx},(30)
which possesses six local energy minima, all separated by a barrier of 10 kJ mol−1.

The system was simulated with the collective variable s = 2πx/lx (which is the optimal choice), NW = 6, and γ = 5, which ensures the asymptotic relationship V(x) = −(1 − γ−1)F(x) = −(4/5)F(x):

 
V(x)/(kJ mol−1) = −4[thin space (1/6-em)]cos{6x2π/lx}.(31)
The bias potential was applied to molecule A along the x direction, whereas no other direction or molecule was biased. Hence, all walkers shared the same energy minimum at x ≈ 0.8 nm (see Section 5.2), yet at slightly different positions to ensure that each follows a unique trajectory. The results presented below are averages over 32 independent but nominally identical simulations.

5. Results and discussion

5.1 Alanine dipeptide conformations

Owing to its extensive use in developments of metadynamics and other ES methods,12,14,17,19,26,27,36,37,54 the alanine dipeptide molecule in vacuum was selected for benchmarking the new image file: d2cp04009c-t46.tif reweighting options, which are contrasted with the already established state-of-the-art MTPΣ (ref. 36) and MVESΣ (ref. 17 and 37) schemes. The convergence offered by each protocol was evaluated for (i) increasing simulation time image file: d2cp04009c-t47.tif, when (ii) using either of the torsion angle ϕ or ψ as CV, and (iii) variable-sized walker ensembles with NW = {4, 8, 16, 64}.

Here, we assessed the convergence of the free-energy difference between two metastable molecular conformations that form a seven-atom membered cyclic structure, labeled “C7”, and stabilized by an internal hydrogen bond. These two conformations are shown in Fig. 2, along with the F(s) contours plotted against ϕ and ψ. The three methyl groups of the molecule may assume either equatorial (C7eq) or axial (C7ax) orientations relative to the ring, respectively. We assessed the convergence performance of each image file: d2cp04009c-t48.tif reweighting procedure via the free-energy difference

 
image file: d2cp04009c-t49.tif(32)
between two torsion-angle domains image file: d2cp04009c-t50.tif centered at image file: d2cp04009c-t51.tif, with Ω(C7eq0) = {−81°, 71°} and Ω(C7ax0) = {74°, −67°}, respectively. The probability image file: d2cp04009c-t52.tif of domain image file: d2cp04009c-t53.tif is
 
image file: d2cp04009c-t54.tif(33)
where the function image file: d2cp04009c-t55.tif is unity throughout image file: d2cp04009c-t56.tif and zero otherwise.

5.1.1 Role of walkers and choice of collective variable for convergence. For each collective variable s = ϕ and s = ψ, Fig. 3 plots the convergence function, rms(ΔF − ΔFref), for increasing MD simulation intervals image file: d2cp04009c-t57.tif and walker ensembles with 4 to 64 members. All reweighting schemes converge markedly more rapidly for the simulations with s = ϕ than those for s = ψ. Hence, s = ϕ is a “good” choice of CV because it manifests pronounced transition-state barriers between the energy minima (Fig. 2), which are readily compensated for by the bias potential V(ϕ, t), whereas the selection s = ψ leads to a “hystereses” behavior that results in slow convergence.14,19
image file: d2cp04009c-f3.tif
Fig. 3 Plots of rmsimage file: d2cp04009c-t156.tif [eqn (32)] against the simulation interval image file: d2cp04009c-t157.tif (log scale) for the herein proposed {MtΣ, Mtw, image file: d2cp04009c-t158.tif, image file: d2cp04009c-t159.tif} reweighting protocols, along with those by Tiwary and Parrinello36 (MTPΣ) and that of Schäfer and Settanni,37 which is identical to the original VES reweighting17,18 (MVESΣ). The VES/MD simulations employed ensembles of (a, b) 4, (c, d) 8, (e, f) 16, and (g, h) 64 walkers along with collective variables of s = ϕ (left panel) and s = ψ (right panel). Each dotted rectangle and horizontal red dotted line in (a)–(d) marks the respective regions of “near” and “sufficient” convergence to the correct reference energy value ΔFref = 6.80 kJ mol−1. The relative performances of the various reweighting protocols in the near-convergence regimes are more transparent in the zoomed plots shown in Fig. 4. Note that the vertical plot ranges varies among the rows of graphs and that the converge accelerate consistently for increasing walker ensembles. Each rms(ΔF − ΔFref) curve resulted from (a)–(d) 16, (e, f) 8, and (g, h) 6 independent simulations; Fig. S1 and S2 (ESI) plot the data uncertainties.

These features are more transparent in the corresponding plots of Fig. 4 which are zoomed around the “near-convergence” domain (dotted rectangles in Fig. 3). Indeed, Fig. 4b and Table 1 reveals that for simulations with s = ψ and NW = 4, only the data reweighted by MtΣ, MTPΣ and MVESΣ reach below the convergence threshold (horizontal red dotted lines in Fig. 3 and 4) within our longest evaluated value of image file: d2cp04009c-t58.tif, requiring the corresponding simulation periods of 13.7 ns, 14.0 ns, and 19.4 ns, respectively. However, while these rms results over Nsim = 16 independent simulations meet the convergence criterion, only the MtΣ and MTPΣ reweighting schemes offer convergence for all simulations (upon omission of obvious outliers; see Table 1), both requiring image file: d2cp04009c-t59.tif. Fig. S1 and S2 (ESI) show the convergence curves with ±σ spreads among the Nsim simulations of each reweighting scheme.


image file: d2cp04009c-f4.tif
Fig. 4 Zoomed convergence plots of the near-convergence region shown by dotted rectangles in Fig. 3.
Table 1 Simulation time image file: d2cp04009c-t149.tif (in ns) to reach convergencea
Alanine dipeptide
sNWb N sim M t Σ

image file: d2cp04009c-t150.tif

M t w

image file: d2cp04009c-t151.tif

M TPΣ M VESΣ
a The VES/MD simulation time required for reaching convergence image file: d2cp04009c-t153.tif of either rms(ΔF − ΔFref) (alanine dipeptide; Fig. 3 and 4) or DKL (analytical potential model; Fig. 6b,c) when employing the given reweighting protocol. The values within parentheses represent the corresponding image file: d2cp04009c-t154.tif values required for all of the Nsim independent simulations to converge; each superscript marks the number of outlier data-curves that were omitted to give the as-stated result. b Collective variable and size of walker ensemble. c Number of bins employed to evaluate eqn (35) with Ns = 6 (Fig. 6b) or Ns = 48 (Fig. 6c).
ϕ − 4 16 5.78(10.71) 20.0(−) 6.93(14.52) −(−) 5.89(11.81) 3.78(11.51)
ϕ − 8 16 1.45(4.13) 1.45(9.36) 9.87(13.22) 1.45(7.65) 1.10(7.98) 1.57(9.72)
ϕ − 16 8 1.47(1.671) 1.46(2.271) 9.59(12.01) 1.89(2.171) 1.97(2.311) 1.01(1.751)
ϕ − 64 6 0.22(0.43) 0.20(0.530) 6.30(7.95) 0.66(1.08) 0.71(1.08) 0.81(1.03)
ψ − 4 16 13.7(18.63) −(−) −(−) −(−) 14.0(19.23) 19.4(−)
ψ − 8 16 17.0(17.62) 17.1(18.62) −(−) 15.2(18.62) 17.0(18.62) 17.4(18.62)
ψ − 16 8 9.8(10.01) 9.8(10.41) −(−) 10.9(11.72) 9.64(10.42) 9.96(10.72)
ψ − 64 6 0.32(2.66) 0.29(2.82) −(−) 7.34(9.531) 0.45(9.21) 0.90(9.81)

Analytical potential
N s N sim M t Σ

image file: d2cp04009c-t152.tif

M TPΣ M VESΣ
6 32 0.150(0.237) 0.170(0.269) 0.240(0.560) 0.700(1.29)
48 32 0.210(0.285) 0.220(0.503) 0.270(0.578) 0.770(1.38)


The overall most rapid rms(ΔF − ΔFref) convergence resulted when employing (moderately) large walker ensembles (Fig. 3c–h and 4c–h). This property is most evident for the simulations with the “unfavorable” CV ψ, where all walker ensembles with NW ≥ 8 secured proper convergence within 20 ns for all reweighting schemes but Mtw. Moreover, regardless of the precise choice of CV, Fig. 3 and 4 reveal that cooperative walkers (i.e., MtΣ and image file: d2cp04009c-t60.tif) are markedly more favorable than independent ones (i.e., Mtw and image file: d2cp04009c-t61.tif). Their advantage emphasizes progressively for increasing NW, which reflects the enhanced statistics provided by sets of cooperative walkers for improved c(t) estimates.

We remind that Fig. 3 and 4 employ a logarithmic time scale and that the differences in convergence merits are substantial between the favorable {MtΣ, image file: d2cp04009c-t62.tif} and worse {Mtw, image file: d2cp04009c-t63.tif} pairs of reweighting protocols. For the herein most favorable {s = ϕ, NW = 64} simulation scenario, the reweighting method with collective walkers (image file: d2cp04009c-t64.tif) required image file: d2cp04009c-t65.tif to attain “sufficient” convergence of rms(ΔF − ΔFref), whereas its counterpart with independent walkers image file: d2cp04009c-t66.tif demanded 670 ps (Table 1). For the NW = 64 walker ensemble with the less favorable CV s = ψ, the image file: d2cp04009c-t67.tif scheme required 290 ps for convergence, whereas image file: d2cp04009c-t68.tif necessitated 7.34 ns (Fig. 4h). The differences in convergence properties among the MtΣ and Mtw schemes are even larger (Table 1): MtΣ offers very similar convergence as image file: d2cp04009c-t69.tif for NW = 64 regardless of the choice of s. In contrast, the Mtw counterpart does not converge within 20 ns for the “difficult” s = ψ case, irrespective of the walker-ensemble size, while it takes 6–29 times longer to converge (relative to MtΣ) for all s = ϕ scenarios with NW ≥ 8. Besides the expected finding that larger walker ensembles accelerate the convergence relative to smaller ones, we conclude that cooperative walkers are preferred to independent ones.

5.1.2 Role of time-integration limit for convergence. Because the MTPΣ and MVESΣ reweighting schemes utilize CV integration, we here only contrast the novel {MtΣ, image file: d2cp04009c-t70.tif, Mtw, image file: d2cp04009c-t71.tif} protocols generalized from the ITRE procedure. For the smallest walker ensembles with NW = 4 of Fig. 3a,b and 4a,b, significant differences are observed for the rms(ΔF − ΔFref) results between the protocols employing Γ = t relative to image file: d2cp04009c-t72.tif for both cooperative and independent walker ensembles, where the integration limit t is favorable throughout. Here, the MtΣ reweighting scheme outperforms any other time-integration-based reweighting option.

For larger ensembles with at least 8 independent walkers, the results of Fig. 3c–h and 4c–h reveal that the image file: d2cp04009c-t73.tif protocol with integration limit image file: d2cp04009c-t74.tif offers a substantially faster reweighting convergence than its Mtw sister scheme. Nontetheless, image file: d2cp04009c-t75.tif, and (in particular) Mtw, remain overall inferior to their cooperative-walker based image file: d2cp04009c-t76.tif and MtΣ counterparts (Section 5.1.1). The latter procedures exhibit nearly equal reweighting performances both far from (Fig. 3) and near/at (Fig. 4) convergence, regardless of the precise simulation period and the number of walkers. We attribute the property of a largely immaterial choice of integration limit Γ = t (MtΣ) or image file: d2cp04009c-t77.tif for the ensembles of cooperative walkers to their accompanying enhanced sampling of the CV domain, thereby also naturally reducing the differences between averages over time or over CVs.

When all results of Fig. 3 and 4 are taken together for the {MtΣ, image file: d2cp04009c-t78.tif, Mtw, image file: d2cp04009c-t79.tif} reweighting procedures evaluated for different CVs, walker-ensemble types and sizes, the overall best and near-equal performances are observed for the cooperative-walkers based MtΣ and image file: d2cp04009c-t80.tif protocols. Yet we recommend using the MtΣ scheme due to its markedly faster convergence also for very small walker ensembles (Fig. 3a,b and 4a,b), along with much more rapid reweighting calculations (Table S1, ESI).

5.1.3 Relative reweighting merits. Here, we focus on contrasting the two best time-integration protocols, MtΣ and image file: d2cp04009c-t81.tif, with the s-integration based MTPΣ and MVESΣ schemes. Relative to the MTPΣ protocol introduced by Tiwary and Parrinello36, Schäfer and Settanni37 highlighted primarily two advantages with their balanced-exponential reweighting method: (i) a faster convergence for “short” simulation periods, and (ii) lower reweighted-observable uncertainties/variabilities. At least within the scope of VES—for which the BE and VES reweighting procedures become identical (MVESΣ)—the second claim is generally neither born out by our assessments for the alanine dipeptide (vide infra) nor for the case examined in Section 5.2. The perhaps more important claim (i) of better reweighting convergence properties for short MD simulation periods, however, appears to depend significantly on the particular choice of CV(s) and walker-ensemble size: for relatively large number of walkers NW = {16, 64}, the MVESΣ protocol indeed offers more rapid convergence regardless of s = {ϕ, ψ} (Fig. 3). Yet, these improvements are typically only pronounced in regimes too far from a reasonable convergence demand. Throughout both regimes of “near” and “sufficient” convergence, the VES/BE reweighting consistently only outperformed all other methods for the cases of s = ϕ with NW = {4, 16}; see Fig. 3a,e and 4a,e.

A lack of reliability appears to be the main deficiency of the original VES reweighting (Fig. 3–5, Fig. S1, and S2, ESI): for the largest-walker simulations close to convergence (Fig. 4g,h), the highly oscillatory rms(ΔF − ΔFref) curve of MVESΣ renders it inferior relative to its primary MtΣ, image file: d2cp04009c-t82.tif, and MTPΣ competitors. The required simulation periods to attain convergence image file: d2cp04009c-t83.tif among the protocols for {s = ϕ, NW = 64 } increase according to image file: d2cp04009c-t84.tif (Table 1), while the case of s = ψ only differs in that image file: d2cp04009c-t85.tif. The 8-walker ensemble evaluations for s = ϕ shown in Fig. 4c reveal a similar trend, except that now the MTPΣ protocol performs overall best, both to reach convergence and within the “near convergence” regime (<1 ns). Hence, as for the VES/BE reweighting, the Tiwary and Parrinello scheme manifests an uneven performance among the simulations in Fig. 3 and 4, in contrast to the two best time-integration-based protocols (i.e., MtΣ and image file: d2cp04009c-t86.tif), whose convergence improve monotonically for increasing walker ensembles (Table 1). The uneven performance of both s-integration-based reweighting protocols—which is most pronounced for VES/BE—presumably originates from the herein strict evaluations that involved reweighting based on the entire simulated time domain; this is examined further in Section 5.2.

Concerning the simulation periods for reaching convergence of rms(ΔF − ΔFref) for the altogether eight {s, NW} combinations (Table 1), the TP and VES/BE protocols offer the most rapid convergence in two cases each, whereas the MtΣ/image file: d2cp04009c-t87.tif counterparts accomplishes that in three cases, where we remind that the image file: d2cp04009c-t88.tif and MtΣ schemes reveal essentially equal image file: d2cp04009c-t89.tif values throughout, except for the smallest walker ensembles of each s = {ϕ, ψ} angle (Section 5.1.1). Moreoever, whenever the MtΣ/image file: d2cp04009c-t90.tif methods do not offer the most rapid convergence, their performances remain close to the best method. Notably, for the much less forgiving criterion that all Nsim simulations of each method must converge, however, the MtΣ procedure perform best throughout the eight evaluated {s, NW} scenarios (Table 1); yet, the difference to the second-best reweighting scheme is often marginal.

The high precision and reliability of the MtΣ protocol is gratifying when considering the linear scaling of the total simulation time against Nsim, thereby in practice requiring a reasonable small number of independent simulations to accomplish an accurate average/rms value of 〈O(R)〉. Hence, it is desirable that the reweighting method yields the lowest possible spread (variance) around the a priori unknown average/rms result, such that one sole simulation and its subsequent observable-reweighting may be expected to approximate well the fully converged value obtained from a (very) large number of Nsim independent simulations.

Fig. 5 plots the variance—σ2[rms(ΔF − ΔFref)] calculated from eqn (28)—observed from each reweighting protocol of Fig. 3 for increasing image file: d2cp04009c-t91.tif among the Ns simulations, employing Nsim = {16, 16, 8, 6} for the respective walker ensembles with NW = {4, 8, 16, 64}. The MtΣ, image file: d2cp04009c-t92.tif and MTPΣ reweighting schemes offer the overall lowest variances, all of which are similar. As for the higher convergence rate of the MtΣ method relative to image file: d2cp04009c-t93.tif for both s= {ϕ, ψ} with NW = 4 (Fig. 3a,b and 4a,b), it also features lower data spreads. The VES/BE protocol manifests irregular variances, which typically remain larger than those of the three best reweighting schemes for the evaluated {s, NW} cases (Fig. 5). Also along the observations for the rms(ΔF − ΔFref) convergence in Section 5.1.1, the variances of the two Mtw and image file: d2cp04009c-t94.tif methods with independent walkers are inferior relative to their collective-walker counterparts. In particular, surprisingly high variances are observed at long simulation periods for the image file: d2cp04009c-t95.tif scheme (Fig. 5), for which we have no satisfactory explanation.


image file: d2cp04009c-f5.tif
Fig. 5 The variance, σ2[rms(ΔF − ΔFref)]; eqn (28), plotted against the simulation period of each evaluated reweighting protocol of Fig. 3. Note the different vertical scales in the (a and b) plots relative to those of (c, e, g), and (d, h, f).

5.2 Entropy assessments of an analytical free-energy model

The evaluations of the four MtΣ, image file: d2cp04009c-t96.tif, Mtw, and image file: d2cp04009c-t97.tif protocols for the alanine dipeptide suggested that simulations with reasonably large ensembles (NW > 4) of cooperative walkers are preferable for obtaining the most accurate results, with the MtΣ reweighting scheme offering the overall most favorable results (Fig. 3–5 and Table 1). Our second benchmarking scenario of liquid water with an artificial free-energy perturbation [eqn (30)] applied to one molecule employs an optimal CV, s = 2πx/lx (Section 4.3), along with a modest walker ensemble of NW = 6. Consequently, we focussed on comparing the cooperative-walker based MtΣ and image file: d2cp04009c-t98.tif protocols (which are expected to consistently gain merits relative to their Mtw, and image file: d2cp04009c-t99.tif counterparts for increasing NW) with the CV-integration associated MTPΣ (ref. 36) and MVESΣ (ref. 17 and 37) reweighting procedures.

The present problem-design with an a priori known F(x) function acting on one water molecule along x, implies that a converged VES/MD simulation should reproduce the periodic free-energy and bias-potential functions of eqn (30) and (31), respectively. Fig. 6a plots F(x) with its six local energy minima image file: d2cp04009c-t100.tif indicated, along with the corrected bias-potential functions estimated from the MtΣ protocol for the simulation periods of image file: d2cp04009c-t101.tif and 1.0 ns. Because all walkers were initially confined to the image file: d2cp04009c-t102.tif domain centered at x = 0.8 nm [i.e., image file: d2cp04009c-t103.tif], this population remains, as expected, strongly favored for the very short image file: d2cp04009c-t104.tif simulation interval. In the limits of very short and long simulation periods image file: d2cp04009c-t105.tif, all examined reweighting protocols {MtΣ, image file: d2cp04009c-t106.tif, MTPΣ, MVESΣ } yield identical results to that shown for MtΣ with image file: d2cp04009c-t107.tif in Fig. 6a.


image file: d2cp04009c-f6.tif
Fig. 6 (a) Corrected bias potential, Vcorr(x, t) = V(x, t) − c(t), obtained from the MtΣ protocol for short (green trace) and long (red trace) simulation intervals image file: d2cp04009c-t160.tif and image file: d2cp04009c-t161.tif, respectively. The dotted curve represents the applied free-energy function F(x) (eqn (30)), whose six minima image file: d2cp04009c-t162.tif are indicated beneath. (b) Convergence of the Kullback–Leibler divergence [DKL; eqn (35)] with Ns = 6 for increasing image file: d2cp04009c-t163.tif (log scale) for the MtΣ and image file: d2cp04009c-t164.tif schemes proposed herein, as well as those of MTPΣ (ref. 36) and MVESΣ (ref. 37). All simulations employed NW = 6 and the collective variable s = 2πx/lx. (c) As in (b), but using a finer grid of Ns = 48 bins of the CV. Each curve is an average over 32 independent MD/VES simulations, whose variations (data spread) are shown in Fig. S3 (ESI). The horizontal red dotted lines in (b and c) mark the threshold of “sufficient” convergence, while the region around “near” convergence (dotted rectangle) is zoomed in the inset graphs (using a linear image file: d2cp04009c-t165.tif scale). (d) Bias-correction function c(t) associated with each reweighting protocol evaluated in (b) and shown for one representative simulation (see Section 5.4).

The convergence of each reweighting protocol was monitored via the estimated relative entropy, which was assessed by calculating the Kullback–Leibler divergence (DKL)26,67 between the reweighted distribution and image file: d2cp04009c-t108.tif:

 
image file: d2cp04009c-t109.tif(34)
Eqn (34) was in practice evaluated by a discretization into Ns bins according to
 
image file: d2cp04009c-t110.tif(35)
A properly converged VES/MD simulation should reproduce the known reference distribution, which for the most straightforward choice of Ns = 6 becomes image file: d2cp04009c-t111.tif, meaning that all walkers distribute evenly among the six F(x) minima (Fig. 6a).

Fig. 6b plots the average DKL response obtained from 32 independent simulations for increasing image file: d2cp04009c-t112.tif. We define DKL ≤ 0.12 as “sufficient” convergence, which is indicated by the dotted red line in Fig. 6b. As expected, DKL evolves from its initial value ln{6}—i.e., with all walkers confined to image file: d2cp04009c-t113.tif—to equally populated energy minima (DKL = 0), as in Fig. 6a for image file: d2cp04009c-t114.tif. The DKL evolution among the various reweighting protocols vary significantly for increasing image file: d2cp04009c-t115.tif, with the value required to reach convergence image file: d2cp04009c-t116.tif increasing according to MtΣ < image file: d2cp04009c-t117.tif < MTPΣ < MVESΣ, and translating into 150 ps, 170 ps, 240 ps, and 700 ps, respectively (Table 1). This corresponds to ≈4.6 times faster convergence of the “best” (MtΣ) scheme relative to the “worst” (MVESΣ). The relative convergence order remains essentially strict throughout all regions from “far” to “near”, and “sufficient” convergence, except for MVESΣ (vide infra). Besides an overall slightly decelerated convergence of all methods, all findings concerning their relative merits also hold for the finer discretization with Ns = 48 shown in Fig. 6c. Morever, for the more stringent (“worst-case”) convergence criterion that all 32 simulations from each method must reach convergence, Table 1 confirms a lower image file: d2cp04009c-t118.tif period required for MtΣ than any other reweighting scheme, thereby fully corroborating the inferences made from the alanine dipeptide evaluations.

Hence, the results of Fig. 6b,c and Table 1 suggest that the MtΣ protocol introduced herein outperforms its image file: d2cp04009c-t119.tif counterpart, as well as both the Tiwary–Parrinello36 and VES/BE17,37 options. Notably, the performance of MVESΣ is remarkably poor for the present simple model system with a small walker ensemble, except for the image file: d2cp04009c-t120.tif domain far from any acceptable convergence threshold: in this image file: d2cp04009c-t121.tif-regime, the claim37 of a more accurate reweighting than MTPΣ is indeed born out (yet, see Section 5.1.3). Fig. S3 (ESI) presents the spread of {DKL} values observed among the Nsim = 32 simulations of each reweighting method of Fig. 6. The results are commensurate with those discussed for the alanine dipeptide (Fig. 5, and Fig. S1 and S2[ESI]): the MtΣ, image file: d2cp04009c-t122.tif, and MTPΣ protocols reveal overall similar spreads, whereas that of VES/BE is typically wider.

Given that all our evaluations included the entire simulated trajectories in the reweighting, which may deteriorate the convergence of the methods based on CV integration (Sections 2.3.1 and 3.1), we also examined their “optimal” performance by locating tmin for each of MTPΣ and MVESΣ, whereupon the DKL curves were re-evaluated, only retaining the simulated data beyond tmin for each Ns = {6, 48} scenario. Fig. S4 (ESI) presents the results. Whereas essentially no improvement resulted for the MTPΣ method, a substantially enhanced performance is observed for MVESΣ, with the MTPΣ and MVESΣ methods now revealing the same convergence at image file: d2cp04009c-t123.tif for Ns = 6, and image file: d2cp04009c-t124.tif for Ns = 48. Although a significant time-span t < 100 ps of the simulated data was discarded, however, the MTPΣ and MVESΣ reweighting schemes remain inferior to MtΣ (Fig. S4 (ESI) and Table 1). As expected, the data truncation gave no convergence improvements for any of the t-integration methods MtΣ and image file: d2cp04009c-t125.tif (not shown). These results underscore the benefits of reweighting by time integration: no further efforts of locating the optimal tmin value are required, thereby also eliminating any potential systematic errors introduced by the data truncation38 (Section 3.1). Yet, once undertaken for the CV-integration-based schemes, no dramatic differences are expected in the reweighting accuracy between s/t-integration-based methods.

5.3 Summary and further considerations

No single reweighting procedure is ever likely to outperform all others for MD simulations of any conceivable system and evaluation criterion. Yet the findings of Fig. 3–6 and Fig. S1–S4 (ESI) altogether consolidate the MtΣ protocol as the method of choice for both large and small walker ensembles, regardless of (sub)optimal or “bad” choices of the CV: at the worst for a given modeled system, MtΣ is expected to deliver a comparable (or slightly inferior) reweighting accuracy compared with the globally best reweighting scheme. Worth underscoring is that VES/MD implementations employing time-integration-based estimates of the bias-correction function typically offer better accuracy and precision of the reweighted observable than the hitherto utilized standard VES reweighting17–22 with c(t) = 0.

A (markedly) longer computational time for performing the reweighting constitutes a minor practical disadvantage of the novel time-integration image file: d2cp04009c-t126.tif reweighting protocols relative to those utilizing CV averaging.17,36,37 This feature, inherited from the ITRE protocol (see Giberti et al.38), is particularly pronounced for the image file: d2cp04009c-t127.tif schemes which estimate c(t) by integration over the entire MD-simulation time-span. Table S1 (ESI) contrasts the scaling of the number of floating-point operations required for each reweighting procedure evaluated herein, along with concrete CPU clock-timings for the simulations of Fig. 3a. Notably, however, these deficiencies of ITRE-derived protocols are in practice immaterial because the MD simulation (even for one walker) is orders-of-magnitude more time consuming than the subsequent reweighting stage, thereby rendering the time spent for the latter a largely irrelevant priority for selecting a reweighting protocol.

5.4 Analytical c(t) expressions in the high-temperature limit

To gain further insight into the nature of the bias-correction functions, we consider a limiting high-temperature scenario of βF(s) ≪ 1 and βV(s, t) ≪ 1 with β = (kBT)−1. Then, approximate analytical expressions may be obtained for the bias-correction function associated with each herein introduced image file: d2cp04009c-t128.tif reweighting method, as well as that of Tiwary and Parrinello.36

For simulations with independent walkers, a Taylor expansion of the exponential functions of eqn (21) to first order yields

 
image file: d2cp04009c-t129.tif(36)
which applies to one independent walker. Hence, cΓw(t) is obtained as the time-average of the bias-potential taken up to either Γ = t or image file: d2cp04009c-t130.tif. The same procedure applied to cooperative walkers [eqn (22)] yields a readily calculated average over the set of NW correction functions {cΓw(t)}:
 
image file: d2cp04009c-t131.tif(37)

We next consider the Tiwary–Parrinello reweighting protocol,36 for which application of the high-temperature approximation to eqn (15) gives the following result:

 
image file: d2cp04009c-t132.tif(38)
When using well-tempered VES17–19 with the bias potential of eqn (14), the integral image file: d2cp04009c-t133.tif in eqn (38) evaluates to α0 = 0, whereupon the bias-correction function may be expressed
 
image file: d2cp04009c-t134.tif(39)
in the high-temperature limit. Likewise, the BE reweighting37 and its VES equivalent17 implies that cVESΣ(t) = 0 throughout (Section 2).

Albeit approximate, eqn (36)–(39) offer reasonably tractable analytical expressions of c(t), as estimated by integration over either time38 or over CVs.36,37 For instance, Bonomi et al.12 derived an equality between the time-derivatives of c(t) and 〈V(s, t)〉s, which follows trivially by applying either of eqn (36) or (39). We next consider the converged results of Fig. 6d, all of which provided image file: d2cp04009c-t135.tif. (Because the function image file: d2cp04009c-t136.tif depends on the precise simulation period [eqn (22)], we employed image file: d2cp04009c-t137.tif for the results plotted in Fig. 6d). Here, eqn (14), (31), and (39) with α11(t) = − 4.00 kJ mol−1 predict that image file: d2cp04009c-t138.tif. Likewise, eqn (36) and (37) yield image file: d2cp04009c-t139.tif. Hence, in a minimum of computational efforts, the value of c(t) for t → ∞ is predicted with a relative error of 14% as compared with the accurate reference value of Fig. 6d.

6. Conclusions

We have generalized the recently proposed metadynamics ITRE reweighting protocol38 to multiple-walker ensembles implemented within VES, moreover introducing and examining the usage of “independent” and “cooperative” walkers. For well-tempered VES simulations of two model cases, viz. the molecular conformations of N-acetyl-L-alanine methylamide, and a water molecule in the liquid phase subjected to a periodic free-energy function, we examined the relative merits of current state-of-the-art reweighting methods introduced in the VES or metadynamics contexts17,36,37 against four new options: the latter resulted by combining either independent or cooperative walkers with the bias-correction function c(t) estimated by time integration up to either t or across the entire simulation period image file: d2cp04009c-t140.tif; see eqn (21) and (22).

The use of multiple-walker ES/MD simulations accelerates the convergence of the reweighted observables. For all but very small walker ensembles, the MtΣ and image file: d2cp04009c-t141.tif methods that utilize cooperative walkers are superior to those with independent walkers (Mtw and image file: d2cp04009c-t142.tif), with the performance-differences growing for increasing NW. The precise upper time-integration limit of t (MtΣ) or image file: d2cp04009c-t143.tif is not critical for c(t) estimates for MD simulations with (moderately) large cooperative walker ensembles. For small walker ensembles (NW < 8), on the other hand, the advantages of cooperative walkers are minor compared to independent ones. Here, the choice of time-integration limit becomes much more critical—strongly favoring the MtΣ/Mtw protocols relative to their image file: d2cp04009c-t144.tif counterparts—while the precise selection of “good” or “bad” collective variable(s) crucially underpins the convergence of all reweighting protocols.

Although no single reweighting protocol is expected to significantly outperform all others for any conceivable simulation scenario, out of the herein contrasted reweighting methods, the MtΣ scheme with cooperative walkers and the bias-correction function determined by time-integration up to t appears to be the overall most dependable option: it offers a superior accuracy for small walker ensembles than its primary and otherwise equivalent competitor image file: d2cp04009c-t145.tif, along with much more rapid reweighting calculations. Notably, both multiple-walker MtΣ and image file: d2cp04009c-t146.tif protocols are readily implemented in other ES methods, such as metadynamics. Moreover, we demonstrated that reweighting of VES-derived observables by the MtΣ procedure may be accelerated further by exploiting an analytical solution of its bias-correction function, as well as that qualitative insight into c(t) may be gained by approximative analytical expressions in the “high-temperature” regime for all six reweighting protocols that were considered. Computer code for implementing the new reweighting procedures are available at https://www.su.se/profiles/baltzar-1.187342 or may be obtained from the authors on request.

The herein recommended MtΣ scheme provides a better—or at worst comparable—reweighting accuracy as that of the currently best collective-variable-integration methods of Tiwary–Parrinello36 (MTPΣ) and the “balanced exponential” (BE) of Schäfer and Settanni.37 We demonstrated that when the BE reweighting is implemented within the VES framework, it becomes identical to the original VES implementation17–20 (MVESΣ) with a constant bias-correction function. However, VES/BE reweighting often converged slower than the MtΣ, image file: d2cp04009c-t147.tif, and MTPΣ options, while typically giving a larger spread of reweighted observable-values between independent simulations. Albeit that deficiency may be alleviated by following the standard procedure of omitting the initial part of the simulated trajectory in the reweighting to improve accuracy, time-integration-based reweighting schemes offer decisive advantages by not requiring any such additional efforts/precautions along with their accompanying possible introduction of systematic errors. We conclude that enhanced reweighting of the VES/MD-derived observables are expected by embracing time-dependent c(t) ≠ 0 options, such as the MTPΣ reweighting36 and its time-integration MtΣ, image file: d2cp04009c-t148.tif counterparts introduced herein.

Author contributions

BS – conceptualization, investigation, formal analysis and software; ME – funding acquisition and supervision; ME and BS – writing, original draft and revisions.

Conflicts of interest

There are no conflicts of interest to declare.

Acknowledgements

This work was supported by the Swedish Foundation for Strategic Research (funder ID 501100001729; project RMA15–0110), and in part by the Swedish Research Council (project VR 2022-03652). The computations were enabled by resources provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS) and the Swedish National Infrastructure for Computing (SNIC) at NSC, partially funded by the Swedish Research Council through grant agreements no. 2022-06725 and no. 2018-05973. We thank two anonymous reviewers for helpful suggestions that improved the manuscript.

References

  1. G. M. Torrie and J. P. Valleau, Monte Carlo free energy estimates using non-Boltzmann sampling: Application to the sub-critical Lennard-Jones fluid, Chem. Phys. Lett., 1974, 28, 578–581 CrossRef CAS .
  2. G. M. Torrie and J. P. Valleau, Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling, J. Comput. Phys., 1977, 23, 187–199 CrossRef .
  3. U. H. E. Hansmann, Parallel tempering algorithm for conformational studies of biological molecules, Chem. Phys. Lett., 1997, 281, 140–150 CrossRef CAS .
  4. Y. Sugita and Y. Okamoto, Replica-exchange molecular dynamics method for protein folding, Chem. Phys. Lett., 1999, 314, 141–151 CrossRef CAS .
  5. S. Park and K. Schulten, Calculating potentials of mean force from steered molecular dynamics simulations, J. Chem. Phys., 2004, 120, 5946 CrossRef CAS PubMed .
  6. K. M. Bal, Reweighted Jarzynski sampling: Acceleration of rare events and free energy calculation with a bias potential learned from nonequilibrium work, J. Chem. Theory Comput., 2021, 17, 6766–6774 CrossRef CAS PubMed .
  7. A. Laio and M. Parrinello, Escaping free-energy minima, Proc. Natl. Acad. Sci. U. S. A., 2002, 99, 12562–12566 CrossRef CAS PubMed .
  8. A. Laio, A. Rodriguez-Fortea, F. L. Gervasio, M. Ceccarelli and M. Parrinello, Assessing the accuracy of metadynamics, J. Phys. Chem. B, 2005, 109, 6714–6721 CrossRef CAS .
  9. G. Bussi, F. L. Gervasio, A. Laio and M. Parrinello, Free-energy landscape for β hairpin folding from combined parallel tempering and metadynamics, J. Am. Chem. Soc., 2006, 128, 13435–13441 CrossRef CAS PubMed .
  10. A. Laio and F. L. Gervasio, Metadynamics: a method to simulate rare events and reconstruct the free energy in biophysics, chemistry and material science, Rep. Prog. Phys., 2008, 71, 126601 CrossRef .
  11. A. Barducci, G. Bussi and M. Parrinello, Well-tempered metadynamics: A smoothly converging and tunable free-energy method, Phys. Rev. Lett., 2008, 100, 020603 CrossRef PubMed .
  12. M. Bonomi, A. Barducci and M. Parrinello, Reconstructing the equilibrium Boltzmann distribution from well-tempered metadynamics, J. Comp. Chem., 2009, 30, 1615–1621 CrossRef CAS .
  13. M. Bonomi and M. Parrinello, Enhanced sampling in the well-tempered ensemble, Phys. Rev. Lett., 2010, 104, 190601 CrossRef CAS PubMed .
  14. A. Barducci, M. Bonomi and M. Parrinello, Metadynamics, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 2011, 1, 826–843 CAS .
  15. J. F. Dama, M. Parrinello and G. A. Voth, Well-tempered metadynamics converges asymptotically, Phys. Rev. Lett., 2014, 112, 240602 CrossRef PubMed .
  16. G. Bussi and A. Laio, Using metadynamics to explore complex free-energy landscapes, Nat. Rev. Phys., 2020, 2, 200–212 CrossRef .
  17. O. Valsson and M. Parrinello, Variational approach to enhanced sampling and free energy calculations, Phys. Rev. Lett., 2014, 113, 090601 CrossRef CAS PubMed .
  18. O. Valsson and M. Parrinello, Well-tempered variational approach to enhanced sampling, J. Chem. Theory Comput., 2015, 11, 1996–2002 CrossRef CAS PubMed .
  19. O. Valsson, P. Tiwary and M. Parrinello, Enhancing important fluctuations: Rare events and metadynamics from a conceptual viewpoint, Annu. Rev. Phys. Chem., 2016, 67, 159–184 CrossRef CAS PubMed .
  20. O. Valsson and M. Parrinello, Variationally enhanced sampling, in Handbook of Materials Modeling, ed. W. Andreoni and S. Yip, Springer, Cham, 2020, pp. 621–634 Search PubMed .
  21. P. M. Piaggi, O. Valsson and M. Parrinello, A variational approach to nucleation simulation, Faraday Discuss., 2016, 195, 557–568 RSC .
  22. P. Shaffer, O. Valsson and M. Parrinello, Enhanced, targeted sampling of high-dimensional free-energy landscapes using variationally enhanced sampling, with an application to chignolin, Proc. Natl. Acad. Sci. U. S. A., 2016, 113, 1150–1155 CrossRef CAS PubMed .
  23. Y. I. Yang and M. Parrinello, Refining collective coordinates and improving free energy representation in variational enhanced sampling, J. Chem. Theory Comput., 2018, 14, 2889–2894 CrossRef CAS PubMed .
  24. L. Mones, N. Bernstein and G. Csányi, Exploration, sampling, and reconstruction of free energy surfaces with Gaussian process regression, J. Chem. Theory Comput., 2016, 12, 5100–5110 CrossRef CAS .
  25. L. Bonati, V. Rizzi and M. Parrinello, Data-driven collective variables for enhanced sampling, J. Phys. Chem. Lett., 2020, 11, 2998–3004 CrossRef CAS .
  26. J. Rydzewski and O. Valsson, Multiscale reweighted stochastic embedding: Deep learning of collective variables for enhanced sampling, J. Phys. Chem. A, 2021, 125, 6286–6302 CrossRef CAS PubMed .
  27. M. Invernizzi, P. M. Piaggi and M. Parrinello, Unified approach to enhanced sampling, Phys. Rev. X, 2020, 10, 041034 CAS .
  28. Q. Liao, Chapter four-enhanced sampling and free energy calculations for protein simulations, Prog. Mol. Biol. Trans. Sci., 2020, 170, 177–213 CAS .
  29. R. Demuynck, S. M. J. Rogge, L. Vanduyfhuys, J. Wieme, M. Waroquier and V. Van Speybroeck, Efficient construction of free energy profiles of breathing metal-organic frameworks using advanced molecular dynamics simulations, J. Chem. Theory C, 2017, 13, 5861–5873 CrossRef CAS .
  30. B. Pampel and O. Valsson, Improving the efficiency of variationally enhanced sampling with wavelet-based bias potentials, J. Chem. Theory Comput., 2022, 18, 4127–4141 CrossRef CAS PubMed .
  31. K. M. Bal and E. C. Neyts, Merging metadynamics into hyperdynamics: Accelerated molecular simulations reaching time scales from microseconds to seconds, J. Chem. Theory Comput., 2015, 11, 4545–4554 CrossRef CAS PubMed .
  32. P. Kříž, Z. Šućur and V. Spiwok, Free-energy surface prediction by flying Gaussian method: Multisystem representation, J. Phys. Chem. B, 2017, 121, 10479–10483 CrossRef PubMed .
  33. S. Kumar, D. Bouzida, R. H. Swendsen, P. A. Kollman and J. M. Rosenberg, The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method, J. Comput. Chem., 1992, 13, 1011–1021 CrossRef CAS .
  34. B. Roux, The calculation of the potential of mean force using computer simulations, Comput. Phys. Comm., 1995, 91, 275–282 CrossRef CAS .
  35. L. Donati and B. G. Keller, Girsanov reweighting for metadynamics simulations, J. Chem. Phys., 2018, 149, 072335 CrossRef PubMed .
  36. P. Tiwary and M. Parrinello, A time-independent free energy estimator for metadynamics, J. Phys. Chem. B, 2015, 119, 736–742 CrossRef CAS PubMed .
  37. T. M. Schäfer and G. Settanni, Data reweighting in metadynamics simulations, J. Chem. Theory Comput., 2020, 16, 2042–2052 CrossRef .
  38. F. Giberti, B. Cheng, G. A. Tribello and M. Ceriotti, Iterative unbiasing of quasi-equilibrium sampling, J. Chem. Theory Comput., 2020, 16, 100–107 CrossRef CAS PubMed .
  39. F. Pietrucci, Strategies for the exploration of free energy landscapes: Unity in diversity and challenges ahead, Rev. Phys., 2017, 2, 32–45 CrossRef .
  40. S. K. Veesam, S. Ravipati and S. N. Punnathanam, Recent advances in thermodynamics and nucleation of gas hydrates using molecular modeling, Curr. Opin. Chem. Eng., 2019, 23, 14–20 CrossRef .
  41. S. Fukuhara, K. M. Bal, E. C. Neyts and Y. Shibuta, Accelerated molecular dynamics simulation of large systems with parallel collective variable-driven hyperdynamics, Comp. Mat. Sci., 2020, 177, 109581 CrossRef CAS .
  42. J. J. Varghese and S. H. Mushrif, Origins of complex solvent effects on chemical reactivity and computational tools to investigate them: a review, React. Chem. Eng., 2019, 4, 165 RSC .
  43. S. Xu and E. A. Carter, Theoretical insights into heterogeneous (photo)electrochemical CO2 reduction, Chem. Rev., 2019, 119, 6631–6669 CrossRef CAS PubMed .
  44. P. Liu and D. Mei, Identifying free energy landscapes of proton-transfer processes between Brønsted acid sites and water clusters inside the zeolite pores, J. Phys. Chem. C, 2020, 124, 22568–22576 CrossRef CAS .
  45. J. Coines, L. Raich and C. Rovira, Modeling catalytic reaction mechanisms in glycoside hydrolases, Curr. Opin. Chem. Biol., 2019, 53, 183–191 CrossRef CAS PubMed .
  46. P. Ibrahim and T. Clark, Metadynamics simulations of ligand binding to GPCRs, Curr. Opin. Struct. Biol., 2019, 55, 129–137 CrossRef CAS PubMed .
  47. A. V. Dongre, S. Das, A. Bellur, S. Kumar, A. Chandrashekarmath, T. Karmakar, P. Balaram, S. Balasubramanian and H. Balaram, Structural basis for the hyperthermostability of an archaeal enzyme induced by succinimide formation, Biophys. J., 2021, 120, 3732–3746 CrossRef CAS PubMed .
  48. Z. Xu, Y. Yang, Z. Wang, D. Mkhonto, C. Shang, Z.-P. Liu, Q. Cui and N. Sahai, Small molecule-mediated control of hydroxyapatite growth: Free energy calculations benchmarked to density functional theory, J. Comput. Chem., 2014, 35, 70–81 CrossRef CAS PubMed .
  49. Q. Wang, M. Wang, K. Wang, Y. Liu, H. Zhang, X. Lu and X. Zhang, Computer simulation of biomolecule-biomaterial interactions at surfaces and interfaces, Biomed. Mater., 2015, 10, 032001 CrossRef PubMed .
  50. H. Heinz and H. Ramezani-Dakhel, Simulations of inorganic-bioorganic interfaces to discover new materials: insights, comparisons to experiment, challenges, and opportunities, Chem. Soc. Rev., 2016, 45, 412–448 RSC .
  51. J. McCarty, O. Valsson, P. Tiwary and M. Parrinello, Variationally optimized free-energy flooding for rate calculation, Phys. Rev. Lett., 2015, 115, 070601 CrossRef .
  52. P. Shaffer, O. Valsson and M. Parrinello, Hierarchical protein free energy landscapes from variationally enhanced sampling, J. Chem. Theory Comput., 2016, 12, 5751–5757 CrossRef CAS PubMed .
  53. P. Raiteri, A. Laio, F. L. Gervasio, C. Micheletti and M. Parrinello, Efficient reconstruction of complex free energy landscapes by multiple walkers metadynamics, J. Phys. Chem. B, 2006, 110, 3533–3539 CrossRef CAS PubMed .
  54. F. Giberti, G. A. Tribello and M. Ceriotti, Global free-energy landscapes as a smoothly joined collection of local maps, J. Chem. Theory Comput., 2021, 17, 3292–3308 CrossRef CAS PubMed .
  55. M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids, Clarendon Press, Oxford, 1987 Search PubMed .
  56. L. Molgedey and H. G. Schuster, Separation of a mixture of independent signals using time delayed correlations, Phys. Rev. Lett., 1994, 72, 3634–3637 CrossRef .
  57. B. Stevensson and M. Edén, Metadynamics simulations of the pH-dependent adsorption of phosphoserine and citrate on disordered apatite surfaces: What interactions govern the molecular binding?, J. Phys. Chem. B, 2021, 125, 11987–12003 CrossRef CAS PubMed .
  58. R. Mathew, B. Stevensson, M. Pujari-Palmer, C. S. Wood, P. R. A. Chivers, C. D. Spicer, H. Autefage, M. M. Stevens, H. Engqvist and M. Edén, Nuclear magnetic resonance and metadynamics simulations reveal the atomistic binding of L-serine and O-phospho-L-serine at disordered calcium phosphate surfaces of biocements, Chem. Mater., 2022, 34, 8815–8830 CrossRef CAS PubMed .
  59. M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess and E. Lindahl, GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers, SoftwareX, 2015, 1–2, 19–25 CrossRef .
  60. U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen, A smooth particle mesh Ewald method, J. Chem. Phys., 1995, 103, 8577–8593 CrossRef CAS .
  61. G. Bussi, D. Donadio and M. Parrinello, Canonical sampling through velocity rescaling, J. Chem. Phys., 2007, 126, 014101 CrossRef PubMed .
  62. G. A. Tribello, M. Bonomi, D. Branduardi, C. Camilloni and G. Bussi, PLUMED 2: New feathers for an old bird, Comput. Phys. Comm., 2014, 185, 604–613 CrossRef CAS .
  63. F. Bach and E. Moulines, Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n), in Advances in neural information processing systems, ed. C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, Curran Associates, Inc., Red Hook, NY, 2013, vol. 26, pp. 773–781 Search PubMed .
  64. P. Bjelkmar, P. Larsson, M. A. Cuendet, B. Hess and E. Lindahl, Implementation of the CHARMM force field in GROMACS: Analysis of protein stability effects from correction maps, virtual interaction sites, and water models, J. Chem. Theory Comput., 2010, 6, 459–466 CrossRef CAS PubMed .
  65. B. Efron and C. Stein, The jackknife estimate of variance, Ann. Statist., 1981, 9, 586–596 Search PubMed .
  66. W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey and M. L. Klein, Comparison of simple potential functions for simulating liquid water, J. Chem. Phys., 1983, 79, 926–935 CrossRef CAS .
  67. S. Kullback and R. A. Leibler, On information and sufficiency, Annals Math. Stat., 1951, 22, 79–86 CrossRef .

Footnote

Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2cp04009c

This journal is © the Owner Societies 2023