Improved reweighting protocols for variationally enhanced sampling simulations with multiple walkers

Baltzar Stevensson; Mattias Edén

doi:10.1039/D2CP04009C

View PDF VersionPrevious ArticleNext Article

Open Access Article

This Open Access Article is licensed under a
Creative Commons Attribution 3.0 Unported Licence

DOI: 10.1039/D2CP04009C (Paper) Phys. Chem. Chem. Phys., 2023, 25, 22063-22078

Improved reweighting protocols for variationally enhanced sampling simulations with multiple walkers†

Baltzar Stevensson * and Mattias Edén *
Department of Materials and Environmental Chemistry, Stockholm University, SE-106 91 Stockholm, Sweden. E-mail: baltzar.stevensson@mmk.su.se; mattias.eden@mmk.su.se

Received 29th August 2022 , Accepted 2nd July 2023

First published on 3rd July 2023

Abstract

In molecular dynamics simulations utilizing enhanced-sampling techniques, reweighting is a central component for recovering the targeted ensemble averages of the “unbiased” system by calculating and applying a bias-correction function c(t). We present enhanced reweighting protocols for variationally enhanced sampling (VES) simulations by exploiting a recent reweighting method, originally introduced in the metadynamics framework [Giberti et al. J. Chem. Theory Comput., 2020, 16, 100–107], which was modified and extended to multiple-walker simulations: these may be implemented either as “independent” walkers (associated with one unique correction function per walker) or “cooperative” ones that all share one correction function, which is the hitherto only explored option. When each case is combined with the two possibilities of determining c(t) by time integration up to either t or over the entire simulation period , altogether four reweighting options result. Their relative merits were assessed by well-tempered VES simulations of two model problems: locating the free-energy difference between two metastable molecular conformations of the N-acetyl-L-alanine methylamide dipeptide, and the recovery of an a priori known distribution when one water molecule in the liquid phase is perturbed by a periodic free-energy function. The most rapid convergence occurred for large cooperative walkers, regardless of the upper integration limit, but integrating up to t proved advantageous for small walker ensembles. That novel reweighting method compared favorably to the standard VES reweighting, as well as to current state-of-the-art reweighting options introduced for metadynamics simulations that estimate c(t) by integration over the collective variables. For further gains in computational speed and accuracy, we also introduce analytical solutions for c(t), as well as offering further insight into its features by approximative analytical expressions in the “high-temperature” regime.

1. Introduction

Enhanced sampling (ES) techniques for molecular dynamics (MD) simulations, such as umbrella sampling,^1,2 replica exchange,^3,4 and steered MD^5,6 along with the more recent options of metadynamics,^7–16 variationally enhanced sampling (VES),^17–23 and machine learning,^24–26 offer powerful means to accelerate the convergence of MD simulations via an enhanced free-energy surface-sampling by avoiding that states are revisited. Hence, they may address challenging systems featuring multiple local energy minima separated by high barriers that would require a prohibitive time-scale by classical MD simulations. This is accomplished by adding a time-dependent bias potential, V(s, t), which depends on a set of collective variables (CVs) of the system, denoted by s.^7–13 Numerous refinements of ES protocols have been presented, encompassing the precise choices of CVs^{14,16,23,24,26–28} and bias-potential parametrization,^{10,15,22,29,30} along with miscellaneous very recent options,^27,31,32 as well as improved reweighting procedures^12,33–38 to recover the targeted unbiased free energy, F(s), from the sum F(s) + V(s, t) that governs the biased MD trajectory.

In particular metadynamics has been widely applied for modeling (bio)chemical processes, such as the nucleation and growth of carbon nano-tubes,^39–41 proton transfers,^42–44 conformations of biomolecules in solution,^28,45–47 as well as their binding at inorganic surfaces.^48–50 Also the herein utilized VES procedure^17–23 has been employed for modeling of molecular conformations in solutions,^18,51 crystal nucleation,^21,30 whereas the ability of VES to handle large sets of CVs have enabled simulations of protein folding.^22,52

Two powerful options for accelerating the convergence of the ensemble-averaged free energy and other observables from ES/MD simulations involve either (I) calculation of the time-dependent bias-correction function, c(t), by analyzing non-equilibrated systems using time integration, as in the very recent “iterative trajectory reweighting” (ITRE) algorithm introduced for metadynamics simulations by Giberti et al.,³⁸ or (II) an enhanced simultaneous sampling of the CV space by performing multiple (N_W) computations in parallel, each referred to as a “walker” and subjected to the same bias potential.^{10,14,16,19,27,46,53} Using several such walkers may greatly accelerate the convergence of ES/MD simulations.^{10,14,16,19,27,53} For multiple-walker ES simulations, we introduce the concepts of “independent” and “cooperative” walkers, which are associated with N_W distinct bias-correction functions {c_w(t)}, and one shared c(t) function [denoted by c_Σ(t)], respectively; see Section 3. Section 2 outlines the salient features of VES and reweighting, moreover reviewing the hitherto (probably) most efficient and accurate options for determining c(t).^36–38

By coupling options (I) and (II)—and thereby generalizing a modified ITRE protocol to multiple-walker simulations and integrating it with the VES framework^17–19,23—we demonstrate that an overall accelerated convergence results relative to the reweighting procedure of the original VES protocol.^17–20 The latter is moreover shown to be equivalent to the balanced exponential (BE) reweighting of Schäfer and Settanni³⁷ when implemented within the VES scope. Our proposed reweighting method was selected from assessments of four new protocols for estimating c(t) by combining the options of (A) independent or cooperative walker ensembles with (B) estimations of c(t) by time-integration up to either t or across the entire MD simulation time period . The convergence properties of the altogether four distinct reweighting procedures were evaluated for two model problems (Section 5): (i) locating the free-energy difference between two metastable molecular conformations of the N-acetyl-L-alanine methylamide dipeptide, which is a widely exploited system for benchmarking ES developments.^{12,14,17,26,27,36,37,54} (ii) The convergence to a known distribution when one water molecule in the liquid phase is subjected to a known (artificial) free-energy perturbation.

We discuss the relative merits of the novel reweighting options for “small” and “large” ensembles of both cooperative and independent walkers for well-tempered VES implementations with both “good” and “poor” collective-variable selections, the choice of which strongly affects the convergence of the computed ensemble-averaged observables. Nonetheless because an optimal choice of collective variable often remains a priori unknown, simulation protocols that simultaneously offer quick convergence and high accuracy even for unfavorable collective variables are desirable. The overall most rapid convergence resulted for simulations with cooperative walkers combined with c(t) calculated by time integration up to t (rather than to ). That herein advocated “M^t_Σ” procedure compared favorably both to the standard reweighting procedure in VES^17–19 and the state-of-the-art reweighting metadynamics protocol by Tiwary and Parrinello³⁶ (when implemented within VES). For further computational speed and accuracy enhancements, we also provide analytical solutions for the c(t) function of the M^t_Σ protocol (Section 3.2). These time savings and accuracy-boosts are expected to be of great utility for further ES/MD-simulation studies.

2. Theoretical background

2.1 Ensemble-averaged observables from biased trajectories

The ensemble average 〈O(R)〉 of an observable O(R) that depends on spatial coordinates R is defined^{7–10,14,16,19,55}


	(1)

It may be calculated according to


	(2)

where U(R) is the internal energy of the system, and β = (k_BT)⁻¹, where k_B and T are Boltzmann's constant and the absolute temperature, respectively. The partition functions of the biased (Z_V) and unbiased (Z) ensembles are given by


	(3)

and


	(4)

respectively.

The unbiased probability distribution, P(s), is defined by


	(5)

where δ(x) is the Dirac delta function. After integration over R, the probability distribution may be expressed as


P(s) = Z⁻¹exp{−βF(s)},	(6)

where “s” implies either that the collective variable(s) is/are independent on spatial coordinates or depend(s) only on a specifically selected subset thereof. The introduction of the two exp{±βV(s(R), t)} factors in eqn (2) along with the multiplication of both its nominator and denominator by Z_V implies that the biased simulation sample configurations from the V(s, t)-biased probability distribution,^{7–10,14,16,19}


P_V(s, t) = Z_V⁻¹exp{−β[F(s) + V(s, t)]}.	(7)

By equating the ratio Z_V/Z of eqn (2) with the exponentiated bias-correction function, i.e., Z_V/Z = exp{−βc(t)}, eqn (2) may be written


	(8)

By assuming ergodicity—i.e., a time/ensemble-averaging equivalence, 〈O(R)〉 may be estimated from the time-average over the biased MD-generated trajectory, according to^{7–10,14,16,19,27,36–38}


	(9)

Eqn (9) is the prevailing route to calculate ensemble averages from ES simulations, where the calculation of c(t) becomes the central task for recovering F(s) and 〈O(R)〉 from the biased trajectories.^{7–10,14,16,19}

Moreover, for a well-tempered ensemble over a sufficiently long time, the bias potential V(s, t) is related to F(s) via the bias factor γ by^{11–15,19,20}


V(s, t) = −(1 − γ⁻¹)F(s),	(10)

whereas P_V(s, t) [eqn (7)] is related to P(s), according to


	(11)

2.2 Variationally enhanced sampling

The variationally enhanced sampling^17–23 protocol aims at minimizing a bias-potential-dependent functional, Λ(V(s, t)), which for a time-dependent bias potential, V(s, t), and a well-tempered target distribution, P_V(s, t) (eqn (11)) is given by^17,19–21


	(12)

The global minimum of the convex functional Λ(V(s, t)) is^18,19


V(s, t) = −F(s) − (βγ)⁻¹lnP(s) − β⁻¹lnZ_V.	(13)

For practical computations, the bias potential is expanded in a suitable set of k basis functions, whose corresponding {α_k} expansion coefficients are the variational parameters that are updated iteratively during the minimization of Λ(V(s, t)).^17,19–22 For the present calculations that involve s-periodic bias potentials, we employed a combined cosine and sine Fourier series,


	(14)

where α₀ = 0 in previous^17–23 as well as our current VES implementations.

With the recent exception of Yang and Parrinello,²³ who combined the time-lagged independent component analysis⁵⁶ and VES, all previous reweighting implementations were tantamount to using a constant bias-correction function, i.e., effectively c(t) = 0, which only holds strictly for “late” time-points of the MD simulation; see Sections 2.3.1, 3.1 and 5.3. Herein, we demonstrate that state-of-the-art reweighting procedures from the metadynamics context that employ time-dependent bias-correction functions^36,38 may offer both a more rapid and a reliable convergence of the modeled observables relative to the standard VES reweighting implementation with c(t) = 0.

2.3 Efficient strategies for calculating the time-dependent bias correction

Here we review current state-of-the-art approaches—all from the realm of (well-tempered) metadynamics—for estimating the bias-potential-correction function by integrating either over collective variables^36,37 or over time.^38,54

2.3.1 Integration over collective variables. Tiwary and Parrinello³⁶ showed that c(t), expressed by an integration over the entire CV space,


	(15)

may be estimated by using the well-tempered relation for the free energy


	(16)

Becuase eqn (16) is only exact once the entire CV space is sampled, which formally demands that t → ∞, reweighting viaeqn (15) and (16) remains accurate after a “transient” time period on a (sub)ns scale,^19,36 where the a priori unknown lower limit is herein denoted by t_min. The calculations may otherwise introduce non-negligible errors and no universal and truly accurate procedure accounting for the unpopulated CV values is hitherto presented. Notably, as illustrated in Section 5.2, the same caveat applies to the standard VES reweighting (Sections 2.2 and 3.1).

The combination of eqn (15) and (16) is the key feature of the procedure by Tiwary any Parrinello³⁶ for estimating c(t) via an integration over the CVs; it is henceforth denoted by M^TP_Σ, and is associated with a bias-potential-correction function c^TP_Σ(t) obtained from eqn (15), where the subscript Σ stresses the use of cooperative walkers for multi-walker simulations. We have successfully utilized the Tiwary–Parrinello reweighting protocol within the VES formalism for studying biomolecular binding at calcium phosphate surfaces.^57,58

Schäfer and Settanni³⁷ suggested by their “balanced exponential” protocol that a better estimate of c(t) [eqn (15)] may be obtained, associated with the readily calculated bias-correction function given by the average value of V(s,t) over the CVs:³⁷


	(17)

Metadynamics simulations utilizing reweighting by the BE procedure compared favorably³⁷ with that of Tiwary and Parrinello,³⁶ as well as with an earlier option introduced by Bonomi et al.¹² (the latter is not considered further herein). Incidentally, for VES implementations with V(s,t) expressed according to eqn (14), then eqn (17) evaluates to


c(t) = α₀ = 0.	(18)

Hence, the bias-correction function from the metadynamics-stemming BE reweighting coincides with that of VES. Consequently, we will in the following refer to this reweighting method as M^VES_Σ but we will cite both ref. 17 and 37 to emphasize that once the BE reweighting is applied within the VES context, it reduces to the “standard” VES reweighting with a time-independent bias-correction function.

2.3.2 Integration over time. Giberti et al.³⁸ recently introduced the ITRE reweighting procedure for calculating c(t) and F(s) from metadynamics simulations, which they argued gives significant benefits to s-integration-based reweighting counterparts, such as those of ref. 12, 19 and 36. The ITRE protocol estimates F(s) from the unbiased distribution by integrating eqn (9) up to time-point t (Fig. 1a),


	(19)

whereupon combination of eqn (15) and (19) yields the expression³⁸


	(20)

We refer to ref. 38 for details on practical ITRE implementations within the metadynamics scope, whereas Section 3.1 presents a modified procedure implemented herein within VES, along with extensions to multiple-walker simulations.


	Fig. 1 (a) The time-integration limits of eqn (21) with (a) Γ = t and (b) depicted for the specific case of estimating c(t₄) for n = 8 time-points {t_j} [eqn (23)]. (c) Graphical illustration of the estimation of c^t_w(t₄) for one walker by using the analytical solution of eqn (27), which involves the parameters A [eqn (25a)], C [eqn (25c)], and D [eqn (25d)]. The bias-potential terms for parameter C, V(s_w(τ_k), τ_k) and for parameter D, V(s_w(t₄), t₄), are depicted by green and black dots, respectively, whereas parameter A involves both V(s_w(τ_k), t₄) (red dots) and V(s_w(τ_k), τ_k). Note that the index 0 ≤ k ≤ 3 implies that τ is integrated up to t₄, as in (a).

3. Enhanced reweighting procedures

3.1 New reweighting protocols

The metadynamics-associated ITRE strategy by Giberti et al.³⁸ for estimating c(t) by integrating eqn (20) over the history of V(s, t) is readily generalized to multiple-walker simulations within the VES framework, which may involve N_W independent or cooperative walkers. Note that for a given simulation, all—independent and/or collective—walkers share the same bias potential and only differ in their bias-correction functions (vide infra). Hence, “independent walkers” should not be confused with “independent simulations”. Moreover, the time-integration of eqn (20) may for each case be evaluated by either using t (as in ref. 38) or

as upper integration limit (Fig. 1a,b), which lead to time-dependent [P(s, t)] and time-independent

distribution functions, respectively. Consequently, combination of both pairs of “walker” and “time-integration” options furnishes four new reweighting methods: each one is denoted by

, where the time-integration limit is given by the superscript Γ = t or

, while the subscript

identifies the scenarios of either cooperative

or independent

walkers, where “w” is an index w = {1, 2, …, N_W}.

Each independent walker w associates with its “own/unique” function, c^Γ_w(t), calculated by a generalized form of eqn (20), according to


	(21)

where s_w(τ) is the CV value of walker w at time-point τ. All cooperative walkers, on the other hand, share the same bias-correction function, c^Γ_Σ(t), which is obtained from


	(22)

Eqn (21) and (22) were in practice implemented by sampling each s_w(τ) function at n discrete time-points


	(23)

with

, yielding a self-consistent system of equations with the solution {c(t₀), c(t₁), …, c(t_n−1)}, as depicted schematically in Fig. 1a,b.

Because the ITRE protocol³⁸—along with its generalized multiple-walker expressions given herein [eqn (21) and (22)]—compute c(t) by time integration up to either t (ref. 38) or to , they do not explicitly assume ergodicity and are thereby less prone to acquire systematic errors that may decelerate the metadynamics/VES convergence. Notably, that contrasts with the M^TP_Σ and M^VES_Σ approaches of ref. 17, 36, and 37 that utilize s-integration and rely on the validity of eqn (16); see Section 2.3.1. Albeit their reweighting accuracy may improve by restricting the evaluation of eqn (15) to the data with t ≥ t_min, the precise value of t_min is a priori unknown and must be deduced empirically.^19,36

The reweighting evaluations herein included the entire time domain to enable continuous “convergence curves” for all methods and simulation periods (Section 5), thereby offering practical assessments of M^TP_Σ and M^VES_Σ against the new time-integration-based reweighting protocols, none of which requires any t < t_min truncation of the data set and consequently also no assumptions about the unknown limit t_min. This feature is a decisive advantage of reweighting by time-integration.³⁸

3.2 Analytical solutions for the bias-correction function c(t)

Albeit the computational efforts of the reweighting stage remain truly marginal as compared with those of its underlying MD simulations and eqn (22) may be solved in ≈3 iterations,³⁸ here we provide analytical solutions for the bias-correction functions c^t_Σ(t) and c^t_w(t), whose computational costs match that of one sole numerical iteration cycle, thereby offering significant advantages both in terms of speed and accuracy.

To solve eqn (22) analytically, we express it according to


	(24)

where each parameter A, B, C, and D is given by a summation over exponentiated functions evaluated at time-points {t_j} [eqn (23)], each separated by Δτ:


	(25a)


B = ΔτN_W,	(25b)


	(25c)


	(25d)

Fig. 1c illustrates the various time-dependent bias-potential components [V(s_w(τ_k), t_j)] of eqn (25) for one walker and t_j value. Here, the C and D parameters depend only on the values V(s_w(τ_k), τ_k) (green dots) and V(s_w(t_j), t_j) (black dot), respectively, whereas A involves both V(s_w(τ_k), t_j) (red dots) and V(s_w(τ_k), τ_k).

By identifying x_j ≡ exp{−βc(t_j)}, eqn (24) may be represented as


Dx_j² + (C − B)x_j − A = 0, 0 ≤ j ≤ n − 1,	(26)

which may be solved analytically:


	(27)

Note that A = C = 0 at t₀ = 0, which in the absence of a bias potential [V(s_w(t₀), t₀) = 0] implies that B = D and c(t₀) = 0. For each consecutive time-point t_j (j = 0, 1, …, n − 1), eqn (25) are evaluated, whereupon c(t_j) is determined from eqn (27).

Notably, eqn (27) applies to calculations of c(t) for the practically most relevant scenario of cooperative walkers, whereas that for independent walkers follows trivially because each of the sums A–D in eqn (25) collapses into one sole term for each walker w. These analytical solutions of were employed in all computations presented below.

4. Computational methods

4.1 General simulation conditions

All atomistic MD simulations involved NVT ensembles at T = 37 °C, utilizing the GROMACS v2018.1 platform.⁵⁹ The equations of motion were integrated in steps of 0.9 fs by using the velocity Verlet integrator.⁵⁵ The Coulomb interactions were calculated with a smoothed particle–mesh Ewald summation⁶⁰ of order four and a tolerance of 10⁻⁵, using a Fourier spacing of 0.12 and a switch distance of 1.2 nm, while the van der Waals interactions were truncated at 1.2 nm. The temperature was controlled by the velocity rescale thermostat⁶¹ with a 1.0 ps time constant.

The well-tempered VES simulations^17–19 employed the PLUMED2.4 software.⁶² To enhance the configurational sampling, a well-tempered target distribution with bias factor γ = 5 was employed along with the CV. The time-dependent bias potential, V(s, t), was expanded out to order N_F = 6 in the CV (eqn (14)). The well-tempered target distribution P_V(s, t) = exp{βV(s, t)/(γ − 1)} and the {α_k(t)} coefficients were calculated iteratively during the simulation by using the averaged-stochastic-gradient descent algorithm⁶³ with a step size of μ = 1.0 to minimize the variational functional eqn (12) by the procedures described in ref. 17 and 63. The time integration spans the interval Δτ between each bias-potential update.¹⁷ To minimize numerical errors, the Fourier coefficients {α_k(t)} were updated every Δτ = 0.9 ps and then stored at each time-point, while P_V(s, t) was updated every 0.45 ns.

The accuracy of the bias-potential evaluation was improved by sampling different CV domains by employing N_W walkers, each operating within an independently generated system and starting from different configurations to ensure that they sample different MD trajectories.

4.2 Alanine dipeptide simulations

One N-acetyl-L-alanine methylamide molecule—henceforth referred to as “alanine dipeptide” (Fig. 2)—was simulated in vacuum by using the CHARMM36/CMAP all-atom force field (July 2017).⁶⁴ The volume of the cubic cell was kept constant at V = 36.7 nm³ to avoid undesirable boundary-condition effects. The simulations were performed with two distinct CVs represented by either torsion angle s = ϕ or s = ψ (Fig. 2), as well as with ensembles of N_W = {4, 8, 16, 64} cooperative and independent walkers. As described further in Section 5.1, every simulated {s, N_W} combination was reweighted by each novel

protocol along with M^TP_Σ (ref. 36) and M^VES_Σ (ref. 17 and 37), by evaluating ΔF between two metastable molecular conformations for increasing

(see Section 5.1).


	Fig. 2 Illustration of the two “C7eq” and “C7ax” molecular conformations of N-acetyl-L-alanine methylamide (“alanine dipeptide”) with the two torsion angles ϕ and ψ indicated, along with the free-energy surface.

The convergence of each reweighting method was assessed from N_sim = {16, 16, 8, 6} independent but nominally identical simulations for the respective walker ensembles with N_W = {4, 8, 16, 64} by calculating the root-mean-square (rms) deviation of to the fully converged reference value ΔF_ref = 6.80 kJ mol⁻¹, i.e., we evaluated the entity rms. ΔF_ref was determined from three independent VES/MD simulations that utilized both CVs, s = {ϕ, ψ}, for a long simulation period of (using one walker). Moreoever, because the bias-potential V(s, t) converged well within 20 ns, employing the standard VES reweighting protocol^17–19 for t > 20 ns yielded the same value of ΔF_ref = 6.80 kJ mol⁻¹ for all three simulations. Note that all reweighting methods converge to the same result (see Section 5.1). The initial configuration of each walker of the subsequent N_sim simulations constituted a randomly selected frame for t > 20 ns of the fully converged 100 ns MD simulations.

The variance among the rms results of the N_sim simulations was determined by the Jackknife method⁶⁵ according to


	(28)


	(29)

4.3 Analytical potential-model simulations

We simulated an NVT ensemble of 1000 water molecules with the force field of ref. 66 in a cubic box of equal axis lengths l_x = l_y = l_z = 3.1 nm. An internal reference point was obtained by restricting the position of one molecule at origo by an harmonic potential with a large force constant (κ = 250 kJ mol⁻¹). The center-of-mass of one other water molecule, referred to as “A”, was restricted at y = 1.3 nm. The intermolecular separation was around l_y/2, which ensured that the two molecules are further apart than the cutoff distance of 1.2 nm for all Coulomb and van der Waals interactions. The position of molecule A along the x direction was subjected to the periodic free-energy function


F(x)/(kJ mol⁻¹) = 5cos{6x2π/l_x},	(30)

which possesses six local energy minima, all separated by a barrier of 10 kJ mol⁻¹.

The system was simulated with the collective variable s = 2πx/l_x (which is the optimal choice), N_W = 6, and γ = 5, which ensures the asymptotic relationship V(x) = −(1 − γ⁻¹)F(x) = −(4/5)F(x):


V(x)/(kJ mol⁻¹) = −4cos{6x2π/l_x}.	(31)

The bias potential was applied to molecule A along the x direction, whereas no other direction or molecule was biased. Hence, all walkers shared the same energy minimum at x ≈ 0.8 nm (see Section 5.2), yet at slightly different positions to ensure that each follows a unique trajectory. The results presented below are averages over 32 independent but nominally identical simulations.

5. Results and discussion

5.1 Alanine dipeptide conformations

Owing to its extensive use in developments of metadynamics and other ES methods,^{12,14,17,19,26,27,36,37,54} the alanine dipeptide molecule in vacuum was selected for benchmarking the new

reweighting options, which are contrasted with the already established state-of-the-art M^TP_Σ (ref. 36) and M^VES_Σ (ref. 17 and 37) schemes. The convergence offered by each protocol was evaluated for (i) increasing simulation time

, when (ii) using either of the torsion angle ϕ or ψ as CV, and (iii) variable-sized walker ensembles with N_W = {4, 8, 16, 64}.

Here, we assessed the convergence of the free-energy difference between two metastable molecular conformations that form a seven-atom membered cyclic structure, labeled “C7”, and stabilized by an internal hydrogen bond. These two conformations are shown in Fig. 2, along with the F(s) contours plotted against ϕ and ψ. The three methyl groups of the molecule may assume either equatorial (C7eq) or axial (C7ax) orientations relative to the ring, respectively. We assessed the convergence performance of each reweighting procedure via the free-energy difference


	(32)

between two torsion-angle domains

centered at

, with Ω(C7eq⁰) = {−81°, 71°} and Ω(C7ax⁰) = {74°, −67°}, respectively. The probability

of domain


	(33)

where the function

is unity throughout

and zero otherwise.

5.1.1 Role of walkers and choice of collective variable for convergence. For each collective variable s = ϕ and s = ψ, Fig. 3 plots the convergence function, rms(ΔF − ΔF_ref), for increasing MD simulation intervals

and walker ensembles with 4 to 64 members. All reweighting schemes converge markedly more rapidly for the simulations with s = ϕ than those for s = ψ. Hence, s = ϕ is a “good” choice of CV because it manifests pronounced transition-state barriers between the energy minima (Fig. 2), which are readily compensated for by the bias potential V(ϕ, t), whereas the selection s = ψ leads to a “hystereses” behavior that results in slow convergence.^14,19


	Fig. 3 Plots of rms [eqn (32)] against the simulation interval (log scale) for the herein proposed {M^t_Σ, M^t_w, , } reweighting protocols, along with those by Tiwary and Parrinello³⁶ (M^TP_Σ) and that of Schäfer and Settanni,³⁷ which is identical to the original VES reweighting^17,18 (M^VES_Σ). The VES/MD simulations employed ensembles of (a, b) 4, (c, d) 8, (e, f) 16, and (g, h) 64 walkers along with collective variables of s = ϕ (left panel) and s = ψ (right panel). Each dotted rectangle and horizontal red dotted line in (a)–(d) marks the respective regions of “near” and “sufficient” convergence to the correct reference energy value ΔF_ref = 6.80 kJ mol⁻¹. The relative performances of the various reweighting protocols in the near-convergence regimes are more transparent in the zoomed plots shown in Fig. 4. Note that the vertical plot ranges varies among the rows of graphs and that the converge accelerate consistently for increasing walker ensembles. Each rms(ΔF − ΔF_ref) curve resulted from (a)–(d) 16, (e, f) 8, and (g, h) 6 independent simulations; Fig. S1 and S2 (ESI†) plot the data uncertainties.

These features are more transparent in the corresponding plots of Fig. 4 which are zoomed around the “near-convergence” domain (dotted rectangles in Fig. 3). Indeed, Fig. 4b and Table 1 reveals that for simulations with s = ψ and N_W = 4, only the data reweighted by M^t_Σ, M^TP_Σ and M^VES_Σ reach below the convergence threshold (horizontal red dotted lines in Fig. 3 and 4) within our longest evaluated value of , requiring the corresponding simulation periods of 13.7 ns, 14.0 ns, and 19.4 ns, respectively. However, while these rms results over N_sim = 16 independent simulations meet the convergence criterion, only the M^t_Σ and M^TP_Σ reweighting schemes offer convergence for all simulations (upon omission of obvious outliers; see Table 1), both requiring . Fig. S1 and S2 (ESI†) show the convergence curves with ±σ spreads among the N_sim simulations of each reweighting scheme.


	Fig. 4 Zoomed convergence plots of the near-convergence region shown by dotted rectangles in Fig. 3.

Table 1 Simulation time

(in ns) to reach convergence^a

Alanine dipeptide
s − N_W^b	N _sim	M ^t_Σ		M ^t_w		M ^TP_Σ	M ^VES_Σ
a The VES/MD simulation time required for reaching convergence of either rms(ΔF − ΔF_ref) (alanine dipeptide; Fig. 3 and 4) or D_KL (analytical potential model; Fig. 6b,c) when employing the given reweighting protocol. The values within parentheses represent the corresponding values required for all of the N_sim independent simulations to converge; each superscript marks the number of outlier data-curves that were omitted to give the as-stated result. b Collective variable and size of walker ensemble. c Number of bins employed to evaluate eqn (35) with N_s = 6 (Fig. 6b) or N_s = 48 (Fig. 6c).
ϕ − 4	16	5.78(10.7¹)	20.0(−)	6.93(14.5²)	−(−)	5.89(11.8¹)	3.78(11.5¹)
ϕ − 8	16	1.45(4.13)	1.45(9.36)	9.87(13.2²)	1.45(7.65)	1.10(7.98)	1.57(9.72)
ϕ − 16	8	1.47(1.67¹)	1.46(2.27¹)	9.59(12.0¹)	1.89(2.17¹)	1.97(2.31¹)	1.01(1.75¹)
ϕ − 64	6	0.22(0.43)	0.20(0.530)	6.30(7.95)	0.66(1.08)	0.71(1.08)	0.81(1.03)

ψ − 4	16	13.7(18.6³)	−(−)	−(−)	−(−)	14.0(19.2³)	19.4(−)
ψ − 8	16	17.0(17.6²)	17.1(18.6²)	−(−)	15.2(18.6²)	17.0(18.6²)	17.4(18.6²)
ψ − 16	8	9.8(10.0¹)	9.8(10.4¹)	−(−)	10.9(11.7²)	9.64(10.4²)	9.96(10.7²)
ψ − 64	6	0.32(2.66)	0.29(2.82)	−(−)	7.34(9.53¹)	0.45(9.21)	0.90(9.81)

Analytical potential
N _s	N _sim	M ^t_Σ		M ^TP_Σ	M ^VES_Σ
6	32	0.150(0.237)	0.170(0.269)	0.240(0.560)	0.700(1.29)
48	32	0.210(0.285)	0.220(0.503)	0.270(0.578)	0.770(1.38)

The overall most rapid rms(ΔF − ΔF_ref) convergence resulted when employing (moderately) large walker ensembles (Fig. 3c–h and 4c–h). This property is most evident for the simulations with the “unfavorable” CV ψ, where all walker ensembles with N_W ≥ 8 secured proper convergence within 20 ns for all reweighting schemes but M^t_w. Moreover, regardless of the precise choice of CV, Fig. 3 and 4 reveal that cooperative walkers (i.e., M^t_Σ and ) are markedly more favorable than independent ones (i.e., M^t_w and ). Their advantage emphasizes progressively for increasing N_W, which reflects the enhanced statistics provided by sets of cooperative walkers for improved c(t) estimates.

We remind that Fig. 3 and 4 employ a logarithmic time scale and that the differences in convergence merits are substantial between the favorable {M^t_Σ, } and worse {M^t_w, } pairs of reweighting protocols. For the herein most favorable {s = ϕ, N_W = 64} simulation scenario, the reweighting method with collective walkers () required to attain “sufficient” convergence of rms(ΔF − ΔF_ref), whereas its counterpart with independent walkers demanded 670 ps (Table 1). For the N_W = 64 walker ensemble with the less favorable CV s = ψ, the scheme required 290 ps for convergence, whereas necessitated 7.34 ns (Fig. 4h). The differences in convergence properties among the M^t_Σ and M^t_w schemes are even larger (Table 1): M^t_Σ offers very similar convergence as for N_W = 64 regardless of the choice of s. In contrast, the M^t_w counterpart does not converge within 20 ns for the “difficult” s = ψ case, irrespective of the walker-ensemble size, while it takes 6–29 times longer to converge (relative to M^t_Σ) for all s = ϕ scenarios with N_W ≥ 8. Besides the expected finding that larger walker ensembles accelerate the convergence relative to smaller ones, we conclude that cooperative walkers are preferred to independent ones.

5.1.2 Role of time-integration limit for convergence. Because the M^TP_Σ and M^VES_Σ reweighting schemes utilize CV integration, we here only contrast the novel {M^t_Σ,

, M^t_w,

} protocols generalized from the ITRE procedure. For the smallest walker ensembles with N_W = 4 of Fig. 3a,b and 4a,b, significant differences are observed for the rms(ΔF − ΔF_ref) results between the protocols employing Γ = t relative to

for both cooperative and independent walker ensembles, where the integration limit t is favorable throughout. Here, the M^t_Σ reweighting scheme outperforms any other time-integration-based reweighting option.

For larger ensembles with at least 8 independent walkers, the results of Fig. 3c–h and 4c–h reveal that the protocol with integration limit offers a substantially faster reweighting convergence than its M^t_w sister scheme. Nontetheless, , and (in particular) M^t_w, remain overall inferior to their cooperative-walker based and M^t_Σ counterparts (Section 5.1.1). The latter procedures exhibit nearly equal reweighting performances both far from (Fig. 3) and near/at (Fig. 4) convergence, regardless of the precise simulation period and the number of walkers. We attribute the property of a largely immaterial choice of integration limit Γ = t (M^t_Σ) or for the ensembles of cooperative walkers to their accompanying enhanced sampling of the CV domain, thereby also naturally reducing the differences between averages over time or over CVs.

When all results of Fig. 3 and 4 are taken together for the {M^t_Σ, , M^t_w, } reweighting procedures evaluated for different CVs, walker-ensemble types and sizes, the overall best and near-equal performances are observed for the cooperative-walkers based M^t_Σ and protocols. Yet we recommend using the M^t_Σ scheme due to its markedly faster convergence also for very small walker ensembles (Fig. 3a,b and 4a,b), along with much more rapid reweighting calculations (Table S1, ESI†).

5.1.3 Relative reweighting merits. Here, we focus on contrasting the two best time-integration protocols, M^t_Σ and

, with the s-integration based M^TP_Σ and M^VES_Σ schemes. Relative to the M^TP_Σ protocol introduced by Tiwary and Parrinello³⁶, Schäfer and Settanni³⁷ highlighted primarily two advantages with their balanced-exponential reweighting method: (i) a faster convergence for “short” simulation periods, and (ii) lower reweighted-observable uncertainties/variabilities. At least within the scope of VES—for which the BE and VES reweighting procedures become identical (M^VES_Σ)—the second claim is generally neither born out by our assessments for the alanine dipeptide (vide infra) nor for the case examined in Section 5.2. The perhaps more important claim (i) of better reweighting convergence properties for short MD simulation periods, however, appears to depend significantly on the particular choice of CV(s) and walker-ensemble size: for relatively large number of walkers N_W = {16, 64}, the M^VES_Σ protocol indeed offers more rapid convergence regardless of s = {ϕ, ψ} (Fig. 3). Yet, these improvements are typically only pronounced in regimes too far from a reasonable convergence demand. Throughout both regimes of “near” and “sufficient” convergence, the VES/BE reweighting consistently only outperformed all other methods for the cases of s = ϕ with N_W = {4, 16}; see Fig. 3a,e and 4a,e.

A lack of reliability appears to be the main deficiency of the original VES reweighting (Fig. 3–5, Fig. S1, and S2, ESI†): for the largest-walker simulations close to convergence (Fig. 4g,h), the highly oscillatory rms(ΔF − ΔF_ref) curve of M^VES_Σ renders it inferior relative to its primary M^t_Σ, , and M^TP_Σ competitors. The required simulation periods to attain convergence among the protocols for {s = ϕ, N_W = 64 } increase according to (Table 1), while the case of s = ψ only differs in that . The 8-walker ensemble evaluations for s = ϕ shown in Fig. 4c reveal a similar trend, except that now the M^TP_Σ protocol performs overall best, both to reach convergence and within the “near convergence” regime (<1 ns). Hence, as for the VES/BE reweighting, the Tiwary and Parrinello scheme manifests an uneven performance among the simulations in Fig. 3 and 4, in contrast to the two best time-integration-based protocols (i.e., M^t_Σ and ), whose convergence improve monotonically for increasing walker ensembles (Table 1). The uneven performance of both s-integration-based reweighting protocols—which is most pronounced for VES/BE—presumably originates from the herein strict evaluations that involved reweighting based on the entire simulated time domain; this is examined further in Section 5.2.

Concerning the simulation periods for reaching convergence of rms(ΔF − ΔF_ref) for the altogether eight {s, N_W} combinations (Table 1), the TP and VES/BE protocols offer the most rapid convergence in two cases each, whereas the M^t_Σ/ counterparts accomplishes that in three cases, where we remind that the and M^t_Σ schemes reveal essentially equal values throughout, except for the smallest walker ensembles of each s = {ϕ, ψ} angle (Section 5.1.1). Moreoever, whenever the M^t_Σ/ methods do not offer the most rapid convergence, their performances remain close to the best method. Notably, for the much less forgiving criterion that all N_sim simulations of each method must converge, however, the M^t_Σ procedure perform best throughout the eight evaluated {s, N_W} scenarios (Table 1); yet, the difference to the second-best reweighting scheme is often marginal.

The high precision and reliability of the M^t_Σ protocol is gratifying when considering the linear scaling of the total simulation time against N_sim, thereby in practice requiring a reasonable small number of independent simulations to accomplish an accurate average/rms value of 〈O(R)〉. Hence, it is desirable that the reweighting method yields the lowest possible spread (variance) around the a priori unknown average/rms result, such that one sole simulation and its subsequent observable-reweighting may be expected to approximate well the fully converged value obtained from a (very) large number of N_sim independent simulations.

Fig. 5 plots the variance—σ²[rms(ΔF − ΔF_ref)] calculated from eqn (28)—observed from each reweighting protocol of Fig. 3 for increasing among the N_s simulations, employing N_sim = {16, 16, 8, 6} for the respective walker ensembles with N_W = {4, 8, 16, 64}. The M^t_Σ, and M^TP_Σ reweighting schemes offer the overall lowest variances, all of which are similar. As for the higher convergence rate of the M^t_Σ method relative to for both s= {ϕ, ψ} with N_W = 4 (Fig. 3a,b and 4a,b), it also features lower data spreads. The VES/BE protocol manifests irregular variances, which typically remain larger than those of the three best reweighting schemes for the evaluated {s, N_W} cases (Fig. 5). Also along the observations for the rms(ΔF − ΔF_ref) convergence in Section 5.1.1, the variances of the two M^t_w and methods with independent walkers are inferior relative to their collective-walker counterparts. In particular, surprisingly high variances are observed at long simulation periods for the scheme (Fig. 5), for which we have no satisfactory explanation.


	Fig. 5 The variance, σ²[rms(ΔF − ΔF_ref)]; eqn (28), plotted against the simulation period of each evaluated reweighting protocol of Fig. 3. Note the different vertical scales in the (a and b) plots relative to those of (c, e, g), and (d, h, f).

5.2 Entropy assessments of an analytical free-energy model

The evaluations of the four M^t_Σ,

, M^t_w, and

protocols for the alanine dipeptide suggested that simulations with reasonably large ensembles (N_W > 4) of cooperative walkers are preferable for obtaining the most accurate results, with the M^t_Σ reweighting scheme offering the overall most favorable results (Fig. 3–5 and Table 1). Our second benchmarking scenario of liquid water with an artificial free-energy perturbation [eqn (30)] applied to one molecule employs an optimal CV, s = 2πx/l_x (Section 4.3), along with a modest walker ensemble of N_W = 6. Consequently, we focussed on comparing the cooperative-walker based M^t_Σ and

protocols (which are expected to consistently gain merits relative to their M^t_w, and

counterparts for increasing N_W) with the CV-integration associated M^TP_Σ (ref. 36) and M^VES_Σ (ref. 17 and 37) reweighting procedures.

The present problem-design with an a priori known F(x) function acting on one water molecule along x, implies that a converged VES/MD simulation should reproduce the periodic free-energy and bias-potential functions of eqn (30) and (31), respectively. Fig. 6a plots F(x) with its six local energy minima indicated, along with the corrected bias-potential functions estimated from the M^t_Σ protocol for the simulation periods of and 1.0 ns. Because all walkers were initially confined to the domain centered at x = 0.8 nm [i.e., ], this population remains, as expected, strongly favored for the very short simulation interval. In the limits of very short and long simulation periods , all examined reweighting protocols {M^t_Σ, , M^TP_Σ, M^VES_Σ } yield identical results to that shown for M^t_Σ with in Fig. 6a.


	Fig. 6 (a) Corrected bias potential, V^corr(x, t) = V(x, t) − c(t), obtained from the M^t_Σ protocol for short (green trace) and long (red trace) simulation intervals and , respectively. The dotted curve represents the applied free-energy function F(x) (eqn (30)), whose six minima are indicated beneath. (b) Convergence of the Kullback–Leibler divergence [D_KL; eqn (35)] with N_s = 6 for increasing (log scale) for the M^t_Σ and schemes proposed herein, as well as those of M^TP_Σ (ref. 36) and M^VES_Σ (ref. 37). All simulations employed N_W = 6 and the collective variable s = 2πx/l_x. (c) As in (b), but using a finer grid of N_s = 48 bins of the CV. Each curve is an average over 32 independent MD/VES simulations, whose variations (data spread) are shown in Fig. S3 (ESI†). The horizontal red dotted lines in (b and c) mark the threshold of “sufficient” convergence, while the region around “near” convergence (dotted rectangle) is zoomed in the inset graphs (using a linear scale). (d) Bias-correction function c(t) associated with each reweighting protocol evaluated in (b) and shown for one representative simulation (see Section 5.4).

The convergence of each reweighting protocol was monitored via the estimated relative entropy, which was assessed by calculating the Kullback–Leibler divergence (D_KL)^26,67 between the reweighted distribution and :


	(34)

Eqn (34) was in practice evaluated by a discretization into N_s bins according to


	(35)

A properly converged VES/MD simulation should reproduce the known reference distribution, which for the most straightforward choice of N_s = 6 becomes

, meaning that all walkers distribute evenly among the six F(x) minima (Fig. 6a).

Fig. 6b plots the average D_KL response obtained from 32 independent simulations for increasing . We define D_KL ≤ 0.12 as “sufficient” convergence, which is indicated by the dotted red line in Fig. 6b. As expected, D_KL evolves from its initial value ln{6}—i.e., with all walkers confined to —to equally populated energy minima (D_KL = 0), as in Fig. 6a for . The D_KL evolution among the various reweighting protocols vary significantly for increasing , with the value required to reach convergence increasing according to M^t_Σ < < M^TP_Σ < M^VES_Σ, and translating into 150 ps, 170 ps, 240 ps, and 700 ps, respectively (Table 1). This corresponds to ≈4.6 times faster convergence of the “best” (M^t_Σ) scheme relative to the “worst” (M^VES_Σ). The relative convergence order remains essentially strict throughout all regions from “far” to “near”, and “sufficient” convergence, except for M^VES_Σ (vide infra). Besides an overall slightly decelerated convergence of all methods, all findings concerning their relative merits also hold for the finer discretization with N_s = 48 shown in Fig. 6c. Morever, for the more stringent (“worst-case”) convergence criterion that all 32 simulations from each method must reach convergence, Table 1 confirms a lower period required for M^t_Σ than any other reweighting scheme, thereby fully corroborating the inferences made from the alanine dipeptide evaluations.

Hence, the results of Fig. 6b,c and Table 1 suggest that the M^t_Σ protocol introduced herein outperforms its counterpart, as well as both the Tiwary–Parrinello³⁶ and VES/BE^17,37 options. Notably, the performance of M^VES_Σ is remarkably poor for the present simple model system with a small walker ensemble, except for the domain far from any acceptable convergence threshold: in this -regime, the claim³⁷ of a more accurate reweighting than M^TP_Σ is indeed born out (yet, see Section 5.1.3). Fig. S3 (ESI†) presents the spread of {D_KL} values observed among the N_sim = 32 simulations of each reweighting method of Fig. 6. The results are commensurate with those discussed for the alanine dipeptide (Fig. 5, and Fig. S1 and S2[ESI†]): the M^t_Σ, , and M^TP_Σ protocols reveal overall similar spreads, whereas that of VES/BE is typically wider.

Given that all our evaluations included the entire simulated trajectories in the reweighting, which may deteriorate the convergence of the methods based on CV integration (Sections 2.3.1 and 3.1), we also examined their “optimal” performance by locating t_min for each of M^TP_Σ and M^VES_Σ, whereupon the D_KL curves were re-evaluated, only retaining the simulated data beyond t_min for each N_s = {6, 48} scenario. Fig. S4 (ESI†) presents the results. Whereas essentially no improvement resulted for the M^TP_Σ method, a substantially enhanced performance is observed for M^VES_Σ, with the M^TP_Σ and M^VES_Σ methods now revealing the same convergence at for N_s = 6, and for N_s = 48. Although a significant time-span t < 100 ps of the simulated data was discarded, however, the M^TP_Σ and M^VES_Σ reweighting schemes remain inferior to M^t_Σ (Fig. S4 (ESI†) and Table 1). As expected, the data truncation gave no convergence improvements for any of the t-integration methods M^t_Σ and (not shown). These results underscore the benefits of reweighting by time integration: no further efforts of locating the optimal t_min value are required, thereby also eliminating any potential systematic errors introduced by the data truncation³⁸ (Section 3.1). Yet, once undertaken for the CV-integration-based schemes, no dramatic differences are expected in the reweighting accuracy between s/t-integration-based methods.

5.3 Summary and further considerations

No single reweighting procedure is ever likely to outperform all others for MD simulations of any conceivable system and evaluation criterion. Yet the findings of Fig. 3–6 and Fig. S1–S4 (ESI†) altogether consolidate the M^t_Σ protocol as the method of choice for both large and small walker ensembles, regardless of (sub)optimal or “bad” choices of the CV: at the worst for a given modeled system, M^t_Σ is expected to deliver a comparable (or slightly inferior) reweighting accuracy compared with the globally best reweighting scheme. Worth underscoring is that VES/MD implementations employing time-integration-based estimates of the bias-correction function typically offer better accuracy and precision of the reweighted observable than the hitherto utilized standard VES reweighting^17–22 with c(t) = 0.

A (markedly) longer computational time for performing the reweighting constitutes a minor practical disadvantage of the novel time-integration reweighting protocols relative to those utilizing CV averaging.^17,36,37 This feature, inherited from the ITRE protocol (see Giberti et al.³⁸), is particularly pronounced for the schemes which estimate c(t) by integration over the entire MD-simulation time-span. Table S1 (ESI†) contrasts the scaling of the number of floating-point operations required for each reweighting procedure evaluated herein, along with concrete CPU clock-timings for the simulations of Fig. 3a. Notably, however, these deficiencies of ITRE-derived protocols are in practice immaterial because the MD simulation (even for one walker) is orders-of-magnitude more time consuming than the subsequent reweighting stage, thereby rendering the time spent for the latter a largely irrelevant priority for selecting a reweighting protocol.

5.4 Analytical c(t) expressions in the high-temperature limit

To gain further insight into the nature of the bias-correction functions, we consider a limiting high-temperature scenario of βF(s) ≪ 1 and βV(s, t) ≪ 1 with β = (k_BT)⁻¹. Then, approximate analytical expressions may be obtained for the bias-correction function associated with each herein introduced

reweighting method, as well as that of Tiwary and Parrinello.³⁶

For simulations with independent walkers, a Taylor expansion of the exponential functions of eqn (21) to first order yields


	(36)

which applies to one independent walker. Hence, c^Γ_w(t) is obtained as the time-average of the bias-potential taken up to either Γ = t or

. The same procedure applied to cooperative walkers [eqn (22)] yields a readily calculated average over the set of N_W correction functions {c^Γ_w(t)}:


	(37)

We next consider the Tiwary–Parrinello reweighting protocol,³⁶ for which application of the high-temperature approximation to eqn (15) gives the following result:


	(38)

When using well-tempered VES^17–19 with the bias potential of eqn (14), the integral

in eqn (38) evaluates to α₀ = 0, whereupon the bias-correction function may be expressed


	(39)

in the high-temperature limit. Likewise, the BE reweighting³⁷ and its VES equivalent¹⁷ implies that c^VES_Σ(t) = 0 throughout (Section 2).

Albeit approximate, eqn (36)–(39) offer reasonably tractable analytical expressions of c(t), as estimated by integration over either time³⁸ or over CVs.^36,37 For instance, Bonomi et al.¹² derived an equality between the time-derivatives of c(t) and 〈V(s, t)〉_s, which follows trivially by applying either of eqn (36) or (39). We next consider the converged results of Fig. 6d, all of which provided . (Because the function depends on the precise simulation period [eqn (22)], we employed for the results plotted in Fig. 6d). Here, eqn (14), (31), and (39) with α₁₁(t) = − 4.00 kJ mol⁻¹ predict that . Likewise, eqn (36) and (37) yield . Hence, in a minimum of computational efforts, the value of c(t) for t → ∞ is predicted with a relative error of 14% as compared with the accurate reference value of Fig. 6d.

6. Conclusions

We have generalized the recently proposed metadynamics ITRE reweighting protocol³⁸ to multiple-walker ensembles implemented within VES, moreover introducing and examining the usage of “independent” and “cooperative” walkers. For well-tempered VES simulations of two model cases, viz. the molecular conformations of N-acetyl-L-alanine methylamide, and a water molecule in the liquid phase subjected to a periodic free-energy function, we examined the relative merits of current state-of-the-art reweighting methods introduced in the VES or metadynamics contexts^17,36,37 against four new options: the latter resulted by combining either independent or cooperative walkers with the bias-correction function c(t) estimated by time integration up to either t or across the entire simulation period

; see eqn (21) and (22).

The use of multiple-walker ES/MD simulations accelerates the convergence of the reweighted observables. For all but very small walker ensembles, the M^t_Σ and methods that utilize cooperative walkers are superior to those with independent walkers (M^t_w and ), with the performance-differences growing for increasing N_W. The precise upper time-integration limit of t (M^t_Σ) or is not critical for c(t) estimates for MD simulations with (moderately) large cooperative walker ensembles. For small walker ensembles (N_W < 8), on the other hand, the advantages of cooperative walkers are minor compared to independent ones. Here, the choice of time-integration limit becomes much more critical—strongly favoring the M^t_Σ/M^t_w protocols relative to their counterparts—while the precise selection of “good” or “bad” collective variable(s) crucially underpins the convergence of all reweighting protocols.

Although no single reweighting protocol is expected to significantly outperform all others for any conceivable simulation scenario, out of the herein contrasted reweighting methods, the M^t_Σ scheme with cooperative walkers and the bias-correction function determined by time-integration up to t appears to be the overall most dependable option: it offers a superior accuracy for small walker ensembles than its primary and otherwise equivalent competitor , along with much more rapid reweighting calculations. Notably, both multiple-walker M^t_Σ and protocols are readily implemented in other ES methods, such as metadynamics. Moreover, we demonstrated that reweighting of VES-derived observables by the M^t_Σ procedure may be accelerated further by exploiting an analytical solution of its bias-correction function, as well as that qualitative insight into c(t) may be gained by approximative analytical expressions in the “high-temperature” regime for all six reweighting protocols that were considered. Computer code for implementing the new reweighting procedures are available at https://www.su.se/profiles/baltzar-1.187342 or may be obtained from the authors on request.

The herein recommended M^t_Σ scheme provides a better—or at worst comparable—reweighting accuracy as that of the currently best collective-variable-integration methods of Tiwary–Parrinello³⁶ (M^TP_Σ) and the “balanced exponential” (BE) of Schäfer and Settanni.³⁷ We demonstrated that when the BE reweighting is implemented within the VES framework, it becomes identical to the original VES implementation^17–20 (M^VES_Σ) with a constant bias-correction function. However, VES/BE reweighting often converged slower than the M^t_Σ, , and M^TP_Σ options, while typically giving a larger spread of reweighted observable-values between independent simulations. Albeit that deficiency may be alleviated by following the standard procedure of omitting the initial part of the simulated trajectory in the reweighting to improve accuracy, time-integration-based reweighting schemes offer decisive advantages by not requiring any such additional efforts/precautions along with their accompanying possible introduction of systematic errors. We conclude that enhanced reweighting of the VES/MD-derived observables are expected by embracing time-dependent c(t) ≠ 0 options, such as the M^TP_Σ reweighting³⁶ and its time-integration M^t_Σ, counterparts introduced herein.

Author contributions

BS – conceptualization, investigation, formal analysis and software; ME – funding acquisition and supervision; ME and BS – writing, original draft and revisions.

Conflicts of interest

There are no conflicts of interest to declare.

Acknowledgements

This work was supported by the Swedish Foundation for Strategic Research (funder ID 501100001729; project RMA15–0110), and in part by the Swedish Research Council (project VR 2022-03652). The computations were enabled by resources provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS) and the Swedish National Infrastructure for Computing (SNIC) at NSC, partially funded by the Swedish Research Council through grant agreements no. 2022-06725 and no. 2018-05973. We thank two anonymous reviewers for helpful suggestions that improved the manuscript.

References

G. M. Torrie and J. P. Valleau, Monte Carlo free energy estimates using non-Boltzmann sampling: Application to the sub-critical Lennard-Jones fluid, Chem. Phys. Lett., 1974, 28, 578–581 CrossRef CAS.
G. M. Torrie and J. P. Valleau, Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling, J. Comput. Phys., 1977, 23, 187–199 CrossRef.
U. H. E. Hansmann, Parallel tempering algorithm for conformational studies of biological molecules, Chem. Phys. Lett., 1997, 281, 140–150 CrossRef CAS.
Y. Sugita and Y. Okamoto, Replica-exchange molecular dynamics method for protein folding, Chem. Phys. Lett., 1999, 314, 141–151 CrossRef CAS.
S. Park and K. Schulten, Calculating potentials of mean force from steered molecular dynamics simulations, J. Chem. Phys., 2004, 120, 5946 CrossRef CAS PubMed.
K. M. Bal, Reweighted Jarzynski sampling: Acceleration of rare events and free energy calculation with a bias potential learned from nonequilibrium work, J. Chem. Theory Comput., 2021, 17, 6766–6774 CrossRef CAS PubMed.
A. Laio and M. Parrinello, Escaping free-energy minima, Proc. Natl. Acad. Sci. U. S. A., 2002, 99, 12562–12566 CrossRef CAS PubMed.
A. Laio, A. Rodriguez-Fortea, F. L. Gervasio, M. Ceccarelli and M. Parrinello, Assessing the accuracy of metadynamics, J. Phys. Chem. B, 2005, 109, 6714–6721 CrossRef CAS.
G. Bussi, F. L. Gervasio, A. Laio and M. Parrinello, Free-energy landscape for β hairpin folding from combined parallel tempering and metadynamics, J. Am. Chem. Soc., 2006, 128, 13435–13441 CrossRef CAS PubMed.
A. Laio and F. L. Gervasio, Metadynamics: a method to simulate rare events and reconstruct the free energy in biophysics, chemistry and material science, Rep. Prog. Phys., 2008, 71, 126601 CrossRef.
A. Barducci, G. Bussi and M. Parrinello, Well-tempered metadynamics: A smoothly converging and tunable free-energy method, Phys. Rev. Lett., 2008, 100, 020603 CrossRef PubMed.
M. Bonomi, A. Barducci and M. Parrinello, Reconstructing the equilibrium Boltzmann distribution from well-tempered metadynamics, J. Comp. Chem., 2009, 30, 1615–1621 CrossRef CAS.
M. Bonomi and M. Parrinello, Enhanced sampling in the well-tempered ensemble, Phys. Rev. Lett., 2010, 104, 190601 CrossRef CAS PubMed.
A. Barducci, M. Bonomi and M. Parrinello, Metadynamics, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 2011, 1, 826–843 CAS.
J. F. Dama, M. Parrinello and G. A. Voth, Well-tempered metadynamics converges asymptotically, Phys. Rev. Lett., 2014, 112, 240602 CrossRef PubMed.
G. Bussi and A. Laio, Using metadynamics to explore complex free-energy landscapes, Nat. Rev. Phys., 2020, 2, 200–212 CrossRef.
O. Valsson and M. Parrinello, Variational approach to enhanced sampling and free energy calculations, Phys. Rev. Lett., 2014, 113, 090601 CrossRef CAS PubMed.
O. Valsson and M. Parrinello, Well-tempered variational approach to enhanced sampling, J. Chem. Theory Comput., 2015, 11, 1996–2002 CrossRef CAS PubMed.
O. Valsson, P. Tiwary and M. Parrinello, Enhancing important fluctuations: Rare events and metadynamics from a conceptual viewpoint, Annu. Rev. Phys. Chem., 2016, 67, 159–184 CrossRef CAS PubMed.
O. Valsson and M. Parrinello, Variationally enhanced sampling, in Handbook of Materials Modeling, ed. W. Andreoni and S. Yip, Springer, Cham, 2020, pp. 621–634 Search PubMed.
P. M. Piaggi, O. Valsson and M. Parrinello, A variational approach to nucleation simulation, Faraday Discuss., 2016, 195, 557–568 RSC.
P. Shaffer, O. Valsson and M. Parrinello, Enhanced, targeted sampling of high-dimensional free-energy landscapes using variationally enhanced sampling, with an application to chignolin, Proc. Natl. Acad. Sci. U. S. A., 2016, 113, 1150–1155 CrossRef CAS PubMed.
Y. I. Yang and M. Parrinello, Refining collective coordinates and improving free energy representation in variational enhanced sampling, J. Chem. Theory Comput., 2018, 14, 2889–2894 CrossRef CAS PubMed.
L. Mones, N. Bernstein and G. Csányi, Exploration, sampling, and reconstruction of free energy surfaces with Gaussian process regression, J. Chem. Theory Comput., 2016, 12, 5100–5110 CrossRef CAS.
L. Bonati, V. Rizzi and M. Parrinello, Data-driven collective variables for enhanced sampling, J. Phys. Chem. Lett., 2020, 11, 2998–3004 CrossRef CAS.
J. Rydzewski and O. Valsson, Multiscale reweighted stochastic embedding: Deep learning of collective variables for enhanced sampling, J. Phys. Chem. A, 2021, 125, 6286–6302 CrossRef CAS PubMed.
M. Invernizzi, P. M. Piaggi and M. Parrinello, Unified approach to enhanced sampling, Phys. Rev. X, 2020, 10, 041034 CAS.
Q. Liao, Chapter four-enhanced sampling and free energy calculations for protein simulations, Prog. Mol. Biol. Trans. Sci., 2020, 170, 177–213 CAS.
R. Demuynck, S. M. J. Rogge, L. Vanduyfhuys, J. Wieme, M. Waroquier and V. Van Speybroeck, Efficient construction of free energy profiles of breathing metal-organic frameworks using advanced molecular dynamics simulations, J. Chem. Theory C, 2017, 13, 5861–5873 CrossRef CAS.
B. Pampel and O. Valsson, Improving the efficiency of variationally enhanced sampling with wavelet-based bias potentials, J. Chem. Theory Comput., 2022, 18, 4127–4141 CrossRef CAS PubMed.
K. M. Bal and E. C. Neyts, Merging metadynamics into hyperdynamics: Accelerated molecular simulations reaching time scales from microseconds to seconds, J. Chem. Theory Comput., 2015, 11, 4545–4554 CrossRef CAS PubMed.
P. Kříž, Z. Šućur and V. Spiwok, Free-energy surface prediction by flying Gaussian method: Multisystem representation, J. Phys. Chem. B, 2017, 121, 10479–10483 CrossRef PubMed.
S. Kumar, D. Bouzida, R. H. Swendsen, P. A. Kollman and J. M. Rosenberg, The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method, J. Comput. Chem., 1992, 13, 1011–1021 CrossRef CAS.
B. Roux, The calculation of the potential of mean force using computer simulations, Comput. Phys. Comm., 1995, 91, 275–282 CrossRef CAS.
L. Donati and B. G. Keller, Girsanov reweighting for metadynamics simulations, J. Chem. Phys., 2018, 149, 072335 CrossRef PubMed.
P. Tiwary and M. Parrinello, A time-independent free energy estimator for metadynamics, J. Phys. Chem. B, 2015, 119, 736–742 CrossRef CAS PubMed.
T. M. Schäfer and G. Settanni, Data reweighting in metadynamics simulations, J. Chem. Theory Comput., 2020, 16, 2042–2052 CrossRef.
F. Giberti, B. Cheng, G. A. Tribello and M. Ceriotti, Iterative unbiasing of quasi-equilibrium sampling, J. Chem. Theory Comput., 2020, 16, 100–107 CrossRef CAS PubMed.
F. Pietrucci, Strategies for the exploration of free energy landscapes: Unity in diversity and challenges ahead, Rev. Phys., 2017, 2, 32–45 CrossRef.
S. K. Veesam, S. Ravipati and S. N. Punnathanam, Recent advances in thermodynamics and nucleation of gas hydrates using molecular modeling, Curr. Opin. Chem. Eng., 2019, 23, 14–20 CrossRef.
S. Fukuhara, K. M. Bal, E. C. Neyts and Y. Shibuta, Accelerated molecular dynamics simulation of large systems with parallel collective variable-driven hyperdynamics, Comp. Mat. Sci., 2020, 177, 109581 CrossRef CAS.
J. J. Varghese and S. H. Mushrif, Origins of complex solvent effects on chemical reactivity and computational tools to investigate them: a review, React. Chem. Eng., 2019, 4, 165 RSC.
S. Xu and E. A. Carter, Theoretical insights into heterogeneous (photo)electrochemical CO₂ reduction, Chem. Rev., 2019, 119, 6631–6669 CrossRef CAS PubMed.
P. Liu and D. Mei, Identifying free energy landscapes of proton-transfer processes between Brønsted acid sites and water clusters inside the zeolite pores, J. Phys. Chem. C, 2020, 124, 22568–22576 CrossRef CAS.
J. Coines, L. Raich and C. Rovira, Modeling catalytic reaction mechanisms in glycoside hydrolases, Curr. Opin. Chem. Biol., 2019, 53, 183–191 CrossRef CAS PubMed.
P. Ibrahim and T. Clark, Metadynamics simulations of ligand binding to GPCRs, Curr. Opin. Struct. Biol., 2019, 55, 129–137 CrossRef CAS PubMed.
A. V. Dongre, S. Das, A. Bellur, S. Kumar, A. Chandrashekarmath, T. Karmakar, P. Balaram, S. Balasubramanian and H. Balaram, Structural basis for the hyperthermostability of an archaeal enzyme induced by succinimide formation, Biophys. J., 2021, 120, 3732–3746 CrossRef CAS PubMed.
Z. Xu, Y. Yang, Z. Wang, D. Mkhonto, C. Shang, Z.-P. Liu, Q. Cui and N. Sahai, Small molecule-mediated control of hydroxyapatite growth: Free energy calculations benchmarked to density functional theory, J. Comput. Chem., 2014, 35, 70–81 CrossRef CAS PubMed.
Q. Wang, M. Wang, K. Wang, Y. Liu, H. Zhang, X. Lu and X. Zhang, Computer simulation of biomolecule-biomaterial interactions at surfaces and interfaces, Biomed. Mater., 2015, 10, 032001 CrossRef PubMed.
H. Heinz and H. Ramezani-Dakhel, Simulations of inorganic-bioorganic interfaces to discover new materials: insights, comparisons to experiment, challenges, and opportunities, Chem. Soc. Rev., 2016, 45, 412–448 RSC.
J. McCarty, O. Valsson, P. Tiwary and M. Parrinello, Variationally optimized free-energy flooding for rate calculation, Phys. Rev. Lett., 2015, 115, 070601 CrossRef.
P. Shaffer, O. Valsson and M. Parrinello, Hierarchical protein free energy landscapes from variationally enhanced sampling, J. Chem. Theory Comput., 2016, 12, 5751–5757 CrossRef CAS PubMed.
P. Raiteri, A. Laio, F. L. Gervasio, C. Micheletti and M. Parrinello, Efficient reconstruction of complex free energy landscapes by multiple walkers metadynamics, J. Phys. Chem. B, 2006, 110, 3533–3539 CrossRef CAS PubMed.
F. Giberti, G. A. Tribello and M. Ceriotti, Global free-energy landscapes as a smoothly joined collection of local maps, J. Chem. Theory Comput., 2021, 17, 3292–3308 CrossRef CAS PubMed.
M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids, Clarendon Press, Oxford, 1987 Search PubMed.
L. Molgedey and H. G. Schuster, Separation of a mixture of independent signals using time delayed correlations, Phys. Rev. Lett., 1994, 72, 3634–3637 CrossRef.
B. Stevensson and M. Edén, Metadynamics simulations of the pH-dependent adsorption of phosphoserine and citrate on disordered apatite surfaces: What interactions govern the molecular binding?, J. Phys. Chem. B, 2021, 125, 11987–12003 CrossRef CAS PubMed.
R. Mathew, B. Stevensson, M. Pujari-Palmer, C. S. Wood, P. R. A. Chivers, C. D. Spicer, H. Autefage, M. M. Stevens, H. Engqvist and M. Edén, Nuclear magnetic resonance and metadynamics simulations reveal the atomistic binding of L-serine and O-phospho-L-serine at disordered calcium phosphate surfaces of biocements, Chem. Mater., 2022, 34, 8815–8830 CrossRef CAS PubMed.
M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess and E. Lindahl, GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers, SoftwareX, 2015, 1–2, 19–25 CrossRef.
U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen, A smooth particle mesh Ewald method, J. Chem. Phys., 1995, 103, 8577–8593 CrossRef CAS.
G. Bussi, D. Donadio and M. Parrinello, Canonical sampling through velocity rescaling, J. Chem. Phys., 2007, 126, 014101 CrossRef PubMed.
G. A. Tribello, M. Bonomi, D. Branduardi, C. Camilloni and G. Bussi, PLUMED 2: New feathers for an old bird, Comput. Phys. Comm., 2014, 185, 604–613 CrossRef CAS.
F. Bach and E. Moulines, Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n), in Advances in neural information processing systems, ed. C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, Curran Associates, Inc., Red Hook, NY, 2013, vol. 26, pp. 773–781 Search PubMed.
P. Bjelkmar, P. Larsson, M. A. Cuendet, B. Hess and E. Lindahl, Implementation of the CHARMM force field in GROMACS: Analysis of protein stability effects from correction maps, virtual interaction sites, and water models, J. Chem. Theory Comput., 2010, 6, 459–466 CrossRef CAS PubMed.
B. Efron and C. Stein, The jackknife estimate of variance, Ann. Statist., 1981, 9, 586–596 Search PubMed.
W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey and M. L. Klein, Comparison of simple potential functions for simulating liquid water, J. Chem. Phys., 1983, 79, 926–935 CrossRef CAS.
S. Kullback and R. A. Leibler, On information and sufficiency, Annals Math. Stat., 1951, 22, 79–86 CrossRef.

Footnote

† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2cp04009c

Click here to see how this site uses Cookies. View our privacy policy here.