Assessing numerical methods for molecular and particle simulation

We discuss the design of state-of-the-art numerical methods for molecular dynamics, focusing on the demands of soft matter simulation, where the purposes include sampling and dynamics calculations both in and out of equilibrium. We discuss the characteristics of diﬀerent algorithms, including their essential conservation properties, the convergence of averages, and the accuracy of numerical discretizations. Formulations of the equations of motion which are suited to both equilibrium and nonequilibrium simulation include Langevin dynamics, dissipative particle dynamics (DPD), and the more recently proposed ‘‘pairwise adaptive Langevin’’ (PAdL) method, which, like DPD but unlike Langevin dynamics, conserves momentum and better matches the relaxation rate of orientational degrees of freedom. PAdL is easy to code and suitable for a variety of problems in nonequilibrium soft matter modeling; our simulations of polymer melts indicate that this method can also provide dramatic improvements in computational eﬃciency. Moreover we show that PAdL gives excellent control of the relaxation rate to equilibrium. In the nonequilibrium setting, we further demonstrate that while PAdL allows the recovery of accurate shear viscosities at higher shear rates than are possible using the DPD method at identical timestep, it also outperforms Langevin dynamics in terms of stability and accuracy at higher shear rates.


Introduction
In this article, we provide a current and detailed perspective on the design of stochastic methods for simulating molecular and particle systems. Most of our discussion is general and equally applicable to simple and complex molecular fluids and polymer solutions, and to both equilibrium and nonequilibrium modeling. Modern software packages such as LAMMPS [73] offer a bewildering array of options for particle simulation, including choices regarding the model ensemble, equations of motion, discretization method, and parameter selection. In this article we contrast a number of the different schemes available, drawing on recent advances in the literature and focussing on the practical needs of the simulation community. All of the existing methods are convergent in the sense that, for suitable choice of parameters and in equilibrium state with probability density ρ β (q 1 , q 2 , . . . , q Nt , p 1 , p 2 , . . . , p Nt ) = Z −1 exp (−βH) where Z is a suitable normalizing constant (i.e., the partition function), β = (k B T ) −1 , with k B being the Boltzmann constant, T the system temperature, and m the assumed uniform mass of the N t particles. (Generalizing all results of this article to nonuniform particle masses would be straightforward but the uniform mass simplifies presentation of some formulas.) Temperature control is crucial for N V T simulation, but also plays a fundamental role in barostat methods as these typically fix both temperature and pressure.
A wide range of different approaches can be designed to sample from the canonical distribution; each such formulation has certain desirable properties, but also certain limitations. For example it is well known that the dynamical response of a system simulated using Langevin dynamics will strongly depend on the friction coefficient. However, since Langevin dynamics relies on a strong assumption of scale separation, it is possible that there is no choice of the friction that gives a satisfactory dynamical approximation [15,30,65,72] (see also discussions on time scales associated with Langevin dynamics [72]). In some cases a "gentle" form of Langevin dynamics is used where internal relaxation modes within the polymer are the dominant feature of interest, or else one uses Nosé-Hoover [40,69] or stochastic velocity rescaling [14,16,17], both of which are in some sense "gentle" alternatives to Langevin dynamics [58]. Other approaches include the pairwise Nosé-Hoover thermostat [2] and multiparticle collision dynamics [92]. However, we restrict our attention to methods that are derived directly as discretizations of stochastic differential equations. In case the goal is only to sample the equilibrium structures of the system under study (as often needed in protein modeling [66,70]), one may use the coefficient of friction as a free parameter and optimize the choice to enhance the rate of convergence to equilibrium or increase the effective sample size [82] associated with a certain family of observables. Whereas standard molecular dynamics methods such as Langevin dynamics and its overdamped limit (Brownian dynamics) are appropriate for modeling systems in or near thermodynamic equilibrium, these methods do not take into account the possibility of an underlying flow, and are thus, unless modified, inappropriate for situations where the underlying flow of the system cannot be predicted beforehand (e.g., when dealing with interfaces or nonuniform flow).
More generally, as we coarse-grain the system, the hydrodynamic transport properties become increasingly important, which calls for a formulation that preserves momentum and Galilean invariance. A method which addresses these issues is the dissipative particle dynamics (DPD) method of Hoogerbrugge and Koelman [39] which conserves momentum "exactly" (i.e., up to rounding errors) at each step. We find in our studies that the recently proposed pairwise adaptive Langevin (PAdL) method [61], which mimics DPD in the stationary setting, is preferable to DPD in nonequilibrium applications, e.g., for shear flows. The use of DPD or PAdL addresses a further problem with Langevin dynamics, namely the nonphysical screening of hydrodynamic interactions observed for Langevin dynamics [25].
Once the formulation is chosen, typically a set of stochastic differential equations (SDEs), it is necessary to replace it for computation with a discretized formulation, by applying a specific numerical method. Particularly for large, computationally demanding applications, the simulation must proceed at the largest possible timestep in order to provide meaningful answers to the questions of interest. Since we must work within a fixed computational budget, one should ultimately compare different numerical methods in terms of accuracy of observables for fixed computational work. On the other hand, if, as here, we wish to separate the numerical error (due to discrete approximation) from the statistical error (due to collecting finite numbers of samples) it is better to analyze these two types of error separately. This is especially true in a metastable system, i.e., one whose dynamics are limited by rare transitions, as for example protein folding or glassy system modeling [5,23], but generally speaking virtually any particle simulation will be subject to the timestep issue when pushed to deliver results on laboratory-relevant timescales. Thus we are led to seek methods that are optimized to deliver the desired properties in the most cost-effective manner, and the large timestep needed means that discretization bias becomes relevant. For molecular and mesoscale modeling, the most fundamental type of error incurred is the error in averages. We say that a method is accurate for long term averaging if the distribution generated by the numerical method converges in the limit of large time and small timestep to a stationary distribution which is close, in the sense of distributions, to the corresponding stationary distribution of the SDE [55,57]. The other major source of error in large scale simulations can be viewed as the error due to insufficient sampling of a stochastic quantity. Such error is always present but may be difficult to estimate, since in practice we do not necessarily know the underpinning distribution a priori nor the normalization constant needed to compute a robust average. To put this another way, if we remain for our entire simulation time exploring certain states of the molecule, it might be the consequence of a strong local confinement rather than an indication that these are always the most important group of states. Analyzing and comparing numerical methods for sampling thus requires balancing the issues of accuracy and sampling efficiency and it is crucial to realize that different methods may have very different levels of bias and rates of convergence which are highly dependent on the choice of timestep. In this article we explore these issues in detail for model systems, calculating first the bias and using this to select timestep to control the attainable accuracy, then (for the specific choice of timestep) assessing the rate of convergence to equilibrium average.
The key findings of this article are as follows. First, we compare the performance of Langevin dynamics, DPD, and PAdL for several benchmark calculations, in particular showing that the PAdL method provides higher accuracy (for given computational budget) than Langevin dynamics and DPD, in both equilibrium and nonequilibrium applications, with careful study of the trade-off between numerical bias and convergence rate in simulation. Second, we compare our numerical schemes with exact results regarding the relaxation behavior for a benchmark model, thus clarifying the performance of the methods in dynamics-oriented modeling applications. Third, we develop a careful procedure to quantify the sampling efficiency of various methods by comparing the effective sample size. A fourth advance in the current article is the demonstration that PAdL allows the recovery of accurate shear viscosity using larger shear rates than otherwise are possible using DPD (at identical timestep) while PAdL outperforms Langevin dynamics in terms of stability and accuracy at higher shear rates. Finally, we emphasize that this article provides specific details regarding implementation of all the various methods which are often lacking in the literature.
The rest of the article is organized as follows. In Section 2, we review a variety of numerical methods in polymer melts simulation. We describe, in Section 3, various physical quantities that are used to evaluate the simulation performance of each method. Section 4 presents numerical experiments in both equilibrium and nonequilibrium cases, comparing the performance of numerous popular numerical methods in practical examples. Our findings are summarized in Section 5.

Numerical methods
In this section, we describe various numerical methods used to simulate many-particle systems. We are interested both in the choice of formulation of the equations of motion and in the consequent secondary choice of discretization method. In the literature one observes that virtually all the popular methods are of a relatively simple design and require typically a single evaluation of the forces of interaction at each timestep, the computational cost of the force evaluation being normally the unit of computational effort.

Langevin thermostat
Following the seminal work of Grest and Kremer [35,49], Langevin dynamics has been widely used in simulating Lennard-Jones systems including polymer chains and their melts and can be written as where q i and p i are d-dimensional vectors and respectively represent positions and absolute momenta of bead i with d being the underlying dimensionality of the physical space (typically d = 3), m denotes the mass of a particle, the force on particle i, F i (q, t), could in principle be both positions and time dependent, however, in equilibrium, F i = −∇ q i U is the conservative force given in terms of a potential energy function U = U (q), dW i represents a dimensionless vector of d independent increments of Wiener processes with stochastic properties dW i (t) = 0 and dW i (t)dW j (t ′ ) = δ ij δ(t − t ′ )1dt, γ is the bead friction coefficient, which couples the beads weakly to a heat bath, u i = u(q i ) denotes a macroscopic streaming velocity at position q i , and σ represents the strength of the random forces, satisfying the following fluctuationdissipation relation: It should be emphasized here that the damping term in (1) depends on the peculiar velocity, which is the difference between the absolute velocity v i = p i /m and the streaming velocity field u. That is, the thermostat acts only on the peculiar velocity, which is essential in nonequilibrium, for instance, when modeling shear flow. The traditional formulation of Langevin dynamics (i.e., when u = 0) does not take into account the underlying streaming velocity and thus is expected to fail when there exists a nonzero underlying streaming velocity, however, this feature is not always clearly stated [81].

Stochastic velocity Verlet (SVV)
Due to its ease of implementation and its natural construction based on the popular Verlet method of molecular dynamics, the stochastic velocity Verlet (SVV) method [67] is one of the most popular methods for Langevin dynamics. The equations are: where h is the integration timestep, R n i and R n+1/2 i are vectors of uncorrelated Gaussian white noise with zero mean and unit variance, resampled at each step.

The BAOAB method
Numerical integration methods for Langevin dynamics have been studied systematically in terms of the long term sampling performance in recent works of Leimkuhler and Matthews [55,56]. Of note is the observation that a particular choice of splitting method, "BAOAB", based on a Trotter factorization of the stochastic vector field of the system into exactly solvable subsystems, is far superior to alternative methods in terms of sampling configurational quantities. The BAOAB method relies on separating the vector field of the system: in such a way that each piece can be solved "exactly". It is straightforward to solve the "A" and "B" pieces, respectively. Moreover, it is possible to derive the exact solution to the Ornstein-Uhlenbeck ("O") part, as where R i is a vector of independent and identically distributed (i.i.d.) standard normal random variables. The BAOAB method then can be defined as where exp (hL f ) denotes the phase space propagator associated with the corresponding vector field f . The integration steps of the BAOAB method, modified to include the streaming velocity, reads: It should be noted that only one force calculation is required at each step for BAOAB (i.e., the force computed at the end of each step will be reused at the beginning of the next step), the same as for alternative schemes, including the SVV method.

Dissipative particle dynamics (DPD) thermostat
Momentum conservation is an essential property required to correctly capture hydrodynamic interactions. However, the momentum is not conserved in Langevin dynamics due to the fact that the thermostat (i.e., the dissipative and random forces) is not pairwise. Analogous to Langevin dynamics, the dissipative particle dynamics (DPD) method [26,36,39] is a momentum-conserving thermostat which has been proposed to simulate complex hydrodynamic behavior. Unlike Langevin dynamics, the dissipative force in DPD is dependent of relative velocities and both the dissipative and random forces are pairwise, ensuring the momentum conservation. It should be noted that DPD has been used primarily as a mesoscale coarse-graining technique, where each DPD particle represents a blob of molecules, however, it has also been used in simulating polymer melts, in which case each DPD particle corresponds to one bead (e.g., see Refs. 19,29,71). The equations of motion of the DPD system can be written as where r ij = q i − q j is the distance between particles i and j with e ij = (q i − q j )/r ij being the unit vector in the associated direction, v ij = v i − v j is the relative velocity, F C ij (r ij ) denotes the conservative force derived from the corresponding pair potential energy U (r ij ), and dW ij = dW ji are independent increments of Wiener processes with mean zero and variance dt. In addition to the relation in (2), the two weight functions have to be related by ω D (r ij ) = ω R (r ij ) 2 in order for the system to sample the canonical ensemble.
We have observed [59] that standard DPD methods perform similarly in all the quantities that we have tested. Therefore, following Ref 61, Shardlow's S1 splitting method (i.e., the DPD-S1 scheme) [80] was used to represent the standard DPD formulation. As in Langevin dynamics, we can similarly define the DPD-S1 (OBAB) method as where one should note that the "O" part is further split into interacting pairs and then each pair is solved by using the method of Brünger, Brooks, and Karplus (BBK) [13] (the detailed integration steps of the DPD-S1 scheme can be found in Appendix A). Due to the fact that the dissipative force depends on relative velocities, DPD is Galileaninvariant, which makes it a profile-unbiased thermostat (PUT) [27,28] by construction and an ideal thermostat for nonequilibrium molecular dynamics (NEMD) [81]. The PUT allows the simulation itself to define the local streaming velocity (for more details, see Refs. 27,28,91) and thus there is no need to additionally subtract the underlying streaming velocity in nonequilibrium applications.

Pairwise adaptive Langevin (PAdL) thermostat
Inspired by recent developments in adaptive thermostats [45,60,79], the pairwise adaptive Langevin (PAdL) thermostat, which can be viewed as "adaptive DPD", has been proposed by Leimkuhler and Shang [61], see also more discussions therein. It has been observed that PAdL is able to correct for thermodynamic observables while mimicking the dynamical properties of DPD.
The equations of motion of the momentum-conserving PAdL thermostat are given by where ξ is an auxiliary dynamical friction variable, σ a constant amplitude as in Langevin dynamics, and G(q, p) denotes the accumulated deviation of the instantaneous temperature away from the target temperature where µ is a coupling parameter (an inverse surface mass density) which is referred to as the "thermal mass". It can be shown that, in equilibrium, the PAdL system preserves the momentum-constrained canonical ensemble with a modified densitỹ where γ is the friction coefficient as it satisfies the fluctuation-dissipation relation (2), and π = (π x , π y , π z ) is the linear momentum vector. Additional modifications should be included if the angular momentum is also conserved. According to the invariant distribution (9), the auxiliary variable ξ is Gaussian distributed with mean γ and variance (βµ) −1 . That is, the auxiliary variable will fluctuate around its mean value during simulation and moreover we can vary the value of the friction in order to recover the dynamics of DPD in a wide range of friction regimes. Therefore, the PAdL thermostat can be viewed as the standard DPD system with an adaptive friction coefficient (i.e., an adaptive DPD thermostat). Furthermore, we point that the PAdL thermostat inherits key properties of DPD (such as Galilean invariance and momentum conservation) required for consistent hydrodynamics. Note also that the PAdL thermostat would effectively reduce to the standard DPD formulation in the large thermal mass limit (i.e., µ → ∞).
The splitting method of PAdL proposed in Ref. 61 has been adopted in this article: where the "O" part is again further split into interacting pairs but each pair is solved exactly, and "D" represents the additional Nosé-Hoover part (the detailed integration steps of the PAdL method can also be found in Appendix A). It is worth mentioning that the computational costs per timestep of all the methods examined in this article are very similar. With the help of computation-saving devices such as Verlet neighbor lists [87], the computational effort of all methods under study scales with the number of particles, and both DPD and PAdL are only slightly more expensive than Langevin integrators.

Quantifiers for simulation performance
In this section, we briefly outline the quantities we will compute in simulation and use to compare the performance of different simulation schemes. These divide into observables for equilibrium and nonequilibrium sampling and the rate of convergence to equilibrium.

Summed autocorrelation count (SAC)
As in Markov chain Monte Carlo methods, we are interested in accurately and efficiently estimating the expected value of some physical observable of interest f (x), i.e., by averaging a time series of (typically correlated) N s samples for large N s , where x i denotes the phase space configuration at time t i . Denoting the variance of f with respect to the probability density function ρ by σ 2 ρ , which is independent of particular sampling methods, it can be shown [6] that the variance of the estimator f of the mean is where τ indicates a quantity that we refer to as the "summed autocorrelation count" (SAC) * , for correlated samples, typically estimated as τ ≡ τ (k max ) for some finite k max < N s by the "running" autocorrelation count estimator † where * Note that the SAC is often referred to as the "integrated autocorrelation time" [6,32,82] in the computational statistics literature. The problem with using such a term here relates to the fact that the time is a well-defined physical quantity whereas the formula quantifies a number of steps of an iterative procedure. Since we are mostly interested in how quickly the samples decorrelate in terms of the number of steps, we use the word "count" in order to avoid confusion with "physical time". † It should be noted that MATLAB's "autocorr" function, despite its name, does not calculate the autocorrelation function (14), however, 1 + τ (kmax) is just two times the first 1 + kmax "autocorr" lags in MATLAB.
is the unnormalized autocorrelation function (or auto co-variance) of f . If the samples are uncorrelated, τ = 1, in which case the variance of the estimator f would simply be σ 2 ρ /N s . Note that the variance of f (i.e., σ 2 ρ ) is a special case of the autocorrelation (14), i.e., σ 2 ρ = C(0). The running τ (k) starts at unity for k = 0, vanishes exactly for k = N s − 1, and its value and variance go through a maximum for intermediate k. In our simulations, we found that it was unclear how to properly determine the k max . Therefore, we instead suggest to approximate the SAC based on a weighted sum fitting [4] of the normalized autocorrelation function, whose argument is scaled to physical time, t, for convenience: where c ∈ [0, 1] is dimensionless, λ 1 , λ 2 > 0 are time constants, and w is a frequency. Integrating (15) from zero to infinity yields based on which the SAC can be approximated as where h is the stepsize associated with the numerical method. The choice of the functional form (15) is somewhat arbitrary, as it simply corresponds to the c-weighted superposition of solutions of a damped harmonic oscillator and a monoexponential relaxation process, but we observed good agreement with the actual autocorrelation function behavior in our simulation experiments (see the next section).
It should be noted that the SAC is closely related to the statistical error bar, i.e., the statistical error bar of the estimated mean is the standard deviation of the time series divided by the "effective sample size" defined as N s /τ . Thus, the SAC is an estimate of the number of iterations, on average, for an independent sample to be drawn, given a correlated chain. Therefore, the SAC directly measures the efficiency of the sampling-a lower value of SAC corresponds to a larger effective sample size, i.e., a more efficient sampling.

Configurational temperature
Since we are mostly interested in configurational sampling, in calculating the SAC we choose for f the configurational temperature [1,12,18,77,86], an observable function solely depending on positions whose average in the canonical ensemble, as the kinetic temperature, is precisely the target temperature: where ∇U and ∇ 2 U respectively denote the gradient and Laplacian of the potential energy U in the configurational phase space (further discussions in Ref. 59). The corresponding unnormalized correlation function and running SAC are denoted by C T (k) and τ T (k), respectively.

Polymer conformation
A multibead nonlinear spring model was employed to simulate a polymer melt as described in detail in Section 4.1. Denote the coordinate of the j-th bead in chain α as q (α) j . The endto-end vector of chain α is then given by R 1 . It should be emphasized here that in taking differences one has to respect the periodic boundary conditions (or to simply unfold all polymer contours before applying the above definitions if they are not already kept unfolded within the code).
Correlation functions, which characterize the relevant dynamical properties, are often studied in molecular dynamics. Of particular interest in polymer melts is the orientational autocorrelation function (OAF) of the end-to-end vector of polymer chains, which characterizes the relaxation of the polymer chains and is evaluated by choosing for f the end-to-end vector R

Shear viscosity
A common approach to generate a simple shear flow in nonequilibrium molecular dynamics is to apply the well-known Lees-Edwards boundary conditions (LEBC) [54], where, as in normal periodic boundary conditions (PBC), the primary cubic box remains centered at the origin, however, a uniform shear velocity profile is expected [27] where e x and e y respectively denote the unit vector in the x-and y-direction, κ is the transposed velocity gradient tensor, ⊗ represents the dyadic product of two vectors, andγ is the shear rate defined asγ = du x /dy, where u x is the macroscopic velocity in the x-direction. It is worth mentioning that while LEBC is typically applied only in the x-direction, the other directions (y and z) remain with PBC. It is nontrivial to implement LEBC in pairwise thermostats due to the position-dependence on both dissipative and random forces, this issue has been discussed in Ref. 61. The Irving-Kirkwood stress tensor [44] subject to LEBC can be written as where V is the volume of the simulation box, and u i is the streaming velocity (20) corresponding to the location of particle i. Only the conservative force should be included for F ij in Langevin dynamics since both the dissipative and random forces are averaged out, whereas all three components of the force should be accounted for pairwise thermostats. The generally non-Newtonian shear viscosity is extracted at finite rates as where σ xy denotes the shear stress, which is the off-diagonal xy-component of the symmetric stress tensor σ (21). While employing (22) the zero shear viscosity η 0 = limγ →0 η can be obtained by extrapolation, it is worth mentioning that η 0 can be alternatively calculated by integrating the stress-stress autocorrelation function (i.e., the Green-Kubo formulas [34,52]). However, it is well documented that those equilibrium approaches are subject to significant statistical error and thus not preferred in practice (see a detailed discussion on extracting transport coefficients by various approaches in Ref. 78).

Flow alignment angle
As a nontrivial application to demonstrate the performance of sampling schemes in the nonequilibrium context, we study the flow alignment of the polymer segments to the imposed flow as a function of the shear rate imposed using LEBC in manner described in Section 3.4, thus we calculate the flow alignment angle [51] as follows: where b x and b y respectively represent the x-and y-component of a normalized bond vector b = b x e x + b y e y + b z e z , and the average is taken over all bonds.

Numerical experiments
In this section, we conduct systematic numerical experiments to compare the performance of various methods introduced in Section 2 in polymer melts simulations.

Simulation details
A popular bead-spring model originally proposed by Kremer and Grest [35,49] is used in our simulations. The system is composed of M identical linear chains with N beads each in a cubic box with periodic boundary conditions [3]. The total number of beads is N t = M N in that case. Excluded volume interactions between all N t beads are included via a truncated Lennard-Jones potential: where r ij = q i − q j denotes the distance between two beads i and j, ǫ and r 0 are two constants that set the energy and length scales of the beads, respectively, in reduced Lennard-Jones units, and r c is the cutoff radius typically chosen as r c = 2 1/6 r 0 such that only the repulsive part of the potential is considered. This potential is also known as the Weeks-Chandler-Andersen potential [90]. Adjacent N beads along the same polymer interact, in addition, via the finitely extensible nonlinear elastic (FENE) potential:   18) and the preset temperature T , against stepsize by using various numerical methods introduced in Section 2 with a variety of friction coefficients in a standard setting of polymer melts as described in Section 4.1. Note that the relative error of DPD and PAdL appears to show little dependence on the friction coefficients, thus the result of γ = 0.5 only is shown. The system was simulated for 1000 reduced time units in each case but only the last 80% of the snapshots were collected to calculate the static quantity f T . Five different runs were averaged to further reduce the sampling errors. The stepsizes tested began at h = 0.005 and were increased incrementally by 10% until all methods became unstable. The horizontal solid black line indicates 1% relative error in sample mean (11) accuracy of configurational temperature, based on which the stepsizes for each method were chosen in equilibrium simulations, unless otherwise stated. Dashed black lines represent the second and fourth order convergence to the invariant distribution.
where k = 30ǫ/r 2 0 represents the spring coefficient and R max = 1.5r 0 determines the maximum length of a bond. This choice of parameters ensures that chains do not cross each other and it allows for a reasonable large integration timestep [49]. The system is thermostatted as described in Section 2.
Overall, the total potential energy of the system is defined as and the total potential of a bond, U LJ (r ij ) + U FENE (r ij ) gives rise to a mean bond length r ij ≈ 0.97 r 0 between adjacent beads i and j for ǫ = k B T = 1. A simple and popular choice of the weight function as in Ref. 36 was adopted in this article: A system (bead number density ρ d = 0.84) consisting of M = 30 identical linear chains with N = 20 beads (unit mass m) on each chain was simulated, where the following parameter set was used: k B = ǫ = r 0 = m = 1 (defining reduced units) at T = 1 in reduced units. The thermal mass in PAdL was initially chosen as µ = 10. It should be noted that the simulated system is usually referred to as "unentangled" since N ≤ 85 [51]. Pre-equilibrated initial  (14) by using various methods with fixed stepsizes shown in Table 1 in the low friction regime of γ = 0.5 with sample mean accuracy of ∼ 1% relative error in configurational temperature. Note that the horizontal axis is scaled to the reduced physical time, rather than the number of steps, for convenience.
configurations were obtained using an existing hybrid approach [50]: Initially, a mixture of phantom and excluded volume FENE chains were placed randomly into the simulation box at a density that exceeds the target density. A subsequent molecular dynamics algorithm with integration time step, force shape and force strength control was used to achieve a prescribed minimum distance (here 0.9) between all pairs of particles, while attempting to maintain local and global characteristics (such as the form factor) of the chain conformations. During this process the most inefficient chains were removed from the system, until the target density was reached. The initial momenta were independent and identically distributed (i.i.d.) normal random variables with mean zero and variance k B T . Unless otherwise stated, the system was simulated for 1000 reduced time units in each case but only the last 80% of the snapshots were collected to calculate various quantities described in the preceding section.

Equilibrium
As a verification of our equilibrium simulations, we first investigate the structural quantities obtained by using the SVV, BAOAB, DPD, and PAdL methods introduced in Section 2. Among all the methods, at a stepsize of h = 0.01, we obtain an (time and chain) averaged squared end-to-end distance of R 2 ee = 29.46 and an averaged squared radius of gyration [49] of R 2 g = 4.87 in the low friction regime of γ = 0.5, in perfect agreement with the results of Kremer and Grest [49]. Figure 1 compares the configurational temperature (19) control for a variety of methods with a range of friction coefficients. Note that SVV and BAOAB are two different splitting methods of Langevin dynamics. However, it can be clearly seen that, while maintaining a similar accuracy control of the configurational temperature, the BAOAB method allows the use of much larger (at least doubled) stepsizes compared to the SVV method, especially in Table 1: Comparisons of the sampling efficiency of various numerical methods quantified by the "effective sample size", N s /τ T , in the low friction regime of γ = 0.5 with similar sample mean f T accuracy of ∼ 1% relative error in configurational temperature. For all entries the total simulation time was N s h ≈ 800, where N s and h respectively represent the number of samples and the stepsize, and I denotes the integrated normalized autocorrelation function (16) of f T . The DPD method was computed by using Shardlow's splitting method (i.e., the DPD-S1 scheme) [80]. The simulation details of the table are the same as in Figure 1.  55,60). All the other methods tested show second order convergence according to the dashed order line. This again illustrates the importance of optimal design of numerical methods. It is also interesting to note that the relative error slightly rises as we increase the friction coefficient for the SVV method whereas, in the BAOAB method, the relative error decreases. While the relative error of both SVV and BAOAB methods depends on the friction coefficients in Langevin dynamics, the two pairwise thermostats (i.e., DPD and PAdL) appear to show little dependence on the friction coefficients. Although it seems that the DPD method is as accurate as SVV, the PAdL method is superior to both of them (even slightly better than BAOAB at low friction, i.e., γ = 0.5).
The accuracy and rates of convergence for each method (and for each observable) depend in a nontrivial way on stepsize and so we cannot expect to use the same stepsize for different numerical integrators. In performing comparisons, it is crucial to develop a careful procedure to quantify sampling convergence in relation to the accuracy desired. In our studies we use a fixed accuracy threshold to select the stepsize for each method (in equilibrium) and then, for this choice of stepsize, which will be different for each method, we use the configurational temperature SAC (see Section 3.1) to estimate the convergence rate. The detailed protocol is as follows: 1. Choose a suitable observable (for instance, the configurational temperature throughout this article);   (0), of the end-to-end vector of polymer chains in a melt between Langevin dynamics (left) and the PAdL method (right) with three different values of the friction coefficient. Note that PAdL and DPD exhibit indistinguishable behavior and thus only the result of the former was shown. The same stepsize of h = 0.01 was used for all methods as the focus here is to study the autocorrelation decays (in fact, reducing the stepsize leaves the autocorrelation decays indistinguishable). 100 different runs were averaged to reduce the sampling errors after the system was well equilibrated. The solid black line is the reference decay obtained by using Hamiltonian dynamics (i.e., switching off the thermostat, γ = 0), which used exactly the same initial conformations and velocities as alternative stochastic dynamics. normalized autocorrelation function. 5. Calculate the effective sample size, N s /τ T , for each method, which characterize the sampling efficiency.
In order to measure the sampling efficiency of the various methods (three different values of the thermal mass, µ, of PAdL are included), we plot the normalized configurational temperature autocorrelation function (CTAF) and its corresponding fitted curve based on the weighted sum (15) in Figure 2. The stepsize, corresponding to the ∼ 1% relative error case, for each method was determined with the help of the horizontal solid black line in Figure 1. As can be seen from the normalized autocorrelation functions in Figure 2, the samples of various methods decorrelate very differently. It is interesting to note that while both Langevin dynamics and DPD exhibit "monotonic-like" decays, PAdL (with a wide range of the thermal mass) oscillates, which we believe is due to the presence of the additional Nosé-Hoover control.
As can be seen from Table 1, the SAC (τ T ) of DPD is significantly larger than alternatives. Moreover, the SAC of BAOAB is about half of that of SVV, which is due to the fact that BAOAB can use about double the stepsize of SVV and thus only about half as many samples are needed to achieve a similar sample mean accuracy of the configurational temperature. Although the SAC of PAdL with µ = 10 is already smaller than either Langevin dynamics or DPD, the SAC of PAdL could be decreased further by lowering the value of the thermal mass µ. This should not come as a great surprise since µ determines how strongly the negative feedback loops (8) couple with the physical system. Therefore, the PAdL method has the ability of controlling the sampling efficiency while others not.  Figure 1 except the same stepsize of h = 0.01 was used for all methods. Note that the error bars were included but could be seen with relatively small shear rates only. Note also that we did short runs to highlight the deviations, while the errors decrease further and the viscosity reaches the Newtonian plateau at small shear rates upon increasing the length of the runs, as is well known from previous studies.
Since different stepsizes, which result in different sample counts (N s ) when the total simulation time is fixed, were used for different methods in order to achieve a similar sample mean accuracy, one should further compare the sampling efficiency by computing the "effective sample size" (N s /τ T ) instead of just the SAC (τ T ), which corresponds to cases where N s is identical in each method. One can see from Table 1 that the effective sample sizes of SVV and BAOAB are very close to each other since they are just two different splitting methods of the same stochastic (Langevin) dynamics. While the effective sample size of DPD is roughly half that of Langevin dynamics (either SVV or BAOAB), PAdL with µ = 10 is over 50% larger than the latter. Further reducing the value of the thermal mass in PAdL leads to an even more efficient sampling than alternatives: the effective sample size of PAdL with µ = 1 is more than four times that of DPD and Langevin dynamics; with µ = 0.1, this increases to a factor of 20. There is a limit to how much µ can be reduced, however, without introducing numerical instability and thus requiring a smaller timestep. Note that although the thermal mass µ in PAdL has a strong influence on the sampling efficiency, it appears that the sampling accuracy depends little on it (for instance, the long term behavior of PAdL with a wide range of the thermal mass µ is almost indistinguishable in Figure 1). Therefore, unless otherwise stated, µ = 0.1 will be used in subsequent comparisons.
The characterization of the relaxation of polymer chains in a melt corresponding to different dynamics was compared and plotted in Figure 3. In particular, we measured the orientational autocorrelation function (OAF) of the end-to-end vector of polymer chains defined in Section 3.3. It is believed that, for such a dense system, the long-time diffusion depends only on the interactions between beads and does not arise from the associated thermostat [49]. Therefore, the reference decay was calculated by using Hamiltonian dynamics (i.e., switching off the thermostat, γ = 0). Since the OAFs obtained by SVV and BAOAB are almost indistinguishable, the SVV method was used for Langevin dynamics. (Note that in what follows, unless otherwise stated, Langevin dynamics was calculated by using the benchmark SVV method).
It can be seen from Figure 3 (left) that the OAF of Langevin dynamics depends strongly on the friction coefficient. To be more precise, the OAF starts to (significantly) deviate from the reference decay as we increase the friction coefficient. Although a relative small friction (i.e., γ = 0.5 in this parameter setting) was suggested in Ref. 49 to not only minimize the effects of the Langevin thermostat but also to be large enough to stabilize the system in the long time limit, visible discrepancies were still observed in the case of γ = 0.5 in our numerical experiments. In stark contrast, the OAFs of the PAdL method in a wide range of the friction coefficients are almost indistinguishable from the reference decay as shown in Figure 3 (right). Very similar behavior was also observed in the DPD method, which implies that the projection of the interactions of both the dissipative and random forces (i.e., the thermostat) on to the line of centers (and thus the conservation of the momentum) may have played a role in preserving the correct relaxation behavior in the case of the pairwise thermostats. Since a relatively small friction has been widely used in the literature of polymer melts, in what follows we restrict our attention to comparing various methods in the low friction regime of γ = 0.5.

Nonequilibrium
The stepsize for each method was chosen according to certain sample mean accuracy threshold (e.g., ≈ 1% relative error in configurational temperature) when examining the sampling efficiency in the previous subsection. However, we are more interested in investigating the stability issues in nonequilibrium simulations. Thus in what follows we fix the stepsize of h = 0.01, which is close to the stability threshold, for all methods.
The shear viscosity was extracted by using the formula (22) outlined in Section 3.4 and plotted in Figure 4. All methods appear to show similar behavior except that the range of shear rates in DPD is greatly limited (that is, the largest usable shear rate in DPD isγ = 0.08,  compared withγ = 1.2 in both Langevin and PAdL). The error bars were indeed included, however they were relatively very small (particularly in the large shear rate regime) and thus not visible.
In Figure 5, we plot the evolution of the shear stress against the shear strain for PAdL and Langevin. While PAdL was perfectly stable for all runs performed, Langevin dynamics occasionally becomes unstable with a shear rate ofγ = 1, therefore, a smaller shear rate oḟ γ = 0.5 was used here instead in order to provide comparisons. As can be seen from the figure, for both methods, as the shear strain increases, the shear stress rises rapidly from zero to its maximum at shear strain of around 2.5 before relaxing to its steady state, consistent with the results in Figure 4. At startup the system tends to transform affinely and builds up stress before relaxation takes over. The faster the shear rate, the more likely the shear stress maximum occurs, as affine shear deformation. During this phase, particles are forced into close proximity resulting in strong Lennard-Jones repulsion and quickly producing enormous shear stress. However, the maximum is not expected, and was indeed not observed by us, at small rates. Overall, the behavior is in good agreement with previous studies [10,20,41,89]. We also investigated the evolution (including DPD whenever possible) with a wide range of shear rates and did not observe significant differences between the methods. Figure 6 plots the standard deviation in the computed shear viscosity over a wide range of shear rates by using various methods subject to Lees-Edwards boundary conditions. Once again, all methods behave very similarly. As we increase the shear rate, the standard deviation decreases, which is consistent with the observations in Figure 4. Note that the DPD method appears only in the smallest shear rate case. In the regimes of small and moderate shear rates, the standard deviations of all/both methods are very similar. However, in the high shear rate (γ = 1) case, Langevin dynamics has not only a visibly (≈ 60%) larger standard deviation but also a smaller range of stepsizes usable than the PAdL method.
We further investigate in Figure 7 the stepsize effects on the alignment angle of the polymer chains subject to LEBC at a relatively high shear rate ofγ = 1. It appears that while the absolute error of the alignment angle of PAdL remains below around 0.04 degrees, the cor- responding error of Langevin dynamics is around six times larger. This clearly demonstrates the superiority of PAdL over Langevin dynamics in nonequilibrium simulations especially at relatively high shear rates.

Conclusions
We have reviewed a variety of numerical methods (SVV and BAOAB of Langevin dynamics, DPD, and PAdL) that can be used to simulate polymeric systems. We have systematically compared those methods in terms of accuracy, efficiency, and stability both in equilibrium and nonequilibrium settings.
In terms of sampling accuracy in equilibrium simulations, we have observed that the BAOAB and PAdL methods outperform the SVV and DPD methods in a wide range of friction coefficients. We have also discovered that while perfectly matching the reference decay, the OAF, which characterizes the orientational relaxation of the polymer chains, of both pairwise (momentum-conserving) thermostats (DPD and PAdL) has little dependence on the friction coefficient in a wide range. On the other hand, the OAF of Langevin dynamics strongly depends on the friction coefficient, moreover, a clear discrepancy was observed even for the commonly used low friction of γ = 0.5. We have further developed a careful procedure to quantify the sampling efficiency of various methods. By comparing the effective sample size, we found that PAdL substantially outperforms alternatives, particularly with a relatively small thermal mass of µ = 0.1, for which remarkably about twenty times increase in the effective sample size was achieved in comparison to alternative approaches (see Table 1).
We are more focused on investigating the stability issues in nonequilibrium simulations. We have demonstrated that, with a stepsize of h = 0.01, the largest usable shear rate was aroundγ = 0.08 for DPD, compared withγ = 1.2 for both Langevin and PAdL, in a stan-dard setting of polymer melts as described in Section 4.1. Thus, in agreement with previous studies [29], DPD is not recommended for nonequilibrium simulations, when the mean flow dissipation rates begin to overwhelm the thermostat, limiting its use in practice for relatively large shear rates. Between Langevin dynamics and PAdL, we have found that they perform rather similarly with relatively low shear rates. For the investigation of even smaller rates, where the FENE polymer exhibits a Newtonian plateau in the shear viscosity, thermodynamically guided methods are more suitable [42,43]. Nevertheless, we have illustrated that while both methods share a similar relation of shear stress and shear strain, Langevin dynamics performed unreliably with a relatively high shear rate ofγ = 1 at a stepsize of at least h = 0.01-it not only produced a larger (by about 60%) standard deviation of the computed shear viscosity than PAdL, but also resulted in a significantly larger (about six times) absolute error of the flow alignment angle.
are the total conservative forces acting on particle i with configuration q.
Pairwise adaptive Langevin thermostat: PAdL  An extension of the PAdL algorithm for systems with beads of different masses and identical friction coefficients can be found in Ref. 61.