Hierarchical modeling of polystyrene melts: From soft blobs to atomistic resolution

We demonstrate that hierarchical backmapping strategies incorporating generic blob-based models can equilibrate melts of high-molecular-weight polymers, described with chemically specific, atomistic, models. The central idea behind these strategies, is first to represent polymers by chains of large soft blobs (spheres) and efficiently equilibrate the melt on mesoscopic scale. Then, the degrees of freedom of more detailed models are reinserted step by step. The procedure terminates when the atomistic description is reached. Reinsertions are feasible computationally because the fine-grained melt must be re-equilibrated only locally. To develop the method, we choose a polymer with sufficient complexity. We consider polystyrene (PS), characterized by stereochemistry and bulky side groups. Our backmapping strategy bridges mesoscopic and atomistic scales by incorporating a blob-based, a moderately CG, and a united-atom model of PS. We demonstrate that the generic blob-based model can be parameterized to reproduce the mesoscale properties of a specific polymer -- here PS. The moderately CG model captures stereochemistry. To perform backmapping we improve and adjust several fine-graining techniques. We prove equilibration of backmapped PS melts by comparing their structural and conformational properties with reference data from smaller systems, equilibrated with less efficient methods.


I. INTRODUCTION
A molecular structure based understanding of properties of technically or experimentally relevant polymer melts or dense solutions still poses significant scientific challenges. Computational studies of structure-process-property relationships in polymer liquids often require the consideration of atomistic details, while at the same time many fundamental scientific questions and important technological applications can be addressed only for high molecularweight (MW) polymers. These simultaneous requirements create major challenges at high polymer concentrations, e.g. concentrated solutions and melts. For typical polymerization degrees in industrial applications, the average spatial extension of random-walk-like chains lies [1] between 10 and 100 nm. In concentrated polymer liquids these fractal "threads" strongly interdigitate. Therefore, samples with dimensions only a few times larger than the average chain size contain hundreds or even thousands of polymers. This easily corresponds to many millions of atomistic degrees of freedom. The overlapping polymers are strongly entangled [2] and their relaxation times are prohibitively long. Their equilibration with atomistic molecular dynamics (MD) is unfeasible even when using massively parallel simulations. Therefore approaches, circumventing the slow relaxation path of physical dynamics are of significant interest. They are an indispensable starting part to simulation studies of atomistic long-chain polymer melts. Various strategies are available, including advanced connectivity altering Monte Carlo (MC) algorithms [3,4] and methods based on hierarchical multiscale modeling.
Hierarchical multiscale modeling involves multiple scales of description [5][6][7][8][9][10][11][12] and offers a powerful concept for tackling large system sizes and protracted equilibration times. The central idea is to benefit from scale separation: in polymers long-wavelength behavior often follows universal laws [13] that incorporate chemistry-specific information into a few parameters. Therefore, one can initially eliminate microscopic features hampering equilibration and prepare samples reproducing only mesoscopic conformational and structural properties. These samples serve as "blueprints" for reinserting the missing microscopic features without disturbing long-wavelength properties. The reinsertion is computationally feasible because only local re-equilibration is required. Two major strategies realize this idea: configuration assembly and hierarchical backmapping.
Configuration assembly [10,[14][15][16][17] algorithms construct the initial sample by putting together polymer chains under fixed average density. Implementations vary in details, but in all cases chains are generated to reproduce prescribed distributions of conformations. If the ensemble is generated as an ideal gas of chains, important longwavelength structural properties, such as the correlation hole effect [13], are missing. Therefore, one must reduce the large density fluctuations of the ideal gas of chains and introduce some interchain correlations through chain packing schemes [14,15]. Subsequently local conformations and liquid packing are recovered through a "push off" procedure, which gradually [14] reinserts the microscopic excluded volume into the ensemble. The strategy has been successful with generic [14,15] and chemistry specific [10,16,17] microscopic models. However, the postulative construction of starting configurations is a drawback. The assumption that polymers in melts have ideal randomwalk-like conformations is only approximately true [18,19]. Examples of deviations from ideal random-walk statistics are found in the power-law decay of bond-bond correlations [18] and statistics of polymer knots [20]. Predicting conformations for polymers with complex architecture (star-like or branched) and systems that are inhomogeneous or multicomponent is even less straightforward. Furthermore, the computational costs at the stage of chain packing increase with molecular weight.
Hierarchical backmapping is a more general approach, taking advantage of a general concept originating from renormalization group theory in critical phenomena. The material is described at several "nested" length scales, introducing a sequence of coarse-grained (CG) models. The sequence is terminated by the microscopic model. Because the sample is equilibrated at the largest length scale with standard simulations (handling the crudest CG model), long-wavelength properties are not postulated but follow from a rigorous statistical-mechanical framework. The molecular details resolved by the next model are reinserted through local sampling of configuration space. Repeating the procedure, one descends the hierarchy of models, step by step, until the microscopic description is reached. The efficiency increases significantly [17,[21][22][23][24][25] when the hierarchy includes soft models, i.e. models where the strength of non-bonded interactions is comparable to the thermal energy.
Hierarchies of CG models can be constructed taking advantage of universalities in polymer behavior, using the classical concept [13] of blobs. The decimation of the microscopic degrees of freedom is what resembles renormalization group theory: [13,26] N b monomers of a microscopically-resolved subchain are lumped into a single soft blob (sphere), so that polymers are represented by chains of blobs. Varying N b generates a family of models with different resolutions. In a polymer melt, a key quantity controlling [18,19,27,28] conformations and liquid structure on the scale of blobs is given by is the invariant degree of polymerization of subchains, R b is the root meansquare end-to-end distance of subchains, and ρ is the number density of microscopic monomers. When the entire chain is considered, instead of a subchain,N b reduces to the invariant degree of polymerization of the melt,N . Melts with the sameN form a single class of materials which can be described [29] (in renormalized space) by a single blob-based model. This common model can be used to inter-convert [29] chemically different materials within the sameN -class. Increasing N b has two important consequences: a) the conformations of subchains approach the Gaussian statistics of ideal random-walks and b) the correlation hole of blobs becomes more shallow, i.e. the intermolecular correlation function of blobs approaches unity. For both cases the small parameter controlling convergence is 1/ N b . Therefore, for large N b the interactions in blob-based models can be approximated by generic expressions [29][30][31][32] inspired by the ideal random-walk limit. Deviations from the asymptotic Gaussian behavior are taken implicitly into account by "renormalizing" the parameters in the generic expressions.
Hierarchical backmapping using blob-based models has been successful in equilibrating high-molecular-weight polymer melts [22,29] and blends [33] described with generic (bead-spring like) microscopic models. The equilibrated samples typically contained 10 3 chains, comprised of a few thousands of beads. The method is not limited to these examples -the computational costs of the procedure are not affected by chain length and are, roughly, proportional only to the volume of the system. [22] Yet, the question whether hierarchical strategies with blob-based models can equilibrate melts described with chemically-specific atomistic models remains, so far, open. Concerns regarding this point have been expressed in the literature [17].
Here we address this question and demonstrate that hierarchical strategies with generic blob-based models can equilibrate melts of actual polymers, described with chemically-specific atomistic models. To illustrate this we develop a hierarchical backmapping method which equilibrates large atomistically-resolved samples of polystyrene (PS) melts. PS is a basic commodity material and is well-suited for method development: it is well studied experimentally and theoretically, and it is a sufficiently complex polymer, where molecules have tacticity and bulky units (benzene rings). To descent the hierarchy of scales, from coarse to atomistic, our backmapping strategy incorporates a blob-based, [30,31] a moderately [34,35] CG, and a united-atom [34,36] (UA), model of PS. We demonstrate that a generic blob-based model [29][30][31] can describe the rather complex PS melt, accurately enough to allow backmapping. So far, chemistry-specific blob-based models have been developed for simpler polymers (e.g. polyethylene) using the machinery of Integral Equation theory. [37,38] To perform backmapping within the hierarchy of chemically-specific models, we improve and adjust several reinsertion techniques. [10,14,15] These methodological issues are also discussed in the paper.

II. HIERARCHY OF MODELS
Our hierarchical strategy for equilibrating large samples of high-molecular-weight polymer melts, described with atomistic detail, requires several models covering a broad range of length scales: from atomistic, to moderately CG up to mesoscopic one. In this section we present the models used for the hierarchical description of PS.

A. Atomistic model
On atomistic level we describe PS through a united atom (UA) model based on the TraPPE-UA force field [39]. Details can be found in previous studies [34,36], so we mention here briefly that in this UA model each PS monomer is represented by groups of eight UAs (see Fig. 1(a)). The full interaction potential of atomistic PS involves bonded and non-bonded terms. Angular and torsional potentials are introduced for the aliphatic backbone, while keeping the lengths of the bonds fixed. Improper dihedral potentials are used to keep the phenyl ring planar and to maintain the tetrahedral configuration around the sp3-hybridized carbon connecting the phenyl ring. Non-bonded interactions between UAs are captured by pairwise Lennard-Jones (LJ) potentials.
Despite the fact that we are using a UA model it is rather straightforward to obtain and simulate all-atom systems by adding hydrogens in the UA PS configurations. However, since the UA PS model has been extensively used and examined before, we choose to represent the atomistic PS chains in the UA description.

B. Moderately coarse-grained model
We employ a moderately (quantitative) CG model that has been derived [10,40] from the UA description presented in Sec II A. The model and the procedure used to obtain the CG force field have been elaborated elsewhere. [10,34] In summary, each PS monomer is mapped on two effective beads, "superatoms", A and B (see Fig. 1(b)). Bead A represents the CH 2 unit of a PS monomer and the half of each of the two neighboring CH groups along the chain backbone. Bead B stands for the phenyl ring. The sizes and masses of A and B superatoms are σ A = 4.1Å, σ B = 5.2 A, m A = 27 amu, m B = 77 amu, respectively. Such a model is capable of providing quantitative predictions about both static and dynamic properties of PS systems. In addition, for the moderately CG model we define: (a) a characteristic length scale as the averaged bead size, i.e., σ CG = (σ A + σ B )/2 = 4.65Å, and (b) a characteristic time scale, which is defined by τ = m A σ 2 CG /k B T (k B T is the thermal energy). This CG scheme can describe tacticity of PS chains without introducing side groups, by classifying the B beads into four subtypes. In total, the bonded part of the CG force field includes one bond, but four "alternative" angular and dihedral interaction potentials. The specific sequence of potentials chosen for angular and dihedral interactions along the backbone of a moderately CG chain depends on the tacticity of the atomistic PS molecule it represents.
Nonbonded interactions are described through pair LJ-like, n-m potentials U α,β (r), where r is the distance between two interacting superatoms. The powers and parameters used in U α,β (r) depend on the type α and β of the interacting superatoms, i.e. a bead can be A type or belong to one of the four B types. Non-bonded interactions are deactivated for those beads that belong to the same chain and are closer neighbours than (1,5). The details and the parameters of the force-field can be found elsewhere. [10,40] The accuracy of the moderately CG model in predicting quantitatively the properties of PS melts has been verified through extensive investigations. [10,40] For the purposes of our backmapping scheme, the fine structure of the CG model presents significant advantages because it facilitates the subsequent reinsertion of the degrees of freedom of the UA description.
C. Blob-based coarse-grained model

Model description
Earlier studies [22,[29][30][31]33] introduced blob-based models for polymer melts described on microscopic level through generic (bead-spring) models. [41] Mapping the chemistry-specific moderately CG model of PS on a blobbased description is similar to these cases.
A moderately CG polystyrene molecule with N CG superatoms is represented by a sequence of N BL = N CG /N b spheres (blobs). As illustrated in Fig. 1(c), each of them stands for a PS subchain with N b superatoms, so that the choice of N b determines the resolution of the drastically CG description. The radius σ i and the coordinates of the center of the i-th sphere, r i , match (respectively) the instantaneous gyration radius, R g,i , and position of the center-of-mass (COM), R cm,i , of the underlying PS subchain.
The connectivity of soft-sphere chains is described [30,31] using bond, V b , and angular, V θ , potentials, defined as: where d and θ are the distance and angle between consecutive spheres and bonds in a soft-sphere chain, respectively. The parameters b BL and k BL control the strength of the potentials; their energy scale is expressed in units of k B T (β = 1/k B T ). The fluctuations of the radius, σ, of a sphere are controlled by a "self-interaction" potential, V s (σ), similar to the Flory free energy [42]. This potential is defined as: The first term in Eq. (3) is of entropic origin and balances the swelling induced by the second term, which implicitly accounts for binary intramolecular interactions between the N b superatoms underlying the blob. The reader may notice that in previous studies [22,[29][30][31] the interactions controlling the fluctuations of σ were augmented by a term, proportional to 1/σ 6 (following Lhuillier [43]). In the first implementation of the soft-sphere model [30] the 1/σ 6 contribution enabled the description of a broad range of concentration regimes, e.g. dilute solutions. However, in melts the 1/σ 6 term is significantly smaller than the 1/σ 3 term [44] and thus can be omitted.
The effective non-bonded interactions between two spheres, i and j, are described by a Gaussian potential: where r ij is the distance between the centers of the spheres,σ 2 = σ 2 i + σ 2 j , and ε controls the strength of repulsion [45]. βV nb (r ij ) is obtained by approximating the interactions of superatoms underlying different spheres by binary repulsive collisions and is proportional to the overlap of two Gaussian clouds. Each of these clouds describes the average distribution in space of superatoms with respect to the COM of the subchain they belong to.
The soft-sphere model proposed for the PS is motivated by arguments [22,29,30,32] based on general polymer physics. The model aims to capture long-wavelength conformational and structural properties of PS melts, accurately enough for performing backmapping at one state point. In this work, we define state points using the density of the blobs nN BL /V (n and V are, respectively, the number of chains and volume of the sample) and temperature T . Due to the simple Gaussian-potential approximation used for the non-bonded interactions, the soft-sphere model is not expected to be thermodynamically consistent, e.g. it will not reproduce the equation-of-state of the PS. For this purpose, more elaborated blob-based models can be developed using techniques such as the integral equation theory [37,38,46] and others. [47,48] To sample the configurational space of PS melts described with the soft-sphere model, we benefit from an efficient Monte Carlo (MC) algorithm based on three types of moves: i) random change of sphere size, ii) random sphere displacement, and iii) slithering snake (reptation). A detailed presentation of the method can be found elsewhere. [31] Here, we summarize that the computational efficiency stems from a special particle-to-mesh calculation of nonbonded interactions, which avoids neighbor lists. [22,31] 2. Parameterization strategy The first step is to set the resolution of the soft-sphere model by choosing N b . Currently there are no rigorous rules for choosing N b , apart from that N b must be "large enough" for the soft-sphere model to be valid but smaller than the entanglement length, N e . The last requirement is important for backmapping. When soft-sphere chains are substituted by the moderately CG polymers, the liquid structure and chain conformations in the melt must be relaxed on scales smaller or comparable to the average blob diameter. The condition N b < N e warrants that this relaxation is achieved through a fast Rouse-like dynamics of short subchains and is not affected by surrounding topological constraints. [22] Had we employed a hierarchy with several blob-based models [22], the condition N b < N e would apply only to the last, highest resolution, blob-based model (where the reinsertion of the moderately CG description is performed). Depending on the method used to extract the entanglement length, simulations of PS have reported N e in the range of 100 − 200 monomers, which is equivalent to 200 − 400 superatoms. [34] These results agree with experiments. [49][50][51] Therefore, we explore different resolutions satisfying the condition N b ≤ 200. The entanglement time in the moderately CG model is τ e 4 × 10 4 τ .
For each trial N b , the parameters c 1 , c 2 , b BL , k BL , and ε are determined using typical structural-based (or Inverse Boltzmann) coarse-graining. [5,9,52] The parameters are chosen so that the probability distributions P BL (σ), P BL (d), and P BL (θ), as well as the pair-correlation function, g BL (r), of the centers of the spheres in blob-based PS melts, reproduce closely their counterparts in reference samples described by the moderately CG model. These reference samples contain 50 chains with N CG = 960 superatoms and were equilibrated through a variant [10] of the configurationassembly method proposed by Auhl et al [14] and long CG molecular dynamics simulations. The density of chains in the reference samples is n/V = 1.17 × 10 −2 chains/nm 3 and the temperature is set to T = 463 K. In the moderately CG reference samples, the polymers are partitioned into subchains with N b superatoms. The length of the chains in the reference samples restricts the N b that can be considered here to a limited set of values, to have an integer number of subchains, i.e. under the condition N b ≤ 200, we try N b = 192, 160, 120, and so on. After the subchains are identified, we calculate the distributions of: i) gyration radii of subchains, P ref,BL (R g ); ii) distance between the COM's of subchains, sequential in the same PS molecule, P ref,BL (d); and iii) angles between two vectors joining the COM of a subchain with the COMs of the preceding and succeeding subchain, P ref,BL (θ) (cf. Fig. 1(c)). The pair distribution function of the COMs of subchains, g ref,BL (r), serves as the counterpart of g BL (r).
With the reference distributions in hand, the parameters of the soft-sphere model are optimized through an iterative procedure which simultaneously minimizes four merit functions, defined [53] as: where f (x) = P BL (σ), P BL (d), P BL (θ) and g BL (r); f ref (x) is the corresponding reference function. In this study, we set the weighting function w(x) in Eq.(5) to a constant, w(x) = 1/x cutoff , so that it has no effect on ∆ f (x) . For P BL (θ) no cutoff is required, since θ ∈ [0, π]. For f (x) = P BL (σ) and P BL (d), the x cutoff is chosen such that f (x cutoff ) 10 −4 .
We use x cutoff = 12 σ CG for f (x) = g BL (r), because g ref,BL (r = 12 σ CG ) saturates, within the noise of the data, to unity.
To start the iterative procedure, the initial values of b BL and k BL are obtained by fitting the Boltzmann distributions of the potentials V b (d) and V θ (θ) from Eq. (2)  parameters c 1 and c 2 are obtained in a similar way by fitting the Boltzmann distribution of the potential V s (σ), Eq. (3), to P ref,BL (R g ). We emphasize that this first estimate of the parameters is approximate, because it assumes that the distributions are not correlated. A first-guess value for ε can be obtained from c 2 . To first order, the two parameters can be related to each other by equating the potential (3)), which captures the effect of the repulsion between intramolecular superatoms, to the "self-interaction" of a Gaussian density distribution. The latter is given by βV nb (0)/2; the prefactor 1/2 avoids double counting. This approximation leads to To perform the first iteration step, cf. Fig. 2, we consider the soft-sphere model with the initial values of parameters. The number of blobs in the soft-sphere chains equals the number of subchains used to partition the moderately CG molecules in the reference samples. The melt has n = 100 soft-sphere chains and the volume is chosen such that n/V matches the chain density in the reference systems. The temperature does not appear explicitly in the soft-sphere model, because all interactions in Eqs. (2) and (3) are scaled by the thermal energy. The melt is equilibrated using the particle-to-mesh MC and the distribution function P BL (σ) is extracted to calculate ∆ PBL(σ) . The parameters c 1 and c 2 are increased or decreased by 5% and a new simulation is performed. The procedure is repeated until ∆ PBL(σ) reaches a minimum value. The configurations from the simulation with the last optimized c 1 and c 2 are used to calculate ∆ PBL(d) . The parameter b BL is modified until ∆ PBL(d) reaches minimum. The same procedure is repeated for k BL and ε; the corresponding merit functions are ∆ PBL(θ) and ∆ gBL(r) , cf. Fig. 2. After updating all parameters, we commence a new iteration step. We recalculate ∆ PBL(σ) and improve c 1 and c 2 by minimizing again ∆ PBL(σ) . Subsequently, the remaining parameters are improved, as in the first iteration step. Iteration steps are repeated until the minimum values of all merit functions, ∆ f (x) , saturate.
Through exploratory studies we find that the simple soft-sphere model can be parameterized to reproduce static properties of the reference PS melt when N b 120. Backmapping a soft-sphere model with the smallest possible N b is preferable, because this procedure requires shorter relaxation times of reinserted microscopic details. Therefore, in this study we work with N b = 120.

D. Validation of the soft-sphere model
We illustrate that the melts equilibrated with the soft-sphere model indeed describe structual and conformational properties of PS melts on length scales comparable or larger than the size of the blobs. For N b = 120, the four panels of Fig. 3 present the distributions P BL (σ), P BL (d), and P BL (θ), as well as the pair distribution function, g BL (r), in a melt of soft-sphere chains (lines) and reference samples (symbols). The agreement of the curves is remarkable. As will be demonstrated in the following, the small deviations observed in the plots do not propagate into the properties of backmapped PS samples. chain. Fig. 4(a) presents for the soft-sphere model (lines) and reference samples (symbols) the intermolecular part g inter,BL (r) of the pair distribution functions plotted in Fig. 3(d). In both curves, the correlation holes of subchains (evident depletion at small distances) and of entire PS chains (shallow depletion towards tail) can be identified. The two g inter,BL (r) match each other closely, even in their long tails, demonstrating that the blob-based model captures the melt structure on large scales. Fig. 4(b), compares polymer conformations in melts described by the soft-sphere (lines) and moderately CG (symbols) models. We employ the internal distance plot, R 2 BL (s)/s, which constitutes a very sensitive quantifier of equilibration. [14] For the soft-sphere model, R 2 BL (s) is the mean-square distance between the centers of spheres in the same chain. In moderately CG melts, R 2 BL (s) is the mean-square distance between the COMs of subchains with N b = 120 superatoms, belonging to the same molecule. In both cases, s is the difference of ranking numbers of spheres (subchains) along chain contour. For large s the plots follow each other closely; for blobs, located near each other along the chain, the relative deviation is at most 3.7%. The internal distance plot demonstrates the accuracy of the soft-sphere model in describing conformational properties on the scale of entire chains.

III. HIERARCHICAL BACKMAPPING STRATEGY
A. Systems studied The hierarchical scheme described in the previous section can be applied to systems of any molecular weight for given density. In practice, for fixed pressure and temperature, the density of the polystyrene becomes chain-length independent for polymerization degrees larger than about 50-100 monomers. [34] Therefore we expect that the softsphere model, developed in the previous section, is transferable to PS melts with arbitrary long chains. However, for the purposes of method development, we set here as a goal the equilibration of atomistic PS melts comprised of n = 100 chains with N = 480 repeat units, that corresponds to a molecular weight of M = 50kDa. These molecules are equivalent to moderately CG PS molecules with N CG = 960 superatoms and soft-sphere chains with N BL = 8 blobs each. These are the chain lengths for which the soft-sphere model is parameterized and for which reference samples are available [10] (cf. Sec II C 2). Having the same chain lengths in backmapped and reference samples simplifies significantly the validation of the equilibration. The temperature in the backmapped and reference melts is T = 463 K, which is a typical processing temperature for PS. The reinsertion of the degrees of freedom of the chemically specific CG model into equilibrated soft-sphere PS melts is conceptually similar to the strategy developed earlier [22] for generic microscopic (bead-spring) models. Technically, however, the backmapping of the chemically specific CG model is more involved, due to the more complex CG force field. The technicalities of the different backmapping steps are presented below.
Introducing moderately CG PS molecules: Every soft-sphere chain in the melt is replaced by a moderately CG PS molecule in a matching conformation. Specifically, the COM and the radius of gyration (squared) of each subschain with N b superatoms in the reinserted molecule must match the COM and the radius (squared) of the blob which has the same ranking number in the soft-sphere chain. We generate each moderately CG PS molecule in the vicinity of the blob-based chain it must replace (the exact location is not critical). Setting the first bead of a generated molecule to A type, the remaining beads are added stepwise. A and B beads alternate. In this work, we assign randomly to each B bead one of the four subtypes (cf. Sec. II B) to make PS chains atactic. The length, bond angle, and torsional angle of the bond connecting the added superatoms to the part of the molecule that has been already constructed, are randomly drawn from Boltzmann distributions. Each distribution depends on the appropriate bonded potential. Once the PS molecules are generated, we associate [22] with each subchain two external potentials: βV cm,i = k cm (r i − R cm,i ) 2 and βV g,i = k g (σ 2 i − R 2 g,i ) 2 . Here R cm,i and R 2 g,i are the COM and squared radius of gyration of i-th reinserted subchain, respectively. These potentials affect all superatoms in the i-th subchain, because R cm,i and R g,i depend on the coordinates of all these superatoms. The parameters controlling the strength of the external potentials are empirically set to k cm = 100 k B T /σ 2 CG and k g = 100 k B T /σ 4 CG . With the external and bonded potentials simultaneously activated (non-bonded potentials are turned off), the configuration of generated moderately CG molecules is subjected to MD simulation. This simulation "drives" every molecule into the corresponding softsphere chain. At this stage the molecules do not interact with each other, therefore the simulation takes negligible time.
Recovering microscopic excluded volume: After the soft-sphere chains are replaced by the moderately CG PS molecules, we remove the external potentials and gradually activate [22] the non-bonded interactions between superatoms. For this purpose we use a "push-off" MD procedure [14,15] which removes overlaps between reiserted superatoms and recovers the original excluded volume characterizing the potentials U α,β (r). We summarize here the main features of the push-off protocol; details are provided in the Appendix.
The potentials are "force-capped" according to the rule: Here r c(n) is the force-capping radius and the original interactions are recovered when r c(n) → 0. The push-off is accomplished in circles, each of them comprises a change in r c(n) and local relaxation through a short MD simulation. For those beads that are not intramolecular (1, 5) neighbours, we decrease r c(n) by a small step at the beginning of each cicle. For intramolecular (1, 5) neighbours, r c(1,5) is adjusted in a special way in order to avoid significant distortions of polymer conformations during the push-off. To modify r c (1,5) in the beginning of each cicle, we use an approach similar to the one developed for generic microscopic models [15]. Namely, we quantify conformational distortions using the descriptor: where R 2 CG (l)/l is the internal distance plot calculated in the PS melt after the previous circle of push-off is accomplished. R 2 CG (l) is the mean-square distance between superatoms in the same chain and l is the difference of ranking numbers of these superatoms along the chain backbone. The reference internal distance plot, R 2 CG,ref (l)/l, is obtained from the reference samples of moderately CG PS melts. The boundaries in t are empirically set to n 1 = 50 and n 2 = 250. We increase or decrease r c (1,5) , depending on whether I is positive or negative.
The MD push-off procedure lasts about 20 τ which is a neglible fraction of τ e ; about 0.05%. In practice, this run takes less than a day on 32 processors of a typical supercomputer.
Local re-equilibration: Once the excluded-volume interactions are recovered, the moderately CG PS melt is locally reequilibrated through a standard MD simulation, which lasts about one τ e . This is the most demanding computationally part of our procedure. This is the most demanding computationally part of our procedure. To give a feeling for the CPU resources that are typically required for re-equilibration, we mention that it takes about 32 days using the ESPResSo++ package [54] (version 1.9.5) on 256 Xeon cores with frequency of 3.0 GHz. We observe that the local re-equilibration of PS melts for ∼ τ e , is about an order of magnitude slower -in terms of the required CPU timecomparing to melts described with generic microscopic bead-spring models. [22] The protracted computations are not related to the backmapping strategy but are due to the complexity of the chemically-specific force-field. Because the largest relaxation time is determined by τ e , the CPU time required by our backmapping procedure does not depend on chain length and is proportional only to system volume. [22] In contrast, the relaxation time in brute-force MD simulations is proportional to the reptation time, given by τ rep τ e (N/N e ) 3 ; to make a simple scaling argument we use the approximate cubic power law of the initial reptation theory [2,55]. The moderately CG PS melts considered here are only weakly entangled: based on the largest reported value N e = 400 (cf. Sec. II C 2) we obtain N/N e 2.4. The estimate for τ rep demonstrates that to equilibrate even this system, brute-force MD simulations would require about a year of continuous run (using the same amount of processors). Melts with slighly longer PS chains are entirely out of reach of brute-force MD simulations.

C. Validating backmapping of the moderately coarse-grained PS melts
We demonstrate the equilibration of backmapped moderately CG melts of PS by monitoring characteristic conformational and structural properties on local and global length scales. These properties are compared with their counterparts calculated in the reference samples. As a first simple test, we present in Figs. 5(a) and (b) the probability distributions of the bond angle and the torsional angle, P CG (θ) and P CG (φ). The data obtained for the backmapped melts (lines) are on top of the distributions (symbols) calculated in reference samples. The structure of the polymer liquid is quantified in Figs. 6(a) and (b), presenting the total, g CG (r), and the interchain, g inter,CG (r), pair-distribution functions of superatoms in the backmapped (lines) and reference (symbols) samples. The plots obtained for backmapped and reference systems are indistinguishable from each other. The agreement between g inter,CG (r) is particularly important -thanks to the correlation-hole effect [13] -this quantity is a sensitive quantifier of the mesoscopic liquid structure on length scales comparable to the average size of the entire chain. The characteristic size of the correlation hole, defined by the average radius of gyration of the chains, is marked in Fig. 6(b) by the arrow. To illustrate more clearly that the backmapped melts reproduce correctly the long-wavelength density fluctuations of PS, we compare in Fig. 6(c) their static structure factor S(q) (line) with its counterpart (symbols) in reference samples. The two plots agree with each other within the statistical noise of the data. The latter is quantified via the error bars, corresponding to the standard deviation calculated from four independent backmapped CG melts of PS. From the data in Fig. 6, we obtain the isothermal compressibility, κ T , of atactic polystyrene at 463 K: κ T = S(q → 0)/ρk B T ≈ 8.7 × 10 −10 Pa −1 . Here ρ denotes the number density of superatoms in the CG PS configurations. The κ T characterizing the backmapped samples is remarkably close to κ T = 8.4 × 10 −10 Pa −1 that has been experimentally reported for PS at 473 K. [56] One of the most stringent tests of equilibration is performed in Fig. 7, where we present the internal distance plot R 2 CG (l)/l calculated in the backmapped (line) and reference (symbols) CG PS melts. The error-bars correspond to the 95% confidence interval and, for clarity, are shown only for the plot obtained from the backmapped melts. The error-bars grow at the tail of the plot due to fewer data available for calculating R 2 CG (l)/l. Within the noise of the data, the two plots match each other well and confirm the equilibration of the PS melts. In the following we describe the methodology for the backmapping from the moderately (chemically specific) CG scale to the atomistic one. Note that several different methods for re-introducing atomistic detail in CG polymer chain conformations have been appeared in the literature. [9,12,16,40,57] Here, inspired from the above works, we propose a generic approach that is based on a "lego-like" construction of consecutive monomers along a macromolecular chain. The backmapping algorithm consists of the following stages: Development of a CG "lego"-particle database: Given a specific CG mapping scheme for PS chains, atomistic information about each CG particle is required. To obtain such information atomistic configurations of short PS chains are analyzed. More specifically, a particle database (class), based on the chosen CG mapping scheme, is constructed including two types of information: First, each member of the class represents a CG particle type and holds the atoms that define it, along with their relative positions, their masses and their contributing weights on the CG particle. In the specific CG PS model five different CG types are defined: One that describes the CG bead type "A" and four for the CG bead type "B" due to different tacticities. The constructed A and B "legos" that correspond to the specific CG PS model are shown in Fig. 8(a). Note that the lego A consists of a CH 2 and two half CH groups, whereas lego B of the phenyl ring (five CH and one C groups).
Second, from the analysis of the atomistic configurations, additional information is gathered about: (a) the average, and the distribution of distances between two consecutive CG particles ("AB" and "BA" CG bonds) and (b) the average and the distribution of angles between three ("ABA" and "BAB" CG angles).
Re-introduction of atomistic detail using a geometric algorithm: In the next stage, given the CG PS configurations, obtained from the backmapping of the blob-based to the chemically specific CG representation, the atomistic ones are created. This is achieved via a geometric algorithm, by putting each "lego" piece at the given center-of-mass of its corresponding CG particle in a consequent way. To avoid high overlaps the "legos" are rotated around their center-of-masses, to minimize distances between neighboring atoms; see Fig. 8 In more detail, for a specific PS monomer i we define as D hh,i the distance between the two CH halves, which correspond to the same CH united-atom group, and D har1,i and D har2,i the distances between the two halves and the "C" carbon of the aromatic ring, in the ith monomer.
In addition we define as: the three (Eulerian) rotation angles, of the A and B CG particle-"lego" of the ith monomer, respectively. Based on the above we propose, along two consequent PS monomers, the following minimization problem: where the cost function is defined as: where D har = 0.151 nm is the (average) atomistic "C-backbone" -"C-aromatic" bond length. w 1 and w 2 are weights with typical values w 1 =w 2 =0.7. Then, we minimize F cost , over all possible angles θ A,i , θ B,i , and θ A,i+1 via a BFGS method. This is a very fast minimization procedure, since it involves information of only two consequent monomers. The above procedure is repeated for all pairs of consequent monomers along the PS chain, in order to obtain a first realistic PS atomistic configuration.
Energy minimization: The re-insertion of the atomistic detail on the CG PS configurations has been extensively tested and no strong overlaps between neighboring atoms were observed. However, due to possible correlations among the CG intramolecular (bonds, angles, dihedrals) degrees of freedom, as well as due to the fact that the geometric method does not involve thermal fluctuations, overlaps between atoms cannot be fully avoided. To treat such overlaps, in the third stage an energy minimization procedure for all atoms is required.
We perform such a "global", with respect the number of atoms that are involved, minimization scheme in two steps: First, the non-bonded interaction potential, between all UA groups, is switched off and the bonded (involving all atomistic bonds, angles and dihedrals) potential is minimized through a steepest descent method, while the centerof-masses corresponding to the CG beads A and B are restrained in their original positions. Second, the non-bonded potential is switched on and the full atomistic force field (all bonded and non-bonded interactions) is minimized via a conjugated gradient algorithm.
Short MD runs: Finally, in order to relax remaining stresses in the systems and to obtain a realistic atomistic trajectory at the appropriate temperature, a short NVT MD run, of about 100 ps, is executed.

Validating backmapping of atomistic, low MW, PS systems
The quantitative validation of the backmapping from the chemically specific CG model to the atomistic one is a challenging issue. Indeed, a direct comparison of the atomistic configurations of high MW derived from the CG model with "reference" data is not possible, since in principle there are no accurate atomistic data for such polymer configurations.
Therefore, in order to validate the backmapping procedure we decide to use a low MW (1kDa, 10mer) PS melt, at T=463 K. First, we perform long atomistic MD simulations of atactic 10mer PS melts, using the model described in section II. These simulations are performed in the NPT ensemble, whereas tail corrections for the energy and pressure were applied. The integration time step was 1 fs, whereas the overall atomistic simulation time of the production runs was 100ns.
Second, we apply the backmapping methodology, described in the previous section, to a CG 10 mer PS system. [10] Thus, we obtain atomistic configurations of 10mer PS without performing atomistic MD simulations.
Therefore, for the low MW PS melt atomistic data are gathered both from long atomistic MD simulations ("reference" data), and from CG systems after employing the CG-to-atomistic backmapping procedure. As a direct "intramolecular" check of the backmapping process we examine the internal distance distribution, analyzed in the united-atom level, of a typical short-chain system (MW = 1 kDa) from long atomistic MD simulations and after backmapping of the CG configurations. In more detail, the R 2 AT (n)/n, is calculated, where R 2 AT (n) is the average mean square distance between n consequent backbone carbon atoms of the atomistic PS chains. Data are shown in Fig. 9. As we can see the agreement between the two sets of data is excellent for all internal distances. Note that a similar agreement has been found before for short PS chains following a different back mapping approach. [58]  The short chain PS atomistic configurations obtained from the CG ones through the backmapping procedure can be further examined by calculating the atomistic pair distribution functions, g AT (r), and comparing them against the data derived directly from the long atomistic runs. Results are presented in Figs. 10(a)(b). In both cases all correlations are included. First, in Fig. 10(a) data for the intramolecular distribution function, g intra,AT (r), are shown. The agreement between the two curves is excellent for all length scales providing thus a direct evidence of the applicability of the coarse-grained model to preserve the long length internal chain structure. Note, that these data include all intramolecular correlations, in contrast to the internal distance distribution data shown in Fig. 9, where only correlations along the backbone atoms are considered. Thus the agreement between the two sets of data shown in Fig. 10(a) proves the ability of the backmapping methodology to reproduce the full intramolecular structure.
The intermolecular pair distribution function, g inter,AT (r), is shown in Fig. 10(b). The excellent agreement between the two different sets of data, clearly demonstrates that the local packing in the backmapped atactic PS melts is also well reproduced. Note that the agreement between the intermolecular distribution functions based on different correlations, e.g. of "CH 2 -CH 2 " and of "phenyl-phenyl" groups (data not shown here) is of similar quality.

High MW atomistic PS melts
After validating the "CG-to-atomistic" backmapping methodology we can proceed to applying this approach to the high MW (50kDa) CG PS melts, which have been obtained through the "blobs-to-CG" backmapping procedure described in Section III B. By doing this, atomistic conformations of 480mer (MW=50 kDa) PS melts are directly obtained. Such well equilibrated high MW atomistic PS configurations cannot be generated through brute force MD simulations even with the most powerful computing resources. A typical well-equilibrated atomistic configuration (snapshot) of 50 kDa PS melts, with 100 chains is shown in Fig. 11. The structure of polymeric chains, and its dependence on the molecular weight, is of special importance. Thus, in the following we compare the pair distribution functions of the long (entangled) PS chains with the data from the short (oligomeric) PS discussed above. The structure of the derived atomistic PS 480mer configurations is examined in Figs. 12(a) and (b). In Fig. 12(a) data for the atomistic intramolecular distribution function, g intra,AT (r) of both low (10mer) and high (480mer) molecular weight PS melts are shown. The agreement between the two curves at very short distances of about 0.5 nm is expected. Indeed, correlations in such length scales involve neighboring atoms along the polymer chain (i.e. atoms belonging only in one-two consequent monomers), thus being very similar for the low and high molecular weight chains. For longer chains intramolecular distribution functions approach zero, as expected, however, g intra,AT (r) for 480mers PS chains are extended to much longer lengths.
Different is the case for the overall chain packing, which is directly related with the correlation hole of the polymer chains. [13] Data about the intermolecular pair distribution function, g inter,AT (r), are shown in Fig. 12(b). It is obvious that the correlation hole extends over a distance of the order of the average radius of gyration of the chains; the latter is shown, for both systems, in Fig. 12(b) with arrows. In addition, for the high molecular weight chains the correlation hole becomes less deep, as has been also discussed before. [34] We present in Fig. 13 the internal distance plot R 2 AT (n)/n calculated in one representative backmapped sample of an atomistic 50kDa PS melt (solid black line). R 2 AT (n) stands for the average mean square distance between n consequent backbone carbon atoms of the atomistic PS chains. For comparison, the internal distance plot calculated for the sample of the moderately CG melt used to backmap this atomistic sample is also shown (dashed red line). For the atomistic sample, the internal distances start from a small value and then smoothly reach a plateau for n around 80-100, which corresponds to 40-50 PS monomers. This behavior at small n differs from the internal distance plot in the moderately CG representation, where R 2 AT (n)/n appears non-monotonous and more structured. Such differences are expected because of the averaging of the monomeric degrees of freedom performed in the CG groups. In contrast, the two plots match each other at large n, demonstrating that our "CG-to-atomistic" backmapping procedure conserves the global conformational properties that have been already equilibrated during the "blob-to-CG" backmapping stage.

IV. CONCLUSIONS AND OUTLOOK
Preparing equilibrated samples of highly entangled polymer melts serves as a first, but by no means trivial, step for studying their dynamical and rheological properties in and out of equilibrium using computer simulations. In this work, we propose a method for efficiently generating well-equilibrated configurations of high molecular-weight polymer melts, described with chemically-specific atomistic models. The method is build on a hierarchy of models, which describe the same material with different resolutions: from mesoscopic to atomistic. The modeling hierarchy includes a soft blob-based, a moderately coarse-grained, and an atomistic model. If required, more soft blob-based models can be incorporated, to accelerate equilibration of larger scales. Each model in the hierarchy is parameterized to reproduce key conformational and structural properties characterizing the melt, when described with the next finer-resolution model. First, the melt is efficiently equilibrated using the soft blob-based model. The details of the next level are reinserted, the system is re-equilibrated, and we proceed to the next reinsertion step. During each step of backmapping only local re-equilibration is required. Therefore, the computation time is independent of the molecular weight of the polymer that is considered.
Here, as a proof-of-concept, we use the method to equilibrate atomistically-resolved samples of melts of long attactic polystyrene chains. The preparation of such systems using brute force MD simulations is unfeasible, even with modern supercomputers. We emphasize that the method is not restricted to polystyrene and can be straightforwardly applied to equilibrate melts of other polymeric materials, where moderately CG models exist. [11,[59][60][61][62][63] Appendix: "push-off " procedure for backmapping the blob-based model to the chemically specific coarse-grained model The whole "push-off" procedure is accomplished in 100 cycles of a molecular dynamics simulation, where each cycle is comprised of 2 × 10 4 steps, and time step ∆t = 0.0001τ (τ is time unit in the CG PS model). (a) For all particle pairs ij other than the intramolecular (1,5) pairs r c(n) is linearly decreased from 0.9 r cutoff ij (r cutoff ij is the cutoff of the LJ-like potential between i and j particles) to 0.5 r cutoff ij within the first 80 cycles, after which particle overlaps for those pairs can be successfully removed. (b) For the intramolecular (1,5) pairs, the corresponding force-capped radius, r c (1,5) , is allowed to fluctuate (instead of decreasing linearly as in case (a)) to prevent large distortions of chain conformations. For this purpose, after each cycle, we quantify via I (see Eq.(7 in Sec III A) the deviation of the internal distance plot from the reference one. Before starting the next cycle we decrease (or increase) r c(1,5) by 0.01σ 1−5 if the calculated I is negative (or positive). In this way, r c (1,5) can fluctuate within the first 80 simulation cycles. In the last 20 cicles it is reduced linearly to 0.5 r cutoff (1,5) .