Hima Bindu
Kolli
*a,
Guadalupe
Jiménez-Serratos
b,
James
Doutch
a,
Tristan G. A.
Youngs
a and
Thomas F.
Headen
*a
aISIS Neutron and Muon Source, Rutherford Appleton Laboratory, Harwell Campus, Oxon, OX11 0QX, UK. E-mail: hima-bindu.kolli@stfc.ac.uk; tom.headen@stfc.ac.uk
bSTFC Hartree Centre, Scitech Daresbury, Warrington WA4 4AD, UK
First published on 4th August 2025
The combined use of neutron scattering experiments with molecular simulation is increasingly being used to study multi-scale structures in molecular biology and soft matter physics. Small-angle neutron scattering (SANS) can provide experimental data at the length scale from 1 to 100's of nm, an order of magnitude larger than the typical atomistic simulations. In this context, coarse-grained (CG) simulation can be used to reduce computational costs, explore system polydispersity, and overcome slow dynamics. The mathematical expression to calculate SANS curves from molecular models is well defined for atomistic systems, but further approximations are needed to analyse CG models, where the atomistic resolution is lost. Here, we present the MuSSIC tool, which is a code to compute the neutron-weighted total structure factor, FCG(Q), directly from CG simulation trajectories, based on the methodology proposed by Soper and Edler [Biochim. Biophys. Acta, 2017, 6, 1861]. We validate the approximations by comparing the results against the atomistic pseudo-CG data to decouple force-field effects. We demonstrate the scientific usefulness and understanding provided by the code, by comparison of CG simulations to the experimental scattering data for archetypal soft matter systems, SDS and CTAB solutions. We were able to use the marked differences with experimental SANS data to give a detailed understanding of the appropriateness of the CG simulation methodologies used for predicting structure. This forms a first step towards new approaches in SANS data analysis, particularly in allowing refinement of models against one or more experimental data sources.
It is well known that one cannot uniquely assign a 3-dimensional atomistic structure to a particular scattering pattern (in all but the most trivial cases), however, total and small-angle neutron scattering intensity can be calculated from known atomic positions over the course of a simulation trajectory. In the most limited interpretation, this provides a strong structural benchmark that the derived force-field parameters and simulation methodologies do reflect experimental reality. In the cases where there is good agreement between simulation and scattering, the resulting simulation ensemble can be interpreted as a reasonable structure representation of the system in question. While there may be several possibilities of atomic arrangements which give a close match to the same neutron scattering data, the additional constraints on known chemical physics in molecular simulations (density, known bonded molecular structure, force-fields that have reasonable agreement with thermodynamic properties) help reduce the possibilities, and constrain the search for a reasonable structure. Using this approach, the combined use of neutron scattering and molecular simulations has allowed the study of disordered systems like molecular liquids, novel solvents, confined fluids, surfactants, biomolecules and polymers in unprecedented detail.5–11 For example, methods like empirical potential structure refinement (EPSR), as implemented in the EPSR code12–14 and now Dissolve,15 have been developed to drive the simulation towards matching experimentally measured structure, by comparing the simulations with the experimental scattering data (usually several isotopically distinct datasets), then by means of the feedback of an additional potential based on the difference, the simulation is driven towards matching all available scattering data. The EPSR method has been widely used for interpreting the atomistic structure of liquids and glasses of a wide variety of systems including simple molecular liquids,16,17 ionic liquids,18 deep eutectic solvents,19 multicomponent glasses,20 biomolecules in solution,21 fluids in confinement22 and, at the largest length scale, small micellar systems23 and biomolecular aggregates,24 by refining against X-ray and/or neutron total scattering data.
Modern neutron instrumentation, such as the Near and InterMediate Range Order Diffractometer (NIMROD) at the ISIS facility,25 have the ability to study systems across length scales ranging from the interatomic (<1 Å) through to the mesoscopic (>300 Å) simultaneously. There is therefore a motivation to increase the length scale of the EPSR technique to match the capability of NIMROD to measure atomistic and mesoscale correlations simultaneously. Furthermore, atomistic molecular simulation is increasingly being used to interpret SAS data. Modern SANS instruments, for example SANS2d26 at the ISIS second target station, offer a wide Q-range and low background, giving experimental data which is more conducive to analysis by simulation based methods.27
Computational approaches have been particularly successful for structural studies of large biomolecules, such as proteins in dilute solution, where tools such as SASSIE,28 have established simulation as a viable analysis method for the lay user, who is not an expert in molecular dynamics. This is a modular framework which contains various elements and operations required to undertake atomistic simulations on biomolecules, including structure building, a number of simulation and minimization modes, and calculation of small-angle scattering curves. This approach has a successful track record in the analysis and interpretation of scattering data for biomolecular systems, for example understanding the conformational space of different components in large complexes such as antibodies, and suggesting plausible conformations for highly flexible biological assemblies.29–31 However there is an increasing demand to use these tools for understanding more generic soft matter systems, often at high concentration, and where the solvent plays an important role and cannot be ignored in the scattering calculation – therefore system size becomes even more critical. This is particularly the case for concentrated solutions, or in cases where the effect of the hydrating water cannot be ignored where fully atomistic molecular dynamics simulations of soft matter systems and large bio-molecular systems are required. For example, investigating the structural and dynamical features of polymer melts at length scales covering both intermolecular range and local short-range is challenging due to the different relaxation times involved.32
Coarse-grained (CG) simulation provides a means for simulating the assembly and interactions of such macromolecular complexes at a reduced level of representation, thereby allowing both longer timescale and larger-sized simulations. In CG simulation, a small number of atoms (typically 3 non-H atoms) are grouped together to form a single CG bead. However, there are certain limitations on increasing the level of coarse-graining due to unphysical bond crossing caused by the softening of the coarse potentials, inaccurate dynamics at interfaces and non-transferability of force field (FF) parameters.33–35 By obtaining the neutron scattering data directly from CG simulations and comparing it to the experiments, one can perform pair structural analysis and also check the accuracy of the CG potential. But computing neutron scattering from CG simulations is not straightforward as there is a loss in atomic resolution with the definition of the CG beads.
Soper and Edler have developed a preliminary coarse-grained version of empirical potential structure refinement, CG-EPSR, that is potentially applicable to a variety of mesoscale and nanoscale structures.36 The method closely follows EPSR and involves deriving an empirical interaction CG potential from the scattering data and is applied on reverse aqueous micelle of sodium-dioctyl sulfosuccinate (AOT) and iso-octane, with average CG bead containing ∼200 atoms and radius ∼0.9 nm which is considerably larger than typical CG bead sizes. In this work, the total structure factor FCG(Q) is computed for the CG simulations by combining the neutron scattering due to the atoms within the single CG bead and scattering due to the bead pairs. The method has not been tested on typical CG bead sizes (3 or 4 carbon atoms per bead) and the intra-molecular bead scattering coming due to the beads that are connected in the molecule is not estimated.
The aim of the current paper is to generalise and verify the neutron scattering calculation method for CG simulations proposed by Soper and Edler in ref. 36 so that it can be used to calculate neutron scattering from any typical CG molecular simulation. The method has been modified to consider much smaller and typical CG bead sizes which are used in standard CG simulations like dissipative particle dynamics or simulations based on MARTINI force fields (≤6 carbon atoms per bead) and are bonded. This is done by considering intra- and inter-molecular bead scattering separately, as shown in Section 2.2, allowing efficient calculation of scattering from different isotopically labelled samples.
The key questions we aim to answer with this study are: (i) How well does a CG calculation of scattering work for “typical” CG simulation bead sizes? (ii) What is the effect of bead size and form factor on the calculation accuracy? And (iii) how widely applicable is the calculation method to different soft matter systems? As a final objective, we demonstrate the utility of the code by using it to compare large CG simulations of micellar systems against experimental data. The first version of the ‘MuSSIC: Multiscale Simulation Scattering Intensity Calculator’37 software has been made available on GitHub (https://github.com/disorderedmaterials/MuSSIC). All the example files and test systems are included in the user guide and documentation which is provided along with the code.
The paper is organised as follows: first, we include an introduction of the relevant scattering theory and the equations used for the calculation of scattering from CG simulation. This is followed by outlining the simulation and verification methodology used and the results of those validation tests. Finally, we include a demonstration of the utility of the MuSSIC code by calculating the scattering from two coarse-grained surfactant solution simulations (CTAB and SDS) and compare the results to experimental SANS data.
![]() | (1) |
The total structure factor for a system containing N distinct atom types can be written as the weighted sum of all possible partial pair structure factors Sij(Q).
![]() | (2) |
S ij (Q) is the partial structure factor obtained from the spatial correlation between the atom types i and j. The Faber and Ziman definition of partial structure factor Sij(Q) is given as
![]() | (3) |
The weights matrix W is calculated for all possible pairs of atom types as given below
Wij = (2 − δij)cicjbibj. | (4) |
For multicomponent systems, a single total structure factor obtained in experiments contains all the partials that one would like to extract individually. The isotopic substitution technique helps to identify the contributions from different partial structure factors (usually hydrogen is replaced by deuterium). Different ratios of isotope give different isotopologues (i.e. chemically identical samples where the only change is substituting one or more isotope for another).
The isotopic substitution can be implemented for the calculation of F(Q) from simulations by simply changing the neutron scattering length bi of the atom type i on the r.h.s of eqn (2) with the scattering length of its isotope. For systems with exchangeable hydrogens (e.g. water), the r.h.s of eqn (2) becomes the product of concentration and scattering length of the isotope and its isotope ratio λ. The effective scattering length of hydrogen when substituted for deuterium becomes
beffH = λbD + (1 − λ)bH | (5) |
Care needs to be taken when calculating the weights matrix (Eq. 4) to understand if the isotopically labelled hydrogens are chemically exchangeable (e.g. –OH or –NH) or not (e.g. –CH). For example in water a 1:
1 mix of H2O and D2O results in a statistical mixture of H and D across all molecules, however in a 1
:
1 mixture of benzene and d6-benzene, there is no exchange, and so molecules are either fully deuterated or fully hydrogenated. Therefore, the scattering weight coefficient in eqn (2) is different in each case. To account for this difference, the total F(Q) can be separated into two parts. The structure factor is computed for atoms within the same molecule Fintra(Q) and a separate structure factor is computed for unbound atoms Finter(Q). Full details of this calculation, with examples, are given in the SI Section S1.
![]() | (6) |
![]() | ||
Fig. 1 Schematic showing the uniform and Gaussian density distributions used for the definition of form factor fs(Q) for bead type ‘s’. |
By analogy with the atomistic case (eqn (1) and (2)) the total differential scattering cross-section for a CG system containing M distinct CG bead types can be written as36
![]() | (7) |
FCG(Q) = Fsingle-beadCG(Q) + Fcross-beadCG(Q). | (8) |
![]() | (9) |
![]() | (10) |
The extra terms in Fsingle-beadCG(Q), with respect to the atomistic case, take care of the fact that each bead may contain more than one atom of any given type. Since the atoms in each bead are assumed to be distributed statistically through the bead according to the same ρ(rs) for all atom types their relative positions within the bead are uncorrelated. The density distribution and the form factor used for the current study are discussed in Section S3 of the SI.
The atoms within the bead are treated similarly as in intra-molecular scattering for atomistic simulations. Therefore for non-exchangeable hydrogens, and given an isotope ratio λ, the above equation changes to
![]() | (11) |
F cross-beadCG(Q) is the scattering from all possible pairs of bead types, which is obtained by analogy to the atomistic case (see eqn (2)), where this time the “bead scattering length” is obtained by multiplying the sum of the scattering lengths of atoms in the bead s with its form factor fs(Q)
![]() | (12) |
Fcross-beadCG(Q) = FintraCG(Q) + FinterCG(Q) | (13) |
We also need to consider how to distribute the scattering length density over a bead, as a function of its radius. We describe this in detail in SI Section S3. In summary we consider both a uniform and Gaussian distribution (see Fig. 1, which result in different bead form factors f(Q)).
Finally, it is important to note that the lower limit of the Q-range, Qmin, for obtaining reliable scattering data depends on the simulation box size L. The maximum distance between the beads in the g(r) calculation is limited to half of the box due to the periodic boundary conditions. It is therefore reasonable to set Qmin as equivalent to the maximum correlation in r-space, i.e. Qmin = 2π/(L/2).
The MuSSIC code is used to compute FCG(Q) from the pseudo-CG trajectory following the equations shown in Section 2.2. FCG(Q) is then compared against the atomistic benchmark data F(Q) to test the accuracy of the calculation. Fig. 3 shows the workflow followed for the validation tests. This comparison test is repeated for different pseudo-CG trajectories generated using different CG mapping models, but not by performing a CG simulation. This allows us to compare FCG(Q) with F(Q) for the same underlying structure, just with different levels of spatial resolution. The accuracy of the method is tested by comparing the FCG(Q) with reference data F(Q) from the atomistic trajectory. We are therefore able to remove any difference that occurs between atomistic and CG representations due to differences in the simulation force-fields, affording a direct comparison and assessment of the quality of the CG calculation method against and atomistic one. Validation tests have been performed following the workflow shown in Fig. 3 on a polyamide-66 melt and a solution of C10 TAB surfactant in water for different CG mapping models and the results are discussed in Section 4.1. Three different pseudo-CG mapping schemes were used for each system as shown in Fig. 4 and 5. Details for the setup of the atomistic simulation are given in Section S4 of the SI. Fig. 6a shows the atomistic representation of a configuration of polyamide-66 alongside the same with ∼4 carbon atoms per bead (Fig. 6b) representation.
![]() | ||
Fig. 6 Snapshots showing the (a) atomistic and (b) ∼4 carbon atoms per bead representation of polyamide-66 system. |
Secondly dissipative particle dynamics (DPD) simulations47–49 were performed using LAMMPS v18Mar201850 to study the formation of C16TAB micelles in water. The coarse-grained model is based on the CTAC model presented in ref. 51, where the surfactant is a flexible chain of 9 beads representing C16TA+ as [(CH3)3N+CH2][CH2CH2]7[CH3], and the counter ion Br− is contained in a [Br−(H2O)2] bead as shown in the Fig. 8. For the water model, two molecules are implicit per DPD segment, (H2O)2. The non-bonded interactions are modelled via soft-repulsive potentials with parameters taken from ref. 51–53. Full details of the simulation methodology and setup are provided in the SI Sections S5.3 and S5.4.
Fig. 9 shows that the difference between the Gaussian and uniform scattering length density distributions is very small, and negligible at low Q values. However, the Gaussian distribution shows slightly better agreement for both polyamide-66 and C10TAB surfactant in water cases shown in Fig. 9a and b. Therefore, the Gaussian distribution has been used throughout the rest of the study with the width of the Gaussian, γs = 0.51Rs (where Rs is the uniform sphere radius) for all the test cases as explained in SI Section S3.
The radius of the CG bead (Rs), for the bead type s is a variable in the calculation which affects the form factor of the bead type fs and thus affects the total scattering. For the test cases shown in Fig. 9, the radius of the CG bead is estimated from the distribution of atoms defining the bead by using the root mean square deviation (rmsd) of the distances of individual atoms (composing the bead) from the geometric center of the bead. In addition, we also estimated the radius of the beads from bead connectivity (beads that are bonded within a molecule) in the pseudo-CG trajectory. We found that the radius calculated from an rmsd calculation showed a clearer intermolecular structure peak closer to the atomistic calculation, although with a higher background, making the overall residual greater. Given this, we choose to use the RMSD calculation as the standard method for calculating the bead size as it is more universal and has the benefit of simplicity of implementation. Further details of these tests, allowing with further tests of the affect of bead size on the calculation are given in the SI Section S6.
![]() | ||
Fig. 11 F CG(Q) computed for polyamide-66 and compared with F(Q) using the CG mappings in Fig. 5 (a) ∼2 carbon atoms per bead (b) ∼4 carbon atoms per bead and (c) ∼6 carbon atoms per bead. |
Overall Fig. 11 shows a good agreement in the low Q region for all bead sizes. As expected, with an increase in the bead size, there is an increasing divergence between the atomistic and CG scattering at higher Q, due to the loss of spatial resolution. Of particular note is the abrupt change in the position of the high Q scattering peak as the CG representation moves from 4 to 6 beads, which is explored in more detail in the next section. A more detailed breakdown of scattering calculation into FintraCG(Q) and FinterCG(Q) is given in Section S2 of the SI showing that the differences with bead size primarily come from the intermolecular scattering in the high Q region.
Exploring the ∼6C case of PA66 shown in the Fig. 11 in more detail, we see a shift in the peak at (∼1 Å−1) to lower Q value. This change in the peak shape and position is not seen in ∼2C and ∼4C cases, so what is it about moving up to ∼6C that causes this discontinuous change? To understand this further, we plot partial pair correlation functions gxy(r) for the CG beads of the polyamide chain. Here x and y indicate CG bead types. Fig. 12 shows the gxy(r) plotted for C2–C2, C4–C4 and C6–C6 bead type pairs for ∼2C, ∼4C and ∼6C cases respectively. For comparison, gCC(r) is plotted from atomistic simulations. Obviously, there is a loss of the low r peaks as we move to larger beads, however up to ∼4C beads, the larger r peaks overlap, indicating a faithful representation of the overall structure. However, there is a clear change in the structure on moving to the 6C case with a new peak appearing at ∼7 Å. Furthermore, there is a clear trend of the longer distance peaks shifting to higher r (shown in the inset of the Fig. 12). We can further understand this difference by separating the g(r) into intra and inter-molecular components, as shown in Fig. S16 of the SI. The peak positions remain less affected in inter-molecular correlation shown in Fig. S16b of the SI, while the intra-molecular correlation (Fig. S16a of the SI) shows a clear shift in the peak for ∼6C case. This analysis clearly indicates that the disagreement for the 6C bead case stems from the reduced detail and correspondence of that CG representation with the underlying structure, resulting in a clear difference in the high Q scattering data.
To further understand and exemplify the calculation method, the MuSSIC code has been used on two CG simulations and the outputs compared to experimental neutron scattering (SANS) data, allowing new physical insight into the ability of CG force-fields and methods to predict structure. SANS data were collected on the SANS2d instrument,26 ISIS Neutron and Muon Source, using source to sample and sample to detector distance of 4 m and circular final (A2) aperture size of 8 mm. The time of flight method was used, utilising wavelengths of 1.75–16.5 Å. Samples were placed in quartz cuvettes in a temperature controlled sample changer at 25 °C. Data were normalised and corrected to 1-D curves using Mantid software.54,55
To further understand the differences between the experiment and CG simulation, we calculated the aggregation properties in this system, such as shape, size, and distribution of micelles, directly from SANS data. This was performed by using model fits in SasView,56 with the best fit obtained for a spherical model of micelle with radius ∼20 Å and a net charge of ∼14e. The Hayter–Penfold Rescaled Mean Spherical Approximation (RMSA) structure factor57,58 was used to obtain the inter-particle structure factor S(Q). This structure factor was chosen, as it can be used to describe interparticle effects between charged particles. This is a well established method for surfactant systems at the concentrations studied and without counter-ion. A simpler hard sphere interaction would not fit well in these cases. The effective radius of the micelle is calculated by fitting the form factor P(Q) combined with S(Q) as [P(Q)·S(Q)]. The average aggregation number 〈Nexpagg〉 is obtained from the effective radius of the micelle using (volumemicelle/volumesurfactant) which is found to be ∼60 surfactants per micelle in SDS case.
The average aggregation number 〈Nsimagg〉 was obtained from CG simulations by counting the number of micelles and the molecules in each micelle for each configuration.59 For this, molecules forming a micelle are identified using a cutoff distance of 8 Å (1.7 times the bead diameter) using the linked list algorithm. The cutoff distance corresponds to the first minimum of the radial distribution function between the head bead of the surfactant and Na+ ion. The geometric center of the micelle is obtained from the molecules forming it. The radius of the micelle is then obtained from the root mean square deviation of the beads from the center of the micelle.
The average radius of the micelle is found to be ∼23 Å and the average aggregation number 〈Nsimagg〉 is found to be ∼28 surfactants per micelle. Though the average radius of the micelle from CG simulations matches closely with the value obtained from SasView analysis of the experimental SANS data, the average aggregation number 〈Nsimagg〉 is found to be lower than 〈Nexpagg〉. Using the CG simulation we are able to get a much more detailed analysis of the distribution of aggregate sizes for the system. Fig. 14 shows the probability distribution of micelle size Nsimagg which was obtained after performing 5 μs simulations. The average aggregation number plotted against simulation time is shown in SI Fig. S17, showing that the simulation is run long enough (6 μs) to see a stable value of Nagg and no further fusion of micelles happening. We can see a wide distribution of sizes from monomers to micelles of size Nsimagg = ∼50 surfactants, having a peak around Nsimagg ∼ 10. The presence of the few bigger micelles in the simulation box make the average aggregation number Nsimagg less meaningful in such a case. A similar variation in the size distribution of the micelles is seen at low concentrations of SDS (50 mM) even after 5 μs long hPF-MD simulations (reported in Fig. 8 of the ref. 59) confirming sufficient equilibration time in this study. However, hPF-MD simulations are able to show a more stable large number of monodisperse micelles at higher concentrations of SDS.40,59
![]() | ||
Fig. 14 Probability distribution of the number of SDS micelles versus micelle size at 60 mM concentration. |
The differences found in the scattering curves between experiments and the simulations are useful in helping our understanding of exactly how the simulation does not match the experiment. In this case, it can be attributed to the fast dynamics in hybrid particle-field potentials resulting in the formation of a few bigger micelles and more smaller aggregates compared to experiments at such low SDS concentrations, resulting in a shift in the micelle structure factor peak to higher Q.
A similar SasView analysis of the aggregates has been performed on experimental SANS data as for CTAB, using the Hayter–Penfold Rescaled Mean Spherical Approximation (RMSA) structure factor to obtain the interparticle structure factor. The best fit is obtained for a spherical model of monodisperse micelles with a radius ∼27 Å. The average aggregation number from SANS data 〈Nexpagg〉 is found to be ∼135 surfactants per micelle. Simulations give an average radius of the micelle ∼17.8 and an aggregation number of 〈Nsimagg〉 ∼42 surfactants per micelle showing the presence of considerably smaller CTAB micelles compared to experiments. The probability distribution of aggregate size Nsimagg, shown in Fig. 16, shows a major proportion of more stable, similar-sized aggregates (∼42 surfactants per micelle) in the simulation box and by calculating the average cluster size as a function of time shown in Fig. S18 in the SI, we found that the equilibrium was reached after 25 × 106 time steps, using a time step of 0.01 in DPD units. We ran 107 equilibrium steps more, from which the frames for analysis were taken.
![]() | ||
Fig. 16 Probability distribution of number of CTAB micelles versus micelle size at 100 mM concentration. |
Unlike the SDS case, DPD simulations of CTAB show less polydispersity in the micellar size distribution with no second peak formation. The smaller and shifted peak reflects the smaller aggregates compared to experiments. This has been observed in other DPD studies of CTAB micelles.60 The direct comparison of the simulation to the experimental data provides for a much richer, more detailed analysis than comparing outputs of fitted SAS models, and does not rely on the inherent assumptions in those models (micelle shape, polydispersity models etc.).
Fig. 17a shows the scattering data at 0 ns, 5 ns, 90 ns and at the end of 1.75 μs long DPD simulation of CTAB surfactants in water. A random mix of surfactants in water is used as the initial configuration (0 ns). The low Q peak emerges as micelles start growing and stabilise at around Q ∼ 0.07 Å−1 from 90 ns onwards. As discussed above the high Q liquid structure peak appears at around Q ∼ 1.5 Å−1 due to the molecular structure factor peak observed in neutron diffraction measurements of neat D2O. The difference in the low Q peak position with experimental data can be related to the micellar size and distribution observed as previously explained in the Section 4.2.2.
![]() | ||
Fig. 17 (a) FCG(Q) plotted as the simulation progressed with time for (a) 100 mM of CTAB in water (b) 60 mM of SDS in water. |
We can now understand the “intermediate Q” peak in the case of hybrid particle field simulations of SDS surfactants, as is seen developing in Fig. 17b at Q ∼ 0.6 Å−1, by analogy to the molecular structure factor peak for CTAB, noting that for the hybrid particle field simulations the potentials used are much more diffuse. This leads to a significant overlap of the beads and very little in terms of structure, as shown in the much flatter water–water g(r) (Fig. S20 in SI). At t = 0 both simulations show similar scattering patterns that quickly plateau, the only difference being the Q-scale at which this happens due to the different size of the water beads (4 waters in SDS case and 2 waters in CTAB case). This scattering is due to density fluctuations in the randomly placed beads in the box. In the DPD simulation for CTAB this plateau almost instantly disappears due to the local intermolecular structure factor of water beads (see structured g(r) in Fig. S20 of SI). Conversely, in the case of SDS the combination of a larger water bead, and high bead overlap (as shown through an unstructured g(r) that does not go to zero at low r), means this plateau largely remains, forming an intermediate Q shoulder on the lower Q micelle structure factor peak. Scattering calculated from pure water hpF-MD simulations and the presence of the shoulder between 0.3–1 Å−1 supports this argument (details are given in Section S10 of SI).
To further explore the reasons for the differences between the calculated scattering from CG simulations and the SANS experiments we return to the validation approach shown in Section 4.1, to confirm that differences between the simulation and experiment are due to the CG simulation methodology, rather than the calculation method. To achieve this, we performed necessarily short atomistic simulations of the systems and then generated pseudo-CG trajectories, following the methods used in the validation tests. The SASview analysis of the SANS-2d data has given an estimate of nearly 60 surfactants per micelle in 60 mM SDS case and 135 surfactants per micelle in 100 mM CTAB case. We therefore performed GROMACS simulations on an atomistic system with micelles of size suggested by SASview, built using the Shapespyer tool.61 A single micelle surrounded by water has been generated and the structure has been well equilibrated for 100 ns using GROMACS. The single micelle was replicated in all directions generating a large system of 8 micelles in water. This was then equilibrated again for another 10 ns using GROMACS. Fig. 18 and 19 show the F(Q) obtained from the atomistic trajectory of 8 preformed micelles and FCG(Q) from pseudo CG-trajectories compared with SANS-2d data for SDS and CTAB cases respectively.
Fig. 18 and 19 show better match with experiments when compared to CG simulations shown in Sections 4.2.1 and 4.2.2. Although ripples are present in the data due to the shorter simulation time, and therefore reduced exploration of the ensemble as necessitated by the more expensive atomistic calculation. The intermediate peak is absent in the case of SDS in water. This test confirms that the differences are due to the coarse-grained simulation methodology but not from the scattering calculation. The error in the low Q region shown in the Fig. 18 and 19 highlights the differences in micellar properties like aggregate size, shape and polydispersity. The calculation of scattering from the atomistic based pseudo-CG simulations also show how atomistic simulations are insufficient to predict scattering, due to their slow dynamics resulting in scattering data from an insufficiently large ensemble of structures.
The purpose of these comparison studies is to show the ability of MuSSIC to obtain such quantitative information on micellar properties of surfactant systems from CG simulations which allow direct comparison to experiments. This study not only assesses the accuracy of the CG potentials and parameters in the force fields from the differences with the experiments but also, potentially allows us to tune those potentials to match with experiments, allowing the determination of a data-refined structure,36 and/or avenues for development of new CG force-fields based on experimental structural data.
The scientific insight afforded by use of the code has been demonstrated on large DPD and hPF-MD simulations of C16TAB and SDS solutions respectively. In both cases, the structural differences between the simulation and experiment were clearly described. This level of information is vital in testing the appropriateness and transferability of CG force-fields, where results can be highly dependent on what data force-field parameters have been derived from. It is clear, from these two examples at least, that typical CG simulation force-fields may not provide the best possible representation of structure as measured by SANS. In this sense the MuSSIC code also provides a well-verified scattering calculation tool as a first step towards developments of CG refinement methods, that are designed to “push” a simulation towards matching the experimentally measured structure – this could be a CG version of EPSR/Dissolve, or other novel algorithms that may be better suited to the longer length scales involved e.g. conformation searching for larger molecules. Ultimately, for wide Q-range neutron scattering instruments such as NIMROD, refinement of a CG simulation against the lower Q data could be combined with the refinement of an atomistic simulation, that is kept structurally coherent with the CG simulation, providing a structure refinement across multiple lengthscales. In addition to refinement methods, future plans to further develop the MuSSIC code include improved definition of bead form factors, support for non-cubic simulation boxes, convolution of the result with a instrumental resolution function and calculation of X-ray scattering.
A parallel question is how can CG simulation help with SANS experiments and data analysis? In the first instance, CG models could be used to test if structural differences will be sufficient to provide different scattering signals, helping to plan experiments. Furthermore, if the experiment does match the simulated scattering (or can be made to match through some form of refinement) it can provide a more robust structural model than standard fitting-based approaches, as the result is known to be consistent with underlying molecular geometries and (assuming a reasonable force-field) some sensible description of the intermolecular forces. Finally, the use of a molecular scale model can potentially form the vital link between different experimental techniques (e.g. NMR cryo-EM, coherent diffraction imaging) to co-refine to a structure, this may become increasingly important as the complexity of the systems studied increases for example in determining structures such as lipid nanoparticles and vesicles.62,63
A description of the scattering calculation broken down into intra- and inter-molecular parts, a detailed description of the bead form factors used, further details of the atomistic and CG simulations, further details on the effect of bead size on the scattering calculation, a presentation of the residuals and R-factor for the scattering calculation (Cg vs. atomistic), plots of the intra- and intermolecular radial distribution functions for the CG-polymer beads, plots of the average aggregation numbers over time for the CG simulations, and details of the hybrid particle field simulation of pure 4-bead water. See DOI: https://doi.org/10.1039/d5cp01390a
This journal is © the Owner Societies 2025 |