Ritvind Suketana,
Andrew Golembeski and
Joshua Lequieu*
Department of Chemical and Biological Engineering, Drexel University, Philadelphia, Pennsylvania 19104, USA. E-mail: lequieu@drexel.edu
First published on 8th August 2025
Field-based simulations can be challenging in multi-component polymer systems and are highly sensitive to the choice of relaxation coefficients (λ) used in the field update algorithms. Judiciously chosen relaxation coefficients are critical for both the stability and convergence of field-based simulations, yet their selection is challenging when the number of unique chemical species in the system is large. In this work, we develop a new method to automatically and efficiently locate optimal relaxation coefficients in systems with large numbers of species. We begin by analyzing the effects of relaxation coefficients in two- and three-species systems and demonstrate that regions of high-performance are both narrow and system-specific. Based on these findings, we next develop a method based on Bayesian optimization that automatically locates relaxation coefficients that are stable and exhibit good performance. We demonstrate that our method is considerably faster than naive search methods and becomes particularly efficient as the system complexity increases. This work demonstrates that Bayesian optimization can be used to stabilize and accelerate field-based simulations that contain many different chemical species.
Design, System, ApplicationField-theoretic simulations (FTS) and self-consistent field theory (SCFT) are powerful tools for probing the phase behavior of multi-component polymer systems, but their efficiency and stability is highly sensitive to the choice of numerical relaxation coefficients used to evolve the auxiliary fields. In this work, we introduce a Bayesian optimization framework to systematically identify optimal relaxation coefficients that accelerate convergence and improve numerical stability. By combining field-based polymer models with surrogate modeling and adaptive parameter tuning, our approach dramatically reduces simulation cost, even in high-dimensional systems with ten or more distinct components. We demonstrate broad utility across both SCFT and FTS, including systems with explicit solvents. Our workflow is broadly applicable to polymer physics, biomolecular design, and coarse-grained modeling efforts where field-based methods are used. Looking forward, this approach enables high-throughput exploration of chemically specific systems and introduces a general framework for intelligent parameter tuning in field-based simulations of systems that are otherwise challenging or impossible to simulate. |
Most prior studies using field-theoretic simulations have employed models that contain only two species or bead types, typically denoted by A and B. In the majority of these past models, these two species interact through a Flory–Huggins parameter χ and require two auxiliary fields to decouple the pairwise interactions: a pressure-like field w+ and an exchange field w−. Field-theoretic simulations of these models have been extensively used to examine the self-assembly of homopolymers, block polymers and blends thereof.5,9–17 Another commonly employed model instead permits these two monomers to carry charges and to interact through coulombic interactions.6,18–20 Field-theoretic simulations of this model have been used to examine a variety of polyelectrolytes and polyampholytes and also require two fields: an auxiliary field w to decouple the excluded volume interactions and an electrostatic field φ to decouple the electrostatic interactions.
In recent years, there has been a considerable effort to extend field-theoretic simulations to systems that contain more than two species. One of the primary motivations for these efforts is that multi-species FTS can incorporate more chemical specificity than traditional two-species FTS. For example, whereas each species in traditional FTS typically represents a monomer or collection of monomers, each species in a multi-species FTS can now represent a collection of atoms or distinct chemical groups. This allows multi-species FTS to incorporate more chemical information and can result in models that are increasingly predictive. Fredrickson, Shell and co-workers have recently developed a powerful approach to parameterize molecularly informed field theories using atomistic particle-based simulations.21,22 Multi-species field-based simulations parameterized using this approach have been used to examine surfactant phase behavior,23 the critical micelle concentration of intrinsically disordered proteins,24 cellulose acetate solubility,25 and block copolymer solution self-assembly.26 Independent work by Pert et al. have also used multi-species field-based simulations to examine the morphologies formed by mRNA encapsulating nanoparticles.27
One of the major challenges with multi-species field-based simulations is that they can be difficult to numerically stabilize. Most existing multi-species field-based simulations rely on the framework developed by Düchs, Delaney and Fredrickson28 where the pairwise interaction matrix is diagonalized to determine the fields that will decouple the interactions within the model. A challenge with this approach is that most polymeric models result in a spectrum of eigenvalues that is very broad, which can lead to numerical challenges in the resulting simulations. The most common solution to this problem is to empirically tune the rate at which different fields are updated so that the fields associated with different eigenvalues are updated more or less quickly.28,29 While this empirical tuning can be tolerated if a small number of species are present, it becomes increasingly burdensome as the number of species within the system grows larger. Other strategies for handling these numerical difficulties are to ignore the effects of fluctuations by only focusing on SCFT,23–26 or to avoid numerical simulations altogether and to analyze multi-species field theories analytically.21,22 As a consequence of these numerical challenges, past work on fully-fluctuating FTS with many different species is still relatively rare.8,27
In this paper, our first objective is to explore why multi-species field-based simulations are difficult to stabilize numerically. To explore this question, we build on past work28,29 that focused on tuning the relaxation coefficients of the different auxiliary fields in order to enhance simulation performance. We first comprehensively examine a two-species system to examine the effects of polymer connectivity (e.g. diblocks vs. homopolymer blends), morphology (e.g. lamellar vs. gyroid phases), and fluctuations (e.g. SCFT vs. FTS). Our analysis shows that the qualitative effects of relaxation coefficients are generally conserved across these different systems. We next extend our analysis to multi-species systems and show that the effect of relaxation coefficients depends strongly on the interactions between the different species. Notably, we show that even if the eigenvalues of the interaction matrix are identical, the underlying effect of relaxation coefficients can still be quite different.
From these findings, we then turn to our second objective where we seek an automated strategy that can optimize the stability and performance of multi-species SCFT and FTS. We demonstrate that Bayesian optimization (BayesOpt) is particularly well-suited for this task and can be used to efficiently locate both stable and high-performing relaxation coefficients for SCFT and FTS. A key finding from our analysis is that tuning our BayesOpt implementation is critical for achieving good performance: the choices of surrogate model initialization, kernel, acquisition function and objective function all have a significant impact on overall performance. Nonetheless if these subtleties are accounted for, BayesOpt can lead to orders of magnitude improvements in multi-species SCFT and FTS performance. In summary, the strategy that we have presented here can both stabilize and accelerate multi-species SCFT and FTS and is envisioned to be useful as field-based methods are extended to systems with increased chemical specificity.
One key difference between our model and the multi-species exchange model of Düchs, Delaney and Fredrickson28 is that we regularize our model using a Gaussian smearing function.30,31 Formally, this approach involves a convolution of the density of each species K with a Gaussian distribution function, ΓK(r) = (2πa2K)−3/2exp[−r2/(2a2K)], in order to convert the microscopic species density K(r) into a smeared microscopic species density
, where K = {1, …, S} and * denotes a spatial convolution. Once these smeared densities have been defined, the non-bonded energy within the system becomes
![]() | (1) |
![]() | (2) |
By following the approach described in ref. 28, this model can exactly be converted into a field theory,
![]() | (3) |
![]() | (4) |
The final term to be specified in eqn (4) is Qm, the single-chain partition function of molecule m. Qm is a functional of the species fields Ω = {Ω1, Ω2, …, ΩS}
![]() | (5) |
![]() | (6) |
To sample the field-based partition function in FTS, field configurations for each exchange field μi(r) are evolved in fictitious time t using complex Langevin32,33 (CL) dynamics,
![]() | (7) |
An important aspect of eqn (7) to emphasize is that the CL dynamics are not physical and simply correspond to a numeric scheme to sample field configurations μ according to the weights exp(−H[μ]). As a consequence, the relaxation coefficients λ = {λ1, λ2, …, λS} are not constrained to any specific physical values and can instead be tuned to obtain optimal sampling performance. One of the central objectives of this work is to develop tools that can automatically tune these relaxation coefficients for systems with many components. This approach will be discussed extensively in section 3.
In order to numerically integrate eqn (7) we use a Euler–Maruyama predictor–corrector (EMPEC) algorithm which has second order accuracy in fictitious time. The EMPEC algorithm first conducts a predictor step which calculates a preliminary guess for the update field at time . This is followed by a corrector step which refines the field update by averaging the forces evaluated at the current step and the predicted step. The same noise ηi(t) is used in both steps28,29
![]() | (8) |
![]() | (9) |
For all simulation in this work we use ζN = 100, smearing length a = 0.15Rg, reference chain length N = 100, chain number density C = 10, and statistical segment length b = 1. The timestep Δt is held fixed at 1.0 for SCFT and at 0.02 for FTS. All SCFT and FTS were performed using an in-house software called OpenFTS.
In FTS, the objective of the simulation is to instead stochastically sample field configurations so that time averages of field-based operators converge to their equilibrium values. Since field-based operators at subsequent timesteps will typically be correlated, optimal performance in FTS is achieved by minimizing the number of timesteps required to obtain decorrelated samples of different field-based operators. Throughout this work, we quantify the performance of FTS by measuring the autocorrelation time of the excess chemical potential operator (Fig. S2). Shorter values of the correlation time in FTS correspond to better simulation performance. In the discussion of our results below, we denote both the performance of SCFT (i.e. timesteps to converge to a saddle point) and the performance of FTS (i.e. autocorrelation time) as P(λ), where the dependence of simulation performance on the relaxation coefficients λ is explicit.
In both SCFT and FTS, the performance will depend on the field configurations used to initialize the simulation. For SCFT, we use initial field configurations from a converged simulation at slightly different interaction parameters χ. For consistency throughout the many systems considered in this work, we evaluate SCFT performance using interactions parameters χN that are multiples of five using initial field configurations obtained using interaction parameters that are multiples of four. For example, when evaluating the performance of a simulation at χN = {5, 15, 20} we initialize these simulation using converged fields for χN = {4, 12, 16}. When initializing FTS, we first converge fields using SCFT for the given interaction parameters and then perform FTS on those fields until all operators reach their equilibrium values. Once equilibrium is reached, we then calculate the correlation time.
All SCFT calculations, excluding those for the double gyroid, are performed in a one-dimensional (1D) box with a length of 6Rg and 64 plane waves (M). The double gyroid simulations are run in a three-dimensional (3D) cubic box with length 9.6Rg and M = 64.3 SCFT is run until the summation of field forces over all fields is below 10−5 or for a maximum of 1500
000 time steps—1.5 times the steps required for the slowest relaxation coefficients to converge. FTS is run in a 3D cubic box with a length of 6Rg and M = 48.3 Only macrophase separating homopolymer blend systems are explored using FTS. When evaluating different relaxation coefficients, each λi is selected from 30 logarithmically spaced points ranging from 0.001 to 100. Further details are provided in the SI.
While BayesOpt is a widely applied method, our specific application of BayesOpt led to some challenges that warrant additional discussion. The first challenge is that many values of the relaxation coefficients, λ, result in simulations that are divergent or that do not converge (see Fig. 1 and 2). Since a nonconvergent simulation corresponds to performance P(λ) ≈ ∞, the objective function is rugged and contains many sharp features. We find that it can be challenging to approximate these sharp features with a Gaussian process surrogate model. A related challenge is that the optimal relaxation coefficients λ* are often in the immediate vicinity of these non-convergent regions and so the incorporation of these sharp features into the surrogate model are critical for good BayesOpt performance. By empirically varying many aspects of our BayesOpt implementation, we find that both these challenges could be addressed by using a suitably tuned Matern kernel,36 an acquisition function that favored exploitation,37 and a dynamically adjusted penalty associated with nonconvergent trajectories (see SI).
We also faced challenges with our BayesOpt implementation when we extended it to systems containing large numbers of species. In our BayesOpt implementation, the memory required to evaluate the acquisition function scales exponentially with the number of species in the system. As the number of species becomes large, this scaling can result in memory requirements exceeding 100 GB that preclude the use of our typical computational hardware. In order to overcome the memory requirements of these systems, we use an adaptive BayesOpt scheme where the range of the surrogate model is dynamically updated as the optimization proceeds (see SI). We find that our adaptive BayesOpt performs similarly to our original BayesOpt implementation despite using a fraction of the required memory (Fig. S7). Throughout this work, we use our adaptive BayesOpt for systems containing more than five species.
We also find that the relaxation coefficients λ used to initialize the surrogate model are extremely important for BayesOpt performance. Notably, the performance of BayesOpt suffered when it was initialized with conventional schemes such as randomly selected or space-filling relaxation coefficients. After trying numerous initialization schemes, we found that choosing relaxation coefficients that were proportional to the square root of the absolute value of that field's eigenvalue tended to give the best BayesOpt performance. Specifically, the initial relaxation coefficients are λi = α|Λi|1/2, where Λi is the eigenvalue defined in eqn (4) and α is a randomly selected proportionality constant that is the same for all fields i. One of the primary advantages of this approach is that the initial relaxation coefficients are chosen along a one-dimensional line, even if the overall dimension of the search space for λ is much larger. By initializing our surrogate model in this way, we are able to efficiently locate convergent relaxation coefficients that can be subsequently refined using BayesOpt. Additional details of our BayesOpt implementation are provided in the SI.
We begin by examining the number of timesteps required by SCFT to converge a diblock copolymer double gyroid phase for different relaxation coefficients λ = {λ1, λ2} (Fig. 1A). In these calculations, the A block fraction fA = 0.34, the segregation strength χN = 30, number of plane waves M = 643 and the initial fields are obtained from a converged unit cell at χN = 20 and fA = 0.37. From these calculations, we observe a wide variation in timesteps to convergence with respect to the relaxation coefficients λ (Fig. 1A). The performance of SCFT ranges from poorly performing relaxation coefficients that require greater than 106 timesteps to the best performing relaxation coefficients λ* = {14.2, 14.2} that converge in 140 timesteps. Moreover, many relaxation coefficients never yield a converged SCFT solution, either due to numeric instabilities (i.e. divergent trajectories) or the inability to reduce field errors beneath the prescribed threshold of 10−5. It is also noteworthy that the optimal relaxation coefficients λ* that give the best SCFT performance are immediately adjacent to these non-convergent relaxation coefficients.
In order to understand whether our findings in Fig. 1A generalize to other systems and conditions, we also examine the influence of relaxation coefficients on SCFT performance for a lamellar-forming diblock copolymer melt (fA = 0.5, χN = 30) and a symmetric two component homopolymer blend with = 0.5 and χN = 30 (Fig. 1B). These calculations are initialized from converged SCFT simulations at χN = 20 and fA = 0.5 for the diblock, and at χN = 20 and
A = 0.5 for the homopolymer blend.
In general, our observations for the gyroid phase are very similar to our observations for the lamellar phase and the homopolymer blend: the optimal relaxation coefficients λ* are many orders of magnitude faster than the poorly performing relaxation coefficients and the optimal relaxation coefficients λ* are adjacent to non-convergent relaxation coefficients. Furthermore, the general shape of the region containing convergent relaxation coefficients is essentially unchanged across these three different systems (Fig. 1B). It is also noteworthy that the general shape of these convergent regions is minimally affected by the convergence tolerance used in SCFT (Fig. S9).
We are also interested in how the choice of relaxation coefficients affect the performance of FTS (Fig. 1C). In our analysis, we consider a homopolymer blend with A = 0.5, χN = 30 and quantify the performance of FTS using the correlation time of the chemical potential operator (see Methods). In order to obtain accurate correlation time measurements when the relaxation coefficients λi are small, all simulations are run for 320
000 timesteps. Additionally, to account for the stochastic nature of FTS, three independent replicas are performed for every combination of relaxation coefficients and the mean correlation time is reported. If one or more replicas diverge before completing the total number of timesteps, then the corresponding relaxation coefficients are classified as non-convergent.
The performance of FTS is qualitatively similar to our results for SCFT in Fig. 1A and B. A notable difference between SCFT and FTS is that FTS requires values of λΔt that are approximately 50 times smaller than for SCFT (Δt = 0.02 for FTS, 1.0 for SCFT). Another difference is the less pronounced optimal region for FTS whereas SCFT has a sharply peaked region of optimal performance. Both of these differences associated with FTS are not surprising given the numeric challenges associated with integrating the stochastic CL dynamics in FTS, which require smaller timesteps and lead to broader optimal parameter regions due to the increased sensitivity to stochastic fluctuations as the relaxation coefficients increase. Nonetheless, if the different timesteps required by SCFT and FTS are accounted for, then the effects of λ on the stability of SCFT and FTS are largely in agreement (Fig. 1D).
A more subtle difference between SCFT and FTS is that equal-valued relaxation coefficients λ1 = λ2 are generally convergent in SCFT (Fig. 1A) but are unstable in FTS (Fig. 1C). It is also noteworthy that the optimal relaxation coefficients generally occur when λ1 ≈ λ2 in SCFT, but for λ1 ≠ λ2 in FTS. This observation underscores the difficulty of running efficient FTS compared to SCFT and is reasonable given the presence of fluctuations in FTS.
Taken together, our results in Fig. 1 demonstrate that tuning the relaxation coefficients λ is critical for both SCFT and FTS performance and that the choice of optimal relaxation coefficients λ* can lead to simulations that are orders of magnitude faster. We also demonstrate that the general effects of relaxation coefficients on simulation performance are qualitatively similar across different systems and simulation techniques. This overarching similarity suggests that an approach to optimize the relaxation coefficients in one context will likely be applicable to other contexts as well. In particular, it motivates a general-purpose strategy for optimizing the relaxation coefficients in SCFT/FTS. The goal of this paper is to describe such a strategy that can efficiently locate these optimal relaxation coefficients λ*. However, before we do this, it is useful to consider how the optimal relaxation coefficients λ* change as the number of species in the system increase beyond two.
One of the most obvious complexities that emerges from a three-species system is that three distinct exchange fields are now necessary to decouple the non-bonded interactions in the model. This now results in three relaxation coefficients λ = {λ1, λ2, λ3} and thus a higher dimensional space to search over in order to locate the optimal relaxation coefficients λ*. A more subtle, but nonetheless important, consequence of moving to three-species systems is that the mapping from the exchange fields μ to species fields Ω now depends on the interaction matrix χ. In order to illustrate this dependence, it is helpful to first consider a two species system where χ = χ12. For this two species system, the X matrix becomes
![]() | (10) |
![]() | (11) |
For systems with three or more species, this independence is lost and the mapping between the exchange fields μ and the species fields Ω will now depend on the interaction parameters χ. As a direct consequence of this interdependence, the relaxation coefficients λ will now depend on the interaction parameters χ. This means that it is generally more challenging to identify stable relaxation coefficients λ in systems with three or more species than in systems that contain only two.
In order to illustrate these challenges, we first examine a three component homopolymer blend consisting of polymers A, B and C, with volume fractions A =
B = 0.33,
C = 0.34, and interaction parameters χABN = 15, χBCN = 20 and χACN = 15. To keep calculations computationally tractable, our analysis will focus on SCFT, however, we expect qualitatively similar results if the analysis were instead performed using FTS (cf. Fig. 1D). In order to visualize the impact of relaxation coefficients λ for this system, we plot two different orientations of the complete three-dimensional data (Fig. 2A) and several two-dimensional slices at fixed values of λ2Δt (Fig. 2B). At least qualitatively, the effect of relaxation coefficients λ for this three species system is similar to our results for two species systems discussed above (Fig. 1). As with the two species systems, the performance of SCFT varies widely with respect to λ, with poor performing λ requiring close to 106 timesteps for convergence, while the optimal λ* = {9.2, 13.7, 32.3} converges in just 80 timesteps. The region of stable λ also narrows as λi values increase, resulting in a small region of optimal performance with the optimal relaxation coefficients λ* located immediately adjacent to non-convergent regions. It is also noteworthy that most λ fail to yield a converged SCFT solution.
Yet beyond these qualitative similarities to the two species systems, this three species system also contains additional complexities that make it more challenging to identify efficient relaxation coefficients λ. For example, in contrast to the two species system discussed in Fig. 1, this three species system does not converge if all relaxation coefficients are equal such that λ1 = λ2 = λ3 (Fig. 2A, dotted black line). This suggests that the tuning of relaxation coefficients for different exchange fields is essential for achieving stable SCFT/FTS of systems with more than two species. Another complexity is that the region of optimal relaxation coefficients is much narrower (Fig. 2B) and only exists for a small range of relaxation coefficients λ. This narrow region of optimal performance can result in over an order of magnitude increase in performance (e.g. Fig. 2B, λ2Δt = 0.08 versus λ2Δt = 13.7) and so there is a considerable computational benefit if this narrow region can be found.
Lastly, and perhaps most significantly, the performance of different relaxation coefficients in this three species system now depends on the interaction parameters χ. To illustrate this dependence, we have recomputed the performance of SCFT for this same three species system with slightly different interaction parameters χABN = 5, χACN = 10 and χBCN = 15 (Fig. 2C). This change in interaction parameters has a significant effect on the role of different relaxation coefficients and where the optimal relaxation coefficients might be found.
These results and the others presented in Fig. 2 collectively demonstrate the challenges of locating stable and high-performing relaxation coefficients in systems that contain three (or more) unique chemical species. In our treatment so far, we were able to locate these optimal relaxation coefficients through brute force sampling: we discretized the range of relaxation coefficients, explicitly performed simulations at each of these different relaxation coefficient values and then quantified the simulation performance. Since the overall search space increases exponentially with the number of species, this brute force approach quickly becomes intractable as the number of species increases beyond three. It is therefore of great interest to develop methods that can locate these optimal relaxation coefficients without brute force sampling.
One particularly attractive strategy would be to use a closed-form expression to determine the optimal relaxation coefficients from the interaction parameters χ. For example, Düchs, Delaney and Fredrickson28 previously suggested that the eigenvalue magnitude |Λi| might be related to the optimal relaxation coefficient of that field. Yet, Düchs, Delaney and Fredrickson28 also demonstrated that non-linear couplings between different fields through the ln
Qm term in eqn (4) makes this scheme too simplistic to yield optimal relaxation coefficients in practice. We also examined the possibility of closed-form predictions of the optimal relaxation coefficients but were unable to identify any general purpose approach that reliably worked for multi-species systems. In our analysis, we found that the closed-form prediction of relaxation coefficients is complicated by the fact that the optimal relaxation coefficients can vary for systems with different interaction parameters χ but equivalent eigenvalues Λ (Fig. S5), and for systems with equivalent interaction parameters χ, but different molecular architectures (Fig. 1B). Our conclusion is thus consistent with Düchs, Delaney and Fredrickson28 that numeric schemes are needed to determine the optimal relaxation coefficients λ*. In the following section, we demonstrate that Bayesian optimization is particularly well-suited for locating optimal relaxation coefficients in complex polymeric systems containing many chemical species.
In order to explore whether BayesOpt could be used to optimize the performance of SCFT and FTS, we now return to the simple two species homopolymer blend previously considered in section 3.1. In particular, we are interested in whether BayesOpt can automatically locate the optimal relaxation coefficients λ* that we previously located using brute force sampling. Fig. 3A shows a representative evolution of a BayesOpt-driven search. We find that BayesOpt quickly locates the region of optimal performance in approximately 10 iterations and then samples this region until the globally optimal relaxation coefficients λ* are found. In order to quantify the performance of BayesOpt, we compare the best performing λ found by BayesOpt to those obtained using random sampling (Fig. 3B). We see that BayesOpt reliably locates high-performing relaxation coefficients λ* in 20 evaluations and results in SCFT simulations that are 30 times faster than λ chosen at random.
While the performance achieved using BayesOpt is quite impressive, it is important to emphasize that the choice of hyperparameters can have a significant effect on BayesOpt performance. In general, we found that the performance of naive BayesOpt implementations were quite poor and that good performance required the careful tuning of the surrogate model kernel, the acquisition function and the objective function. The details of our specific BayesOpt implementation are described in section 2.3 and in the SI. It is also important to note that there are many strategies that could be used to generate relaxation coefficients in addition to BayesOpt and random sampling. We have chosen to compare these two methods in order to give a baseline measure of performance and not an exhaustive comparison of different methods.
Another metric that we use to quantify BayesOpt's performance is the fraction of total SCFT simulations that are non-convergent. As demonstrated in Fig. 1 and 2, it is common for the vast majority of relaxation coefficients λ to not yield a convergent SCFT solution. Thus in addition to finding λ that have good SCFT performance (Fig. 4A–D), BayesOpt quickly finds λ that are able to converge. While random sampling tends to find λ of which >75% are non-convergent, BayesOpt is able to focus its sampling on λ of which <30% are non-convergent on average (Fig. 4E–H). It is also noteworthy that the fraction of nonconvergent simulations tends to decrease throughout the course of a BayesOpt-driven search, resulting in simulations where only ≈10% are non-convergent for a five-species homopolymer blend. Thus, in addition to finding relaxation coefficients with good performance, BayesOpt is also able to focus its sampling on those relaxation coefficients that yield convergent SCFT solutions.
We next examine whether BayesOpt can be used to optimize the relaxation coefficients λ in FTS for systems with many species (Fig. 5). As we observed for SCFT, BayesOpt performs well for two- to five-species systems and results in 7- to 21-fold performance improvements relative to random sampling (Fig. 5A–D). BayesOpt also similarly locates convergent relaxation coefficients much more efficiently than random, with non-convergent fractions of ≈40% in BayesOpt on average versus >85% for random (Fig. 5E–H).
It is also illustrative to compare the performance of BayesOpt between SCFT (Fig. 4) and FTS (Fig. 5). In general, BayesOpt leads to more significant performance gains relative to random in SCFT than in FTS. One reason for this difference is a superficial artifact from how we have defined the correlation time of non-convergent trajectories in FTS. Since these three-dimensional FTS are considerably more expensive than these one-dimensional SCFT, we set the maximum number of timesteps to be considerably lower in FTS than in SCFT (5 × 104 versus 1.5 × 106). One direct consequence of this choice is that the maximum stable correlation time possible in our FTS is 1250 timesteps and non-convergent FTS trajectories are therefore assigned a correlation time of 1250 timesteps (see section 2 and SI). Furthermore, for computational efficiency, relaxation coefficients with correlation times exceeding this threshold are labeled as non-convergent. This choice limits the amount of information that BayesOpt can gain from each evaluation of FTS, and also sets a ceiling for the maximum correlation time which leads to more modest improvements in FTS (e.g. 7–21×) than in SCFT (e.g. 30–190×) relative to random. While we could increase the maximum number of timesteps in FTS, we feel that our choice is computationally prudent because it avoids wasting resources on relaxation coefficients with exceedingly large correlation times. In summary, our choice to truncate inefficient relaxation coefficients improves the absolute performance (i.e. walltime) of BayesOpt FTS at the expense of penalizing the relative performance of BayesOpt FTS compared to random.
Another important difference between BayesOpt in SCFT and FTS is the role of stochasticity in the evaluation of simulation performance. In SCFT, simulation performance is quantified by the number of timesteps to convergence, which is a deterministic quantity. In contrast, simulation performance in FTS is quantified using correlation time which is a stochastic quantity that will depend on the realization of the random noise described in eqn (7). We anticipate that accounting for this noise in our BayesOpt implementation could improve performance,34,35 but for simplicity we do not pursue these improvements here.
As an additional test for our BayesOpt implementation we consider two different systems. These systems represent the most complex systems we have examined thus far and serve to rigorously test whether BayesOpt can optimize relaxation coefficients under increased system complexity. For the first system, we choose a five-component mixture containing homopolymer chains with 100 or 150 discrete beads in an explicit solvent (details in SI). The asymmetric chain lengths coupled with the explicit solvent make it very difficult to obtain convergent FTS relaxation coefficients.
As before, we compare BayesOpt to random sampling over many independent replicas (Fig. 6A). Whereas the performance of randomly sampled relaxation coefficients is relatively constant near the maximum correlation time of 1250 timesteps, BayesOpt identifies relaxation coefficients with correlation times <100 timesteps after only 10 FTS evaluations. On average, this corresponds to a 33-fold performance improvement relative to random. It is also noteworthy that convergent relaxation coefficients are very difficult to locate for this system as >95% of relaxation coefficients sampled randomly are non-convergent (Fig. 6C). In contrast, we find that BayesOpt is able to focus its sampling towards more stable relaxation coefficients, typically achieving nonconvergent rates of ≈50% after its first 10 evaluations. Thus, whereas randomly sampled relaxation coefficients are completely ineffective at generating stable FTS for this system, BayesOpt is able to robustly locate stable relaxation coefficients with good performance.
For the second system, we examine a ten-component SCFT homopolymer blend in which each species is present at equal volume fraction and each homopolymer consists of 100 discrete beads. This system represents the highest dimensional search space considered in this study and poses a significant challenge for our optimization method due to the rapid increase in possible λ as the number of species increases. As the number of species increases beyond five, the computational and memory demands of BayesOpt become prohibitive due to our Gaussian process regression surrogate model. This is because the number of potential relaxation coefficients λ that BayesOpt needs to estimate grows exponentially with the number of species. Without a more efficient strategy for navigating the increasingly large search space, BayesOpt is computationally intractable for this system due to both time and memory constraints. To address these constraints, we develop a modified BayesOpt workflow, referred to as adaptive BayesOpt, where the resolution is lowered but the range of the search space is dynamically updated as BayesOpt samples λ (see section 2 and SI).
We apply adaptive BayesOpt to this system and compare its performance to random sampling (Fig. 6B). As in the previous examples, we track the best performing λ found after each evaluation. For random sampling, performance remains relatively poor as new λ are evaluated, with the best performing λ yielding only modest improvements in SCFT performance. In contrast, adaptive BayesOpt quickly identifies λ with substantially better performance. Adaptive BayesOpt then refines and improves performance over subsequent evaluations and ultimately locates λ with a 125-fold improvement in simulation performance relative to random. As with the five-component system, this improvement is not only due to BayesOpt finding better performing relaxation coefficients, but also due to a reduction in the fraction of non-convergent evaluations. As shown in Fig. 6D, random sampling continues to produce a high proportion of non-convergent λ (>80%), whereas adaptive BayesOpt instead prioritizes sampling in convergent regions of the search space (<10%). Taken together, these results demonstrate that even in multispecies settings where the search space becomes prohibitively large, BayesOpt can still efficiently identify high-performing and stable relaxation coefficients.
To investigate the transferability of our BayesOpt relaxation coefficients, we focus on the five-species homopolymer blend examined previously in Fig. 4D, H, 5D and H. The original interaction parameters for this system used with BayesOpt are denoted by χ0N and are given in Table S5. In order to calculate the transferability of the optimal λ* found by BayesOpt, we first define a perturbation magnitude ε that parameterizes the deviation of the new interaction parameters N, relative to their original values χ0N. Specifically,
N is obtained by multiplying each element of χ0N by a random factor between 1 − ε and 1 + ε. When examining a range of different ε, we select values of
N that are unique to a specific value of ε and that cannot be generated using other values. In order to obtain statistics for different values of ε, we choose five different values of λ* obtained from independent BayesOpt searches and generate 40 random realizations of
N per ε.
We first examine whether the optimal λ* yield convergent SCFT simulations for different values of ε (Fig. 7A). The fraction of non-convergent trajectories is observed to increase with ε and suggests a rather poor transferability of λ*, especially when ε is large. We hypothesize that this low transferability emerges because the optimal relaxation coefficients located by BayesOpt are deep inside the narrow optimal performance region discussed previously for two- and three-species systems in Fig. 2A and 3A, respectively. Though this optimal performance region results in exceptional performance for χ0N (Fig. 4D), it leads to poor transferability to the new values of N.
One natural solution to this problem is to instead modify λ* to slightly decrease performance in order to improve transferability. A simple approach to achieve this goal is to project λ* onto the vector given by the square root of eigenvalue magnitudes |Λi|1/2 for the unperturbed χ0N, compute the norm of this projection, and multiply the unit vector corresponding to |Λi|1/2 for the perturbed N by half this norm. This procedure to obtain a projected λ* effectively uses the original λ* to obtain a proportionality constant that is then used to set the relaxation coefficients for the new
N. We observe that these projected λ* exhibit excellent transferability and result in stable SCFT simulations even for ε = 1. The performance of the projected λ* are also comparable to the performance obtained if the original λ* are used (Fig. 7B). These results indicate that our simple projection strategy is very effective at producing relaxation coefficients for SCFT that are both high-performing and convergent for a wide range of different interaction parameters.
We next examine how the FTS performance of the λ* obtained using BayesOpt transfer to different values of N. In contrast to SCFT, we observe that the optimal λ* result in a low fraction of non-convergent FTS across all values of ε (Fig. 7C). Moreover, the correlation time in FTS only increases slightly with respect to ε and still yields relatively low correlation times even when ε is large (Fig. 7D). We attribute the excellent transferability of λ* in FTS to the relatively broad region of optimal performance in FTS, especially relative to SCFT (cf. Fig. 1A and C). Thus, even though BayesOpt leads to more modest performance gains in FTS relative to SCFT (Fig. 4 and 5), the transferability of λ* to different interaction parameters is quite good.
Taken together, the results presented in Fig. 7 demonstrate that the relaxation coefficients obtained using BayesOpt can result in convergent and high-performing SCFT and FTS calculations at different values of interaction parameters. We have also examined the transferability of relaxation coefficients to different polymer volume fractions and observe similar results (Fig. S8). These results demonstrate that a single BayesOpt calculation can be used to obtain relaxation coefficients that are broadly useful across a range of different systems and conditions.
To address these challenges, we develop a Bayesian optimization (BayesOpt) workflow tailored to optimize SCFT and FTS. Our results demonstrate that BayesOpt consistently outperforms random sampling and rapidly identifies high-performing relaxation coefficients while avoiding non-convergent regions of the search space. We show that our approach can optimize over high-dimensional systems with ten or more species that would otherwise be intractable. We also show that the relaxation coefficients obtained using BayesOpt transfer reasonably well to different systems and conditions. Furthermore, our approach is easily extended to other field update algorithms,29 alternative theoretical frameworks for performing multi-species FTS38 and is compatible with other recent numerical strategies for stabilizing FTS.39
Though our study has focused on BayesOpt, there are numerous other optimization algorithms that could also be used to identify relaxation coefficients. While we have not examined alternatives to BayesOpt in this work, there are several general criteria that should be satisfied in order for an algorithm to be suitable for relaxation coefficient optimization. The first and most important criteria is for the algorithm to minimize the number of required evaluations, especially for FTS where each evaluation can be very computationally expensive. Another criteria is for the algorithm to be able to locate the optimal relaxation coefficients despite their proximity to large non-convergent regions. These two criteria suggest that any successful algorithm should exhibit a multi-scale character: it should first make large changes to the relaxation coefficients in order to locate the optimal region and then it should focus on sampling this region despite the occurrence of non-convergent simulations. Our BayesOpt implementation is able to achieve this (cf. Fig. 3A) and we expect that other successful algorithms for relaxation coefficient optimization will do the same.
It is also important to note that there are numerous methods to locate the SCFT field configurations other than solving the fictitious dynamics (i.e. eqn (7) with ηi = 0) used throughout out this work. One prominent and powerful example is Anderson mixing40–42 which is implemented in the open-source PSCF code.43,44 Vigil, Delaney and Fredrickson recently compared the performance of SCFT using either fictitious dynamics or Anderson mixing and observed that both methods had similar performance so long as each method was properly tuned.29 Since this past work only considered two-species systems, it would be very interesting to compare how the performance of our optimal relaxation coefficients compare to Anderson mixing when the number of species becomes large.
Another interesting extension of our work would be to perform BayesOpt using a relatively loose convergence tolerance in SCFT. Since our initial results suggest that the relative performance of different relaxation coefficients is largely independent of convergence tolerance (Fig. S9), it could be possible to identify λ* more quickly by first performing BayesOpt using a loose tolerance and then performing production SCFT simulations with a tolerance that is tighter. This could dramatically accelerate the 10–100 SCFT evaluations typically required for BayesOpt to find the optimal relaxation coefficients, especially if three-dimensional SCFT simulations are required.
Taken together, our work represents an advance in the automation and acceleration of field-based simulations, especially when many species are present. By reducing the need for manual tuning, our BayesOpt method broadens the practical scope of SCFT and FTS and paves the way for high-throughput and chemically specific simulations in soft matter and biomolecular physics.
Data for this article, including raw data used to generate all figures, are available on Zenodo at DOI: 10.5281/zenodo.16817972.
This journal is © The Royal Society of Chemistry 2025 |