Open Access Article
Fanjin
Wang
*ab,
Maryam
Parhizkar
b,
Anthony
Harker
c and
Mohan
Edirisinghe
a
aDepartment of Mechanical Engineering, University College London, London, WC1E 7JE, UK. E-mail: fanjin.wang.20@ucl.ac.uk
bSchool of Pharmacy, University College London, London, WC1N 1AX, UK
cDepartment of Physics and Astronomy, University College London, London, WC1E 6BT, UK
First published on 4th August 2025
Polymeric nanoparticles have critical roles in tackling healthcare and energy challenges with miniature characteristics. However, tailoring synthesis processes to meet design targets has traditionally depended on domain expertise and trial-and-error. Modeling strategies, particularly Bayesian optimization, facilitate the discovery of materials with maximized/minimized properties. Based on practical demands, this study integrates constrained composite Bayesian optimization (CCBO) to perform target-value optimization under black-box feasibility constraints for by-design nanoparticle production. In a synthetic problem that simulates electrospraying, a representative nanomanufacturing process, CCBO avoided infeasible conditions and efficiently optimized towards predefined size targets, surpassing the baseline methods and state-of-the-art optimization pipelines. CCBO was also observed to provide decisions comparable to those of experienced experts in a human vs. BO campaign. Furthermore, laboratory experiments validated the use of CCBO for the guided synthesis of poly(lactic-co-glycolic acid) particles with diameters of 300 nm and 3.0 μm via electrospraying under minimal initial data. Overall, the CCBO approach represents a versatile and holistic optimization paradigm for next-generation target-driven particle synthesis empowered by artificial intelligence (AI).
More recently, BO has been investigated for materials and drug discovery to assist in the identification of optimal properties.15–17 However, the application of BO in the targeted synthesis of materials presents two critical challenges. First, conventional BO was developed to seek a global maximum or minimum rather than to match a pre-defined target.18,19 The latter is a common requirement in materials development tasks, e.g., matching physiological mechanical properties for hydrogels and tailoring release profiles for drug delivery agents. Despite its relevance, the target-matching problem remains surprisingly underexplored in BO applications for materials discovery. This may be attributed to the prevailing emphasis on discovering materials with extreme or superior properties rather than materials that meet specific design criteria. Another issue is associated with experimental feasibility constraints. The majority of the current applications of BO within materials discovery and development do not incorporate feasibility.17,20,21 Nevertheless, BO recommendations can present a myriad of practical concerns in laboratory experiments, such as impossible combinations of material compositions, incompatible processing parameters, and apparatus limitations. Shrinking the boundaries of variables to a practical region could be a direct solution at the cost of reduced search space. In special cases in which known constraints on the input variables are available (e.g., as inequality equations), optimization can be performed subject to these constraints.22,23 For example, Li et al. nested an active learning loop for constraint modelling to restrict the candidate space selectable by BO.24 Low et al. suggested evolution-guided Bayesian optimization, which imposes known constraints on multi-objective optimization problems, for nanoparticle synthesis in microfluidics.25 However, these strategies become impossible when the constraints are unknown a priori and must be evaluated through laboratory experiments.
Several prior works on constrained or composite BO have explored applications in hyperparameter tuning. In the area of constrained BO, Gramacy and Lee proposed weighting the expected improvement (EI) acquisition function with a modelled probability to enforce a preference for feasible candidates.26 Gardner et al. extended this approach to inequality constraints, assuming that feasibility could be derived from a continuous-value constraint function.27 More recently, Tian et al. proposed a boundary exploration method that relaxes the acquisition function weights to encourage exploration near the constraint boundaries.28 In the area of composite BO, Uhrenholt and Jensen investigated target value optimization, specifically minimizing a 2-norm, by warping the GP to a noncentral chi-squared distribution.29 As an improvement, Astudillo and Frazier approached a more general problem of composite BO for any arbitrary composite function over the objective function. They transformed the Gaussian posterior in the acquisition function directly with the composited function.30 Although these strategies have been rigorously tested on synthetic benchmarks and hyperparameter optimization tasks, they have yet to be integrated into a combinatorial framework to facilitate guided laboratory experiments.
Here, we implement a constrained composite Bayesian optimization (CCBO) pipeline showcasing efficient identification of suitable processing parameters in the rational synthesis of polymeric particles. Through introducing a variational inference GP component, the black-box experiment feasibility was modeled and incorporated into the BO acquisition function. Composite BO handles the modeling of experimental parameters and targeting of particle size through a composite objective function. Amongst the various particle fabrication techniques, electrospraying was selected as the model technique based on its simplicity, versatility, and precision as a popular manufacturing method in drug delivery research.31 This technique utilizes electric fields to deform the meniscus of a polymer solution to form fine jets, which eventually disintegrate into fine droplets. As these droplets travel towards a collector, they further shrink and solidify due to solvent evaporation. Various parameters in the electrospraying process, such as the flow rate, voltage, polymer concentration, and solvent can be adjusted to tailor the product characteristics, although the intertwined impact of these factors could lead to prolonged, if not infeasible, trial-and-error experiments.32 We demonstrate the superior performance of CCBO in target parameter optimization compared to random baseline and conventional BO strategies through both synthetic data and wet-lab experiments for poly(lactic-co-glycolic acid) (PLGA) particle synthesis at multiple size targets.
:
50 ratio) was purchased from Corbion (Amsterdam, The Netherlands). Chloroform and N,N-dimethylacetamide (DMAc) were purchased from Sigma-Aldrich (Gillingham, UK).
, where
and μ(x0) and σ2(x0) are the posterior mean and variance from the Gaussian process at x0, and f* is the current best observation. As the calculation of expectation requires integrating over the posterior, it becomes analytically intractable under a batched scenario where q > 1. We followed the strategy in BoTorch in which Monte–Carlo sampling was used to approximate the expectation as:![]() | (1) |
, and g* represents the current closest distance (with respect to the target) achieved. Notably, the data
consisted of {(xi,yo,i)}ni=1 where yo,i = g(si) = − (si − so)2 with so representing the target value (predefined as a constant). Under such configurations, this vanilla BO pipeline could help to identify suitable experiment variables X that could maximize the negative distance measure yo.
Furthermore, the feasibility component was introduced to learn the black-box constraints in the experiments. Here, a variational Gaussian process was implemented for the binary classification of experimental success or failure.34 The details for variational inference in Gaussian classification have been described in previous publications.35 Briefly, the latent Gaussian process is further warped with a Probit regression to limit the output to between 0 and 1 for the purpose of approximating a Bernoulli posterior. For our latent Gaussian process, it followed the same constant mean prior and kernel functions to incorporate mixed inputs. To incorporate feasibility modelling in the Bayesian optimization process, we followed the strategy proposed earlier26 to extract the posterior probability as a scaling factor in the acquisition function:
Incorporating this factor in the acquisition function allowed the suppression of the values of experiments that are potentially infeasible, creating our constrained BO pipeline.
In both the vanilla and constrained BO pipelines, the Gaussian process modeled yo and attempted to maximize this negative distance. As a different strategy, the composite BO used a Gaussian process to directly model the particle size s. The composite part, namely, the negative squared distance function g, was separated from the input data. Instead, the distance function was directly applied to the Gaussian posterior in the acquisition function:36
![]() | (2) |
. By coupling the composite acquisition function αqEICF with the constraint probability, we obtain the acquisition function for CCBO:
In the present work, the Monte–Carlo sampling number N was 512 and q was fixed to 2 throughout all BO pipelines. All inputs X were normalized to unit cubes, and the flow rate variable was transformed to a logarithm before normalization. The outcomes of the objective component, including the distance variable yo in vanilla BO and constrained BO, as well as the particle size variable s in CCBO, were standardized to zero mean and unit variance. The outcomes of the feasibility component yc were rescaled to {−1,1}.
![]() | (3) |
![]() | (4) |
| Label | Polymer concentration (% w/v) | Flow rate (μL min−1) | Voltage (kV) | Solvent |
|---|---|---|---|---|
| Bounds | [0.05–5.00] | [0.01–60.00] | [10.0–18.0] | {CHCl3, DMAc} |
| S-1 | 0.50 | 15.00 | 10.0 | DMAc |
| S-2 | 0.50 | 0.10 | 10.0 | CHCl3 |
| S-3 | 3.00 | 20.00 | 15.0 | DMAc |
| S-4 | 1.00 | 20.00 | 10.0 | CHCl3 |
| S-5 | 0.20 | 0.02 | 10.0 | CHCl3 |
The regret, defined as the closest distance to the target particle size, was plotted at each iteration. The experimental variables proposed in a typical run were visualized on 3D plots with symbols representing solvent and feasibility, and colors encoding the iteration. The area under the curve (AUC) for each strategy and human participant was calculated based on trapezoid rules for quantitative comparison. One-tailed Mann–Whitey U-tests were performed with the alternative hypothesis being that CCBO had smaller AUC/regret compared to the BO baseline or human groups, respectively.
As a benchmark, CCBO, together with random baseline, vanilla BO, and constrained BO only, was performed for 10 iterations. Five initial experiments were included, accounting for successful and failed cases for both solvents. The optimization target was set to 18 μm. Results for other target sizes, including 0.6, 3 and 6 μm, can be found in SI Fig. 1. In each iteration, two sets of processing parameters were proposed and subjected to simulation functions to retrieve the synthetic experimental result as well as the feasibility. The regret, defined as the difference between the target and the closest candidate, was recorded after each iteration as a measurement of performance (Fig. 1c). After 10 iterations, the random baseline reached 0.8 μm regret. Similarly, the vanilla BO and constrained BO both achieved around 0.4 μm regret. In contrast, the CCBO algorithm rapidly converged to the targeted diameter after only two iterations with minimal regret. Moreover, the AUC of each strategy was calculated using a trapezoidal method to quantify the optimization efficiency. CCBO achieved a minimal AUC of 2.47 ± 0.85, which was significantly lower than that of the random method (19.48 ± 8.12, p < 0.0001), vanilla BO (18.35 ± 3.86, p < 0.0001) and constrained BO (16.26 ± 3.73, p < 0.0001) under the one-tailed Mann–Whitney U-test.
The benchmark for synthetic electrospray was extended to compare CCBO with state-of-the-art optimization methods such as Summit,38 Dragonfly,39 EDBO+,40 and Atlas.41 Notably, the implementations in Summit and Dragonfly did not support optimization under unknown constraints. Therefore, their performance was similar to that of the vanilla BO baseline. EDBO+ was developed as a pool-based active learning optimization platform. The EDBO+ algorithm did not provide improvement in regret, potentially due to its lack of support for constrained optimization and the limitation of a pool-based search space compared to other strategies. Finally, the most recently developed approach, Atlas, which is a framework library for self-driving libraries by Hickman et al., utilized a variational GP to model unknown constraints for experimental feasibility.42 The optimization by Altas with a priori unknown constraints showed better performance than other existing strategies in the benchmark. However, as none of the state-of-the-art libraries support target-value optimization natively, none of these strategies outperformed the CCBO algorithm proposed herep for electrospray optimization. More detailed results can be found in SI Fig. 1. These results highlight the importance of incorporating both a constrained and composite optimization scheme for target-driven design problems in experiments.
To understand the recommendation process, the proposed experiments were visualized in Fig. 1d. The random baseline sampled uniformly across the experiment space with both solvents, resulting in many failed DMAc experiments due to the flow rate feasibility constraints. Vanilla BO started exploring the boundary conditions in the first few rounds. With an additional model to account for feasibility, the constrained BO algorithm managed to learn the feasible region for DMAc, as reflected by most DMAc experiments being recommended with lower flow rates. This corresponded well to the initial feasible zone visualized in Fig. 1b. In addition, the number of failed and successful attempts of each algorithm from the results were plotted in Fig. 1e, highlighting the reduction in infeasible experimental conditions with the help of the additional constraint model.
Furthermore, the CCBO strategy was observed to show highly efficient searching in a localized experiment space (Fig. 1d). The performance of CCBO could be explained by its design. The routes taken by vanilla BO and constrained BO were directly minimizing the distance, whereas the surrogate GP was forced to model more complicated results from both the experiment and the superimposed distance function. On the contrary, GP was solely used for modeling the black-box experiment results for CCBO. Our observations with CCBO echoed the findings in the composite BO literature: extracting the analytically trackable part from the black-box function can drastically benefit the optimization efficiency.30 In standard BO, the EI acquisition function assumes a Gaussian posterior distribution. However, the posterior of the composite function becomes non-Gaussian after transformation with a non-linear function. To address this, Astudillo and Frazier suggested leaving the GP to model the black-box function. The composite part was instead incorporated into the acquisition function to transform the Gaussian posterior of the black-box function. This allows more efficient optimization through a closer approximation of posterior distribution in a composite scenario.36 In our implementation, the composite acquisition function was optimized in the CCBO pipeline with Monte Carlo sampling. Through the benchmark validation, we have shown that vanilla BO or constrained BO alone would not be able to efficiently optimize our design problem, highlighting the importance of the integration of CCBO.
Finally, we compared CCBO to human electrospray users with varying levels of expertise in this synthetic campaign. More experienced users were believed to approach the target more efficiently, as they were equipped with prior knowledge of the influence of the parameters and the selection of solvent. All participants (N = 14) evaluated the same initial experimental data and suggested experiments to achieve a target particle size of 3 μm. The comparative results are plotted in Fig. 2a. More detailed human performance results are available in SI Table 1 and Fig. 2. In the first iteration, the CCBO strategy was behind intermediate (1–3 years of experience, N = 4) and advanced users (≥3 years of experience, N = 4), and performed similarly to beginners (<1 year of experience, N = 6). However, CCBO soon overtook intermediate users from the second iteration onwards and surpassed advanced users on later iterations. Quantitatively, the AUC was calculated and plotted (Fig. 2b) with respect to each strategy or human group, and CCBO (1.40 ± 0.10) exhibited a significantly smaller (p = 0.01) AUC than beginners (2.62 ± 1.19) under the one-tailed Mann–Whitney U-test. There were no significant reductions in AUC for CCBO compared to intermediate (1.60 ± 0.41, p = 0.34) or advanced (1.03 ± 0.42, p = 0.95) users. When focusing on overall performance (regret at final iteration), the regret of the CCBO strategy was significantly lower than that of intermediate (p = 0.02) and beginner (p < 0.0001) users. Further analysis of parameter selection strategies revealed that advanced users predominantly followed a one-factor-at-a-time (OFAT) approach, resulting in linear adjustment patterns (Fig. 2c). Most beginner users and intermediate users attempted to adjust multiple parameters simultaneously. Unlike human participants, CCBO employed more strategic exploration and exploitation, effectively reducing experimental regret by targeting promising regions in the parameter space. Taken together, these findings demonstrated that CCBO could achieve performance comparable to highly experienced participants and navigate complex experimental spaces more effectively than human users. In addition, the performance differences among users with various levels of expertise reflected the successful development of the synthetic problem simulating electrospraying, consolidating our confidence in proceeding to laboratory validation.
| Label | Polymer concentration (% w/v) | Flow rate (μL min−1) | Voltage (kV) | Solvent | Mean size (μm) | Feasible? |
|---|---|---|---|---|---|---|
| 0-1 | 2.40 | 1.73 | 14.0 | DMAc | 0.56 | 1 |
| 0-2 | 4.06 | 0.44 | 15.7 | CHCl3 | 1.00 | 0 |
| 0-3 | 2.88 | 49.11 | 11.8 | DMAc | 15.00 | 0 |
| 0-4 | 0.76 | 0.01 | 17.6 | CHCl3 | 1.20 | 0 |
| 0-5 | 0.11 | 10.43 | 14.5 | CHCl3 | 6.26 | 1 |
| 0-6 | 3.55 | 0.06 | 12.8 | DMAc | 0.15 | 1 |
| 0-7 | 4.55 | 2.39 | 16.7 | CHCl3 | 5.24 | 1 |
| 0-8 | 1.88 | 0.21 | 11.0 | DMAc | 1.12 | 1 |
Two particle sizes, 300 nm and 3.0 μm, were set as the design targets based on pharmaceutical interest as drug carriers for intravenous injection and pulmonary delivery.4 Based on previous reports, the production of PLGA particles with these two particle sizes require distinct processing parameters involving different solvents and flow rates.32,43 Thus, the setting of these targets could simulate distinct experimental scenarios to challenge BO pipelines. The workflow of targeted particle production under CCBO guidance is illustrated in Fig. 3a. With the initial data gathered, a CCBO pipeline was implemented to propose two experiments in parallel for laboratory investigation. The selection of two experiments was based on the capacity for laboratory work and to avoid wasting materials and preparation time. After collecting samples and characterization, the results from triplicate experiments were evaluated and compared with the target. The next iteration of BO was performed based on the addition of the new data.
The parameters proposed by CCBO are visualized using heatmaps in Fig. 3b. The heatmap of the initial experiments reflects the selection of diverse parameters in the Sobol sequence. In total, three iterations of BO were performed for the target of 300 nm and four iterations for the 3.0 μm target. The selection of solvent was the most obvious difference for these two targets. Indeed, in previous reports of PLGA particle synthesis, DMAc is a popular solvent due to its high boiling point.44 From a mechanistic viewpoint, droplets will experience fission due to the competition between coulombic repulsion and liquid surface tension in an electrospraying process.45 At the same time, the evaporation of solvents increases the concentration and viscosity of the droplet. As a non-volatile solvent, DMAc allows this fission process to fully develop and thus generates sub-micrometer particles.32 Chloroform, on the contrary, is preferred in the literature to produce larger particles within the tens of micrometers range.46 These practical considerations, which are normally accumulated through experience and trial-and-error, were also picked up by the BO pipeline. The recommendations provided by CCBO clearly showed a trend of adopting DMAc for the 300 nm target and chloroform for the 3.0 μm target.
Linking the recommendations to the experimental results (Fig. 3c) could provide a more holistic viewpoint of the selection strategy of CCBO. For the 300 nm target, the best candidate in the initial experiments (0-8 on Table 2) used DMAc with a low polymer concentration, flow rate and voltage to obtain 0.15 μm particles. The recommendations from the CCBO pipeline showed exploration of higher concentrations and fine-tuning of the flow rate parameter (SI Table 2). Interestingly, the 3-1 and 3-2 experiments both achieved a 300 nm particle size with distinct processing parameters, suggesting that the impact of the less-concentrated polymer solution was compensated by the higher flow rate used for 3-1. Furthermore, the balance of exploration–exploitation from the EI acquisition function was further demonstrated through the experiment series for the 3.0 μm target. In the first iteration, CCBO attempted the use of both DMAc and chloroform as the solvent (SI Table 3). The second iteration tested the lowest polymer concentration (0.05% w/v), which is shown as the lightest green in the heatmap (Fig. 3b). Finally, the recommendations settled at higher concentrations with reduced flow rates to approach the target based on the fine-tuning from exploitation. It was also observed from the SEM images (Fig. 3d) that experiment 1-2 for the 3.0 μm target managed to produce 2.69 μm particles with rough and polydisperse characteristics using a low polymer concentration (0.36% w/v) sprayed at a high flow rate of 3.65 μL min−1. The final experiment 4-2 suggested a 4.02% w/v solution sprayed at 1.08 μL min−1 (SI Table 3) to obtain 3.29 μm diameter particles. This result again highlighted the ability to achieve similar particle size through balancing polymer concentration and flow rate, together with adjusting other parameters. The SEM images of the final iteration experiments show satisfactory particle production at the targeted sizes.
Overall, we have verified the performance of CCBO in the automatic identification of the experiment feasibility region and rapid convergence to design targets through synthetic data validation. Comparison with human experts demonstrated the competitive performance of CCBO. The rational exploration of the experiment space outperformed the instinct-driven OFAT trial-and-error approach of humans. As a further step, wet-lab experiments consolidated the potential of CCBO in real-world applications for guided particle synthesis within a few iterations.
In addition, the innate exploration–exploitation trade-off of BO made possible the identification of multiple possible experimental parameters that can achieve the same design target. This is especially helpful when other design considerations coexist. For example, in the validation with the synthetic problem (Fig. 1f), CCBO attempted to use both DMAc and chloroform and paired them with a wide range of other processing parameters to hit the design target in iterations 6 to 10. From the perspective of production rate, a higher flow rate and polymer concentration might be preferred. Similarly, if the sustainability of the solvent is considered, DMAc would be selected over chloroform as a less harsh solvent. Besides the synthetic data, laboratory experiments also managed to find multiple parameters to produce particles with 300 nm or 3.0 μm diameter. These particles exhibited distinctive morphology and polydispersity, demonstrating varying characteristics for their applications. Although not explicitly coded as a multiple-objective optimization problem, these sets of experimental parameters could be presented to the user as alternative choices. In practice, such flexibility allows the researcher to consider product properties, manufacturing metrics, or other aspects in production without changing the main design target.
Since only two solvents have been investigated in the present work, categorial representations of the solvent variable were used instead of applying molecular featurization. Many modern BO libraries designed for chemistry and materials research support molecular featurization, such as Atlas and GAUCHE.41,49 Featurizing molecules with their physicochemical properties could incorporate chemistry knowledge in the optimization process and benefit molecular structure optimization and discovery tasks. For example, Griffiths et al. managed to leverage BO to optimize molecular design in a latent space generated from variational autoencoders.50 Although optimizing the solvent molecule per se was not necessarily a focus in particle synthesis applications, leveraging molecular fingerprints to represent solvents would equip the optimization process with chemically meaningful knowledge (via representing similar solvents with close descriptors).51 In addition, extending the present single-objective optimization paradigm to multiple objectives could benefit more complicated particle design tasks, including the control of both particle size and size distributions, or morphological features. Our implementation of constrained optimization was through feasibility-weighting of the acquisition function. This should be extensible to multi-objective optimization seamlessly, considering that the feasibility modelling is irrelevant to the type of acquisition functions. Notably, Li et al. recently proposed a new method to balance (unknown) constraint modelling and multi-objective optimization through unifying constraint violation with hypervolume regret.52 They demonstrated improved efficiency compared to baseline scalarization-based methods such as qParEGO.53 On the other hand, composite optimization is expected to be transferable to multi-objective optimization scenarios, in which objectives are scalarized. However, the implementation of these extensions is beyond the scope of the current manuscript and thus left as a future direction.
Finally, we highlight that CCBO could potentially be extended to other particle synthesis systems, such as batch methods and microfluidics, to facilitate the guided design and production of particles. In the past, the resource-demanding nature of experimentation and scarcity of data have posed significant challenges and prolonged the workflow of particle synthesis. We expect CCBO to empower nanotechnology with a smarter and more efficient paradigm for target-driven design.
Supplementary information is available. See DOI: https://doi.org/10.1039/d5dd00243e.
| This journal is © The Royal Society of Chemistry 2025 |