Mohammad
Haddadnia
ab,
Leonie
Grashoff
c and
Felix
Strieth-Kalthoff
*cd
aHarvard University, Department of Biological Chemistry & Molecular Pharmacology, Boston, MA, USA
bHarvard University, Dana-Farber Cancer Institute, Department of Cancer Biology, Boston, MA, USA
cUniversity of Wuppertal, School of Mathematics and Natural Sciences, Wuppertal, Germany. E-mail: strieth-kalthoff@uni-wuppertal.de
dUniversity of Wuppertal, Interdisciplinary Center for Machine Learning and Data Analytics, Wuppertal, Germany
First published on 7th May 2025
Scientific optimization problems are usually concerned with balancing multiple competing objectives that express preferences over both the outcomes of an experiment (e.g. maximize reaction yield) and the corresponding input parameters (e.g. minimize the use of an expensive reagent). In practice, operational and economic considerations often establish a hierarchy of these objectives, which must be reflected in algorithms for sample-efficient experiment planning. Herein, we introduce BoTier, a software library that can flexibly represent a hierarchy of preferences over experiment outcomes and input parameters. We provide systematic benchmarks on synthetic and real-life surfaces, demonstrating the robust applicability of BoTier across a number of use cases. Importantly, BoTier is implemented in an auto-differentiable fashion, enabling seamless integration with the BoTorch library, thereby facilitating adoption by the scientific community.
In scientific optimization problems, the primary objective(s) are generally derived from the outcome of an experiment. In reaction optimization (Fig. 1a), for example, this could be the yield of the desired product, or the quantity of an undesired side product. At the same time, secondary optimization objectives can include preferences over input parameters, such as minimizing the loading of an expensive catalyst, or minimizing the reaction temperature to lower energy consumption.11–13 It is worth noting that such considerations imply that certain objectives are prioritized over others, establishing a known hierarchy.14–16
In MOO, a solution in which further improving one objective is detrimental to at least one other objective is called a Pareto optimum,17 and the set of all Pareto optima is referred to as the Pareto front (Fig. 1b). In an ideal scenario, knowing the entire Pareto front would enable optimal post-hoc decisions, accounting for all inter-objective trade-offs. Accordingly, the past decades have seen significant advances in hypervolume-based approaches to map the Pareto front.18,19 However, Pareto-oriented optimization may spend significant experimental resources on mapping regions of the Pareto front that are not of interest to the researcher (Fig. 1b). Therefore, when relative objective importances are known, scalarizing multiple objectives into a single score can help guide the optimization to desired regions of the Pareto front.1,13,20
In practice, such scalar scores are often used in a manner that can be described as implicit objective modeling (Fig. 1c left). Here, for each observation, the multiple objective values are first combined into a single scalar score, and standard single-objective BO is then employed to optimize this score over the search space.1,20 While straightforward, this aggregate-then-predict approach has two main drawbacks: (a) when input-based objectives are included, the relationship between score and input parameters is provided only implicitly to the surrogate, and must therefore be “re-learned”. For example, in the reaction optimization scenario shown in Fig. 1a, the surrogate would have to learn how catalyst loading and temperature influence the final score, even though this relation is known a-priori in analytical form. This redundancy is likely to reduce optimization efficiency. (b) The scalar score itself is artificial and may lack physical meaning, which can hinder the design of effective priors.21,22
Therefore, scalarization would ideally be employed in a predict-then-aggregate manner (Fig. 1c), which, requires manipulating multiple posterior distributions, therefore complicating practical implementation. In this context, Frazier and co-workers introduced the concept of composite objective functions,23 where a real-valued function is applied only after building surrogate models, and demonstrated that their posteriors can be approximated by Monte-Carlo integration. When applied to scalarization in MOO, we refer to this strategy as explicit objective modeling.
Combining the principle of hierarchical MOO with the idea of explicit objective modeling, we herein introduce BoTier as a flexible framework for MOO which enables tiered preferences over both experiment inputs and outputs. The main contributions of this paper include (1) the formulation of an improved, auto-differentiable hierarchical composite score; (2) its open-source implementation as an extension of the BoTorch library; and (3) systematic benchmarks on analytical surfaces and real-world chemistry examples, showcasing how BoTier can efficiently navigate MOO problems in the context of scientific optimization.
![]() | (1) |
Although Chimera is widely used for MOO, its current formulation is limited to implicit objective modeling scenarios, where χ is computed for all K observations, {(ψ1(xk), ψ2(xk),…,ψN(xk))}k=1K. Since the value of χ for a single observation depends on all other observations considered at the same time, batch-wise evaluation of χ is not possible in its current form. Moreover, its implementation is not auto-differentiable, limiting its usefulness as a composite score for BO.
![]() | (2) |
As in Chimera, the product ensures that an objective ψi contributes to Ξ only after all superordinate objectives {ψj}j<i have met their satisfaction thresholds. When ψi is below its threshold ti, it becomes the limiting objective in that region of parameter space, and min (ψi, ti) returns ψi. Otherwise, ti is added to the score, preserving continuity of Ξ. Empirically, we confirm that this formulation is consistent with the ranking behavior of Chimera (ESI,† section 4). In addition, all ψi can be normalized to the range [0, 1] based on expert knowledge. This normalization, albeit optional, places gradients of Ξ on a consistent scale for all x ∈ X.
Our implementation of Ξ employs continuously differentiable approximations for both min (x1, x2) and H(x) (see (ESI),† section 1), enabling the automatic propagation of gradients through Ξ(x) using the PyTorch framework. This approach supports gradient-based techniques for optimizing any acquisition function computed on top of Ξ, ensuring robust optimization even in high-dimensional spaces. Therefore, BoTier integrates seamlessly with the widespread BoTorch ecosystem for BO, and can be flexibly combined with different single- or multi-task surrogate models, and acquisition functions. When applied in the context of explicit objective modeling, Ξ can be evaluated over both experiment inputs and model outputs (i.e., posterior distributions) using Monte-Carlo integration (see ESI,† section 1.3 for details).25,26 We provide BoTier as a lightweight Python library, which can be installed from the Python Package Index (PyPI).
All empirical optimization runs were performed using BO workflows implemented in BoTorch. Unless otherwise noted, we employed a Gaussian Process (GP) surrogate model with the Expected Improvement (EI) acquisition function and a batch size of 1. Each run was repeated 50 times from different random seed points to ensure statistical significance. Sobol sampling was used as a model-free baseline. The complete code for reproducing all experiments is available on our GitHub repository.
First, we evaluated several MOO strategies on four analytical multiobjective surfaces from the BoTorch library. These multidimensional (2—10D) benchmark functions typically exhibit non-linear, non-convex behavior within a bounded search space. To simulate a scenario with both input- and output-dependent objectives, each of these two-objective problems was augmented by a third objective that depends solely on the function inputs (see ESI†). Fig. 2 summarizes the general trends observed across all surfaces; a detailed, problem-specific comparison between the algorithms is provided in the ESI.† Across all tasks, we found that BoTier, when used in an implicit objective modeling scenario, already led to faster convergence toward the optimum value of Ξ compared to Chimera (Fig. 2, top row). Likewise, the number of experiments needed to find conditions that satisfy the first objective, the first two objectives, or all three objectives, was consistently lower (Fig. 2 bottom row). In every case, using BoTier under explicit objective modeling further accelerated optimization. We initially attributed this improvement to the surrogate model no longer needing to “re-discover” known correlations between inputs and objectives. Surprisingly, this finding persisted for the original two-objective problems in which all objectives depend solely on experiment outputs (see ESI,† Section 2.2). Although a systematic analysis is beyond the scope of this study, these findings suggest that learning two independent distributions is seemingly simpler than capturing a more complex joint distribution, as required in the case of implicit objective modeling. In fact, multi-output GP surrogates, that try to capture correlations between objectives, did not improve optimization performance compared to single-task GP models in most cases (Fig. S12 and S17†).
![]() | ||
Fig. 2 Benchmarks of MOO strategies on four analytical surfaces, each extended by an input-dependent objective (see ESI† for details). Top panel: best observed value of Ξ as a function of the number of experimental evaluations. All statistics were calculated on 50 independent campaigns on each surface. Intervals are plotted as the standard error. Bottom panel: Number of experiments required to satisfy the first objective (n = 1, green); the first two objectives (n = 2, dark green); or all three objectives (n = 3, blue). |
Moreover, we evaluated BoTier against a threshold-based, non-hierarchical composite score: a penalty-based scalarization introduced by deMello and co-workers.27 The widespread Pareto-oriented Expected Hypervolume Improvement (EHVI) acquisition function was tested as a reference.28 While optimization behavior varied by problem (see ESI,† Sections 2.2 and 2.3), several trends emerged: Compared to BoTier, the penalty-based scoring often takes more evaluations to satisfy the early objectives in the hierarchy. The identification of points that satisfy all objective criteria is achieved at a comparable experimental budget, highlighting the general efficiency of explicit objective modeling. As expected, EHVI, lacking preferences for any “region” of the Pareto front, required substantially more experiments to identify Pareto-optimal points that satisfy all criteria (see Fig. 2 and ESI†). These benchmarks confirm the feasibility of BoTier's formulation and its effectiveness as a composite score in explicit objective modeling.
Encouraged by these results, we tested BoTier on chemical reaction optimization scenarios which are highly relevant to self-driving laboratories. Following the emulator strategy described by Häse et al.,29 first, a supervised ML model was trained on an external, labeled dataset. The resulting model was then used as an emulator, substituting for actual wet-lab experiments, and could be queried with any set of input parameters proposed during a BO campaign. Specifically, we investigated the following problems: condition optimization in a heterocyclic Suzuki–Miyaura coupling,29 an enzymatic alkoxylation reaction,29 a synthesis of silver nanoparticles monitored via spectrophotometry,30 and an amine monoalkylation.31 All optimization runs were performed using a GP surrogate with the EI acquisition function, running 50 iterations from different random seed points, as described above.
Fig. 3b illustrates the Suzuki–Miyaura coupling example, which is optimized over reagent stoichiometry, catalyst loading, base loading and reaction temperature. Process chemistry considerations define a three-tier objective hierarchy, consisting of (1) maximized product yield, (2) minimized cost of all reactants and reagents, and (3) minimized reaction temperature. Benchmarking different MOO algorithms on this problem shows that BoTier is the only strategy capable of identifying reaction conditions that simultaneously meet all thresholds. By contrast, a statistical, model-free Sobol sampling baseline (see ESI† for further details) rapidly found high-yielding conditions, but failed to keep cost and temperature low. Similarly, neither Chimera with implicit objective modeling nor the evaluation of the full Pareto front (EHVI) identified satisfactory conditions in the given budget.
![]() | ||
Fig. 3 Evaluation of different MOO strategies for chemical reaction optimization. (A) Optimization performance on different emulated reaction optimization problems. Number of experiments required to satisfy the first n objectives. (B) Case study of a Suzuki–Miyaura coupling. Plots show the median objective values across 50 independent runs, with shaded areas indicating the 20th and 80th percentiles. See ESI† for further details. |
Similar trends were observed for the other emulated problems (Fig. 3a). Notably, we observed cases in which BoTier and EHVI satisfy the criteria at similar rates (Fig. 3a, panels 2 and 4); which occurred when the objectives did not strongly compete (see Fig. S21 and S23† for further details). Overall, if a hierarchy between objectives exists, BoTier proved to be a robust scalarization function which, across all cases investigated, never performed worse, but often notably better than existing MOO methods – particularly when used as a composite score in explicit objective modeling.
(1) Use hierarchical objectives when a hierarchy exists. If the objectives, whether input- and output-dependent, are subject to a well-defined priority structure, BoTier offers a robust objective to encode and optimize these preferences. Our benchmarks show that, in these cases, hierarchical methods can more rapidly identify desirable optima than approaches that seek to map the entire Pareto front.
(2) Favor explicit over implicit objective modeling whenever possible. Across all problems studied, explicit objective modeling consistently outperformed implicit objective modeling approaches, often yielding substantial speedups. In no case did an implicit modeling approach prove superior. We foresee that this effect will be even more pronounced when incorporating priors over physically meaningful quantities.
To encourage broader adoption, BoTier is provided as a lightweight, open-source extension to the BoTorch library. Looking ahead, we are exploring its applications in self-driving laboratories, where hierarchical optimization can be especially valuable for balancing complex objectives including materials properties, synthetic feasibility, cost, and sustainability. We anticipate that BoTier will be a valuable addition to the optimization toolbox for autonomous research systems.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5dd00039d |
This journal is © The Royal Society of Chemistry 2025 |