Annie M. Westerlund*,
Lakshidaa Saigiridharan
and
Samuel Genheden
Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183, Gothenburg, Sweden
First published on 14th July 2025
Contemporary multistep retrosynthesis tools such as AiZynthFinder, which are frequently used by chemists, generate solved routes for the majority of target molecules, but do not consider the prior knowledge of the chemist, including specific bonds that should disconnect or remain connected throughout the routes. Such knowledge is for example integral when planning a joint synthesis route for a set of similar molecules where common disconnection sites can be identified across the molecules. Here, we present a novel strategy in AiZynthFinder for human-guided multistep retrosynthesis via prompting. This includes a filter for discarding reactions that violate bonds to freeze constraints. Furthermore, we benchmark four possible strategies for breaking selected bonds in the search for synthetic routes, and show that a combination of a disconnection-aware transformer and a multi-objective search generates routes which satisfy bond constraints for more targets in the PaRoutes dataset compared to the standard search (75.57% vs. 54.80%). Finally, we apply the strategy on a set of drug molecules to exemplify a real-world scenario. Our novel approach enables building a short joint synthesis route that satisfies the given bond constraints and covers eight of the ten molecules, demonstrating the added value of incorporating human prior knowledge in synthesis planning.
One of the challenges commonly occurring in drug discovery is the prediction of synthesis routes for a set of target molecules with shared intermediates. However, off-the-shelf AI-driven synthesis planning tools, such as AiZynthFinder, are typically unaware of concepts such as “multiple targets” or “shared intermediates”. The challenge has previously been approached either through a recursive search on a complete reaction network to generate cost-effective and diverse synthesis routes,19,20 or as a postprocessing step of retrosynthesis searches by carrying out multi-objective optimization on the generated reaction networks.21,22 The latter is naturally dependent on the quality of the set of routes generated by retrosynthesis. Moreover, multistep retrosynthesis which considers the prior knowledge of the chemist who carries out the experiments in the lab is an important topic. The chemist could, for example, be interested in breaking a specific bond, or in keeping certain bonds or moieties frozen in the generated synthesis route. This would be especially beneficial when planning a joint synthesis route for a set of similar compounds where common disconnection sites can be identified across the compounds.
Here, we present a novel strategy in AiZynthFinder for human-guided multistep retrosynthesis via prompting. The chemists provide input on what bonds should be disconnected (bonds to break) or remain connected (bonds to freeze) in the synthesis route as prompts to the tool. In its standard configuration, AiZynthFinder uses a so-called template-based model to suggest chemical disconnections and MCTS guided by a simple objective to break down the target molecule. To keep bonds frozen, any single step predictions violating the bonds to freeze constraints are filtered out. This functionality is referred to as the frozen bonds filter. For breaking user-specified bonds, we investigate two possible approaches to disconnection-aware multistep retrosynthesis via prompting. First, a novel score that favors routes satisfying the bonds to break constraints early in the search tree (see Methods; Broken bonds score). We integrate this score in the tree search using a recently developed multi-objective (MO) MCTS algorithm.23,24 Second, we introduce a framework where a disconnection-aware transformer, which is a prompt-based language model for disconnecting tagged bonds to break,25,26 is combined with the template-based model to allow for human-guided multistep retrosynthesis. The reader is referred to Methods; Disconnection-aware Chemformer in multistep retrosynthesis for details on how the two models are integrated in the multistep retrosynthesis framework.
We benchmark the novel approach on the PaRoutes set-n1 (ref. 27) and Reaxys-JMC (Journal of Medicinal Chemistry)28 datasets, which consist of known synthesis routes either from patents or the literature. Finally, we showcase a project-specific application with ten similar target compounds. Overall, the work presented here provides a novel approach for human-driven multistep retrosynthesis via prompting.
First, recent approaches have been reported for finding common intermediates and joint synthesis routes. One approach leveraged synthesis routes from AI-driven synthesis planning tools and carried out multi-objective optimization on the generated reaction networks.21,22 A recent alternative approach constructed a search tree for multiple targets at once using only the single step model scores.29 These tools are promising for automatically constructing a joint synthesis route for multiple targets. Although such methods do not allow the user to make suggestions of bonds to break or freeze on their own, they could in principle be used in conjunction with our human-guided retrosynthesis search in a workflow for creating joint synthesis routes for multiple targets given bond constraints.
Second, advances have been made for the purpose of prompting single step predictions to disconnect bonds to break. Recently, a disconnection-aware transformer model was proposed to guide single step predictions by tagging the disconnection site in the SMILES string prior to feeding it to the model.25,26 The first introduction of a disconnection-aware transformer demonstrated the ability of the model to learn to recognize the tags and disconnect the corresponding bonds.26 It was extensively benchmarked on single step predictions compared to a baseline retrosynthesis transformer in order to manifest its robustness and accuracy. In addition, the authors introduced a model for automatic tagging of reaction centers. The automatic tagging aims to step away from human-guided retrosynthesis and aids the disconnection-aware model to automatically generate more diverse predictions.26 In contrast to our work, the first disconnection-aware transformer was only used and evaluated for the purpose of single step retrosynthesis. Later, an alternative disconnection-aware transformer was used in a multistep search by pairing it with automatic tagging in each single-step iteration.25 Unlike the work presented here, automatic tagging of reaction centers was used to boost diversity. Hence, human-directed prompting of bonds to break in a multistep framework was not explored.
In contrast to earlier work on disconnection-aware transformers which only considered isolated single step predictions, we have developed two approaches for human-guided multistep retrosynthesis searches via prompting. We thus focus on generating synthesis routes, rather than single reaction predictions. The first approach is the novel broken bonds score which is used in a multi-objective search. The score enables guiding synthesis planning with prompted bond constraints without altering the single step model. Our second approach relies on the previously proposed disconnection-aware transformer. To this end, we introduce novel extensions to the single step model. For instance, several steps may be required to break the bonds, and hence the disconnection site tagging should be reliably propagated to the next steps in the synthesis route. We therefore constructed a custom expansion strategy (Table 1) which combines the predictions of the disconnection-aware model with predictions of the template-based model in order to allow further disconnections to reach solved routes. In addition to our two approaches, we introduce the frozen bonds filter. To our knowledge, approaches similar to the frozen bonds filter or the broken bonds score have not been reported prior to this work.
MCTS | Monte-Carlo Tree Search. Search algorithm used for multistep retrosynthesis. Iteratively predicting reactants given the current product molecule. A (single) objective, or score, is used to guide the search |
MO-MCTS | Multi-objective Monte-Carlo Tree Search. MCTS with multiple objectives, or scores, to guide the search |
Single step model | In each step of an AI-driven retrosynthesis search, predictions are carried out with a single step (machine learning) model which predicts reactants for a given product, either implicitly via a reaction template or explicitly via SMILES or molecule graphs |
Expansion strategy | Used within the multistep retrosynthesis search to generate predictions with a single step model. It outputs model predictions which are ranked and, possibly, processed |
AiZynthFinder | Software tool which uses MCTS, or MO-MCTS in combination with a single step model to generate multistep retrosynthesis predictions |
Template-based model | The single-step model used in the standard AiZynthFinder search. Takes a molecule fingerprint as input and predicts reaction templates which can be applied to the product to obtain reactants |
Chemformer | Transformer model which can be used as single step model. Retrosynthesis transformer models12,25,26,30,31 are trained on SMILES strings (reactant SMILES are predicted directly based on the product SMILES as input). This allows for non-chemical modifications to the input, including tagging atoms in bonds to break |
Pareto front | An optimal balance of multiple objectives or scores, such that none of the scores can be improved without weakening the other score(s). The Pareto front is the set of solutions that fulfill this trade-off |
Pareto rank | A ranking of Pareto fronts which are obtained by iteratively removing the primary Pareto front from the data. Ex: the set of samples in the new Pareto front obtained after removing the data on the first Pareto front corresponds to the second Pareto rank |
Table 1 provides a glossary with key concepts and abbreviations used throughout this manuscript.
For the bonds to break constraints, we investigated two main approaches: (1) a novel broken bonds score to bias the tree search and route building steps and (2) a disconnection-aware Chemformer for boosting single step predictions. The reader is referred to the Methods section for technical details on the broken bonds score and disconnection-aware Chemformer.
Standard AiZynthFinder uses MCTS (Table 1) to combine single-step predictions. In standard MCTS, a single search objective is used (the state score). However, for retrosynthesis, there can be multiple objectives that describe route quality. For example, when prompting bonds to break to a retrosynthesis search, one objective is to favor routes which disconnect the prompted bonds (broken bonds score) and another objective is to favor routes which are solved and short (state score). Therefore, both the state score and broken bonds score are necessary to rank the routes in disconnection-aware retrosynthesis. In contrast to standard MCTS, multi-objective MCTS (MO-MCTS, Table 1) enables taking multiple objectives, or scores, into account when carrying out the retrosynthesis search. It has previously been explored for robotics environmental monitoring24 and was recently implemented for multistep retrosynthesis in AiZynthFinder.23 The extracted routes are part of the Pareto front (Table 1).
The reader is referred to the Methods section as well as the original publication of MO-MCTS in AiZynthFinder23 for technical details on how the MO-MCTS algorithm is implemented.
Building on the two approaches, we implemented four different disconnection-aware strategies in multistep retrosynthesis:
(1) Route ranking: in AiZynthFinder, the tree search generates a large number of synthesis routes which are ranked in a postprocessing step. After ranking the routes, AiZynthFinder extracts the top 10–15 routes which are returned to the user. The number of routes generated by MCTS is much larger (on average ∼125 with the state score), and therefore the score used to rank routes greatly influences which routes are returned to the user. In the route ranking strategy, routes are ranked prior to extraction using a linear combination (with equal weights) of the broken bonds score (Methods; eqn (1)) and the state score. The score was thus not used in the MCTS, but in the subsequent route extraction phase by AiZynthFinder to gather the output routes. This is in contrast to the standard strategy, where only the state score is used for route ranking. Note that the template-based model and standard single-objective (state score) MCTS were used in this strategy.
(2) MO search: multi-objective MCTS23,24 (MO-MCTS) with the broken bonds score as an additional objective together with the state score. The template-based model was used as single-step model. The template-based model generated 50 predictions. The broken bonds score ensured that routes with desired disconnections were favored. The two objectives used in the MO-MCTS were used to rank and select routes, such that the extracted routes belonged to the Pareto front.
(3) Chemformer: disconnection-aware Chemformer paired with the template-based model in a multi-expansion framework to generate single step predictions. The standard MCTS with the state score was used to conduct the search. The state score was used to rank and extract routes.
(4) Chemformer-MO: MO-MCTS with the state score and broken bonds score to conduct the search, and the disconnection-aware Chemformer paired with the template-based model in a multi-expansion framework to generate single step predictions. The two scores used in the MO-MCTS were used to rank and select routes, such that the extracted routes belonged to the Pareto front.
The frozen bonds filter was used in all four strategies. The usefulness of the frozen bonds filter is specifically analyzed in the section Application to a target-specific set of compounds. The four disconnection-aware strategies were compared to the standard AiZynthFinder search that uses the template-based model from the production platform and MCTS with the state score as the single search and route ranking objective.18
We could draw several interesting conclusions from the results in Fig. 2. Although the standard search solved most targets (84.91%) in a short time (Fig. 2a and b), it only solved and satisfied the bond constraints for 54.80% of the targets. The Chemformer strategy was substantially slower than the standard strategy, while the route-ranking and multi-objective search strategies were comparable in search time (Fig. 2b). The route-ranking strategy led to fewer solved targets than the standard search (Fig. 2a: 74.62% vs. 84.91%) but resulted in more solved targets which fulfill the bond constraints (67.66% vs. 54.80%).
The Chemformer strategy yielded almost as many solved targets as the standard strategy (Fig. 2a; 83.18%), while more of the solved targets actually complied with the bond constraints (Fig. 2c; 62.00% vs. 54.80%). Although the Chemformer strategy improved bond constraint performance compared to standard search, it remained less efficient than the route-ranking strategy. Hence, the route scoring in the AiZynthFinder postprocessing step is an important, but easily overlooked, component. In contrast to Chemformer, which approaches the problem with single step predictions, strategies that rely on the broken bonds score consider the full composition of the routes. In other words, Chemformer operates locally while the broken bonds score operates globally. The global approach appeared especially successful. In particular, the MO search strategy generated routes which satisfied the constraints for even more targets than the route-ranking strategy (Fig. 2c). Moreover, it generated a larger number of solved routes which fulfilled bond constraints compared to the route-ranking strategy, while a smaller average number of routes was generated compared to the Chemformer strategy (Fig. 2d). The same was observed for the Reaxys-JMC targets (Fig. S3d†). These results demonstrate a clear advantage of using the broken bonds score from human-guided synthesis planning via prompting. Because the local and global approaches treat the problem in fundamentally different ways, one could imagine that the combined approach (Chemformer-MO which combines the MO-MCTS with disconnection-aware Chemformer) is superior. Indeed, for the PaRoutes set-n1 targets, the Chemformer-MO strategy reached the highest percentage of targets fulfilling bond constraints and yielded the largest number of routes which satisfy the bond constraints (Fig. 2).
For the Reaxys-JMC targets, Chemformer-MO performed similar to the MO search strategy in terms of solved targets satisfying constraints, but generated substantially more routes which satisfy the constraints (Fig. S3†). We noticed that 6.0% of the targets were solved with fulfilled constraints using MO search (Table 2: Case 1), but not by Chemformer-MO. Conversely, 4.2% of the targets were solved with fulfilled constraints by Chemformer-MO (Table 2: Case 2), but not by MO search. Because the bonds to break constraints were handled differently by these two strategies, we investigated this constraint type. The number of bonds to break constraints was typically fewer for targets in Case 2, compared to the targets in Case 1. Moreover, Chemformer-MO often solved routes with fewer steps than the number of constraints in Case 1, as opposed to the MO search in Case 2 (Table 2). While Chemformer-MO did not fulfill all constraints in Case 1, it solved 85% of the targets and satisfied at least one of the bonds to break constraints for all targets. This was in stark contrast to the MO search strategy in Case 2 (Table 2: 45%). The premature convergence, which was more common for the Chemformer-based strategy, is a symptom of the bonds to break constraints being treated as soft constraints. Because the state score favors routes which are short and solved, the longer routes from Chemformer-MO which satisfied all bond constraints were not included in the Pareto fronts of these targets. With the current implementation, one can include routes from lower Pareto ranks (Table 1) to extract the longer routes that satisfy all bond constraints. Altogether, the results presented here suggest that the combined approach of the Chemformer-MO strategy successfully generated routes which fulfill the given bond constraints, with the largest performance boost originating from the broken bonds score in MO-MCTS.
Case | Fulfilled bond constraints | Distribution of #bonds to break constraints | Solved in less (less or equal to) steps than #bonds to break constraints | Solved targets | Satisfied any bonds to break constraints |
---|---|---|---|---|---|
1 Fulfilled by MO search (but not Chemformer-MO) | 6.0% | 1 Constr.: 5% | [Chemformer-MO]: 43% (80%) | [Chemformer-MO]: 85% | [Chemformer-MO]: solved – 85%, all – 100% |
2 Constr.: 48% | |||||
3 Constr.: 47% | |||||
2 Fulfilled by Chemformer-MO (but not MO search) | 4.2% | 1 Constr.: 33% | [MO search]: 24% (36%) | [MO search]: 66% | [MO search]: solved – 45%, all – 78% |
2 Constr.: 62% | |||||
3 Constr.: 8% |
Fig. 3a (Reaxys-JMC: Fig. S4a†) shows that the disconnection-aware strategies in general suggested dissimilar routes to the standard search. The route-ranking strategy yielded the most dissimilar sets of routes, followed by the Chemformer-MO strategy (Fig. 3a and S4a†). Notably, the dissimilarity of routes when comparing standard search to itself resulted in slight non-zero values. While this might appear unexpected, it is inherent to the used similarity metric.37 Specifically, the route similarity consists of two parts: atom-similarity and bond-similarity. The atom-similarity is normalized based on the number of total molecules in the two routes being compared. However, the atom-similarity calculation is only based on atoms that are identified in the final target molecule. This leads to some molecules being completely left out from the similarity calculations while still being part of the normalization. This is sensible when comparing two different routes, but may sometimes lead to noise when comparing two identical routes.
When considering route diversity in terms of average maximum dissimilarity between the set of routes generated by each strategy, the disconnection-aware strategies scored lower than the standard strategy (Fig. 3b and S4b†). In addition to these metrics, we computed the round-trip accuracy of predicted reactions. The round-trip accuracy was similar across the different strategies (Fig. S5†), indicating that any strategy can be employed without excessively compromising route feasibility.
In order to understand how, and to what extent, the disconnection-aware Chemformer was used in the routes, we analyzed the frequency of predictions in the extracted routes which had been generated by Chemformer and Chemformer-MO strategies (Fig. S6 and S7†). The Chemformer expansion policy was used more often in the Chemformer-MO strategy than in the single-objective Chemformer strategy (Fig. S6 and S7†), demonstrating the role of the broken bonds score in steering the search and ranking routes. Regardless of whether single- or multi-objective MCTS was used, Chemformer predictions were found only in the first three steps of the extracted routes (Fig. S6b and S7b†). The depths of trees generated by Chemformer and Chemformer-MO on average only included three steps (Tables S1 and S2†). Thus, the prioritization of disconnecting bonds to break left little room for the standard template-based predictions, which were mainly used as a complement to yield properly solved routes. In line with this, we noticed that the template-based model was not utilized in all trees, and was found in fewer trees generated by the Chemformer-MO strategy (Fig. S6 and S7†).
In summary, the disconnection-aware Chemformer produced reliable single step predictions, but the MO-MCTS with the broken bonds score (MO search) had a greater impact on the final constraint satisfaction of generated routes. Furthermore, the combined Chemformer-MO approach exploited features from both strategies and governed several synthesis routes in the feasible region while lowering route diversity compared to the standard search.
Table 3 reports the percentage of solved targets and the percentage of targets with routes that were solved with satisfied bond constraints. The standard search reached purchasable precursors for all 10 targets. However, for one of the targets, neither of the three strategies relying solely on the template-based model was able to break the specified bonds (Table 3). In contrast, both Chemformer-based strategies satisfied bond constraints for all targets. Although all strategies performed reasonably well in satisfying bond constraints for this selected set of molecules, the bulk experiments (Fig. 2 and 3) demonstrate clear differences in the overall performance of the disconnection-aware strategies.
Strategy | Solved targets | Fulfilled bond constraints |
---|---|---|
Standard | 10/10 | 9/10 |
Route ranking | 9/10 | 9/10 |
MO search | 9/10 | 9/10 |
Chemformer | 10/10 | 10/10 |
Chemformer-MO | 10/10 | 10/10 |
Fig. 4 depicts the top-1 routes generated by the standard search and Chemformer-MO strategy for the target for which the template-based strategies did not yield solved routes with fulfilled constraints. Specifically, the route generated with the standard search reached purchasable precursors before breaking the bond [N:3]–[C:4] (Fig. 4a). In practice, one could envision a scenario where the chemist wishes to break a specific bond in order to avoid a certain starting material, for example due to cost or environmental impact. Then, it is important that the prompts made by the chemist are respected by the synthesis planning tool. For this reason, the route generated by the standard strategy in Fig. 4a can be considered less successful compared to the route in Fig. 4b generated by the Chemformer-MO strategy.
To test the effect of the frozen bonds filter, we carried out a search with bonds to freeze set to the bond between thiazole and benzene that was disconnected in the first Suzuki coupling reaction (Fig. 4a). With this filter, the bonds to break constraints were satisfied for all targets (Table S3†). However, the routes were not ranked to favor constraint satisfaction, leading to only 7/10 top-1 routes fulfilling the constraints. Together with the route-ranking strategy, the frozen bonds strategy yielded lower route diversity than any of the other strategies. Moreover, the frozen bonds strategy lead to routes more dissimilar to the standard search while the disconnection-aware strategies had higher top-1 constraint satisfactions (Table S3†). As mentioned, the ability to prioritize and rank routes based on constraint satisfaction is an important aspect when constructing routes with common intermediates. In contrast to using a frozen bonds filter, the Chemformer-MO strategy avoided the premature convergence-issue by first breaking the bond [N:3]–[C:4] in the first step with an N-acylation to amide reaction and then the bond [N:1]–[c:2] in the second step with nucleophilic aromatic substitution (SNAr)‡§ (Fig. 4b). This particular approach was observed in routes suggested by both Chemformer strategies for 7 of the 9 targets which included both bonds.
For the seven targets which shared a common retrosynthesis approach where bond [N:3]–[C:4] was disconnected in the first step and [N:1]–[c:2] in the second, we constructed a joint reaction tree (Fig. 5). The routes presented in this figure correspond to the top route of each target molecule. Five of these targets shared a common intermediate at the second step, and six shared a common intermediate at the third step. Interestingly, the shared intermediate at the second step was the target molecule which only contains the bond [N:1]–[c:2] (Fig. S8†). Hence, 8 of the 10 targets are actually represented in the joint reaction tree. One target did not share any intermediate molecules with the other targets, but a building block molecule which was common to the five targets with a common second-step intermediate.3 ¶ Altogether, the Chemformer-MO strategy generated routes with common intermediates and shared building block molecules, greatly simplifying the synthesis of these compounds. For this set of compounds, the local approach used by the disconnection-aware Chemformer thus proved successful when the template-based model did not generate routes where [N:3]–[C:4] was disconnected in the first step.
We demonstrated that the strategies relying on a broken bonds score (route-ranking and MO search) overall outperformed the basic Chemformer strategy. Because the template-based model returned the 50 highest ranked template predictions, the bond to break was often disconnected in one of these. The broken bonds score could then rank the reaction trees or steer the search to generate routes which fulfilled the given bond constraints. In the Chemformer strategy, the boosted single step predictions were likely down-prioritized because the default search objective favored short and solved routes, which may be suboptimal for this task. When combining the disconnection aware Chemformer with the broken bonds score in a multi-objective search, features from both local and global approaches were exploited, leading to a higher percentage of solved routes with satisfied constraints compared to the other three strategies. Notably, the Chemformer and Chemformer-MO strategies were significantly slower than standard search, while the MO search and route-ranking performed on par with the standard search concerning search time. Therefore, the strategies only relying on the broken bonds score and the template-based model are suitable in industrial application settings where multiple searches might be conducted, and time is a limiting factor. Conversely, the Chemformer-MO strategy may be better suited for few-target searches where the project can afford to generate more suggestions of routes which comply with the bond constraints at the cost of computational time.
The work presented here can be exploited in several practical application scenarios. For example, given the knowledge and experience of a chemist, human-guided retrosynthesis can refine the synthesis routes provided by a standard search where undesirable disconnections have been proposed. Another example is synthesis planning for a set of similar molecules, for instance those obtained from generative molecular design with reinforcement learning.39 Adopting our approach may in these cases improve the routes, thus simplifying the synthesis process and consequently accelerating drug discovery.
In conclusion, the disconnection-aware Chemformer with the multi-objective search strategy used together with a frozen bonds filter is a viable strategy to generate routes that comply with user-specified bond constraints, and enables creating joint synthesis plans for a compound series using human knowledge. Future work includes creating a complete workflow for multi-target synthesis which integrates our approach into existing multi-target planning framework tools.21,22,29
The score takes a set of bonds to break, , as the input and returns a score,
, between zero and one:
![]() | (1) |
The Chemformer expansion policy received an atom-mapped SMILES as input together with the list of bonds to break. In an atom-mapped SMILES, each atom is tagged by a number. Atom-mapping is used to pair the corresponding atoms in product and reactants SMILES in a reaction. By exploiting the atom-mapping predicted by RXN-mapper,33 the original atom-mapping could be propagated to the retro-reactions predicted by Chemformer. The Chemformer expansion policy returned atom-mapped predictions which break the bonds specified by the list of bonds to break. The technical details of data processing, including how the atom-mapping is propagated in each step, and filtering carried out in the Chemformer expansion policy are given in ESI Methods:† Chemformer expansion policy.
Similar to when the template-based model was used alone, the multi-expansion strategy returned the first ranked 50 template predictions of the two combined models to the tree search.
To create synthetic benchmarking datasets for the disconnection-aware multistep retrosynthesis, we first extracted all reactions from the reference routes (PaRoutes and Reaxys-JMC). From these reactions, we extracted bonds to break and bonds to freeze with a workflow which included propagating atom-mapping in the tree and selecting bonds to break by maximizing the convergent disconnection score.36 The convergent disconnection score makes sure that we select bonds that split the product molecule into roughly equal sized reactants. The workflow for obtaining bonds to break and bonds to freeze for a target molecule from an example route is visualized in Fig. S10,† and described in detail in ESI Methods:† Synthetic benchmarking datasets. The bonds to freeze were selected from the unchanged bonds to ensure that there is at least one possible solved route (the reference route) for each target molecule. We note that the freezing some of the changed bonds in the reference routes would force the search to find different solved routes than the reference but would not guarantee an existing solution. In addition, we chose to freeze fewer bonds than bonds to break to focus on the effect of the different disconnection strategies, and to lower the probability of freezing necessary bonds for alternative routes. Applying this bond extraction workflow resulted in 1748 and 1367 targets from PaRoutes set-n1 and Reaxys-JMC, respectively.
Footnotes |
† Electronic supplementary information (ESI) available: Supporting methods, as well as supporting figures of single step model validation, Reaxys-JMC results, round-trip accuracy, expansion policy statistics, and target molecules in the application study, and also a table for supporting the Reaxys-JMC Chemformer-MO vs. MO search analysis. See DOI: https://doi.org/10.1039/d5sc00927h |
‡ A triflate or tosylate would be needed to activate the oxygen and make a viable leaving group. |
§ The thiazole cyclization reaction would benefit from a protecting group on the nitrogen, for example a butyloxycarbonyl (BOC) group. |
¶ The route is missing a step of reducing the nitroarene before transforming into an amide. |
This journal is © The Royal Society of Chemistry 2025 |