Quantifying the Failure Modes of Current One-step Retrosynthesis Models

Abstract

Computer-aided synthesis planning (CASP) automates retrosynthetic analysis, generally by recursively applying one-step retrosynthesis models within multistep search algorithms to simplify a target molecule into commercially available starting materials. Despite their utility, these tools often fail to recover literature-reported pathways. Such failures arise from two causes: either (i) the literature-reported precursor is not proposed at all or (ii) it is proposed but ranked too low to be discovered during multistep search. In this work, we quantify the challenges that data-driven one-step retrosynthesis models face in reproducing the reported precursors. We first evaluate model performance using standard top-\textit{k} exact-match accuracy and stratify this accuracy by product and reaction complexity, demonstrating a decrease in performance with increasing complexity. This decline is accompanied by a systematic underprediction of the number of reacting atoms and changing rings, indicating a bias toward simpler transformations, even when complex examples are included in the training data. To gain deeper insights into failure modes, we evaluate models with complementary metrics that account for incorrect stereochemistry, leaving groups, and multi-stage reactions. Overall, our work provides a quantitative analysis of how one-step retrosynthesis models fail to capture literature-reported reactions, highlighting opportunities for improving future models and providing guidance on using model predictions more effectively in prospective synthesis planning.

Supplementary files

Article information

Article type
Edge Article
Submitted
14 Feb 2026
Accepted
03 Jun 2026
First published
04 Jun 2026
This article is Open Access

All publication charges for this article have been paid for by the Royal Society of Chemistry
Creative Commons BY license

Chem. Sci., 2026, Accepted Manuscript

Quantifying the Failure Modes of Current One-step Retrosynthesis Models

S. B. A. Tran, J. Roh and C. W. Coley, Chem. Sci., 2026, Accepted Manuscript , DOI: 10.1039/D6SC01323F

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements