Automated synthesis and fragment descriptor-based machine learning for retention time prediction in supercritical fluid chromatography

Abstract

The integration of automated synthesis and machine learning (ML) is transforming analytical chemistry by enabling data-driven approaches to method development. Chromatographic column selection, a critical yet time-consuming step in separation science, stands to benefit substantially from such advances. Here, we report a workflow that combines automated synthesis of a structurally diverse amide library with fragment descriptor-based ML for retention time prediction in supercritical fluid chromatography (SFC). Retention data were systematically acquired on the recently developed DCpak® PBT column, providing one of the first structured datasets for this stationary phase. Benchmarking revealed that fragment-count descriptors (ChyLine and CircuS) substantially outperformed conventional molecular fingerprints, delivering higher predictive accuracy and more interpretable relationships between substructures and retention behavior. External validation underscored the role of chemical space coverage, while visualization techniques such as ColorAtom analysis offered mechanistic insight into model decisions. By uniting automated synthesis with chemoinformatics-driven ML, this study demonstrates a scalable approach to generating high-quality training data and predictive models for chromatography. Beyond retention prediction, the framework exemplifies how data-centric strategies can accelerate column characterization, reduce reliance on trial-and-error experimentation, and advance the development of autonomous, high-throughput analytical workflows.

Graphical abstract: Automated synthesis and fragment descriptor-based machine learning for retention time prediction in supercritical fluid chromatography

Supplementary files

Article information

Article type
Paper
Submitted
29 Sep 2025
Accepted
24 Nov 2025
First published
26 Nov 2025
This article is Open Access
Creative Commons BY license

Digital Discovery, 2026, Advance Article

Automated synthesis and fragment descriptor-based machine learning for retention time prediction in supercritical fluid chromatography

S. Sartyoungkul, B. Sakthivel, P. Sidorov and Y. Nagata, Digital Discovery, 2026, Advance Article , DOI: 10.1039/D5DD00437C

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements