Issue 43, 2025

High-throughput electronic property prediction of cyclic molecules with 3D-enhanced machine learning

Abstract

Complex organic molecules play a pivotal role in bioactive compounds and organic functional materials, yet existing molecular datasets lack structural diversity for such systems, limiting the generalizability of machine learning (ML) models. This study introduces a high-quality dataset, Ring Vault, comprising 201 546 cyclic molecules, including monocyclic, bicyclic, and tricyclic systems, spanning 11 non-metallic elements. This dataset covers a wide chemical space and provides a robust foundation for molecular property prediction. Leveraging quantum mechanical (QM) calculations on a subset (36 000 molecules), we trained three ML models (Graph Attention Network, Chemprop, and AIMNet2) to predict five key electronic properties: HOMO–LUMO gap, ionization potential (IP), electron affinity (EA), and redox potentials (Eox, Ered). The fine-tuned AIMNet2 model, incorporating 3D conformational information, outperformed 2D-based models, achieving R2 values exceeding 0.95 and reducing mean absolute errors (MAEs) by over 30%. Principal component analysis (PCA) of AIMNet2 embeddings revealed intrinsic correlations between electronic properties and structural features, such as conjugation extent and functional group effects. This work establishes a robust framework for high-throughput screening and rational design of cyclic molecules, with applications spanning drug discovery, organic electronics, and energy materials. The dataset and methodology provide a foundation for exploring complex structure–property relationships and accelerating functional molecule discovery.

Graphical abstract: High-throughput electronic property prediction of cyclic molecules with 3D-enhanced machine learning

Supplementary files

Article information

Article type
Edge Article
Submitted
04 Jun 2025
Accepted
20 Sep 2025
First published
02 Oct 2025
This article is Open Access

All publication charges for this article have been paid for by the Royal Society of Chemistry
Creative Commons BY license

Chem. Sci., 2025,16, 20553-20563

High-throughput electronic property prediction of cyclic molecules with 3D-enhanced machine learning

P. Zheng and O. Isayev, Chem. Sci., 2025, 16, 20553 DOI: 10.1039/D5SC04079E

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements