Integrating equivariant architectures and charge supervision for data-efficient molecular property prediction

Zixiao Yang; Hanyu Gao; Xian Kong

doi:10.1039/D5ME00173K

Integrating equivariant architectures and charge supervision for data-efficient molecular property prediction

Zixiao Yang,^ab Hanyu Gao

^c and Xian Kong

*^ab

Author affiliations

* Corresponding authors

^a South China Advanced Institute for Soft Matter Science and Technology, School of Emergent Soft Matter, South China University of Technology, Guangzhou, China
E-mail: xk@scut.edu.cn

^b Guangdong Provincial Key Laboratory of Functional and Intelligent Hybrid Materials and Devices, South China University of Technology, Guangzhou, China

^c Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, SAR, China

Abstract

Understanding and predicting molecular properties remains a central challenge in scientific machine learning, especially when training data are limited or task-specific supervision is scarce. We introduce the molecular equivariant transformer (MET), a symmetry-aware pretraining framework that leverages quantum-derived atomic charge distributions to guide molecular representation learning. MET combines an equivariant graph neural network (EGNN) with a transformer architecture to extract physically meaningful features from three-dimensional molecular geometries. Unlike previous models that rely purely on structural inputs or handcrafted descriptors, MET is pretrained to predict atomic partial charges, which are quantities grounded in quantum chemistry. This enables MET to capture essential electronic information without requiring downstream labels. We show that this pretraining scheme improves performance across diverse molecular property prediction tasks, particularly in low-data regimes. Analyses of the learned representations reveal chemically interpretable structure–property relationships, including the emergence of functional group patterns and smooth alignment with molecular dipoles. Ablation studies confirm that the EGNN encoder plays a crucial role in capturing transferable spatial features, while the transformer layers adapt these features to specific prediction tasks. This architecture draws direct analogies to quantum mechanical basis transformations, where MET learns to transition from coordinate-based to electron-based representations in a symmetry-preserving manner. By integrating domain knowledge with modern deep learning techniques, MET offers a unified and interpretable framework for data-efficient molecular modeling, with broad applications in computational chemistry, drug discovery, and materials science.

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

DOI: https://doi.org/10.1039/D5ME00173K
Article type: Paper
Submitted: 15 Sep 2025
Accepted: 21 Dec 2025
First published: 24 Dec 2025

Download Citation

Mol. Syst. Des. Eng., 2026,11, 244-254

Permissions

Request permissions

Integrating equivariant architectures and charge supervision for data-efficient molecular property prediction

Z. Yang, H. Gao and X. Kong, Mol. Syst. Des. Eng., 2026, 11, 244 DOI: 10.1039/D5ME00173K

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Molecular Systems Design & Engineering

Integrating equivariant architectures and charge supervision for data-efficient molecular property prediction

Abstract

Supplementary files

Transparent peer review

Article information

Download Citation

Permissions

Integrating equivariant architectures and charge supervision for data-efficient molecular property prediction

Social activity

Search articles by author

Spotlight

Advertisements