Integrating Equivariant Architectures and Charge Supervision for Data-Efficient Molecular Property Prediction

Abstract

Understanding and predicting molecular properties remains a central challenge in scientific machine learning, especially when training data are limited or task-specific supervision is scarce. We introduce the Molecular Equivariant Transformer (MET), a symmetry-aware pretraining framework that leverages quantum-derived atomic charge distributions to guide molecular representation learning. MET combines an Equivariant Graph Neural Network (EGNN) with a Transformer architecture to extract physically meaningful features from three-dimensional molecular geometries. Unlike previous models that rely purely on structural inputs or handcrafted descriptors, MET is pretrained to predict atomic partial charges, which are quantities grounded in quantum chemistry. This enables MET to capture essential electronic information without requiring downstream labels. We show that this pretraining scheme improves performance across diverse molecular property prediction tasks, particularly in low-data regimes. Analyses of the learned representations reveal chemically interpretable structure-property relationships, including the emergence of functional group patterns and smooth alignment with molecular dipoles. Ablation studies confirm that the EGNN encoder plays a crucial role in capturing transferable spatial features, while the Transformer layers adapt these features to specific prediction tasks. This architecture draws direct analogies to quantum mechanical basis transformations, where MET learns to transition from coordinate-based to electronbased representations in a symmetry-preserving manner. By integrating domain knowledge with modern deep learning techniques, MET offers a unified and interpretable framework for data-efficient molecular modeling, with broad applications in computational chemistry, drug discovery, and materials science.

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Paper
Submitted
15 Sep 2025
Accepted
21 Dec 2025
First published
24 Dec 2025

Mol. Syst. Des. Eng., 2026, Accepted Manuscript

Integrating Equivariant Architectures and Charge Supervision for Data-Efficient Molecular Property Prediction

Z. Yang, X. Kong and H. Gao, Mol. Syst. Des. Eng., 2026, Accepted Manuscript , DOI: 10.1039/D5ME00173K

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements