Moment of inertia as a simple shape descriptor for diffusion-based shape-constrained molecular generation

Abstract

The article introduces MLConformerGenerator, a machine-learning framework for shape-constrained molecular generation that combines an Equivariant Diffusion Model (EDM), guided by a compact shape descriptor based on the principal components of the moment of inertia tensor, and a Graph Convolutional Network (GCN) model for bond prediction. The compact yet informative descriptor provides concise representation of molecular shape, enabling scalable learning from large datasets and synthetic conformers generated from 2D molecular inputs. The use of a GCN for bond prediction is evaluated in comparison to deterministic methods. The suggested approach provides an ability to fine-tune the model to generate datasets with chemical-feature distributions closely matching those of target datasets of real conformers. The proposed model supports generation conditioned on both explicit conformers and arbitrary shapes, offering flexibility for applications such as dataset augmentation and structure-based molecule design. Trained on over 1.6 million molecules, the model demonstrates the ability to generate chemically valid, structurally diverse molecules that conform to target shape constraints. It achieves an average shape similarity of 0.53 to a reference conformer, with peak similarity exceeding 0.9 - a performance comparable to that of analogous models relying on more complex descriptors. The results show that integrating physically grounded descriptors with modern generative architectures provides a robust and effective strategy for shape-constrained molecular design.

Supplementary files

Article information

Article type
Paper
Submitted
18 Jul 2025
Accepted
14 Aug 2025
First published
21 Aug 2025
This article is Open Access
Creative Commons BY-NC license

Digital Discovery, 2025, Accepted Manuscript

Moment of inertia as a simple shape descriptor for diffusion-based shape-constrained molecular generation

D. A. Sapegin, F. Bakharev, D. Krupenya, A. Gafurov, K. Pildish and J. C. Bear, Digital Discovery, 2025, Accepted Manuscript , DOI: 10.1039/D5DD00318K

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements