Mechanism-driven interpretable modeling of hydrogen solubility in organic compounds via a hierarchical transformer

Abstract

The acceleration of global warming due to fossil fuel combustion has spurred the strategic transition to hydrogen as a zero-carbon energy carrier. However, accurate prediction of hydrogen dissolution behavior in the hydrogen energy industry chain remains a key bottleneck restricting the development of hydrogen storage materials and optimization of chemical processes. In response to the high cost limitations of traditional experimental methods and the inefficient universality of thermodynamic models, this study developed a machine learning framework integrating feature engineering and interpretability analysis for predicting the solubility of H2 in organic compounds. By constructing a high-dimensional dataset of organic compounds, it innovatively integrated critical property parameters, molecular descriptors, and functional group fingerprint features, and used the Boruta algorithm to select 14 key features to eliminate multicollinearity. Four machine learning models – convolutional neural network (CNN), cascade forward neural network (CFNN), adaptive neuro-fuzzy inference system (ANFIS), and a novel hierarchical regression transformer (HiRegFormer) – were systematically evaluated. The results showed that the HiRegFormer model demonstrated outstanding performance, with an R2 of 0.9855 and an RMSE of 0.0069. Its hybrid encoder architecture simultaneously captured local molecular interactions and global thermodynamic patterns. SHAP interpretability analysis quantified the dominant roles of pressure and temperature on solubility. This study provides a data-mechanism dual-driven intelligent tool for hydrogen dissolution equilibrium prediction, which has significant engineering guidance value for the rational design of hydrogen storage materials and the optimization of hydrogenation reactors.

Graphical abstract: Mechanism-driven interpretable modeling of hydrogen solubility in organic compounds via a hierarchical transformer

Supplementary files

Article information

Article type
Paper
Submitted
08 Jul 2025
Accepted
27 Oct 2025
First published
08 Dec 2025

J. Mater. Chem. A, 2026, Advance Article

Mechanism-driven interpretable modeling of hydrogen solubility in organic compounds via a hierarchical transformer

X. Sun, S. Liu, Z. Wang, J. Wang and B. Sun, J. Mater. Chem. A, 2026, Advance Article , DOI: 10.1039/D5TA05515F

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements