Latent thermodynamic flows: unified representation learning and generative modeling of temperature-dependent behaviors from limited data

Abstract

Accurate characterization of equilibrium distributions in complex molecular systems, and their dependence on environmental factors such as temperature, is crucial for understanding thermodynamic properties and transition mechanisms. However, obtaining converged sampling of these high-dimensional distributions using approaches like molecular dynamics simulations often incurs prohibitive computational costs. And the absence of informative low-dimensional representations for these distributions hampers interpretability and many downstream analyses. Recent advances in generative AI, particularly flow-based models, show promise for efficiently modeling molecular equilibrium distributions; yet, without tailored representation learning, their generative performance on high-dimensional distributions remains limited and inexplicable. In this work, we present Latent Thermodynamic Flows (LaTF), an end-to-end framework that seamlessly integrates representation learning with generative modeling. LaTF unifies the State Predictive Information Bottleneck with Normalizing Flows to simultaneously learn low-dimensional representations, i.e., collective variables, classify metastable states, and generate equilibrium distributions across temperatures beyond the training data. The joint optimization of representation learning and generative modeling allows LaTF to mutually enhance both components, making optimal use of costly simulation data to accurately reproduce the system's equilibrium behaviors over the meaningful latent representation that captures its slow, essential degrees of freedom. We demonstrate LaTF's effectiveness across diverse systems, including a model potential, the Chignolin protein, and a cluster of Lennard-Jones particles, with thorough evaluations and benchmarking using multiple metrics and extensive simulations. Moreover, we apply LaTF to a RNA tetraloop system, where despite using simulation data from only two temperatures, LaTF reconstructs the temperature-dependent structural ensemble and melting behavior, consistent with experimental and prior extensive computational results.

Graphical abstract: Latent thermodynamic flows: unified representation learning and generative modeling of temperature-dependent behaviors from limited data

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Edge Article
Submitted
21 Aug 2025
Accepted
12 Dec 2025
First published
02 Jan 2026
This article is Open Access

All publication charges for this article have been paid for by the Royal Society of Chemistry
Creative Commons BY license

Chem. Sci., 2026, Advance Article

Latent thermodynamic flows: unified representation learning and generative modeling of temperature-dependent behaviors from limited data

Y. Qiu, R. John, L. Herron and P. Tiwary, Chem. Sci., 2026, Advance Article , DOI: 10.1039/D5SC06402C

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements