Prediction rigidities for data-driven chemistry

Abstract

The widespread application of machine learning (ML) to the chemical sciences is making it very important to understand how the ML models learn to correlate chemical structures with their properties, and what can be done to improve the training efficiency whilst guaranteeing interpretability and transferability. In this work, we demonstrate the wide utility of prediction rigidities, a faimily of metrics derived from the loss function, in understanding the robustness of ML model predictions. We show that the prediction rigidities allow the assessment of the model not only at the global level, but also on the local or the component-wise level at which the intermediate (e.g. atomic, body-ordered, or range-separated) predictions are made. We leverage these metrics to understand the learning behavior of different ML models, and to guide efficient dataset construction for model training. We finally implement the formalism for a ML model targeting a coarse-grained system to demonstrate the applicability of the prediction rigidities to an even broader class of atomistic modeling problems.

Supplementary files

Article information

Article type
Paper
Submitted
14 Thg5 2024
Accepted
22 Thg8 2024
First published
23 Thg8 2024
This article is Open Access
Creative Commons BY license

Faraday Discuss., 2024, Accepted Manuscript

Prediction rigidities for data-driven chemistry

S. Chong, F. Bigi, F. Grasselli, P. Loche, M. Kellner and M. Ceriotti, Faraday Discuss., 2024, Accepted Manuscript , DOI: 10.1039/D4FD00101J

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements