Generalizable classification of crystal structure error types using graph attention networks

Marco Gibaldi; Jun Luo; Andrew J. White; R. Alex Mayo; Cécile Pereira; Tom K. Woo

doi:10.1039/D5TA05426E

You do not have JavaScript enabled. Please enable JavaScript to access the full features of the site or access our non-JavaScript page.

Generalizable classification of crystal structure error types using graph attention networks

Marco Gibaldi,

^a Jun Luo,^a Andrew J. White,^a R. Alex Mayo,^a Cécile Pereira^b and Tom K. Woo

*^a

Author affiliations

* Corresponding authors

^a Department of Chemistry and Biomolecular Sciences, University of Ottawa, 10 Marie Curie Private, Ottawa K1N 6N5, Canada
E-mail: twoo@uottawa.ca

^b TotalEnergies OneTech SE, Palaiseau, France

Abstract

Modern chemical applications of machine learning rely on massive training datasets collected through computational simulations or data mining. The quality of such datasets is increasingly challenged due to the discovery of errors in the most popular crystal structure databases. While methods exist to determine error presence, determining an error's cause is not straightforward. We propose a graph neural network-based approach to classify the presence of crystal structure errors, including proton omissions, charge balancing errors, and crystallographic disorder. A training dataset comprising >11k metal–organic frameworks (MOFs) labelled by error type was generated through domain expert inspection. Chemically intuitive features, such as atomic number and oxidation state, were found to achieve high classification accuracies ranging from 85 to 95%. Despite only training on MOFs, classification was generalizable towards unseen databases of molecules and metal complexes, observing accuracies eclipsing 96% in proton and disorder error classification in random samples of drug molecules and metal complexes. Further, graph explainability analysis indicated that these models frequently identify chemically-problematic subgraph structures—analogous to those a chemist would flag—as important towards the error label prediction.

This article is part of the themed collection: Journal of Materials Chemistry A HOT Papers

Download options Please wait...

Supplementary files

Article information

DOI: https://doi.org/10.1039/D5TA05426E
Article type: Paper
Submitted: 05 Jul 2025
Accepted: 26 Aug 2025
First published: 03 Sep 2025
This article is Open Access

Download Citation

J. Mater. Chem. A, 2025,13, 32255-32270

Permissions

Request permissions

Generalizable classification of crystal structure error types using graph attention networks

M. Gibaldi, J. Luo, A. J. White, R. A. Mayo, C. Pereira and T. K. Woo, J. Mater. Chem. A, 2025, 13, 32255 DOI: 10.1039/D5TA05426E

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Social activity

Fetching data from CrossRef.
This may take some time to load.

Journal of Materials Chemistry A

Generalizable classification of crystal structure error types using graph attention networks

Abstract

Supplementary files

Article information

Download Citation

Permissions

Generalizable classification of crystal structure error types using graph attention networks

Social activity

Search articles by author

Spotlight

Advertisements