Issue 5, 2024

MLstructureMining: a machine learning tool for structure identification from X-ray pair distribution functions

Abstract

Synchrotron X-ray techniques are essential for studies of the intrinsic relationship between synthesis, structure, and properties of materials. Modern synchrotrons can produce up to 1 petabyte of data per day. Such amounts of data can speed up materials development, but also comes with a staggering growth in workload, as the data generated must be stored and analyzed. We present an approach for quickly identifying an atomic structure model from pair distribution function (PDF) data from (nano)crystalline materials. Our model, MLstructureMining, uses a tree-based machine learning (ML) classifier. MLstructureMining has been trained to classify chemical structures from a PDF and gives a top-3 accuracy of 99% on simulated PDFs not seen during training, with a total of 6062 possible classes. We also demonstrate that MLstructureMining can identify the chemical structure from experimental PDFs from nanoparticles of CoFe2O4 and CeO2, and we show how it can be used to treat an in situ PDF series collected during Bi2Fe4O9 formation. Additionally, we show how MLstructureMining can be used in combination with the well-known methods, principal component analysis (PCA) and non-negative matrix factorization (NMF) to analyze data from in situ experiments. MLstructureMining thus allows for real-time structure characterization by screening vast quantities of crystallographic information files in seconds.

Graphical abstract: MLstructureMining: a machine learning tool for structure identification from X-ray pair distribution functions

Supplementary files

Article information

Article type
Paper
Submitted
02 Jan 2024
Accepted
27 Mar 2024
First published
27 Mar 2024
This article is Open Access
Creative Commons BY-NC license

Digital Discovery, 2024,3, 908-918

MLstructureMining: a machine learning tool for structure identification from X-ray pair distribution functions

E. T. S. Kjær, A. S. Anker, A. Kirsch, J. Lajer, O. Aalling-Frederiksen, S. J. L. Billinge and K. M. Ø. Jensen, Digital Discovery, 2024, 3, 908 DOI: 10.1039/D4DD00001C

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements