Interpretable machine-learning prediction of DFT energies per atom and identification of magic numbers in coinage-metal nanoclusters (N ≤ 55) from the open quantum cluster database

Abstract

Atomically precise coinage-metal nanoclusters (Cu, Ag, and Au) exhibit size-dependent stability critical for catalysis, plasmonics, and photocatalysis, yet first-principles screening becomes prohibitive beyond ∼55 atoms. We use the open quantum cluster database (QCD; 4381 Cu/Ag/Au clusters, N ≤ 55, PBE/PAW) to build an interpretable machine-learning framework for the DFT energy per atom (EDFT/N). Nine geometric descriptors (radius of gyration, asphericity, compactness, and bounding-box dimensions) are combined with cluster size N, metal identity, and three QCD-derived electronic features. On a stratified 70/15/15 split, LightGBM attains a test MAE of 0.0144 eV atom−1 and R2 = 0.996, with stable 5-fold cross-validation (0.0142 ± 0.0004 eV per atom); ExtraTrees yielded near-equivalent accuracy. A geometry-only variant trained without electronic inputs retains an MAE = 0.0148 eV per atom (only 3% above the full model), demonstrating that energies can be ranked from coordinates alone. Per-metal second-difference (Δ2E) analysis with adaptive thresholds identifies universal peaks at N = 8 and 34 across all three metals; N = 32 and 38 are resolved only for Au, which is consistent with relativistic stabilization, whereas Au shows an anomalous Δ2E at N = 20 that is absent in Cu and Ag. SHAP analysis reveals that metal identity and cluster size dominate predictions, whereas geometric descriptors govern the ∼10 meV per atom differences that determine magic-number locations. Size-grouped cross-validation shows that interpolation within the QCD is highly accurate, but extrapolation across size domains is substantially harder (MAE ≈ 0.10 eV per atom), bounding the model's scope. The complete open-source pipeline is released to support FAIR data practices.

Graphical abstract: Interpretable machine-learning prediction of DFT energies per atom and identification of magic numbers in coinage-metal nanoclusters (N ≤ 55) from the open quantum cluster database

Supplementary files

Article information

Article type
Paper
Submitted
21 Apr 2026
Accepted
13 May 2026
First published
29 May 2026

Phys. Chem. Chem. Phys., 2026, Advance Article

Interpretable machine-learning prediction of DFT energies per atom and identification of magic numbers in coinage-metal nanoclusters (N ≤ 55) from the open quantum cluster database

A. A. Khairbek, M. I. Al-Zaben, A. Y. A. Alzahrani and R. Thomas, Phys. Chem. Chem. Phys., 2026, Advance Article , DOI: 10.1039/D6CP01474G

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements