Open Access Article
This Open Access Article is licensed under a Creative Commons Attribution-Non Commercial 3.0 Unported Licence

High-throughput design of bimetallic materials via multimodal machine learning and the accessibility index

Yuming Gu a, Yating Gua, Maochen Yangb, Shisi Tanga, Jiawei Chena, Xinyi Lianga, Dong Zhengcd, Zekun Lib, Fengqi Songcd, Yang Gaob, Yan Zhu*a, Yinghuan Shi*b and Jing Ma*a
aState Key Laboratory of Coordination Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, 210023, P. R. China. E-mail: majing@nju.edu.cn; zhuyan@nju.edu.cn
bState Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, P. R. China. E-mail: syh@nju.edu.cn
cNational Laboratory of Solid State Microstructures, Collaborative Innovation Center of Advanced Microstructures, School of Physics, Nanjing University, Nanjing 210093, P. R. China
dAtom Manufacturing Institute (AMI), Nanjing 211805, P. R. China

Received 15th June 2025 , Accepted 10th September 2025

First published on 11th September 2025


Abstract

It is a great challenge to efficiently explore bimetallic systems containing miscible or immiscible elements (e.g., Au/Ni and Au/Rh) due to the difficulty in screening candidates with favorable formation energy (Eform) from the vast combination space of different metal pairs and ligands or coordination environments. The importance of the coordination environment is highlighted through the multilevel attention mechanism within the graph convolutional neural network (GCNN) and the Shapley additive explanation (SHAP) analysis for an 8-feature scheme in Eform prediction. To further reduce the prediction error of formation energy in the test set, multimodal machine learning (MML) is applied to 11[thin space (1/6-em)]186 bimetallic nanocluster configurations by integrating the molecule graph of the metal core and the physical property features such as mixing enthalpy (Hmix) of the bimetallic pair and SMILES strings and solubility (log[thin space (1/6-em)]P) of the ligand. The present MML model could predict nanoclusters with up to more than one thousand atoms rapidly. To evaluate the experimental accessibility of bimetallic porous materials, alloys, and 2D materials in a general way, an accessibility index, φ, is defined as the combination of the electronegativity (χenv) and the reduced atomic distance index [D with combining tilde] without the need for density functional theory (DFT) calculations. Larger values of φ indicate that the bimetallic materials are more accessible, owing to the energetically favorable interatomic charge transfer and optimal reduced distance around 0.3 (∼3.5 Å metal–metal distance) for nanoclusters and 0.1 (∼2.5 Å) for zeolites, respectively. Among the 100 external test samples, three nanoclusters (Au36Ag38((CF3)2PhC[triple bond, length as m-dash]C)30Cl10, Au38Ag33((CF3)2PhC[triple bond, length as m-dash]C)30Cl8, and Au9AgRh(PPh3)8Cl) and three 2D materials (Au/Ni@NC, Ni/Pt@NC, and Cu/Gd@NC) were synthesized in this work, in good agreement with that their accessibility indices (φ) are in the favorable range (φ ≥ 0.30) and low formation energies below −1 eV per atom. The proposed MML scheme and accessibility index hold promise in facilitating the high-throughput discovery and bimetallic material design.


Introduction

The combination of two different kinds of metal elements brings unique adsorption and luminescence properties in various functional bimetallic systems such as ligand-protected nanoclusters, nanoparticles, alloys, and two-dimensional (2D) materials with potential applications in catalysis, sensors, and drug delivery.1–6 It is interesting to study the accessibility of bimetallic materials with miscible or even immiscible metal pairs, whose miscibility is reflected by the heat of mixing (Hmix) in binary phase diagrams, that is, for solid solutions Hmix < 0 and for phase-separated systems Hmix > 0. The mixing of immiscible elements like Ag/Ni, Ag/Cu, Au/Rh, Au–Pt, etc., has been realized in nanomaterials via various methods such as nonequilibrium synthesis,3 colloidal co-reduction,7–10 physical co-sputtering,11 quenching,12,13 or size reduction methods.14 For the nanoclusters, which are composed of a certain number of metal atoms and ligands, some heteroatom-doped gold nanoclusters with miscible (Au/Cd and Au/Cu)15,16 and immiscible (Au/Ru, Au/Rh, and Au/Pt) elements17,18 have aroused experimental and theoretical interest. When exploring novel bimetallic systems with miscible or immiscible pairs, there exist several challenges. The vast compositional and structural space, involving diverse element types, stoichiometries, and coordination motifs, exceeds the manual exploration. Screening suitable formation energies (Eform) among numerous possible metal–ligand combinations is particularly challenging. The strong influence of coordination environments further increases the complexity of stability prediction. Density functional theory (DFT) calculations for large-sized systems are computationally expensive, while experimental synthesis of such bimetallic systems often involves high uncertainty and cost. A predictive machine learning (ML) framework to effectively evaluate the thermodynamic stability and accessibility of the bimetallic materials is desired to save the computational and experimental costs.

Recent advances in ML have enabled stability prediction for materials using diverse data modalities. For instance, the graph modality was applied to give the prediction of the formation energies of Au nanoclusters, guiding the experimental synthesis of new structures (Au10(PPh3)7Cl3 and Au38OT24).19 A quantitative metric of synthesizability for inorganic materials denoted as the crystal-likeness score (CLscore) was predicted by the graph modality via positive-unlabeled learning.20 In the context of bimetallic nanoclusters, several numeric descriptors, including cohesive energy differences, atomic radius mismatch (ΔR), coordination asymmetry, and magnetism have been employed to build the relationship between the core–shell preference of the bimetallic nanoclusters.21 Despite the progress, relying solely on a single modality alone is insufficient for machine learning models to accurately predict target properties in some cases. Recently, multimodal machine learning (MML), which integrates heterogeneous data streams such as graph and text within a shared latent space, was applied to enhance the efficiency of energy predictions for adsorption configurations to accelerate the catalyst design.22 MML was also applied to investigate the key role of the chemical structure in governing per- and polyfluoroalkyl substance removal, which visualized the contributions of individual chemical elements via adding the simplified molecular input line entry system (SMILES) string modality into the numeric modality of the experimental data.23 In addition, by fusing information from the modalities of chemical composition (text) and crystal structures (graph), several novel materials, such as Li1.5NbO0.5F0.5 and Li15TaN7O2, were recommended as promising candidates for Li-ion conductors.24

In this study, we develop an MML model to predict the formation energies of bimetallic nanoclusters with up to thousands of atoms quickly from 11[thin space (1/6-em)]186 configurations of bimetallic nanoclusters by DFT calculations, as shown in Fig. 1. The MML model integrates the information from three modalities: (1) graph-based representations of core motifs, (2) SMILES string encoding ligands, and (3) digital descriptors capturing key thermodynamic and environmental features. An easily available accessibility index, φ, was further introduced to quantify the synthetic accessibility of bimetallic materials, which integrates the environment electronegativity (χenv) and the reduced metal–metal distance index ([D with combining tilde]) without the need for DFT calculations. This descriptor shows good transferability to evaluate 100 bimetallic materials reported in the literature, including porous materials, alloys, and 2D materials. By taking φ and Eform into consideration together, three nanoclusters (Au36Ag38((CF3)2PhC[triple bond, length as m-dash]C)30Cl10, Au38Ag33((CF3)2PhC[triple bond, length as m-dash]C)30Cl8, and Au9AgRh(PPh3)8Cl) and three 2D materials (Au/Ni@NC, Ni/Pt@NC, and Cu/Gd@NC) were synthesized successfully. The proposed multimodal machine learning scheme is expected to accelerate the experimental discovery and synthesis of bimetallic systems from a huge chemical space.


image file: d5sc04386g-f1.tif
Fig. 1 The flow chart of the stability and accessibility index prediction of bimetallic materials using multimodal vs. single-modal machine learning methods.

Results and discussion

Construction of bimetallic material datasets

Several datasets comprising various nanoclusters have been developed to facilitate the prediction of stability, bioactivities, and nanohydrophobicity of nanoparticles.25–27 Herein, we focused on the bimetallic combinations in material design. As shown in Fig. 1, a dataset of bimetallic materials was compiled with a total of 11[thin space (1/6-em)]412 unique structures, spanning 75 representative bimetallic element pairs (covering 44 miscible and 31 immiscible combinations). Among them, 11[thin space (1/6-em)]186 nanocluster configurations sampled from the potential surfaces of 126 nanoclusters and 120 zeolites were utilized for training and internal testing. To evaluate model generalizability, an external test set consisting of 100 experimentally reported bimetallic materials, including 14 ligand-protected nanoclusters, 4 zeolites/metal–organic frameworks (MOFs), 35 alloys or nanoparticles, 5 oxide/nitride-supported metal systems, and 42 kinds of 2D materials, was curated. Six newly synthesized bimetallic structures in this work were also incorporated to update the dataset.

Specifically, the bimetallic nanocluster training set is mainly based on 5 types of topology categories, including Au4M2(PET)8, Au4M2(PET)8(PPh3)2, Au5M2((CH3)5C5)2 (PPh3)3Cl2, Au8M(PPh3)8, and Au9M(FPh3P)7Br3 (PET = 2-phenylethanethiol, PPh3 = triphenylphosphine, FPh3P = tris(4-fluorophenyl)phosphine, and (CH3)5C5 = 1,2,3,4,5-pentamethylcyclopentadienyl). The diversity of bimetallic nanoclusters was achieved by doping heteroatoms into gold nanoclusters, including Mn, Fe, Co, Ni, Cu, Zn, Ru, Rh, Pd, Ag, Cd, Re, Os, Ir, Pt, and Hg, which generated 126 types of nanoclusters. The configurational flexibility of the ligand also makes the configuration space complicated, since different ligand orientations and binding modes can lead to multiple stable or metastable structures. As a result, we may need to search for the possible structures of nanoclusters. For instance, Au4Ru2(PET)8, which contains an Au4Ru2 core and PET ligands, has been found to have 95 distinct structures due to the flexibility of the ligands and the fluidity of the core structure via the distance between Au and S atoms (dAu–S) and the rotation of the C–S bond. To systematically capture this structural diversity, we sampled nanocluster geometries from the potential energy surfaces of 126 nanocluster prototypes with different metal pairs and ligand types, covering both miscible and immiscible combinations. For each nanocluster prototype, multiple configurations were generated by varying the arrangement of metal atoms and the coordination of ligands, followed by DFT calculations to obtain their formation energies. In this way, 11[thin space (1/6-em)]186 nanocluster configurations were compiled into the present bimetallic dataset, which encompasses a wide range of stoichiometries, coordination environments, and ligands. This dataset was then divided into training, validation, and test sets (8[thin space (1/6-em)]:[thin space (1/6-em)]1[thin space (1/6-em)]:[thin space (1/6-em)]1 ratio) to construct and evaluate the multimodal machine learning (MML) model in the following section.

Metal atoms can also be anchored within the channels in zeolites through coordination interactions with oxygen atoms, forming the metal–zeolites. The distance between two metal atoms (D) can be adjusted by the adjacent rings. These bimetallic zeolites were derived from Cu@zeolites with different topologies and pore sizes (PLD), such as MFI (PLD = 6.36 Å), MOR (PLD = 6.70 Å), and FAU (PLD = 11.24 Å). The doped metals in bimetallic zeolites were mainly 3d–5d transition metal elements, i.e., Fe/Cu@MFI (Fig. 1).

The 100 external test set data were collected from the published literature, which were synthesized in experiments, including porous materials,28 2D materials,29–31 metal-loaded metal oxides/nitrides,32–34 alloys,35,36 and nanoclusters.37 Three newly reported nanoclusters, Au36Ag38((CF3)2PhC[triple bond, length as m-dash]C)30Cl10, Au38Ag33((CF3)2PhC[triple bond, length as m-dash]C)30Cl8, and Au9AgRh(PPh3)8Cl, were added into the external test set. Additionally, three bimetallic pairs supported on nitrogen-doped carbon (NC), namely, Au/Ni@NC, Ni/Pt@NC, and Cu/Gd@NC were newly synthesized in this work, which were also incorporated into the dataset to test the predictive capabilities of the constructed models. The prediction of accessibility was based on two target properties: formation energy (Eform) and accessibility index (φ), both of which are defined in detail in the following sections.

Formation energy prediction: single-modal vs. multimodal machine learning

To qualitatively evaluate the thermodynamic stability of the bimetallic materials, the formation energy (Eform) was calculated with the definition shown as follows.
 
image file: d5sc04386g-t1.tif(1)

In eqn (1), Etotal is the total energy of the bimetallic nanocluster or zeolite. Nmetali is the number of one kind of metal atoms in the bimetallic materials, and Emetali is the energy of metal atoms, in which i = 1, 2. Eenv is the energy of the surrounding environment, for example, ligands for nanoclusters, and frameworks for metal–zeolite. When the calculated formation energy was negative, it implies that the bimetallic materials are thermodynamically favorable for experimental synthesis under the given conditions. The computational details are shown in the SI. The formation energy values are mainly populated in the range between −5.86 and 0.85 eV per atom for the bimetallic nanoclusters and zeolites. As shown from the distribution of bimetallic formation energies in the dataset in Fig. S1, most of the formation energies of bimetallic materials are in the range of −4 to −1 eV per atom. A DWeibull-like distribution of Eform occurs with D statistic and p-value of 0.04 and 0.71 by the Kolmogorov–Smirnov (KS) test, respectively. Two peaks may be ascribed to bimetallic nanoclusters and zeolites in Fig. S1a and b. All structures of the bimetallic nanoclusters were collected into the Bimetallic Materials Dataset to train the multimodal machine learning model (Fig. S2).

Formation energy prediction based on the GCNN model with graph modality. Before exploring multimodal approaches, it is useful to first compare several commonly used single modalities, such as graphs and digital descriptors, which can be applied to predict the formation energy of bimetallic materials. When only the graph modality was applied to represent the bimetallic nanoclusters in a graph convolutional neural network (GCNN) model through the home-made DeepMoleNet,38,39 the input molecular information of the Au nanocluster could be represented by nodes (atoms) and edges (atom pairs) automatically, as shown in Fig. 2. In this work, the prediction ability was tested by statistic values of mean absolute error (MAE), root mean squared error (RMSE), and coefficient of determination (R2), which are defined as
 
image file: d5sc04386g-t2.tif(2)
 
image file: d5sc04386g-t3.tif(3)
 
image file: d5sc04386g-t4.tif(4)
where yi are the machine learning predicted results, ŷi is the actual DFT calculation values, and ȳi is the averaged DFT values, respectively. The GCNN model gives the prediction (MAE = 0.15 eV per atom; R2 = 0.93) for the formation energies of bimetallic nanoclusters, in which hyper-parameters of GCNN are listed in Table S1. Four different learning rates were tested to obtain the optimal GCNN model, as shown in Fig. S3.

image file: d5sc04386g-f2.tif
Fig. 2 The GCNN model based on graph modality for formation energy prediction.

The significance of the coordination environment in stability prediction could be revealed through the multilevel attention mechanism. Local attention operations are performed at T steps following each aggregation of distinct node feature levels. The influence of adjacent atom environments on atomic attention results in shifts in importance at each step. High weight values highlight the crucial roles of specific atoms in the energy prediction process. As shown in Fig. 2, taking Au4Pd2(PET)8 as an example, the normalized attention values, Att, on metal atoms are the highest in the bimetallic nanoclusters, indicating that the metal core plays important roles in Eform prediction. The ligands PPh3, PET, and (CH3)5C5 are the top three ligands in the occurrence count, with the log[thin space (1/6-em)]P values of 5.69, 2.87, and 0.40, respectively. It is reported that the log[thin space (1/6-em)]P of the ligand has great influence on the solubility of Au clusters and cell activity.19 The coordination atoms, i.e., sulfur, also show relatively high values (AttS = 0.50). When the PPh3 ligands are added to form Au4Pd2(PET)8(PPh3)3, the bimetallic nanocluster becomes more stable (Eform = −3.63 eV per atom) with the AttP value of 0.42. It is found that the attention values of C atoms in coordinated (CH3)5C5 (AttC = 0.51) are higher than those in PPh3 (AttC = 0.1–0.2) for the Au5Pd2((CH3)5C5)2(PPh3)3Cl2 nanocluster. Interestingly, the average attention values of ligands (Attligand) exhibit the correlation between the extent of charge transfer between two metals (ΔQCT). It has been revealed that the large extent of charge difference (ΔQCT) between two metal atoms may lead to a more stable structure.19 To sum up, the coordination environment, charge transfer, and log[thin space (1/6-em)]P feature of ligands are modulation factors in the stabilization of metal cores. In the following subsection, these features will be used as descriptors in building Eform prediction models.

Formation energy prediction based on an 8-feature scheme with digital descriptor modality. In addition to the log[thin space (1/6-em)]P and ΔQCT mentioned above, there are another 8 digital descriptors that are crucial in the prediction of Eform (Fig. 3). The component descriptors include the size of the metal core (Nmetal), the ratio of the number of doped metal atoms (ratio), and the differences between molar mass of the metals (ΔM), respectively. Leveraging binary metallic phase diagrams as feature descriptors could capture the thermodynamics matching between two metals. Hmix is selected as an easily accessible feature to describe the miscibility of two metal elements. The interactions between two types of metal elements were elucidated by the differences between electronegativity (ΔχM), which is a readily available parameter. In addition, some geometric descriptors, such as the average metal–metal distance (D) and the differences between atomic radius (ΔR), were further selected for the formation energy prediction.
image file: d5sc04386g-f3.tif
Fig. 3 The 8-feature scheme model based on digital descriptor modality for the formation energy prediction.

Coordination atoms in nanoclusters are usually sulfur (S) and phosphorus (P), while oxygen (O) atoms are commonly found in zeolites. The environmental electronegativity (χenv) can be used to represent the coordination interaction between the metal core and its surrounding environment, which can be expressed in eqn (5).

 
χenv = [small chi, Greek, macron]metal[small chi, Greek, macron]sub (5)
where [small chi, Greek, macron]metal is the average electronegativity of the total atoms and [small chi, Greek, macron]sub is the average electronegativity of the substrate. It is found that the values of ΔQCT for bimetallic nanoclusters and zeolites show a volcano-like correlation with the χenv, respectively, as shown in Fig. 3. For bimetallic zeolites, the values of ΔQCT become larger with more negative values of χenv, indicating that the larger electronegativity difference enhances the charge transfer by strengthening the interaction between metal atoms and O atoms around the channel. In contrast, bimetallic nanoclusters tend to exhibit larger ΔQCT values even with smaller electronegativity differences, likely due to their distinct coordination environments dominated by flexible organic ligands. The Pearson correlation coefficient matrix of these features is shown in Fig. S4, from which one can find that the top 3 important features are the ratio, χenv, and Nmetal, respectively. Shapley additive explanations (SHAP), derived from the concept of Shapley values in game theory, provide a fair attribution of contribution to each feature in the prediction process. In this framework, a feature's contribution was determined by averaging its marginal impact across all possible feature combinations, ensuring that the distribution of contributions is both fair and consistent. According to the SHAP analysis, the features of ratio and χenv emerge as the most influential, which is consistent with the results obtained from the Pearson correlation coefficient matrix. The analysis indicates that medium-to-low values of ratio and higher values of χenv are associated with more stable structures. A larger electronegativity difference enhances the interaction between the metal core and its surrounding environment, ultimately leading to greater structural stability in bimetallic systems.

The random forest regression (RFR) model gives good prediction of the formation energies based on the bimetallic nanocluster dataset among twelve models, as shown in Fig. S5 and S6. To show the generalization performance of these features, the data of bimetallic zeolites were added to construct bimetallic nanocluster and zeolite datasets. The RFR model also retained the good prediction performance, in which the MAE values of 0.18 eV per atom are shown in Fig. S7 and S8, and the parameters are listed in Table S2.

However, the single data modality may not be entirely robust to underestimate the complicated interplay of these factors and may suffer from the overfitting limitation to give the reasonable prediction of the stability of the bimetallic materials. Graph-based models tend to lack interpretability, making it difficult to unravel the chemical factors behind predictions. On the other hand, models that rely solely on digital descriptors are prone to overfitting, primarily due to their inability to capture the full structural and environmental complexity of chemical systems. The multimodal machine learning (MML), a promising approach that integrates diverse data modalities, will be applied in the next subsection for prediction of formation energy.

Formation energy prediction based on an MML model. We have designed an MML model with contextual awareness, including graph (for the metal core), SMILES (for the ligand), and some digital descriptors (for physical properties), to predict the Eform of the bimetallic nanoclusters, as shown in Fig. 4. The purpose of integrating multiple modalities, such as graph representations of the metal core, SMILES strings of the ligands, and digital descriptors such as mixing enthalpy (Hmix) and solubility (log[thin space (1/6-em)]P), is to capture complementary information about bimetallic nanoclusters that a single modality alone cannot fully represent. The graph modality encodes structural topology, SMILES provides the chemical composition and ligand environment, and digital descriptors supply the key thermodynamic and physicochemical properties. By combining these heterogeneous data streams in a shared latent space, the MML model could achieve contextual awareness and avoid the limitations of single-modality models, such as lack of interpretability or overfitting.
image file: d5sc04386g-f4.tif
Fig. 4 (a) The flowchart of formation energy prediction on bimetallic nanoclusters by multimodal machine learning; (b) comparison of the performance of the MML models; (c) prediction of formation energy (Eform) by the MML with log[thin space (1/6-em)]P and Hmix; (d) prediction of the synthesized bimetallic nanoclusters.

All the structures were collected into the nanocluster dataset to train the MML model, in which 8949 data (80%) were chosen as the training set, 1119 (10%) for the validation set, and the rest for the test set, as shown in Table 1. The framework of the MML model is illustrated in Fig. 4a. The bimetallic nanoclusters are initially separated into two parts: the core and the ligands. The metal core is represented by a molecular graph that captures its topological structure, Mcore, which is subsequently encoded into graph embeddings. As mentioned above, the miscibility of two metals in the core could be reflected by the Hmix value, called dHmix. The chemical composition of the ligands is represented using the SMILES notation, Sligands, which is encoded via MolT5,40 a pre-trained model for natural language text and molecule strings. The important feature of ligand solubility is described by digital values of log[thin space (1/6-em)]P and dlog[thin space (1/6-em)]P. The log[thin space (1/6-em)]P values were taken from XLOGP3 (ref. 41) and our PoLog[thin space (1/6-em)]P,39 which gave similar performance in log[thin space (1/6-em)]P prediction. We encode these features using distinct encoders and project them into a shared latent space. The descriptor features are encoded by a single linear layer.

Table 1 Coefficient of determination (R2) and mean absolute error (MAE) of ML models with different modalities
Algorithms R2 MAE
Training set Test set Training set Test set
GCNN with graph modality
11[thin space (1/6-em)]186 data (training[thin space (1/6-em)]:[thin space (1/6-em)]validation[thin space (1/6-em)]:[thin space (1/6-em)]test = 8[thin space (1/6-em)]:[thin space (1/6-em)]1[thin space (1/6-em)]:[thin space (1/6-em)]1)
GCNN 0.934 0.927 0.121 0.152
8-Feature scheme (χenv, ΔχM, Nmetal, ratio, ΔM, Hmix, D, and ΔR)
246 data (training[thin space (1/6-em)]:[thin space (1/6-em)]validation[thin space (1/6-em)]:[thin space (1/6-em)]test = 8[thin space (1/6-em)]:[thin space (1/6-em)]1[thin space (1/6-em)]:[thin space (1/6-em)]1)
RFR 0.996 0.969 0.072 0.182
GDB 0.995 0.967 0.077 0.186
CATBoost 0.999 0.967 0.016 0.184
DT 0.993 0.966 0.093 0.198
EXTREE 0.999 0.959 0.001 0.203
XGB 0.997 0.959 0.067 0.215
ADAB 0.955 0.909 0.261 0.325
SVR 0.979 0.855 0.100 0.375
KNN 0.934 0.810 0.261 0.324
RIDGE 0.699 0.654 0.680 0.616
LINEAR 0.699 0.653 0.679 0.614
LASSO 0.686 0.640 0.700 0.649
MML (graph + SMILES strings + Hmix & log[thin space (1/6-em)]P)
11[thin space (1/6-em)]186 data (training[thin space (1/6-em)]:[thin space (1/6-em)]validation[thin space (1/6-em)]:[thin space (1/6-em)]test = 8[thin space (1/6-em)]:[thin space (1/6-em)]1[thin space (1/6-em)]:[thin space (1/6-em)]1)
MML w/o Hmix & log[thin space (1/6-em)]P 0.943 0.933 0.070 0.073
MML with Hmix 0.938 0.951 0.061 0.057
MML with log[thin space (1/6-em)]P 0.952 0.950 0.058 0.072
MML with Hmix & log[thin space (1/6-em)]P 0.963 0.952 0.050 0.055


To encode the contextual information of the nanoclusters, we designed a pre-fusion stage before the fusion stage. In the pre-fusion stage, the solubility properties of the ligands were added to the front and back ends of the core's graph features, and the enthalpy of mixing of the two metal elements was added to the front and back ends of the ligand SMILES features. The pre-fusion stage was designed to enhance the contextual awareness of the model by providing additional information about the ligands and the core. The augmented core and ligand features, along with the descriptor features, were then concatenated and fed into the fusion stage. Four layers of Mamba2 (ref. 42) were employed to facilitate the exchange of information between the multimodal features. The output of the fusion stage was pooled and passed into a regression head, which consisted of linear layers and PReLU activation function units.

The model takes the core Mcore, the ligands' SMILES strings Sligands, and the descriptor features D = [dlog[thin space (1/6-em)]P, dHmix] as input. We denote the features decoded by the core encoder φc, the ligand encoder φl, and the descriptor encoder φd as fc, fl, and fd, respectively.

In the pre-fusion stage, the core and ligand features are augmented as follows:

 
faugc = φc([dlog[thin space (1/6-em)]P, fc, dlog[thin space (1/6-em)]P]) (6)
 
faugl = φl([dHmix, fl, dHmix]) (7)
where dlog[thin space (1/6-em)]P and dHmix are the descriptor features of solubility and enthalpy of mixing, respectively, and φc and φl are linear projection layers.

The augmented core and ligand features were concatenated with the descriptor features and fed into the fusion stage:

 
fs = SSM([faugc, faugl, fd]) (8)
where SSM denotes the four continuous Mamba2 blocks and fs is the multimodal feature vector.

The output of the fusion stage was pooled and passed through the regression head to predict the formation energy of the bimetallic nanoclusters:

 
EMMLform = ψ(Θ(fs)) (9)
where Θ and ψ are the pooling operation and the regression head, respectively.

The loss functions comprised the contrastive loss and the regression loss. To enable the model to learn the connection between the original nanoclusters and the unit of core and ligands, we employed the contrastive loss. The contrastive loss was calculated using the cross-entropy loss between the graph features of the original nanoclusters and the combination of the graph features of the core and the SMILES features of the ligands. The regression loss was calculated using the mean squared error between the predicted formation energy and the actual formation energy. The total loss was computed as the sum of the contrastive loss and the regression loss, with the details shown in the SI.

MML could discover inherent relationships between the modalities, improving the predictive ability and generalization capability of models to surpass single-data-dimension limitations. Compared with the single-modal machine learning only with graph or numeric modality in Table 1, the smaller values of MAE in the test set indicate the better performance of MML models. We investigated the impact of various descriptors on model performance further, as shown in Fig. 4b. The MML model incorporating both log[thin space (1/6-em)]P and Hmix achieved the best results, with an MAE of 0.055 eV per atom, RMSE of 0.209 eV per atom, and R2 of 0.952. These results highlight the significance of contextual information from the nanoclusters for accurately predicting the formation energy of bimetallic nanoclusters. Additionally, we examined the individual contributions of solubility and the enthalpy of mixing. When either solubility or enthalpy of mixing was used as the sole descriptor, the model achieved MAEs of 0.072 eV per atom and 0.057 eV per atom, respectively. Both of these performances surpassed the model without any descriptors, suggesting that the ligand environment and miscibility of metal pairs are very important features in nanocluster design. As a result, the present MML model significantly improved the prediction accuracy of formation energies, reaching lower mean absolute errors (0.055 eV per atom) in the test set compared to graph-only or descriptor-only models, indicating good generalization to larger and experimentally synthesized nanoclusters.

The MML model enabled rapid and reliable evaluation of large-sized nanoclusters containing hundreds to over a thousand atoms, which are computationally challenging for conventional DFT methods. The advantage of the constructed MML model was employed to realize the formation energy prediction for the large-sized structures. As shown in Fig. 4d, the selected bimetallic nanoclusters exhibit total atom counts (Natom) ranging from 277 to 1317, which are much larger than those in the training set. The predicted formation energies are all below −2.2 eV per atom, suggesting favorable thermodynamic stability, consistent with experimental realizations. We also chose three nanoclusters, Au23Pd(CHT)17, Au24Cd(nBuS)18, and Au24Hg(nBuS)18, which are affordable in DFT calculations. The DFT formation energies were −3.56, −3.58, and −3.71 eV per atom, respectively, which were reproduced by MML prediction with an MAE of 0.44 eV per atom. For example, a nanocluster in the external test set, Au47Cd2(TBBT)31 (Natom = 793),43 was predicted to have a formation energy of −3.34 eV per atom. Furthermore, a recently synthesized nanocluster, Ag135Cu60(PET)60Cl42 (Natom = 1317),44 featuring a buckminsterfullerene-like silver kernel, was predicted to be thermodynamically stable with a formation energy of −2.28 eV per atom. This is in agreement with its successful one-pot synthesis via the reduction of a solution containing 4-CH3C6H4SO3Ag, CuCl2·2H2O, and 2-phenylethanethiol in a mixed solvent system of dichloromethane and methanol using NaBH4 as a reductant.

Accessibility prediction: accessibility index vs. formation energy

In order to provide useful information for guiding experimental synthesis, accessibility prediction was carried out with the consideration of geometric and electronic factors in bimetallic materials. Analogous to the concept of reduced mass of the two-body system, the reduced atomic distance ([D with combining tilde]) is defined in eqn (10) as follows:
 
image file: d5sc04386g-t5.tif(10)
where D represents the average interatomic distance between the two metals, ΔR denotes the difference in their atomic radii, and c is an empirical scaling factor (set to 100) that ensures the ΔR term is appropriately weighted relative to D. When the size of two metals is similar, such as Au (R = 144 Å) and Ag (R = 144 Å) atoms in Au4Ag2(PET)8, the value of ΔR is 0, leading to the [D with combining tilde] of 1. While the average metal–metal distance (e.g., D = 3.92 Å for Cu/Y@MOR) is much less than c × ΔR, the values of [D with combining tilde] may be close to 0. Our analysis indicates that the optimal reduced distances are typically around 0.3 for nanoclusters (e.g., Au4Ru2(PET)8(PPh3)2, Au4Re2(PET)8(PPh3)2, and Au4Os2(PET)8(PPh3)2) and 0.1 for zeolites (e.g., Cu/Sc@MOR, Cu/Sc@MFI, and Cu/Y@MFI), with the optimal metal–metal distances (D) of 3.5 Å for nanoclusters and 2.5 Å for zeolites, respectively.

On the other hand, the electronegativity difference between the metal core and its coordination environment (χenv) serves as an effective indicator of charge transfer (ΔQCT) between two metal atoms (Fig. 3). A volcano-like correlation between χenv and stability is observed in Fig. 5, where the left and right branches correspond to Cu/M@zeolites and Au/M nanoclusters, respectively. Such a consistent trend drawn from different types of materials suggests that χenv, without relying on DFT calculations, could be used in the high-throughput screening of synthetically accessible bimetallic materials. Motivated by these observations, we propose a new descriptor, φ, to quantitatively evaluate the synthetic accessibility of bimetallic structures. The definition of φ is presented as follows:

 
φ = [D with combining tilde] × χenv (11)
where [D with combining tilde] is the reduced distance parameter of geometric difference, and χenv captures the feature of charge transfer between two metals.


image file: d5sc04386g-f5.tif
Fig. 5 The definition of the terms of the accessibility index and their relationship with formation energy.

The most stable bimetallic structures are typically found within a characteristic energy basin where φ ≥ 0.3, as shown in Fig. 5. Thus, a larger φ value indicates better geometric and electronic matching, both of which contribute to the enhanced stability and accessibility of the bimetallic materials.

The combined evaluation of φ and Eform can serve as a guideline for the efficient screening of bimetallic materials that are likely to be synthesizable. As shown in Fig. 6a, the relationship between the descriptors Hmix and φ with the Eform reveals that, in general, bimetallic nanoclusters exhibit greater thermodynamic stability than metal–zeolite systems. The formation energies become more negative with the increasing trend of the Hmix values for bimetallic nanoclusters, which can be attributed to the protection of the ligands, as shown in Fig. 6b. Ligands can cover the surface of bimetallic nanoclusters, reducing the surface energy and passivating the ‘active’ low-coordinated metal atoms to enhance the thermodynamic stability. Some ligands (e.g., carbonyl and thiol) can form stable coordination environments to inhibit oxidation and other unfavorable chemical reactions by forming strong bonds with metal atoms, thereby preventing excessive aggregation or dissociation. Some doped metals in bimetallic nanoclusters with higher values of Hmix have been synthesized in the experiment, i.e., [Au12Ru(Ph2PCH2PPh2)6]2+ (HAu/Rumix = 15 kJ mol−1), [Au12Ir(Ph2PCH2PPh2)6]3+ (HAu/Irmix = 13 kJ mol−1), and [Au12Rh(Ph2PCH2PPh2)6]3+ (HAu/Rhmix = 7 kJ mol−1).18 The seemingly immiscible bimetallic combination could be realized by using the strategy of the protected ligands. Gold nanoclusters containing immiscible metal pairs, specifically Au/Ru, Au/Re, and Au/Os, are identified as thermodynamically stable, with formation energies below −5 eV per atom, as shown in Fig. 6c. Representative examples include Au4Ru2(PET)8(PPh3)2, Au4Re2(PET)8(PPh3)2, and Au4Os2(PET)8(PPh3)2, all of which exhibit φ values greater than −0.3, suggesting favorable synthetic accessibility. Among them, Au4Ru2(PET)8(PPh3)2 has already been successfully synthesized and shown to enable light-driven N2 fixation.45 In contrast, bimetallic Cu/M@zeolites (where M = Ir, Re, and Os) are predicted to be thermodynamically unstable, with corresponding φ values of −0.53, −0.55, and −0.54, respectively, indicating that experimental synthesis of these Cu/M@zeolite structures may be challenging.


image file: d5sc04386g-f6.tif
Fig. 6 (a) The formation energy profiles of bimetallic materials with φ and Hmix; (b) the pairwise relationship between the formation energy and Hmix; (c) the structures of inaccessible Cu/M@zeolites and accessible Au/M nanoclusters.

The training dataset, while primarily focused on ligand-protected nanoclusters and zeolites, contains a variety of metal combinations, stoichiometries, and coordination environments that capture essential patterns of metal–ligand interactions and stability. This diversity within the training data ensures that the model is not overfitted to a specific material type but rather learns generalizable features of bimetallic systems. The generalization of the proposed multimodal machine learning model beyond ligand-protected nanoclusters and zeolites originates from the intrinsic bimetallic pairing in the selected descriptors. The environmental electronegativity (χenv) describes the interaction between the metal core and its surrounding coordination environment, regardless of whether it is in nanoclusters, zeolites, alloys, or 2D materials. Similarly, the mixing enthalpy (Hmix) reflects the miscibility between two metals, a property that is not restricted to a specific family but is applicable to all bimetallic systems. By incorporating such transferable descriptors that capture physicochemical principles, the model achieves predictive capability across different classes of bimetallic materials.

The machine learning model was further applied to 100 different materials, including metal–zeolites,28 nanoclusters,37 metal-loaded oxides/nitrides32–34 at the interfaces, alloys,35,36 and 2D materials,29–31 spanning dimensionalities from three-dimensional to two-dimensional configurations. As summarized in Table S4, the synthesized configurations are predicted to exhibit thermodynamic stability under operational conditions. The stability index, φ, shows the correlation between the formation energies and Pearson correlation coefficient (r) of −0.63, as shown in Fig. 7, indicating that a higher value of φ may lead to a more stable structure. The predictive accuracy of the model is expected to increase further by incorporating material-specific descriptors, which will enhance its ability to predict the stability of a broader range of bimetallic systems in future work.


image file: d5sc04386g-f7.tif
Fig. 7 Accessibility prediction via Eform and φ for the external test set of bimetallic materials, including porous materials, nanoclusters, alloys, nanoparticles, metal oxides/nitrides, and 2D materials.

Bimetallic pairs of noble metals (Au/M and Pt/M) and non-noble metals (Cu/M and Ni/M) are predicted by the RFR algorithm, encompassing nanoclusters, metal–zeolites, 2D materials, etc., as illustrated in Fig. 8, with associated standard deviation error bars. The color bar indicates the average values of the stability index, φ, of these metal pairs. The stability of Au/M pairs seems to be higher than the Cu/M pairs, which can be attributed to the higher electronegativity (χAu = 2.54 vs. χCu = 1.90), allowing it to accept electrons when bonded with metals of lower electronegativity.


image file: d5sc04386g-f8.tif
Fig. 8 The accessibility prediction via Eform, φ, and Hmix for the bimetallic pairs comprising noble metals (Au/M and Pt/M) and base metals (Cu/M and Ni/M) predicted by machine learning.

Among the noble metal pairs, the top three Au-based pairs identified are Au/Os, Au/Re, and Au/Ru with immiscible metal combinations, which are promising for future experimental realization. The formation energies for Au/Ag pairs are −2.59 ± 0.22 eV per atom, suggesting the potential for experimental synthesis. By tuning the ligand environment, PPh3, (CF3)2PhC[triple bond, length as m-dash]CH, and Cl were selected to protected the cores to form the Au/Ag nanoclusters in the following section, with the φ in the range of −0.16 to −0.10. Due to the high cost and scarcity of noble metals, introducing non-noble metals into noble metal systems to form bimetallic pairs is essential to achieve a balance between performance and cost. For non-noble metals, d1 (Y and Sc), d2 (Ti and Zr), and d10 (Au, Cd, and Zn) metals are predicted to exhibit more stable structures with Cu/M pairs. And Ni element is favorable to form the metal pairs with d10 metals (Au and Ag).

Another three miscible/immiscible bimetallic pairs, Ni/Pt, Cu/Gd, and Au/Ni, exhibit formation energies lower than −1 eV per atom according to the wind rose diagrams, indicating their thermodynamic stability. The nitrogen-doped carbon surface provides a favorable support for anchoring these metal species. The corresponding φ values for Ni/Pt@NC, Cu/Gd@NC, and Au/Ni@NC were calculated to be −0.30, −0.21, and −0.16, respectively, suggesting that these structures are experimentally accessible. Taking the multifactors of Eform, φ, and Hmix into account, these systems are selected in the following section to validate the applicability and predictive power of the constructed machine learning model.

Newly synthesized bimetallic materials as accessibility tests

Among the vast combinations of metal pairs, prediction of accessibility could accelerate the design and discovery of promising bimetallic candidates. Two nanocluster crystals were successfully synthesized in experiments presented in Fig. S9–S11 (for Experimental details, see the SI). The structures of Au36Ag38((CF3)2PhC[triple bond, length as m-dash]C)30Cl10 (short for Au36Ag38) and Au38Ag33((CF3)2PhC[triple bond, length as m-dash]C)30Cl8 (short for Au38Ag33) were solved by single-crystal X-ray diffraction, as shown in Fig. 9. Au36Ag38 can be viewed as a core–shell structure with the predicted formation energy of −1.24 eV per atom. The Au21Ag3 core is enclosed by an Ag35 shell coordinated with the 10 Cl ligand, and the outermost layer consists of 15 monomeric RC[triple bond, length as m-dash]C–Au–C[triple bond, length as m-dash]CR staples. Au38Ag33, with the same formation energy of −1.24 eV per atom, shares a similar structure and ligand arrangement and is co-crystallized with Au36Ag38, which made it difficult to separate the single crystals of the two products via crystallization. Like Au36Ag38, Au38Ag33 is also stabilized by 15 monomeric RC[triple bond, length as m-dash]C–Au–C[triple bond, length as m-dash]CR staples. In addition to the surface staple structure, Au38Ag33 contains an Au23 core and an Ag33 shell coordinated with the 8 Cl ligand. Interestingly, the constructed ML model can be further employed to predict the formation energy of trimetallic nanoclusters. As shown in Fig. 9b, Au9AgRh(PPh3)8Cl2 (denoted as Au9AgRh) was obtained by doping Ag and Rh on the basis of [Au11(PPh3)8Cl2], with a formation energy of −2.23 eV per atom. It has a similar structure to Au11, but the doping of Ag and Rh caused a slight distortion in its structure. The two positions coordinated with the Cl ligand are co-occupied by Au/Ag/Rh.
image file: d5sc04386g-f9.tif
Fig. 9 The crystal structures and ML predicted formation energies, EMLform, of (a) Au36Ag38((CF3)2PhC[triple bond, length as m-dash]C)30Cl10 and Au38Ag33((CF3)2PhC[triple bond, length as m-dash]C)30Cl8 and (b) Au9AgRh(PPh3)8Cl2.

Three bimetallic pairs supported on nitrogen-doped carbon (NC) were predicted to be stable with EMLform of −1.79, −1.41, and −1.28 eV per atom for Au/Ni@NC, Cu/Gd@NC, and Ni/Pt@NC systems, respectively, as shown in Fig. 10a. The DFT-calculated formation energies were −2.45, −1.81, and −1.75 eV per atom for these systems, respectively, demonstrating qualitative alignment with the predicted stability trend. The projected density of states (PDOS) also reveals an interaction between the d orbitals of bimetallic centers (Au and Ni) and the orbitals of the coordination N atoms near the Fermi level. The charge transfer between the bimetallic centers and surrounding N atoms could strengthen the stability of the Au/Ni@NC.


image file: d5sc04386g-f10.tif
Fig. 10 Plots of DFT vs. ML predicted formation energies and density of states vs. charge density differences; SEM images; and XPS spectra of (a) Au/Ni@NC, (b) Cu/Gd@NC, and (c) Ni/Pt@NC.

The Au/Ni@NC, Cu/Gd@NC, and Ni/Pt@NC materials were synthesized in experiments; their scanning electron microscope (SEM) images are shown in Fig. 10b (for Experimental details, see the SI). Powder X-ray diffraction (XRD) provided crystal structure information on a macroscale (Fig. S12). The structures of Au/Ni@NC, Cu/Gd@NC, and Ni/Pt@NC were further characterized by X-ray Photoelectron Spectroscopy (XPS), which could reveal the bimetallic chemical composition and surface chemical state, as shown in Fig. 10c and Fig. S13–S15. For the Au/Ni@NC, a typical Ni 2p3/2 and 2p1/2 doublet accompanied by two satellite peaks indicated the positive oxidation state of Ni atoms.46 The XPS peaks of Au 4f were characterized by two sets of doublets; the peaks located at 87.50 and 83.80 eV correspond to Au 4f7/2 and Au 4f5/2 of Au(0), respectively. The peaks at 88.50 and 84.80 eV could be attributed to the Au(III) species.47 For the Cu/Gd@NC, two characteristic peaks could be attributed to Cu(0) species at 931.20 eV and 951.42 eV, and the other two peaks at 934.12 eV and 954.36 eV were assigned to Cu(II) species.48 Two peaks appeared at 151.44 and 142.20 eV, which were attributed to Gd 4d3/2 and Gd 4d5/2, respectively, indicating the existence of the Gd element in the synthesized material.49 As reflected by XPS results of Pt 4f in the Ni/Pt@NC, the spectra were deconvoluted into two spin–orbit doublets, indicating the existence of the Pt element in the synthesized material.50 These results demonstrate that the ML-predicted formation energies are qualitatively consistent with DFT calculations and experimental observations, indicating the capability to design stable bimetallic materials.

Conclusions

In this study, we have presented the MML model and accessibility index to identify stable bimetallic materials that can be experimentally synthesized. By integrating the molecule graph of the metal core and the SMILES notation of the ligand with the addition of Hmix and log[thin space (1/6-em)]P, the MML model could predict the stability of the bimetallic nanoclusters with up to thousands of atoms. The stability of bimetallic materials could be further adjusted by the coordination environment, in which the protected ligands prevented the aggregation or dissociation of the metal cores. The accessibility index based on the combination of geometric and electronic factors has been successfully extended to external test sets for experimentally obtained bimetallic materials, including the metal–zeolites, nanoclusters, alloys, metal oxides/nitrides, and 2D materials. When the φ is greater than −0.3, the corresponding bimetallic material is considered potentially synthesizable in experiments. Notably, our machine learning model has effectively guided the synthesis of six new structures, including nanoclusters and 2D materials, which were anticipated to exhibit stability. The proposed machine learning approach holds significant promise for the discovery and synthesis of bimetallic materials in experimental settings.

Author contributions

Jing Ma and Yuming Gu initiated the project. Yuming Gu carried out the DFT calculations and built the GCNN model and 8-feature scheme model. Yating Gu, Jiawei Chen, Xinyi Liang, Dong Zheng, and Fengqi Song constructed the datasets. Shisi Tang and Yan Zhu performed the synthesis of new nanoclusters. Maochen Yang, Zekun Li, Yang Gao, and Yinghuan Shi designed the multimodal machine learning frameworks. All authors contributed to the discussion of the results as well as the writing and revision of the manuscript.

Conflicts of interest

There are no conflicts to declare.

Data availability

The data are available from the corresponding author on reasonable request.

CCDC 2440719 and 2440720 contain the supplementary crystallographic data for this paper.51a,b

Supplementary information: details of DFT computations, illustration of dataset and machine learning models, and experimental details. See DOI: https://doi.org/10.1039/d5sc04386g.

Acknowledgements

This work was supported by the National Key Research and Development Program of China (2023ZD0120700), the National Natural Science Foundation of China (grant no. 22033004, 22373049, 62222604, 62192783, 22125202, and 92461312), the Natural Science Foundation of Jiangsu Province (BK20232012), and the Jiangsu Funding Program for Excellent Postdoctoral Talent (grant no. 2023ZB655). We are grateful to the High Performance Computing Centre of Nanjing University for providing the IBM Blade cluster system.

Notes and references

  1. X. Liu, X. Cai and Y. Zhu, Acc. Chem. Res., 2023, 56(12), 1528–1538 CrossRef CAS PubMed.
  2. L. Liu and A. Corma, Chem. Rev., 2023, 123(8), 4855–4933 CrossRef CAS PubMed.
  3. C. Yang, B. H. Ko, S. Hwang, Z. Liu, Y. Yao, W. Luc, M. Cui, A. S. Malkani, T. Li, X. Wang, J. Dai, B. Xu, G. Wang, D. Su, F. Jiao and L. Hu, Sci. Adv., 2020, 6, eaaz6844 CrossRef PubMed.
  4. L. Zhang, Z. Xie and J. Gong, Chem. Soc. Rev., 2016, 45(14), 3916–3934 RSC.
  5. K. D. Gilroy, A. Ruditskiy, H.-C. Peng, D. Qin and Y. Xia, Chem. Rev., 2016, 116(18), 10414–10472 CrossRef CAS PubMed.
  6. J. L. Durham, A. S. Poyraz, E. S. Takeuchi, A. C. Marschilok and K. J. Takeuchi, Acc. Chem. Res., 2016, 49(9), 1864–1872 CrossRef CAS PubMed.
  7. K. Kusada, M. Yamauchi, H. Kobayashi, H. Kitagawa and Y. Kubota, J. Am. Chem. Soc., 2010, 132, 15896–15898 CrossRef CAS PubMed.
  8. C. Srivastava, S. Chithra, K. D. Malviya, S. K. Sinha and K. Chattopadhyay, Acta Mater., 2011, 59(16), 6501–6509 CrossRef CAS.
  9. Q. Zhang, K. Kusada, D. Wu, T. Yamamoto, T. Toriyama, S. Matsumura, S. Kawaguchi, Y. Kubota and H. Kitagawa, Nat. Commun., 2018, 9(1), 510 CrossRef PubMed.
  10. K. Kusada, T. Yamamoto, T. Toriyama, S. Matsumura, K. Sato, K. Nagaoka, K. Terada, Y. Ikeda, Y. Hirai and H. Kitagawa, J. Phys. Chem. C, 2020, 125(1), 458–463 CrossRef.
  11. M. Meischein, A. Garzón-Manjón, T. Hammerschmidt, B. Xiao, S. Zhang, L. Abdellaoui, C. Scheu and A. Ludwig, Nanoscale Adv., 2022, 4(18), 3855–3869 RSC.
  12. B. B. Rajeeva, P. Kunal, P. S. Kollipara, P. V. Acharya, M. Joe, M. S. Ide, K. Jarvis, Y. Liu, V. Bahadur, S. M. Humphrey and Y. Zheng, Matter, 2019, 1(6), 1606–1617 CrossRef.
  13. J. Feng, D. Chen, P. V. Pikhitsa, Y.-h. Jung, J. Yang and M. Choi, Matter, 2020, 3(5), 1646–1663 CrossRef.
  14. P.-C. Chen, M. Gao, C. A. McCandler, C. Song, J. Jin, Y. Yang, A. L. Maulana, K. A. Persson and P. Yang, Nat. Nanotechnol., 2024, 19, 775–781 CrossRef CAS PubMed.
  15. S. Yang, S. Chen, L. Xiong, C. Liu, H. Yu, S. Wang, N. L. Rosi, Y. Pei and M. Zhu, J. Am. Chem. Soc., 2018, 140(35), 10988–10994 CrossRef CAS PubMed.
  16. C. Zhou, H. Li, Y. Song, F. Ke, W. W. Xu and M. Zhu, Nanoscale, 2019, 11(41), 19393–19397 RSC.
  17. C. Yao, N. Guo, S. Xi, C.-Q. Xu, W. Liu, X. Zhao, J. Li, H. Fang, J. Su, Z. Chen, H. Yan, Z. Qiu, P. Lyu, C. Chen, H. Xu, X. Peng, X. Li, B. Liu, C. Su, S. J. Pennycook, C.-J. Sun, J. Li, C. Zhang, Y. Du and J. Lu, Nat. Commun., 2020, 11(1), 4389 CrossRef CAS PubMed.
  18. S. Takano, H. Hirai, T. Nakashima, T. Iwasa, T. Taketsugu and T. Tsukuda, J. Am. Chem. Soc., 2021, 143(28), 10560–10564 CrossRef CAS PubMed.
  19. Y. Gu, S. Tang, X. Liu, X. Liang, Q. Zhu, H. Wu, X. Yang, W. Jin, H. Chen, C. Liu, Y. Zhu and J. Ma, J. Mater. Chem. A, 2024, 12(8), 4460–4472 RSC.
  20. J. Jang, G. H. Gu, J. Noh, J. Kim and Y. Jung, J. Am. Chem. Soc., 2020, 142(44), 18836–18843 CrossRef CAS PubMed.
  21. A. Ghosh, S. Datta and T. Saha-Dasgupta, J. Phys. Chem. C, 2022, 126(15), 6847–6853 CrossRef CAS.
  22. J. Ock, S. Badrinarayanan, R. Magar and A. Antony, Nat. Mach. Intell., 2024, 6(12), 1501–1511 CrossRef.
  23. N. Jeong, S. Park, S. Mahajan, J. Zhou, J. Blotevogel, Y. Li, T. Tong and Y. Chen, Nat. Commun., 2024, 15(1), 10918 CrossRef CAS PubMed.
  24. S. Wang, S. Gong, T. Böger, J. A. Newnham, D. Vivona, M. Sokseiha, K. Gordiz, A. Aggarwal, T. Zhu, W. G. Zeier, J. C. Grossman and Y. Shao-Horn, Chem. Mater., 2024, 36(23), 11541–11550 CrossRef CAS.
  25. X. Yan, A. Sedykh, W. Wang, B. Yan and H. Zhu, Nat. Commun., 2020, 11, 2519 CrossRef CAS PubMed.
  26. X. Yan, A. Sedykh, W. Wang, X. Zhao, B. Yan and H. Zhu, Nanoscale, 2019, 11, 8352–8362 RSC.
  27. W. Wang, X. Yan, L. Zhao, D. P. Russo, S. Wang, Y. Liu, A. Sedykh, X. Zhao, B. Yan and H. Zhu, J. Cheminf., 2019, 11, 6 Search PubMed.
  28. T. Chen, W. Yu, C. K. T. Wun, T.-S. Wu, M. Sun, S. J. Day, Z. Li, B. Yuan, Y. Wang, M. Li, Z. Wang, Y.-K. Peng, W.-Y. Yu, K.-Y. Wong, B. Huang, T. Liang and T. W. B. Lo, J. Am. Chem. Soc., 2023, 145(15), 8464–8473 CrossRef CAS PubMed.
  29. J. Hao, H. Zhu, Q. Zhao, J. Hao, S. Lu, X. Wang, F. Duan and M. Du, Nano Res., 2023, 16(7), 8863–8870 CrossRef CAS.
  30. Z. Li, S. Ji, C. Wang, H. Liu, L. Leng, L. Du, J. Gao, M. Qiao, J. H. Horton and Y. Wang, Adv. Mater., 2023, 35(25), 2300905 CrossRef CAS PubMed.
  31. L. Zhang, J. Feng, S. Liu, X. Tan, L. Wu, S. Jia, L. Xu, X. Ma, X. Song, J. Ma, X. Sun and B. Han, Adv. Mater., 2023, 35(13), 2209590 CrossRef CAS PubMed.
  32. J. Fu, J. Dong, R. Si, K. Sun, J. Zhang, M. Li, N. Yu, B. Zhang, M. G. Humphrey, Q. Fu and J. Huang, ACS Catal., 2021, 11(4), 1952–1961 CrossRef CAS.
  33. Y. Lou, F. Jiang, W. Zhu, L. Wang, T. Yao, S. Wang, B. Yang, B. Yang, Y. Zhu and X. Liu, Appl. Catal., B, 2021, 291, 120122 CrossRef CAS.
  34. Z. Zhang, S. Chen, J. Zhu, C. Ye, Y. Mao, B. Wang, G. Zhou, L. Mai, Z. Wang, X. Liu and D. Wang, Nano Lett., 2023, 23(6), 2312–2320 CrossRef CAS PubMed.
  35. K. Qi, Y. Zhang, N. Onofrio, E. Petit, X. Cui, J. Ma, J. Fan, H. Wu, W. Wang, J. Li, J. Liu, Y. Zhang, Y. Wang, G. Jia, J. Wu, L. Lajaunie, C. Salameh and D. Voiry, Nat. Catal., 2023, 6(4), 319–331 CrossRef CAS.
  36. D. Wei, Y. Wang, C.-L. Dong, Z. Zhang, X. Wang, Y.-C. Huang, Y. Shi, X. Zhao, J. Wang, R. Long, Y. Xiong, F. Dong, M. Li and S. Shen, Angew. Chem., Int. Ed., 2023, 62, e202217369 CrossRef CAS PubMed.
  37. X. Liu, E. Wang, M. Zhou, Y. Wan, Y. Zhang, H. Liu, Y. Zhao, J. Li, Y. Gao and Y. Zhu, Angew. Chem., Int. Ed., 2022, 61, e202207685 CrossRef CAS PubMed.
  38. Z. Liu, L. Lin, Q. Jia, Z. Cheng, Y. Jiang, Y. Guo and J. Ma, J. Chem. Inf. Model., 2021, 61(3), 1066–1082 CrossRef CAS PubMed.
  39. Q. Jia, Y. Ni, Z. Liu, X. Gu, Z. Cui, M. Fan, Q. Zhu, Y. Wang and J. Ma, J. Chem. Inf. Model., 2022, 62(20), 4928–4936 CrossRef CAS PubMed.
  40. C. Edwards, T. Lai, K. Ros, G. Honke, K. Cho and H. Ji, arXiv, 2022, preprint, arXiv:2204.11817,  DOI:10.48550/arXiv.2204.11817.
  41. T. Cheng, Y. Zhao, X. Li, F. Lin, Y. Xu, X. Zhang, Y. Li, R. Wang and L. Lai, J. Chem. Inf. Model., 2007, 47(6), 2140–2148 CrossRef CAS PubMed.
  42. T. Dao and A. Gu, arXiv, 2024, preprint, arXiv:2405.21060,  DOI:10.48550/arXiv.2405.21060.
  43. S. Zhuang, D. Chen, L. Liao, Y. Zhao, N. Xia, W. Zhang, C. Wang, J. Yang and Z. Wu, Angew. Chem., Int. Ed., 2020, 59, 3073–3077 CrossRef CAS PubMed.
  44. L. Tang, W. Dong, Q. Han, B. Wang, Z. Wu and S. Wang, Nat. Synth., 2025, 4, 506–513 CrossRef CAS.
  45. Y. Sun, W. Pei, M. Xie, S. Xu, S. Zhou, J. Zhao, K. Xiao and Y. Zhu, Chem. Sci., 2020, 11(9), 2440–2447 RSC.
  46. X. Zhang, H. Su, P. Cui, Y. Cao, Z. Teng, Q. Zhang, Y. Wang, Y. Feng, R. Feng, J. Hou, X. Zhou, P. Ma, H. Hu, K. Wang, C. Wang, L. Gan, Y. Zhao, Q. Liu, T. Zhang and K. Zheng, Nat. Commun., 2023, 14(1), 7115 CrossRef CAS PubMed.
  47. J. Zhao, H. Wang, H. Geng, Q. Yang, Y. Tong and W. He, ACS Appl. Nano Mater., 2021, 4(7), 7253–7263 CrossRef CAS.
  48. X. Zhou, M. Wang, J. Chen and X. Su, Talanta, 2022, 245, 123451 CrossRef CAS PubMed.
  49. S. Ning, M. Li, X. Wang, D. Zhang, B. Zhang, C. Wang, D. Sun, Y. Tang, H. Li, K. Sun and G. Fu, Angew. Chem., Int. Ed., 2023, 62, e202314565 CrossRef CAS PubMed.
  50. F. Zhou, X. Ke, Y. Chen, M. Zhao, Y. Yang, Y. Dong, C. Zou, X.-A. Chen, H. Jin, L. Zhang and S. Wang, J. Energy Chem., 2024, 88, 513–520 CrossRef CAS.
  51. (a) Y. Gu, Y. Gu, M. Yang, S. Tang, J. Chen, X. Liang, D. Zheng, Z. Li, F. Song, Y. Gao, Y. Zhu, Y. Shi and J. Ma, CCDC 2440719: Experimental Crystal Determination, 2025,  DOI:10.5517/ccdc.csd.cc2mxrw6; (b) Y. Gu, Y. Gu, M. Yang, S. Tang, J. Chen, X. Liang, D. Zheng, Z. Li, F. Song, Y. Gao, Y. Zhu, Y. Shi and J. Ma, CCDC 2440720: Experimental Crystal Determination, 2025,  DOI:10.5517/ccdc.csd.cc2mxrx7.

Footnote

Y. Gu, Y. Gu, M. Yang and S. Tang contributed equally to this work.

This journal is © The Royal Society of Chemistry 2025
Click here to see how this site uses Cookies. View our privacy policy here.