Open Access Article
Lane E. Schultz
a,
Nickolas Gantzlerb,
N. Scott Bobbitt
b,
Dorina F. Sava Gallis
c and
Rémi Dingreville
*a
aCenter for Integrated Nanotechnologies (CINT), Sandia National Laboratories, Albuquerque, NM 87185, USA. E-mail: rdingre@sandia.gov
bMaterial, Physical, and Chemical Sciences Center, Sandia National Laboratories, Albuquerque, NM 87185, USA
cNanoscale Sciences Department, Sandia National Laboratories, Albuquerque, NM 87185, USA
First published on 9th February 2026
Metal–organic frameworks (MOFs) are porous crystalline materials with applications in gas capture, drug delivery, and molecular separations. While high-throughput computational screening has traditionally identified promising MOFs, recent advances increasingly harness machine learning to accelerate discovery and screening. Existing optimization methods such as Bayesian optimization (BO) and genetic algorithms often overlook the detailed structure–property relationships critical to MOF performance. Here, we present an optimization workflow that couples graph neural networks (GNNs) with multi-objective BO to enhance MOF discovery and screening. By representing MOFs as graphs embedding atomic- and structural-level features, GNNs capture intricate structure–property correlations, enabling more accurate property predictions than traditional methods relying solely on macroscopic descriptors. Our integrated framework efficiently identifies Pareto-optimal MOF candidates tailored for improved separation of alkanes, alkenes, alcohols, and aromatics, demonstrating the significant advantage of graph-based models in materials optimization workflows.
Due to their versatility, MOFs have garnered significant attention, with researchers actively seeking MOFs with enhanced properties for targeted applications. Given their modular construction and the vast chemical space of different nodes, linkers, and topologies, there are a large number of possible MOFs, meaning they can be chosen or carefully designed for specific applications. Specialized databases containing thousands of MOFs have emerged, including the hMOF database,18 ToBaCCo,19 ab initio REPEAT charge MOF (ARC-MOF),20,21 computation-ready experimental MOF (CoRE MOF),22 and MOSAEC MOF database (MOSAEC-DB)23 to name a few. ARC-MOF20 features MOFs with density functional theory (DFT) calculated charges and some experimentally synthesized samples. CoRE MOF22 stores experimentally synthesized MOFs along with some material properties. MOSAEC-DB23 includes curated MOFs that were verified by the physical plausibility of metal oxidation states. These extensive databases provide researchers with opportunities for MOF property optimization on more physically feasible structures. While databases like CoRE MOF and ARC-MOF have aggregated tens of thousands of structures, they represent a vast and noisy chemical space. Brute-force high-throughput computational screening of every candidate for every possible application is computationally intractable. The generation of new MOFs has evolved dramatically from basic combinatorial approaches where ligands and secondary building units are interchanged between known structures,18 to sophisticated machine-learning (ML) techniques such as variational autoencoders that explore latent spaces to produce innovative structures.24 The primary bottleneck in materials discovery is no longer generation but rather the efficient screening and rapid identification of the highest-performing, Pareto-optimal candidates from these massive existing databases. Therefore, we utilize the well-established CoRE MOF22 2019 database for much of our work, which provides experimentally validated structures as a reliable foundation. We also use the ARC-MOF database.20,21
Various computational approaches have been employed to search for MOFs with optimal properties,25 including hierarchical screening with ML predictions26 and genetic algorithms.27 Bayesian optimization (BO) has emerged as a particularly effective method, utilizing surrogate models and acquisition functions that balance exploitation (maximizing target properties if greater is better) and exploration (investigating uncertain areas) to efficiently identify promising candidates. Researchers have demonstrated that BO can rapidly screen optimal materials from large search spaces, sometimes requiring only a few proposed materials to find optimal candidates.28 Advanced implementations incorporate multi-fidelity approaches that balance computational cost with accuracy for a similar class of material called covalent organic frameworks,29 and diversity promoting mechanisms like the Vendi score to ensure broad exploration of a material space.30 Others optimize multiple properties simultaneously like in Comlek et al.,31 where MOFs with favorable CO2 working capacity and CO2 over N2 selectivity were selected. Many approaches to MOF discovery employ BO with MOFs represented in a tabular format (i.e., representations where each row corresponds to a material and each column to a feature or property). Tabular formats emphasize macroscopic properties such as pore size and surface area.32–34 While useful, these simplified descriptors overlook critical structure–property relationships at the atomic scale inherent in graph-based representations. They fail to capture the local chemical environment of the metal node, the precise geometry of the organic linker, or the material's topology. Even strategies utilizing Simplified Molecular Input Line Entry System (SMILES) strings encoded as vectors, such as those discussed in Liu et al.,30 may not inherently capture atomic properties like local chemistry or geometry. In contrast, graph neural networks (GNNs) that integrate both structural and chemical information have demonstrated superior performance in property prediction compared to their tabular counterparts, as highlighted by Luo et al.35 and Li et al.36 Both sources indicate that graph-based models achieve better accuracy in predicting properties than models trained on tabular data. Luo et al.35 focused on atomic charge prediction while Li et al.36 focused on prediction of CH4 over N2 selectivity. The integration of property optimization via BO with a GNN serving as the surrogate model holds significant promise for MOFs, offering a way to exploit complex structural and chemical information.
MOFs have been widely studied for chemical separations.5,6 One of the most difficult challenges in adsorptive separations is to distinguish between molecules with similar chemical or geometric characteristics. When a MOF exhibits strong affinity for a specific molecule, it often shows comparable affinity for structurally or chemically related species, making selective adsorption difficult. Consequently, a tradeoff in selectivity inevitably arises, and this tradeoff becomes increasingly pronounced as molecular similarity increases. In this work, we integrate a GNN-based structural representation of MOFs with a BO workflow to identify and screen MOFs capable of separating chemically similar organic molecules (e.g., alcohols, butenes, benzene/toluene). This integrated approach enables the discovery of Pareto-optimal MOFs across multiple challenging separation tasks, effectively prioritizing candidates that optimally balance selectivity between closely related molecules.
We utilized structures from the ARC-MOF20,21 v6 and CoRE MOF22 2019 Version 1.1.3 databases to search for Pareto-optimal MOFs. Each MOF was uniquely characterized using Voronoi polyhedra (VP), which defines the connectivity employed in a graph representation of the framework. These graphs incorporated atomic-level chemical information, local atomic neighborhood environments, and global MOF properties. Using these graph representations, surrogate GNNs were trained on subsets of MOFs and subsequently applied within a BO framework to explore the remaining search space, assessing whether the GNNs effectively identified Pareto-front materials. We validated this approach by comparing multiple model types with our GNN BO workflow to identify MOFs with optimal CH4/N2 selectivity using the data of Li et al.36 Separations were quantified by maximizing the Henry constant of the target species while minimizing those of competing molecules. The Henry coefficient, which represents the proportionality between gas adsorption and pressure in the low-pressure limit, was used as an indicator of affinity toward specific adsorbates. Optimization tasks included one, two, three, and four simultaneous chemical objectives. The GNN-based candidate selection consistently outperformed random selection across most test cases. Feature ablation analyses provided interpretability, revealing the respective contributions of edge-, node-, and global-level features to model predictions. The present framework not only captures structure-to-property mapping but also enables automated discovery of materials with optimized performance. The current search space, however, remains limited by the availability of structures within existing databases.
482 and 18
522 MOF structures from the ARC-MOF20,21 v6 and CoRE MOF 2019 databases,22,37 respectively. The ARC-MOF collection includes both experimentally synthesized and computationally generated frameworks, along with DFT-computed atomic charges and several macroscopic MOF properties. The CoRE MOF database comprises experimentally realized MOFs that have been curated for atomistic modeling by addressing issues such as partial occupancies, symmetry inconsistencies, and the presence of solvent molecules. The structural information for each MOF in both datasets is encoded in a crystallographic information file (CIF) containing the element type, site indices, and three-dimensional atomic spatial coordinates. From CoRE MOF, subsets were further filtered to only retain structures with a pore limiting diameter greater than the largest kinetic diameter of adsorbed VOCs. See the SI for tabulated kinetic diameters. We note that these datasets primarily comprise ‘computationally ready’ versions of known structures. While such readiness increases the likelihood that our ‘optimal’ materials can be synthesized, it also means that the model may have lower predictive confidence in ‘dark’ chemical spaces, such as those involving highly reactive open-metal sites or non-traditional coordination geometries that are underrepresented in the training data.
![]() | (1) |
![]() | (2) |
Finding a single material that optimizes multiple selectivities is often infeasible due to inherent trade-offs in material properties. Instead, a set of non-dominated solutions, known as the Pareto front, can be identified for multi-objective optimization. The Pareto front represents solutions that cannot be improved in one objective without degrading another.39 Mathematically, a solution yi belongs to the Pareto front if no other solution yj dominates it:
![]() | (3) |
| Dataset | Structures | Objective | Family | Min/Max |
|---|---|---|---|---|
| {1-propanol} | 3634 | 1-Propanol | Alcohol | Min |
| {Butane} | 3653 | Butane | Alkane | Max |
| {Benzene, toluene} | 2125 | Benzene | Aromatic | Max |
| Toluene | Aromatic | Min | ||
| {1-propanol, butane} | 3634 | 1-propanol | Alcohol | Min |
| Butane | Alkane | Max | ||
| {1-propanol, 2-propanol} | 3634 | 1-propanol | Alcohol | Max |
| 2-propanol | Alcohol | Min | ||
| {1-propanol, propane} | 3634 | 1-propanol | Alcohol | Min |
| Propane | Alkane | Max | ||
| {2-cis-butene, 2-trans-butene} | 3260 | 2-cis-butene | Alkene | Max |
| 2-trans-butene | Alkene | Min | ||
| {Butane, propane} | 3653 | Butane | Alkane | Max |
| Propane | Alkane | Min | ||
| {1-propanol, 2-propanol, butane} | 3634 | 1-propanol | Alcohol | Min |
| 2-propanol | Alcohol | Min | ||
| Butane | Alkane | Max | ||
| {1-propanol, 2-propanol, propane} | 3634 | 1-propanol | Alcohol | Min |
| 2-propanol | Alcohol | Min | ||
| Propane | Alkane | Max | ||
| {1-propanol, 2-propanol, butane, propane} | 3634 | 1-propanol | Alcohol | Min |
| 2-propanol | Alcohol | Min | ||
| Butane | Alkane | Max | ||
| Propane | Alkane | Max |
000 insertion cycles per adsorbate. MOF structures were held rigid at their crystallographic minimum positions. Non-bonded interactions were treated using 12–6 Lennard-Jones potentials with parameters taken from the universal force field (UFF)42 for MOFs and TraPPE43–46 for the adsorbates. For most adsorbates, the united atom model was applied, while for benzene and toluene we employed rigid 12-site models with explicit charges. The toluene model was adapted from benzene by replacing a hydrogen atom with an sp3 methyl group. Interaction cross terms were determined using Lorentz–Berthelot mixing rules. Partial charges on the MOFs were computed using the EQeq47 method as implemented in RASPA and treated using Ewald summations. All simulations used a 12.0 Å cutoff radius and 300 K temperature, with simulation boxes duplicated as needed to ensure dimensions at least twice this value in all directions. Ideal gas Rosenbluth weights were pre-computed at 300 K. Given the wide range of values in Henry coefficients, we took their natural logarithms for subsequent analysis. All Lennard-Jones parameters and molecule structures are provided as RASPA definition files in the repository mentioned in the Data availability section at the end of this article.
Regarding the representativeness of Henry coefficients as indicators of separation performance, we note that Henry's law strictly applies only at low adsorbate concentrations. Nevertheless, Henry coefficients remain informative for low-concentration applications such as VOC capture or chemical sensing.48,49 They are also valuable descriptors for dilute gas separations, consistent with prior studies.50 Moreover, Henry coefficients computed via Widom insertions capture a comprehensive average of adsorbate–framework interactions by sampling the entire pore environment. Accounting for higher-concentration behavior would require explicitly modeling adsorbate–adsorbate interactions, which become non-negligible for species such as alcohols. Incorporating these effects would entail substantial methodological extensions that we identify as an important avenue for future work.
![]() | ||
| Fig. 1 Conceptual framework for graph-based representation of metal–organic frameworks (MOFs). A MOF from CoRE MOF22,37 is shown in panel (a) along with a subset of calculated properties. Panel (b) presents an abstraction of a material structure. Panel (c) illustrates the corresponding graph representation, where atoms are represented as nodes and connections to neighbors as edges. Panel (d) depicts the global features that characterize the entire graph, the node features that encode atomic properties, and the edge features that capture connectivity. | ||
The connections between nodes forming the graph structure were defined using VP, a geometric construction that partitions space around each atom by defining bisecting planes between the central atom and its neighboring atoms. Each MOF structure provided as a CIF was loaded into pymatgen51 as a structural object. The nearest-neighbor network was constructed using pymatgen's IsayevNN class, applying a bond distance tolerance of 0.5 Å without imposing restrictions on the number of Voronoi neighbors.51,52 This strategy, inspired by Luo et al.,35 defines an edge between two atoms if (i) their distance is less than or equal to the sum of their Cordero covalent radii plus a distance tolerance and (ii) they share a common Voronoi facet. The 0.5 Å tolerance was specifically chosen to prevent isolated nodes, as noted in Luo et al.35 Thus each MOF was represented as a graph in which atomic connectivity is encoded through VP-based spatial relationships.
Other featurizations exist. We acknowledge that the smooth overlap of atomic positions (SOAP)53 approach is a powerful method for encoding local atomic information in structures. SOAP representations are differentiable with respect to atomic positions (crucial for molecular dynamics simulations) while maintaining the invariance properties characteristic of graph representations. However, the original SOAP formulation incorporates only positional and element type information and does not incorporate additional chemical descriptors such as electronegativity. In contrast, graph representations offer greater flexibility, allowing the seamless inclusion of diverse chemical and structural features within a unified framework.
Chemical and structural properties of each atomic site were also incorporated as node features. These properties were obtained from two sources: descriptors extracted through pymatgen51 and features adopted from Luo et al.35 From pymatgen, each node was encoded with elemental attributes including Mendeleev number, atomic mass, number of electrons, electron affinity, and the number of electrons in each orbital of the atom up to the 7p orbital (with unoccupied orbitals assigned a value of zero). The VP geometry of each atom was represented using Schläfli notation, which records the specific number of enclosing faces and the number of vertices per face. For example, an atom with nine enclosing faces, where three faces have four vertices and the remaining six have five vertices, is represented as (0, 3, 6), where the index indicates vertex count and is the number of faces. Faces with up to 10 vertices were considered, following the hard-coded constraint in pymatgen's VoronoiAnalyzer. In total, 31 node features were obtained from the properties described above, while additional features were included using modified code and parameterizations provided by Luo et al.35
As in Luo et al.,35 each node was further enriched with atomic properties from Mendeleev,54 including atomic number, atomic weight, van der Waals radius, first ionization energy, Cordero covalent radius, Rahm atomic radius, static dipole polarizability, and Ghosh electronegativity (8 features). Beyond these direct attributes, a set of local environment descriptors were computed to capture geometric and chemical interactions between neighboring atoms.
The property weighted radial distribution functions (pwRDFs)55 quantify how a given atomic property P varies radially around each atom i:
![]() | (4) |
The property weighted angular distribution functions (pwADFs) were computed analogously to describe three-body correlations between atom i connected to atoms j and k:
![]() | (5) |
![]() | (6) |
![]() | (7) |
The weighed radial atom centered symmetry function (wRACSF)35,56,57 was also included,
![]() | (8) |
![]() | (9) |
![]() | (10) |
Finally, shell features35,58 characterized property distributions over coordination shells up to the fourth order. We first define what a coordination shell for an atom i represents. The first coordination shell consists of the immediate neighbors surrounding an atom. The second coordination shell encompasses all atoms connected to the first neighbors of atom i. The third coordination shell includes all atoms connected to the second coordination shell atoms, and this pattern continues for higher-order shells. For each atom in a MOF and for each property P and shell n, we computed the average, absolute difference, and standard deviation:
![]() | (11) |
![]() | (12) |
![]() | (13) |
Global features were classified into two categories: material-level and graph-based descriptors. Material-level features were computed from structural and pore geometry data, while graph-based features were derived exclusively from the topological representation of each MOF. Material-level properties, calculated using pymatgen,51 include volume, density, the number of atoms, the number of each element type in a structure normalized by the total number of atoms, the average VP volume for all atoms, and the average number of atom neighbors. Additional geometrical descriptors evaluated with Zeo++,59 include the diameter of the largest included sphere, diameter of largest free sphere, diameter of largest included sphere along the free sphere path, both available and unavailable surface area, and both the available and unavailable void fraction. A spherical probe radius of N2 was used. Graph-based descriptors, computed using NetworkX,60 describe structural connectivity and topology. These include undirected graph density, diameter, degree assortativity coefficient, and average shortest path length. The graph density is defined as
where e and n are the number of edges and nodes respectively and describes the ratio of actual connections to possible connections in the graph. The graph diameter and radius correspond to the maximum and minimum eccentricities across all nodes, representing the farthest and nearest network spans, respectively. Unconnected nodes were excluded from distance calculations. The degree assortativity coefficient is the Pearson correlation coefficient of the degree between pairs of linked nodes, while the average shortest path length is given by
![]() | (14) |
The GNN adopted here follows the formulation introduced by Battaglia et al.62 and implemented through the MetaLayer class in PyG. In this architecture, edge, node, and global sub-models learn (from features scaled with scikit-learn's RobustScaler63) complementary hierarchical representations that capture multi-scale structural information within graphs. The latent features extracted from these components are subsequently processed through additional neural layers to infer material properties and their uncertainties. Fig. 2 provides an abstraction of our GNN, showing how multi-level graph features (global, node, and edge) are encoded into latent representations for predicting target properties with uncertainties.
, connecting nodes
and
, given a global feature vector,
, is defined as,
![]() | (15) |
denotes the new representation of edges. The input layer dimensionality of ϕe is equal to twice the number of node features (comprising features from source and destination nodes), plus the number of edge features and global features. All hidden layers and the output layer maintain consistent width, with rectified linear unit (ReLU) activations and batch normalization between layers. The number of layers and neurons were later optimized, as described in Section 2.5.3.The node sub-model builds upon latent edge representations to encode neighborhood effects. For each node, latent edge vectors directed toward it are aggregated using the PyG scatter function, combined with its intrinsic node and global features, and passed through a neural network ϕn:
![]() | (16) |
is the latent representation of the nodes. This produces latent node features that reflect both local geometry and broader contextual information from neighboring atoms. The architecture of ϕn follows the same design as ϕe.
The global sub-model integrates information across the entire structure, combining aggregated node and edge embeddings with global descriptors
. Its update function is given by
![]() | (17) |
and ρe→g perform mean aggregations across all nodes and edges, respectively. The network ϕg adheres to the same architectural conventions as ϕe and ϕn sub-models.
After the latent edge, node, and global representations were established, their outputs were concatenated and passed through a fully connected NN that produces final material-level embeddings (the s neurons in Fig. 2(a)). These embeddings were then connected to separate layers corresponding to each target property (the p neurons in Fig. 2(a)). Each output head contains twice the number of neurons as target properties—one neuron for the predicted value and another for the associated uncertainty—allowing the model to learn both quantities simultaneously.
Model training employed the negative log-likelihood (NLLH) loss function. We assumed Gaussian-distributed prediction errors. For each sample i and target property y, the model generated a mean prediction µi and an associated standard deviation σi. The loss, later averaged over all training instances and target properties, is expressed as
![]() | (18) |
:
20
:
0 ratio for BO and a 80
:
10
:
10 ratio for parity plots, respectively. Fixed training parameters included a learning rate of 0.001, a batch size of 32, an early stopping criterion of 10 epochs (i.e., the number of epochs with no improvement to terminate learning), and an exponential learning rate scheduler with γ of 0.99. Each of the 100 Optuna trials involved a short 10-epoch training run. Model performance was assessed using the area under the average NLLH loss function versus epoch curve, normalized by the number of epochs to mitigate the effects of early stopping. This approach provided a smoother and more reliable optimization metric than using the minimum or final epoch loss. We selected the optimal architectural parameters from these trials. The final model parameters were taken from the epoch with the lowest observed loss. For BO, this full optimization and training workflow was repeated at each iteration to refine the surrogate model continuously.We applied BO to explore the space of MOFs with optimal Henry coefficients. Each surrogate was initially trained with 25 data points. For single-objective Bayesian optimization (SOBO), we used the expected improvement (EI) acquisition function. For multi-objective Bayesian optimization (MOBO), we used the expected hypervolume improvement (EHI). Both implementations were from BoTorch.65
The hypervolume improvement identifies the current Pareto front from measured objectives, then quantifies the additional hypervolume contributed by a new candidate relative to a reference point, measuring the improvement across all objectives at each iteration. A detailed formulation is provided by Xu et al.39 For SOBO, where only one objective is optimized, the improvement is computed as the difference between the best achieved value and the reference point, rather than a hypervolume. All reference points were chosen via the infer_reference_point function from BoTorch.65
We chose random selection of candidates from the available search space as a baseline for comparison. This approach ignores prior information and mimics unguided exploration, offering a conservative measure against which BO performance can be evaluated.
To quantitatively compare SOBO and MOBO results across different optimization runs, we introduced a normalized performance metric defined as follows. For each iteration, we computed the difference between GNN performance (e.g., hypervolume improvement) and random selection performance as a function of the number of proposed candidates, normalized by the performance gap between the best possible outcome and random selection. The area under the curve (AUC) of this normalized performance curve provides a single score representing how closely the GNN approaches optimal performance relative to random selection. Positive values indicate that the GNN outperforms random selection, negative values the opposite, and zero corresponds to equivalent performance. This metric was calculated for both hypervolume improvement and Pareto-front coverage to ensure consistency across optimization scenarios.
![]() | (19) |
482 structures. GNNs models experienced a similar temporary stagnation between 200 and 350 iterations but recovered to find superior candidates.
We compared the case for optimizing two objectives (Fig. 3(b)) similar to that of the single objective in Fig. 3(a). Specifically, surrogates were programmed to maximize adsorbed CH4 and minimize adsorbed N2. Three independent runs were averaged instead of ten used for the single objective. It is clear that GNNs outperform GPs and do so faster than the previous comparison. For a randomly chosen GNN surrogate, we show how the model explored both the input and objective space of our MOFs. We only include the first 200 samples to avoid visual clutter. The global level features of MOFs were gathered, transformed using scikit-learn's RobustScaler,63 and projected onto a two-dimensional (2D) space with the use of Uniform Manifold Approximation and Projection (UMAP)66 (Fig. 3(c)). The min_dist parameter of UMAP was set to 0.01. As seen in Fig. 3(c), there are sparse and dense regions in the low-dimensional 2D UMAP space. The surrogate samples points from both sparse and dense regions, indicating exploration. Fig. 3(d) shows the adsorption objectives on each axis and the same sampling of MOFs from Fig. 3(c). The GNN surrogate has sampled MOFs with objectives in the direction of higher CH4 adsorption and lower N2 adsorption (i.e., in the direction of the Pareto front). Identifying MOFs that maximize CH4 uptake while minimizing N2 adsorption directly translates to materials that enhance methane recovery and reduce energy costs in pressure-swing adsorption or membrane-based separation processes. Values of 0 on the color maps indicate MOFs that have not been sampled by the GNN. The superior performance of GNN over GP is attributed to its ability to capture complex structure–property relationships beyond the global features used by GPs. Consequently, GNNs surrogates were utilized for all subsequent optimization tasks in this work.
Strong correlations between Henry coefficients indicate that the adsorption physics are dominated by a scaling relationship where increased binding site density enhances both metrics simultaneously. Successful separation of alkanes from alcohols, for instance, requires minimizing the Henry coefficients of 1-propanol and 2-propanol while maximizing those for butane and propane in the four objective case, representing competing objectives that define a multi-dimensional Pareto front. The Pareto-optimal MOFs identified by our GNN model are those that maximize the ‘residual’ performance, achieving higher selectivity than would be predicted by the adsorption capacity alone. From an industrial perspective, even a small deviation from the dominant trend that would lead to a 2% increase in selectivity can result in significantly lower regeneration costs and higher product purity. Our aim is to develop a surrogate model for BO that can accurately identify the Pareto optimal set and distinguish it from suboptimal solutions.
To evaluate the surrogate's ability to capture meaningful structure–property connections, we measured EHI score for all points. Among the top five scoring candidates, the three highest-scoring were not on the Pareto front, while the remaining two were Pareto-optimal. We selected the highest-scoring of the two Pareto front points (indicated by the star shape in Fig. 4) for the feature importance ablation study. This analysis revealed that node-level features predominantly drive model predictions, with eight of the top ten most important features arising from node descriptors and only two from global features for both 1-propanol and butane. The absence of edge features among the top contributors highlights the dominant role of local atomic environments captured at the node level. This underscores the limitation of surrogate models restricted to tabular global features, which would inherently miss significant predictive information embedded in the graph topology and atomic features. These results validate the surrogate's reliability and justify its use within the BO framework to effectively navigate strongly correlated multi-objective MOF design spaces.
Fig. 5 illustrates results from a representative MOBO run optimizing butane selectivity over 1-propanol and 2-propanol. Panel (a) shows hypervolume improvement versus the number of proposed candidates, where the GNN-based surrogate significantly outperforms random selection, achieving near-maximal hypervolume after about 200 candidates, while random selection falls short even after 1000. The random proposals were averaged across 10 runs. Panel (b) shows the fraction of identified Pareto front points instead of hypervolume improvement, with the GNN identifying nearly 60% of points on the Pareto front after 1000 proposed candidates compared to 26% for random search.
Panels (c) and (d) compare the AUC for normalized hypervolume improvement and Pareto front coverage across runs with one to four objectives. In panel (c), the GNN's AUC of normalized hypervolume improvement, relative to random selection, remains largely consistent across runs optimizing one through four objectives. By contrast, panel (d) shows that the AUC of normalized fraction of Pareto-front points identified is highest for single-objective optimization, similar for two- and three-objective cases, and drops sharply in the four-objective scenario. While normalized hypervolume improvement remains stable across objectives, coverage declines for four-objective optimizations due to the exponential growth of the Pareto front dimensionality and the intrinsic difficulty of covering it extensively. For example, the single-objective front contains only one optimal point, the three-objective front (shown in panels (a) and (b)) has 211 points, and the four-objective front comprises 1440 points (approximately 40% of the dataset). Despite this, the GNN identified Pareto-front points at rates similar to random selection in higher dimensions, but the specific points it selected yielded substantially greater hypervolume improvement, maintaining superior optimization.
Finally, Table 2 reports GNN BO performance across various VOC separations within the same functional group family, highlighting generally positive AUC values except for the {benzene, toluene} dataset for the metric of fraction, where performance falls to random baseline, likely due to dataset size and the high similarity of these molecules. Overall, the GNN surrogate consistently enables effective optimization for challenging VOC separations in MOFs.
| Dataset | # Objectives | Metric | AUC |
|---|---|---|---|
| {1-Propanol} | 1 | Hypervolume | 942.5 |
| Fraction | 876.5 | ||
| {Butane} | 1 | Hypervolume | 776.4 |
| Fraction | 681.5 | ||
| {1-propanol, 2-propanol} | 2 | Hypervolume | 801.3 |
| Fraction | 140.0 | ||
| {1-propanol, butane} | 2 | Hypervolume | 772.0 |
| Fraction | 236.1 | ||
| {1-propanol, propane} | 2 | Hypervolume | 893.1 |
| Fraction | 64.8 | ||
| {2-cis-butene, 2-trans-butene} | 2 | Hypervolume | 824.6 |
| Fraction | 107.4 | ||
| {Benzene, toluene} | 2 | Hypervolume | 766.8 |
| Fraction | −7.1 | ||
| {Butane, propane} | 2 | Hypervolume | 852.5 |
| Fraction | 333.8 | ||
| {1-propanol, 2-propanol, butane} | 3 | Hypervolume | 867.8 |
| Fraction | 278.7 | ||
| {1-propanol, 2-propanol, propane} | 3 | Hypervolume | 771.5 |
| Fraction | 112.6 | ||
| {1-propanol, 2-propanol, butane, propane} | 4 | Hypervolume | 817.4 |
| Fraction | 8.4 |
![]() | ||
| Fig. 6 Selected MOF properties, their span, and values for Pareto optimal metal–organic framework (MOFs). Panel (a) shows a stacked histogram of properties for CoRE MOF shown in the legend. The horizontal axis is the base 10 logarithm of values (with values of zero kept as zero) while the vertical axis represents the counts. Panel (b) shows the property values for MOFs selected by a graph-neural network (GNN). Panels (c) and (d) show the same type of information as panels (a) and (b), respectively, but for the ARC-MOF instead. Panels (e)–(g) show the OVITO67 visualizations of MOFs from panel (d). | ||
Among the three optimal candidates in Fig. 6(d), DB12-IHUTIQ_clean_repeat.cif appears to be the most synthetically feasible. This material likely originates from the CCDC database and has been previously reported in the literature.68 Its organic linker, 1,4-bis(4-pyridyl)-2,3-diaza-1,3-butadiene, is commercially available, overcoming one of the most common limitations in MOF synthesis—linker accessibility. The second candidate, DB0-m3_o490_o15_f0_fsc.sym.2_repeat.cif, also appears synthetically feasible. It consists of a bridging tetratopic carboxylate linker and a pillared N-based linker, both of which are available commercially. Although this structure has not been previously reported, its composition suggests it could be readily synthesized. In contrast, DB1-ZnN4-SiF6-irmof6_No2_repeat.cif is the least feasible of the three. While conceptually synthesizable, the linker has not been previously synthetically realized. Nevertheless, this structure is isoreticular with the SIFSIX family of materials, implying that MOF synthesis would be straightforward if the linker were accessible.69
Despite the inherent strong correlations among Henry coefficients of VOCs studied, GNNs identified Pareto-optimal MOFs with superior performance compared to random search. Ablation analyses underscored the critical role of node-level features in capturing local atomic environments, which are largely overlooked by models restricted to macroscopic descriptors. These results highlight the fundamental advantage of graph-based representations in encoding physicochemically relevant information essential for accurate surrogate modeling and effective candidate selection in materials optimization tasks.
Moving forward, integrating dynamic or flexible MOF representations that capture framework flexibility and adsorption site heterogeneity could improve realism and predictive accuracy. Incorporating multi-fidelity simulation data or experimental feedback into the BO loop may accelerate convergence and robustness of predictions. Enhancing uncertainty quantification methods within GNN architectures will better guide exploration–exploitation trade-offs. Lastly, extending this workflow to optimize additional properties such as adsorption kinetics or stability metrics would broaden its applicability to real-world materials design challenges.
Supplementary information (SI) is available. See DOI: https://doi.org/10.1039/d5ta09133k.
| This journal is © The Royal Society of Chemistry 2026 |