Panagiotis
Krokidas
*a,
Stelios
Karozis
b,
Salvador
Moncho
c,
George
Giannakopoulos
df,
Edward N.
Brothers
c,
Michael E.
Kainourgiakis
b,
Ioannis G.
Economou
*ae and
Theodore A.
Steriotis
a
aInstitute of Nanoscience and Nanotechnology, National Center for Scientific Research “Demokritos”, 15341 Aghia Paraskevi Attikis, Greece. E-mail: p.krokidas@inn.demokritos.gr; ioannis.economou@qatar.tamu.edu
bInstitute of Nuclear & Radiological Sciences and Technology, Energy & Safety, National Center for Scientific Research “Demokritos”, 15341 Aghia Paraskevi Attikis, Athens, Greece
cScience Program, Texas A&M University at Qatar, Education City, P.O. Box 23874, Doha, Qatar
dInstitute of Informatics and Telecommunications, National Center for Scientific Research “Demokritos”, 15341 Aghia Paraskevi Attikis, Athens, Greece
eChemical Engineering Program, Texas A&M University at Qatar, Education City, P.O. Box 23874, Doha, Qatar
fSciFY PNPC, TEPA Lefkippos – NCSR Demokritos, 27, Neapoleos Str., 153 41 Ag. Paraskevi, Greece
First published on 16th June 2022
Molecular sieving is based on mobility differences of species under extreme confinement, i.e. within pores of molecular dimensions. The pore properties of a material determine its separation efficiency, while pore network engineering provides a way to optimize the sieving performance. Unlike rigid and structurally limited carbon and zeolite molecular sieves, metal organic frameworks (MOFs) offer flexible networks with unlimited pore tailoring possibilities, by using different linkers, functional groups and metals/clusters. Nevertheless, knowledge-based pore optimization towards highly selective materials is hampered by the complex relationship between structural modifications and molecular diffusivity. Machine learning (ML) approaches can elucidate this correlation, but pertinent research in MOFs has so far focused solely on sorption properties. Herein, we report the first ML-assisted work towards understanding how the replacement of basic MOF building units affects the pore structure and consequently the molecular diffusivity. The ML approach developed is general; the work is however focused on zeolitic-imidazolate frameworks (ZIFs) with SOD topology. Since there is no database of relevant ZIF variations, we constructed a new ensemble of 72 existing and new ZIFs through systematic sub-unit replacement, developed a force-field for each of these structures and performed molecular dynamics (MD) simulations on fully flexible systems to calculate framework properties and the diffusivity of different molecules (ranging from helium to n-butane). Based on this new database, a predictive multi-step ML model was developed and trained. The model can rapidly and efficiently estimate the diffusivity of molecules in any possible ZIF structure with SOD topology by using readily accessible input information.
In this work, we present for the first time a data mining strategy assisted by ML for predicting the diffusivity of species in flexible crystalline nanoporous materials. Unlike conventional ML approaches in which databases of existing MOF (e.g., CoRE) are screened in order to identify the best performing structures (e.g., in terms of adsorption capacity) our aim is to develop a tool that can predict the diffusivity of a molecule upon any possible structural alteration. This tool may then be used for the design and synthesis of new MOFs by directing the synthesis efforts towards meaningful structures. Zeolitic-imidazolate frameworks (ZIFs),28,29 a sub-family of MOFs and in particular ZIF-8 analogues were chosen as a case study for three reasons: firstly, due to their unique flexible structure and swinging/gate opening effect they have been widely investigated for several membrane-based gas separation processes;30 secondly, there are indications that some modifications may indeed lead to highly selective materials, but a clear modification–diffusivity correlation has yet to be understood;1,30–34 finally, ZIF-8 analogues exhibit a distinct structural characteristic, which is the aperture connecting the framework cages, that seems to control diffusivity,33,35–38 and subsequently the material performance. However, contrary to MOFs, an adequately extended database of ZIF-8 type structures that can support a big data analysis, is unavailable. In this respect, a new structural dataset was created. The set comprises 72 chemically robust ZIF variants with SOD topology (ZIF-8 type) that were developed after systematic replacement of the linker, metal and/or functional group. It should be mentioned that in order to isolate and study the impact of building units' modifications on diffusivity, we chose to work on one topology only. In fact, topology variations would severely obscure our findings as this is an additional factor,20,39 which can greatly affect the aperture size as well as the network connectivity and therefore the diffusion of penetrants.33,40 The design of new variants was driven from the insight gained from our recent works41–45 regarding the impact of modifications on the aperture which bridges the cages (Fig. 1). In brief, bulkier linkers decrease the aperture,43,44 while its size and rigidity depend on the ionic radius of the metal.42 On the other hand, the functional group's impact on aperture is not fully understood. Information on the building units of each ZIF of the dataset is reported in ESI (ESI_1‡), along with refences to existing ZIFs that incorporate some of these units.
![]() | ||
Fig. 1 (a) Profile view of the aperture bridging two cages in ZIF-8 and (b) front-view of the aperture. The multicolor sketch depicts the ZIF-8 building unit and its sub-units. |
Force field parameters, such as charges, and bond length, angle and torsional parameters, for each of the 72 ZIF structures were developed through elaborate density functional theory (DFT) calculations. In a second step, the force fields were employed in fully flexible molecular dynamics (MD) simulations in order to calculate for each framework the sizes of aperture and “stretched” aperture, i.e., the aperture when a penetrant molecule of He, H2, O2, CO2, N2, CH4, C2H4, C2H6, C3H6, C3H8, i-C4H10 and n-C4H10 lies in its center (results are provided as ESI_2‡). The third step was the deduction of diffusivity of each of these penetrants in all ZIFs, by means of dynamically corrected transition state theory (dc-TST), through simulations that accounted for the flexibility of the frameworks. DFT, MD and dc-TST simulation details are given in ESI_1,‡ while all force field parameters are tabulated in ESI_3.‡Fig. 2 summarizes the workflow of our computations. At this point, it is worth noting that five (5) out of the 72 ZIFs of our dataset have been synthesized, and used to validate our computational approach, by comparing simulations with experiments for ZIF structural characteristics and properties such as gas sorption, diffusivity, and permeability in these ZIFs.41,42,44,46,47
Overall, the new ZIF database, which can be found in ESI_2,‡ sums up to 712 entries (ZIF-penetrant-aperture-stretched aperture and diffusivity). Fig. S1‡ shows the resulting distribution of aperture sizes and diffusivities in our dataset, proving that the modification scheme adopted spans throughout the range of interest. In general, more than 2000 ZIFs based on SOD topology can be designed (see Table S3‡) by using different metals, linkers, and functional groups and this can increase dramatically if combinations (e.g., two different metals) are employed. The 72 ZIFs of our study focus on a part of the available structural landscape; the structures considered however, cover sufficiently the most important range of apertures and diffusivities. Indeed, diffusion through apertures beyond the largest one employed (CdIF-1: 3.92 Å) is quite unrestricted and thus the separation potential for all gas pairs under investigation is poor, while apertures below the smallest one (Be-ZIF-7-8-I: 1.96 Å) will result to diffusivities that are extremely low for any practical application.
Based on the dc-TST results (details provided in ESI_2‡) it is evident that diffusivities and their ratios can vary by several orders of magnitude in different ZIFs, after minute aperture changes; for example, the ratio DCO2/DCH4 for ZIF-8 is approx. 9, which is increased to 3 × 103 by replacing one out of three mIm linkers with bIm (ZIF-7-8), causing a 15% decrease of the aperture size. An additional modification in the framework, by replacing the Zn2+ metal with Be2+ decreases the aperture by 13%, leading to a DCO2/DCH4 substantial enhancement from 3 × 103 to 6.43 × 109. Likewise, the replacement of Zn2+ with Co2+ or Cu2+ in ZIF-8 reduces the aperture diameter by just 0.03% or 0.06%, but Dpropylene/Dpropane increases by 30% or 85% respectively, shifting the structure from just satisfactory to top performing.48 Beyond this aperture size – diffusivity relationship, a closer look at the dataset (see ESI_2‡) reveals a rather extreme complexity at several levels. For instance, bulkier linkers decrease the aperture diameter while simultaneously increase its flexibility, resulting in higher diffusivities than expected in several cases (e.g., bIm vs. mIm). In addition, some combinations of metals and linkers give unexpected results; Zn2+ replacement by Co2+ is expected to decrease the aperture, but in some cases the reverse happens (e.g., F-ZIF-7-8 vs. Co-F-ZIF-7-8; Br-ZIF-7-8 vs. Co-Br-ZIF-7-8). Likewise, Cd, has the largest ionic radius and is thus expected to produce the largest apertures, but this is not the case (e.g., Cd-I-ZIF-7-8 vs. Mn-I-ZIF-7-8). Moreover, the impact of the functional group on the aperture size/flexibility is largely unclear. Overall, it seems that a rather obscure “dynamic” penetrant/framework interaction controls molecular mobility. The above show that correlating structural changes with gas mobility is a multivariant and extremely complex problem that cannot be deconvoluted by simplistic structure–diffusion relationships.48
A possible way to address such highly complicated relationships is to employ artificial intelligence (AI) approaches and in this respect an ML model was trained to predict the logarithm of diffusivity of the various penetrants in ZIFs (details in the Computational Methodology of ESI_1‡). The training set is based on the newly developed ZIF database and comprises simulation deduced values (aperture sizes, stretched aperture sizes and penetrant diffusivities), as well as physically meaningful numerical descriptors of the ZIF building units and the gases (Tables 1 and S8‡). The ML model predictive performance is shown in Fig. 3(a).
![]() | ||
Fig. 3 (a) Performance of the ML model for diffusivity prediction and (b) dominant descriptors on the model. |
The trained model exhibits a satisfying efficiency all over the diffusivity scale range, with an average performance of R2 = 0.96 and explained variance EV = 0.96. Additionally, feature importance analysis was performed and revealed the descriptors that govern our model (Fig. 3(b)). Among the various gas features, van der Waals diameter (gas_vdW) is the most prominent, while aperture size proves to be the dominant ZIF descriptor. Surprisingly, the stretched aperture does not affect the model performance. It should be emphasized at this point that except aperture sizes (normal and stretched), the descriptors are readily available physical properties. Stretched aperture seems to be unimportant and could thus be omitted but the very important aperture size requires elaborate DFT and MD calculations on fully flexible crystals hampering the generic, facile use of the ML model (e.g., for ZIF sub-units and penetrants not included in our dataset).
Based on the above, a two-step ML strategy could prove applicable: a first model, M1, may be trained on ZIF physical descriptors (Table 1) in order to predict the aperture sizes of ZIF variants. Then, a second model (M2) can be trained to predict the diffusivity of a penetrant in a ZIF, by using the penetrant descriptors (Table 1) and the predicted ZIF aperture as well as the rest of the ZIF descriptors of M1. The overall procedure is summarized schematically in Fig. 4.
The comparison of the predicted aperture sizes of M1 with the actual (simulation) ones (Fig. 4) yields R2 = 0.90 and EV = 0.91, highlighting an overly satisfying predictive potential. M2 predicts the log(D) of any gas with R2 = 0.93 and EV = 0.93, which is a surprisingly high performance especially when considering that it is based only on readily available physical properties and the M1 predicted aperture size. In fact, by comparing the performances of the first model (Fig. 3) and the two-step model, it becomes evident that only a small amount of information is lost by using the M1 predicted aperture sizes (instead of the MD calculated ones). Additionally, based on the importance ranking of descriptors, a much simpler M2 model (M2_simple), was trained only on the most prominent descriptors for ZIFs (predicted aperture size from M1) and the gases (vdW diameter). The M2_simple model (Fig. 4) has R2 = 0.86 and EV = 0.85, which is considered quite impressive, given the simplicity of the input required versus the complexity of the task.
In conclusion, since an adequate ensemble of pre-synthesized or pre-designed ZIF-8 variants was not available we have built 72 new SOD structures by varying both the organic and the metal sections of the framework. Instead of adopting the conventional approach of using a common force-field across all the ZIF variants, we performed accurate DFT calculations specifically for each structure. The force-fields developed were used for MD and dc-TST simulations in fully flexible crystals. The results (apertures and diffusivities) were for the first time considered collectively for the training of predictive ML models, that account for the interplay between the structure modifications and the diffusion of penetrants.
Most ML-related works employ descriptors (such as pore diameters, free volume and gravimetric/volumetric surface areas) that need prior knowledge of the full unit cell representation of the frameworks under study, as well as descriptors that are extracted (e.g., potential energy surfaces) by computations on the unit cell.49–52 In our case, the use of similar descriptors or even elaborate DFT-based descriptors (e.g. charges, bond length/bond angle parameter) is by all means possible and actually leads to improved predictive power (data not presented). Nevertheless, the utilization of simple and mainly readily available input information such as the mass and size of the basic building units, was chosen in order to make the ML routines easy to employ: a researcher can consider a new functionalization of ZIF-8, bypass the multiple computational steps (ZIF cluster construction, DFT calculations, unit cell construction and MD, dc-TST simulations) and directly obtain a good estimation of the diffusivity of different penetrants (and thus the structure's selectivity potential) through a simple two-step ML model.
Footnotes |
† The data set and force field terms for each ZIF structure are also available in the Zenodo repository (https://doi.org/10.5281/zenodo.6342588 and https://doi.org/10.5281/zenodo.6389957). |
‡ Electronic supplementary information (ESI) available: Information on the building units used to produce the ZIF structures of our database and relevant tables, force field description and DFT details, simulation and machine learning methods, estimation of the linkers and functional groups size, computational tools, and software data set and force field terms for each ZIF structure. See https://doi.org/10.1039/d2ta02624d |
This journal is © The Royal Society of Chemistry 2022 |