Maryam
Nurhuda
,
Carole C.
Perry
and
Matthew A.
Addicoat
*
School of Science and Technology, Nottingham Trent University, Clifton Lane, Nottingham NG11 8NS, UK. E-mail: matthew.addicoat@ntu.ac.uk
First published on 14th April 2022
Tight-binding approaches bridge the gap between force field methods and Density Functional Theory (DFT). Density Functional Tight Binding (DFTB) has been employed for a wide range of systems including proteins, clays and 2D and 3D materials. DFTB is 2–3 orders of magnitude faster than DFT, allowing calculations containing up to ca. 5000 atoms. The efficiency of DFTB comes via pre-computed integrals, which are parameterized for each pair of atoms, and the requirement for this parameterization has previously prevented widespread use of DFTB for Metal–Organic Frameworks. The GFN-xTB (Geometries, Frequencies, and Non-covalent interactions Tight Binding) method provides parameters for elements up to Z ≤ 86. We have therefore employed GFN-xTB to periodic optimizations of the Computation Ready Experimental (CoRE) database of MOF structures. We find that 75% of all cell parameters remain within 5% of the reference (experimental) value and that bonds containing metal atoms are typically well conserved with a mean average deviation of 0.187 Å. Therefore GFN-xTB provides the ability to calculate MOF structures more accurately than force fields, and ca. 2 orders of magnitude faster than DFT. We therefore propose that GFN-xTB is a suitable method for screening of hypothetical MOFs (Z ≤ 86), with the advantage of accurate binding energies for adsorption applications.
Calculation methods that are used in molecular simulation can be grouped into quantum mechanics (QM) or classical simulation methods (molecular mechanics, MM). The level of theory/computational method required depends on the property of interest. For properties that require electronic structure and information about the exact binding sites, quantum mechanical methods are used. However, due to the high computational cost of quantum mechanical methods, the size of the system to model needs to be carefully chosen. Quantum mechanics based on wave function theory can only be applied to small cluster models of MOFs capturing the region of interest and limited to a few tens of atoms.13 Consequently, small cluster models are unable to describe long range dispersion interactions. In addition, cutting the MOF to create a feasible cluster model requires capping the cut points with e.g. hydrogen atoms or methyl groups, which introduces electronic effects not present in the parent periodic structure. For full unit cell MOF structures, periodic DFT can be used to simulate systems with as many as 100–1000 atoms.14–19 While possible on handfuls of chosen MOFs, the expense of these quantum mechanical approaches means they cannot support screening large subsets of MOFs.
Classical mechanics such as Force Fields (FFs) can be a computationally efficient ‘shortcut’. Force fields are employed for properties that don’t need information about electronic structure and can be defined by the relevant conditions of temperature or pressure. In these cases, force fields offer a welcome reduction in cost. Existing force fields which have been routinely applied in MOF simulations are either the general force fields – such as UFF,20 TraPPE21 and DREIDING22 – or specific force fields designed for a particular group of MOFs,23,24 such as MOF-FF25 and QuickFF.26–29 Specific force fields are typically more accurate than general force fields, but are unsuitable for screening studies outside their parameterization. The general force field approaches are good at handling the chemical diversity of MOFs for reproducing experimental results such as crystal structures, bulk moduli, sublimation energies and fluid properties.30 However, they are not ideal for explaining structural transformation over external stimuli and for explaining donor–acceptor interactions in open metal site materials.31 Another problem is general force fields fail to model adsorption isotherms at the lower pressure region.32
Since the diverse inorganic fragments that comprise MOFs cannot be replicated by general force fields, a common approach is to reparametrize these force fields. The initial development of force fields for MOFs were force fields designed to accurately describe individual popular MOFs, such as extended MM3 for MOF-5,26 Cu paddlewheel-based MOFs27 and for the Flexible Metal–Organic Framework MIL-53(Al).23 Subsequently, force fields have been developed to accommodate a group of MOFs, BTW-FF23 for 6 popular MOFs associated with Cu, Zn, and Zr metal nodes; Becker et al.29 optimized a polarizable force field for CO2 and CH4 adsorption in M-MOF-74 (M = Co, Cr, Cu, Fe, Mg, Mn, Ni, Ti, V, and Zn). QuickFF28 is a program that automates the derivation of force fields for MOFs. UFF4MOF33,34 took a different approach, aiming for universality rather than transferability and reached over 99% coverage of the 2014 CoRE database.3
Another approach frequently used is combining potentials from multiple generic force fields. Hamon et al. modelled adsorption of H2S into MIL MOFs by combining UFF for the inorganic part of the framework and the DREIDING inter-atomic potential for the organic part.35 The model matches the experimental adsorption isotherms quite well in the whole pressure range, confirming structural transitions occur in some pressure regions. Vandenbrande et al.24 reviewed five different force fields, three combined generic force-field (UFF/TraPPE, Drieding- UFF/TraPPE, MM3-MBIS) and two ab initio derived force fields (SAPTFF and MEDFF), for methane adsorption in Zr-based MOFs (UiO-66, UiO-67, DUT-52, NU-1000, and MOF808),24 found that UFF/TraPPE gives an acceptable agreement with the experiment in the UiO-66 framework for pressure between 30 and 80 bar. Meanwhile, other combined generic force fields do not accurately reproduce single molecule adsorption energies. The two ab initio derived force fields gave a remarkable accuracy of the individual adsorption energies.
Due to the motivation of reproducing experimental data and the requirement of having enough experimental data to parameterize and test, most force field development to date has mostly focused on fitting to a limited number of popular MOFs. The lack of transferability and the uncertain level of accuracy is the main drawback of force fields. This especially hampers screening efforts, particularly of hypothetical framework materials, and especially for MOFs using less frequently employed metals such as the lanthanides and actinides.
To achieve a compromise between high accuracy DFT or ab initio calculations and rapid, but less reliably accurate force field calculations, there are two possible strategies: One strategy is to combine Quantum Mechanical (QM) and Molecular Mechanics (MM) methods in a single QM/MM36–38 calculation. Various forms of QM/MM have been applied to several MOFs including Fe-MOF-74,39 NU-100040 and ZIFs.41 QM/MM is appealing for adsorption studies in MOFs because it allows for the key adsorption region to be treated accurately (QM) while the rest of the framework is treated efficiently (MM). It also allows for reaction mechanisms to be determined.42 However, the need to carefully specify the QM and MM regions makes it generally difficult to apply QM/MM to a wide variety of MOFs, including hypothetical structures, as might be encountered in a screening study. The second strategy to achieve a balance between QM accuracy and MM efficiency is to choose a semi-empirical method that approximates more expensive ab initio methods. One such method, Density Functional Tight Binding (DFTB) represents the desired compromise between accuracy and efficiency. DFTB approximates DFT by using pre-calculated parameters for integrals and a minimal basis. DFTB has been extensively used on large biomolecules with accuracy comparable to DFT43 but is at least 2 orders of magnitude faster. DFTB has been used for some MOFs, notably those containing Zinc44–49 and Copper.50 However, with these few exceptions, the availability of parameters for metal atoms has been problematic. The 3ob-3-1 parameter set51 contains parameters for Mg and Zn, the matsci-0-3 set52 includes Cu and Al, while the pbc-0-3 set53 contains parameters for Fe, but these parameters are designed to reproduce bulk Fe only and are not recommended for other systems such as MOFs. The QUASINANO parameters54,55 include all element–element pairs, but only Z ≤ 20 + Br.
Analysis56 of structures contained in the 2014 Computation-Ready Experimental (CoRE) MOF database,3 showed MOFs containing 42 elements with Z ≥ 20, including, second and third row transition metals, lanthanide and actinide elements. A broad application of DFTB to MOFs, especially screening studies of either real or hypothetical structures, necessarily requires parameters for all of these elements. DFTB parameters are generally for pairs of atoms (e.g. Zn–O, Zn–C, C–O), meaning that the number of parameters required scales with the square of the number of elements. Therefore, to apply DFTB to the entire periodic table, would require ∼1002 parameters, and significant amounts of source data and effort in parameterisation.54,55 Even automated methods55,57 for deriving parameters require significant time and effort and the parameters derived are generally limited by the source data (i.e. parameters derived for metal surfaces would be expected to be only poorly applicable to MOFs). Consequently, while DFTB has been used extensively for Covalent Organic Frameworks, comprised of (C, N, O, B, F, H elements),58–61 application of DFTB to MOFs has been largely restricted to Zn and Cu-containing MOFs.
Addressing the parametrization problem, Grimme et al. developed a semiempirical TB method, GFN-xTB,62 designed to produce reasonable Geometries, Frequencies, and Non-covalent interactions for diverse chemical system consisting of elements from the periodic table, Z ≤ 86. GFN-xTB is targeted to facilitate systems more than 1000 atoms; it has been successfully applied to proteins with 3000 atoms.
The GFN-xTB,62 method follows DFTB, approximating DFT by expanding the Kohn–Sham equation in terms of density fluctuations. The energy expression consists of terms describing electronic, (atom pairwise) repulsion, dispersion and halogen bonding.
E = Eel + Erep + Edisp + EXB |
The electronic energy term (Eel), includes the effects of self-consistent charges and a contribution of electron smearing, which accounts for partial orbital occupations. Molecular orbitals are constructed from linear combinations of atom centred orbitals (LCAO), with the basis set consist of a minimal basis set of atom centred functions. To improve the treatment of hydrogen bonding, the 2 s function is included for hydrogen atoms.
The repulsive energy term, (Erep) uses an atom pairwise potential, employing the effective nuclear charges of both atoms, ZeffA and ZeffB, but crucially avoiding pair related parameters. The dispersion energy, (Edisp) is computed using the D3 method63 and BJ damping scheme.64 Finally, the halogen bonding term (EXB) employs a modified Lennard-Jones form with a correction for pairwise repulsion of halogen bond acceptor (Br, I, At) – donor (N or O) atoms.
In total, GFN-xTB has only 16 global and roughly 1000 element specific parameters, and atom pair related parameters are avoided. The parameterization is with respect to molecular structures obtained at the PBEh-3c hybrid DFT level.
The open-source release of GFN-xTB65 only recently added the capability of undertaking periodic optimizations, and previous benchmarking of GFN-xTB for adsorption of small molecules in MOFs has employed capped cut-out structures.66 The implementation of GFN-xTB in the AMS package67 and DFTB+68 does permit periodic optimizations. In this paper, we evaluated the performance of GFN-xTB as implemented in AMS to optimized geometry structures of MOF from the 2014 and 2019 CoRE Databases.
Most of the 9895 CoRE structures are amenable to geometry optimization in this way, there are some exceptions to the simple protocol above, as follows:
1. Z ≤ 86. GFN-xTB is parameterised only for Z ≤ 86. Therefore, all structures with atoms of elements with Z > 86 were excluded. (191 structures)
2. Very large structures. Structures with total number of atoms greater than 1200 were considered too computationally expensive for a full optimization, especially given the number of “soft” deformations expected in such structures. In this case, we limit the optimization to 100 geometry optimization steps. (334 structures)
3. Partial occupancy CIF files. Structures with partial occupancies: Calculations can only be undertaken on full atoms. Structures where the total number of atoms is greater than defined by the stoichiometry are excluded from the evaluation in this work. (1407 structures)
4. Structures that collapse during geometry optimization. There are several possible causes of structural collapse: The MOF structure may be fundamentally not stable upon solvent removal – in this case, structural collapse in the calculation mirrors the structural collapse in experiment; In addition, structural collapse may be observed as an artefact of the structure curation process.2,3 In typical MOF syntheses, metal salt and ligand solutions are mixed together in polar, high boiling point solvents. Experimentally reported metal organic frameworks may contain solvent molecules in their pores, missing charge balancing ions (CBIs) or missing hydrogen atoms and overlapping atoms. Before being imported into the CoRE database, curation is performed, however the solvent is removed with an imperfect method. As a consequence, some of the solvent still remains in the pore and/or an essential part of the MOF structure itself is accidently removed. This has been observed by other authors employing the CoRE database.69 These structures are excluded from the analysis. (478 structures)
5. Structures that partially converged. Some structures converged (change in energy <−1 × 10−2 Hartree), but still had residual gradients larger than the default criterion of 1 × 10−5 Hartree. These structures were manually examined to confirm convergence and extract the structure where the energy had converged. (173 structures)
Refcodes for all structures in each case above are provided in the ESI.†
The textural properties, such as gravimetric surface area and volumetric surface area, are calculated before and after optimization. They are calculated using zeo++,70 using the high accuracy (ha) settings and using probe molecules with diameter 1.86Å representing N2 and with a trial number of 2000.
Fig. 1 Error in cell parameters, calculated as (XGFN − XExp) (a–c). Right hand side shows the 95.13% of cell parameters within ±20% of the experimental value. |
Another source of large negative errors is structures which were starting to collapse, when the atoms move closer to other atoms they might bind to other atoms and form a different connectivity. While breathing MOFs is indicated by a narrowing in one side of cell parameter, collapsing structures shows a large negative deviation in all cell parameters. As an example, refcode ACUBAB_clean81 which begins with a = 8.88 Å, b = 8.88 Å, c = 23.78 Å and optimized to a = 7.43 Å, b = 7.68 Å, c = 15.50 Å. This happens mostly to structures with linkers that are relatively small compared to the size of metal node (example refcode ACUBAB_clean and C6DT02320G_c6dt02320g2_clean).
The large positive deviations arise from structures with partial occupancy cif files. Automatic conversion of cif files (which permit fractional occupancy) to input files which do not permit fractional occupancy, sometimes produces erroneous output, where parts of structures are duplicated. While a lot of partial occupancy cif files could be detected and have been removed from the evaluation, some still remained. The identification of partial occupancy files is done by calculating the atom–atom distances with threshold closer than 1 Å with neither atom being a hydrogen atom. However, for some structures, multiple possible atom positions with partial occupancy are located at a distance greater than 1 Å. One example is refcode WUXYUL_clean.82
The average RMSD for all atoms are 0.489 Å when atom positions are optimized with the lattice fixed at its experimental value and 0.617 Å when the lattice is simultaneously optimized as shown in Table 1. Oxygens attached to metal atoms have slightly higher RMSD in both cases, but the positions of the metal atoms themselves are slightly better conserved with RMSDs of 0.439 Å and 0.560 Å for fixed and lattice-optimized structures respectively. Atoms in the organic linker have a higher deviation than the atoms in metal node since the organic linkers have more flexibility and degrees of rotational freedom around the inorganic building unit. These RMSD values compare favourably with those obtained of a set of 72 MOFs calculated with the PM783 semi-empirical method.84
Mean RMSD with fixed lattice (Å) | Mean RMSD with lattice optimization (Å) | |
---|---|---|
All atoms | 0.489 | 0.617 |
M–O | 0.507 | 0.650 |
Metal atoms | 0.439 | 0.560 |
Non M–O and non Metal | 0.490 | 0.619 |
Mean RMSD with fixed lattice(Å) | Mean RMSD with lattice optimization(Å) | |
---|---|---|
All Bonds | 0.120 | 0.175 |
Without metal atom | 0.093 | 0.150 |
M–X | 0.187 | 0.236 |
The sharp peaks centred around 0.063 Å in Fig. 3, show that typical organic bonds are very well replicated. Metal containing bonds are also well replicated, though to a lesser extent, with 76.63% of M–M bonds being within 0.250 Å of the reference crystal structure value. The cases where optimized M–M bonds are outside this threshold involve a change of atomic connectivity during the optimization, typically caused by the removal of structural ligands. For example, the structure AGARUW85 is a La3+ containing MOF, where 1D chains of La3+ ions are deca-coordinated by a mixture of coordinating and chelating carbonato ligands and water. In the cleaned structure in the CoRE database (refcode AGARUW_clean85), the water ligands are removed, resulting in a significant rearrangement of the coordination environment of each La3+ ion and a consequent reduction in one of the La–La distances from 4.923 Å to 2.874 Å.
Most force fields developed specially for MOFs have placed special attention on ensuring the correct geometries of common sets of inorganic building units. We checked the metal clusters used in popular MOFs such as M2O paddlewheels found in HKUST-1, trimeric oxo-centered M3O in MIL-100 series, M4O octahedron in MOF-5, and finally the 12-connected M6O cuboctahedron in UiO-66. Very good results are shown for all metal nodes, the deviation of optimized structure is presented in Fig. 4. The two key distances in building blocks that affect the lattice parameters of the optimized structure are the M–O and MM bond lengths, which in a layer-pillar MOF, affect the a, b and the c lattice dimensions respectively. Cu-based paddlewheel is highly conserved, with Cu–Cu bond length deviating only −0.030 Å from the experimental value. Other M2O paddlewheels, Ni2O and Zn2O exhibit a higher deviation, contracting 0.3 Å and 0.15 Å in bond length between the two metals. For all paddlewheels, the M–O bonds are very well conserved with all deviations less than ±0.1 Å. Structures for Ni2O and Zn2O paddlewheels are supplied in the ESI.†
Fig. 4 (a) Cu-based Paddlewheel M2O of refcode FIQCEN86 (b) In-based Trimeric Oxo-centred M3O refcode FIFGIM87 (c) Zn-based octahedron M4O of refcode EDUSIF88 (d) Zr-based cubohedral M6O of recode RUBTAK03.89 Crystal structure is represented by opaque and optimized structures are shown partly transparent. |
Both M3O and M4O experimental structures are also reproduced very accurately shown in Fig. 4b and c respectively. We examined In, Fe, and Al-based trimeric oxo-centered metal nodes, In–In have a deviation of −0.087 Å, Fe–Fe −0.163 Å (refcode HAKSIY90) and Al–Al −0.173 Å (refcode JALCAD91). Again, the M–O bonds are very conserved with all deviations less than ±0.1 Å. Zn-based octahedron unit found in MOF-5 (e.g. refcode EDUSIF88) exhibits a Zn–Zn deviation of −0.0699 Å and Zn–O error of −0.04 Å.
The zirconium oxide building block, from the well-known UiO-66 MOF (e.g. refcode RUBTAK0389) is similarly well-conserved. GFN-xTB has corrected the inner oxygen positions from being symmetric, to having two types of oxygen atoms – one capped with hydrogen and another one without hydrogen. The distinct zirconium–oxygen bonds optimized to a longer bond for Zr–OH (refer to O2 in Fig. 4d) and a shorter bond for Zr–O (refer to O1 in Fig. 4d). This correction is justified as the crystal structure represents an average of oxygen positions, where 50% of the oxygen atoms bear a hydrogen atom, but which oxygen atoms (i.e. the orientation of the building block) is not pre-defined.
We also confirm that GFN-xTB could also reproduce the detailed atomic structures shown by the RMSD of atomic position being 0.489 Å for all atoms and 0.439 Å for metal atoms when the lattice was fixed, while for the optimized lattice, the RMSD values are 0.617 Å for all atoms and 0.560 Å for metal atoms. Bond lengths are better conserved with an RMSD of 0.120 Å for all bonds and 0.187 Å for metal containing bonds for fixed lattice optimization. While for relaxed lattice optimization 0.175 Å for all bonds and 0.236 Å for metal containing bonds.
Popular metal-containing building blocks, e.g. Cu2O paddlewheels in HKUST-1, trimeric oxo-centered In3O in MIL-100, Zn4O octahedron in MOF-5, and finally the 12 connected Zr6O cuboctahedron in UiO-66 found to be especially well replicated with M-M errors of −0.030 Å, −0.087 Å, −0.070 Å, and 0.114 Å respectively.
The accuracy of optimized structures, combined with computational efficiency and the wide coverage of the periodic table (Z ≤ 86) allow GFN-xTB to be applied for screening studies where MOFs from the entire periodic table may be encountered. The accurate non-covalent interactions in the method,66 could allow GFN-xTB to be employed in screening for adsorption energies, an area in which force fields often have difficulty.
Footnote |
† Electronic supplementary information (ESI) available: Index and errors associated with all optimised CoRE structures. Complete set of GFN-xTB optimised structures. See DOI: https://doi.org/10.1039/d2cp00184e |
This journal is © the Owner Societies 2022 |