Harveen Kaura,
Flaviano Della Piaa,
Ilyes Batatiab,
Xavier R. Advinculaac,
Benjamin X. Shia,
Jinggang Lande,
Gábor Csányib,
Angelos Michaelidesa and
Venkat Kapil*afg
aYusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK. E-mail: v.kapil@ucl.ac.uk
bEngineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, UK
cCavendish Laboratory, Department of Physics, University of Cambridge, Cambridge, CB3 0HE, UK
dDepartment of Chemistry, New York University, New York, NY 10003, USA
eSimons Center for Computational Physical Chemistry at New York University, New York, New York 10003, USA
fDepartment of Physics and Astronomy, University College, London WC1E 6BT, UK
gThomas Young Centre and London Centre for Nanotechnology, London WC1E 6BT, UK. E-mail: v.kapil@ucl.ac.uk
First published on 9th August 2024
Calculating sublimation enthalpies of molecular crystal polymorphs is relevant to a wide range of technological applications. However, predicting these quantities at first-principles accuracy – even with the aid of machine learning potentials – is a challenge that requires sub-kJ mol−1 accuracy in the potential energy surface and finite-temperature sampling. We present an accurate and data-efficient protocol for training machine learning interatomic potentials by fine-tuning the foundational MACE-MP-0 model and showcase its capabilities on sublimation enthalpies and physical properties of ice polymorphs. Our approach requires only a few tens of training structures to achieve sub-kJ mol−1 accuracy in the sublimation enthalpies and sub-1% error in densities at finite temperature and pressure. Exploiting this data efficiency, we perform preliminary NPT simulations of hexagonal ice at the random phase approximation level and demonstrate a good agreement with experiments. Our results show promise for finite-temperature modelling of molecular crystals with the accuracy of correlated electronic structure theory methods.
Although possible in theory, predicting sublimation enthalpies with first principles methods is challenging due to the need for high accuracy. Reliable predictions require a tolerance of nearly 4.2 kJ mol−1 for absolute sublimation enthalpies and a tighter tolerance of less than 1 kJ mol−1 for relative sublimation enthalpies.4 As shown by Zen et al.,6 predicting absolute sublimation enthalpies of common molecular solids, such as ice, ammonia, carbon dioxide and aromatic hydrocarbons, consistently to 1 kcal mol−1 requires computationally demanding “correlated” electronic structure techniques. These techniques include quantum fixed-node Diffusion Monte Carlo,7 periodic coupled cluster8 or random phase approximation (RPA) with singles excitations.9,10 Similarly, achieving correct relative stabilities, accurate to 1 kJ mol−1, of prototypical polymorphs of oxalic acid, glycine, paracetamol, and benzene requires statistical mechanics incorporating dynamical disorder via thermal effects,11,12 thermal expansion,13 and anharmonic quantum nuclear motion.11–13
Unfortunately, the computational cost associated with correlated electronic structure theory6 or rigorous quantum statistical mechanics,14 individually or in tandem, remains high. Hence, sublimation enthalpies are commonly approximated inexpensively with dispersion-corrected density functional theory (DFT) for a static geometry optimized lattice at zero kelvin.15 As a consequence of their static description, these enthalpies are compared indirectly with experiments requiring a careful extrapolation of measured enthalpies to a static lattice at zero kelvin.16 Unfortunately, this is typically associated with an error-prone ad hoc subtraction of zero-point energy corrections calculated using DFT.17 Furthermore, considering the inherent uncertainties in measured sublimation enthalpies,5 there is a need for relative stabilities that can be unambiguously compared with experiments at their respective thermodynamic conditions.13
In this context, machine learning potentials (MLPs)18–22 provide an avenue for first-principles-accuracy modelling of molecular crystals at finite temperature. MLPs have been used as computationally inexpensive surrogates for first-principles potential energy surfaces (PES) for ranking putative polymorphs in increasing order of lattice energies.23–25 MLPs have also facilitated finite-temperature modelling of polymorphs of simple compounds like hydrogen26 and water,27,28 with converged system sizes and simulation times. More recently, Kapil and Engel13 developed an MLP-based framework for predicting polymorph relative stabilities for paradigmatic molecular crystals containing up to four chemical species, such as benzene, glycine, and succinic acid. While their approach enables rigorous predictions of relative and absolute stabilities at finite temperatures, it presents a number of limitations arising from those of conventional MLPs. These include a >kJ mol−1 error in out-of-distribution prediction, the need for large volumes of training data (>1000 structures per compound), and declining accuracy and data efficiency with an increasing number of chemical species. These deficiencies make finite-temperature stability calculations for generic molecular compounds costly and prohibitive when using chemically-accurate electronic structure theory.6
In this work, we present an accurate and data-efficient MLP-based approach for finite-temperature modelling and sublimation enthalpy prediction of given polymorphs of a compound. Using ice polymorphs as a test bed, we show that using the multi atomic cluster expansion (MACE) architecture22 supplemented with fine-tuned training of the foundational MACE-MP-0 model is sufficient to reach sub-kJ mol−1 accuracy with as few as 50 training structures. In Section 2, we discuss the shortcomings of conventional MLPs and the capabilities of the newer methods with a focus on MACE and MACE-MP-0, followed by the details of our protocol, including the dataset generation, training and validation steps. In Section 3, we apply our approach to crystalline ice – a prototypical system exhibiting a high degree of polymorphism with good quality experimental data on densities and sublimation enthalpies.16 We demonstrate the accuracy and generality of our approach on the excellent agreement of finite-temperature density and sublimation enthalpies of ice polymorphs directly against the DFT level. Finally, we explore simulations of ice polymorphs directly at the RPA level by training on a few tens of periodic total energy and force calculations. While the results of RPA simulations are sensitive to the orbitals used to expand the independent particle response function,29,30 using hybrid-functional orbitals yields a good description of the density of ice. In Section 4, we discuss the limitations of this work and future efforts for direct finite-temperature simulations for characterizing molecular crystal polymorphs at the accuracy of correlated electronic structure.
Standard architectures, such as the Behler–Parrinello neural network (BPNN) framework,35 Gaussian Approximation Potential,19 SchNet,37 DeepMD,33 and Moment Tensor Potential,38 can be constructed by mixing and matching various flavours of two- or three-body atomic representations and regression models. Despite their success, standard models have two main limitations. First, the truncation of body order leads to the incompleteness of the atomic representations, limiting their accuracy and smoothness.39 The accuracy can be systematically improved, but including higher body order representations involves a much higher computational cost and labour.38,40,41 Second, the number of representations scales combinatorially with the number of chemical species as n-body representations are defined for every n-tuple of atomic species. Hence, for a given accuracy cutoff, standard MLPs exhibit an exponentially increasing cost with an increasing number of chemical species. Similarly, for a fixed cost, these models have a steeply worsening accuracy with increasing chemical species.
Newer MLPs such as NequIP,21 MACE,22 and PET,42 implemented as Euclidean graph neural networks,43 address these issues by incorporating (near) complete atomic representations.20 In addition, they exploit a learnable latent chemical space21,44,45 for smoothly interpolating or extrapolating representations across chemical species at cost without compromising accuracy.46 Specifically, in the MACE architecture, the initial node representations of the graph neural network are based on the atomic cluster expansion20 up to a selected body order (typically n = 4). MACE systematically constructs higher body-order representations in terms of the output of the previous layer and an equivariant and high body-order message passing scheme.45 Hence, increasing the number of layers or body order of the message passing enables learning correlation functions of arbitrary order. However, thanks to its message passing scheme, MACE can efficiently construct high body-order representations with a simple architecture (e.g., the default parameters of MACE with just two layers give access to a body order of 13).47 Finally, embedding chemical information in a learnable latent space,48 MACE displays an computational cost with the number of chemical species.
Exploiting these capabilities for datasets with a large number of elements, Batatia et al.49 have recently developed a so-called MACE-MP-0 foundational MLP model trained on a diverse Materials Project dataset. Specifically, MACE-MP-0 is trained on the MPTrj dataset, comprising 1.5 million small periodic unit cells of inorganic (molecular) crystals with elements across the periodic table. The training set includes total energies, forces and stress tensors estimated at the PBE(+U) level. Trained for elements across the periodic table, the MACE-MP-0 model is capable of out-of-the-box usage for general materials with qualitative (and sometimes quantitative) PBE accuracy. Other classes of foundational MLPs exist, such as CHGNET50 and M3GNet,51 based on materials project datasets and the ALIGNN-FF50 based on the JARVIS-DFT dataset. Unlike MACE-MP-0, these models are based on three-body atomic representations.
While MACE-MP-0’s accuracy is insufficient for studying molecular crystal polymorphs, its parameters may provide a starting point for training system-specific models at a different level of theory. Considering that the pre-trained n-body atomic representations are valid for generic materials, using the MACE-MP-0 parameters as a starting point for fine-tuning may require less data and computational time compared to training a new model from scratch.
The models from step 4 can be used for production simulations in the NPT ensemble with larger simulation sizes and long simulation times. The same procedure is used for training the model at the RPA level, replacing DFT with RPA level total energy and force calculations.
We also perform total energy and force calculations at the RPA level. We use the EXX + RPA@PBE0(ADMM) approach, which computes exact exchange (EXX) and the RPA correlation energy based on PBE0 orbitals estimated within the auxiliary density matrix method (ADMM).61 We employ the sparse tensor-based nuclear gradients of RPA62 as implemented in CP2K. All calculations use the triple-zeta cc-TZ and RI_TZ basis sets alongside GTH-PBE0 pseudopotentials. The Hartree–Fock exchange contribution to the SCF and the Z-vector equation is calculated within the ADMM approximation.
As shown in Fig. 1, we report RMSEs of the energy per molecule and the force components on H and O atoms as a function of the size of the training set. A MACE model trained from scratch reports an energy RMSE of nearly 0.01 kJ mol−1 with the smallest training set comprising 50 structures. On an absolute scale, this error is extremely low. However, it corresponds to 10% of the mean energy variation in the validation set, which is nearly the per atom standard deviation of the potential energy at 100 K. With 400 structures, i.e., the entire dataset, we report an energy RMSE of around 0.001 kJ mol−1. This error corresponds to around 1% of the validation set standard deviation.
We next study the effect of fine-tuning the parameters of the MACE-MP-0 model, which implies starting the training from the last checkpoint of the foundational model. As seen in Fig. 1(a), fine-tuning improves the accuracy of the models, compared to training from scratch, for small datasets containing fewer than 160 structures. With just 50 training structures, a fine-tuned MACE model reports an energy RMSE that corresponds to around 2% of the validation set standard deviation. For training set sizes, beyond 160 structures we see nearly identical performance of the from-scratch and fine-tuned models on the energy RMSE.
The RMSEs of the force components on H and O atoms paint a similar picture. The from-scratch MACE models report a small RMSE of nearly 2 meV Å−1 with just 50 structures, nearly 1% of the validation set standard deviation, with a systematic reduction in error with increasing training data. On the other hand, fine-tuning the MACE-MP-0 model results improved accuracy and data efficiency with nearly 1 meV Å−1 RMSE with 50 structures and sub meV Å−1 RMSE with over 200 structures. We observe improvements in force RMSEs across the full range of training set sizes.
To contextualize these RMSEs, we also report RMSEs obtained with a standard 2- or 3-body atomic-representation-based model. We select the BPNN35 architecture which has been widely used for simulating the bulk,67 interfacial68 and confined phases of water.69 We observe nearly order-of-magnitude higher energy and force RMSEs with the BPNN scheme, with indications of saturating errors with 400 training structures. Although on an absolute scale, the BPNN model reports small energy RMSE, it corresponds to saturation at around 10% relative error. The saturation in the RMSEs is likely due to a ceiling on learning capacity due to incomplete atomic representations. We note the much higher data efficiency of the MACE models, which exhibit a higher accuracy even with an order of magnitude and fewer training data.
To assess model performance at finite thermodynamic conditions, we perform fully flexible NPT simulations at 100 K and 1 bar for each model. We found the BPNN NPT simulations to be unstable due to overfitting on the training ensemble, hence we only present results for the MACE models. We report the thermodynamic average of the potential energy and the density as a function of the size of the training set in Fig. 3. Exploiting MACE’s high fidelity, we perform statistical reweighting to calculate the DFT reference of the average potential energy and the density. For this purpose, we use the trajectory sampled by the fine-tuned MACE trained on 100 structures. The DFT references allow us to test MLPs against their DFT in their thermodynamic ensembles directly.
As shown in Fig. 3, the from-scratch MACE models generalize well to the true thermodynamic ensemble displaying sub-kJ mol−1 error for the average potential energy at 100 K and 1 bar. Despite these small errors, we note that the from-scratch MACE thermodynamic averages deviate from the DFT reference at 50 structures, yielding a statistically significant agreement for models trained with more than 200 structures. These small errors lead to significant disagreements in the density (a quantity that is much harder to converge compared to energy or forces as per empirical evidence70). We require training on 400 structures to converge the density to the DFT reference within statistical error.
With the fine-tuned models, we observe a remarkable performance against the DFT ensemble. Even for the smallest training set comprising 50 structures, we observe a quantitative agreement in the density and the average energy. Inspired by this result, we trained new fine-tuned models with 1, 2, 10, and 20 structures and studied their performance on density and the average energy in the NPT ensemble. With just 10 structures, the properties obtained with the fine-tuned models are within 1% (0.09 g cm−3) of the DFT density and 0.5 kJ mol−1 of the DFT energy. These tests suggest the potential of our approach for finite-temperature modelling of molecular polymorphs with a few tens of training structures. This is a marked improvement over our previous work in ref. 13, which required over a few hundred or thousands of structures and differential learning for stable NPT simulations.
For each polymorph, we train a fine-tuned MACE model on 100 structures and perform NPT simulations at 100 K and 1 bar to estimate the density and the average potential energy. To estimate the sublimation enthalpy, we further train a fine-tuned MACE model on 200 structures of a water molecule providing a gas phase reference enthalpy at 100 K and 1 bar. The training set was developed in the same way as for the ice polymorphs with initial NVT sampling using CP2K at revPBE-D3 level and converged DFT calculations on randomly sampled structures using VASP. Finally, for an apples-to-apples comparison, we compare with DFT-level densities and sublimation enthalpies calculated using statistical reweighting.
As shown in Table 1, the densities and sublimation enthalpies estimated with MACE at 100 K and 1 bar agree remarkably with the reference DFT estimations up to the statistical error. In most cases, the discrepancies between MACE and DFT are within the 1σ statistical error of DFT estimations. In all the cases, the agreement for the sublimation enthalpy and the density are within 0.5 kJ mol−1 and 1% respectively. These results demonstrate the data efficiency and generalizability of the fine-tuned MACE models to the full NPT ensemble of energy and volume despite being trained on a skewed ensemble.
Polymorph | Density [g cm−3] | Sublimation enthalpy [kJ mol−1] | ||||
---|---|---|---|---|---|---|
MACE | DFT | % Error | MACE | DFT | Error | |
Ih | 0.922 ± 0.001 | 0.922 ± 0.001 | 0.012 ± 0.161 | −58.208 ± 0.089 | −58.214 ± 0.089 | 0.006 ± 0.126 |
II | 1.200 ± 0.002 | 1.190 ± 0.009 | 0.896 ± 0.764 | −57.055 ± 0.091 | −56.814 ± 0.145 | 0.241 ± 0.0172 |
VI | 1.329 ± 0.001 | 1.330 ± 0.006 | 0.083 ± 0.461 | −55.225 ± 0.091 | −55.215 ± 0.114 | 0.010 ± 0.146 |
VIII | 1.569 ± 0.001 | 1.565 ± 0.003 | 0.262 ± 0.196 | −56.063 ± 0.107 | −55.719 ± 0.143 | 0.349 ± 0.179 |
To explore this possibility, we consider modelling the properties of ice Ih at the RPA level. We consider this approach, as there exists an efficient implementation in CP2K.62,71 Additionally, there exist tests for zero Kelvin calculations on ice Ih with both Gaussian Type Orbitals in CP2K, which compare well with those using plane waves in VASP.29 It is to be noted that the RPA correlation energy estimated with PBE orbitals leads to underbinding and, thus, an underprediction of the density, while Hartree–Fock orbitals lead to overbinding and, thus, an overprediction of the density.29 Following the suggestion of Macher et al.,29 we used hybrid functional orbitals from PBE0 estimated with the ADMM approximation, which we refer to as EXX + RPA@PBE0(ADMM). For brevity, we will refer to this level of theory as RPA level unless stated otherwise. We trained a fine-tuned MACE model at this level for ice Ih using 75 periodic total energy and force calculations.
We found that training at the RPA level was more challenging than at the DFT level. Our training set resulted in an energy mean absolute error (MAE) of 0.03 kJ mol−1 and a force MAE of 24.8 meV Å−1, which is significantly higher than the energy MACE of 0.001 meV per atom and force 1 meV Å−1 MAE on the DFT training set. This is due to the noise in the forces resulting from the ADMM approximation, which reduces the computational cost associated with RPA-level calculations. ADMM also leads to a spurious offset in the total energy, which doesn’t allow us to estimate sublimation energies but, fortunately, doesn’t significantly affect structural properties.30 A full study of water at the RPA level with comprehensive convergence tests without the auxiliary density matrix method will be presented elsewhere.
Despite the noisy training set, we were able to report stable NPT simulations at RPA level at 100 K and 1 bar, as shown in Fig. 4. However, due to the high cost of single-point calculations, we were unable to perform statistical reweighting for direct validation. Nonetheless, the accuracy afforded by our simulation allowed us to compare meaningfully with experiments.
Fig. 4 RPA-level simulations of ice Ih. Panel (a) shows the time series of the density of ice Ih from a classical NPT simulation at 100 K and 1 bar with a MACE model trained at EXX + RPA@PBE0(ADMM) level. Panel (b) shows the same quantity but with classical and path-integral NPT simulations using the MACE model trained at the revPBE-D3 level. The black dashed lines correspond to the experimental density72 at 100 K and 1 bar. Panel (c) shows the oxygen–oxygen pair correlation function of ice Ih at 220 K estimated from an RPA level MACE NVT simulation. Panel (d) reports the same quantity but with classical and path-integral NPT simulations at 220 K and 1 bar using the MACE model trained at the revPBE-D3 level. The black lines correspond to the experimental pair correlation function73 measured at 220 K and 1 bar. |
We first study the density of ice at 100 K, which is 0.934 g cm−3 as observed in experiments.72 Macher et al.’s29 calculations show that the exact exchange and RPA correlation using (delocalized) PBE orbitals lead to an underpredicted density of around 0.910 g cm−3, while using exact exchange computed using (localized) Hartree–Fock orbitals and RPA correlation using PBE orbitals overestimates the density to around 0.955 g cm−3. Consistent with these results, our RPA calculations based on exact exchange and correlation using PBE0(ADMM) orbitals result in a density of 0.939 g cm−3, which is only marginally higher than the experimental density. Our RPA results also show an improvement over revPBE-D3 density of 0.922 g cm−3. The missing quantum nuclear effects in our RPA-level simulations explain the remaining (small) discrepancy between the NPT and experimental densities. We were unable to confirm this directly as our path integral simulations were not stable. However, we were able to confirm that the instability of the simulations is linked to the noisy forces by performing path integral simulations using our revPBE-D3 level fine-tuned model. As can be seen in Fig. 4(b), we report stable simulations with quantum nuclear effects despite not training on configurations generated using path integral simulations. Quantum nuclear effects marginally reduce the density of ice Ih, nearly to the same extent as the overestimation of the classical density estimated at RPA level compared to experiments.
Finally, with access to stable trajectories, we compared the structure of ice Ih with radiation total scattering experiments at 220 K (ref. 73) by calculating the oxygen–oxygen pair correlation function. Although our potential is trained on configurations corresponding to a 100 K and 1 bar ensemble, our models can generalize to higher temperatures. Unfortunately, due to the noisy RPA-level forces, as diagnosed by stable classical and path-integral simulations in the revPBE-D3 NPT ensemble at 220 K in Fig. 4(d), we report unstable simulations in the NPT ensemble at 220 K and 1 bar. On the other hand, we can perform stable simulations in the NVT ensemble at 220 K and compare the predicted pair correlation function with the experiment in Fig. 4(c). We report an overall good agreement with the experimental pair correlation function and with the revPBE-D3, modulo the over-structuring of the first and second peaks. Although quantum nuclear motion at 220 K is expected to broaden the first and second peaks but not sufficiently enough to explain the extent of static disorder in experiments. The non-zero probability in the 3–4 Å range could arise from defect migration at the grain boundaries in the power sample. Alternatively, the disagreement for short distances (or large reciprocal space vectors) could be an artefact of the empirical potential structure refinement74 used to analyze experimental data.
Training a MACE model by finetuning the parameters of the pre-trained MACE-MP-0 model, as opposed to training it from scratch, results in improved accuracy and data efficiency. Only 50 to 100 training structures sampled for a given T, P condition are needed to achieve sub-kJ mol−1 and sub 1% agreement on the average energy and density, respectively, against the reference DFT NPT ensemble. Exploiting the accuracy and low data requirement of our approach, we develop an RPA-quality machine learning model for simulating ice Ih in the NPT ensemble. Our RPA simulations demonstrate an overall good agreement with the experimental density and pair-correlation functions and an improvement beyond DFT. At the same time, the noise in RPA training forces compromises the MLP’s data efficiency and robustness compared to the DFT level. Our work highlights the importance of tightly converged electronic structure theory training data, particularly at correlated levels.
We conclude by discussing some limitations that need to be addressed to make our approach applicable to more complex molecular crystals, such as the CSP blind test candidates.75 For instance, our initial data generation step, which uses first-principles MD, took around 24 hours on 8 nodes of an Intel Xeon E5-2690 v3 @ 2.60 GHz (12 cores). However, this step could prove expensive for larger or more complex molecular crystals. Hence, future work will explore less expensive sampling protocols, ranging from MD using the MACE-MP-0 model or a random structure search sampling.76 Second, our approach could require a significantly large volume of training data for systems with a large number of polymorphs, as we generate training data and models separately for each polymorph. Hence, we will study the data efficiency associated with pooling training configurations to develop a single model for all polymorphs. Finally, our approach currently doesn’t include quantum nuclear effects in sublimation enthalpies, which could prove important for obtaining quantitative agreement with experiments. With these developments, we foresee predictive sublimation enthalpy and physical property predictions for molecular crystals at the accuracy of correlated electronic structure and path integral molecular dynamics.
This journal is © The Royal Society of Chemistry 2024 |