Open Access Article
Wen-Qing
Li
,
Gang
Wu
,
Juan Manuel
Arce-Ramos
,
Yang Hao
Lau
and
Man-Fai
Ng
*
Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR), 1 Fusionopolis Way, #16–16 Connexis, Singapore 138632, Republic of Singapore. E-mail: ngmf@a-star.edu.sg
First published on 30th August 2025
Accurate modelling of the structural and dynamic properties of the solid electrolyte interphase (SEI) in lithium-ion batteries remains a longstanding challenge due to the high complexity of the SEI structure and the lack of structural information. Atomistic simulations using molecular dynamics (MD) can provide insights into the structure of the SEI but require large models and accurate interatomic potentials; however, existing computational tools struggle to evaluate these potentials in mixed-material systems efficiently and reliably. Here, we demonstrate the effectiveness of machine learning interatomic potentials (MLIPs) defined using amorphous structures as reference data, specifically the moment tensor potential (MTP), combined with density functional theory (DFT) calculations and active learning loops that enable rapid sampling of MD trajectories. For SEI relevant materials (e.g., Li2CO3, bulk Li, LiPF6, and Li2EDC), our trained MTP models accurately capture the key structural properties (e.g., lattice parameters, elastic constants, or phonon spectra). For the dynamical properties of monoclinic Li2CO3 and amorphous Li2EDC, the models are validated against previous theoretical predictions in the literature. Particularly, we illustrate the finite temperature effects on computing energy barriers. The determined mechanism of dominant diffusion carriers (Li vacancy, interstitial Li, and Li Frenkel pair) in Li2CO3 is highly consistent with DFT calculations. Furthermore, we show that the generated training datasets can be applied to train graph-neural-network (GNN)-based interatomic potentials that can further improve accuracy. The developed machine learning workflow provides a scalable approach for SEI modelling, enabling simulations at larger time and length scales to tackle the limitations of conventional DFT methods.
New conceptsThe present study introduces a machine learning-driven framework for modelling the complex structure and dynamics of the solid electrolyte interphase (SEI) in lithium-ion batteries. By leveraging moment tensor potentials (MTPs) trained on density functional theory (DFT) data, the approach enables accurate and scalable molecular dynamics simulations. A key innovation lies in the integration of active learning loops and amorphous configurations, which iteratively refine the training set to capture a broader range of atomic environments efficiently. The workflow also demonstrates the transferability of generated datasets to graph neural network (GNN)-based force fields, offering a pathway to further improve predictive accuracy for SEI materials. |
In battery materials research, molecular dynamics (MD) modelling has become a crucial tool for developing next-generation battery technologies.7,8Ab initio molecular dynamics (AIMD) based on density functional theory (DFT) has been widely used to model battery materials at the atomic scale, typically involving a few hundred atoms and picosecond time scales.9 However, AIMD struggles to model the SEI realistically due to its high computational cost and limited simulation scale.10 Empirically fitted force fields provide a complementary approach that can enable large-scale MD simulations by using analytic formulas to describe the interatomic interactions.11–14 These models significantly reduce the computational cost as compared with DFT. However, the development of traditional force fields, which requires reparameterization of data points (experimental and/or quantum chemical data) with significant human involvement, is a tedious process for complex chemistries (particularly those involving bond breaking and formation), leading to a low transferability.
Machine learning-based force fields (MLFFs) are transforming the way we model complex material systems, offering an efficient yet highly accurate alternative to traditional interatomic potentials.15–17 Recent progress in MLFF models for solid electrolyte materials in lithium-ion batteries has demonstrated their utility in probing structural stability, dopant effects, diffusivity, and activation energies across a wide range of chemistries.18,19 By learning energy landscapes directly from first-principles calculations, MLFFs can bridge the gap between computational efficiency and quantum-level precision, making them particularly well-suited for studying intricate environments. Unlike classical force fields, which rely on predefined functional forms, MLFFs use flexible, data-driven models to capture a broad spectrum of atomic interactions. This adaptability is crucial for SEI materials, where transport properties depend on subtle variations in the chemical composition, structural disorder, and interfacial effects. Despite their promise, applying MLFFs to SEI systems remains an open challenge. One of the primary hurdles is the need for extensive, high-quality training datasets that comprehensively represent the diverse atomic configurations found in SEI materials. The SEI consists of a combination of organic and inorganic phases, often exhibiting both crystalline and amorphous regions, as well as dynamically evolving interfacial structures. A robust MLFF must accurately describe interactions among key elements such as Li, O, C, H, and F, while also capturing the complex behavior of molecular species like CO32− and EDC2− anions.
A major challenge in MLFF development is the computational cost associated with generating training data. First-principles calculations, particularly for large systems, are resource-intensive, limiting the feasible size of individual training structures to a few hundred atoms. Consequently, MLFF training datasets must be carefully curated to maximize coverage of relevant atomic interactions within these constraints. In practical applications, the architecture and technical setup of MLFFs—including the choice of reference data, the method of sampling, and the level of theory used—play a crucial role in determining the reliability and range of applicability of the final model. Several strategies have been proposed to assemble reference datasets from DFT or ab initio calculations, each with distinct advantages and limitations. For example, (i) AIMD sampling: running long molecular dynamics trajectories at finite temperatures to systematically explore relevant atomic configurations;20,21 (ii) multiscale training: using lower-accuracy methods to generate an extensive dataset, followed by selective high-accuracy DFT refinements to enhance model precision;20,22 (iii) active learning: dynamically improving the force field by incorporating novel configurations encountered during simulations;23–26 (iv) enhanced sampling techniques: using biasing methods to accelerate the exploration of rare but important configurational states;21 and (v) perturbation-based methods: generating a diverse training set by displacing atomistic structures to capture high-energy states relevant for diffusion and reaction pathways.17,27 Although these techniques have been successfully applied to various material systems, their optimal combination for SEI modelling remains unclear. The lack of standardized methodologies for constructing SEI-specific training datasets has hindered the development of MLFFs capable of capturing the full range of SEI phenomena. Additionally, validating MLFF predictions against DFT/experimental data remains an essential but underexplored aspect of this field. As computational resources continue to advance, MLFFs are poised to play a pivotal role in accelerating SEI research. However, achieving their full potential requires further innovation in dataset construction, model transferability, and validation frameworks. Addressing these challenges will be essential for leveraging MLFFs to gain deeper insights into the fundamental processes governing SEI formation, evolution, and ionic transport.
In this work, we demonstrate the utility of machine learning interatomic potentials (MLIPs) developed through systematically sampling amorphous structures via a batch of short MD simulations, coupled with an active learning scheme and DFT calculations, to efficiently and reliably model the major components of the SEI in LIBs. Direct comparison and validation of MLFF predictions against DFT benchmarks for key properties such as Li+ transport barriers, lattice stability, phonon spectra, and elastic moduli establish trust in the MLFF's predictive power for SEI chemistry. While the study builds upon existing machine learning frameworks such as MTP and MACE, its novelty lies in developing a domain-specific MLFF workflow tailored for the SEI—a chemically complex, structurally diverse, and dynamically evolving region that is poorly described by classical force fields.
The SEI exhibits a highly complex chemical environment, encompassing diverse compositions, interfaces, and stoichiometries. Consequently, its structural properties cannot be adequately represented by well-defined crystalline phases alone. In many cases, the interfacial regions of the SEI are inherently disordered at the atomic scale. To address this, we adopted a strategy that captures interactions between different compound types and overcomes the limitations of training machine learning potentials solely on near-crystalline configurations. Specifically, the initial structures were generated by randomly blending and packing varying amounts of Li2EDC, Li2CO3, LiPF6 and Li species into a supercell using an initialisation scheme proposed by Vilhelmsen and Hammer, Fig. S1.29 The supercell involves atoms with a maximum number of 8 Li2EDC, 16 Li2CO3, 64 Li molecules, etc. Initially, each system was set up at low density with a large supercell (Fig. S1). Then it was condensed over 5 ps at 10 K with a high pressure of 50 kbar in the NPT ensemble, and an optimization was conducted to relax the atomic positions and internal stress. These calculations were performed using a universal machine learning force field implemented in MACE.30 In this work, 120 starting structures were generated this way to evaluate the training set via the loops described in the following section (Scheme 1). The results and discussion section demonstrates that the potential exhibits strong predictive capability for various properties near equilibrium states (Fig. 1).
![]() | ||
| Fig. 1 Typical structure of the SEI with inorganic and organic components, e.g., Li2CO3 and Li2EDC. Brown – carbon, red – oxygen, white – hydrogen, silver – fluorine, and light purple – phosphorus. | ||
In the MTP framework, the potential energy of an atomic structure is approximated as the sum of site energies for individual atoms, where each site's energy is expanded linearly using a set of basis functions. MTP introduces degree-like measures, referred to as “levels,” to determine which basis functions are included in the interatomic potentials. In our implementation, we employ MTP at “level 16,” corresponding to a mid-sized basis set, which enables large-scale MD simulations with sufficient efficiency. This is particularly advantageous for investigating SEI materials, where computational speed is critical. We have observed that this strategy can result in sufficient sampling for intended applications of the final MTP model involving MD simulations for equilibrium or close to equilibrium properties (e.g., radial distribution functions and lithium-ion diffusion coefficients).
Starting with 120 initial configurations generated via the scheme discussed in Section 2.1, the active learning process (see Section 2.2) ultimately produced the final training set, comprising 8686 structures, which enabled the generation of 50-ps NVT and NPT trajectories at various temperatures below 1000 K and pressures of 1 bar and 100 kbar. Training at MTP “level 16” yields a root-mean-squared error (RMSE) of 8 meV per atom for energy and a RMSE of 0.34 eV Å−1 for forces.
All of the MD simulations were performed using the MTP force field31 implemented as a calculator in the LAMMPS package38 using a constant time step of 0.5 fs. All simulations were performed using NPT or NVT ensembles, as appropriate, using a Nosé/Hoover temperature thermostat and a Nosé/Hoover pressure barostat, including an ensemble of velocities from a random number generator with the specified seed at the specified temperature. The relaxation time constants were chosen to be 100 fs for the thermostat and 500 fs for the barostat. In the production stage of the MD simulations, the initial geometries of the amorphous Li2EDC were created using the disordered system builder as implemented in ASE proposed by Vilhelmsen and Hammer29 (see Fig. S1 and S6 for more details).
000 sampled structures from those trajectories to evaluate the model performance. We assess the model's accuracy by comparing the absolute energies (Fig. 2a) and force components (Fig. 2b) against the reference level from theory (DFT with the PBE functional). The test root-mean-squared errors (RMSEs) for energy per atom and force components are 9 meV and 0.21 eV Å−1, respectively—closely matching the RMSEs of the training set. This indicates an accurate and robust model is achieved. More importantly, nevertheless, the more stringent and useful tests for our model are quantitatively accurate predictions of various properties (e.g., lattice parameters, elastic constants, phonon spectra, radial distribution functions and lithium-ion diffusion coefficients) of materials compared to DFT results and previous theoretical predictions in the literature. These tests are discussed in Section 3.
![]() | ||
| Fig. 2 Correlation of test set energies (a) and atomic force x components (b) of the DFT level with the MTP predictions. | ||
![]() | ||
| Fig. 5 Energy profile of the pathway following (a) the generation of a neutral Li Frenkel pair, (b) the neutral Li vacancy diffusion, and (c) the neutral Li diffusion via a knock-off mechanism. Black – DFT, blue – MTP, and yellow – MACE. The nudged elastic band method43 as implemented in ASE44 was used to locate the energy barrier of Li diffusion in a 1 × 2 × 2 supercell of Li2CO3. The atomistic structures of initial, transition, and final states are shown in Fig. S10. | ||
Finite-temperature dynamics offer critical insights into lithium-ion transport within the SEI materials, which is an important method for understanding batteries based on electrode–electrolyte systems. The experimentally determined activation energy (∼0.60 eV) near the anode is expected to be an average value over the multiphase SEI including different interfaces.42,45 Accurately predicting lithium-ion transport across different phases requires well-trained force fields. As an example, we consider models consisting of 1440 atoms from a 3 × 5 × 4 supercell of monoclinic Li2CO3, as well as an amorphous model containing 72 Li2EDC molecules (Fig. S6 in the SI) with 1152 atoms. MD simulations were performed at elevated temperatures (600–900 K for Li2CO3 and 400–600 K for Li2EDC) to ensure that the system reached a diffusive regime, allowing the analysis of temperature-dependent transport behaviours. The self-diffusion coefficient (D) was calculated using the Einstein relation at temperatures where diffusion was observed in NPT simulations at a pressure of 1 bar:
The diffusive behaviour of Li ions is evident from the linear increase of MSD with time (Fig. S7), and the temperature dependence follows an Arrhenius-like relationship, appearing linear in the plot of log(D) vs. 1/T (Fig. 6). While the MSD is higher at elevated temperatures, Li hopping events are significantly reduced at room temperature, leading to much lower diffusivity. At 300 K, our simulations yield lithium-ion diffusivities of 1.25 × 10−15 m2 s−1 for Li2EDC and 1.32 × 10−23 m2 s−1 for Li2CO3 (Fig. 6). The diffusion coefficient in Li2EDC is consistent with the value (∼10−16 m2 s−1) obtained using the revised many-body polarizable APPLE&P force field.12 To the best of our knowledge, only Tasaki et al.46 have previously reported Li diffusion in crystalline Li2CO3 at room temperature, using the COMPASS force field with parameters by defined Accelrys, Inc. In detail, they used a Li2CO3 crystal model with a simulation cell of approximately 20 Å per side and performed 1 ns NPT ensemble simulations. In our study, we employed a larger supercell (∼25 Å) and extended the simulation time to 20 ns under the NPT ensemble (Fig. S7) in order to evaluate the Li+ diffusion coefficient in Li2CO3. However, our calculated Li diffusion coefficient for Li2CO3 is approximately eight orders of magnitude lower than the value, 9 × 10−15 m2 s−1, reported by Tasaki et al.46 This discrepancy may arise from differences in computational models or the methodologies used to estimate diffusivity. Our calculated value (1.32 × 10−23 m2 s−1 for Li2CO3) was used to derive the activation energy barrier (Fig. S8a), which was further validated via DFT calculations presented in Fig. 5. Moreover, a comprehensive study of Li transport in Li2EDC, combining experimental measurements with the APPLE&P force field,12 has indicated that the diffusion coefficient of 8 × 10−12 m2 s−1 reported by Tasaki et al. is significantly overestimated compared to our more consistent value of ∼10−15 m2 s−1.
The obtained activation energy barrier for Li-ion diffusion is 0.64 eV for Li2EDC (Fig. S8), which closely aligns with the previously reported value of 0.66 eV using the classical force field.12 For crystalline Li2CO3, the barrier is 1.19 eV, which is highly consistent with our DFT prediction and falls within the range of 0.78–1.34 eV measured using solid-state NMR spectroscopy.47 Within the voltage range of lithium-ion batteries, excess interstitial Li is expected to dominate the diffusion process due to its lower formation energy compared to other defects (e.g., Li vacancy and Li Frenkel pair),9 making it a key factor in estimating energy barriers.9,39,40 Our simulations reveal a more rapid linear increase in the MSD curve (Fig. S7 in the SI) in the presence of an extra interstitial Li in the Li2CO3 model with 1440 atoms, confirming its impact on diffusivity. The concentration of interstitial Li-doped Li2CO3 is about 7.1 × 1025 m−3, which is close to the concentration used in SEI growth modelling.48Fig. 6b further illustrates the Arrhenius-like relationship. The diffusivity of the interstitial Li model at 300 K is 4.25 × 10−13 m2 s−1, comparable to the values obtained using AIMD (7.6 × 10−15 m2 s−1)42 and the standard 6–12 Lennard-Jones (LJ) potential (3.3 × 10−16 m2 s−1).41 The activation energy for interstitial Li diffusion in Li2CO3 is found to be 0.18 eV. Additionally, Li vacancies are frequently reported as point defects that facilitate Li transport in Li2CO3.49 At 300 K, the Li vacancy diffusivity is 8.0 × 10−14 m2 s−1 with an activation energy of 0.24 eV, with both values comparable to those of interstitial Li. The close agreement between energy barriers estimated from DFT-based NEB calculations and those obtained from MLFF-based MD simulations further validates the high accuracy of our trained model.
The diffusivities of interstitial Li and Li vacancies in Li2EDC at 300 K are 1.65 × 10−14 m2 s−1 and 6.11 × 10−15 m2 s−1, respectively, with the corresponding energy barriers of 0.49 eV and 0.56 eV. These values suggest that the organic Li2EDC exhibits lower ionic mobility than the inorganic Li2CO3, aligning with experimental findings where the stripping of the Li+ solvation sheath50 and the diffusion of Li+ from the organic to the inorganic layer9 are identified as key steps in anode processes.
We explicitly calculated the ionic conductivities of both Li2CO3 and Li2EDC using the Nernst–Einstein relation: σ = nz2q2D/kBT, where σ is the ionic conductivity, n is the charge carrier concentration, D is the diffusion coefficient, T is the temperature, z = 1 is the charge number for Li+, q = 1.602 × 10−19 C is the elementary charge, and kB = 1.381 × 10−23 J K−1 is the Boltzmann constant. We adopt a larger supercell containing more atoms for the ionic conductivity calculations so that the system size is more comparable to other theoretical work. Previous studies51–56 have shown that ionic conduction in inorganic Li2CO3 primarily occurs via interstitial Li defects.
The charge carrier concentration n of Li interstitials in Li2CO3 computed using a supercell including 3457 atoms is 2.98 × 1025 m−3. The calculated ionic conductivity based on the computed diffusion coefficients (1.76 × 10−13 m2 s−1 at 300 K) is 3.25 × 10−5 S m−1. Available experimental studies reported that the ion conductivity of Li2CO3 is ∼10−8 S m−1.51–53 Nevertheless, the theoretical reported values based on DFT and the continuum approach typically range from 10−6 to 10−8 S m−1.54–56 For Li2EDC, unlike Li2CO3, the dominated ion conduction mechanism remains experimentally unclear. Therefore, both intrinsic Li and Li interstitials might contribute to the conduction. With this assumption, the estimated charge carrier concentrations for Li2EDC using a supercell including 2688 atoms range from 3.50 × 1025 to 1.18 × 1028 m−3. The calculated ionic conductivities range from 7.60 × 10−4 to 2.26 × 10−6 S m−1 based on the computed diffusion coefficients (1.04 × 10−14 m2 s−1 at 300 K). The experimental ionic conductivity is ∼10−7 S m−1, while a forcefield based modelling gives values ranging from 10−6 to 10−8 S m−1.12
Despite the calculated value of Li2CO3 being higher than the experimental value, our result is in good agreement with other theoretical work, whereas the Li2EDC result agrees with both experimental and other theoretical works. We note that the ionic conductivity calculated from modelling in the literature tends to be higher. This discrepancy could primarily stem from the fact that it is computationally challenging to model the experimental charge carrier's concentration which requires a very large supercell. Therefore, a higher concentration of charge carrier is assumed in modelling (i.e., adopt a relatively smaller supercell), which could result in higher ionic conductivity. Nevertheless, the trend obtained from the present work remains consistent with other modelling and experimental work.
In addition, previous DFT studies suggest that the Li(001)/Li2CO3(001) interface exhibits relatively high mechanical strength with a reported interface energy of 0.49 J m−2,58 yet Li+ migration across it has not been thoroughly investigated. Our MD simulation for Li(001)/Li2CO3(001) interface shows that Li(001) changes to Li(111) after 5 ns at 300 K, indicating that Li(111) is the lowest energy surface. Therefore, we have investigated Li-ion transport across the Li(111)/Li2CO3(001) interface (Fig. 7). MLFF-MD simulations performed under an applied electric field show that Li transport takes place from Li(111) to Li2CO3(001), as shown in Fig. 7. With the application of the electric field (right to left), the migration process takes place along the atomic chain (the five Li atoms labelled by numbers 1 to 5 in Fig. 7). Like the diffusion pathway in the “knock-off mechanism”, these Li atoms at sites from 5 to 1 take turns as the diffusing interstitials, move in four knock-off steps, and finally push the Li atom to the left side of Li2CO3.
![]() | ||
| Fig. 7 MLFF-MD simulations performed at 300 K under an applied electric field (0.03 eV e−1 Å−1) using the QEq method57 to reveal lithium transport from Li(111) to Li2CO3(001). (a) Atomic structure of the Li(111)/Li2CO3(001) interface. Snapshots of the MD trajectory at (b) 1.4 ps and (c) 1.5 ps illustrate the migration process. | ||
We have therefore further developed and validated an equivariant neural network model named MACE,30 to assess whether the training dataset generated through iterative loops based on MTP models can be leveraged to capture subtle interatomic interactions with higher fidelity. The goal is to improve transferability across diverse material environments while enhancing accuracy. Training at MACE using about one million parameters yields a RMSE of 2 meV per atom for energy and a RMSE of 0.07 eV Å−1 for forces (Fig. S2). Compared to the MTP model, the MACE model also achieves significantly lower RMSEs for both energy and forces on test sets (Fig. S2 and S3). Like the MTP model, the MACE model demonstrates excellent agreement with DFT calculations for lattice parameters (Table S1), elastic constants (Table S2), and phonon spectra (Fig. S3). Additionally, the MACE model improves predictions of energy barriers for Li transport via three mechanisms in Li2CO3 (Fig. 5) compared to the MTP model. However, despite these improvements, the computational cost of MACE remains a limiting factor. In large supercells (∼1440 atoms), our current implementation achieves only ∼180 picoseconds of MD simulation per day using one GPU CAR on a machine (Nvidia A100 40 GB). In contrast, MTP models indicate that equilibrating the amorphous Li2EDC structure requires at least 60 nanoseconds of MD simulations (Fig. S6), highlighting a major challenge in applying MACE to SEI systems. This computational bottleneck restricts its feasibility for long-timescale simulations, such as studying ion transport across grain boundaries in SEI's organic or inorganic phases, which evolve due to dissolution, redeposition, and densification of different phases. Collectively, both methods highlight the importance of developing a diverse suite of MLFFs tailored to the specific accuracy-efficiency requirements of different battery components. Future extensions may benefit from hybrid strategies that combine the strengths of local-descriptor models like MTP with other more expressive approaches such as Sparse Gaussian Process Regression (SGPR). SGPR, as demonstrated in recent studies59,60 on solid electrolyte materials, provides not only reliable accuracy but also intrinsic uncertainty quantification.
While our study focuses on Li2CO3 and Li2EDC as representative SEI components due to their well-characterized roles in the inorganic and organic phases of the SEI, we acknowledge that the real SEI is a heterogeneous and dynamic mixture of various species, including LiF, Li2O, organic polymers, and decomposition products including battery additives.4–6,61 Our current framework is designed to be extensible, and the active learning workflow demonstrated here can be applied to incorporate additional SEI components in future studies. This work represents an important first step toward building more comprehensive machine learning-based models for the complex multicomponent nature of the SEI.
Supplementary information: Tables and figures illustrating the structures and properties of various materials and systems. See DOI: https://doi.org/10.1039/d5mh01343g.
| This journal is © The Royal Society of Chemistry 2025 |