Open Access Article
Srikant Sagireddy†
*abc,
Nikhil Rampal†
*ab,
Stephen E. Weitzner
ab and
Liwen F. Wan
*ab
aMaterials Science Division, Lawrence Livermore National Laboratory, Livermore, California 94550, USA. E-mail: rampal1@llnl.gov; wan6@llnl.gov
bLaboratory for Energy Applications for the Future (LEAF), Lawrence Livermore National Laboratory, Livermore, California 94550, USA
cDepartment of Chemical Engineering, Stanford University, Stanford, California 94305, USA. E-mail: srikant@stanford.edu
First published on 11th March 2026
Electrochemical stability windows determine the operating range of battery electrolytes, yet accurate prediction remains challenging because stability emerges from statistical ensembles of local solvation environments rather than single ground-state molecular structures. Traditional density functional theory calculations on energy-minimized clusters cannot capture the thermal variations in local coordination environments and geometries that govern decomposition, while SMILES-based machine learning methods lack explicit representation of three-dimensional solvation structure and ion pairing. Here, we introduce a structure-aware machine learning framework that predicts frontier orbital energies (HOMO and LUMO) directly from molecular dynamics-sampled solvation configurations, achieving sub-0.6 eV accuracy at computational costs 3–4 orders of magnitude lower than first-principles methods. Across twelve representative battery electrolytes, we demonstrate that solvent-separated and contact ion pairs exhibit strong size- and local chemistry dependent electronic stability, with variations in coordination shifts of HOMO or LUMO level by 2–3 eV, and that extended solvation structure and partially desolvated environment further modulate stability by up to 3 eV. By encoding the statistical nature of electrochemical failure through ensemble sampling of explicit solvation geometries, our approach enables high-throughput screening and rational design of next-generation battery electrolytes with mechanistic understanding of structure–property relationships.
Broader contextThe rational design of battery electrolytes remains a critical bottleneck in developing next-generation energy storage systems, as electrolyte stability fundamentally limits achievable voltage windows, cycle life, and safety. Traditional computational screening faces an inherent trade-off: density functional theory provides mechanistic insight but is computationally prohibitive, while machine learning achieves speed but relies on simplified representations (e.g., SMILES) that cannot capture explicit solvation geometries. Moreover, conventional methods evaluate single ground-state configurations, yet experimental electrochemistry involves dynamic ensembles of coordination states (contact ion pairs, solvent-separated pairs, partially desolvated species) with distinct frontier orbital energies. Our work bridges these limitations by integrating molecular dynamics sampling with structure-aware machine learning to predict how statistical ensembles of solvation configurations control electrochemical properties, enabling mechanistically informed, high-throughput screening at computational costs orders of magnitude lower than quantum chemistry while providing fundamental insights into structure–property relationships governing interfacial degradation. |
Experimental electrolyte screening is labor-intensive and economically challenging, and thus limits its practicality to explore broad chemical diversity. First-principles methods such as density functional theory (DFT) and ab initio molecular dynamics (AIMD) offer accurate predictions of electronic properties, but the computational expense becomes prohibitive for evaluating large molecular libraries or mapping the vast chemical design spaces.1 These limitations have motivated the exploration of machine-learning (ML) approaches that can accelerate property prediction by several orders of magnitude compared to the conventional quantum mechanical methods.
Recent ML-driven studies have demonstrated the feasibility of screening electrolyte solvents and additives using learned models trained on DFT-derived electronic descriptors.2,3 Supervised and hybrid ML workflows have been used to identify candidate molecules with improved electrochemical stability and transport properties, often combining molecular descriptors with solvation-aware features.1,4
Notably, Zheng et al.5 demonstrated that frontier-orbital descriptors derived from ML models correlate with experimentally measured coulombic efficiency and interfacial stability against lithium metal, enabling data-driven electrolyte screening guided by electronic structure metrics. However, current ML models face fundamental challenges in generalization, as these models are typically trained on datasets that represent limited families of electrolyte chemistries, leading to poor predictive performance on unseen solvents, salts, or coordination environments.6,7 Designing electrolytes that span carbonates, ethers, sulfones, phosphates, ionic liquids, fluorinated solvents, and reactive intermediates motivates the need for models that learn transferable chemical representations rather than correlations confined to narrow chemical domains.
Recent progress in ML suggests that generalizable predictive models arise from the integration of three key elements: dataset, ML modeling framework, and prediction task. The first element is a dataset that captures chemical diversity and structural realism. Electrolyte properties depend not only on molecular identity but also on solvation environments, coordination structure, ion pairing, and dynamic speciation.8,9 Datasets must therefore provide broad chemical coverage and include high-quality labels such as HOMO and LUMO energies, ionization potentials, solvation energies, and stability metrics that are computed with consistent and reproducible protocols. Foundation models provide a promising framework for addressing this challenge by learning transferable molecular and structural representations from large and chemically diverse datasets.10–12 Their power arises from self-supervised representation learning, which enables the extraction of general chemical features such as orbital patterns, functional group behavior, intermolecular interaction motifs, and solvation structure without relying solely on expensive DFT labels.13–15 By learning transferable molecular representations, these models provide a scalable route to transferability across wide regions of chemical space. In this context, foundation models act as representation learners rather than replacements for first-principles methods, enabling scalable screening while retaining a connection to underlying chemical physics.
The second element is a ML modeling framework capable of leveraging such datasets. This requires architectures that can represent both the compositional and structural complexity of electrolyte systems. These architectures span a broad range, from message passing neural networks and transformer-based encoders operating on molecular graphs or sequences, as in SMI-TED,16 to equivariant neural networks processing full three-dimensional atomic configurations, as in POS-EGNN.17 Graph-based architectures specialized for electrolyte systems have also demonstrated the value of geometry-aware representations; for example, BAMBOO-MIXER18 leverages a forward predictive model with an inverse generative approach, enabling prediction of a broad range of molecular properties given appropriate training data. The choice of representation governs the ability of a model to capture electronic structure, noncovalent interactions, and the influence of local environments. Scalable training workflows that support multi-task and multi-fidelity learning are essential because electrolyte datasets often vary in size, accuracy, and computational cost. A robust architectural framework therefore provides the foundation for extracting transferable chemical information and for adapting models to diverse predictive tasks.
The third element is the prediction task itself. Properties that govern electrolyte performance include ionic conductivity, viscosity, solvation structure,19 and the electronic descriptors that define oxidative and reductive stability.13 The HOMO, LUMO and gap have become practical surrogates for the electrochemical stability window (ESW) in many screening studies. Yet these properties do not originate solely from isolated molecular structure. They are strongly modulated by local chemical environments, including ion pairing, transient complexes, and fluctuating solvation environments. Such speciation can shift frontier orbital energies and modify redox limits in ways that gas-phase calculations do not capture. Accurate prediction therefore demands models that incorporate structural context and environment-dependent effects. These advances motivate a framework that moves beyond isolated-molecule descriptors toward environment-aware, transferable prediction of electrolyte electronic properties.
In this work, we introduce a structurally informed integrated ML workflow, coupling molecular dynamics sampling with a structure-aware equivariant graph neural network, that predicts electrolyte frontier orbital energies from thermally sampled, solvated molecular environments, enabling transferable electronic property prediction across chemically diverse electrolyte systems. The novelty lies in the design of this application workflow: we first filter chemically diverse datasets that capture realistic solvation environments, ion pairing behavior, and dynamic speciation, leveraging the foundational OMol25 database11 to learn transferable molecular and structural representations. We then incorporate physics-grounded structural information directly into the predictive framework by extracting thermalized molecular configurations from molecular dynamics (MD) simulations and using these structures as inputs to the17 equivariant graph-based neural network. By embedding local chemical context into the representation learning process, our approach bridges the gap between the efficiency of ML-based screening and the physical fidelity required for rational electrolyte design, thereby enabling rapid, transferable, and structurally informed screening of diverse electrolyte formulations for desired oxidative and reductive stabilities.
To create a fine-tuning dataset tailored specifically to lithium-ion batteries, we sequentially filtered the pre-training dataset. We first identified structures containing lithium atoms, reducing the 20 million electrolyte structures to over 600
000 lithium-containing molecules, followed by removing all aqueous samples and under-coordinated lithium samples.
We further retained only samples in which lithium coordination was between 3 and 10. These filtering steps produced a final fine-tuning dataset of approximately 24
000 structures representing chemically relevant Li-ion battery electrolytes (Fig. 1).
![]() | ||
| Fig. 1 ML training protocol. (A) Overview of workflow from data engineering to model training and verification. (B) Data processing. (C) Model training and inference. | ||
Given a molecular system with N atoms, the model takes atomic numbers
and three-dimensional coordinates
as inputs, where
denotes the maximum atomic number, and then constructs a molecular graph by connecting atoms within a cutoff distance rcut. Two types of node representations are maintained throughout its layers: invariant scalar features
and equivariant steerable features
for degree l, where dne represents the node embedding dimension. The edges are represented by invariant scalars
and tensors
ij = {
(0),
(1),…,
(Lmax)}, where ded is the edge embedding dimension and each
(l) encodes l-degree spherical harmonic projections of relative atomic positions, up to maximum degree Lmax. Information flows through the network via attention-driven message passing that explicitly incorporates spatial geometric information in each layer. The model outputs dne-dimensional feature vectors for each atom, capable of capturing both local atomic environments and global molecular structure.
For molecular-level property prediction, the node embeddings are aggregated through max pooling to obtain a holistic graph-level representation:
| g = MaxPool(h1,h2,…,hN). | (1) |
This global molecular representation is then passed through a multi-layer perceptron prediction head to predict the HOMO/LUMO energies and HOMO–LUMO gaps. For atomic-level property prediction, like the site charges, the node embeddings hi are directly passed through a separate prediction head.
| Parameter | Small | Medium |
|---|---|---|
| Layers | 4 | 6 |
| lmax | 2 | 2 |
| dne | 256 | 384 |
| ded | 384 | 384 |
| rcut (Å) | 6.0 | 6.0 |
| Attention heads | 8 | 8 |
We then finetuned the model with identical architectural parameters but a reduced learning rate, 3.0 × 10−5, for 10 epochs over the 24
000 selected Li-ion battery-relevant structures. The resulting fine-tuned model achieves enhanced performance on predicting relevant battery electrolyte solvation structures while retaining the broad generalization capabilities developed during pre-training.
At last, we implemented a multi-component loss function that balances prediction accuracy across all four targets. The total loss was defined as
![]() | (2) |
,
,
, and
represent the losses for HOMO energy, LUMO energy, HOMO–LUMO gap, and site charge, respectively. Each component loss was computed using the Huber loss function, which provides robustness to identify outliers. Since the HOMO–LUMO gap is mathematically derived as εgap = εLUMO − εHOMO, accurate predictions of the individual orbital energies inherently ensure accurate gap predictions. We therefore assigned higher weights to the HOMO and LUMO losses to emphasize learning these fundamental electronic properties directly. The HOMO–LUMO gap loss, with a lower weight, was considered as a physical constraint that encourages the model to maintain meaningful relationship between HOMO and LUMO energies.
| Electrolyte | Speciation | HOMO [eV] | LUMO [eV] | ||||
|---|---|---|---|---|---|---|---|
| Small | DFT | Error | Small | DFT | Error | ||
| LiFSI-G2 | SSIP | −12.7 | −12.7 | 0.0 | −1.82 | −1.70 | 0.12 |
| CIP | −9.70 | −9.31 | 0.39 | 0.07 | 0.26 | 0.19 | |
| LiTFSI-G2 | SSIP | −12.8 | −12.8 | 0.0 | −1.61 | −1.64 | −0.03 |
| LiPF6-EC | SSIP | −13.8 | −13.2 | 0.6 | −2.28 | −1.56 | 0.72 |
| CIP | −11.1 | −10.8 | 0.3 | −0.50 | 0.11 | 0.61 | |
| LiPF6-EMC | SSIP | −13.5 | −13.3 | 0.2 | −2.10 | −1.75 | 0.35 |
| CIP | −11.8 | −11.2 | 0.6 | −0.72 | 0.19 | 0.91 | |
| LiPF6-EC-EMC | SSIP | −10.8 | −6.89 | 3.91 | −4.10 | −5.49 | −1.39 |
| CIP | −9.54 | −4.52 | 5.02 | −3.45 | −1.96 | 1.49 | |
| LiTFSI-EC-PC | SSIP | −13.1 | −12.9 | 0.2 | −1.55 | −1.38 | 0.17 |
| CIP | −10.3 | −9.79 | 0.51 | 0.01 | 0.20 | 0.19 | |
| LiFSI-DEE | SSIP | −13.8 | −13.5 | 0.3 | −2.20 | −1.83 | 0.37 |
| CIP | −9.19 | −8.97 | 0.22 | −0.7 | −0.09 | 0.61 | |
| LiTFSI-DEE | SSIP | −13.8 | −13.5 | 0.3 | −2.14 | −1.82 | 0.32 |
| CIP | −10.8 | −10.6 | 0.2 | 0.13 | 0.26 | 0.13 | |
| LiFSI-DPE | CIP | −9.99 | −10.0 | −0.01 | 0.55 | 0.44 | −0.11 |
| LiTFSI-DEE-DPE | SSIP | −13.2 | −13.0 | 0.2 | −1.96 | −1.80 | 0.16 |
| CIP | −9.47 | −8.86 | 0.61 | 0.03 | 0.10 | 0.07 | |
| LiTFSI-DOL-DME | SSIP | −7.36 | −11.1 | −3.74 | −1.69 | −4.08 | −2.39 |
| LiFSI-BFE | SSIP | −14.4 | −14.2 | 0.2 | −2.10 | −1.87 | 0.23 |
| CIP | −9.05 | −8.81 | 0.24 | −0.15 | −0.14 | 0.01 | |
| LiPF6-BFE | SSIP | −14.1 | −14.0 | 0.1 | −1.94 | −1.82 | 0.12 |
| CIP | −11.4 | −11.0 | 0.4 | 0.23 | 0.22 | −0.01 | |
| LiFSI-BTFE | CIP | −10.0 | −10.1 | −0.1 | 0.51 | 0.17 | −0.34 |
| LiPF6-BTFE | SSIP | −13.9 | −14.1 | −0.2 | −1.63 | −1.88 | −0.25 |
| CIP | −12.1 | −12.3 | −0.2 | −0.12 | −0.10 | 0.02 | |
To assess whether the model learns chemically meaningful representations of local solvation environment beyond reproducing numerical trends, we analyze the internal representations learned by the model. Specifically, we extract node embeddings for lithium atoms from structures sampled during MD simulations (section 1 of the SI) across a range of electrolyte systems to probe how local solvation environments are encoded. Fig. 2B and C show t-SNE projections of these Li-ion node embeddings. When color-coded by bulk electrolyte composition (Fig. 2B), the embeddings cluster according to solvent class, e.g. carbonate- or ether-based system, and anion identity, such as PF6−, FSI−, or TFSI−. When color-coded by local speciation (Fig. 2C), the embeddings separate into well-defined regions corresponding to freely solvated Li-ions, referred to as solvent-separated ion pairs (SSIPs, blue), in which the Li-ion is coordinated exclusively by solvent molecules with no anion in the first solvation shell, contact ion pairs (CIPs, red), in which one anion is present in the first solvation shell, and larger aggregate structures (grey). All solvation structures in this work are Li-ion-centered; anion-centered clusters independent of Li-ion are not considered, as our focus is on solvation environments directly relevant to Li-ion coordination and the implication of desolvation at electrode interfaces.22,23 This organization of the latent space indicates that the model encodes local coordination environments without explicit supervision on structural labels, capturing the fundamental and diverse chemical interactions that define local solvation environments.
While the latent space analysis demonstrates that the model effectively captures local coordination physics, predictive accuracy must be quantitatively benchmarked against first-principles energies. To this end, we perform DFT single-point energy calculations at the ωB97M-V/def2-TZVPD level of theory on representative solvation structures extracted from the MD simulations (section 3 of the SI). Table 2 compares the predictions from the small model against these DFT reference values across various electrolyte combinations, spanning carbonates, glymes, ethers, and fluorinated solvents with FSI−, TFSI−, and PF6− anions. For most systems, the model achieves excellent agreement with DFT, with HOMO energy errors typically below 0.6 eV and LUMO errors below 0.4 eV. However, two systems exhibit notably larger deviations: LiPF6-EC-EMC and LiTFSI-DOL-DME. These outliers share a common characteristic: they involve electrolyte mixtures in which multiple solvent species compete for interactions with lithium ions (SI, section 5). We speculate that these errors arise from the model's limited exposure to mixed-solvent coordination environments during pre-training, which may hinder its ability to fully capture complex solvent interactions in these specific systems. Predictions for structurally dissimilar solvent mixtures should therefore be interpreted cautiously and rigorously validated by DFT calculations. Still, promisingly, the model maintains high predictive accuracy for most of the mixed or functionalized systems where the chemical components are structurally similar. For example, systems like LiTFSI-DEE-DPE (with errors below 0.61 eV) and LiTFSI-EC-PC (with errors below 0.51 eV) demonstrate excellent performance. Overall, these quantitative benchmark tests suggest that the model excels at interpolating within similar chemical spaces, but may struggle to fully capture environments that combine structurally distinct solvent classes, particularly those that were underrepresented in the pre-training data. Improving predictions for heterogeneous mixed-solvent systems would likely require augmenting the training dataset with more diverse multi-component coordination environments. Despite these current limitations, we believe the model remains robust and capable of providing quantitatively accurate frontier orbital predictions, which makes it suitable for comparative screening and rational electrolyte design. We leverage these capabilities in the following sections to analyze structure–property relationships, highlighting the model's potential for guiding future electrolyte development.
![]() | (3) |
![]() | (4) |
Based on the speciation, e.g., freely solvated ions, CIPs, SSIPs, etc. extracted from MD simulations of twelve different electrolyte systems, we predict their ESW using our benchmarked POS-EGNN model.
The ensemble-resolved cathodic and anodic limits, defined as
| VelectrolyteCL = max(VCL,SSIP, VCL,CIP) | (5) |
| VelectrolyteAL = min(VAL,SSIP, VAL,CIP), | (6) |
As evident in Fig. 3A, substantial differences in ESWs across electrolyte chemistries, both in the absolute width of the window and in the relative positions of the cathodic and anodic limits. Even within the same solvent class, changes in anion identity or solvent mixing lead to systematic shifts in stability limits, underscoring strong sensitivity of frontier orbital energies to local solvation structures. In the following sections, we discuss the role of individual attributes of solvation environments to the predicted ESWs.
The strong solvent effects are further highlighted in the distribution analysis of HOMO, LUMO, and HOMO–LUMO gap values for all solvation structures extracted from MD sampling. As shown in Fig. 3B, the distributions of HOMO and LUMO energies span a wide range, with clear separation based on solvent class. Glyme-based electrolytes exhibit the most stable HOMO and LUMO levels, with narrow distributions, suggesting they are less sensitive to geometrical variations in local solvation structures. In contrast, ether and fluorinated ether systems show broader distributions, indicating a stronger dependence of electrochemical stability on speciation and local solvation environments. Interestingly, carbonate mixtures exhibit the lowest-lying LUMO states, extending down to about −6 eV, which suggests increased susceptibility to reduction. The combined shifts in HOMO and LUMO levels are reflected in the HOMO–LUMO gap distributions, which generally cluster between ∼8 and ∼12 eV for most solvent systems. Overall, we observe the following trend for stability: carbonate mixtures < ethers, fluorinated ethers < carbonate < ether mixtures < glymes. The wide distribution of HOMO–LUMO gap values in fluorinated ethers suggests that these systems offer significant tunability through specific functionalization. This tunability presents a practical opportunity to design electrolyte systems with enhanced electrochemical stability, further expanding the range of viable electrolyte formulations for next-generation batteries.
Systematic comparison of LiPF6, LiFSI and LiTFSI across different solvent systems reveals striking anion-dependent variations in both oxidation and reduction limits. In fluorinated ether systems, this pattern is particularly pronounced, where LiPF6 in BFE exhibits an anodic limit of 11.04 ± 0.30 V compared to LiFSI in BFE at 9.53 ± 0.36 V, demonstrating a 1.5 V improvement in oxidative stability. Similarly, in BTFE solvent, LiPF6 achieves 11.49 ± 0.39 V versus LiFSI's 9.88 ± 0.33 V anodic limit. This consistent trend across different fluorinated ethers confirms LiPF6− based systems provide superior oxidative resistance compared to sulfonyl-based salts. However, cathodic stability shows the opposite behavior. LiFSI systems consistently demonstrate enhanced reduction resistance compared to the LiPF6 counterparts. In BFE solvent, both salts exhibit similar cathodic limits (1.83 V), but in BTFE, LiFSI achieves remarkable cathodic stability with a limit of −0.28 ± 0.24 V compared to LiPF6's 1.63 V. This 1.9 V difference represents one of the largest anion-induced cathodic shifts observed across all systems studied, highlighting the profound impact of anion identity on reduction processes.
The most dramatic anion effects appears in glyme-based electrolytes, where LiTFSI-G2 achieves an extraordinary anodic limit of 12.85 ± 0.13 V, nearly 4 V higher than LiFSI-G2's 8.81 ± 0.86 V. This significant difference translates to stability windows of 11.11 V and 7.06 V, respectively, leading to a 57% increase in ESW. Remarkably, both systems exhibit nearly identical cathodic limits (1.74–1.75 V), meaning the extended ESW for LiTFSI-G2 is entirely due to enhanced oxidative resistance from the TFSI− anion. The structural flexibility of TFSI− introduces additional complexity through conformational effects. Unlike rigid anions, TFSI− can adopt multiple geometries around lithium, creating dynamic equilibria between different CIP structures. This conformational diversity is reflected in the broader error ranges observed for some TFSI− systems (e.g., LiTFSI-DOL-DME shows ±1.55 V anodic uncertainty), indicating greater structural inhomogeneity compared to more rigid anions such as PF6−.
These anion-specific effects reveal fundamental principles governing electrochemical stability limits. Across all twelve electrolytes studied, the cathodic limit is set by solvent-separated ion pairs (SSIP), whereas the anodic limit is governed by contact ion pairs (CIP), as shown in Fig. S2. Beyond pure anion effects, the interplay between anion identity and solvent chemistry leads additional complexity in mixed and functionalized systems.
Carbonate mixtures like LiPF6 in EC-EMC exhibit dramatically different stability windows (5.10 V difference) compared to single-component systems (LiPF6-EC: 8.71 V, LiPF6-EMC: 9.29 V), with the mixture showing a substantially increased cathodic limit (5.05 V) that reflects cooperative solvation effects. Similarly, fluorine-functionalized electrolytes exhibit systematic shifts in both HOMO and LUMO distributions relative to their parent solvents, with fluorination generally improving cathodic stability but showing variable effects on anodic limits depending on the specific anion–solvent combination. Notably, the model resolves these differences without explicit supervision, capturing the impact of additives on ESWs and demonstrating its ability to distinguish subtle but chemically meaningful variations between single component electrolytes and their mixtures.
For SSIPs (Fig. 3F (left panel)), the frontier orbital energies exhibit a systematic dependence on solvation shell size. As the environment expands from primary to secondary coordination, the HOMO energy increases substantially, by 2 to 3 eV, indicating reduced oxidative stability. This trend becomes more pronounced when anions enter the secondary solvation shell, despite the absence of direct Li-anion contact. The upward shift in the HOMO reflects long-range electronic coupling mediated through the solvent network, demonstrating that even non-contact ion pairing can strongly perturb the occupied electronic states. In contrast, the LUMO energy in SSIP configurations shows a weaker but still consistent response to the extension of the solvation environment. As shown in Fig. 3F (middle panel), secondary solvation induces only modest shifts in LUMO levels, whereas the inclusion of anions produces a more pronounced effect. Taken together, these effects result in a monotonic reduction of the HOMO–LUMO gap with increasing solvation shell complexity. Similarly trends are also observed for CIPs (Fig. 3G), where the HOMO and LUMO energies show a strong dependence on the solvation environment, particularly on the inclusion of extended solvation shells (second solvation shell) that contain anion species.
For the freely solvated Li-ions (Fig. 4A), partial desolvation induces substantial system-specific shifts in the predicted frontier orbital energies. The systems most sensitive to desolvation are those with ether-based solvents. For example, LiFSI-DEE shows the most dramatic response, with LUMO shift of 2.53 eV (from −4.71 eV at 2 molecules to −2.18 eV at 4 molecules) and gap modulation of 3.23 eV (from 8.20 eV at 1 molecule to 11.43 eV at 4 molecules). Similarly, LiPF6-BFE exhibits a 1.90 eV HOMO shift from −14.33 eV (2 molecules) to −12.43 eV (4 molecules). In contrast, carbonate-based systems display more modest variations, with LiPF6-EC and LiPF6-EMC showing gap changes of only 1.08 eV and 1.18 eV, respectively, over the same desolvation range.
CIPs (Fig. 4B) display similarly strong desolvation effects, with ether-based solvents again exhibiting the greatest sensitivity to coordination number changes. LiFSI-DEE shows the most pronounced LUMO variation of 2.02 eV, shifting from −2.04 eV (1 molecule) to −0.02 eV (4 molecules), with corresponding gap modulation of 1.78 eV (7.71 eV to 9.49 eV). LiPF6-BTFE demonstrates significant HOMO sensitivity with a 1.60 eV destabilization from −12.06 eV (1 molecule) to −10.46 eV (4 molecules), although its gap value remains relatively constant with only a 0.71 eV variation (11.71 eV to 11.00 eV). In contrast, systems like LiTFSI-DEE show only modest impact from desolvation, with maximal gap variations of 1.33 eV.
These observed differential sensitivities likely arise from variations in local solvation geometry, which lead to greater variations in Li+ coordination geometry and, consequently, larger fluctuations in frontier orbital energies. The strong dependence of electronic properties on both solvation number and solvent identity underscores the value of high-throughput screening approaches. In this context, foundation models enable rapid evaluation of numerous electrolyte candidates across diverse solvation states, and can identify promising systems for subsequent investigation with more computationally demanding yet more accurate physics-based methods. Moreover, the substantial orbital energy variations observed during desolvation highlight that accurate prediction of interfacial electrochemical stability requires explicit modeling of partially desolvated states, suggesting that future predictive simulations would benefit from incorporating explicit electrode proximity effects, e.g., interfacial electric fields and surface adsorption interactions, beyond what bulk coordination-number sampling alone can capture.
This statistical perspective underscores fundamental limitations of first-principles modeling of electrolyte stability. Static DFT calculations typically operate on energy-minimized clusters and, by construction, cannot access the distribution tails that ultimately determine electrochemical failure, leading to systematic underestimation of both reduction susceptibility (by missing low-lying LUMO configurations) and oxidation susceptibility (by missing high-lying HOMO configurations). The strong dependence of frontier orbital energies on solvation-shell size and ion-pairing structure further indicates that small, isolated molecular clusters are insufficient to capture the electronic environments experienced by solvated species under practical conditions. Accurate prediction of HOMO and LUMO energies therefore requires inclusion of extended solvation shells, explicit representation of ion pairing, and dynamical sampling of coordination environments. Although DFT can, in principle, describe these effects, electronic-structure calculations on large, thermally sampled solvation clusters rapidly become computationally prohibitive. As a result, many existing studies rely on static or undersolvated cluster models that underestimate the impact of collective solvation and speciation. In this regard, ensemble-based ML approaches offer a unique opportunity to overcome these limitations by efficiently sampling the full distribution of solvation configurations that governs electrochemical stability under realistic operating conditions.
For HOMO energies, the SMI-TED model exhibits deviations ranging from −5.83 to +2.14 eV relative to POS-EGNN across the solvent systems examined. The largest discrepancies occur for G2, with a −5.83 eV deviation (POS-EGNN: −12.64 eV; SMI-TED: −6.81 eV), DOL-DME with −5.50 eV (POS-EGNN: −12.39 eV; SMI-TED: −6.89 eV), and BFE with −4.56 eV (POS-EGNN: −11.80 eV; SMI-TED: −7.24 eV). Carbonate solvents show moderately large errors of −3.21 to −3.51 eV. Only BTFE deviates in the opposite direction (+2.41 eV), indicating that the one-dimensional representation fails to capture stabilizing effects associated with specific solvation geometries.
LUMO energy predictions show similarly large discrepancies, spanning −5.41 to +2.41 eV. The EC–EMC mixture exhibits the largest deviation at −5.41 eV (POS-EGNN: −4.17 eV; SMI-TED: +1.24 eV), followed by DOL-DME at −4.01 eV and G2 at −3.86 eV. Ether systems consistently show deviations exceeding −2.7 eV (DEE: −2.83 eV; DEE-DPE: −2.85 eV; BFE: −2.71 eV), whereas carbonate solvents exhibit more modest errors of −1.81 to −1.85 eV. Again, BTFE represents an outlier with a +2.41 eV deviation.
Although HOMO–LUMO gap predictions benefit from partial error cancellation, systematic deviations persist, ranging from −2.51 to +2.08 eV. EC-EMC shows the largest underestimation at −2.51 eV (POS-EGNN: 6.52 eV; SMI-TED: 9.03 eV), while EMC exhibits the largest overestimation at +2.08 eV (POS-EGNN: 11.09 eV; SMI-TED: 9.01 eV). Even for systems with relatively small gap errors, such as BTFE (+0.20 eV) and DPE (+0.23 eV), the underlying HOMO and LUMO predictions contain large compensating errors exceeding 2 eV.
The fundamental limitation of SMILES-based models arises from their one-dimensional representation of chemical information. A SMILES string specifies the molecular graph of the solvent but contains no information about how it coordinates the Li ion or how it orients relative to coordinating anions. Yet, it is precisely these geometric details that ultimately determines frontier orbital energies through effects such as orbital hybridization and electrostatic screening.
Through systematic analysis of twelve battery-relevant electrolyte systems spanning carbonates, glymes, ethers, and fluorinated solvents with FSI−, TFSI−, and PF6− anions, we quantify how local coordination environment and species determines frontier orbital energies. Across all systems examined, it is found that SSIPs control the cathodic limit whereas CIPs determine the anodic limit, with coordination-dependent energy variations spanning 2 to 3 eV. Beyond the primary solvation effect, our analyses also highlight the critical role of extended solvation environment, which can module the orbital energies by up to 3 eV through long-range electronic coupling mediated by the connected solvent network. Analysis of partial desolvated states, which naturally occur during battery's operation at the electrode–electrolyte interfaces, further reveals system-specific sensitivities to coordination number changes, with fluorinated ethers exhibiting particularly strong responses. These results indicate that bulk-phase sampling alone is insufficient to fully capture the electrochemical stability of relevant solvated species and to reliably predict propensities for interfacial degradation.
Our framework addresses a fundamental limitation in both traditional quantum chemistry and previous ML approaches for electrolyte screening. It is shown that the frontier orbital energies are not determined by a single molecular representation of the solvated species, but rather, by the collective interplay of ion-pair interactions, coordination geometry, and the extended solvation environment. Directly encoding these molecular details in the ML model is therefore essential for accurate downstream predictions; correspondingly, ML models based solely on SMILES representations are fundamentally unsuitable for electronic-structure prediction. By training equivariant graph neural networks on the OMol25 database filtered for battery-relevant chemistries, our model explicitly incorporates geometric features while learning transferable representations of solvation environments from large and diverse datasets. From a practical perspective, this work establishes a scalable pipeline for high-throughput screening of electrochemically promising electrolyte candidates at computational costs that are 3–4 orders of magnitude lower than DFT, while maintaining sub-0.6 eV accuracy in frontier orbital assessment. Nevertheless, we acknowledge several limitations of the current model that warrant further development in future work. First, expanding the training dataset to include a greater diversity of mixed-solvent coordination environments would improve transferability for screening multicomponent electrolytes used in commercial batteries; in the interim, we recommend validating mixed-solvent predictions against targeted DFT calculations, particularly for systems combining structurally dissimilar solvents such as cyclic and linear carbonates. Second, incorporating temporal information from direct MD simulations of activated desolvation processes may be necessary to capture transient coordination events that govern electrolyte decomposition pathways and interfacial ion-transport kinetics. Third, extending the predictive scope beyond frontier orbital energies to include reaction barriers, decomposition products, and solid–electrolyte interphase properties would enable comprehensive computational screening of electrolyte stability and performance across the full operating window of next-generation batteries.
Footnote |
| † These authors are co-first authors. |
| This journal is © The Royal Society of Chemistry 2026 |