Bruis van Vlijmen‡
ab,
Vivek N. Lam‡
ab,
Patrick A. Asinger‡c,
Xiao Cui‡
ab,
Joachim Schaeffer
cd,
Alexis Geslin
abf,
Devi Ganapathiab,
Shijing Sun
e,
Patrick K. Herring
e,
Chirranjeevi Balaji Gopal
e,
Natalie Geise
b,
Haitao D. Deng
ab,
Henry L. Thamanab,
Stephen Dongmin Kang
a,
Steven B. Torrisi
e,
Amalie Trewartha
e,
Abraham Anapolskye,
Brian D. Storey
e,
William E. Gent
ab,
Richard D. Braatz
*c and
William C. Chueh*abf
aDepartment of Materials Science and Engineering, Stanford University, Stanford, CA, USA. E-mail: wchueh@stanford.edu
bApplied Energy Division, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
cDepartment of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA. E-mail: braatz@mit.edu
dControl and Cyber-Physical Systems Laboratory, Technical University of Darmstadt, 64289, Germany
eToyota Research Institute, Los Altos, CA, USA
fDepartment of Energy Science and Engineering, Stanford University, Stanford, CA, USA
First published on 30th May 2025
To reliably deploy lithium-ion batteries, a fundamental understanding of cycling aging behavior is critical. Battery aging consists of complex and highly coupled phenomena, making it challenging to develop a holistic interpretation. In this work, we generate a diverse battery cycling dataset with a broad range of degradation trajectories, consisting of 359 high energy density commercial Li(Ni,Co,Al)O2/graphite + SiOx cylindrical 21700 cells cycled across 207 unique cycling protocols. We consolidate aging via 16 mechanistic state-of-health (SOH) metrics, including cell-level performance metrics, electrode-specific capacities/state-of-charges (SOCs), and aging trajectory metrics. We develop a framework using interpretable machine learning and explainable features to generate an aging matrix that visually deconvolutes the complex battery degradation behavior. This generalizable data-driven mechanistic framework simplifies the complex interplay between cycling conditions, degradation modes, and SOH, acting as a hypothesis-generation tool to aid battery users in identifying key degradation regimes for further study and experimentation.
Broader contextThe growing demand for energy storage solutions in electrifying transportation and decarbonizing the electricity grid underscores the need to accelerate advancements in battery technology to meet various performance and cost requirements. Understanding battery aging is one of the most resource consuming tasks in developing new battery technologies due to complex and intercoupled degradation pathways. To expedite battery technology development, methodologies summarizing degradation across diverse operating conditions are necessary. In this work, we use data-driven methodologies in combination with interpretable metrics to summarize degradation across hundreds of cycling conditions visually with an aging matrix. This aging matrix highlights how 16 state-of-health indicators depend on both operating conditions and other mechanistic state-of-health metrics. This aging matrix and the corresponding methodology allow battery designers to quickly visualize their battery degradation space, and researchers to identify areas of interest for further experimentation or analysis. |
While characterization at the materials and cell level can offer a mechanistic understanding of battery aging,19–22 it is slow and low throughput.23 Instead, modeling offers a more accessible route to understanding battery aging. Physics-based models, such as the Doyle–Fuller–Newman model,24–26 utilize fundamental electrode parameters, capture physical principles such as conservation laws, and explicitly model degradation mechanisms leading to more interpretable behavior. Nonetheless, predicting battery lifetime under unseen conditions remains challenging due to the complexity of interconnected aging phenomena14 and challenges of model parameter identifiability.27 Another approach uses mechanistic models to estimate electrode-specific capacities and lithium inventory.28–30 These models capture aggregate physical mechanisms with fewer model parameters than physics-based simulations.31–35 Tracking electrode-specific capacities independently provides a clearer picture of what types of degradation occur under various operating conditions.23,36
In recent years, machine learning (ML) techniques have been developed to analyze battery aging through a data-driven lens.37–56 While ML techniques are high throughput and predictive, a pure data-driven approach with complex black-box models (e.g. deep learning) obscures relationships between cycling conditions and battery aging mechanisms, overlooking key scientific and engineering insights.57 Data-driven models with interpretable characteristics could build trust by reproducing known trends, and inform engineering decisions from physical insights all while retaining flexibility and ease of use.
Existing battery aging datasets are typically collected with specific applications in mind.58–66 To give application examples, Attia, Severson, and colleagues67,68 focused on optimizing electric vehicle fast charging protocols. Diao et al.69 examined different temperatures to understand how temperatures accelerate battery aging. Paulson, Ward, and colleagues48,70 tested various cell chemistries to understand the differences in their aging and build transferable ML models. Wildfeuer et al.71 examined different state-of-charge (SOC) ranges and temperatures in both cycling and calendar aging tests to investigate different experimental factors, but did not use multi-variate analyses, such as ML techniques, to deconvolute the multiple aging factors at play. To capture degradation across a variety of use cases data must span across use cases, including a wide range of SOC, charging, and discharging protocols (see ESI,† Table S5 for an overview of cycling datasets).72 Many of these studies first analyze the data with a traditional methodology relying on holding all cycling parameters constant, and varying one cycling parameter at a time to get the influence of a selected cycling parameter.69,71,73–77 However, as the dimensionality of cycling parameters increases and one probes an increasingly broad degradation space, the cycling parameters' complex and intercoupled influence on battery degradation can make conventional analysis intractable. Additionally, if one wants to relate measured inputs that cannot be controlled by an operator to be constant, such as capacity or resistance, to battery degradation, a traditional univariate approach does not generalize well to these use cases. Instead, interpretable data-driven models comprehensively applied to large datasets can be leveraged as a critical tool to bridge the gap between pure data-driven approaches and traditional methodologies, allowing researchers to better understand their battery systems.
In this work, we propose and develop a physically interpretable, data-driven understanding of lithium-ion battery aging. We first generate a large dataset consisting of 359 cells under 207 unique cycling conditions spanning diverse use cases and aging trajectories in-house. We then develop a comprehensive understanding of degradation, within the bounds of the gathered dataset, by calculating 16 mechanistic SOH metrics across the degradation aging trajectory. By combining interpretable ML with explainable features, we extract complex correlations to these SOH metrics to identify factors contributing significantly to degradation. In doing this, we demonstrate that physically meaningful features should be used in combination with methods that robustly extract feature importance to gather insights from large datasets.78–82 This approach addresses the challenges of analyzing a comprehensive set of SOH metrics across diverse operating conditions using traditional univariate methods, and the limited system-level insight offered by standalone black-box models or early prediction models with features that are not easily interpretable, such as the features employed in Severson et al.67 By using an explainable data-driven model in combination with a diverse battery aging dataset, we generate an aging matrix to analyze and visually summarize battery degradation across a comprehensive set of SOH metrics. More generally, constraining ML models to use features that have clear physical meaning dramatically enhances explainability, complementing pure data-driven featurization approaches.
![]() | ||
Fig. 1 Overview of the dataset. (a) We highlight the diverse cycling parameters varied in our dataset. All batteries in this dataset are cycled in a chamber with a temperature set point of 25 °C. The nominal cycling experiment structure is shown schematically with a loop where cells go through a diagnostic “checkup” cycle, followed by 100 aging cycles repeating until end of life (EOL). (b) The diagnostic cycle consisting of a reset cycle, a hybrid pulse power characterization (HPPC),85 and three rate reference performance tests (RPTs) at 0.2C, 1C, and 2C discharge currents (see ESI,† Table S2 for full conditions). Mechanistic SOH metrics are extracted from various parts of this diagnostic cycle data (see ESI,† Section S.4 for further details). Additionally, the aging cycle voltage and current vs. time traces are shown with cycling parameter names overlaid on the areas they would affect. (c) The distribution of rate-dependent capacities at the beginning of life (BOL). Means and coefficients of variation are included in the plot, showcasing the tight distribution at BOL. (d) The distribution of rate-dependent capacities at EOL (defined by 0.2C RPT capacity reaching 80% of the nominal capacity, 4.84 A h). For further information on BOL to EOL variability see ESI,† Section S.8.86 |
We first quantify six cell-level performance metrics: (1) total EFCs at EOL, (2) 1C rate-specific capacity: QRPT,1C, (3) 2C rate-specific capacity: QRPT,2C, (4) ohmic resistance: Rohm, (5) charge transfer resistance: Rct, and (6) polarization resistance: Rp. We calculate resistances through pulse measurements performed during the hybrid pulse power characterization (HPPC) sequence of the diagnostic cycle at various SOCs (see ESI,† Section S.4.2 for definitions and calculation details for resistance metrics). Unless otherwise specified, the resistances reported are at 50% SOC.
Second, to determine electrode-specific capacities/SOCs, we implement differential voltage fitting (DVF), a mechanistic model algorithm. DVF is a non-invasive degradation probe that fits a measured full cell voltage profile with an emulated profile created from the underlying cathode and anode voltage profiles (see more details in Sections S.4.3 and S.6 (ESI†) where we also justify the validity and emphasize limitations of our results obtained with a moderate current (0.2C) data). Similar methodologies have been implemented by other groups.23,32,34,87–89 Using DVF we extract electrode capacities (QPE and QNE) and lithium inventory (QLi). Additional information, such as the SOC of either electrode at a full cell specified voltage is further calculated. We specifically select SOCPE,2.7V, SOCNE,2.7V, SOCPE,4.0V, and SOCNE,4.0V because the electrode-specific SOCs near the fully discharged and fully charged states heavily influence aging.90
These cell-level and electrode-specific metrics are calculated at every diagnostic cycle for each cell and tracked from beginning of life (BOL) to EOL. As would be expected for commercial cells, these metrics have low variability at BOL (coefficient of variations <1%, Fig. 1c). At EOL however, there is high variation in the rate capability (approximately 5% and 8% for RPT1.0C and RPT2.0C respectively), resistance, and electrode-specific capacities/SOCs (Fig. 1d and ESI,† Section S.8). This observation underscores the importance of using a comprehensive set of SOH metrics and confirms that the cycling conditions in this work induce a wide range of degradation trajectories.
In addition to probing cell-level and electrode-specific metrics with each diagnostic cycle, we also quantify the aging trajectory over the entire battery lifetime using metrics we refer to as trajectory metrics.9,91 We define three trajectory metrics: (1) knee indicator: knee, (2) resistance growth factor: R′′, and (3) negative/positive capacity (N/P) ratio: NP Ratio. The knee indicator describes a sudden and accelerated capacity-based degradation (i.e., a knee in the capacity vs. EFC curve) with knee indicator >0 if a knee exists at any point in the cell lifetime. The resistance growth factor captures the curvature of resistance with respect to EFCs, indicating whether resistance grows at an accelerating or decelerating rate during cycling. NP ratio is commonly used in battery manufacturing to determine how much active mass of cathode and anode material to coat onto an electrode sheet. In this case we dynamically calculate an NP ratio with the ratio of the estimated QNE and QPE at EOL. While the exact ratio depends on the half cell voltage bounds, this metric gives a measure of cell balancing between positive and negative electrode remaining capacities.92 Section S.7 (ESI†) details the calculations of these trajectory metrics.
We combine these 16 total cell-level performance metrics, electrode-specific capacities/SOCs, and trajectory metrics, collectively referred to as mechanistic SOH metrics, to comprehensively quantify battery aging. By concurrently assessing these metrics, we reveal their relationships to 207 cycling conditions to develop a comprehensive summary of aging. Fig. 2 visualizes selected metrics calculated on all cells in the dataset. The lifetime of these cells ranges from less than 50 EFCs to nearly 1000 EFCs, forming a broad set of trajectories. Similar trends are seen across the cell-level performance metrics and electrode-specific capacities.
![]() | ||
Fig. 3 Understanding feature importance through ML models and SHAP. (a–c) A traditional approach to understanding a feature's importance varying one parameter at a time, such as Vcharge, while holding the others constants and analyzing the impact on mechanistic SOH metrics such as EFC. Because of the highly convoluted degradation space, the impact of a parameter can drastically change depending on the values of the other controlled parameters. This is shown with three differently colored curves representing different parameters held constant while (a) Vcharge, (b) CCdischarge, and (c) CC1 is allowed to vary (see ESI,† Section S.3 for the aging conditions and further discussion). (d) ML model structure, with cycling conditions as inputs and mechanistic SOH metrics (here EFC) as output. (e) Leveraging SHAP analysis, feature importance is revealed. Each row represents a feature, and its spread correlates with its impact. (f) Each feature impact in (e) is aggregated for all data points by taking the mean absolute SHAP value and the resulting feature strengths are summarized in one line for each SOH metric model (here the EFC model). The stronger the color, the more impactful the feature. The relative absolute error (RAE) is additionally plotted alongside feature importance as model performance is critical in extracting the sensible feature importance. |
To account for the complex interdependence and convoluted degradation space, we utilize machine learning in combination with Shapley additive explanations (SHAP)93 analysis to determine the features that are the most important. A nonlinear machine learning model, in this case a random forest model, is first used to learn the complex correlations of cycling parameters to a SOH metric of interest (Fig. 3d). We then utilize SHAP analysis to attribute model prediction to individual feature importance which calculates the features' average marginal contributions to the model prediction, taking into account all possible combinations.94 As we are using these models to capture complex correlations, well-performing models and features are critical to extract sensible feature importances (Parity plots in ESI,† Section S.10.7). The SHAP importance value for the input features for all cells present in the training dataset is reported in Fig. 3e. Finally, we collapse this plot by taking the mean absolute value of the SHAP values for each of the features on each cell in the dataset to generate a heatmap plot (Fig. 3f). This heatmap visualizes what features on average have the greatest importance for a SOH metric of interest. This information directs the battery pack designer, or researcher to the most impactful cycling parameters for further analysis. In the case of EFC prediction, we see that CC1 is the most important feature and CC2 is the second most important feature. Additionally, as SHAP values describe the generated model, the error of the model is critical in qualifying whether the model and the corresponding SHAP values have captured meaningful trends. Because of this, the relative absolute error (RAE) is plotted alongside the feature importances in Fig. 3f. To see how this approach compares to a state-of-the-art univariate comparison where one parameter is varied while others are held constant to assess the importance of each feature individually, see ESI,† Section S.3. Another key advantage of this approach is that it works in cases where the model inputs are not control parameters that can be held constant, such as calculated features (resistance, capacity, etc.) that are influenced by control parameters. This use case is explored in Sections 3.2 and 3.3.
![]() | ||
Fig. 4 Impact of cycling conditions. (a) ML model structure, with cycling conditions as inputs and EOL mechanistic SOH metrics as outputs. This is the full expanded version of what was shown in Fig. 3d–f. (b) SHAP feature importances for each protocol model, one line representing a model for one specific mechanistic SOH metric. A darker hue indicates higher feature importance. This degradation matrix representation visualizes the impact of cycling conditions on degradation in a high-dimensional space and the corresponding model training errors. |
No single cycling parameter dominates all mechanistic SOH metrics. For example, the cell-level performance metrics QRPT,1C and QRPT,2C are dominated by CC2, while Rct is dominated by Vcharge. Conversely, EFC is more convoluted and is impacted by both charging currents CC1 and CC2, but also by Vdischarge. On the one hand, higher charging currents cause lithium plating and side reactions. On the other hand, lower Vdischarge utilizes more silicon, which degrades faster than graphite.95 These hypotheses can be validated by further analysis and experimentation. For example, disassembling cells that have a higher charging currents and low EFC and observing through scanning electron microscopy (SEM) if the anode shows comparatively more lithium plating to see if lithium plating is the dominant degradation mode for this cycling condition. Similarly, for low Vdischarge and low EFC one can again observe the anode through SEM to see if there is comparatively more cracking due to cycling the silicon region of the anode. Across metrics, Vdischarge and tCV appear less frequently as the dominant feature (within the bounds of this dataset) for most mechanistic SOH metrics (except Vdischarge for EFC), despite previous reports stating their importance.96,97
Additionally, our visual aging matrix representation makes it easy to identify mechanistic SOH metrics governed by the same features by reading the matrix column by column. For example, one can notice that EFC and Rp, the polarization resistance, are both dominated by CC1. The correlation between EFC and Rp is verified in ESI,† Fig. S24, with a Spearman coefficient of −0.81. This suggests that long-term battery degradation is governed by the long time-scale effects such as diffusion limitations in the electrodes rather than by the growth of resistive films or sluggish reactions. These latter two phenomena, represented by the ohmic and charge transfer resistances Rohm and Rct are both dominated by Vcharge. Additionally, Fig. 4 underlines the detrimental role of Vcharge on SOCPE,4.0V, as cathodes are known to suffer from structure instabilities and side reactions at high SOC.98–100 To test these hypotheses one can disassemble the batteries that have high Vcharge and large ohmic and charge transfer resistance, and perform EIS on the cathode electrode sheet. However, this phenomenon does not seem to affect the battery cycle life, likely because of the conservative maximum charge cutoff voltage of 4.2 V. Moreover, Fig. S17 (ESI†) shows that degradation at the anode (QNE) is more pronounced than that on the cathode (QPE), suggesting that EOL is anode-limited (within the bounds of this dataset). This is also reported in the dynamic NP ratio at EOL. One could experimentally verify these findings by disassembling the batteries and performing a low rate capacity check up in a half cell.
We note that many of the electrode-specific capacities/SOCs have higher RAE error and low SHAP feature importance across all features, so the results should be taken more cautiously. With that being said, QLi has a strong CC1 dependence alongside EFC and Rp potentially indicating similar degradation modes. The electrode-specific SOCs, calculated from electrode-specific capacities, depend most strongly on Vcharge and CCdischarge, potentially sharing degradation modes with resistance metrics such as Rct and Rohm.
Finally, for the trajectory metrics, the knee indicator depends most strongly on CC1 and CCdischarge, the resistance growth factor (R′′) on Vcharge and CCdischarge, and the convoluted degradation metric NP ratio has a high RAE and low feature importance across parameters. For detailed information on the influence of cycling conditions on mechanistic SOH metrics, as well as model performance, see ESI,† Section S.10.7.
To summarize, this aging matrix representation directly visualizes important trends to design cycling limits, not just to maximize cycle life but also a wide range of other performance metrics. For example, suppose it is important to prevent capacity knees, from this analysis, we see that modifying CC1 and CCdischarge will have the greatest impact, whereas modifying the Vdischarge would be less effective. Generating this aging matrix allows us to visually summarize battery degradation across numerous conditions and mechanistic SOH metrics, and the aging matrix serves as a hypothesis-generating tool by highlighting features and responses that should be further investigated. This aging matrix allows a battery pack designer or researcher to design hypotheses and focus their efforts on the components of degradation that are dominant in their dataset. While this analysis highlights regions of interest, further study and experimentation, such as the cell teardown studies proposed in this work, can be performed to confirm hypotheses. Additionally, we emphasize that while we chose a representative set of 16 mechanistic SOH metrics, this framework can be extended to additional design-specific SOH metrics and operating conditions depending on the use case. If the design space of the operating conditions or the battery chemistry differs significantly, the models used here would need to be fine-tuned or retrained on newly collected data to remain representative.
We built an “explanatory model” by adding electrode-specific capacities/SOCs metrics as input features to understand their impact on the output Rp (Fig. 5). Specifically, these features are the changes in mechanistic SOH metrics from BOL to EOL (represented by Δ). Low SOC resistance is of particular interest here as it is the most influential for determining when a battery reaches a voltage cutoff. Fig. 5 shows the analysis results for Rp at 30% SOC.
Fig. 5b lists the most dominant features contributing to the observed polarization resistance growth Rp. From the SHAP analysis, we observe that two electrode-specific features, ΔSOCPE,2.7V and ΔSOCNE,2.7V, are dominant features impacting the total resistance but show opposite relationships with resistance growth (Fig. 5b and c). Surprisingly, negative electrode over-discharging (ΔSOCNE,2.7V decreasing) leads to lower resistance increase. This is unexpected because electrode kinetics are typically most sluggish at the SOC extremes; therefore, at low SOC, we expect that resistance should increase in the direction of deeper discharge for an electrode.97
To understand the origin of this effect, we recall how ΔSOCPE,2.7V and ΔSOCNE,2.7V are calculated. These quantities are calculated at a specified full cell voltage (2.7 V for this example) and, as a result, are highly correlated (Fig. 5d and ESI,† Section S.9.1). This correlation arises because when one electrode's SOC shifts, regardless of the aging mechanism, the other electrode's SOC must inversely shift to produce the same measured full cell voltage (ESI,† Fig. S23 explores this in further detail). In general, SHAP is unable to differentiate between highly correlated features, and repeating the SHAP analysis multiple times reveals that either ΔSOCPE,2.7V or ΔSOCNE,2.7V can emerge as the most dominant feature (ESI,† Fig. S21). However, if ΔSOCNE,2.7V is removed from this explanatory model, for example, ΔSOCPE,2.7V appears as the sole dominant feature (ESI,† Fig. S22).
From Fig. 5d, we show that, at low SOC, while negative electrode over-discharging (decreasing ΔSOCNE,2.7V) correlates with lower resistance increase, over-discharged positive electrodes (decreasing ΔSOCPE,2.7V) correlate to higher resistance increase. As resistance increasing with overdischarging of an electrode is in line with the understanding that electrode kinetics are most sluggish at SOC extremes, we rationalize that low SOC resistance rise is dominated by more lithiated positive electrode.
Lastly, charging currents (CC1, and CC2) and discharging current (CCdischarge) both impact Rp at 30% SOC, but with opposite correlations (CC1 and CC2 have a positive correlation while CCdischarge has a negative correlation, indicated by the color scheme in Fig. 5b). It has previously been reported that fast delithiation current (high CC1 and CC2) expedites cathode capacity fade and impedance rise, while fast lithiation current (high CCdischarge) can effectively avoid the kinetically limited region of cathodes and enhance the cyclability of cathodes.103 Our results point to these potential hypotheses.
While our framework based on a random forest model and SHAP explainability does not differentiate between the contributions from two highly correlated electrodes, explainable features together with scientific knowledge contribute to hypothesize causation. Although in this section, we highlight and analyze low SOC resistance as one example, we emphasize that this approach generalizes to any mechanistic aging feature of interest (ESI,† Section S.9.3).
We then apply the framework demonstrated in the previous sections on our diagnostic-aided model and present the results in an aging matrix plot in Fig. 6b (see ESI,† Section S.10.7 for parity plots and full SHAP analysis). For the mechanistic SOH metrics, the diagonal entries of the aging matrix are highlighted and correspond to self-prediction (i.e., predicting the EOL value of a given metric using its early value). Interestingly, while the features on this diagonal might be expected to consistently be the most predictive, this is not always the case. For example, the early prediction of Rct is dominated by Vcharge. Additionally, while QRPT0.2C is used to define the EOL cutoff, and thus EFC at EOL, the early prediction of EFC is dominated by Rp and the higher rate capacities, rather than by QRPT0.2C. These results highlight the importance of detailed tracking of battery SOH across multiple metrics. While a given degradation mode might dominate the EOL values of certain mechanistic SOH metrics, the best early indicators for the onset of that mode may be a different metric or set of metrics.
Since SHAP analysis cannot differentiate between correlated input features, in order to draw robust conclusions about the importance of early cycle features, it is insightful to also consider a “diagnostic-only” model, excluding cycling parameters as input features (ESI,† Section S.10.6). In principle, this may affect the relative feature importance of the early cycle features which correlate with specific cycling parameters. In addition, this type of model is preferred in cases where a user either does not directly have access to cycling conditions, such as battery second life repurposing or online EV health estimation; the cycling conditions are kept constant, such as probing production quality control; or the relationship to cycling conditions is not the focus, such as in process or battery design optimization.104
Through our interpretable ML framework, we deepen our physical understanding of battery degradation within the high dimensionality of a diverse dataset. While interpretable ML tools can be used to generate hypotheses and summaries of the dataset, the findings must be further validated with physical characterization to gain confidence. This model framework is extensible to include various other use-case specific SOH metrics or cycling conditions. We encourage the field to use this methodology in the analysis of large datasets that span other chemistries and operating conditions such as those spanning wide temperature ranges, dynamic operational usage profiles for grid storage or EVs,105 battery manufacturing optimization,106 and other tasks. In these cases where the design space is sufficiently different, the models will need to be retrained or fine-tuned on these new datasets. We urge the field to use the dataset presented here to expand upon this work while keeping interpretability in mind to enrich our understanding of battery degradation.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4ee05609d |
‡ These authors contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2025 |