From the journal Digital Discovery Peer review history

Uncertainty-aware and explainable machine learning for early prediction of battery degradation trajectory

Round 1

Manuscript submitted on 27 Jun 2022
 

23-Aug-2022

Dear Dr Rieger:

Manuscript ID: DD-ART-06-2022-000067
TITLE: Uncertainty-aware and explainable model for early prediction of battery degradation trajectory

Thank you for your submission to Digital Discovery, published by the Royal Society of Chemistry. I sent your manuscript to reviewers and I have now received their reports which are copied below.

I have carefully evaluated your manuscript and the reviewers’ reports, and the reports indicate that major revisions are necessary.

Please submit a revised manuscript which addresses all of the reviewers’ comments. Further peer review of your revised manuscript may be needed. When you submit your revised manuscript please include a point by point response to the reviewers’ comments and highlight the changes you have made. Full details of the files you need to submit are listed at the end of this email.

Digital Discovery strongly encourages authors of research articles to include an ‘Author contributions’ section in their manuscript, for publication in the final article. This should appear immediately above the ‘Conflict of interest’ and ‘Acknowledgement’ sections. I strongly recommend you use CRediT (the Contributor Roles Taxonomy, https://credit.niso.org/) for standardised contribution descriptions. All authors should have agreed to their individual contributions ahead of submission and these should accurately reflect contributions to the work. Please refer to our general author guidelines https://www.rsc.org/journals-books-databases/author-and-reviewer-hub/authors-information/responsibilities/ for more information.

Please submit your revised manuscript as soon as possible using this link:

*** PLEASE NOTE: This is a two-step process. After clicking on the link, you will be directed to a webpage to confirm. ***

https://mc.manuscriptcentral.com/dd?link_removed

(This link goes straight to your account, without the need to log on to the system. For your account security you should not share this link with others.)

Alternatively, you can login to your account (https://mc.manuscriptcentral.com/dd) where you will need your case-sensitive USER ID and password.

You should submit your revised manuscript as soon as possible; please note you will receive a series of automatic reminders. If your revisions will take a significant length of time, please contact me. If I do not hear from you, I may withdraw your manuscript from consideration and you will have to resubmit. Any resubmission will receive a new submission date.

The Royal Society of Chemistry requires all submitting authors to provide their ORCID iD when they submit a revised manuscript. This is quick and easy to do as part of the revised manuscript submission process.   We will publish this information with the article, and you may choose to have your ORCID record updated automatically with details of the publication.

Please also encourage your co-authors to sign up for their own ORCID account and associate it with their account on our manuscript submission system. For further information see: https://www.rsc.org/journals-books-databases/journal-authors-reviewers/processes-policies/#attribution-id

I look forward to receiving your revised manuscript.

Yours sincerely,
Dr Kedar Hippalgaonkar
Associate Editor, Digital Discovery
Royal Society of Chemistry

************


 
Reviewer 1

The authors have presented an explainable model for prediction of battery degradation and end of life. It is a detailed study with good results, though the reference list is too selective and can be improved. Here are some suggestions below.

Can the authors do a benchmarking table to compare their model with the performance in the literature? You can refer to Nat. Mach. Intell. 2020​​, 2, 161-170 and Adv. Mater. 2022, 34, 2101474.

The authors also mention in the cover letter and text that "we foresee our model to generalize to all battery chemistries and formats." However, they have only used some limited NCA and NMC data. Please provide more evidence to support this generalization.

Reviewer 2

This paper presents a deep learning approach to predict the degradation of Li-ion batteries from preliminary cycling information with quantified uncertainty. Furthermore, the analysis is performed for multiple battery chemistries and an explainability analysis is performed to connect input features to predictive performance. This study provides a contribution to the state of the art in data-driven battery modeling, but needs significant modifications in terms of presentation to be acceptable for publication.
1) There are a considerable number of typographical errors in the manuscript. For example, “insidence” on page 1, “amont” on page 6, and “criterium” on page 10.
2) The abstract makes certain claims that are not fully backed-up in the text. For example, the text contains no comparison with previous results to justify the claim that the model outperforms previous ones. Furthermore, justification of the explainability analysis results with existing chemical insights are highly limited, and perhaps should be reconsidered.
3) In page 2 the paper by Strange and Dos Reis (https://doi.org/10.1016/j.egyai.2021.100097) should be discussed as they also do capacity degradation predictions based on limited preliminary cycling information with quantified uncertainty.
4) Consider citing Saxena et al. 2022 (https://doi.org/10.1016/j.jpowsour.2022.231736) as another example of capacity fade degradation prediction
5) On page 3, did the authors consider splitting cycling experiments into two where there is some discontinuity in the capacity degradation?
6) On page 3, “covariates” should be defined at this first mention. It should not wait until Sec. 2.5.
7) On page 3, please describe what motivated the lower bound of 20 cycles for trajectory prediction.
8) On page 3, please explain why the charging schedule is necessary to include as a feature. How would the performance be if only these features were used. The reviewer fears that this might be sufficient for accurate predictions with the Severson dataset.
9) On page 4 the authors describe their uncertainty quantification approach. It seems quite similar to others available in the literature. Please provide adequate citations.
10) Please also consider including some more principled verification/validation of the quantified uncertainties. Otherwise, it is impossible to judge their quality.
11) On page 8, please consider evaluating the overall accuracy of the capacity degradation trend prediction, not just the EOL.
12) The paper claims improved performance of the described model over others available in the literature. Please include these performance comparisons in a tabular format.
13) On page 9 please use “RMSE” to describe error instead of “within 200 cycles of accuracy.” This is not specific enough.
14) On page 10, what is meant by a “proxy criterium (criterion)?”
15) Figure 5B is difficult to read. Please include an insert plotting predicted capacity uncertainty versus cycles.
16) Figure labels need to be switched between Fig. 6 and Fig.7
17) Consider splitting Fig. 7 into two subfigures. One for the coulombic efficiency comparison, an done for everything else.


Reviewer 3

Thanks for submitting the research work for the review process. Attached you will find the feedback.


 

We thank the reviewer for their detailed and helpful feedback. We have changed the manuscript to address their comments and have included a point-by-point list detailing the changes made to address specific comments. In the accompanying pdf, our answers are in cursive and blue. We have also submitted a version of the manuscript where major changes in the text are marked in cursive (small changes such as corrected typos are not marked, added figures are mentioned in the response).


General comments
A shared concern was the overselling of the paper’s achievements (R1P2 (Reviewer 1, Point 2), R2P2, R2P13, R3P8) along with grammatical errors. We have toned down and specified the claims throughout the paper and proofread the paper again, correcting all the errors we could find.
Reviewer 1 and 2 also raised concerns about the comparisons to other algorithms. We agree with those concerns. Unfortunately, many manuscripts did not make their code and data split available. For example, in the review paper suggested by Reviewer 1 (Adv. Mater. 2022, 34, 2101474), we did not find any papers with open code, illustrating the problem. This makes a direct comparison effectively impossible. We have compared the performance of the algorithm as described by Severson et al. to our model in Table 1.
Furthermore, we now added Table 2 to our paper, comparing recent approaches while emphasizing that the comparison is not necessarily meaningful as all approaches use different data splits (for example, Strange et. al (Ref 20) uses an 80:20 split while Ma et. al. (Ref 40) uses a 50:50 split).
Finally, we have made our code completely open-source and freely available to facilitate future comparisons.

Reviewer 1
Can the authors do a benchmarking table to compare their model with the performance in the literature? You can refer to Nat. Mach. Intell. 2020​​, 2, 161-170 and Adv. Mater. 2022, 34, 2101474.
As described in the general comments, we have added a table, Table 2, to the paper. We summarize the performance metrics of three relevant papers and our approach on their respective dataset while clearly expressing in the caption that the metrics are not strictly comparable since they are all on different data splits.
We refer to the two reviews named for a more extensive overview of algorithms and features used for prediction.
The authors also mention in the cover letter and text that "we foresee our model to generalize to all battery chemistries and formats." However, they have only used some limited NCA and NMC data. Please provide more evidence to support this generalization.
We have rephrased this claim and instead specifically reference the results in the supplements.
Reviewer 2
There are a considerable number of typographical errors in the manuscript. For example, “insidence” on page 1, “amont” on page 6, and “criterium” on page 10.
We have proofread the paper again and corrected typos.
2) The abstract makes certain claims that are not fully backed-up in the text. For example, the text contains no comparison with previous results to justify the claim that the model outperforms previous ones. Furthermore, justification of the explainability analysis results with existing chemical insights are highly limited, and perhaps should be reconsidered.
We have toned down claims throughout the text. We address the shared concern about comparison between approaches in the general comments.
3) In page 2 the paper by Strange and Dos Reis (https://doi.org/10.1016/j.egyai.2021.100097) should be discussed as they also do capacity degradation predictions based on limited preliminary cycling information with quantified uncertainty.
We have added the paper to page 2 and in the model performance comparison table 2
4) Consider citing Saxena et al. 2022 (https://doi.org/10.1016/j.jpowsour.2022.231736) as another example of capacity fade degradation prediction
We have added the paper to page 2
5) On page 3, did the authors consider splitting cycling experiments into two where there is some discontinuity in the capacity degradation?
We added a short remark on this. Due to the frequency of the outliers, splitting was not possible.
6) On page 3, “covariates” should be defined at this first mention. It should not wait until Sec. 2.5.
We have added the definition to page 3
On page 3, please describe what motivated the lower bound of 20 cycles for trajectory prediction.
The focus was on showing results for a wide range of initial cycles to display the impact of the initial cycle number on the RMSE. 20 was chosen ad-hoc as a relatively low number. We have replaced the number of cycles used for the example as it may have been confusing and have extended the explanation for the range chosen.
On page 3, please explain why the charging schedule is necessary to include as a feature. How would the performance be if only these features were used. The reviewer fears that this might be sufficient for accurate predictions with the Severson dataset.
In section 2.5 we go into detail about the feature selection. We have added a reference to Section 2.5 on page 3. For the comparison between Linear Regression as described in Severson et. al. and our approach, we do include the charging schedule in the input variables for all algorithms, ensuring a fair comparison. The charging schedule is included to allow future use with varying charging schemes. For many applications the charging schedule is not constant throughout the lifetime and therefore needs to be included on a cycle-to-cycle basis for accurate prediction. We therefore choose to include it here.
9) On page 4 the authors describe their uncertainty quantification approach. It seems quite similar to others available in the literature. Please provide adequate citations.
Using the negative log likelihood (NLL) as a loss function is indeed a very common approach in machine learning - the loss function is implemented as a default in Pytorch and we were not able to find an original citation. We have added recent citations using the same approach.
However, there are relatively few works considering uncertainty for regression with neural networks and to our knowledge, ours is the first to use LSTM ensembles trained with the NLL to capture both aleatoric and epistemic uncertainty for time series.
Please also consider including some more principled verification/validation of the quantified uncertainties. Otherwise, it is impossible to judge their quality.
In addition to Figure 2 showing uncertainty over the prediction and Figure 3 showing the uncertainty over trajectories, we have added Figure 4 showing the calibration between what percentage of the trajectories are predicted to be within a certain range of uncertainty vs what percentage of the observed trajectories are included.
On page 8, please consider evaluating the overall accuracy of the capacity degradation trend prediction, not just the EOL.
We have added the R2 value for the model to the evaluation section. To prevent confusion, please note that the R2 value in the paper referenced by R2P3 (Ref 20 in our paper) does not describe the accuracy of their model but rather the accuracy of a smoothed approximation of the capacity to the parameterization of the smoothed approximation. To evaluate the curve fitting for their prediction model they provide model predictions in Figure 3 as we do in our paper.
12) The paper claims improved performance of the described model over others available in the literature. Please include these performance comparisons in a tabular format.
We have removed the claim of superiority. As described in the general comments, the majority of papers do not have code or information essential to recreate their results available. We have added a Table 2 showing an overview of recent approaches including performance metrics.
13) On page 9 please use “RMSE” to describe error instead of “within 200 cycles of accuracy.” This is not specific enough.
We changed the section and specified that the RMSE at 40 cycles is 173 cycles as is also shown in Figure 5A
14) On page 10, what is meant by a “proxy criterium (criterion)?”
We have rephrased the wording of that section to make clear that cycling a battery for longer will reduce uncertainty and increase accuracy over the prediction.
15) Figure 5B is difficult to read. Please include an insert plotting predicted capacity uncertainty versus cycles.
We have added an insert plotting the uncertainty for both trajectories.
Figure labels need to be switched between Fig. 6 and Fig.7
We have switched the figures such that the figures are referenced in order.
17) Consider splitting Fig. 7 into two subfigures. One for the coulombic efficiency comparison, an done for everything else.
We have split up the figure into two figures.
Reviewer 3
The authors are encouraged to put an indication of the research outcome already in the abstract.
We have added the model performance metrics to the abstract as well as clearly mentioned the most significant impact of this work - ‘’Our model will enable accelerated battery development via uncertainty guided truncation of cell cycling experiments once predictions are reliable.’
The authors are suggested to put a reference on statements like the first sentence in Section 1 and/or statements claiming 80% SoH is the EoL in the automotive sector etc.
We have removed the unnecessary specification to the automotive sector and added a citation stating that batteries are typically considered at their EOL at 80% capacity.
Regular typos and grammar should be thoroughly checked in the document. Words like insidence, f.e., amont, etc. seem to not exist. Such kind of errors gives a poor impression of the presentation quality.
We have proofread the manuscript again and removed errors.
Questions on the novel contributions – Why did the authors choose LSTM as the preferred methodology? It is not well-explained.
We have extended this part of 3.1 Briefly, LSTM are very well-suited for prediction of sequential data, can model complex functions and allow us to examine how the capacity degraded over time.
The authors have not generated any dataset but used open access information for their study. Thus, it is suggested to rather concentrate on how to tackle the noise/outliers than pointing them out (referring to the last paragraph of Section 2.1). In a laboratory environment, it is common to have inconsistency in the measurement when conducting large test campaigns.
We go into detail on the data processing in section 2.2, including strategies to tackle noise such as the use of MAF (Moving Average Filter) and removal of outliers. To motivate their use, especially for readers with domain knowledge in ML instead of battery development, we think it is useful to give this background.
Exclusion of the most important feature as the temperature is bold for an electrochemical system, thus, requiring more motivation and clarification. How to justify that the temperature has common stress on the selected features meaning integrated contribution?
We have rephrased section 2.5 where we explain why temperature was not included as a feature. Briefly, while temperature is an important factor, it is also indirectly represented in the features that we do include and furthermore, does not generalize (have the same correlation to the cell degradation) well to other datasets as the temperature varies highly based on the experiment environment and the sensor placement.
The explanations of NMC/NCA/LFP training, validation, and testing are a bit unclear throughout the main manuscript.
We have rephrased and clarified Section 2.1 describing the datasets and the train/val/test split of the data. Additionally, we have added Figure S8 in the supplements to further clarify the data split
The authors claim to have electrochemical signals considered referring to physical degradation mechanism but not explicitly shown the outcome.
We have clarified these claims and specify that the observed gradients are in-line with previously known effects, opening up the possibility of future work to classify degradation mechanisms solely from the data at hand.
The authors have shown analysis for non-aged and calendar-aged batteries separately; however in reality, the stress factors of cycling and calendar life aging are interlinked.
This is correct. For any ML algorithm, performance on data outside of the training manifold is of interest. The training data that we use contains only non-aged batteries. We therefore analyze how the ML algorithms perform on calendar-aged batteries that they were not explicitly trained for. In contrast to training on aged and non-aged batteries together, this presents a harder challenge.




Round 2

Revised manuscript submitted on 20 Sep 2022
 

02-Nov-2022

Dear Dr Rieger:

Manuscript ID: DD-ART-06-2022-000067.R1
TITLE: Uncertainty-aware and explainable model for early prediction of battery degradation trajectory

Thank you for your submission to Digital Discovery, published by the Royal Society of Chemistry. I sent your manuscript to reviewers and I have now received their reports which are copied below.

I have carefully evaluated your manuscript and the reviewers’ reports, and the reports indicate that major revisions are necessary.

Please submit a revised manuscript which addresses all of the reviewers’ comments. Further peer review of your revised manuscript may be needed. When you submit your revised manuscript please include a point by point response to the reviewers’ comments and highlight the changes you have made. Full details of the files you need to submit are listed at the end of this email.

Digital Discovery strongly encourages authors of research articles to include an ‘Author contributions’ section in their manuscript, for publication in the final article. This should appear immediately above the ‘Conflict of interest’ and ‘Acknowledgement’ sections. I strongly recommend you use CRediT (the Contributor Roles Taxonomy, https://credit.niso.org/) for standardised contribution descriptions. All authors should have agreed to their individual contributions ahead of submission and these should accurately reflect contributions to the work. Please refer to our general author guidelines https://www.rsc.org/journals-books-databases/author-and-reviewer-hub/authors-information/responsibilities/ for more information.

Please submit your revised manuscript as soon as possible using this link:

*** PLEASE NOTE: This is a two-step process. After clicking on the link, you will be directed to a webpage to confirm. ***

https://mc.manuscriptcentral.com/dd?link_removed

(This link goes straight to your account, without the need to log on to the system. For your account security you should not share this link with others.)

Alternatively, you can login to your account (https://mc.manuscriptcentral.com/dd) where you will need your case-sensitive USER ID and password.

You should submit your revised manuscript as soon as possible; please note you will receive a series of automatic reminders. If your revisions will take a significant length of time, please contact me. If I do not hear from you, I may withdraw your manuscript from consideration and you will have to resubmit. Any resubmission will receive a new submission date.

The Royal Society of Chemistry requires all submitting authors to provide their ORCID iD when they submit a revised manuscript. This is quick and easy to do as part of the revised manuscript submission process.   We will publish this information with the article, and you may choose to have your ORCID record updated automatically with details of the publication.

Please also encourage your co-authors to sign up for their own ORCID account and associate it with their account on our manuscript submission system. For further information see: https://www.rsc.org/journals-books-databases/journal-authors-reviewers/processes-policies/#attribution-id

I look forward to receiving your revised manuscript.

Yours sincerely,
Dr Kedar Hippalgaonkar
Associate Editor, Digital Discovery
Royal Society of Chemistry

************


 
Reviewer 3

Dear authors,
Thanks for addressing the feedback and concerns provided in the first round. Unfortunately, I couldn’t find the response to a few questions mentioned again below.
- The comparison of the achieved results to existing works is missing. (already replied corresponding to other responses)
- Figure S2 shows a poor estimation accuracy referring to the chemistry-neutral claim. How can the authors justify the reported error in predicting aging with early cycles?
- What are the limitations and future work for this work? The authors are encouraged to try the same features in another dataset.
- Are the selected features in this research test methodology dependent? What if shallow cycling data are available instead of a complete charge-discharge cycle that is more realistic? Will the model still be good enough to predict the lifetime trajectory?

Moreover, based on the previous replies, I have further questions/suggestions to make.
 It is not clear to this reviewer if his question 9 is addressed in the revised manuscript. If yes, then where? Could the authors clarify?
 In the new Table 2, it is suggested to remove MAE if not available for all the compared studies.
 Can the authors comment on testing the prediction accuracy for dynamically aged data or on field data? Is this already realistic? If not, then how far we are from implementing such algorithms in BMS?

Reviewer 2

The authors have addressed most of the reviewer comments, but some remain un-addressed. Furthermore, typographical errors remain. An additional major review is required prior to publication.

1) The work still contains numerous typographical and grammatical errors
2) In the second paragraph of Sec. 2.1 it is unclear whether the authors are referring to training data splits in the Severson paper or in the present work.
3) Are we sure that the third Severson batch was calendar aged. Were all cells for the study purchased simultaneously? Please clarify in the text.
4) In the first paragraph of page 4 the authors are advised to avoid gendered language. “person-hours” would be more appropriate than “man-hours.”
5) In the first paragraph of the second column of page 4, the term “a lot of,” is used. This is a bit too informal language for a scientific publication.
6) Regarding the first paragraph of Sec. 2.5, the reviewer has no problem with the authors excluding temperature in their study, but the provided justification doesn’t make much sense. Of course, the cell temperature depends on the environmental temperature – it is unclear why this is relevant to mention. If this is due to an observation during training and validation, please indicate so in the text.
7) Please explain the meaning of ‘n’ in the equation on page 6.
8) The reviewer’s response in the following exchange is inadequate:

Reviewer comment: On page 3, please explain why the charging schedule is necessary to include as a feature. How would the performance be if only these features were used. The reviewer fears that this might be sufficient for accurate predictions with the Severson dataset.

Author response: In section 2.5 we go into detail about the feature selection. We have added a reference to Section 2.5 on page 3. For the comparison between Linear Regression as described in Severson et. al. and our approach, we do include the charging schedule in the input variables for all algorithms, ensuring a fair comparison. The charging schedule is included to allow future use with varying charging schemes.

For many applications the charging schedule is not constant throughout the lifetime and therefore needs to be included on a cycle-to-cycle basis for accurate prediction. We therefore choose to include it here.
Severson charging schedules can be succinctly described as two different C rates, I.e. only two variables. This would be a way for the algorithm to "cheat" and find an easy way to correlate with degradation. Real charging schedules might be far more complex and less likely to be easily taken advantage of by a learning algorithm.


 

We thank the reviewers for their thoughtful and detailed feedback. We have addressed their points below. Additionally, we have uploaded a new version of the manuscript, as well as a version where changes are marked in cursive. The response might be easier to read in the response pdf where our answers are clearly marked in blue and cursive.

Referee: 3

Comments to the Author
Dear authors,
Thanks for addressing the feedback and concerns provided in the first round. Unfortunately, I couldn’t find the response to a few questions mentioned again below.
- The comparison of the achieved results to existing works is missing. (already replied corresponding to other responses)
In Table 2 we compare the achieved results to existing work on the same dataset. According to your recommendation, we removed the MAE.
- Figure S2 shows a poor estimation accuracy referring to the chemistry-neutral claim. How can the authors justify the reported error in predicting aging with early cycles?
We write in the supplements
“Compared to the results on LFP batteries, we see in Figure S2 that the model learned to predict the rate of degradation but is less accurate in prediction. This is likely due to the noise, including lasting upwards jumps of the capacity, present in the data that can readily be seen from the true degradation trajectories in Figure S2 as well as the two different chemistries in the dataset. [...]
We note that in two cases, the prediction deviates significantly from the actual trajectory from the start. In both cases, the battery experienced significant degradation in the first 100 cycles (down to 70\% of the initial capacity) and based on visual inspection seems to differ from the rest of the distribution. “
There are two outliers with poor accuracy present in figure S2. We note in Section 3 in the supplements that these two cells differ from the rest, characterized by reaching their EOL (80% of the nominal capacity) after just 30-40 cycles.
According to earlier comments, we had already toned down claims, particularly about the currently trained model achieving chemistry-neutral performance.
- What are the limitations and future work for this work? The authors are encouraged to try the same features in another dataset.
We have extended this part of the conclusion:
“A limitation of our work is that the main dataset consists only of cells with a single cell chemistry that are discharged with a uniform discharge rate across cells and lifetime. As opposed to the more realistic use case of varying charge/discharge rates, this limits the complexity of the prediction task. In subsequent work, we intend to apply the model to newly created datasets containing a wider variety of usage parameters, opening up the possibility of incorporating the model into a BMS (Battery Management System) for more targeted usage.
Additionally, as is visible in Figure 4, the model is still slightly overconfident in its predictions, particularly for data points with large errors.
In future work, we plan to predict the driving degradation mechanism directly from the LSTM model.”
Briefly, a current important limitation is the lack of variation regarding the discharge rate and DoD in the currently used and available datasets.
We have also observed that the model is currently overconfident in its decision, resulting in too tight uncertainty boundaries as seen in Figure 4.
In addition to applying our model to datasets with a wider variety of usage conditions, we plan to extend the explainability aspect of the work by predicting the driving degradation mechanisms in-situ.

- Are the selected features in this research test methodology dependent? What if shallow cycling data are available instead of a complete charge-discharge cycle that is more realistic? Will the model still be good enough to predict the lifetime trajectory?
The currently selected features are commonly available for most cycling experiments and thus not test methodology dependent.
The model can be used with shallow cycling data and varying DoD for each cycle. Due to the sequential nature of the model, variable cycling conditions can be easily accommodated by extending the expected input to incorporate the start and end charge for each cycle. We expect the model to predict accurate lifetime trajectories in this use case.
Moreover, based on the previous replies, I have further questions/suggestions to make.
It is not clear to this reviewer if his question 9 is addressed in the revised manuscript. If yes, then where? Could the authors clarify?
Question 9 for clarification: “The authors have shown analysis for non-aged and calendar-aged batteries separately; however, in reality, the stress factors of cycling and calendar life aging are interlinked. How the authors would tackle a such problem with the developed methodology?”
We agree that the stress factors of cycling and calendar life aging are interlinked. However, there are nuanced differences between the exact way battery behavior changes due to these two types of aging. As often the aging information is missing, our model does not use direct information about calendar aging and instead uses the same input variable to predict outcomes from cycling and calendar aging. To be sure of the model capability, we trained the model only with non-aged data and checked how well it performs in aged batteries with somewhat different aging trajectories.
If the model was trained with a combination of calendar-aged and non-calendar-aged data, it would perform even better.
An advantage of purely data-driven algorithms such as ours is that the compounding effects of calendar aging and cycling do not have to be explicitly incorporated but will be learned by the model by being trained with data that consists of aged and non-aged batteries. The problem of interlinked effects is therefore solved by the use of a data-driven algorithm.
In our manuscript we write “Since calendar aging influences the capacity of a battery, ML algorithms that were trained only on data from non-calendar-aged batteries will perform worse when predicting the future capacity of calendar-aged batteries (and vice versa).” We also stress that the calendar-aged results are with a model trained only on non-calendar-aged batteries and therefore already show how the model performs when encountering interlinked aging effects. If the model was trained with a combination of calendar-aged and non-calendar-aged data, it would perform even better.
We hope we have made clear that the model will not perform worse when trained on a mixed dataset but better.
In the new Table 2, it is suggested to remove MAE if not available for all the compared studies.
We have removed the MAE from the table. As the MAE is of interest to some people in the battery community, we have added the MAE of our model to the text.
Can the authors comment on testing the prediction accuracy for dynamically aged data or on field data? Is this already realistic? If not, then how far we are from implementing such algorithms in BMS?
As the reviewer has noted, the used datasets do not have realistic conditions for field data, e.g. with a uniform discharge rate for all batteries and complete charge/discharge cycles. The main focus of the current work is indeed to predict future degradation under controlled conditions and reduce testing times.
To employ ML models in a BMS, a complex dataset with varying charge and discharge policies is needed. Once trained and validated with such a dataset, we are hopeful that the model can be employed in a BMS towards providing next-generation optimization and maintenance capabilities. We are currently planning an industrial collaboration to incorporate a model based on this algorithm into a BMS. We expect the model to handle the more complex challenge that comes with field data well. This is a complex multi-year endeavor from us and we hope to see similar efforts from other academic and industrial organizations as well given the critical impact of energy storage systems. We have added a ‘teaser’ about our effort in the conclusion section.


Referee: 2

Comments to the Author
The authors have addressed most of the reviewer comments, but some remain un-addressed. Furthermore, typographical errors remain. An additional major review is required prior to publication.

1) The work still contains numerous typographical and grammatical errors
We have carefully proofread the manuscript and fixed errors with professional grammar-checking software.
2) In the second paragraph of Sec. 2.1 it is unclear whether the authors are referring to training data splits in the Severson paper or in the present work.
We thank the reviewer for pointing this out.
The batteries were cycled in three batches. We clarified that the first two batches were used for training/validation in the original paper while the third batch was used for testing.
3) Are we sure that the third Severson batch was calendar aged. Were all cells for the study purchased simultaneously? Please clarify in the text.
Yes, we refer to the supplements of the Severson paper, “Second, the calendar aging of the secondary test set was about one year greater than the primary testing and training sets.”
We have clarified this in the manuscript.
4) In the first paragraph of page 4 the authors are advised to avoid gendered language. “person-hours” would be more appropriate than “man-hours.”
We changed the phrasing to person-hours.
5) In the first paragraph of the second column of page 4, the term “a lot of,” is used. This is a bit too informal language for a scientific publication.
We have changed the expression to “a substantial amount of”.
6) Regarding the first paragraph of Sec. 2.5, the reviewer has no problem with the authors excluding temperature in their study, but the provided justification doesn’t make much sense. Of course, the cell temperature depends on the environmental temperature – it is unclear why this is relevant to mention. If this is due to an observation during training and validation, please indicate so in the text.
Machine learning models use training data to learn correlations between the input and the output. A necessary condition is that the learned correlations will generalize, i.e. be applicable in the training dataset as well as in the test dataset.
The cell temperature depends among others on the environmental temperature as well as the cycling conditions, making it a less reliable indicator. Additionally, the temperature is not always available for cycling datasets. Considering that we reach good predictive performance without using temperature, it is preferable to not include more input parameters than necessary to avoid unnecessary sources of complexity.
While we have not tested using temperature, we note that the only model including temperature in the original Severson paper (‘Full’ model) has the worst test RMSE compared to only using discharge features or even only one feature, further illustrating the point that using non-generalizable features such as temperature leads to worse test performance even in the ideal setting with a temperature-controlled chamber and individual temperature sensors for each cell.
We hope that we have adequately explained why we did not use temperature as an input feature.
7) Please explain the meaning of ‘n’ in the equation on page 6.
We have added the definition of N, y_n and y_n,pred after the equation. With this change, the definition of n directly follows from the formula as the index over which the sum is taken.
8) The reviewer’s response in the following exchange is inadequate:
Reviewer comment: On page 3, please explain why the charging schedule is necessary to include as a feature. How would the performance be if only these features were used. The reviewer fears that this might be sufficient for accurate predictions with the Severson dataset.
As Reviewer 3 noted, a constant charge/discharge policy is an unrealistic condition for real-world battery usage. For many applications, the charging schedule is not constant throughout the lifetime and therefore needs to be included on a cycle-to-cycle basis for accurate prediction of the future trajectory. We, therefore, include it as an input feature.
Severson charging schedules can be succinctly described as two different C rates, I.e. only two variables. This would be a way for the algorithm to "cheat" and find an easy way to correlate with degradation. Real charging schedules might be far more complex and less likely to be easily taken advantage of by a learning algorithm.
We agree that real charging schedules might be far more complex.
The charging schedule of the Severson dataset cannot be described with two variables since the point at which the C rate changes is variable and depends on the SoC. We use the minimum, maximum and mean C rate as an input. These values can be easily extracted for most charging schedules. Together with the remaining covariates, we believe that they can adequately represent more realistic charging schedules for the purpose of predicting future SOH if training data is available.






Round 3

Revised manuscript submitted on 15 Nov 2022
 

28-Nov-2022

Dear Dr Rieger:

Manuscript ID: DD-ART-06-2022-000067.R2
TITLE: Uncertainty-aware and explainable model for early prediction of battery degradation trajectory

Thank you for submitting your revised manuscript to Digital Discovery. I am pleased to accept your manuscript for publication in its current form. I have copied any final comments from the reviewer(s) below.

You will shortly receive a separate email from us requesting you to submit a licence to publish for your article, so that we can proceed with the preparation and publication of your manuscript.

You can highlight your article and the work of your group on the back cover of Digital Discovery. If you are interested in this opportunity please contact the editorial office for more information.

Promote your research, accelerate its impact – find out more about our article promotion services here: https://rsc.li/promoteyourresearch.

If you would like us to promote your article on our Twitter account @digital_rsc please fill out this form: https://form.jotform.com/213544038469056.

By publishing your article in Digital Discovery, you are supporting the Royal Society of Chemistry to help the chemical science community make the world a better place.

With best wishes,

Dr Kedar Hippalgaonkar
Associate Editor, Digital Discovery
Royal Society of Chemistry


 
Reviewer 3

Thanks for addressing all the concerns. This reviewer has no further comment to make.

Reviewer 2

I thank the authors for responding to the review admirably.




Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article. Reviewers are anonymous unless they choose to sign their report.

We are currently unable to show comments or responses that were provided as attachments. If the peer review history indicates that attachments are available, or if you find there is review content missing, you can request the full review record from our Publishing customer services team at RSC1@rsc.org.

Find out more about our transparent peer review policy.

Content on this page is licensed under a Creative Commons Attribution 4.0 International license.
Creative Commons BY license