Open Access Article
André
Nogueira
a,
Filipe H. B.
Sosa
a,
Ana C.
Dias
b,
João A. P.
Coutinho
a and
Nicolas
Schaeffer
*a
aCICECO – Aveiro Institute of Materials, Department of Chemistry, University of Aveiro, 3810-193 Aveiro, Portugal. E-mail: nicolas.schaeffer@ua.pt
bCentre for Environmental and Marine Studies (CESAM), Department of Environment and Planning, University of Aveiro, 3810-193 Aveiro, Portugal
First published on 20th November 2025
Technological innovation has led to the widespread adoption of lithium-ion batteries (LIBs) for portable energy storage. Correspondingly, sustainable solutions to end-of-life battery disposal are crucial to manage their growing volume. Beyond their potentially hazardous nature, waste LIBs contain several economically relevant critical raw materials such as lithium, manganese and cobalt. However, their recovery by hydrometallurgical approaches often relies on the excessive use of corrosive solutions during the leaching steps, negatively impacting the atomic efficiency, effluent treatment, and cost of the process. Despite numerous studies on the topic, identifying optimal leaching conditions is challenging given the variety of available battery chemistries and leaching agents, compounded by economic and environmental concerns. This work presents a methodical, data-driven approach to model the leaching of key metals from oxide-based LIB cathode active materials using machine learning algorithms, implementing pairwise difference algorithms for data augmentation. The developed model underwent thorough evaluation and screening, and its output is integrated to compute a simplified economic and environmental assessment, accounting for key performance indicators such as heating requirements, solvent cost, and environmental impact, thereby enabling an agile screening of potential preliminary leaching conditions. The methodology described herein is an important step in integrating emerging computational tools in the development of novel, greener metal recycling processes.
Green foundation1. This work introduces a machine learning (ML) framework to optimize the recycling of lithium-ion batteries (LIBs). This data-driven approach aims to reduce the excessive use of harmful chemicals in the leaching step of hydrometallurgical battery recycling. The key advance is an integrated toolkit that combines leaching yield prediction with preliminary economic and environmental assessment tools, enabling more agile screening and development of greener and more efficient metal recycling processes.2. The contribution to green chemistry takes the form of the “LIB Leaching Toolkit”, a software tool that integrates ML-powered leaching yield prediction with a qualitative economic and environmental assessment tools for a more holistic approach to process design. The underlying model showed strong predictive performance, with a coefficient of determination of 0.933 on unseen test data. 3. This work can be advanced by expanding the dataset to include more varied cathode chemistry data and, crucially, data on the contaminants present in real waste battery streams. This would improve the applicability of the toolkit to real-world scenarios, highlighting its potential for use in a digital twin setting for more informed process design and control. |
The recycling of valuable and critical materials like lithium, nickel, manganese, cobalt and graphite present in LIB is a key step in reducing the environmental footprint of battery production by introducing a circular economy for these materials. Different possible avenues for LIB recycling, including pyrometallurgy, hydrometallurgy, biometallurgy, direct recycling, or a combination of options, are reported for LIB recycling.3 Hydrometallurgy is projected to remain the most common battery cathode recycling process, accounting for 90% of global capacity in 2030, due to its high yields, favorable economics and flexibility in handling different cathode chemistries and waste streams.4 Hydrometallurgy hinges on using leaching agents, usually aqueous solutions containing strong chemical oxidizers, to leach metals from the active material. Metals are then recovered from the pregnant solution through precipitation, deposition, solvent extraction or other separation methods. Hydrometallurgical processes can offer high efficiency and purity in metal recovery, from battery waste.5 However, careful process optimization is essential to minimize the drawbacks of this approach. The leaching step directly affects any downstream separation processes and is typically associated with the largest chemical consumption, incurring significant environmental impact. Leaching inherently leads to large wastewater and carbon dioxide emissions, as much as 6.6 kg of wastewater and 42 kg of carbon dioxide per electric vehicle cell.6 Generated wastewater may contain heavy metals, sodium, phosphate, and sulphates, all of which carry significant environmental impact.6 From an economic perspective, hydrometallurgical processes typically involve the use of excess quantities of reagents, implying potentially high operating costs. Furthermore, high purity recovery of critical materials from waste remains a challenge due to the costly separation processes needed to separate the transition metals after leaching. These aspects make careful consideration of the leaching step even more important to facilitate downstream recovery.7 The combined heterogeneity in feedstock and processing pathways requires the need for flexible recycling operations in which optimal leaching conditions can be quickly screened and adapted as new needs arise, bypassing experimental re-optimization and excessive chemical requirements.
Machine-learning (ML) techniques offer a way to overcome the limitations of first-principles chemical models, leveraging data-driven approaches to model complex patterns and predict leaching performance of real-world waste streams. This transition from first principles to a data-driven philosophy is especially relevant for application to variable real-world waste, where a complete mechanistic understanding of the chemical processes at play is often impractical. Additionally, flexible and robust computational modeling of leaching processes enables real-time tuning and optimization, maximizing yield while minimizing environmental impact and cost. ML is a useful tool for the optimization of processes, covering the LIB value-chain from manufacturing to recycling.8,9 The manufacture of LIB is highly sensitive to variations in process parameters, which will directly influence cell performance. Duquesnoy et al.10,11 used physics-based and ML models to predict ideal manufacturing parameters to produce electrodes for specific battery applications. ML approaches were also utilized to improve the performance and longevity of LIB.12 Additionally, significant efforts were directed towards the development of ML-based battery life models to optimize battery charging and use.13
More recently, ML techniques were identified to improve e-waste recycling processes, extending beyond the manufacture to the disposal of LIB. Zhou14 developed a ML model that pairs machine vision with convolutional neural networks to achieve high accuracy classification of mixed e-waste into different categories, helping to overcome one of the major challenges of e-waste recycling. Continuing the recycling process, works by Ebrahimzade et al.15 and Niu et al.16 used different ML approaches to successfully model the leaching of LIB, addressing another aspect of the battery recycling problem by being able to predict leaching efficiency from experimental conditions. However, both approaches have limitations in how they model the chemical systems. The first is highly specific, supporting exclusively sulfuric acid and hydrogen peroxide. This narrow focus means its predictions are not applicable to other leaching agents. While the model developed by Niu et al.16 is broader, its chemical descriptors for the acids are limited to the first dissociation constant of each one. This simplification tries to represent a complex chemical system using a single value, which cannot fully capture acid structure, multiple deprotonations, and others. Recent work by Zhou et al.17 showcases the thorough development of a methodology for the efficient development of deep eutectic solvents (DES) for LIB cathode leaching. Their approach allows for the rapid identification of new solvents using ML models, enabling significantly more targeted screening of leaching agents thank traditional trial-and-error methods.
Looking beyond LIB leaching to other hydrometallurgical leaching applications, ML models were successfully applied to capture the leaching performance of copper from various minerals.
Table 1 summarizes previous studies that have used ML algorithms for leaching modelling, highlighting their specific applications. However, existing research focuses on predictive accuracy for the metal leaching step in isolation, neglecting to showcase its potential when integrated into broader operational frameworks. Embedded within systems that enable simulation, optimization and technical-economic-environmental analysis, ML models can act as predictive engines, simulating process outcomes under different feedstock and operating conditions therefore enabling more informed process design and control.
| Ref. | Models | Use | Leaching agent | Best | # Points |
|---|---|---|---|---|---|
| Ebrahimzade et al.15 | ANN | Leaching from waste LIBs | H2SO4 + H2O2 | ANN | 685 |
| Niu et al.16 | XGB, RF, SVM, AdaBoost | Leaching from waste LIBs | 25 different acids | XGB | 17 588 |
| Zhou et al.17 | CGAN | DES discovery for LIB leaching | DES | CGAN | 791 |
| Zhang et al.18 | MLR, SVM, DT, RF, ANN, GBR | Pyrolusite leaching | H2SO4 + FeSO4 | SVM | 304 |
| Mathaba and Banza19 | ANN | Co and Cu leaching | H2SO4 | ANN | 32 |
| Flores and Leiva20 | RF, SVM, ANN | Cu leaching | H2SO4 | ANN | 15 581 |
| Daware et al.21 | RF, GBR, SVM, XGB, AdaBoost | Cu leaching from PCBs | Acids with pKa between 6 and 3 | GBR | 1320 |
A significant hurdle to the broader integration of ML tools is the scarcity of readily available, high-quality data. A smaller dataset limits the kinds of models that can be used, support vector machines, gradient boosting algorithms and random forests being reported as particularly well suited to data-restricted datasets.22 However, the use of more data-hungry methods may still be possible by using data augmentation techniques. One example of this approach is the pairwise difference (PD) algorithm described by Tynes et al.23 to improve prediction and uncertainty quantification in chemical search. This meta-algorithm operates on pairs of data points. During training, the model learns to predict differences between all possible pairs of input points. For prediction, test points are paired with all training set points, generating a set of predictions that can be treated as a distribution where the mean is the final prediction, and the dispersion serves as an uncertainty measure. This approach has been shown to reliably improve the performance of the random forest algorithm across various chemical ML tasks.23
In this work, ML approaches were used to model the leaching of Li, Mn, Ni and Co from oxide-based LIB cathode material using a range of organic and inorganic acids. Data was collected by surveying published articles on LIB recycling and/or available data from our laboratory. Experimental conditions and yields were compiled and formatted for consistency. Previously reported approaches were compared with models that exploit pairwise differences (PD) algorithms for data augmentation. Then, we demonstrate how these models can be integrated into a simplified techno-economic and environmental impact assessment tool, thus allowing for rapid screening and optimization of leaching conditions. The best-performing model was integrated into a Python application, the “LIB Leaching Toolkit” and performance was assessed using two case studies. This framework enables not only predicting leaching efficiency, but also considering economic and environmental implications, showcasing more flexible, data-driven models potentially deployable within a digital twin or process development context.
The composition of the LIB cathodes was standardized using the Ni
:
Mn
:
Co molar ratio as it is defined in the common general formula for ternary LIB cathodes—LiNixMnyCo1−x−yO2. These values were normalized such that the sum of the Ni, Mn and Co content was equal to 1. Hence, the LIB composition may be characterized by three variables, inputNi, inputMn, and inputCo. For example, NMC111, containing equimolar amounts of all three metals, is represented as inputNi = inputMn = inputCo = 0.33. The reaction conditions themselves were described using five parameters: leaching temperature in °C, acid concentration in mol L−1, hydrogen peroxide concentration in wt%, solid-to-liquid ratio in g L−1 and sampling time in minutes.
A comprehensive set of molecular descriptors was employed to represent the different acids in the dataset. These included properties such as the acid dissociation constants (pKa) and the number of available protons. Whenever available, the solubility product (KSP) of the corresponding lithium, nickel, manganese and cobalt salts in water at 20 °C were also included as direct inputs. Additionally, other molecular descriptors were dynamically retrieved from the SMILES string of each acid using the RDKit package.46 A complete overview of the features is available in Table S2 of the SI. This expanded representation, although increasing complexity, provides a more comprehensive description of each acid's behavior compared to previous approaches, which have primarily only used the acid concentration and first dissociation constant.16 This improved representation aims to improve the modelling performance of lesser-represented acids and the overall generalizability of the models.
The compiled leaching data contains 12 different organic and inorganic acids. To allow for appropriate modelling, it was necessary to encode the acid type into a numeric value. The first dissociation constant was chosen for its availability for all the acids of interest. However, picking a single number to differentiate between acids is a simplified approach, limiting the generalization potential of the models, particularly as it fails to differentiate between mono-, di- and tri-protic acids. This limitation was overcome by using the three first dissociation constants, when available, as well as through using other molecular descriptors. Fig. S1 displays the distribution of acids in the working dataset, revealing a pronounced bias towards hydrochloric and sulfuric acids, which collectively account for over 65% of the data points.
000 rpm for 2 minutes. The leachate was decanted, and the solid residue washed with deionized water several times before drying in a 50 °C oven overnight.
A complete digestion of the cathode active material was done using 4 M HCl to enable yield computation. The concentration of lithium was measured using a Mettler Toledo DX207-Li ISE half-cell electrode. Quantification of Ni, Mn and Co in solution was performed using a Bruker S2 Picofox Total Reflection X-ray Fluorescence (TXRF) spectrometer, equipped with a molybdenum X-ray source. The analysis was performed at a voltage of 50 kV and a current of 600 μA. Quartz sample carriers were pre-coated with 10 μL of a solution of silicon in isopropanol (SERVA) and dried at 323 K. Samples were diluted in a 1 wt% polyvinyl alcohol solution and spiked with a known concentration of yttrium standard, adjusted to the metal content of the samples. 10 μL of the diluted and spiked samples were transferred to the preheated carriers and dried at 353 K. TXRF measurement was carried out for 300 s.
Random forest (RF), gradient boosting regression (GBR) and multilayer perceptron (ANN) models were implemented in Python 3.12.3 using the scikit-learn package version 1.5.1.47 The optuna package was used to perform Bayesian optimization of the model hyperparameters instead of a conventional grid search methodology. By sampling within the ranges available in Table S3 of the SI, the optimization algorithm is able to estimate the impact of varying each of the hyperparameters to enable a more targeted and efficient optimization.48 A detailed description of the hyperparameter optimization is described in the “Hyperparameter optimization” section of the SI. Model selection was performed using the entire set of features under consideration, detailed in Table S2. The models were first compared using performance metrics such as the coefficient of determination (R2), mean absolute error (MAE), median absolute error (MedAE), mean squared error (MSE) and root mean squared error (RMSE). Equations for these metrics are available in Table S4 of the SI.
Further analysis and selection of the models involved performing several statistical tests. An independent-samples Kruskal–Wallis test was used to identify significant differences in the mean absolute error (MAE) between groups (different acids, cathode chemistries, yield ranges). When significant differences were found, pairwise Dunn's tests were then performed to identify the source of these differences. Additionally, Wilcoxon signed-rank tests were conducted to assess if more complete feature sets were significantly better performing than a base set of features.
The optimized scikit-learn models were compressed into .gz files using joblib 1.4.2 for easier distribution and deployment. The code for this project is archived on Zenodo at https://doi.org/10.5281/zenodo.16096943.
The general flow of data within the LIB Leaching Toolkit application is depicted in Fig. 2.
The first step of the LIB Leaching Toolkit is loading the pre-trained PD-GBR model, the training data (required for applying the PD method) and the ‘sample data’, containing the leaching scenarios for which yields will be calculated. The sample data can be generated using the toolkit's built-in tool (as illustrated in Fig. 3), which allows for varying a single parameter across a range of custom, evenly spaced values while holding others constant. Alternatively, users can create sample data by editing a provided template spreadsheet.
The feature set is then expanded by applying the PD algorithm to generate all pairwise feature combinations between the training and sample data. The toolkit then computes predictions for the average and standard deviation of extraction yields for Li, Ni, Mn and Co. The results are stored as a Pandas DataFrame object and exported as a spreadsheet.
Having estimated the extraction yields, the toolkit computes a mass balance to determine the required reagents for the leaching process. All calculations are based on 1000 kg of LIB cathode material. The process begins by calculating the total volume of the diluted acid solution required for a given solid-to-liquid ratio. From this volume and the specified acid concentration (mol L−1), the total moles of acid are determined. The mass of pure acid is then calculated using its molar mass, which is subsequently adjusted to find the required mass of the concentrated stock solution by accounting for its purity. Finally, the volume of water needed for dilution is calculated by subtracting the volume of the concentrated acid from the total volume of the prepared leaching solution. The results are stored in a Pandas DataFrame for further analysis and manipulation.
Following mass balance calculations, the toolkit performs a simplified economic assessment to estimate the costs associated with each leaching scenario. This includes estimates of the reagent costs as well as the energy needs for heating and mixing, detailed in Tables S6 and S7 of the SI. Other factors like equipment sizing and cost, solvent recovery, and wastewater treatment were omitted to enable a more flexible yet simplified approach that can be more broadly applicable. The reagent cost is calculated directly from the mass of concentrated acid determined in the mass balance, multiplied by its bulk price. The heating cost is estimated by calculating the energy (in kWh) required to raise the temperature of the calculated volume of water from an assumed ambient temperature of 25 °C to the specified reaction temperature, using the specific heat capacity of water. The mixing energy is calculated based on the total slurry volume (liquid + solids), a defined mixing power requirement (in kW m−3), and the total reaction time. Both heating and mixing energy requirements are converted to a monetary cost using a fixed price per kWh. To enable a comparison between different leaching strategies, the costs are normalized by the total mass of Li, Ni, Mn, and Co recovered (calculated using the predicted yields), providing a final metric in euros per kg of metal leached. Economic assessment is essential for evaluating the financial viability of different approaches. Even a simplified analysis allows for preliminary comparison and selection of leaching conditions.
Using the yields and mass balance results, the toolkit can perform a simplified environmental impact assessment. The amount of acid used in each scenario is multiplied by a set of environmental impact assessment parameters, commonly available in life-cycle assessment (LCA) databases such as Ecoinvent. These parameters quantify the potential environmental impacts associated with the production of acids considered in this study: acetic acid, ascorbic acid, citric acid, formic acid, hydrochloric acid, lactic acid, nitric acid, oxalic acid, and sulfuric acid. The impact metrics aim to capture the impacts on climate change, broader ecosystem and human health effects, and resource use. It is important to note that the parameters provided are only illustrative examples and should be updated by the user with relevant LCA data for their specific context. Environmental impact assessment provides valuable insights into the sustainability of leaching processes. By integrating this assessment, the toolkit places environmental considerations at the core of process development.
Finally, the toolkit facilitates a comparative analysis of leaching scenarios. This is achieved by performing pairwise t-tests to determine statistical differences between leaching conditions. These enable the user to identify which variations in process parameters lead to significant changes in leaching performance. Paired with environmental and cost data, this becomes an invaluable tool to minimize cost and environmental impact. The toolkit also calculates selectivities and enrichment factors to aid process tuning and selection. The toolkit provides visualization tools in the form of scatter and bar plots and heatmaps and allows exporting results as spreadsheets and image files, exemplified in Fig. 4. These visualizations allow intuitive interpretation of the data in formats that researchers and engineers are familiar with, once again bridging the gap between computer and chemical engineering and process development.
:
1
:
1 ratio, characteristic of NMC111, and the modal acid type corresponds to a pKa1 of −8.0 – hydrochloric acid (HCl). This suggests a strong bias towards NMC111 leaching with HCl within the dataset. Additionally, the close agreement between the average and mode of inputNi, inputMn and inputCo reinforces the potential overrepresentation of NMC111 compared to other NMC types. The overrepresentation could lead to models biased towards NMC111 leaching behavior and limit the potential generalizability to other cathode chemistries. This hypothetical limitation is compounded by the simplified representation of the chemical system within the data. While these parameters are the conditions one would expect to control in a leaching experiment, they offer a simplistic view of the chemical environment and of the solid substrate. It means that the leaching model might not be able to effectively differentiate the performance of different acids, namely the di- and tri-protic acids included in the data set. These potential issues were addressed by using RDKit to retrieve a comprehensive set of chemical descriptors, expanding the number of features used to characterize each leaching environment, as detailed in section 3.2. Additionally, it would be possible to employ selective resampling techniques such as SMOTE to combine oversampling of less represented leaching conditions and under-sampling of NMC111/HCl data points to build a more evenly distributed data set.50
Analyzing the leaching yield statistics, presented in Table S9, reveals that while complete extraction is possible for all metals of interest (mode and maximum yields of 1.00), the average extraction within the dataset varies significantly. Lithium exhibits the highest average extraction (xLi = 0.78), suggesting it leaches more readily under the acidic test conditions in the set than the transition metals (xNi = 0.64, xMn = 0.58, xCo = 0.57). The disparity in the average leaching efficiencies highlights the need to optimize recycling methodologies, particularly to maximize manganese and cobalt extraction.
The potential of linear relationships between the selected features and metal leaching yields was assessed by computing the Pearson correlation coefficients. The coefficients, ranging from −1 to +1, quantify the strength and direction of the linear relation between each feature and target. Values near the extremes indicate strong linear correlation, while values close to zero suggest a lack of linear relationship. As depicted in Fig. 5, the coefficients reveal predominantly weak linear correlations between the features and targets. The absence of strong correlations showcases the necessity of employing more sophisticated, non-linear modeling techniques to accurately capture the complex interplay of factors governing the leaching process.
Having analyzed the input data, the subsequent phase focused on the preparation of the dataset for the machine learning modeling process. Fig. 6 displays a comparison between the entire dataset and the data that was set aside for testing, confirming that the latter is indeed representative of the dataset. The distributions of the features in both the full dataset and the test set exhibit similar medians, interquartile ranges and overall spread. The test split preserves the underlying data characteristics, ensuring that the model evaluation is conducted on a sample that reflects the original data distribution.
Model performance was assessed using the coefficient of determination (R2), mean squared error (MSE) and mean absolute error (MAE). The training statistics, available in Table S10 of the SI, show an adequate fit of the RF and GBR models, with R2 values of 0.888 and 0.958, respectively. The ANN model was less effectively trained, with an R2 of 0.711. While ANN are powerful, versatile models, they typically require a large amount of data to be trained effectively. With a limited dataset of 776 points, the model likely did not have enough information to reliably learn these parameters and identify the underlying patterns in the data. However, these results highlight the advantages of data augmentation using the PD approach: the PD-ANN models were more effectively trained than their more basic counterparts (training R2 increased from 0.711 to over 0.946). The PD-GBR model achieved the best fit to the training data, with an R2 of 0.997. It was not possible to assess the performance of the PD-RF model, as the model training exhausted the resources of the machines available for training, illustrating the potential inadequacy of RF to work with large datasets (7762 points). It is noteworthy that applying the PD algorithm to augment the dataset allowed a better fit of both the ANN and GBR models.
Analyzing the test statistics in Table 2 to compare the models, it is clear that the GBR and PD-GBR models are the best suited to the modelling task, with a testing R2 of 0.914 and 0.933 respectively and comparatively low error statistics. The RF and PD-ANN models perform similarly, with R2 of around 0.803 in both cases. The poorly trained ANN model falls behind in testing, with an R2 of 0.684. Considering the similar performance of the GBR and PD-GBR models, the PD version was chosen to continue this work. The increased complexity and model size are outweighed by the slightly increased training and testing performance and the ability to provide error estimates. A comparison of experimental and PD-GBR predicted results in Fig. 7, confirms the goodness of fit with points evenly distributed around the diagonal line, indicating a lack of systematic bias in the predictions across the range of leaching yields and elements. The experimental vs. predicted plots for the remaining models are available in Fig. S2 of the SI.
![]() | ||
| Fig. 7 Plot comparing the experimental and predicted leaching yields of the various LiNixMnyCo1−x−yO2 materials using the PD-GBR model for all tested acids. | ||
| Model | R 2 | MAE | MedAE | MSE | RMSE |
|---|---|---|---|---|---|
| RF_full | 0.803 | 0.096 | 0.074 | 0.018 | 0.134 |
| GBR_full | 0.914 | 0.062 | 0.041 | 0.008 | 0.090 |
| ANN_full | 0.684 | 0.132 | 0.104 | 0.031 | 0.174 |
| PD-GBR_full | 0.933 | 0.051 | 0.028 | 0.006 | 0.079 |
| PD-ANN_full | 0.804 | 0.093 | 0.064 | 0.018 | 0.134 |
In addition to the features that were explicitly included in the dataset, several properties were retrieved based on the acid SMILES string using the RDKit package version 2024.09.06 for Python.51 The goal was to improve the description of the leaching environment and potentially extend the predictive capabilities of the models to acids beyond those in the original dataset. Due to the substantial number of potential features available through this package, a descriptor selection and grouping were performed:
• Basic molecular properties, including the reaction conditions and number of atoms.
• Properties related to Lipinski's rule of five for drug discovery: octanol–water partition coefficient (log
P), topological polar surface area (TPSA), the number of hydrogen bond donors and acceptors and the molar mass.
• Charge properties, represented by the maximum and minimum partial charge on any atom of the molecule.
• Polarizability of the molecule, represented by its molar refractivity.
• Molecular structure, quantifying the presence of specific groups such as carboxylic acid, C–O bonds, primary amine, tertiary amine, halogen atoms, aliphatic hydroxyl groups, and the number of sulfur atoms.
A series of tests were performed, training PD-GBR models with different feature sets to assess the impact of different features on model performance. Table S11 provides an overview of the model-feature pairs evaluated, starting with the inclusion of all available features and progressively dropping groups. Model training was performed as described above.
The training statistics, detailed in Table S12, demonstrate a consistently strong fit for the PD-GBR models to the training data. All models achieved R2 values exceeding 0.97 and low error metrics, indicating their ability to capture the underlying patterns in the data. The PD-GBR models maintain good performance on the unseen test data (Table S13). The PD-GBR-full model, containing all features, yielded the best test metrics with an R2 of 0.9332 and MAE of 0.0513, showing a very good predictive performance. Notably, the other feature sets closely follow this model performance-wise. Highlights include feature sets that exclude log
P, TPSA, number of hydrogen bond donors and acceptors and molecular weight, as well as the set that excludes the maximum and minimum partial charge and molar refractivity of the acid molecules. These two sets achieve a R2 of 0.9296 and 0.9297, respectively. Interestingly, comparing the simplest set, containing only basic experimental parameters, with the full set, revealed only a moderate penalty in model performance. The simpler set yielded an R2 of 0.9222 and MAE of 0.0566, compared to the full set's R2 of 0.9332 and MAE of 0.0513.
While the remarkably similar performance between sets suggests that a simpler feature set performs equally well on the current dataset, a strategic choice was made to use the full feature set. This more comprehensive description of the chemical environment is aimed at enhancing the model's potential generalizability to acid chemistries outside of the training set, at the cost of additional model complexity and training time. Consequently, the remainder of this work employs the PD-GBR-full model, using the features listed in Table S2.
Beyond acid type, the influence of the feature set on the predictive accuracy was also examined. Wilcoxon signed-rank tests were performed to assess if the observed performance variations could be attributed to the more comprehensive feature set. These tests compared the MAE of prediction for each acid group using the full feature set against a base set, which included only basic experimental parameters and acid descriptors. Crucially, Fig. 8 and the test statistics presented in Table S15 show that the only group where the base set exhibited a significantly higher MAE was the “other” acids category (p = 0.016). No significant differences were found for the other acid groups. While the expanded feature set does not offer an advantage for the most common inorganic acids in the dataset, its inclusion proves helpful in describing less represented acids (primarily more complex organic acids).
To better assess the model's robustness, its predictive capability was evaluated across different experimental yield ranges. The data was divided into three groups according to the experimental yields: the first one for yields under 0.3 (MAE = 0.043), the second one for yields between 0.3 and 0.7 (MAE = 0.058), the third one for yields above 0.7 (MAE = 0.050). A Kruskal–Wallis test, performed on the absolute error of each estimation, indicated very significant differences between groups (p < 0.001). An ad-hoc pairwise Dunn's test was performed, whose results are available in Table S16, demonstrating that there are significant differences between the group with low yields (y < 0.3) and the other two. Surprisingly, the PD-GBR model appears to estimate lower yields slightly more accurately than higher yields. Another aspect of the model's generalizability is its performance across different target metals. An independent-samples Kruskal–Wallis test was performed to assess if the predictive performance is different for the target metals (Li, Mn, Co, and Ni) under consideration. A p-value of 0.668 was obtained, indicating no significant differences exist between the MAE for the target metals.
To complete the analysis, the model's performance was examined across different cathode chemistries. The predictions for NMC333 (MAE = 0.046), NMC622 (MAE = 0.102), NMC811 (MAE = 0.028) and remaining chemistries (MAE = 0.051) were compared. A p-value of under 0.001 in the independent-samples Kruskal–Wallis indicates there are differences between cathode chemistries, which were probed using pairwise Dunn's tests. The pairwise comparisons, detailed in Table S17, reveal several significant differences. NMC811, which exhibited the lowest MAE, showed significantly better predictions compared to the “other” chemistries (p-value = 0.044) and NMC622 (p-value < 0.001), which had the highest MAE. The difference between NMC811 and NMC333 was not found to be significant. Similarly, the predictions for NMC333 were significantly better than NMC622 (p-value < 0.001) but not significantly different from the “other” chemistries. The “other” chemistries also showed better predictions than NMC622 (p-value < 0.001). Collectively, these results indicate that the model's predictive performance varies significantly across cathode chemistries. NMC811 and NMC333, representing over 55% of the testing data, generally perform better or similarly to each other and the “other” group, while NMC622, around 11% of the testing data, consistently obtains significantly higher prediction errors compared to the remaining categories.
The leaching kinetics were fitted to four different kinetic models using a least squares methodology: linear model (x = kt), shrinking core model (SCM) with chemical reaction control (1 − (1 − x)1/3 = kt), SCM with product layer diffusion control (1–3(1 − x)2/3 + 2(1 − x) = kt) and SCM with a combination of film diffusion and chemical reaction control
.52 A comparative study, detailed in Table S18, revealed that the mixed model provided the best fit for the data. This model was used to determine the rate constant (k) for each metal at each temperature from the slope of the linearized plot. The Arrhenius plot in Fig. 9 was constructed by plotting the natural logarithm of the rate constant against the reciprocal of the absolute temperature. From the slope of this plot, equal to −Ea/R, the activation energies were calculated, as presented in Table 3. The calculated activation energies are consistent with reported values for similar leaching systems, further validating the predictive capabilities of the PD-GBR model that was developed.53,54 A direct comparison with literature values is provided in Table 3.
![]() | ||
| Fig. 9 Arrhenius plot for the leaching of Li, Ni, Mn, and Co from NMC111 cathodes between 25–80 °C. The slopes of the linear regression are used to calculate the activation energies (Ea) presented in Table 3. The simulated kinetics data used to compute the kinetic parameters are available in Fig. S3 of the SI. | ||
However, reported activation energies can vary significantly between studies. This variation may stem from differences in the methodologies used to determine the rate constant and whether and how distinct dissolution stages are considered. Furthermore, the morphology of the cathode material plays a key role in determining if the process is limited by diffusion or chemical reaction constraints.38,55 It is also crucial to consider that the dataset at the basis of the model contains data from pristine cathode material, battery cathode, and isolated oxides without binders. Consequently, the model's prediction of activation energy likely represents a generalized kinetic behavior, rather than the kinetics of a single, idealized material, reflecting the mixed nature of the source data.
The Toolkit was used to predict yields and calculate impact metrics for a set of six different acids. The tests were conducted on fixed conditions: NMC111, 40 °C, 50 g L−1, 60 min leaching, 1 mol L−1 of acid. Environmental impacts and costs were calculated for 1000 kg of cathode material. Keeping the experimental conditions fixed allows to isolate the effect of the different acids on leaching and environmental and economic impact. Fig. 10 displays the predicted yields and standard deviation for the leaching scenarios outlined. The results indicate that hydrochloric acid and ascorbic acids are predicted to be the most effective leaching agents under these conditions. Despite the common greener perceptions of organic acids compared to inorganic ones, it is crucial to also take economic and environmental considerations into account. Fig. 11 ranks each acid according to the environmental impacts of its production, including upstream impacts, calculated for the quantity needed in each leaching scenario using the Environmental Footprint 3.1 impact assessment method.56 Taking this into account, the organic acids perform worse than hydrochloric and sulfuric acid across all impact categories for the leaching scenarios considered. This can be attributed to the production of organic acids often relying on inorganic acids. Citric and lactic acid, for example, are produced first as calcium citrate and calcium lactate through fermentation processes. A pathway to recover the acids is the addition of sulfuric acid to precipitate calcium sulfate, leaving the organic acid solution.57,58
![]() | ||
| Fig. 10 Metal leaching yields from NMC111 predicted by the developed PD-GBR model for fixed conditions: 40 °C, 50 g L−1, 60 min leaching, 1 mol L−1 of acid. | ||
The estimated reagent, mixing and heating costs, available in Fig. 12, are consistent with the results of Fig. 11, showing that inorganic hydrochloric and sulfuric acids have significantly lower costs when compared to the selected organic acids. This allows to conclude that for the purposes of NMC leaching, inorganic acids like hydrochloric and sulfuric acid might be advantageous both from an economical and environmental perspective, in apparent contradiction with the greener perception of organic leaching agents. Despite the limitations of this simplified comparative approach, the relative economic and environmental impact assessment is strongly supported by recent LCA studies, revealing that inorganic acids can remain more environmentally viable options for LIB cathode recycling compared to organic acids, primarily due to lower quantity of reagents needed.59,60 These works also identified sulfuric acid as having a lower environmental impact than lactic, ascorbic, or citric acid, whilst acetic acid was also evaluated as a better alternative to these within organic acids.
![]() | ||
| Fig. 12 Costs per kg of metal leached from NMC in thousands of euros (all price inputs are summarized in Tables S6 and S7 of the SI). | ||
However, it is important to stress that the results of the “LIB Leaching Toolkit” should only serve as a preliminary screening tool and does not substitute a comprehensive LCA or detailed economic analysis. The environmental impact estimates are calculated based on quite narrow process boundaries, considering only the production of the acid needed for each leaching scenario. Additional environmental impacts during leaching, such as gaseous emissions of Cl2 when using HCl, and the recovery and/or neutralization of the acids downstream of the leaching step are not considered in the calculations. Finally, the conclusions may not extrapolate when considering an overall hydrometallurgical process as the choice of lixiviant and process integration influences subsequent separation units and the effluent volumes generated.
As shown in Fig. 13, the model generally presents good agreement with experimental results for lithium across most acid systems. Predictions for MSA and PHA – two acids with functional groups not included in the training set, and OXA – a poorly represented acid in the training set whose corresponding metal transition complexes are poorly soluble, are remarkably close to experimental. Leaching results for the HCl system are underestimated but within the error margin, whilst the GLU system is overestimated. The zwitterionic GLY system was notably poorly characterized by the model, being significantly different in nature to the acids included in the training set. This is somewhat expectable due to the more complex pH-dependent speciation of amino acids and self-buffering effect. For example, whilst the first dissociation constant input of GLY in the model was pKa = 2.3, the experimentally measured pH of the leach solution was significantly higher at 6.2. Such a discrepancy suggests that additional parameters might be required to properly capture leaching using zwitterions. Importantly, the experimental results for GLY are in accordance with previous works that report poor leaching yields in the absence of an additional reducing agent.61
The prediction of lithium, nickel, and cobalt yield for almost all acids is in reasonable to excellent agreement with experimental results. The two notable exceptions are GLY and OXA, which are both problematically overestimated. As discussed in the case of OXA, the poorly soluble nature of the corresponding transition metal oxalate complexes is likely contributing to the observed overestimation. Whilst the solubility products of the respective acid salts were included as model inputs when available, pondering its importance is required to better dissociate “leaching performance” from the final metal concentration in solution for acids likely to exhibit leaching and provoke subsequent precipitation.
Unfortunately, manganese leaching yield predictions were considerably overestimated for all acids except those included in the training set (HCl). This consistent overestimation could indicate systematic bias in the model. These results appear to contradict the analysis discussed prior, which showed that the PD-GBR was not significantly worse at predicting manganese yields than any of the others for acids in the training set. In fact, those results show a lower MAE in lower yield systems, which should lead to improved manganese predictions, considering that Mn leaching yields are on average lower than the other metals. In addition to the unique redox behavior of manganese (discussed below) which complicates the prediction, a further contribution to the systematic overestimation of the leaching yield may be due to the available data used during training. For Mn, 40% of the data pertains to a leaching yield of 0.75 or higher, whereas 80% of the data shows a yield of over 0.25. As the model developed herein is entirely data-driven, this bias in the data could explain the difficulty of predicting low manganese yield, especially in completely untested systems such as the ones selected for this case study. This problem is exacerbated for less manganese-rich cathode chemistries, as data points with 0.2 lithium equivalent amounts of manganese or less show a statistically significantly higher (p < 0.001) average manganese yield (0.71) than the general population of data points (0.58). Future work on the model refinement will seek to address this gap in the data.
Another possible explanation lies in the distinctive behavior of manganese during the leaching of battery cathodes. After an initial “self-regulating” step where the transition metals in the LIB cathode exhibit similar leaching behavior, the dissolution enters a second stage where manganese in solution decreases, resulting in an atypical leaching behavior when compared with the other transition metals. The underlying cause is the occurrence of side reactions, such as the disproportionation or oxidation of Mn2+ ions and the precipitation of higher-valence MnxOy species. These reactions lead to surface reorganization and the formation of new manganese phases, including metastable birnessite and subsequently γ-type manganese oxide. As a result, a manganese-rich core–shell structure forms, driven by the presence of divalent manganese in the solution.55 A possible approach to capture the irregular solubility of Mn with leaching time is to include more chemical descriptors, such as the oxidation reduction potential of the solutions, to help establish more extrapolation points for the machine learning model. Unfortunately, there is currently a lack of such data that precludes its inclusion.
Results from Fig. 13 suggest that while the model and feature set offer some generalizability to acids outside the training data (as demonstrated by the MSA and PHA results), care must be taken to validate the predictions, especially for manganese rich chemistries. However, the PD-GBR model developed herein proved a worthwhile tool for preliminary studies of acids outside the training set, if not for a definite, accurate prediction of leaching experiments.
Beyond modeling and prediction, this work bridges the gap between data science and chemical engineering, by integrating the ML models into a user-friendly toolkit. This allows for evaluation and optimization of reaction conditions, by offering both yield predictions and preliminary economic and environmental impact assessments. For these assessments, the cost metrics focused on reagent, mixing and heating expenses. Similarly, the environmental impact assessment is limited to the impact of acid production and does not extend to downstream wastewater treatment, which would be crucial in a full LCA. Albeit simplified, an integrated approach such as this one is invaluable for streamlining the development of sustainable LIB processes. Additionally, the ability to quickly simulate process outcomes under varying conditions, as demonstrated in the case studies, highlights the potential of this kind of approach for faster iteration and process optimization, enabling more informed process design and control for recycling plants.
While the models presented herein generally exhibit good agreement with experimental data, limitations in the generalization of novel conditions were observed and highlight the need to validate all modelling tools for a specific purpose. For example, for acids not included in the training set, a good predictive performance was observed for MSA and PHA, but the model struggled with other more diverse leaching agents like zwitterionic glycine. These findings underscore the need for more comprehensive and diverse datasets to improve model performance, robustness and generalizability prior to its broader application in process development and digital twin settings. Future efforts will focus in addressing these biases as well as capturing more relevant industrial conditions. This includes the influence of copper and aluminum ion impurities, known to impact the redox leaching of black mass, as well as leaching yields in mixed acid solutions.62
Although preliminary, this work emphasizes the importance of integrating emerging computational tools into the development of greener, better metal recycling processes. The LIB Leaching Toolkit serves as an important preliminary study tool for agile screening and optimization of leaching conditions, paving the way for more efficient and environmentally conscious battery recycling. The natural continuation of this work would be to expand the dataset to include a greater range of cathode materials and acids, particularly focusing on real waste streams and possible contaminants, which were disregarded for this work.
Supplementary information (SI) is available. See DOI: https://doi.org/10.1039/d5gc04752h.
| This journal is © The Royal Society of Chemistry 2026 |