Open Access Article
Shayan Mousavi
Masouleh
ab,
Corey A.
Sanz
c,
Ryan P.
Jansonius
c,
Cara
Cronin
c,
Jason E.
Hein
c and
Jason
Hattrick-Simpers
*a
aCanmet MATERIALS, Natural Resources Canada, 183 Longwood Rd S, Hamilton, ON, Canada. E-mail: jason.hattrick-simpers@nrcan-rncan.gc.ca
bClean Energy Innovation, National Research Council of Canada, 2620 Speakman Dr, Mississauga, ON, Canada
cTelescope Innovations, 301-2386 E Mall, Vancouver, BC, Canada
First published on 1st October 2025
As the demand for high-purity lithium surges, primarily fueled by the adaptation of the electric vehicle (EV) industry, the need for cost-effective extraction and purification technologies intensifies. The Smackover Formation in southern Arkansas, recently identified as one of the world's largest lithium resources, offers vast potential. This formation is part of a broader array of lithium resources across North America, many of which possess lower-grade lithium compared to renowned sources like South American brines. These alternative formations, while presenting significant opportunities, require innovative purification techniques to make their exploitation economically viable. Continuous crystallization is a promising method to produce battery-grade lithium carbonate from these lower-grade sources. Yet, the optimization of this process is challenging due to its complex parameter space, often constrained by scarce data. This study introduces a Human-in-the-Loop (HITL) assisted active learning framework aimed at adapting and optimizing the continuous crystallization process of lithium carbonate. By integrating human expertise with data-driven insights, this approach significantly accelerates the optimization of lithium extraction from challenging sources. Our results demonstrate the framework's ability to rapidly adapt to new data, improving the process's tolerance to critical impurities, such as magnesium, by industry practices at a few hundred ppm, and extending it to handle contamination levels as high as 6000 ppm. This makes the use of low-grade lithium resources contaminated with such impurities feasible, potentially reducing overhead processes. By leveraging artificial intelligence, we not only refined the operational parameters but also demonstrated a potentially reduced need for extensive pre-refinement, promoting the use of lower-grade materials without sacrificing product quality. This advancement marks a significant step towards economically harnessing North America's lithium reserves, particularly those in the Smackover Formation, thereby contributing to the sustainability of the lithium supply.
Among these low-grade sources, the Smackover Formation in southern Arkansas stands out as a significant resource. Identified as potentially one of the world's largest lithium reserves, the Smackover Formation features high concentrations of lithium in brines—over 400 milligrams per liter—which are currently brought to the surface as waste streams from the oil, gas, and bromine industries.6–10
However, exploiting these resources is not straightforward. The Smackover's lithium-bearing brines are characterized by a high ratio of impurities to lithium, approximately 1000 atoms of impurity for every atom of lithium. These impurities include elements such as sodium (Na), potassium (K), magnesium (Mg), and calcium (Ca), closely related to lithium on the periodic table, which complicates their separation due to their similar solubility, charge, and mass properties.6 This challenge necessitates a shift from traditional methods to innovative extraction techniques that are both economically viable and environmentally sustainable.
Traditional methods for producing battery-grade Li2CO3 are costly and involve multiple steps, including lengthy evaporation processes with high water usage.11–14 Exploiting lower-grade brines would be even more expensive due to their dilution, necessitating process improvements to reduce reagent, solvent, and water usage.2,15 In contrast, continuous crystallization—a technique commonly used in the pharmaceutical industry but rarely in mining—is well-suited for this purpose.16,17 This method achieves metal salt purities of 90–99.9% and eliminates the need for evaporative concentration,18–24 thus, reducing water and land use as well as production time.15 According to a United States Department of Energy assessment, direct lithium extraction using continuous crystallization can cut production costs by 24% compared to traditional evaporative methods, enabling cost-effective purification of Li2CO3 from low-grade brines in a single step.2,15
Nonetheless, the continuous crystallization process is not without its challenges, particularly when applied to the low-grade lithium brines such as those found in the Smackover Formation.6,7,19 In this regard, the primary challenge is not merely concentrating the lithium, which can be achieved through methods like evaporation, reverse osmosis, or solar concentration, but effectively rejecting closely related impurities (Na, K, Ca, Mg). To achieve high-purity lithium carbonate (Li2CO3), the crystallization process must be finely tuned: a dilute solution might lead to sparse crystal formation and low mass recovery, while a too concentrated solution can cause rapid nucleation, trapping impurities within the crystals. Adding to the complexity, Li2CO3 exhibits inverse solubility, decreasing from 13 g L−1 at 20 °C to 8.6 g L−1 at 80 °C. Thus, continuous crystallization needs to be adapted to be either impurity-tolerant or highly selective for lithium, minimizing the need to remove every challenging impurity. Optimizing such intricate chemical operations is a formidable task due to the high dimensionality of the control space and the sophisticated chemistry involved.
Traditional optimization strategies, such as conventional design of experiment (DOE) methodologies, often involve exploring a vast experimental space. In our study, we initially identified 10 critical variables, which, under a full factorial DOE, would necessitate about more than approximately 1024 experiments. Given our experimental throughput constraints, limited to about four per week, conducting such a large number of experiments was impractical. This limitation highlighted the need for an alternative approach that could achieve optimization with fewer experiments.
Over the past decade, various AI-driven active learning solutions have emerged, treating the optimization task as a search through a “black box.” Among these, Bayesian optimization methods have been widely used as a principled framework for guiding experimental selection in data-efficient ways. These methods have led to more efficient exploration and optimization of complex systems.25–31 Recent contributions, such as M. Lazin et al.32 and K. Yang et al.,33 have extended Bayesian optimization approaches to better handle cases where high dimensionality is the dominant challenge under minimal constraints. However, our process involves moderate-to-high dimensionality coupled with strong practical constraints, interdependent variables, and nonlinear process responses, which present different challenges that can still limit optimization speed and efficiency in practice. These limitations reflect a broader challenge: AI-driven active learning approaches rely heavily on machine learning models, which build correlations without necessarily providing causal understanding or integrating well-established, yet hard-to-quantify, heuristics of the physical and chemical nature of the system under study. This reliance can result in a process that remains time-consuming and resource-intensive. Furthermore, design biases may arise from a limited understanding of the system's complexities, leading to skewed outcomes and compounding errors. These biases can hinder the model's ability to generalize, impacting its overall performance and reliability.
To mitigate biases and address complexity challenges, the field of human-in-the-loop AI has emerged as a promising solution.34–36 This approach leverages the collaboration between human intelligence and artificial intelligence.34 Human cognitive abilities and domain expertise play a crucial role in enhancing AI's predictive capabilities by interpreting data-driven correlations and offering intuition-driven insights. These insights help in refining the evaluation process of AI models, ensuring that the models align more closely with real-world complexities and expectations. Human-in-the-loop AI involves human input in tasks such as data collection, algorithm selection, and model tuning, creating a feedback cycle that helps reduce biases in both human decision-making and AI predictions.35 This collaborative framework allows for quick adjustments and a deeper understanding of the machine learning workflow, ensuring that AI-driven systems become more adaptive, efficient, and effective in managing complex optimization tasks. This integration not only addresses the limitations of each approach but also combines their strengths to improve overall outcomes in high-dimensional optimization environments.34–36
In this paper, we present a human-in-the-loop-assisted active learning (HITL-AL) AI framework specifically designed to optimize a continuous crystallization technique for producing high-quality, battery-grade lithium from diverse low-grade brines. This approach directly addresses the challenges posed by feedstocks like the Smackover Formation's complex geochemistry, where high impurity levels closely related to lithium complicate traditional extraction methods. Our objective extends beyond merely adjusting control parameters such as reactor temperatures and flow rates. It also involves investigating the interactions between brine compositions and system controls, particularly the interplay of major contaminants like Na, Ca, Mg, and K, to determine how much contamination and dilution can be tolerated while still producing battery-grade outcomes. By doing so, we aim to unlock the potential of these abundant but challenging resources, reducing reliance on high-quality sources and advancing cost-effective, sustainable lithium extraction methods.
In our HITL-AL process, human experts play a central role in refining machine learning-suggested experiments, using their judgment to focus on those most likely to yield meaningful results. This strategic selection is crucial for conducting experiments within practical throughput constraints while exploring promising pathways that models might overlook. Moreover, human experts are integral to evaluating outcomes and adjusting both hypotheses and workflows. This involves developing new hypotheses from emerging data and rigorously testing these ideas, helping to identify and correct biases in design and chemical assumptions, such as the difficulty of impurity removal and the ranges of control parameters. The methods section provides detailed insights into these processes and how different hypotheses are tested.
This iterative approach not only reduces the number of experiments required but also uncovers critical insights that drive innovation. For instance, through this collaborative and adaptive process, we discovered that adjusting cold reactor temperatures significantly reduces magnesium impurities. This counterintuitive breakthrough was achieved with minimal observations, demonstrating the effective synergy of human intuition and AI analysis. This finding significantly enhances the production of battery-grade lithium by expanding the acceptable range of magnesium contamination levels. We explore these discoveries further, illustrating the framework's effectiveness in optimizing lithium production. Additionally, we highlight the broader implications of our findings, showing how the integration of human expertise and AI not only improves experimental efficiency but also provides a sustainable and cost-effective solution to challenges posed by low concentration and polluted brines, such as those found in the Smackover Formation.
![]() | ||
| Fig. 1 Overview of the integration between the experimentation cycle and the human-in-the-loop active learning cycle, highlighting the different steps or modules within each phase. | ||
Upon initiating the HITL-AL cycle, as illustrated in Fig. 1, the “data and results inspection module” acts as the initial juncture. This module integrates cumulative experimental observations into our workflow. It features a statistical dashboard that compiles and analyzes incoming data. Detailed information on these statistical analyses can be found in Section 2.2.1. This dashboard provides human experts with insights into parameter interdependencies and correlations, crucial for real-time monitoring and timely response to experimental findings.
As outlined in Fig. 1, following the inspection step, the “human interpretation” phase occurred. In this phase, experts apply their scientific and intuitive insights to the results of the statistical analysis. This step emphasizes the integration of human expertise into the cycle, thus avoiding the exhaustive need to fully automate complex decision-making in R&D setups.
Leading directly from the interpretation step, is the “informed design and adjustment” phase. In this phase, the insights obtained from human interpretation are employed to meticulously adjust and define the configurations, ranges, and constraints within the surrogate space—a multidimensional representation of experimental conditions. Each point within this space specifies operational controls such as temperature and flow rate, as well as reactant concentrations of lithium, sodium, calcium, and magnesium. Experts selectively refine input features and output targets based on these insights, ensuring that both machine learning model settings and experimental parameters are meticulously aligned with our objectives to maximize lithium yield and purity.
Building on the “informed design and adjustment” phase, the ML model prediction phase leverages these refined inputs to train machine learning models. These models predict crystallization outcomes, ensuring the configurations within the surrogate space are optimized for maximum lithium yield and purity. Details about the model configurations, training, and operational parameters are provided in Section 2.2.4.
In the “data acquisition” phase, predictions from the machine learning models guide the generation of a list of potential experiments, as detailed in Section 2.2.5. These suggestions, coupled with model predictions, are rigorously reevaluated within the “data and results inspection and interpretation modules.” Here, statistical analyses help human experts assess the performance of the models and the efficacy of the data acquisition process. During the interpretation phase, the viability of the experiments is evaluated. If the assessments determine that the experiments are feasible the process advances to the experimentation phase. If not, the HITL-AL cycle proceeds with further iterations, adjusting the surrogate space, data acquisition strategies, and model parameters until the experimental plan is finalized.
Once experiments are authorized for “experimentation,” experts review the suggestions, select the most feasible experiments, and conduct them along with reproducibility tests to verify data quality.
This continuous, iterative cycle of review and refinement, depicted in Fig. 1, enhances the capacity to efficiently produce battery-grade lithium from low-grade brines. Subsequent sections will delve deeper into the experimentation phase and detail the integration of the HITL-AL cycle, illustrating how each step contributes to achieving optimal outcomes.
![]() | ||
| Fig. 2 (a) Simplified process flow diagram of the continuous hot-cold crystallization setup. (b) Image of benchtop setup used for processing LiCl brines into battery grade Li2CO3. | ||
At the start of the process, the cold reactor was loaded with the initial solution (e.g., low-grade Li2CO3 brine). The concentration of various elements in the input solution was monitored before loading the cold reactor. At the end of the crystallization process, the solution was unloaded and filtered from the hot reactor, where the crystallized purified lithium was collected. After unloading the product from the hot reactor, the concentration of different elements in the resulting crystals was measured using Inductively Coupled Plasma Optical Emission Spectroscopy (ICP-OES).
The continuous crystallization process was designed to isolate battery-grade Li2CO3 from a Li2CO3 impure feedstock; see Fig. 1a for a schematic overview. The initial crude material for this study was primarily synthesized to simulate various real-world lithium carbonate crudes with typical contaminant levels, including Mg, Ca, K, and Na. To ensure reproducibility, we also used industry-grade feedstock contaminated with a similar set of elements. Control parameters managed within the system included the temperatures of the cold and hot reactors, defining the temperature differential, as well as the pH, stirring rate, slurry concentration, flow rate between reactors, and retention time.
All experiments conducted were systematically recorded in tables and passed to the HITL cycle. Each experiment was assigned a quality score (1, 2, or 3) based on the process expert's observations. A score of 1 indicated no anomalies, such as sedimentation in containers. A score of 2 was given for minor occurrences of such issues, while a score of 3 was assigned to conditions where noticeable anomalies, like sedimentation on container walls, were observed. Additionally, specialists provided comments describing any anomalies noticed during the experimentation.
Throughout various cycles of active learning and experimentation, specialists conducted reproducibility tests. These tests primarily focused on experiments that showed anomalies or unusual results, including a random selection of previous experiments. If contradictory results were observed, the scores could be adjusted. Experiments with significantly divergent outcomes were deemed failed and excluded from further analysis.
Statistical analyses were systematically performed to explore correlations and identify key parameters influencing process outcomes. Initially, observed experimental data were analyzed to establish baseline relationships. After the first active learning iteration, analyses were extended to incorporate simulated surrogate space conditions and machine learning predictions, thereby elucidating model predictions and highlighting important signals identified by the models.
Pearson correlation analyses quantified linear relationships between input parameters and final impurity levels. SHAP (SHapley Additive exPlanations) values clarified the relative contributions of each feature to model predictions, while sensitivity analyses evaluated the robustness of these predictions by perturbing parameters individually by one standard deviation. The SHAP and sensitivity analyses utilized Random Forest Regressors (RFR) due to their interpretability and straightforward hyperparameter tuning. Hyperparameters—including the number of estimators, maximum depth, minimum samples split, and minimum samples per leaf—were optimized using the Tree-structured Parzen Estimator (TPE) algorithm via Optuna. Collectively, these analyses provided insights into the underlying data relationships and clarified model behaviors.
If it was the initial active learning iteration, experts proceeded directly to the informed design and adjustments step. However, in subsequent iterations, experts only moved forward to the informed design and adjustments step when unsatisfactory outcomes or unexpected ML behaviors were identified. These actions included retraining ML models with adjusted hyperparameters, refining surrogate space parameter boundaries, or conducting targeted verification experiments—such as deploying random walkers (described further in Section 2.2.3)—to clarify anomalies and potential biases. These iterative refinements, guided by human insights, ensured continuous alignment among model predictions, experimental design, and chemical process understanding.
Experts evaluated feasible parameter ranges by identifying biases, model inconsistencies, and persistent trends suggesting areas for further investigation. If surrogate space coverage was deemed inadequate, a random walk algorithm was activated for targeted verification, deploying 100
000 random walkers. Walkers were initialized near flagged regions and took randomized steps within ±25% of identified Pareto frontier boundaries, systematically probing overlooked parameter ranges.
Decisions regarding machine learning model strategies—whether to continue exploratory analyses or transition toward exploitation—were informed by empirical results and experimental observations rather than purely computational predictions. The exploratory analysis phase primarily identified key parameters and their critical ranges necessary for achieving battery-grade outcomes. Once sufficient understanding was reached, the strategy shifted to exploitation, focusing on defining decision boundaries between battery-grade and non-battery-grade conditions, particularly in relation to Mg impurity tolerance. The decision to transition between strategies or initiate targeted verification was systematically informed by human interpretation, ensuring continuous alignment between the experimental workflow and evolving process insights.
Separate GPR models were developed for each target outcome, employing a differentiable Matern kernel, tuned using the Tree-structured Parzen Estimator (TPE) algorithm. Expert evaluations of predictive accuracy informed further hyperparameter refinements and model configurations.
During the initial exploration, GPR models predicted post-crystallization elemental concentrations, aiding experts in identifying key parameters that influence impurity reduction. Battery-grade label predictions using GPC were subsequently incorporated to delineate boundaries between battery-grade and non-battery-grade outcomes, providing insights into how impurity levels impact lithium carbonate purity. Standard scaling was applied to both the training dataset (features only) and surrogate simulation datasets to ensure consistent model performance by maintaining uniform feature magnitudes and preventing numerical inconsistencies in predictions.
The data acquisition in this study has two strategies. The first strategy emphasized exploratory analysis in the initial active learning cycles. After gaining some insight an information decision boundary exploitation was followed.
Post-refinement analyses showed that Na impurities consistently fell below the project's set battery-grade threshold of 500 ppm and for most tests even under the more restrictive limit of 250 ppm, as introduced in Table S1, demonstrating this without additional purification treatments. K concentrations, although occasionally exceeding the strict battery-grade threshold of 10 ppm, were deemed manageable due to the high water solubility of potassium salts. Literature indicates typical removal efficiencies of 97–99% for K through simple multi-stage water washes, allowing reduction from initial concentrations around 500 ppm to within battery-grade specifications (Table S1).1,2 Similarly, Ca impurities, despite exceeding the battery-grade threshold of 70 ppm in some preliminary outcomes, were also considered manageable. Literature supports mild supplementary treatments such as dilute acid washes or carbonation–decarbonation recrystallization, which reliably reduce calcium from initial concentrations of a few hundreds to a thousand ppm to below battery-grade limits, typically achieving removal efficiencies between 70–85%.1,3,4
Given the satisfactory refinement consistently achieved for Na and Li impurities, further optimization efforts for these elements were deprioritized. Although post-refinement K concentrations occasionally exceeded battery-grade thresholds, optimization for potassium was similarly deprioritized due to the ease and effectiveness of post-process water washing. This decision was corroborated by statistical analyses conducted within the HITL framework. Specifically, SHAP analysis demonstrated that post-refinement K concentrations exhibited lower feature importance than a randomly generated control variable (Fig. S1). Additionally, Pearson correlation matrices indicated no significant correlations between K concentrations and process control parameters after refinement (Fig. S2). Although sensitivity analysis suggested a slight influence from flow rate on K reduction, this was statistically insignificant, exhibiting comparable magnitude to random variations (Fig. S1). Together, these analyses justified assigning K a lower priority for subsequent experimental optimization.
Conversely, Mg impurity removal posed significant challenges. In scenarios where initial Mg concentrations exceeded approximately 80 ppm—the conventional battery-grade threshold—none of the preliminary experiments successfully reduced Mg below this desired limit (Table S1). Furthermore, elevated initial Mg concentrations corresponded with increased post-refinement Ca concentrations, occasionally surpassing the battery-grade threshold, though still manageable through previously described post-processing methods. Consequently, informed by initial observations and HITL-driven analyses, subsequent experimental designs specifically prioritized optimizing Mg and Ca removal.
Exploratory analysis began by employing Pareto frontier extraction, aiming to reduce post-refinement Mg and Ca concentrations. Gaussian Process Regression (GPR) predictions were generated across a surrogate experimental space consisting of 10
000 data points, as detailed fully in Tables S3. At each exploratory iteration, approximately 30 candidate conditions identified by Pareto frontier analysis were presented to experts, from which a subset was selected for experimental execution. After four active learning iterations, a cumulative total of 36 successful, reproducible experiments—including the initial 16 experiments—were collected for further analysis (Table S4). The iterative refinement and optimization achieved through these cycles are illustrated in Fig. 3.
As depicted in Fig. 3, despite inherent complexity causing occasional sub-optimal model predictions—including non-physical outcomes such as negative concentrations—GPR predictions remained valuable for identifying meaningful trends and signals.
Throughout the initial exploratory active learning phase, informed by insights from the first 16 expert-designed experiments, it was observed that the continuous crystallization process consistently met battery-grade lithium carbonate specifications for Na (Tables S1 and S4). Potassium and calcium impurities were effectively managed with straightforward, cost-efficient secondary treatments, ensuring they did not obstruct overall lithium purification. However, across the first 36 experiments conducted, when initial Mg concentrations exceeded 200 ppm, reducing Mg below the battery-grade threshold of 80 ppm was consistently unattainable (Tables S1 and S4). This persistent challenge led to a strategic shift in optimization efforts, concentrating specifically on improving magnesium impurity removal to reliably achieve battery-grade lithium.
To address the persistent challenge of effectively reducing Mg impurities, human experts examined potential biases in the experimental design—specifically concerning the selected parameter ranges. A targeted random walk algorithm was deployed along the boundaries defined by the last identified Pareto frontier, systematically exploring adjacent regions. By allowing deviations up to ±25% from these Pareto frontier boundaries, this approach generated a new surrogate space containing 5000 additional experimental conditions. GPR models, trained on all experimentally observed data available up to that point (the first 36 data points, Table S4), were subsequently employed to predict outcomes across this newly created surrogate space, enabling the detection of overlooked parameter spaces or latent biases.
Analysis of GPR-predicted data from this random walk surrogate space revealed an unexpected inverse correlation between the cold reactor's temperature and predicted post-refinement Mg concentration. This surprising observation, identified through Pearson correlation, SHAP feature importance, and sensitivity analyses performed on the model-predicted dataset (Fig. 4), starkly contrasted prevailing assumptions derived from prior literature and scientific heuristics. The statistical analyses indicated that higher cold reactor temperatures could significantly enhance Mg impurity removal efficiency—directly challenging established recommendations of maintaining cold reactor temperatures below 60 °C with a 20 °C differential between reactors. Prompted by these model-derived insights, experimental trials were subsequently proposed and conducted to validate this observation.
To experimentally validate this correlation, human experts conducted targeted trials, systematically increasing the cold reactor temperature while holding other operational parameters constant. These validation experiments confirmed the initial GPR-derived predictions, substantiating the inverse relationship between cold reactor temperature and post-refinement Mg concentration. Fig. 5 presents one such validation experiment, highlighting the direct impact of cold reactor temperature on Mg impurity reduction (detailed experimental conditions provided in Tables S4 and S5).
Following the experimental validation of the inverse relationship between cold reactor temperature and Mg impurity concentration, the study proceeded to the exploitation phase. The objective in this phase was to clearly delineate the decision boundary between battery-grade and non-battery-grade lithium outcomes, specifically quantifying the maximum permissible initial Mg concentration that could yield battery-grade lithium under optimized temperature conditions. Fig. 6 visualizes this boundary by plotting experimentally observed outcomes—classified as battery-grade or non-battery-grade—against initial Mg concentration and cold reactor temperature. The resulting visualization reveals a distinct and actionable decision boundary.
Subsequently, a GPC model was trained to formalize the experimentally identified decision boundary (Fig. 6). Utilizing a ray tracing algorithm, experimental candidates with predicted probabilities closest to the critical 0.5 threshold—indicating equal likelihood of achieving battery-grade or non-battery-grade lithium—were selected for further verification experiments.
Fig. 7 presents these GPR-derived predictions, distinguishing battery-grade from non-battery-grade outcomes, overlaid with all observed experimental data (totaling 80 experiments). This visualization highlights a substantial improvement in permissible initial Mg contamination levels. Historically, industry standards limited acceptable Mg contamination to approximately 80 ppm. However, our HITL-driven optimization demonstrates that increasing the cold reactor temperature beyond the initially recommended limit of 60 °C can significantly elevate the tolerance of initial Mg contamination to several thousand ppm. This finding implies a potential reduction in the extent of pre-refinement processes and lessens reliance on higher-purity lithium sources to achieve battery-grade products.
Fig. 8 provides additional analytical evidence supporting the significant influence of cold reactor temperature and initial Mg concentration on process outcomes. Specifically, Fig. 8a shows the distribution of cold reactor temperatures for experiments with initial Mg concentrations above 200 ppm, categorized by battery-grade or non-battery-grade outcomes. This distribution clearly demonstrates that higher reactor temperatures strongly correlate with successful Mg impurity reduction. Further supporting these observations, Fig. 8b presents SHAP analysis results, highlighting the prominent impact of cold reactor temperature on achieving battery-grade lithium. Lastly, Fig. 8c illustrates sensitivity analysis outcomes, reinforcing that both initial Mg concentration and cold reactor temperature are key parameters determining the purity of the final lithium carbonate product.
Including the initial 16 historic experiments, a total of 38 experiments (22 additional experiments) were required to obtain clear experimental evidence of the influence of cold reactor temperature on achieving battery-grade lithium carbonate. In total, 80 experiments were conducted throughout the study to validate these insights and precisely identify the decision boundary between battery-grade and non-battery-grade outcomes (as represented in Fig. 6–8). Given the complexity of the crystallization process and the extensive dimensionality of the parameter space, identifying critical parameters within just 38 experiments demonstrates the effectiveness of integrating human expertise, statistical analyses, and machine learning-driven active learning methods. Conducted at an accelerated pace of approximately one experiment per day, this human-in-the-loop approach substantially reduced the timeline traditionally required for such process optimization, facilitating rapid decision-making and iterative refinement despite the challenges inherent in low-data machine learning scenarios.
To further illustrate how the HITL framework accelerated the identification of experimental conditions conducive to producing battery-grade lithium carbonate with initial Mg concentrations exceeding 200 ppm, we compared its efficacy against two alternative active learning frameworks without human intervention. Specifically, we compared the performance of our HITL-assisted active learning (HITL-AL) approach, which required 38 experiments to identify key conditions leading to battery-grade outcomes, against the performance of purely computational active learning methods without expert guidance. These methods included a random-sampling approach and a simplified Bayesian optimization approach, both conducting one experiment per active learning cycle.
For this comparative evaluation, a surrogate GPC model—trained on data and insights obtained from the HITL approach—was used to predict battery-grade outcomes. Two distinct experimental datasets, each comprising 10
000
000 simulated experimental scenarios, were generated for this comparison. The first dataset, termed “uninformed,” strictly adhered to initial parameter ranges without incorporating insights from the HITL process. The second dataset, termed “informed,” was explicitly constrained by parameter ranges informed by the HITL-identified optimal temperature settings and impurity conditions.
Each active learning method underwent 100 computational simulations (instances) employing different random seeds to statistically evaluate the frequency at which battery-grade conditions were identified. The first active learning approach applied a random sampling data acquisition policy. In contrast, the second method used a simplified Bayesian approach explicitly adapted for classification tasks, predicting experimental outcomes as either battery-grade or non-battery-grade. The classification approach was adopted due to challenges in reliably simulating detailed regression predictions of impurity concentrations with the surrogate GPC model. For condition selection, the Bayesian method relied on the upper confidence bound (UCB) metric to target experimental conditions with high uncertainty.
Fig. 9 summarizes the comparative results of the simplified Bayesian and random active learning methods using both informed and uninformed datasets. The simplified Bayesian approach leveraging the informed dataset (integrating optimized parameter ranges derived from prior HITL insights) identified battery-grade conditions within 40 experiments at a success rate of approximately 67%, significantly outperforming the uninformed dataset scenario, which achieved only 14% success. Similarly, the random sampling approach achieved success rates of 14% with the informed dataset and only 1% with the uninformed dataset. These results clearly illustrate the substantial advantage provided by incorporating human-derived knowledge and insights into active learning optimization strategies.
Collectively, these findings underscore the practical effectiveness and importance of integrating human expertise with machine learning analyses. Targeted human interventions, guided by nuanced insights from ML-driven data exploration, notably accelerated the discovery of optimal process parameters, demonstrating a highly effective synergy for refining complex chemical systems.
Through iterative experimentation and informed interpretation of machine learning model predictions, our approach notably accelerated the identification of optimal process conditions. A critical insight emerged from this collaborative optimization: contrary to traditional assumptions recommending lower cold reactor temperatures (below 60 °C with a 20 °C temperature differential), our framework identified and experimentally validated that elevated cold reactor temperatures significantly enhanced magnesium impurity removal efficiency. This finding permitted an unprecedented increase in magnesium impurity tolerance—from conventional limits of approximately 80 ppm up to several thousand ppm—markedly reducing the need for intensive pre-refinement stages.
Overall, this research demonstrates the pivotal role human expertise can play in refining AI-driven optimization processes, particularly in high-dimensional chemical systems constrained by limited experimental throughput. By effectively balancing human intuition, domain knowledge, and machine learning analysis, our approach not only expedited critical discoveries but also facilitated agile responses to emergent insights, thereby optimizing experimental efficiency and accuracy. Consequently, the methodological and practical advancements presented herein represent significant progress toward the sustainable, economically viable extraction of lithium carbonate from complex, impurity-rich brines. This advancement contributes directly to the broader objective of responsibly harnessing North America's abundant but challenging lithium resources, supporting the growth and sustainability of lithium-based technologies vital for the global energy transition.
The experimental data supporting this article are provided in Table S4 of the SI, which also includes additional information on experimental procedures and complementary analyses. Supplementary information is available. See DOI: https://doi.org/10.1039/d5dd00285k.
| This journal is © The Royal Society of Chemistry 2025 |