Qingya
Wang
abc,
Shubin
Lyu
ad,
Haoyu
Zou
ad and
Fusheng
Li
*ad
aSchool of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, PR China. E-mail: qqwqy@163.com
bNational Key Laboratory of Uranium Resource Exploration-Mining and Nuclear Remote Sensing, East China University of Technology, Nanchang, Jiangxi, PR China
cCollege of Information Engineering, Jiangxi Polytechnic University, Jiujiang, Jiangxi, PR China
dYangtze Delta Region Institute(Huzhou), University of Electronic Science and Technology of China, Huzhou, Zhejiang, PR China
First published on 3rd December 2025
A novel analytical framework is presented that combines a physics-informed broad learning system network (PI-BLS-Net) with multimodal spectral fusion to enable rapid, low-cost, and interpretable chromium speciation in industrial tailings. The PI-BLS-Net integrates visible–near-infrared (vis-NIR) and X-ray fluorescence (XRF) spectral data, transforming one-dimensional spectra into two-dimensional images via Gramian Angular Field (GAF) techniques to enhance feature extraction. Discriminative features are learned from the fused spectral images using a Principal Component Analysis Network (PCANet), with key geochemical parameters (measured pH and Eh) and physicochemical constraints—such as redox equilibrium and mass balance based on the Nernst equation—explicitly embedded into both the model architecture and its loss function. This approach enables accurate quantification of Cr(III) and Cr(VI) concentrations in rare earth tailings, supporting scientific waste classification and risk assessment. Validation on 218 tailings samples demonstrates that the PI-BLS-Net achieves excellent performance in classifying tailings hazard based on chromium speciation, with an accuracy of 89.5%, F1-score of 89.0%, and area under the receiver operating characteristic curve (AUC) of 0.95 on an independent test set. Ablation studies further confirm the significant contributions of spectral-to-image transformation, multimodal fusion, PCANet feature extraction, and especially the physics-informed module to overall model performance. This work provides a rapid, interpretable, and robust approach for pollutant speciation analysis in complex matrices, offering valuable technical support for the scientific management and environmental risk assessment of Cr-bearing tailings.
Traditional laboratory-based chromium speciation analyses, such as selective extraction coupled with inductively coupled plasma mass spectrometry (ICP-MS) or atomic absorption spectroscopy (AAS), provide reliable results but are time-consuming, labor-intensive, and require complex sample preparation.1,2 These limitations hinder their applicability for high-throughput or on-site screening of large-scale tailings repositories, especially where rapid hazard classification is demanded. The development of rapid, cost-effective, and environmentally friendly analytical technologies capable of providing chromium speciation information is thus of significant practical importance.
Spectroscopic techniques, notably visible–near-infrared (vis-NIR) and X-ray fluorescence (XRF), have emerged as powerful tools for rapid and non-destructive analysis of environmental samples. Vis-NIR spectroscopy is sensitive to molecular vibrations and can reflect information on organic and certain inorganic chemical bonds, whereas XRF spectroscopy provides elemental composition data. However, single-modality spectroscopy often lacks sufficient selectivity for reliable quantification of specific species such as Cr(VI) in complex matrices due to background interference and overlapping signals. Integrating multi-modal spectral information has shown promise for improving analytical accuracy and robustness. Recent studies, including those from our research group, have demonstrated that the fusion of vis-NIR and XRF spectroscopy significantly enhances the quantitative analysis of heavy metals and geochemical parameters in soils, coal, and mineral matrices.3–6 For instance, our previous work established multi-modal fusion frameworks for quantitative soil cadmium analysis and in situ coal geochemistry, and developed Gramian Angular Summation (GAS) transformation and PCANet-based deep learning methods for rapid heavy metal screening.3–5
Beyond spectral fusion, critical preprocessing steps such as spectral baseline correction,7 spectral band selection, and super-resolution image reconstruction have been shown to further improve the classification and quantification of mineral samples.8,9 Nevertheless, challenges remain in handling spectral complexity, background interference, and feature selection for reliable chromium speciation.
Machine learning (ML) and deep learning (DL) approaches have been increasingly applied to extract meaningful patterns from high-dimensional spectral data, thus enhancing the performance of spectral analysis models. However, conventional “black-box” models often suffer from poor interpretability and require large training datasets and extensive parameter tuning. Broad Learning System (BLS), a recently developed shallow neural architecture, offers efficient and scalable modeling by expanding network width rather than depth, and is well-suited for heterogeneous data fusion tasks.6 Despite these technical advances, many current approaches lack the incorporation of essential physicochemical constraints, limiting their generalizability and scientific interpretability.
With the increasing complexity of environmental datasets, embedding physical and chemical knowledge into machine learning models—an approach known as physics-informed machine learning (PIML)—has become an emerging trend. PI models can enhance prediction reliability and chemical interpretability, enforcing constraints such as mass balance and redox equilibrium. For example, our recent work introduced a BLS-Net optimized with dual PCA and improved PSO-CARS variable selection for XRF-based heavy metal determination, which incorporated domain-relevant knowledge at both data fusion and model optimization stages.6 However, few studies have fully integrated redox chemistry constraints (the Nernst equation) into multi-modal spectral models for chromium speciation.
Taken together, while progress has been made in rapid screening and multi-modal spectral analysis of environmental contaminants, significant challenges remain in integrating physical chemistry constraints, handling spectral complexity, and achieving accurate, interpretable chromium speciation in real-world tailings samples.
To facilitate a clear understanding of the overall research strategy and workflow, Fig. 1 presents a schematic diagram of the proposed methodology. This process flowchart visually summarizes the key components and logical structure of the study, including spectral data acquisition, multi-modal data fusion, spectral-image transformation, hierarchical feature extraction, and physics-informed modeling for chromium speciation. The inclusion of this figure in the Introduction section enables readers to quickly grasp the design logic and main steps of the entire analytical framework.
![]() | ||
| Fig. 1 Overview of the proposed analytical workflow for rapid chromium speciation in rare earth tailings. | ||
Based on these considerations, this study aims to establish a novel, rapid and robust chromium speciation methodology for rare earth tailings in Ganzhou, China. Our approach leverages the fusion of vis-NIR and XRF spectral data, advanced spectral-image transformation techniques, and PI broad learning for accurate and explainable prediction of Cr(III) and Cr(VI) concentrations. This framework builds on and extends the recent advances and methodologies developed by our research group, including multi-modal data fusion, innovative spectral baseline correction, feature selection, and interpretable machine learning for geochemical analysis.3–10
The primary purpose of this study is to address the critical need for rapid, accurate, and interpretable chromium speciation in complex industrial tailings, which is essential for effective environmental risk assessment and scientific waste management. By embedding physicochemical domain knowledge into advanced data-driven models, the proposed framework not only enhances analytical performance but also improves model transparency and reliability. The significance of this research lies in its potential to provide a robust technical foundation for the scientific classification, management, and environmental risk assessment of Cr-bearing tailings, thereby contributing to safer waste disposal and the sustainable utilization of mineral resources.
The study area was divided into three regions: Region A1 (25.80°N–25.86°N, 114.10°E–114.22°E) and Region A2 (25.72°N–25.78°N, 114.15°E–114.28°E) comprised a total of 154 samples for model development (training and internal validation). Region B1 (25.87°N–25.90°N, 114.30°E–114.40°E; 64 samples), located in a distant area with distinct geochemical characteristics, was reserved as an independent external validation set and not involved in model training.
At each sampling site, five sub-samples from the 0–20 cm soil layer were collected using a small shovel, thoroughly mixed to form a single composite sample (approximately 2 kg), and placed in anti-static polyethylene bags. All samples were transported to the laboratory, freeze-dried, ground, and sieved sequentially through 20-mesh and 100-mesh screens to ensure homogeneity, and then stored in specialized sample bottles for subsequent XRF and vis-NIR analysis. For quality control, one sample from every batch of 30 was randomly selected for triplicate analysis.
This spatial allocation strategy—using regions A1 and A2 for model development and reserving region B1 for independent external validation—was designed to avoid overfitting and overly optimistic performance estimation due to spatial autocorrelation. It also better simulates real-world scenarios where the model must perform reliably on new, previously unmeasured areas, providing a robust and objective evaluation of practical applicability. Furthermore, within the dataset from A1 and A2, 70% of the samples were randomly assigned to the training set and 30% to an internal test set for model development.
Quality assurance and control (QA/QC) were maintained throughout the analyses. The accuracy of the **Cr-total** measurement was verified using the certified reference material **NIST SRM 2710a**, with results falling within the certified range. For the **Cr(VI) analysis**, method performance was validated using **procedural blanks and matrix-spiked samples**, with spike recoveries consistently between 95–105%. Method precision was assessed by analyzing one random sample in triplicate for every 30 samples, yielding a relative standard deviation (RSD) consistently below 10%. The method detection limits in the measurement solution were 0.01 µg L−1 for Cr(VI) and 0.02 µg L−1 for Cr(III).
:
1) for 18 ± 2 hours at 20–25 °C on a rotary tumbler (30 ± 2 rpm). The extraction fluid pH (either 4.93 ± 0.05 or 2.88 ± 0.05) was selected based on the sample's alkalinity. After agitation, the leachate was filtered (0.6–0.8 µm glass fiber) for subsequent analysis.
The concentrations of total chromium and Cr(VI) in the leachate were quantified by ICP-MS (Agilent 7900, USA), following appropriate sample preservation and filtration steps as described in the standard. Quality assurance for the TCLP analysis followed the same protocol as the speciation analysis, including the use of certified standard solutions for calibration, NIST SRM 2710a for validating total Cr measurements, and procedural blanks, spiked samples, and triplicate analyses for ensuring the reliability of Cr(VI) data.
:
2.5 (w/v) ratio, following standard soil analysis protocols. After equilibration, pH was recorded using a calibrated glass electrode pH meter (Mettler Toledo, Switzerland). Redox potential (Eh) was determined with a platinum combination electrode (Ag/AgCl reference) connected to a millivoltmeter; readings were converted to values relative to the standard hydrogen electrode (SHE), accounting for reference potential and temperature. All measurements were performed in accordance with internationally recognized methods to ensure data reliability and comparability.
Specifically, for the assessment of chromium speciation in TCLP leachates, PHREEQC simulations were performed for each tailings sample. The input parameters for these simulations included the measured initial total chromium concentration, the designated TCLP extraction pH (either 2.88 or 4.93, as selected based on sample alkalinity), a liquid-to-solid ratio of 20
:
1, and a temperature of 25 °C. The phreeqc.dat database was utilized for all calculations. The primary outputs from PHREEQC were the equilibrium concentrations of Cr(VI) and total Cr in the simulated leachate. These predicted concentrations were then converted into a binary classification (“hazardous” or “general”) using the same national hazardous waste classification standards applied to the measured TCLP data (i.e., “hazardous” if TCLP Cr(VI) >5 mg L−1 or TCLP total Cr >15 mg L−1). This PHREEQC-derived theoretical classification served as a critical geochemical reference for evaluating the physical interpretability and consistency of our PI-BLS-Net predictions.
For XRF spectral data, the optimal energy range was determined through iterative trial-and-error, with energies below 22.4 keV ultimately selected. The reported XRF intensities from the instrument were normalized by dividing by the live time to adjust for count rate (cps). The first step in XRF preprocessing was baseline correction to remove slope artifacts present in the spectra. The Compton peak at 20.17 keV was normalized. Both the raw and Compton-normalized spectra were smoothed using a Savitzky–Golay filter with a window size of 5 and a polynomial order of 2.
First, each preprocessed spectral vector X = [x1, x2, …, xn] was normalized to the interval [−1, 1]. A common method for this normalization is min–max scaling:
![]() | (1) |
i is the normalized value of xi.
Next, the normalized values
i are converted into angular coordinates θi using the arccosine function. The index i (representing the position or time step in the spectral vector) can also be mapped, often linearly scaled to cover a range such as [0, π] or [0, 2π] if needed for visualization, but the standard matrix definition focuses on the angles derived from the values:
θi = arccos( i), | (2) |
i ≤ 1 and 0 ≤ θi ≤ π. θi uniquely represents the normalized spectral value
i. The Gramian Angular Field matrices are constructed by considering the sum or difference of the angles for each pair of points (
i,
j). The resulting matrices, Gramian Angular Summation Field (GASF) and Gramian Angular Difference Field (GADF), are n × n matrices defined as follows:| GASFi,j = cos(θi + θj) | (3) |
| GADFi,j = sin(θi − θj) | (4) |
The element at position (i, j) in the GASF matrix represents the cosine of the sum of the angles corresponding to the i-th and j-th points in the spectral vector, while the GADF matrix represents the sine of the difference of these angles. These matrices effectively preserve the temporal correlation and relative relationships within the original spectral sequence in a 2D image format. The conversion process between GADF and GASF is shown in Fig. S1
The specific steps are as follows:
(1) Spectral curve plotting: for each preprocessed spectral vector (wavelength–intensity pairs), the data are plotted as a line chart without axes. A linear scale is used for vis-NIR spectra, while a logarithmic scale can be selected for XRF spectra depending on their characteristics. This process generates a clean spectral line image, removing axes, labels, and other distractions.
(2) Image acquisition and horizontal mirroring: the spectral plot is converted into a 2D grayscale pixel matrix, and a horizontally mirrored version of the image is generated.
(3) Concatenation and padding: the original and mirrored images are concatenated horizontally to form a wider composite image. A uniform background color is then used to pad the image into a square matrix, providing sufficient space for subsequent rotations.
(4) Multi-angle rotation and pixel aggregation: the square composite image is rotated in 30° increments for a total of 12 rotations (covering 360°). Each rotated image is generated, and all rotated images are then aggregated by taking the minimum pixel value at each position, strengthening the multi-directional representation of the main spectral contour.
(5) Size normalization: the final aggregated image is resized to 128 × 128 pixels to standardize the input for subsequent convolutional neural network models.
This CSM transformation effectively preserves the overall shape, symmetry, and multi-directional spatial features of the spectrum, providing deep learning models with rich 2D structural information for feature extraction and discrimination.
The PCANet pipeline consists of three main stages:
(1) Filter learning: in each stage, local patches are extracted from the training images, and a set of orthogonal PCA filters is learned. The input images are convolved with these filters to produce feature maps, which serve as the input for the next stage. This cascade continues for a predefined number of stages.
(2) Nonlinear binarization and block-wise histogramming: the output feature maps from the final stage are binarized (using a thresholding or hashing operation) to generate binary maps. These are then partitioned into non-overlapping or overlapping blocks, and a histogram of binary patterns is computed for each block.
(3) Feature vector construction: block-wise histograms are concatenated to form a high-dimensional feature vector that characterizes the input image's local and global structures. This vector is then used as the input for downstream regression or classification models.
PCANet's data-driven filter learning and hierarchical structure enable it to capture rich, discriminative features from the fused spectral images, facilitating accurate quantitative or qualitative analysis in subsequent modeling steps.
(1) Input layer: the model accepts a concatenated feature vector comprising: (i) spectral features extracted from preprocessed vis-NIR and XRF spectra; (ii) fundamental physicochemical parameters, including pH and redox potential (Eh); and (iii) optional physicochemical transformations, such as proton activity [H+] = 10−pH or other nonlinear terms reflecting the electrochemical environment.
(2) Feature mapping layer (FML): this layer generates primary feature nodes via two parallel mechanisms:
(a) Spectral feature nodes: nodes produced by applying random linear projections to the spectral features, consistent with standard BLS, to capture complex spectral patterns.
(b) Physicochemically sensitive nodes: nodes generated by applying nonlinear basis functions (sigmoid, tanh, polynomial combinations such as pH × Eh) to physicochemical inputs, explicitly modeling anticipated nonlinear responses associated with chemical equilibria and kinetics governed by pH and Eh.
(3) Enhancement layer (EL): to capture higher-order interactions and complex nonlinearities, the enhancement layer generates additional nodes based on the feature nodes. The activation functions within this layer are modulated by physicochemical context—for example, incorporating gating functions dependent on the difference between measured Eh and the theoretical pH-dependent redox threshold (fRedox(pH)), thus approximating the redox transition boundaries of the Cr(III)/Cr(VI) system under varying pH conditions.
(4) Output layer: the final layer consists of output neurons directly corresponding to the predicted concentrations of Cr(III) and Cr(VI).
Physics-constrained loss function: model training is guided by a composite loss function that integrates data fidelity with adherence to fundamental chemical laws:
| LLoss = α × LMSE + β × LRedox + γ × LMass | (5) |
(1) LMSE is the mean squared error between the predicted and experimentally measured concentrations:
![]() | (6) |
(2) LRedox is a penalty term promoting consistency between model predictions and the Nernst equation for the Cr(VI)/Cr(III) redox couple. The Nernst equation relates the measured Eh to the standard electrode potential E0Cr, the activities (approximated by concentrations) of Cr(III) and Cr(VI), and the solution pH:
![]() | (7) |
L Mass is a penalty enforcing mass conservation, applicable when the total chromium concentration Crtotal is known:
| LMass = [max(0,Cr(III)pred + Cr(VI)pred − Crtotal)]p | (8) |
(1) Accuracy:
![]() | (9) |
(2) Precision:
![]() | (10) |
(3) Recall:
![]() | (11) |
(4) F1-score:
![]() | (12) |
In this study, “hazardous waste” is defined as samples with TCLP Cr(VI) concentrations exceeding 5 mg L−1, or TCLP total Cr concentrations exceeding 15 mg L−1; all other samples were classified as “general waste” in accordance with the national hazardous waste classification standard. All evaluation metrics were calculated based on the independent validation set. Statistical analysis and model comparisons were performed using Python, and the significance of differences in performance metrics among models was assessed using paired t-tests and Tukey's Honestly Significant Difference (HSD) post hoc tests.
In addition to the above evaluation metrics, we further compared the Cr(III)/Cr(VI) ratios predicted by our machine learning models with those calculated by the PHREEQC geochemical model under identical pH, Eh, and Cr concentration conditions. PHREEQC simulations were performed for both the idealized Cr–H2O system and for multi-component systems representative of the studied tailings. This comparison serves to validate the consistency and reliability of the proposed method against well-established theoretical models.
A Redundancy Analysis (RDA) was performed to statistically investigate the relationships between these physicochemical parameters and chromium speciation (see Fig. S4, SI). The analysis confirmed strong positive correlations between Eh and the concentrations of both Cr(III) and Cr(VI), and a notable inverse relationship between pH and Cr(VI) prevalence, which aligns with established electrochemical principles governing Cr(III)/Cr(VI) redox equilibria. The pronounced variability of Cr(VI), coupled with its clear statistical dependence on fluctuations in pH and Eh, highlights the inherent limitations of predictive models that rely solely on spectroscopic data. These findings strongly support the necessity of incorporating key physicochemical parameters into the modeling framework.
To visually validate this core hypothesis, Fig. 5 presents a comprehensive Eh–pH diagram that contextualizes our measured field data against the theoretical predictions from the established geochemical model, PHREEQC. The primary purpose of this diagram is to demonstrate that the chromium speciation in our field samples is governed by the same fundamental physicochemical principles described by thermodynamic theory.
The analysis reveals a strong consistency between theory and observation. The spatial distribution of our measured samples (circular markers) aligns remarkably well with the thermodynamic landscape predicted by PHREEQC (background contours). This is evidenced by two key observations: first, the measured Cr(VI) percentages (marker colors) closely mirror the theoretical trends, with higher values predominantly appearing in the predicted Cr(VI)-dominant region (upper-left). Second, the empirically derived 50% boundary from our data (dashed line) exhibits a negative slope that is strikingly similar to the theoretical 50% equilibrium line (solid line).
This convergence of evidence from three distinct sources—the first-principles geochemical model, the distribution of field measurements, and the data-driven statistical fit—provides a robust demonstration that Eh and pH are the primary physicochemical controls in this system. This finding is not merely descriptive; it forms the foundational rationale for our physics-informed modeling strategy. By confirming that these physical laws are active and observable in our dataset, we justify the explicit inclusion of these parameters and their governing relationships as constraints within our PI-BLS-Net framework, a central theme evaluated in the subsequent sections.
| Model Configuration | Accuracy (%) | F1-score (%) | AUC |
|---|---|---|---|
| a Each row builds upon the previous one. The final row represents the complete proposed model (GAF-fusion + PCANet + PI). Metrics are for binary classification on the independent validation set. | |||
| Base BLS (1D spectra) | 71.9 | 71.1 | 0.77 |
| + GAF-vis-NIR | 75.0 | 74.5 | 0.81 |
| + GAF-fusion (vis-NIR & XRF) | 81.3 | 80.8 | 0.87 |
| + PCANet | 85.9 | 85.3 | 0.92 |
| + physics-informed (PI) module | 89.5 | 89.0 | 0.95 |
The ablation study, as detailed in Table 1, clearly demonstrates the stepwise and essential contributions of each component. Starting from a baseline BLS model using 1D spectra, converting the data to 2D GAF images and subsequently fusing both vis-NIR and XRF modalities led to a significant performance boost, confirming the synergistic advantage of multimodal information. Introducing the PCANet for hierarchical feature learning, with hyperparameters optimized via Bayesian optimization as detailed in Appendix A.2 in the SI, provided a further substantial gain. Crucially, the final integration of the physics-informed (PI) module achieved the highest performance, contributing a final 3.6% increase in accuracy. This incremental analysis validates the effectiveness of our proposed composite framework, with the PI component being particularly vital for achieving the top performance.
Building on these findings, a comprehensive comparison of various model configurations was performed, with results detailed in Table 2. These results were obtained under identical conditions on the independent validation set and corroborate the findings from the ablation study.
| Config. | Model description | Accuracy (%) | Precision (%) | Recall (%) | F1-score (%) | AUC |
|---|---|---|---|---|---|---|
| a Config. 1 is the baseline model. Configs. 2 and 3 are unimodal spectra-image transformations. Configs. 4 and 5 are multimodal fusion. Configs. 6 and 7 add PCANet feature extraction. Configs. 8 and 9 represent the complete PI-BLS framework proposed in this study. Precision, recall, and F1-score are calculated for the binary classification task, treating ‘hazardous waste’ as the positive class. | ||||||
| 1 | Raw spectra + BLS | 70.3 | 71.0 | 69.8 | 70.4 | 0.76 |
| 2 | CSM-NIR + BLS | 73.4 | 74.1 | 72.5 | 73.3 | 0.79 |
| 3 | GAF-NIR + BLS | 75.0 | 75.8 | 74.2 | 75.0 | 0.81 |
| 4 | CSM-fusion + BLS | 79.7 | 80.2 | 78.9 | 79.5 | 0.85 |
| 5 | GAF-fusion + BLS | 81.3 | 81.9 | 80.5 | 81.2 | 0.87 |
| 6 | CSM-NIR + XRF + PCANet + BLS | 84.4 | 85.0 | 83.5 | 84.2 | 0.90 |
| 7 | GAF-fusion + PCANet + BLS | 85.9 | 86.3 | 85.1 | 85.7 | 0.92 |
| 8 | GAF-fusion +PCANet +PI-BLS | 89.5 | 87.1 | 91.0 | 89.0 | 0.95 |
| 9 | CSM-fusion + PCANet + PI-BLS | 88.1 | 86.5 | 89.2 | 87.8 | 0.94 |
The data in Table 2 confirm that models incorporating advanced feature representations consistently outperform the baseline. While both GAF and CSM transformations proved effective, GAF-based models (e.g., Config. 3 vs. 2) generally yielded slightly better results in this study. The most significant performance leap is observed with the fully-integrated frameworks (Configs. 8 and 9), which combine multimodal fusion, PCANet feature extraction, and the PI module. The premier model, GAF-fusion + PCANet + PI-BLS (Config. 8), achieved the best results across all metrics, with an accuracy of 89.5%, a recall of 91.0%, and an AUC of 0.95. The high recall is particularly important for this application, as it signifies the model's strong capability to correctly identify hazardous waste samples, thereby minimizing the critical risk of false negatives in environmental risk assessment. The robust AUC value further demonstrates the excellent discriminative power of the proposed PI-BLS model for prediction on unseen data.
First, to assess the chemical rationality of our model, we compared the PI-BLS-Net predictions directly against the theoretical classifications derived from PHREEQC simulations (Fig. S5a). A high degree of consistency was observed, with a classification accuracy of 96.8% between the data-driven model and the theoretical model. This strong alignment indicates that our physics-informed framework successfully learned to make classification decisions that are consistent with fundamental thermodynamic principles governing chromium leaching, thereby enhancing its interpretability and trustworthiness. It provides strong evidence that the PI module is not just a black box but actively guides the model towards physically plausible solutions.
Next, to establish a performance baseline for the theoretical model itself, we evaluated the PHREEQC classifications against the laboratory-measured actual classifications (Fig. S5b). The PHREEQC model achieved a standalone accuracy of 86.2%, demonstrating its inherent capability to predict the hazard class based on thermodynamic equilibrium. However, the 25 false positives (predicting “hazardous” for samples that were measured as “general”) suggest that an ideal equilibrium model may overestimate the leaching potential in some real-world scenarios. This discrepancy is likely due to factors such as kinetic limitations, complex mineral–water interactions, or matrix effects not fully captured by the simplified thermodynamic simulation.
Finally, the practical predictive performance of our PI-BLS-Net is visualized in Fig. S5c, which compares its predictions against the measured actual classifications. Our model achieved an accuracy of 85.8% on the entire dataset, which is highly competitive with the theoretical PHREEQC model. Crucially, our PI-BLS-Net model exhibits a more balanced error profile. The ability of the PI-BLS-Net to achieve a predictive accuracy comparable to a first-principles geochemical model, while maintaining strong chemical consistency (as shown in Fig. S5a), validates our hybrid physics-informed machine learning approach. It suggests that by integrating rich, multi-modal spectral data, the PI-BLS-Net effectively captures nuances from the sample matrix that complement the explicit physical constraints, leading to a robust and practical classification tool for environmental risk assessment.
As shown in Table 3, our PI-BLS-Net demonstrates superior performance across all evaluation metrics while maintaining reasonable computational efficiency. Compared to classical machine learning approaches such as Random Forest and XGBoost, our model achieves a significantly higher accuracy and F1-score, highlighting the benefits of our advanced feature extraction pipeline and physics-informed architecture.
When compared with deep learning models, the advantages of our approach become even more apparent. The PI-BLS-Net surpasses the standard CNN-LSTM hybrid model by a substantial margin, likely due to the latter's inability to effectively leverage domain knowledge. More importantly, our model also outperforms a traditional PINN framework by 7.2% in accuracy. This improvement can be attributed to the novel broad learning strategy, which avoids the vanishing/exploding gradient problems common in deep PINNs, and our domain-specific implementation of physical constraints within the efficient BLS architecture.
In terms of computational resources, the PI-BLS-Net offers a compelling balance. It is significantly faster and requires less memory than deep learning architectures such as PINN and CNN-LSTM, while being only moderately more demanding than the lower-performing classical models. These results collectively position our PI-BLS-Net as a highly effective and efficient solution for rapid and accurate chromium speciation analysis in complex environmental samples.
Looking forward, future research could address these limitations by employing larger and more diverse datasets to validate the robustness and scalability of the methodology. Furthermore, integrating more advanced optimization algorithms or parallel computing techniques may help to reduce computational time. The exploration of model interpretability, possibly through explainable AI methods, also represents an important direction for subsequent work. Finally, application of the proposed framework to other domains or tasks could further demonstrate its versatility and practical value.
By employing GAF and CSM methods, one-dimensional spectral data were effectively transformed into two-dimensional image representations, from which hierarchical features were extracted using the PCANet. Experimental results demonstrated that the PI-BLS-Net achieves outstanding performance in predicting Cr(III) and Cr(VI), particularly in the presence of complex nonlinear relationships.
This approach not only improves prediction accuracy but also enhances the physical interpretability of the model, offering a novel solution for environmental monitoring and pollution management.
| This journal is © The Royal Society of Chemistry 2026 |