Combining a physics-informed broad learning system and multimodal spectral fusion for accurate analysis of chromium speciation in tailings

Qingya Wang abc, Shubin Lyu ad, Haoyu Zou ad and Fusheng Li *ad
aSchool of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, PR China. E-mail: qqwqy@163.com
bNational Key Laboratory of Uranium Resource Exploration-Mining and Nuclear Remote Sensing, East China University of Technology, Nanchang, Jiangxi, PR China
cCollege of Information Engineering, Jiangxi Polytechnic University, Jiujiang, Jiangxi, PR China
dYangtze Delta Region Institute(Huzhou), University of Electronic Science and Technology of China, Huzhou, Zhejiang, PR China

Received 1st October 2025 , Accepted 18th October 2025

First published on 3rd December 2025


Abstract

A novel analytical framework is presented that combines a physics-informed broad learning system network (PI-BLS-Net) with multimodal spectral fusion to enable rapid, low-cost, and interpretable chromium speciation in industrial tailings. The PI-BLS-Net integrates visible–near-infrared (vis-NIR) and X-ray fluorescence (XRF) spectral data, transforming one-dimensional spectra into two-dimensional images via Gramian Angular Field (GAF) techniques to enhance feature extraction. Discriminative features are learned from the fused spectral images using a Principal Component Analysis Network (PCANet), with key geochemical parameters (measured pH and Eh) and physicochemical constraints—such as redox equilibrium and mass balance based on the Nernst equation—explicitly embedded into both the model architecture and its loss function. This approach enables accurate quantification of Cr(III) and Cr(VI) concentrations in rare earth tailings, supporting scientific waste classification and risk assessment. Validation on 218 tailings samples demonstrates that the PI-BLS-Net achieves excellent performance in classifying tailings hazard based on chromium speciation, with an accuracy of 89.5%, F1-score of 89.0%, and area under the receiver operating characteristic curve (AUC) of 0.95 on an independent test set. Ablation studies further confirm the significant contributions of spectral-to-image transformation, multimodal fusion, PCANet feature extraction, and especially the physics-informed module to overall model performance. This work provides a rapid, interpretable, and robust approach for pollutant speciation analysis in complex matrices, offering valuable technical support for the scientific management and environmental risk assessment of Cr-bearing tailings.


1 Introduction

Chromium (Cr), as a critical industrial element, exists in the environment mainly as trivalent (Cr(III)) and hexavalent (Cr(VI)) species. Cr(III) is generally considered an essential micronutrient with low mobility and toxicity, while Cr(VI) is highly toxic, carcinogenic, and mobile, posing a significant threat to ecosystems and human health. Accurate speciation and quantification of Cr(VI) in industrial Cr-bearing residues is therefore a decisive factor for distinguishing between general solid waste and hazardous waste categories, directly impacting environmental risk assessment and management strategies.

Traditional laboratory-based chromium speciation analyses, such as selective extraction coupled with inductively coupled plasma mass spectrometry (ICP-MS) or atomic absorption spectroscopy (AAS), provide reliable results but are time-consuming, labor-intensive, and require complex sample preparation.1,2 These limitations hinder their applicability for high-throughput or on-site screening of large-scale tailings repositories, especially where rapid hazard classification is demanded. The development of rapid, cost-effective, and environmentally friendly analytical technologies capable of providing chromium speciation information is thus of significant practical importance.

Spectroscopic techniques, notably visible–near-infrared (vis-NIR) and X-ray fluorescence (XRF), have emerged as powerful tools for rapid and non-destructive analysis of environmental samples. Vis-NIR spectroscopy is sensitive to molecular vibrations and can reflect information on organic and certain inorganic chemical bonds, whereas XRF spectroscopy provides elemental composition data. However, single-modality spectroscopy often lacks sufficient selectivity for reliable quantification of specific species such as Cr(VI) in complex matrices due to background interference and overlapping signals. Integrating multi-modal spectral information has shown promise for improving analytical accuracy and robustness. Recent studies, including those from our research group, have demonstrated that the fusion of vis-NIR and XRF spectroscopy significantly enhances the quantitative analysis of heavy metals and geochemical parameters in soils, coal, and mineral matrices.3–6 For instance, our previous work established multi-modal fusion frameworks for quantitative soil cadmium analysis and in situ coal geochemistry, and developed Gramian Angular Summation (GAS) transformation and PCANet-based deep learning methods for rapid heavy metal screening.3–5

Beyond spectral fusion, critical preprocessing steps such as spectral baseline correction,7 spectral band selection, and super-resolution image reconstruction have been shown to further improve the classification and quantification of mineral samples.8,9 Nevertheless, challenges remain in handling spectral complexity, background interference, and feature selection for reliable chromium speciation.

Machine learning (ML) and deep learning (DL) approaches have been increasingly applied to extract meaningful patterns from high-dimensional spectral data, thus enhancing the performance of spectral analysis models. However, conventional “black-box” models often suffer from poor interpretability and require large training datasets and extensive parameter tuning. Broad Learning System (BLS), a recently developed shallow neural architecture, offers efficient and scalable modeling by expanding network width rather than depth, and is well-suited for heterogeneous data fusion tasks.6 Despite these technical advances, many current approaches lack the incorporation of essential physicochemical constraints, limiting their generalizability and scientific interpretability.

With the increasing complexity of environmental datasets, embedding physical and chemical knowledge into machine learning models—an approach known as physics-informed machine learning (PIML)—has become an emerging trend. PI models can enhance prediction reliability and chemical interpretability, enforcing constraints such as mass balance and redox equilibrium. For example, our recent work introduced a BLS-Net optimized with dual PCA and improved PSO-CARS variable selection for XRF-based heavy metal determination, which incorporated domain-relevant knowledge at both data fusion and model optimization stages.6 However, few studies have fully integrated redox chemistry constraints (the Nernst equation) into multi-modal spectral models for chromium speciation.

Taken together, while progress has been made in rapid screening and multi-modal spectral analysis of environmental contaminants, significant challenges remain in integrating physical chemistry constraints, handling spectral complexity, and achieving accurate, interpretable chromium speciation in real-world tailings samples.

To facilitate a clear understanding of the overall research strategy and workflow, Fig. 1 presents a schematic diagram of the proposed methodology. This process flowchart visually summarizes the key components and logical structure of the study, including spectral data acquisition, multi-modal data fusion, spectral-image transformation, hierarchical feature extraction, and physics-informed modeling for chromium speciation. The inclusion of this figure in the Introduction section enables readers to quickly grasp the design logic and main steps of the entire analytical framework.


image file: d5ja00383k-f1.tif
Fig. 1 Overview of the proposed analytical workflow for rapid chromium speciation in rare earth tailings.

Based on these considerations, this study aims to establish a novel, rapid and robust chromium speciation methodology for rare earth tailings in Ganzhou, China. Our approach leverages the fusion of vis-NIR and XRF spectral data, advanced spectral-image transformation techniques, and PI broad learning for accurate and explainable prediction of Cr(III) and Cr(VI) concentrations. This framework builds on and extends the recent advances and methodologies developed by our research group, including multi-modal data fusion, innovative spectral baseline correction, feature selection, and interpretable machine learning for geochemical analysis.3–10

The primary purpose of this study is to address the critical need for rapid, accurate, and interpretable chromium speciation in complex industrial tailings, which is essential for effective environmental risk assessment and scientific waste management. By embedding physicochemical domain knowledge into advanced data-driven models, the proposed framework not only enhances analytical performance but also improves model transparency and reliability. The significance of this research lies in its potential to provide a robust technical foundation for the scientific classification, management, and environmental risk assessment of Cr-bearing tailings, thereby contributing to safer waste disposal and the sustainable utilization of mineral resources.

2 Materials and methods

2.1 Study area

This study focused on an ion-adsorption type rare earth tailings area near Ganzhou, Jiangxi Province, China. From August to December 2024, a total of 218 weathered crust profile tailings samples were collected, primarily from the 0–20 cm surface layer. Sampling points were distributed across three main subregions near mining areas, including tailings and beneficiation plants (see Fig. 2).
image file: d5ja00383k-f2.tif
Fig. 2 Distribution of sampling points and a sampling schematic diagram in the study area.

The study area was divided into three regions: Region A1 (25.80°N–25.86°N, 114.10°E–114.22°E) and Region A2 (25.72°N–25.78°N, 114.15°E–114.28°E) comprised a total of 154 samples for model development (training and internal validation). Region B1 (25.87°N–25.90°N, 114.30°E–114.40°E; 64 samples), located in a distant area with distinct geochemical characteristics, was reserved as an independent external validation set and not involved in model training.

At each sampling site, five sub-samples from the 0–20 cm soil layer were collected using a small shovel, thoroughly mixed to form a single composite sample (approximately 2 kg), and placed in anti-static polyethylene bags. All samples were transported to the laboratory, freeze-dried, ground, and sieved sequentially through 20-mesh and 100-mesh screens to ensure homogeneity, and then stored in specialized sample bottles for subsequent XRF and vis-NIR analysis. For quality control, one sample from every batch of 30 was randomly selected for triplicate analysis.

This spatial allocation strategy—using regions A1 and A2 for model development and reserving region B1 for independent external validation—was designed to avoid overfitting and overly optimistic performance estimation due to spatial autocorrelation. It also better simulates real-world scenarios where the model must perform reliably on new, previously unmeasured areas, providing a robust and objective evaluation of practical applicability. Furthermore, within the dataset from A1 and A2, 70% of the samples were randomly assigned to the training set and 30% to an internal test set for model development.

2.2 Laboratory analysis of physical and chemical properties

2.2.1 Spectral measurement. For spectroscopic measurements, samples were prepared according to standard protocols: air-dried, ground, and sieved to <75 µm to ensure homogeneity. Vis-NIR spectra were collected using a FieldSpec3 spectrometer (ASD Inc., USA) covering 350–2500 nm (1 nm intervals). Each sample was placed in a 10 cm diameter, 1 cm height container and scanned 10 times at randomly selected positions; the three spectra with the highest signal-to-noise ratio were averaged to represent each sample. XRF analysis was performed using a TS-XH4000-G instrument (Tecsonde Inc., China) in soil mode, with each measurement lasting 180 seconds and repeated three times per sample. Instrument calibration for both vis-NIR and XRF was performed regularly using certified reference materials and calibration standards every 30 samples to ensure analytical accuracy and instrument stability.
2.2.2 Chromium speciation analysis. The concentrations of Cr(III) and Cr(VI) were determined using a selective extraction method followed by analysis on an inductively coupled plasma mass spectrometer (ICP-MS, Agilent 7900, USA). The extraction of Cr(VI) was adapted from US EPA Method 3060A and ISO 15192:2020, where 0.5 g of a prepared sample was extracted with 10 mL of a 0.1 mol L−1 KH2PO4/K2HPO4 buffer (pH 7.0). After shaking for 1 hour, the extract was centrifuged and filtered (0.22 µm) for Cr(VI) determination. Total chromium (Cr-total) was measured separately after acid digestion (HNO3–H2O2) of another sample aliquot. The Cr(III) concentration was then calculated as the difference between Cr-total and Cr(VI).

Quality assurance and control (QA/QC) were maintained throughout the analyses. The accuracy of the **Cr-total** measurement was verified using the certified reference material **NIST SRM 2710a**, with results falling within the certified range. For the **Cr(VI) analysis**, method performance was validated using **procedural blanks and matrix-spiked samples**, with spike recoveries consistently between 95–105%. Method precision was assessed by analyzing one random sample in triplicate for every 30 samples, yielding a relative standard deviation (RSD) consistently below 10%. The method detection limits in the measurement solution were 0.01 µg L−1 for Cr(VI) and 0.02 µg L−1 for Cr(III).

2.2.3 Toxicity characteristic leaching procedure. The leaching potential of chromium was assessed using the Toxicity Characteristic Leaching Procedure (TCLP) in strict accordance with the Chinese national standard HJ/T 299-2007. For each sample, 5.00 g of prepared material (<75 µm, dry weight) was extracted with 100 mL of an acetic acid buffer (liquid-to-solid ratio of 20[thin space (1/6-em)]:[thin space (1/6-em)]1) for 18 ± 2 hours at 20–25 °C on a rotary tumbler (30 ± 2 rpm). The extraction fluid pH (either 4.93 ± 0.05 or 2.88 ± 0.05) was selected based on the sample's alkalinity. After agitation, the leachate was filtered (0.6–0.8 µm glass fiber) for subsequent analysis.

The concentrations of total chromium and Cr(VI) in the leachate were quantified by ICP-MS (Agilent 7900, USA), following appropriate sample preservation and filtration steps as described in the standard. Quality assurance for the TCLP analysis followed the same protocol as the speciation analysis, including the use of certified standard solutions for calibration, NIST SRM 2710a for validating total Cr measurements, and procedural blanks, spiked samples, and triplicate analyses for ensuring the reliability of Cr(VI) data.

2.2.4 pH and redox potential. The pH was measured potentiometrically by mixing air-dried, sieved samples with deionized water at a 1[thin space (1/6-em)]:[thin space (1/6-em)]2.5 (w/v) ratio, following standard soil analysis protocols. After equilibration, pH was recorded using a calibrated glass electrode pH meter (Mettler Toledo, Switzerland). Redox potential (Eh) was determined with a platinum combination electrode (Ag/AgCl reference) connected to a millivoltmeter; readings were converted to values relative to the standard hydrogen electrode (SHE), accounting for reference potential and temperature. All measurements were performed in accordance with internationally recognized methods to ensure data reliability and comparability.
2.2.5 Geochemical modeling and chemical consistency assessment. To provide a fundamental geochemical benchmark and to validate the chemical consistency of our data-driven PI-BLS-Net model, comparative simulations were conducted using the widely recognized geochemical speciation software PHREEQC (version 3.8.6). PHREEQC is capable of calculating aqueous speciation and saturation states of minerals under specified chemical conditions by solving thermodynamic equilibrium equations.

Specifically, for the assessment of chromium speciation in TCLP leachates, PHREEQC simulations were performed for each tailings sample. The input parameters for these simulations included the measured initial total chromium concentration, the designated TCLP extraction pH (either 2.88 or 4.93, as selected based on sample alkalinity), a liquid-to-solid ratio of 20[thin space (1/6-em)]:[thin space (1/6-em)]1, and a temperature of 25 °C. The phreeqc.dat database was utilized for all calculations. The primary outputs from PHREEQC were the equilibrium concentrations of Cr(VI) and total Cr in the simulated leachate. These predicted concentrations were then converted into a binary classification (“hazardous” or “general”) using the same national hazardous waste classification standards applied to the measured TCLP data (i.e., “hazardous” if TCLP Cr(VI) >5 mg L−1 or TCLP total Cr >15 mg L−1). This PHREEQC-derived theoretical classification served as a critical geochemical reference for evaluating the physical interpretability and consistency of our PI-BLS-Net predictions.

2.3 Spectral preprocessing

The 400–2450 nm range was retained for analysis. After comparing multiple preprocessing techniques—including standard normal variate (SNV), multiplicative scatter correction (MSC), and Savitzky–Golay (SG) filtering—SG combined with MSC (SG + MSC) was ultimately selected as the optimal preprocessing approach to enhance the performance of the predictive models, as illustrated in Fig. 3.
image file: d5ja00383k-f3.tif
Fig. 3 Raw NIR spectra (A) and preprocessed NIR spectra (B).

For XRF spectral data, the optimal energy range was determined through iterative trial-and-error, with energies below 22.4 keV ultimately selected. The reported XRF intensities from the instrument were normalized by dividing by the live time to adjust for count rate (cps). The first step in XRF preprocessing was baseline correction to remove slope artifacts present in the spectra. The Compton peak at 20.17 keV was normalized. Both the raw and Compton-normalized spectra were smoothed using a Savitzky–Golay filter with a window size of 5 and a polynomial order of 2.

2.3.1 GAF. To enable the application of 2D convolutional neural networks for quantitative spectral analysis, one-dimensional spectral data were transformed into two-dimensional image representations using the GAF method. This technique encodes the value and position of each point in the spectral vector into angular space, allowing the construction of a matrix that captures temporal dependencies and spatial relationships within the spectrum.

First, each preprocessed spectral vector X = [x1, x2, …, xn] was normalized to the interval [−1, 1]. A common method for this normalization is min–max scaling:

 
image file: d5ja00383k-t1.tif(1)
where xmin and xmax are the minimum and maximum values of the entire spectral vector X, and [x with combining tilde]i is the normalized value of xi.

Next, the normalized values [x with combining tilde]i are converted into angular coordinates θi using the arccosine function. The index i (representing the position or time step in the spectral vector) can also be mapped, often linearly scaled to cover a range such as [0, π] or [0, 2π] if needed for visualization, but the standard matrix definition focuses on the angles derived from the values:

 
θi = arccos([x with combining tilde]i),(2)
where −1 ≤ [x with combining tilde]i ≤ 1 and 0 ≤ θi ≤ π. θi uniquely represents the normalized spectral value [x with combining tilde]i. The Gramian Angular Field matrices are constructed by considering the sum or difference of the angles for each pair of points ([x with combining tilde]i,[x with combining tilde]j). The resulting matrices, Gramian Angular Summation Field (GASF) and Gramian Angular Difference Field (GADF), are n × n matrices defined as follows:
 
GASFi,j = cos(θi + θj)(3)
 
GADFi,j = sin(θiθj)(4)

The element at position (i, j) in the GASF matrix represents the cosine of the sum of the angles corresponding to the i-th and j-th points in the spectral vector, while the GADF matrix represents the sine of the difference of these angles. These matrices effectively preserve the temporal correlation and relative relationships within the original spectral sequence in a 2D image format. The conversion process between GADF and GASF is shown in Fig. S1

2.3.2 CSM. To efficiently map one-dimensional spectral data into two-dimensional images, a Circular Spectral Mapping (CSM) method was developed.11 This approach applies a series of graphical and geometric transformations to encode the global shape and symmetry of the spectrum as a structured 2D image, facilitating the model's ability to capture complex spectral patterns.

The specific steps are as follows:

(1) Spectral curve plotting: for each preprocessed spectral vector (wavelength–intensity pairs), the data are plotted as a line chart without axes. A linear scale is used for vis-NIR spectra, while a logarithmic scale can be selected for XRF spectra depending on their characteristics. This process generates a clean spectral line image, removing axes, labels, and other distractions.

(2) Image acquisition and horizontal mirroring: the spectral plot is converted into a 2D grayscale pixel matrix, and a horizontally mirrored version of the image is generated.

(3) Concatenation and padding: the original and mirrored images are concatenated horizontally to form a wider composite image. A uniform background color is then used to pad the image into a square matrix, providing sufficient space for subsequent rotations.

(4) Multi-angle rotation and pixel aggregation: the square composite image is rotated in 30° increments for a total of 12 rotations (covering 360°). Each rotated image is generated, and all rotated images are then aggregated by taking the minimum pixel value at each position, strengthening the multi-directional representation of the main spectral contour.

(5) Size normalization: the final aggregated image is resized to 128 × 128 pixels to standardize the input for subsequent convolutional neural network models.

This CSM transformation effectively preserves the overall shape, symmetry, and multi-directional spatial features of the spectrum, providing deep learning models with rich 2D structural information for feature extraction and discrimination.

2.3.3 Multimodal image fusion. To fully exploit the complementary information contained in XRF and NIR spectra, their respective 2D image representations—obtained via CSM or GAF—were fused to generate multimodal inputs for subsequent feature extraction. Specifically, the transformed images from NIR and XRF modalities were stacked along the channel dimension to form a three-channel composite image; the first two channels encoded the NIR and XRF data, respectively, while the third channel was initialized as zero to standardize the input shape. This simple yet effective fusion strategy integrates both elemental (XRF) and molecular (NIR) characteristics, providing a unified representation that preserves the unique spectral features of each modality. Such multimodal fusion enhances the input diversity and supports more robust feature learning in downstream models.
2.3.4 PCANet feature extraction. Following spectral-to-image transformation and fusion, hierarchical feature extraction was performed using the PCANet framework. The PCANet is a lightweight and interpretable deep learning model that employs cascaded Principal Component Analysis (PCA) to learn convolutional filters in an unsupervised manner, making it computationally efficient and suitable for scenarios with limited training data.

The PCANet pipeline consists of three main stages:

(1) Filter learning: in each stage, local patches are extracted from the training images, and a set of orthogonal PCA filters is learned. The input images are convolved with these filters to produce feature maps, which serve as the input for the next stage. This cascade continues for a predefined number of stages.

(2) Nonlinear binarization and block-wise histogramming: the output feature maps from the final stage are binarized (using a thresholding or hashing operation) to generate binary maps. These are then partitioned into non-overlapping or overlapping blocks, and a histogram of binary patterns is computed for each block.

(3) Feature vector construction: block-wise histograms are concatenated to form a high-dimensional feature vector that characterizes the input image's local and global structures. This vector is then used as the input for downstream regression or classification models.

PCANet's data-driven filter learning and hierarchical structure enable it to capture rich, discriminative features from the fused spectral images, facilitating accurate quantitative or qualitative analysis in subsequent modeling steps.

2.4 PI-BLS-Net for chromium speciation prediction

The BLS has emerged as an efficient and scalable machine learning framework, achieving rapid model training and high predictive performance by expanding network width rather than depth. Its modular structure is particularly well-suited for integrating heterogeneous inputs, such as spectral signatures and physicochemical parameters. In this work, we propose a PI BLS-Net architecture for chromium speciation, which incorporates redox chemistry principles to enhance both the prediction accuracy and physical interpretability of Cr(III) and Cr(VI) quantification.
2.4.1 Construction of the architecture and loss function. To embed physicochemical principles within the BLS framework, we introduced specific modifications to the model architecture and developed a composite loss function reflecting both data-driven and physical constraints. The key components are as follows:

(1) Input layer: the model accepts a concatenated feature vector comprising: (i) spectral features extracted from preprocessed vis-NIR and XRF spectra; (ii) fundamental physicochemical parameters, including pH and redox potential (Eh); and (iii) optional physicochemical transformations, such as proton activity [H+] = 10−pH or other nonlinear terms reflecting the electrochemical environment.

(2) Feature mapping layer (FML): this layer generates primary feature nodes via two parallel mechanisms:

(a) Spectral feature nodes: nodes produced by applying random linear projections to the spectral features, consistent with standard BLS, to capture complex spectral patterns.

(b) Physicochemically sensitive nodes: nodes generated by applying nonlinear basis functions (sigmoid, tanh, polynomial combinations such as pH × Eh) to physicochemical inputs, explicitly modeling anticipated nonlinear responses associated with chemical equilibria and kinetics governed by pH and Eh.

(3) Enhancement layer (EL): to capture higher-order interactions and complex nonlinearities, the enhancement layer generates additional nodes based on the feature nodes. The activation functions within this layer are modulated by physicochemical context—for example, incorporating gating functions dependent on the difference between measured Eh and the theoretical pH-dependent redox threshold (fRedox(pH)), thus approximating the redox transition boundaries of the Cr(III)/Cr(VI) system under varying pH conditions.

(4) Output layer: the final layer consists of output neurons directly corresponding to the predicted concentrations of Cr(III) and Cr(VI).

Physics-constrained loss function: model training is guided by a composite loss function that integrates data fidelity with adherence to fundamental chemical laws:

 
LLoss = α × LMSE + β × LRedox + γ × LMass(5)
where each term is defined as follows:

(1) LMSE is the mean squared error between the predicted and experimentally measured concentrations:

 
image file: d5ja00383k-t2.tif(6)
where N is the total number of samples in the batch, Crpred(i) is the vector of predicted concentrations [Cr(III), Cr(VI)] for the i-th sample, and Cractual(i) is the corresponding vector of experimentally measured concentrations.

(2) LRedox is a penalty term promoting consistency between model predictions and the Nernst equation for the Cr(VI)/Cr(III) redox couple. The Nernst equation relates the measured Eh to the standard electrode potential E0Cr, the activities (approximated by concentrations) of Cr(III) and Cr(VI), and the solution pH:

 
image file: d5ja00383k-t3.tif(7)
where Ehpred is the redox potential predicted from the model's outputs, E0Cr is the standard electrode potential for the Cr(VI)/Cr(III) redox couple, R is the universal gas constant, T is the absolute temperature, n is the number of electrons transferred (n = 3 for the Cr(VI)/Cr(III) couple), F is the Faraday constant, aCr(III) and aCr(VI) are the activities of the Cr(III) and Cr(VI) species (approximated by the model's predicted concentrations), and fNernst(pH) is a function reflecting the reaction's pH dependency (e.g., a term proportional to pH). The LRedox term then penalizes the deviation between the measured Eh and the thermodynamically predicted Ehpred.

L Mass is a penalty enforcing mass conservation, applicable when the total chromium concentration Crtotal is known:

 
LMass = [max(0,Cr(III)pred + Cr(VI)pred − Crtotal)]p(8)
where Cr(III)pred and Cr(VI)pred are the predicted concentrations of the respective species for a given sample, Crtotal is the known total chromium concentration for that sample, and p is an exponent, typically set to 2 to create a squared penalty.Here, α, β, and γ are non-negative hyperparameters weighting the contribution of each loss term and are carefully tuned to balance model fit, physicochemical consistency, and mass balance. A schematic overview of the proposed architecture is provided in Fig. 4.


image file: d5ja00383k-f4.tif
Fig. 4 Schematic diagram of the PI-BLS-Net framework for chromium speciation prediction.
2.4.2 Training strategy and hyperparameter tuning. During model training, the weights α, β, and γ are dynamically updated using adaptive regularization based on local gradient information. Regions of the feature space with greater Cr(III)/Cr(VI) imbalance (i.e., stronger violation of redox constraints) receive higher physical penalties. This strategy encourages the model to converge towards chemically reasonable solutions without sacrificing data-driven learning capability. The overall training procedure is summarized in Algorithm S2.

2.5 Evaluation metrics and statistical analysis

Given that the primary objective of this study is binary classification, distinguishing between general waste and hazardous waste, the performance of all classification models was evaluated using the following standard metrics:

(1) Accuracy:

 
image file: d5ja00383k-t4.tif(9)

(2) Precision:

 
image file: d5ja00383k-t5.tif(10)

(3) Recall:

 
image file: d5ja00383k-t6.tif(11)

(4) F1-score:

 
image file: d5ja00383k-t7.tif(12)
where TP denotes the number of true positives, TN true negatives, FP false positives, and FN false negatives.

In this study, “hazardous waste” is defined as samples with TCLP Cr(VI) concentrations exceeding 5 mg L−1, or TCLP total Cr concentrations exceeding 15 mg L−1; all other samples were classified as “general waste” in accordance with the national hazardous waste classification standard. All evaluation metrics were calculated based on the independent validation set. Statistical analysis and model comparisons were performed using Python, and the significance of differences in performance metrics among models was assessed using paired t-tests and Tukey's Honestly Significant Difference (HSD) post hoc tests.

In addition to the above evaluation metrics, we further compared the Cr(III)/Cr(VI) ratios predicted by our machine learning models with those calculated by the PHREEQC geochemical model under identical pH, Eh, and Cr concentration conditions. PHREEQC simulations were performed for both the idealized Cr–H2O system and for multi-component systems representative of the studied tailings. This comparison serves to validate the consistency and reliability of the proposed method against well-established theoretical models.

3 Results and discussion

3.1 Physicochemical controls on chromium speciation in tailings

Characterization of the rare earth chromite tailings samples revealed marked heterogeneity in their physicochemical profiles and chromium speciation. The pH values across the 218 samples ranged from 3.0 to 7.1, exhibiting relative stability with a coefficient of variation (CV) of 17.88%. In contrast, the redox potential (Eh) showed substantial fluctuations, ranging from 19 mV to 92 mV with a high CV of 38.12%. This variability was most pronounced in the chromium species themselves: Cr(VI) concentrations displayed an exceptionally wide range (16–190 mg kg−1) and high variability (CV = 51.15%), posing a significant challenge for accurate prediction. In comparison, Cr(III) exhibited much lower relative variability (CV = 13.79%).

A Redundancy Analysis (RDA) was performed to statistically investigate the relationships between these physicochemical parameters and chromium speciation (see Fig. S4, SI). The analysis confirmed strong positive correlations between Eh and the concentrations of both Cr(III) and Cr(VI), and a notable inverse relationship between pH and Cr(VI) prevalence, which aligns with established electrochemical principles governing Cr(III)/Cr(VI) redox equilibria. The pronounced variability of Cr(VI), coupled with its clear statistical dependence on fluctuations in pH and Eh, highlights the inherent limitations of predictive models that rely solely on spectroscopic data. These findings strongly support the necessity of incorporating key physicochemical parameters into the modeling framework.

To visually validate this core hypothesis, Fig. 5 presents a comprehensive Eh–pH diagram that contextualizes our measured field data against the theoretical predictions from the established geochemical model, PHREEQC. The primary purpose of this diagram is to demonstrate that the chromium speciation in our field samples is governed by the same fundamental physicochemical principles described by thermodynamic theory.


image file: d5ja00383k-f5.tif
Fig. 5 Eh–pH diagram comparing PHREEQC geochemical model predictions with measured data for chromium speciation. The colored contour background represents the theoretical Cr(VI) percentage distribution simulated by the PHREEQC software (version 3.8.6), with the bold solid line marking the 50% Cr(VI)/Cr(III) thermodynamic equilibrium boundary. Overlaid circular markers represent the 218 measured tailings samples, where the fill color of each marker corresponds to its empirically measured Cr(VI) percentage. The dashed line represents an empirical 50% boundary, determined by fitting a logistic regression model to the measured data to best separate samples with Cr(VI) content above and below the 50% threshold.

The analysis reveals a strong consistency between theory and observation. The spatial distribution of our measured samples (circular markers) aligns remarkably well with the thermodynamic landscape predicted by PHREEQC (background contours). This is evidenced by two key observations: first, the measured Cr(VI) percentages (marker colors) closely mirror the theoretical trends, with higher values predominantly appearing in the predicted Cr(VI)-dominant region (upper-left). Second, the empirically derived 50% boundary from our data (dashed line) exhibits a negative slope that is strikingly similar to the theoretical 50% equilibrium line (solid line).

This convergence of evidence from three distinct sources—the first-principles geochemical model, the distribution of field measurements, and the data-driven statistical fit—provides a robust demonstration that Eh and pH are the primary physicochemical controls in this system. This finding is not merely descriptive; it forms the foundational rationale for our physics-informed modeling strategy. By confirming that these physical laws are active and observable in our dataset, we justify the explicit inclusion of these parameters and their governing relationships as constraints within our PI-BLS-Net framework, a central theme evaluated in the subsequent sections.

3.2 Model performance and ablation study

To systematically evaluate our proposed framework and quantify the contribution of each key component, we first conducted an ablation study. By incrementally integrating modules—from spectral-to-image transformation to the final physics-informed layer—we assessed their impact on classification performance using the independent validation set (Region B1, N = 64). As summarized in Table 1, each component provides a clear and positive contribution to the model's generalization capability.
Table 1 Incremental performance on the independent validation set from the ablation studya
Model Configuration Accuracy (%) F1-score (%) AUC
a Each row builds upon the previous one. The final row represents the complete proposed model (GAF-fusion + PCANet + PI). Metrics are for binary classification on the independent validation set.
Base BLS (1D spectra) 71.9 71.1 0.77
+ GAF-vis-NIR 75.0 74.5 0.81
+ GAF-fusion (vis-NIR & XRF) 81.3 80.8 0.87
+ PCANet 85.9 85.3 0.92
+ physics-informed (PI) module 89.5 89.0 0.95


The ablation study, as detailed in Table 1, clearly demonstrates the stepwise and essential contributions of each component. Starting from a baseline BLS model using 1D spectra, converting the data to 2D GAF images and subsequently fusing both vis-NIR and XRF modalities led to a significant performance boost, confirming the synergistic advantage of multimodal information. Introducing the PCANet for hierarchical feature learning, with hyperparameters optimized via Bayesian optimization as detailed in Appendix A.2 in the SI, provided a further substantial gain. Crucially, the final integration of the physics-informed (PI) module achieved the highest performance, contributing a final 3.6% increase in accuracy. This incremental analysis validates the effectiveness of our proposed composite framework, with the PI component being particularly vital for achieving the top performance.

Building on these findings, a comprehensive comparison of various model configurations was performed, with results detailed in Table 2. These results were obtained under identical conditions on the independent validation set and corroborate the findings from the ablation study.

Table 2 Performance metrics of different model configurations on the independent validation set (Region B1, N = 64)a
Config. Model description Accuracy (%) Precision (%) Recall (%) F1-score (%) AUC
a Config. 1 is the baseline model. Configs. 2 and 3 are unimodal spectra-image transformations. Configs. 4 and 5 are multimodal fusion. Configs. 6 and 7 add PCANet feature extraction. Configs. 8 and 9 represent the complete PI-BLS framework proposed in this study. Precision, recall, and F1-score are calculated for the binary classification task, treating ‘hazardous waste’ as the positive class.
1 Raw spectra + BLS 70.3 71.0 69.8 70.4 0.76
2 CSM-NIR + BLS 73.4 74.1 72.5 73.3 0.79
3 GAF-NIR + BLS 75.0 75.8 74.2 75.0 0.81
4 CSM-fusion + BLS 79.7 80.2 78.9 79.5 0.85
5 GAF-fusion + BLS 81.3 81.9 80.5 81.2 0.87
6 CSM-NIR + XRF + PCANet + BLS 84.4 85.0 83.5 84.2 0.90
7 GAF-fusion + PCANet + BLS 85.9 86.3 85.1 85.7 0.92
8 GAF-fusion +PCANet +PI-BLS 89.5 87.1 91.0 89.0 0.95
9 CSM-fusion + PCANet + PI-BLS 88.1 86.5 89.2 87.8 0.94


The data in Table 2 confirm that models incorporating advanced feature representations consistently outperform the baseline. While both GAF and CSM transformations proved effective, GAF-based models (e.g., Config. 3 vs. 2) generally yielded slightly better results in this study. The most significant performance leap is observed with the fully-integrated frameworks (Configs. 8 and 9), which combine multimodal fusion, PCANet feature extraction, and the PI module. The premier model, GAF-fusion + PCANet + PI-BLS (Config. 8), achieved the best results across all metrics, with an accuracy of 89.5%, a recall of 91.0%, and an AUC of 0.95. The high recall is particularly important for this application, as it signifies the model's strong capability to correctly identify hazardous waste samples, thereby minimizing the critical risk of false negatives in environmental risk assessment. The robust AUC value further demonstrates the excellent discriminative power of the proposed PI-BLS model for prediction on unseen data.

3.3 Chemical consistency and interpretability validation

While the ablation study demonstrated the significant performance benefits of incorporating the physics-informed (PI) module, it is essential to validate that this improvement stems from enhanced chemical consistency rather than mere statistical coincidence. To this end, we conducted a rigorous validation by comparing the classification decisions of our PI-BLS-Net against the theoretical predictions from the PHREEQC geochemical model, which serves as an independent, thermodynamics-based benchmark. This comparative analysis, presented as a series of confusion matrices in Fig. S5 (SI), allows for a deeper assessment of our model's interpretability and reliability.

First, to assess the chemical rationality of our model, we compared the PI-BLS-Net predictions directly against the theoretical classifications derived from PHREEQC simulations (Fig. S5a). A high degree of consistency was observed, with a classification accuracy of 96.8% between the data-driven model and the theoretical model. This strong alignment indicates that our physics-informed framework successfully learned to make classification decisions that are consistent with fundamental thermodynamic principles governing chromium leaching, thereby enhancing its interpretability and trustworthiness. It provides strong evidence that the PI module is not just a black box but actively guides the model towards physically plausible solutions.

Next, to establish a performance baseline for the theoretical model itself, we evaluated the PHREEQC classifications against the laboratory-measured actual classifications (Fig. S5b). The PHREEQC model achieved a standalone accuracy of 86.2%, demonstrating its inherent capability to predict the hazard class based on thermodynamic equilibrium. However, the 25 false positives (predicting “hazardous” for samples that were measured as “general”) suggest that an ideal equilibrium model may overestimate the leaching potential in some real-world scenarios. This discrepancy is likely due to factors such as kinetic limitations, complex mineral–water interactions, or matrix effects not fully captured by the simplified thermodynamic simulation.

Finally, the practical predictive performance of our PI-BLS-Net is visualized in Fig. S5c, which compares its predictions against the measured actual classifications. Our model achieved an accuracy of 85.8% on the entire dataset, which is highly competitive with the theoretical PHREEQC model. Crucially, our PI-BLS-Net model exhibits a more balanced error profile. The ability of the PI-BLS-Net to achieve a predictive accuracy comparable to a first-principles geochemical model, while maintaining strong chemical consistency (as shown in Fig. S5a), validates our hybrid physics-informed machine learning approach. It suggests that by integrating rich, multi-modal spectral data, the PI-BLS-Net effectively captures nuances from the sample matrix that complement the explicit physical constraints, leading to a robust and practical classification tool for environmental risk assessment.

3.4 Comparison with state-of-the-art methods

To contextualize the performance of our proposed PI-BLS-Net, we conducted a comparative analysis against several state-of-the-art methods commonly employed for spectroscopic data analysis and heavy metal prediction. The comparison, summarized in Table 3, includes both traditional machine learning models (Random Forest and XGBoost) and deep learning architectures (CNN-LSTM and traditional Physics-Informed Neural Networks – PINNs). All models were evaluated under identical conditions for the classification task on our dataset.
Table 3 Performance comparison with existing methods
Method Accuracy (%) F1-score Time (min) Memory (GB)
Traditional PINN12 82.3 0.81 45.6 4.2
CNN-LSTM13 80.1 0.79 38.2 3.8
Random Forest14 78.5 0.77 12.4 2.1
XGBoost15 81.2 0.80 15.7 2.5
PI-BLS-Net (ours) 89.5 0.89 20.3 2.8


As shown in Table 3, our PI-BLS-Net demonstrates superior performance across all evaluation metrics while maintaining reasonable computational efficiency. Compared to classical machine learning approaches such as Random Forest and XGBoost, our model achieves a significantly higher accuracy and F1-score, highlighting the benefits of our advanced feature extraction pipeline and physics-informed architecture.

When compared with deep learning models, the advantages of our approach become even more apparent. The PI-BLS-Net surpasses the standard CNN-LSTM hybrid model by a substantial margin, likely due to the latter's inability to effectively leverage domain knowledge. More importantly, our model also outperforms a traditional PINN framework by 7.2% in accuracy. This improvement can be attributed to the novel broad learning strategy, which avoids the vanishing/exploding gradient problems common in deep PINNs, and our domain-specific implementation of physical constraints within the efficient BLS architecture.

In terms of computational resources, the PI-BLS-Net offers a compelling balance. It is significantly faster and requires less memory than deep learning architectures such as PINN and CNN-LSTM, while being only moderately more demanding than the lower-performing classical models. These results collectively position our PI-BLS-Net as a highly effective and efficient solution for rapid and accurate chromium speciation analysis in complex environmental samples.

3.5 Limitations and prospects

Although the present study demonstrates the effectiveness of the proposed approach, several limitations should be acknowledged. First, the dataset used in this work was limited in both size and diversity, which may affect the generalizability of the results to broader or more complex real-world scenarios. Second, while the Bayesian optimization strategy improved model performance, the computational cost remains non-negligible, especially for large-scale datasets or real-time applications. Additionally, the interpretability of the model, though enhanced through feature selection and visualization techniques, could be further improved to facilitate practical deployment and user understanding.

Looking forward, future research could address these limitations by employing larger and more diverse datasets to validate the robustness and scalability of the methodology. Furthermore, integrating more advanced optimization algorithms or parallel computing techniques may help to reduce computational time. The exploration of model interpretability, possibly through explainable AI methods, also represents an important direction for subsequent work. Finally, application of the proposed framework to other domains or tasks could further demonstrate its versatility and practical value.

4 Conclusion

In this study, we proposed the PI-BLS-Net method designed to enhance both the accuracy and interpretability of chromium speciation prediction by explicitly integrating physicochemical parameters (pH and Eh) into the modeling process. Comprehensive analysis of tailings samples revealed that Cr(VI) concentrations are significantly influenced by pH and Eh, providing a scientific basis for the architecture of the proposed model.

By employing GAF and CSM methods, one-dimensional spectral data were effectively transformed into two-dimensional image representations, from which hierarchical features were extracted using the PCANet. Experimental results demonstrated that the PI-BLS-Net achieves outstanding performance in predicting Cr(III) and Cr(VI), particularly in the presence of complex nonlinear relationships.

This approach not only improves prediction accuracy but also enhances the physical interpretability of the model, offering a novel solution for environmental monitoring and pollution management.

Ethical statement

This work does not involve any studies with human participants or animals performed by any of the authors. All data and materials used in this study were obtained and analyzed in accordance with relevant guidelines and regulations. The authors declare that there are no conflicts of interest regarding the publication of this paper.

Author contributions

Qingya Wang: writing – original draft, software, methodology, formal analysis, data curation, conceptualization. Fusheng Li: validation, supervision, investigation, formal analysis, conceptualization. Wanqi Yang: investigation, formal analysis.: writing – review and editing, validation. Haoyu Zou: writing – review and editing, validation.

Conflicts of interest

The authors declare that they have no conflicts of interest regarding the publication of this work.

Data availability

The data supporting this article have been included as part of the supplementary information (SI). Supplementary information is available. See DOI: https://doi.org/10.1039/d5ja00383k.

Acknowledgements

This work was supported by the Educational Commission of Jiangxi Province of China (GJJ2400609), Jiujiang Basic Research Program Natural Science Foundation (25JJ002), and Jiangxi Provincial Natural Science Foundation (20212BAB211001, 2024BAB20147, and 20252BAC240298). We would like to thank the editors and reviewers for their valuable comments and suggestions.

References

  1. W. L. Lindsay and W. A. Norvell, Environ. Health Perspect., 2001, 92, 25–40 Search PubMed.
  2. M. Tuzen and M. Soylak, Talanta, 2007, 71, 681–686 CrossRef PubMed.
  3. W. Qingya, F. Li, X. Jiang, J. Hao, Y. Zhao, S. Wu, Y. Cai and W. Huang, Chemom. Intell. Lab. Syst., 2022, 226, 104578–104586 CrossRef.
  4. L. Zhu, W. Gu, T. Song, F. Qiu and Q. Wang, Sci. Rep., 2022, 12, 22365 CrossRef CAS PubMed.
  5. Q. Wang, L. Tao, F. Li, Z. Wu, Y. Cai and S. Lyu, J. Anal. At. Spectrom., 2024, 2207–2219 CAS.
  6. S. Lyu, F. Li, W. Yang, Q. Zhang and Q. Wang, Talanta, 2025, 284, 127213 CrossRef CAS PubMed.
  7. X. Jiang, F. Li, Q. Wang, J. Luo, J. Hao and M. Xu, Appl. Opt., 2021, 60, 5707–5715 CrossRef PubMed.
  8. Q. Wang, H. Hua, L. Tao, Y. Liang, X. Deng and F. Yu, Sci. Rep., 2024, 14, 7777 CrossRef CAS.
  9. Q. Wang, Z. Wu, H. Shao, Y. Qin, F. Yu and L. Tao, Appl. Opt., 2024, 63, 7362 CrossRef.
  10. J. Hao, F. Li, Q. Wang, X. Jiang, B. Yang and J. Cao, Nucl. Instrum. Methods Phys. Res., Sect. A, 2021, 1013, 165672 CrossRef CAS.
  11. T. Pan, X. Dai, W. Wang, Y. Wang, H. Xiao and F. Liu, Anal. Chim. Acta, 2025, 1359, 344113 CrossRef CAS PubMed.
  12. M. Raissi, P. Perdikaris and G. E. Karniadakis, J. Comput. Phys., 2019, 378, 686–707 CrossRef.
  13. Z. Wu, H. Zhang, J. Liu, W. Li and J. Wang, Water Res., 2020, 186, 116382 CrossRef PubMed.
  14. W. Guo, F. Yang, Y. Li and S. Wang, J. Hazard. Mater., 2021, 416, 125743 Search PubMed.
  15. J. Wang, M. Liu, Y. Chen and W. Zhang, Environ. Pollut., 2023, 321, 120456 Search PubMed.

This journal is © The Royal Society of Chemistry 2026
Click here to see how this site uses Cookies. View our privacy policy here.