Semi-supervised graph learning for spatial mapping of heavy metal concentrations in smelter-adjacent soils using a mobile LIBS device
Abstract
Heavy metal concentrations in soils near smelters are critical indicators for assessing soil quality, ecological risks, and potential health threats. However, accurate monitoring remains challenging due to soil matrix complexity and limited labeled spectral data. This study presents a semi-supervised learning framework based on a teacher–student model combined with GraphSAGE. The approach incorporates intra-group consistency constraints to map laser-induced breakdown spectroscopy (LIBS) spectra to the concentrations of Cr, Cu, Cd, Pb, and Zn. Spectral data were preprocessed using Savitzky–Golay filtering, normalization, feature selection, and PCA. The resulting components served as inputs to the model. Under the fully labeled dataset, the GraphSAGE-based framework outperformed conventional sequence models (LSTM and Transformer), achieving lower mean absolute percentage error (MAPE), generally reduced root mean squared error of prediction (RMSEP), and improved precision reflected by a lower mean relative standard deviation (mean RSD) on the labeled test set. By integrating unlabeled samples via the semi-supervised strategy, the teacher–student framework further improved model robustness and predictive stability, lowering MAPE to 5.23% (Cu), 2.73% (Cr), 5.82% (Pb), 4.90% (Zn), and 6.29% (Cd), with corresponding reductions in RMSEP and mean RSD. Finally, the optimized model mapped heavy metal distributions across the study area. Concentrations were high near the smelter and peaked in downwind zones. These patterns align with the atmospheric transport of smelter-derived particulates, confirming their dominant role in dispersion. The proposed method offers a practical tool for environmental monitoring and supports precision remediation strategies.

Please wait while we load your content...