Multimodal Prediction of Sludge Volume Index for Monitoring Sludge Settling Performance
Abstract
The sludge volume index (SVI) is a crucial parameter for evaluating the sludge settling performance in wastewater treatment plants. However, existing methods that utilize process parameters or visual features lack multi-modal collaborative modeling of macroscopic settling characteristics, which restricts the prediction accuracy of SVI. To address this issue, we propose a multi-modal framework specifically for SVI prediction that integrates process data (e.g., chemical oxygen demand; pH) and macroscopic visual features (e.g., floc color; 30-minute sludge volume) of sludge settling—parameters that are critical for SVI estimation. Firstly, a sludge settling visual dataset and a process-visual fusion database are constructed for SVI prediction scenarios. Secondly, an improved YOLO11 model is designed to achieve reliable detection and localization of floc state and supernatant state in sludge settling images. Thirdly, combining K-means clustering with geometric feature analysis, the output of the improved YOLO11 model is used to quantify the color and settling ratio of flocs. Finally, a stochastic configuration network is employed to fuse multi-source data for SVI prediction. Experiments show that the improved YOLO11 achieves a mAP@0.5 of 94.8% on the sludge settling visual dataset, with a 42.1% reduction in the number of parameters; the proposed multi-modal model achieves lower prediction errors than single-modal models, with a root mean square error (RMSE) of 7.31 and a mean absolute error (MAE) of 4.40, and the contribution of visual features to SVI prediction accuracy reaches 63.7%. This study provides an efficient solution for the real-time monitoring of SVI, a key indicator of sludge settling performance in wastewater treatment plants.
Please wait while we load your content...