Enhancing source water quality predictions to improve treatment by integrating watershed data on water quality, river flow and rainfall into interpretable machine learning algorithms

Abstract

Raw water quality used for municipal water treatment is impacted during and after rainfall events and the resultant changes in river flow. Recently, raw water quality parameters such as turbidity have been modeled and predicted using machine learning algorithms, based on environmental, hydrological, and meteorological information as input variables. Our research aims to integrate upstream water quality with river flow and watershed rainfall data into interpretable machine learning algorithms to enhance raw water turbidity predictions. Such predictions would allow water utility operators to anticipate the required adjustments during water treatment processes. First, we estimated lag-times between the upstream input variables of rainfall in watershed, river flow and raw water turbidity, and the output targeted variable of downstream raw water turbidity. Then, we used a XGBoost technique to predict raw water turbidity using upstream water quality along with river flow and watershed rainfall data. Finally, the overall importance of every input variable was estimated using a SHAP (SHapley Additive exPlanations) strategy. Results showed that the upstream raw water turbidity is the most important input variable, followed by river flow. Best performance metrics and time series visual inspection of modeled variables showed that integrating upstream raw water quality data leads to enhanced raw water predictions. These results could open possibilities for developing and implementing regional raw water quality modeling that can feed Weather-Event-Water-Treatment–Early-Warning-Systems (WEWT-EWS). Future research could improve raw water quality prediction horizons and include interannual data.

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Paper
Submitted
03 Mar 2026
Accepted
15 Jun 2026
First published
15 Jun 2026
This article is Open Access
Creative Commons BY license

Environ. Sci.: Water Res. Technol., 2026, Accepted Manuscript

Enhancing source water quality predictions to improve treatment by integrating watershed data on water quality, river flow and rainfall into interpretable machine learning algorithms

C. Ortiz-Lopez, C. Bouchard and M. J. Rodriguez, Environ. Sci.: Water Res. Technol., 2026, Accepted Manuscript , DOI: 10.1039/D6EW00235H

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements