Prediction of total organic carbon and E. coli in rivers within the Milwaukee River basin using machine learning methods
Abstract
Urban water undergoes physical and chemical changes due to various contaminants from point sources and non-point sources, including organic matter pollution and fecal bacterial contamination. Machine learning (ML) algorithms can be used as potential tools in surface water quality monitoring due to their capacity of finding underlying patterns and non-linear relationships among water quality parameters, unattainable by traditional or process-based water quality analysis. In this study, several standalone ML models such as artificial neural network (ANN), support vector machine (SVM), gradient boosting machine (GBM), random forest (RF) and ensemble-hybrid models such as RF-SVM, ANN-SVM, GBM-SVM, RF-ANN, GBM-ANN, and RF-GBM were developed for predicting total organic carbon (TOC) and E. coli in the Milwaukee River system. The significance of the study is the application of the ensemble-hybrid models for TOC and bacterial contamination prediction for the first time, which provides a reliable and direct approach to complement existing monitoring techniques in the Milwaukee River system with satisfactory prediction accuracies. The ensemble-hybrid models for TOC prediction resulted in R2 values within a range of 0.95–0.97. However, for E. coli prediction it was difficult to explain the greater amount of unexplained variation in bacterial data based on the physicochemical water quality parameters, resulting in R2 values within a range of 0.29–0.42. The hybrid model ANN-GBM outperformed others for both TOC and E. coli with prediction accuracies of 97% and 42%, respectively. An attempt was made to explain the variability in living microorganism behavior based on specific physicochemical parameters by developing prediction models for E. coli.
- This article is part of the themed collections: Machine learning and artificial neural networks: Celebrating the 2024 Nobel Prize in Physics, Topic Collection: Water Bodies and Topic Collection: Sensors, Detection and Monitoring