Prediction of disinfection by-product formation in drinking water via fluorescence spectroscopy†
Fluorescence spectroscopy shows promise as a tool for monitoring regulated disinfection by-products (DBPs) online in water treatment applications. Prediction of DBP formation via fluorescence spectroscopy was investigated using drinking water treatment plant (WTP) samples and experimental data from bench-scale advanced oxidation processes applied to a natural water matrix. L1-Regularized linear regression (lasso), boosted regression tree ensembles, principal components regression, supervised principal components, and fluorescent regional integration models were applied to data comprising instantaneous haloacetic acid (HAA) and trihalomethane (THM) concentrations and DBP formation potentials (HAAfp and THMfp) paired with fluorescence excitation–emission matrices. L1-Regularized linear regression yielded the lowest mean absolute error (MAE), assessed by cross-validation, on HAA and HAAfp data collected at the WTP (7.7 μg L−1, N = 22). Boosted regression tree ensemble predictions had the lowest MAE on WTP THM and THMfp data (13.5 μg L−1, N = 37). L1-Regularized linear regression and supervised principal components, respectively, exhibited the greatest prediction accuracy (MAE 14.9 and 9.5 μg L−1, N = 60) for HAAfp and THMfp data generated via bench-scale advanced oxidation processes. Linear models based on either fluorescent regional integration or (unsupervised) principal components were consistently less accurate than the highest-performing methods for DBP prediction.