Open Access Article
Luyu
Guo
a,
Jing
Zhang
a,
Yahan
Cao
a,
Jiayu
Zhang
a,
Zhengyang
Li
a,
Xiaofei
Chen
b,
Lei
Ma
*a and
Xiaowei
Liu
*c
aBeijing Key Laboratory of Fuels Cleaning and Advanced Catalytic Emission Reduction Technology, College of New Materials and Chemical Engineering, Beijing Institute of Petrochemical Technology, Beijing 102617, China. E-mail: malei@bipt.edu.cn
bChen Ping Laboratory of TIANS Engineering Technology Group Co., Ltd., Shijiazhuang, Hebei 050000, China
cDivision of Physical Sciences and Engineering, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia. E-mail: xiaowei.liu@kaust.edu.sa
First published on 24th September 2025
The effective removal of toxic pollutants like m-cresol from wastewater remains challenging despite technological advancements. This study optimized total organic carbon (TOC) removal from m-cresol-contaminated wastewater using sodium percarbonate (SPC) oxidation through artificial neural network (ANN) and response surface methodology (RSM) modeling. TOC was selected as the optimization target due to its comprehensive representation of organic pollution levels. Six operational parameters were evaluated: initial pH, reaction time, SPC dosage, temperature, catalyst dosage, and initial m-cresol concentration. The ANN model demonstrated superior performance over RSM, achieving near-perfect R2 values with significant improvement in predictive accuracy. Under optimal ANN-derived conditions (pH 2.3, 35.7 min, 2.9 g L−1 SPC, 45.7 °C, 12.9 g L−1 catalyst, 75 mg L−1m-cresol), maximum experimental TOC removal reached 67.8%, significantly exceeding RSM's 38.2%. These findings demonstrate ANN's superior capability to model complex, nonlinear relationships in advanced oxidation processes, providing a robust optimization framework for enhancing wastewater treatment efficiency.
Water impactAI-optimized sodium percarbonate oxidation offers enhanced removal of toxic organic pollutants from industrial wastewater, addressing critical environmental contamination challenges. This method provides treatment facilities with significantly improved efficiency for protecting water quality, reducing chemical discharge into aquatic ecosystems, and advancing sustainable wastewater management practices. |
In recent years, sodium percarbonate (SPC) has emerged as an effective oxidizing agent in water treatment. When dissolved in water, SPC decomposes to produce percarbonate and carbonate ions, the latter further breaking down to form oxygen and hydroxyl radicals (·OH), which enhance the oxidative process. Unlike traditional oxidizing agents, SPC overcomes the limitation of operating within a narrow acidic pH range due to the buffering effect of coexisting carbonate ions. For example, Guo et al.4 investigated the use of SPC in an ozone (O3)/SPC system to accelerate the degradation of sulfamethoxazole (SMX) in water. Their study showed that after 30 min of treatment, the O3/SPC system achieved substantial enhancement in eliminating total organic carbon (TOC) and chemical oxygen demand (COD) compared to ozone-only treatment. At the optimal SPC concentration of 0.2 g L−1, the SMX degradation rate improved by 16.4% relative to the O3-only system, with the kinetic constant increasing by 1.7-fold.
In parallel, artificial intelligence (AI) technologies have introduced new opportunities for in-depth exploration and intelligent analysis in water environment management, addressing complex problems in water treatment.5 Traditional methods, e.g., adsorption, membrane separation/filtration, precipitation, flotation, coagulation/flocculation, aerobic and anaerobic processes, etc., are often fitted using physical models. However, the inherent complexity of real-world water treatment reactions makes them difficult to capture accurately with simple physical models. Artificial neural networks (ANN) have emerged as a powerful tool for resolving this issue, exhibiting improved predictive performance across a wide range of complex operational scenarios.6 By utilizing ANN modelling, the applicability of alternative models can be extended, enabling more effective removal of a variety of environment pollutants.7 With the computational power of modern central processing units (CPU), ANN have made significant progress in chemical and environmental applications.5 These advancements range from aiding robots in discovering enhanced photocatalysts8 to serving as predictive tools in water resource management and environmental toxicology,9–11 effectively modelling and optimizing pollutant elimination processes to improve treatment efficiency and cost-effectiveness.12,13 Response surface methodology (RSM) analysis represents a hybrid framework that integrates experimental design, mathematical statistics, and parameter optimization.14 Its core concept involves approximating implicit functions by constructing explicit polynomial expressions. Through the use of multidimensional quadratic regression equations, RSM quantifies interdependencies between factors and system responses in multifactor tests, effectively addressing multivariable problems.15–17 Compared with other conventional methods, RSM offers the advantage of requiring fewer experimental runs while maintaining strong interpretability, making it especially suitable in certain AOP optimization studies.
In this paper, we employ the RSM and the ANN models to evaluate the predictive accuracy of TOC elimination during SPC oxidation treatment of m-cresol polluted wastewater. SPC was selected as the oxidant, with six key experimental parameters: the initial pH of solution, reaction time, dosage of SPC, reaction temperature, dosage of catalyst and the initial m-cresol concentration. These parameters were selected because they represent the core operational factors governing SPC activation, ·OH generation, and pollutant degradation efficiency, while providing a balanced and practical basis for model optimization with strong literature precedent. The removal rate of TOC was taken as the primary performance indicator. An advanced SPC oxidation system was developed to study the treatment of m-cresol wastewater. To determine the optimized reaction conditions for wastewater treatment, the prediction accuracies of the two studied models were compared, aiming to improve the system's treatment efficiency.
| Experimental factor | Unit | Symbol | Horizontal coding | ||||
|---|---|---|---|---|---|---|---|
| −1.565 | −1 | 0 | 1 | 1.565 | |||
| Initial pH of solution | P | 1.0 | 3 | 6.5 | 10 | 12 | |
| Reaction time | min | t | 4.3 | 10 | 20 | 30 | 35.7 |
| SPC dosage | g L−1 | S | 1.1 | 1.5 | 2.3 | 3.1 | 3.6 |
| Reaction temperature | °C | T | 14.4 | 20 | 30 | 40 | 45.7 |
| Catalyst dosage | g L−1 | c | 8.2 | 9 | 10.5 | 12 | 12.9 |
| m-Cresol concentration | mg L−1 | M | 35.9 | 50 | 75 | 100 | 114.1 |
![]() | (1) |
The variables in the experimental matrix were encoded using formula (2):
![]() | (2) |
A multivariate regression analysis was performed on the experimental dataset and design matrix associated with CCD, developing an encoded quadratic polynomial model aimed at predicting the response variable values:
![]() | (3) |
:
20 and 90
:
10 training-test ratios, depending on the size and characteristics of the dataset.22 Also, the implementation of cross-validation protocols during model training enhances reliability. This method, widely recognized as robust for the selection of parameters in AI-based algorithms, builds high-quality networks that divide the dataset into multiple folds. The model is then trained multiple times, with each fold serving as validation in turn. The final performance evaluation is based on the average results across all folds, providing an overall assessment of the model's effectiveness.
![]() | (4) |
![]() | (5) |
![]() | (6) |
representing their arithmetic mean. Generally, R2 serves as a quantitative metric bounded between 0 and 1, where values approaching unity (R2 ≥ 0.7) demonstrate stronger congruence between theoretical predictions and empirical observations.23 A smaller RMSE indicates a better model fit,24 while a diminishing AARE% closer to 0 reflects a reduced discrepancy between model predictions and observed values,25 implying higher model prediction accuracy.
To compare the model prediction accuracies of ANN against RSM, the following formula is applied:
![]() | (7) |
ANOVA was carried out to statistically quantify response model variations while validating regression coefficient significance (Table S2). The adeq precision metric, serving as a signal-to-noise ratio indicator, evaluates the model's ability to provide reliable signals relative to background noise. Herein, adeq precision was calculated as 16.62, indicating that the model provides sufficient signals for reliably predicting the TOC removal rate under specific conditions. An F-value of 32.8 suggests that the model makes a highly significant contribution to the variance of the interpreted data, with only a 0.01% chance that random fluctuations would lead to an increased F-value. However, the F-value for the lack-of-fit term is 1.9, implying a 15.5% probability that it is inflated by random fluctuations, suggesting non-significant terms present in the model. Terms with a P-value < 0.1 are treated as statistically important, while highly significant factors (P < 0.0001) strongly influence the response variable. These findings indicate the need for optimization to refine the model and enhance predictive performance.
Model optimization typically involves excluding non-significant terms identified through ANOVA. In this process, variables with P-values greater than 0.05 are iteratively removed, while those with P < 0.0001 are regarded as key factors due to their strong influence on the response variable. Additionally, the adeq precision metric, with values greater than 4 considered acceptable, is used as a criterion for evaluating model reliability. In this context, single variables represent the effects of individual factors, two variables capture interactions between factors, and quadratic terms account for the nonlinear effects of individual factors. Streamlining the model by removing non-significant terms improves its focus on key influencing factors and enhances predictive accuracy. As shown in Table 2, the optimization retained significant terms, including single-factor conditions and interaction terms such as BE, A2, B2, and F2. Post- optimization, the error between R2 and Rpred2 reduced from 35% to 16%, while adeq precision increased from 16.6 to 27. The improved signal-to-noise ratio and F-value confirmed the optimization's effectiveness, demonstrating enhanced model reliability and predictive accuracy. Based on the transformation of eqn (2), the final optimized formula for TOC removal is expressed as follows:
![]() | (8) |
| Source of variance | TOC removal rate | ||
|---|---|---|---|
| Coefficient | F-Value | P-Value | |
| Model | 37.3 | <0.0001 | |
| Cut moment | 36.5 | ||
| A-P | −5.6 | 73.0 | <0.0001 |
| B-t | 8.6 | 170.8 | <0.0001 |
| C-M | −3.0 | 20.2 | <0.0001 |
| D-S | −1.6 | 6.2 | 0.0153 |
| E-T | 5.1 | 59.0 | 0.0003 |
| F-c | 2.5 | 14.1 | <0.0001 |
| BE | 1.4 | 4.2 | <0.0001 |
| A 2 | −4.3 | 9.5 | 0.0028 |
| B 2 | −3.9 | 7.8 | 0.0068 |
| F 2 | 4.6 | 11.1 | 0.0014 |
| Lack of fit | — | 1.8 | 0.1761 |
| R 2 | 0.8 | — | — |
| R Adj 2 | 0.8 | — | — |
| R pred 2 | 0.7 | — | — |
| Adeq precision | 27.0 | — | — |
Three-dimensional (3D) response surface plots with associated contour plots are generated to elucidate the correlations between independent variables and their corresponding responses.26,27Table 2 identifies reaction time (B) and reaction temperature (E) as key interactive factors influencing the optimized response surface. As shown in Fig. 4a, the curved contour lines highlight the nonlinear interactions between B and E affecting the TOC removal rate.28 The response surface within the designated ranges captures the complete trend of these interactions, demonstrating the complex relationship between these two factors. Fig. 4b further illustrates that the TOC removal rate increases progressively with extended reaction times and elevated temperatures. This trend likely results from higher temperatures accelerating the SPC oxidation reaction and longer reaction times enabling more complete interactions between pollutants and oxidants, collectively enhancing degradation efficiency.
Next, the RSM model's predictive performance was further evaluated by comparing experimental and predicted values. Fig. 5a shows a discrepancy between these values, with an AARE% of 15%. Under the predicted optimal conditions – an initial solution pH of 4.4, a reaction time of 25.4 minutes, an SPC dosage of 1.8 g L−1, a reaction temperature of 21.6 °C, a catalyst amount of 11.3 g L−1, and an m-cresol concentration of 56.5 mg L−1 – the experimental TOC removal rate was merely 38.2%. This value is lower than many experimental test results, and also shows an absolute error of 5.1% compared to the predicted value of 43.3%. These deviations suggest that although the RSM model offers general predictive capability, its accuracy is limited, implying the need for a more robust predictive model.
![]() | ||
| Fig. 5 Comparison of predicted TOC removal rates with experimental results using (a) the RSM model and (b) the ANN model. | ||
After training, the ANN model demonstrates excellent performance, achieving a prediction score of 1 for the training set and 0.98 for the test (Table 3). The AARE% is just 0.4%, highlighting the model's exceptional generalization ability and accuracy. Fig. 5b presents the prediction results for all data in the training and test sets, revealing robust performance across both datasets. Compared with RSM (Fig. 5a), the ANN model clearly shows superior accuracy in predicting TOC removal rates. The model predicted the optimal reaction conditions as follows: an initial solution pH of 2.3, a reaction time of 35.7 min, a SPC dosage of 2.9 g L−1, a reaction temperature of 45.7 °C, a catalyst dosage of 12.9 g L−1, with a pollutant concentration of 75 mg L−1. Under these conditions, the predicted TOC removal rate was 63.1%, whilst the experimental value was 67.8%, leading to a 4.8% absolute error. This high level of predictive accuracy, coupled with 77% enhancement of experimental TOC removal rate than that obtained from RSM, indicates that the current model effectively fits the datasets and captures the relationship between input parameters and TOC removal rate. Although a minor discrepancy still exists between the experimental and predicted value, the difference is within an acceptable range. Experimental values are often subject to errors arising from experimental operation difficulties, equipment accuracy or environmental factors, which may contribute to deviations.
| Data | R 2 | RMSE | AARE% |
|---|---|---|---|
| Training set | 1 | 1.78 | 0.4% |
| Test set | 0.98 | 0.01 |
ANN sensitivity analysis was then performed to examine the relative importance of each input parameter in terms of TOC efficacies for m-cresol degradation. As shown in Fig. 6, the ranking of variable significance is as follows: reaction time > SPC dosage > catalyst dosage > initial solution pH > initial m-cresol concentration > reaction temperature. This order aligns well with the established mechanistic understanding of advanced oxidation processes. As reaction time increases, the substrate has more opportunities to interact with sodium percarbonate and the catalyst, leading to the generation of more ·OH, which enhances pollutant removal.18 At suitable pH and reaction temperatures, the collision frequency of reactant molecules increases, along with the energy transferred during collisions, making it easier for the reactant molecules to overcome the reaction activation energy, thus boosting TOC removal rates. However, excessive SPC can result in the formation of oxidative by-products, which may negatively affect the TOC removal rate. Similarly, higher initial concentrations of m-cresol can inhibit TOC removal, possibly due to competition for reactive radicals or the depletion of oxidizing agents.
![]() | ||
| Fig. 7 Prediction of TOC removal rate using RSM and ANN models. (a) Training set, (b) test set, (c) R2, and (d) RMSE comparisons. | ||
In conclusion, the ANN model exhibited superior predictive capabilities and reliability, capturing complex nonlinear relationships, achieving a ca. 100% improvement in prediction accuracy over the RSM model. Notably, the experimental TOC removal rate is improved by 77% under the guidance of the ANN model compared with RSM! These findings underscore the potential of ANN models to optimize AOP for wastewater treatment and offer a foundation for enhancing the efficiency of SPC-based systems in addressing environmental challenges.
In future work, this framework may be extended to complex wastewater matrices, integrated with global optimization algorithms, and advanced toward hybrid mechanism–data-driven architectures to enhance both adaptability and interpretability.
Supplementary information: the SI mainly includes catalyst characterization conditions, N2 adsorption–desorption (Fig. S1), response surface coding and corresponding experimental results (Table S1), analysis of variance and statistical parameters of RSM (Table S2), as well as discussions on the principles and methodological considerations underlying the present work. See DOI: https://doi.org/10.1039/D5EW00689A.
| This journal is © The Royal Society of Chemistry 2025 |