 Open Access Article
 Open Access Article
Maryam Foroughiab, 
Hassan Zolghadr Nasabc, 
Reza Shokoohi *c, 
Mohammad Hossein Ahmadi Azqhandid, 
Azam Nadalic and 
Ashraf Mazaheric
*c, 
Mohammad Hossein Ahmadi Azqhandid, 
Azam Nadalic and 
Ashraf Mazaheric
aDepartment of Environmental Health, School of Health, Torbat Heydariyeh University of Medical Sciences, Torbat Heydariyeh, Iran
bHealth Sciences Research Center, Torbat Heydariyeh University of Medical Sciences, Torbat Heydariyeh, Iran
cDepartment of Environmental Health Engineering & Research Centre for Health Sciences, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran. E-mail: reza.shokohi@umsha.ac.ir; shokoohia@yahoo.com;   Tel: +98 38380026;  Fax: +98 8138380509
dApplied Chemistry Department, Faculty of Gas and Petroleum (Gachsaran), Yasouj University, Gachsaran 75813-56001, Iran
First published on 24th May 2019
In real-scale applications, where NPs are injected into the aqueous environment for remediation, they may interact with natural organic matter (NOM). This interaction can alter nanoparticles' (NPs) physicochemical properties, sorption behavior, and even ecological effects. This study aimed to investigate sorption of Pb(II) onto multi-walled carbon nanotube (MWCNT) in presence of NOM. The predominant behavior of the process was examined comparatively using response surface methodology (RSM) and boosted regression tree (BRT)-based models. The influence of four main effective parameters, namely Pb(II) and humic acid (HA) concentrations (mg L−1), pH, and time (min) on Pb removal (%) was evaluated by contributing factor importance rankings (BRT) and analysis of variance (RSM). The applicability of the BRT and RSM models for description of the predominant behavior in the design space was checked and compared using statistics of absolute average deviation (AAD), mean absolute error (MAE), root mean square error (RMSE), and multiple correlation coefficient (R2). The results showed that although both approaches exhibited good performance, the BRT model was more precise, indicating that it could be a powerful method for the modeling of NOM-presence studies. Importance rankings of BRT displayed that the effectiveness order of the studied parameters is pH > time > Pb(II) concentration > HA concentration. Although HA concentration showed the least effect in comparison with three other studied parameters theoretically, the experimental results revealed that Pb(II) removal is enhanced in presence of HA (73% vs. 81.77%), which was confirmed by SEM/EDX analyses. Hence, maximum removal (R% = 81.77) was attained at an initial Pb(II) concentration of 9.91 mg L−1, HA concentration of 0.3 mg L−1, pH of 4.9, and time of 55.2 min.
Thus, efficient removal of heavy metals from water bodies is still a challenging task facing environmental engineers. Among developed remediation technologies for heavy metals, including chemical precipitation, ion exchange, adsorption, membrane separation, and electrochemical processes,6 adsorption is still known as one of the most efficient approaches and many adsorbents have been studied in recent decades.7–9 Among introduced adsorbents, carbon nanotubes (CNTs) have attracted considerable research attention due to their highly porous and hollow structure, large specific surface area, light mass density, and capability to establish strong electrostatic interaction with various kinds of pollutant molecules. These features have led to CNTs seeming a very promising candidate for adsorption of various kinds of pollutants from wastewater, including heavy metal ions.10,11 Depending on both CNT and solution chemistry, the apparent adsorption capacity for Pb(II) has been reported from several mg g−1 to about 100 mg g−1.12
However, in real-scale applications where the nanoparticles (NPs) are injected into the aqueous environment for remediation, interaction with natural organic matter (NOM) may occur. Hence, it is crucial to study the adsorption behavior of heavy metals by NPs in the absence or presence of NOM. NOM is one of the most abundant materials on earth and ubiquitously present in natural water bodies at concentrations ranging from a few mg L−1 to a few hundred mg L−1.13 Interaction between NOM and NPs can alter the physicochemical properties, sorption behavior, and even ecological effects of the adsorbents.14 For this reason, in many studies, NOM has been introduced into the process to investigate its effect on NPs performance in sorption of a target pollutant. Therefore, in recent years, growing numbers of studies have reported the effects of NOM on heavy metals removal by CNTs.15
However, all have focused on the investigation using the one-variable-at-a-time (OVAT) approach, in which the impacts of the main effective parameters are presented individually. This strategy suffers from not showing the interactions between all contributing variables. Multivariate statistical strategies have been preferred to the OVAT approach to identify the optimal combination of parameters and interactions between variables, improve cost- and time-effectiveness, develop a mathematical model, forecast the response, assess the model adequacy, and determine the optimal conditions for a given response.16,17 Response surface methodology (RSM) is known as an efficient procedure applicable not only to experimental design but also to development of a mathematical model (linear, square polynomial functions, etc.) for each response based on the obtained results.18 Boosted regression tree (BRT) model is a recently developed procedure for either multivariate classification or regression. This approach offers the benefits of both classical regression models and machine learning techniques. BRT adjusts complex linear and nonlinear responses to multiple categorical and continuous parameters even where the data suffer from collinearity-based challenges.19 Such tree-based methods were generally developed to optimize predictive performance by combining a large number of simple trees into a powerful model instead of considering a single tree based on conventional regression trees.20 These advantages led to application of RBT for the present study's modelling and optimization in addition to BBD strategy.
The objectives of this work were (i) to investigate the effect of NOM on MWCNTs in Pb(II) sorption by considering the parameters of Pb(II) and HA concentrations (mg L−1), pH, and time (min), (ii) to model the process and compute the impacts in terms of main effects and interactions using both RSM and BRT strategies and compare the results in terms of absolute average deviation (AAD), mean absolute error (MAE), root mean square error (RMSE), and multiple correlation coefficient (R2), and (iii) to introduce the optimal conditions of the process and the expected efficiency at such point. It should be emphasized that, although an increasing number of studies have been conducted in recent years to evaluate the effect of NOM on sorption-based processes of NPs, to the best of our knowledge none of them have investigated the process from modeling and/or interactions point of view. Therefore, this study highlighted the application and comparison of the RSM and BRT models on the process behavior.
| N = 2k(k − 1) + C0 | (1) | 
|  | ||
| Fig. 1 Design space at a three-level Box–Behnken approach. The yellow and red circles in the left scheme lie on the factorial and center points, respectively. | ||
According to eqn (1), a total of 29 experiments including 12 factorial points (Stds 1–25) and five replicates at the center points (Stds 25–29) were defined and their experimentally obtained results (summarized in Table 1) were used to describe the governed behavior in the process by fitting to the quadratic polynomial model presented in eqn (2).
|  | (2) | 
| Factor | Name | Units | Levels and ranges | ||
|---|---|---|---|---|---|
| Upper level (−1) | Middle (0) | Lower level (+1) | |||
| A | Pb concentration | mg L−1 | 2 | 6 | 10 | 
| B | HA concentration | mg L−1 | 0 | 10 | 20 | 
| C | pH | — | 3 | 5 | 7 | 
| D | Time | min | 10 | 35 | 60 | 
| Standard order | Run order | Leverage | Pb concentration (mg L−1) | HA concentration (mg L−1) | pH | Time (min) | Pb removal efficiency (%) | ||
|---|---|---|---|---|---|---|---|---|---|
| Actual value | Predicted value | Residual | |||||||
| 13 | 1 | 0.786 | 6 | 0 | 3 | 35 | 10.423 | 11.76 | −1.33 | 
| 21 | 2 | 0.635 | 6 | 0 | 5 | 10 | 20.5018 | 16.56 | 3.94 | 
| 6 | 3 | 0.647 | 6 | 10 | 7 | 10 | −8 | −9.42 | 1.42 | 
| 9 | 4 | 0.619 | 2 | 10 | 5 | 10 | 58.23 | 60.02 | −1.79 | 
| 28 | 5 | 0.2 | 6 | 10 | 5 | 35 | 23.8281 | 23.83 | −0.0052 | 
| 8 | 7 | 0.704 | 6 | 10 | 7 | 60 | 23.5944 | 20.81 | 2.79 | 
| 25 | 8 | 0.2 | 6 | 10 | 5 | 35 | 25.2748 | 23.83 | 1.44 | 
| 11 | 9 | 0.645 | 2 | 10 | 5 | 60 | 47.645 | 46.86 | 0.7843 | 
| 18 | 10 | 0.612 | 10 | 10 | 3 | 35 | 64.3901 | 60.34 | 4.05 | 
| 4 | 12 | 0.648 | 10 | 20 | 5 | 35 | 61.362 | 60.51 | 0.8522 | 
| 1 | 14 | 0.714 | 2 | 0 | 5 | 35 | 56.5768 | 57.77 | −1.19 | 
| 22 | 15 | 0.758 | 6 | 20 | 5 | 10 | 39.394 | 37.72 | 1.68 | 
| 24 | 16 | 0.67 | 6 | 20 | 5 | 60 | 21.7446 | 24.01 | −2.27 | 
| 27 | 17 | 0.2 | 6 | 10 | 5 | 35 | 22.5867 | 23.83 | −1.25 | 
| 19 | 18 | 0.656 | 2 | 10 | 7 | 35 | 38.7484 | 41.13 | −2.38 | 
| 10 | 19 | 0.619 | 10 | 10 | 5 | 10 | 40.0267 | 42.15 | −2.12 | 
| 5 | 20 | 0.597 | 6 | 10 | 3 | 10 | 26.1443 | 29.27 | −3.13 | 
| 17 | 21 | 0.612 | 2 | 10 | 3 | 35 | 45.2172 | 41.71 | 3.5 | 
| 29 | 22 | 0.2 | 6 | 10 | 5 | 35 | 22.0096 | 23.83 | −1.82 | 
| 26 | 23 | 0.2 | 6 | 10 | 5 | 35 | 25.4672 | 23.83 | 1.63 | 
| 14 | 24 | 0.786 | 6 | 20 | 3 | 35 | 41.0664 | 42.4 | −1.33 | 
| 3 | 25 | 0.648 | 2 | 20 | 5 | 35 | 60.5635 | 59.49 | 1.07 | 
| 2 | 26 | 0.714 | 10 | 0 | 5 | 35 | 61.362 | 62.78 | −1.42 | 
| 12 | 27 | 0.645 | 10 | 10 | 5 | 60 | 71.2188 | 70.76 | 0.4561 | 
| 20 | 28 | 0.656 | 10 | 10 | 7 | 35 | 26.7023 | 28.53 | −1.83 | 
| 7 | 29 | 0.628 | 6 | 10 | 3 | 60 | 12.745 | 14.51 | −1.76 | 
RF is an ensemble learning method for regression that consists of many decision trees and was first introduced by Tin Kam Ho of Bell Labs in 1995. The RF technique combines Breiman's “bagging” idea and the random collection of features. Several advantages have been reported for RF-derived models over other statistical approaches: they are able to handle missing values and high-dimensional data, recognize complex interactions among factors and the most important parameters measurements, and anticipate with high accuracy (low-bias models in addition to low-variation results) even for large databases.33 However, RF suffers from inherent limitations, such as overfitting for some datasets and unreliable variable importance scores, especially for categorical factors with different numbers of levels. These disadvantages can be overcome by employing boosting methods such as BRT.34 The process of BRT application includes fitting the model using random independent bootstrap replicates which are then combined by averaging the output for regression. In fact, BRTs are an ensemble strategy wherein many simple models are combined to improve the model performance (“boosting”) by means of recursive binary splits to related response to independent factors (regression trees). These approaches robustly factor collinearity, outliers, and missing data and can take both categorical and continuous parameters.35
So far, BRT approaches have been successfully used in different fields of chemistry with large data volumes, including reflectance spectroscopy,36 blood–brain barrier modelling,37 and cancer diagnostics.38 However, our literature survey shows that there is no evidence for use of BRT approach in the adsorption process from RSM data. It has been shown that the BRT model is one of the most powerful statistical approaches reported in science since the 1990s; the efficiency of BRT regression usually depends on three parameters: the number of trees (nt), tree complexity (tc), and learning rate (lr).39 However, the success of a BRT model relies on optimal sets of these regularization parameters. Hence, BRT models with various nt (1 to 100), tc (1, 4, 16) and lr (0.1, 0.25, 0.50, and 1.00) values were considered in the training to select the best BRT model with maximum R2 and minimum error.
The SEM images of MWCNTs and MWCNT/HA before and after Pb(II) sorption are shown in Fig. 2. As can be seen, the MWCNTs were smooth and free from impurities (Fig. 2a). The bulk morphology of the long particles is filament-like and oriented with uniform diameters, which indicates homogeneous MWCNTs. It can be seen from Fig. 2b (MWCNT/HA before adsorption of Pb(II)) that the extent of aggregation between MWCNTs in MWCNTs/HA was clearly reduced compared to raw MWCNTs, which can be attributed to hydrophobic and π–π attractions of HA with MWCNTs.42 The uniformly distributed MWCNTs/HA nanohybrid can greatly increase the surface-to-volume ratio and the efficiency of Pb(II) ion capture, thus greatly improving the removal properties of the prepared adsorbent.42 However, after the adsorption process, the tubes displayed swelling from the open ends of the MWCNTs (Fig. 2c). The functional groups (e.g. hydroxyl or carboxyl groups) created during the adsorption process will attach to these or to any other available defect sites. Therefore, the surfaces of MWCNTs after adsorption were less smooth in comparison with pristine MWCNTS, mainly due to the surface modification induced by adsorption.43
|  | ||
| Fig. 2 SEM images of pristine optimum MWCNTs (a) and MWCNTs/HA before (b) and after (c) adsorption of Pb(II). | ||
The sorption of Pb(II) was also confirmed by the comparison of EDX spectra of the MWCNTs before and after exposure to Pb(II)- and HA-containing solutions (Fig. S2†). In nanoparticles before sorption and those exposed to HA solution, no lead was detected, as can be seen from Figs. S2a and b,† whereas a sharp peak of lead appeared in the EDS spectrum of nanoparticles after the process (Fig. S2c†).
| Pb removal (%) as coded form = 23.8333 + (1.50675 × A) + (−0.137164 × B) + (−8.09783 × C) + (3.86432 × D) + (−0.996687 × AB) + (−7.80475 × AC) + (10.4443 × AD) + (−15.4589 × BC) + (−10.7171 × BD) + (11.2484 × CD) + (30.1244 × A2) + (6.17927 × B2) + (−11.0312 × C2) + (0.989705 × D2). | (3) | 
| Source | Sum of squares (SS) | df | Mean square | F-Value | p-Value | Status | 
|---|---|---|---|---|---|---|
| a R2 = 0.9887, R2 adjusted = 0.9743, and R2 predicted = 0.9106, AP = 33.2441, and CV = 8.79. | ||||||
| Model | 9685.71 | 14 | 691.84 | 68.61 | <0.0001 | Significant | 
| A-Pb | 27.24 | 1 | 27.24 | 2.70 | 0.1285 | |
| B-HA | 0.11 | 1 | 0.11 | 0.01 | 0.9178 | |
| C-pH | 587.40 | 1 | 587.40 | 58.26 | <0.0001 | |
| D-Time | 146.11 | 1 | 146.11 | 14.49 | 0.0029 | |
| AB | 3.97 | 1 | 3.97 | 0.39 | 0.5430 | |
| AC | 243.66 | 1 | 243.66 | 24.16 | 0.0005 | |
| AD | 436.33 | 1 | 436.33 | 43.27 | <0.0001 | |
| BC | 358.04 | 1 | 358.04 | 35.51 | <0.0001 | |
| BD | 273.59 | 1 | 273.59 | 27.13 | 0.0003 | |
| CD | 506.11 | 1 | 506.11 | 50.19 | <0.0001 | |
| A2 | 5416.76 | 1 | 5416.76 | 537.20 | <0.0001 | |
| B2 | 183.90 | 1 | 183.90 | 18.24 | 0.0013 | |
| C2 | 661.20 | 1 | 661.20 | 65.57 | <0.0001 | |
| D2 | 5.54 | 1 | 5.54 | 0.55 | 0.4743 | |
| Residual | 110.92 | 11 | 10.08 | |||
| Lack of fit | 101.29 | 7 | 14.47 | 6.01 | 0.0512 | Not significant | 
| Pure error | 9.63 | 4 | 2.41 | |||
| Cor total | 9796.63 | 25 | ||||
The large R2 values (R2 = 0.9887, Radjusted2 = 0.9743, and Rpredicted2 = 0.9106) prove high correlation and agreement between the anticipated and obtained results. AP evaluates adequate model discrimination. In this study, the AP ratio of 33.24 indicates that the signal is sufficient to model.
|  | ||
| Fig. 3 Response contour plots: effects of (a) initial Pb concentration and pH, (b) initial HA concentration and pH and (c) initial Pb concentration and time. | ||
Removal of Pb(II) was critically dependent on the solution pH value, which influences not only the surface charge of the MWCNTs but also the degree of ionization and speciation of the adsorbate. Fig. 3a shows that, with increase of pH from 3.0 to 5.0, the removal efficiency increased. The effect of pH can be simultaneously related to the following reasons: at pH = 3.0, adsorption effect is very weak as a result of the competition of H+ with Pb(II) on the adsorption sites; at pH = 5.0, the adsorption capability increases due to the role of functional groups on the MWCNTs surfaces; and at pH = 7.0, the adsorption capacity increases remarkably. The higher adsorption capacity of the NPs at pH 7.0 may also result from the cooperating roles of adsorption and precipitation. Since the pHpzc of NPs was found to be 6.32 (Fig. S3†), the removal efficiency would be expected to decrease because of the positive charges of MWCNTs at pH < pHpzc. However, in real experiments, we found that at lower pH values the efficiency increased. This may correspond to interactions of HA on the nano surface that prevent the expected phenomena. As pH increases, the weakly acidic HA with carboxylic and phenolic moieties turns to a more negatively charged species. Therefore, at higher pH values, repulsion of HA and MWCNTs increases, hindering further sorption of HA to MWCNTs. This results in a decreasing trend of removal at high pHs. Fig. 3c illustrates the sorption of HA on MWCNTs in terms of pH values. In fact, the improvement of HA sorption on the adsorbent, and therefore of Pb(II) removal, decreased with increasing pH values. The competition between Pb(II) and HA in occupation of active sites made their interaction insignificant as outlined by red in Fig. S4† (p-value = 0.54), while the interactions related to Pb(II) or HA with other parameters are all significant. In fact, such a plot visualizes the interaction between a pair of variables through the slope difference among them in relation to the response. When two variables' lines show a parallel trend, it is assumed that there is no interaction between their corresponding variables.40 As can be seen from Fig. S4,† except Pb(II) and HA concentrations, all the lines follow an unparallel trend, indicating interaction between them.
In this study, the nt (0–100) and lr in BRT method were obtained by a trial and error procedure for the datasets obtained from the BBD-introduced matrix. The optimal values were selected based on minimization of MSE at the tc of 1, 4, and 16 (Fig. 4).
|  | ||
| Fig. 4 The association between the nt and the predictive deviance with four lr and three levels of tc. | ||
Herein, the aim was to achieve the combination of tree parameters, i.e. lr, tc, and nt, where a minimum MSE for the estimations of the response could be found. A value bigger than 1.00 for lr was not investigated because it was too fast and the derived minimum MSE would most probably be due to BRT overfitting. A similar phenomenon in lr = 1.00 and 0.5 were observed, namely that overfitting occurred but in relatively more trees (nt < 10). On the other hand, the smallest values for lr (i.e. 0.01 and 0.001) resulted in the best performance but needed thousands of trees to reach the minimum MSE (results for lr = 0.01 and 0.001 not shown). Elith et al. showed there was only slight improvement in the prediction power on a large number (N = 500) of trees.53 Nevertheless, researchers suggest the optimum lr must be selected to result in a minimum MSE value in the nt < 100 for different tc. The MSE as a function of tree complexity for lr = 0.1 for Pb removal is shown in Fig. 8. It can be seen from this figure that the optimum tc for both training and testing dataset is 4. The relative factor importance for each factor contributed is shown in Fig. 5. The relative importance of factors can be evaluated by averaging the number of times that a parameter is selected for splitting and the squared improvement resulting from these splits.54 As evident in Fig. 5, the maximum importance in Pb(II) removal by MWCNT is assigned to pH.
As expected from sum of squares (SS, ANOVA results in Table 2), the pH and time were found to be the most effective parameters, with relative contributions of 75.00% and 19.35%, respectively, according to the BRT model for Pb(II) adsorption (Table 3). The concentrations of Pb(II) and HA had contributions of 4.00% and 1.650%, respectively, showing their insufficient influence on Pb(II) removal. The relative importance obtained from SS has good agreement with that achieved by BRT.
| Model | Input variable | |||
|---|---|---|---|---|
| Pb concentration (mg L−1) | HA concentration (mg L−1) | pH | Time (min) | |
| BRT | 4.00 | 1.65 | 75.00 | 19.35 | 
| RSM | 3.95 | 1.50 | 75.35 | 19.20 | 
|  | (4) | 
|  | (5) | 
|  | (6) | 
|  | (7) | 
R2 represents how well the developed equation truly fits to experimental data and is described by least-squares regression. It can be used for determining the degree of linear correlation of parameters in a regression calculation and a higher value implies more reliable prediction of the model.30,55,56 AAD is the average absolute deviation from a middle point and is considered a direct way to measure deviations between predicted and obtained results and, in contrast with R2, a smaller value is better.57 Mean square error (MSE) and RMSE are other statistics to check the quality of a model which are positive values and are preferred to be smaller and closer to zero. For a best-fitted model, sum of squared residuals, and therefore MSE and RMSE, should be minimum. The values for the mentioned statistics are listed in Table 4.
| Model | Statistical metrics (for TCS) | |||
|---|---|---|---|---|
| R2 | RMSE | MAE | AAD% | |
| BRT | 0.999889 | 0.006464 | 0.005755 | 1.217286 | 
| RSM | 0.9887 | 0.022771 | 0.007659 | 1.679938 | 
The plot of observed responses versus predicted ones can be informative respecting model fitting to a data set.58 The goodness-of-fit between the mentioned responses given by the RSM and BRT models are presented in Fig. 6. As is clear from Fig. 6, there is good agreement between the obtained responses and the predicted values in both models, especially in the case of BRT (R2 = 0.999889). In an adequate model, in addition, no major trend would be seen for the residuals against time or any other parameters.59 The plot of the internally studentized residuals versus the experimental runs is depicted in Fig. 7 and the residuals appear to behave randomly, suggesting their independence from experimental runs.
Although both models presented appropriate statistics, the BRT model is superior to that developed by BBD from fitness and estimation capability point of views. However, RSM is still advantageous to its studied counterpart due to showing the experimental factors' influence as main effects or interactions and giving a regression equation on a process behavior in the studied design space.59
Moreover, importance ranking for BRT displayed that pH and time are the most effective factors, with relative contributions of 75.00% and 19.35%, followed by Pb(II) and HA concentrations at 4.00% and 1.650%, respectively. Although HA concentration showed the least effect in comparison with three other studied parameters, the experimental results revealed that Pb(II) removal is enhanced in presence of HA (73% vs. 81.77%), which was confirmed by SEM/EDX analyses. Therefore, it seems that even though RSM is the most widely applied technique for optimization adsorption-based studies, the BRT approach can give more accurate and reliable results even with a smaller data set.
The findings of this work are potentially significant for evaluation of a treatment method along with the modeling capability. The BRT strategy is more appropriate due to taking much less computational time and handling a smaller number of contributing factors. However, as the bagging and boosting approaches are meta-algorithms, they can be employed with different kinds of trees or other regression models. The optimal situation and relative importance of each parameter for Pb(II) adsorption were determined and presented.
| Footnote | 
| † Electronic supplementary information (ESI) available. See DOI: 10.1039/c9ra02881a | 
| This journal is © The Royal Society of Chemistry 2019 |