Prediction of nanoparticle transport behavior from physicochemical properties: machine learning provides insights to guide the next generation of transport models†
In the last 15 years, the development of advection–dispersion particle transport models (PTMs) for the transport of nanoparticles in porous media has focused on improving the fit of model results to experimental data by inclusion of empirical parameters. However, the use of these PTMs has done little to elucidate the complex behavior of nanoparticles in porous media and has failed to provide the mechanistic insights necessary to predictively model nanoparticle transport. The most prominent weakness of current PTMs stems from their inability to consider the influence of physicochemical conditions of the experiments on the transport of nanoparticles in porous media. Qualitative physicochemical influences on particle transport have been well studied and, in some cases, provide plausible explanations for some aspects of nanoparticle transport behavior. However, quantitative models that consider these influences have not yet been developed. With the current work, we intend to support the development of future mechanistic models by relating the physicochemical conditions of the experiments to the experimental outcome using ensemble machine learning (random forest) regression and classification. Regression results demonstrate that the fraction of nanoparticle mass retained over the column length (retained fraction, RF; a measure of nanoparticle transport) can be predicted with an expected mean squared error between 0.025–0.033. Additionally, we find that RF prediction was insensitive to nanomaterial type and that features such as concentration of natural organic matter, ζ potential of nanoparticles and collectors and the ionic strength and pH of the dispersion are strongly associated with the prediction of RF and should be targets for incorporation into mechanistic models. Classification results demonstrate that the shape of the retention profile (RP), such as hyperexponential or linearly decreasing, can be predicted with an expected F1-score between 60–70%. This relatively low performance in the prediction of the RP shape is most likely caused by the limited data on retention profile shapes that are currently available.
- This article is part of the themed collection: Modelling in Environmental Nanotechnology