Emerging investigator series: predicted losses of sulfur and selenium in european soils using machine learning: a call for prudent model interrogation and selection†
Abstract
Reductions in sulfur (S) atmospheric deposition in recent decades have been attributed to S deficiencies in crops. Similarly, global soil selenium (Se) concentrations were predicted to drop, particularly in Europe, due to increases in leaching attributed to increases in aridity. Given its international importance in agriculture, reductions of essential elements, including S and Se, in European soils could have important impacts on nutrition and human health. Our objectives were to model current soil S and Se levels in Europe and predict concentration changes for the 21st century. We interrogated four machine-learning (ML) techniques, but after critical evaluation, only outputs for linear support vector regression (Lin-SVR) models for S and Se and the multilayer perceptron model (MLP) for Se were consistent with known mechanisms reported in literature. Other models exhibited overfitting even when differences in training and testing performance were low or non-existent. Furthermore, our results highlight that similarly performing models based on RMSE or R2 can lead to drastically different predictions and conclusions, thus highlighting the need to interrogate machine learning models and to ensure they are consistent with known mechanisms reported in the literature. Both elements exhibited similar spatial patterns with predicted gains in Scandinavia versus losses in the central and Mediterranean regions of Europe, respectively, by the end of the 21st century for an extreme climate scenario. The median change was −5.5% for S (Lin-SVR) and −3.5% (MLP) and −4.0% (Lin-SVR) for Se. For both elements, modeled losses were driven by decreases in soil organic carbon, S and Se atmospheric deposition, and gains were driven by increases in evapotranspiration.
- This article is part of the themed collection: Emerging Investigator Series