Big-data and machine learning to revamp computational toxicology and its use in risk assessment
The creation of large toxicological databases and advances in machine-learning techniques have empowered computational approaches in toxicology. Work with these large databases based on regulatory data has allowed reproducibility assessment of animal models, which highlight weaknesses in traditional in vivo methods. This should lower the bars for the introduction of new approaches and represents a benchmark that is achievable for any alternative method validated against these methods. Quantitative Structure Activity Relationships (QSAR) models for skin sensitization, eye irritation, and other human health hazards based on these big databases, however, also have made apparent some of the challenges facing computational modeling, including validation challenges, model interpretation issues, and model selection issues. A first implementation of machine learning-based predictions termed REACHacross achieved unprecedented sensitivities of >80% with specificities >70% in predicting the six most common acute and topical hazards covering about two thirds of the chemical universe. While this is awaiting formal validation, it demonstrates the new quality introduced by big data and modern data-mining technologies. The rapid increase in the diversity and number of computational models, as well as the data they are based on, create challenges and opportunities for the use of computational methods.
- This article is part of the themed collection: Recent Review Articles