QSARs and computational chemistry methods in environmental chemical sciences

Kathrin Fenner *ab and Paul G. Tratnyek c
aEawag, Swiss Federal Institute of Aquatic Science and Technology, 8600 Dübendorf, Switzerland. E-mail: kathrin.fenner@eawag.ch
bDepartment of Chemistry, University of Zürich, 8057 Zürich, Switzerland
cInstitute of Environmental Health, Oregon Health & Science University, 3181 SW Sam Jackson Park Road, Portland, OR 97239, USA

It is my great pleasure to introduce to you this Themed Issue on “QSARs and computational chemistry methods in environmental chemical sciences” in Environmental Science: Processes & Impacts, for which I served as guest editor together with ESPI associate editor Paul Tratnyek. At the outset, we recognized that quantitative structure–activity relationships (QSARs) have long been a pervasive part of environmental chemical sciences, so much so that there are whole journals and multiple monographs dedicated to environmental applications of QSARs. We also noted that the relatively recent proliferation of advanced molecular modeling studies in environmental chemistry has set the stage for an era of convergence between these disciplines. For this issue, it was our ambition to encourage this convergence, while also showcasing some new examples of the more purely statistical (QSAR) and theoretical (molecular modeling) approaches to environmental chemistry.

During 2016, we recruited contributions from many active, influential, and/or emerging researchers working near the intersection of molecular modeling and chemometric analysis of environmental chemistry data. The resulting collection of invited research articles, perspectives, and critical reviews spans a range of topics, from detailed studies of specific processes using advanced molecular methods to more general applications of statistical and “big data” approaches, and as such gives a wide and balanced perspective on the whole range of “in silico” methods in environmental chemical sciences at this time. The applications covered in this issue range from developing predictive models for important fate and toxicity endpoints, to improving our mechanistic understanding of how chemical structure and environmental conditions affect the underlying processes.

The three perspectives at the beginning of this issue were arranged to be complementary in that they address the three main motivations for QSAR and molecular modeling in environmental sciences, i.e., contaminant fate, toxicity, and regulation. Paul Tratnyek, Eric Bylaska and Eric Weber (DOI: 10.1039/C7EM00053G) discuss the prediction of properties that are fundamental determinants of chemical fate and effects, and highlight the increasing importance of computational methods, not only for quantum-chemical property prediction but also for integration of databases and prediction tools into comprehensive web services. Marcy Card and her co-authors from the U.S. Environmental Protection Agency (DOI: 10.1039/C7EM00064B) also discuss the prediction of chemical fate determining properties, but emphasize the early development of QSAR-based software tools to support the USEPA's efforts to develop science-based regulations of chemical contaminants. Mark Cronin's perspective (DOI: 10.1039/C6EM00687F) tackles the larger scale (i.e., less molecular and more cellular) challenge of predicting (eco-)toxicity endpoints and their use in the regulation of contaminants.

The two reviews in this issue turned out to be nicely complementary, but in a way that was not part of our original design. Tom Nolte and Ad Ragas (DOI: 10.1039/C7EM00034K) provide a review in classical format: starting with a very comprehensive summary of currently used QSAR models for sorption, abiotic and biotic transformation, and bioconcentration of ionizable compounds, and ending with an analysis of needs and opportunities for future developments. In contrast, Syed Hashsham et al. (DOI: 10.1039/C6EM00689B) present a review that is almost entirely forward-looking, by necessity, because their topic—virulence factor activity relationships (VFARs)—is a bold extension of the QSAR concept that has not yet proven to be feasible. This is one of two novel applications of QSAR modeling that we recruited for this review. The other was the application to properties of nanomaterials, which is also somewhat controversial because these models treat materials essentially like conventional chemicals. Sadly, in the end, we received no papers in that area.

The prediction of partitioning of compounds between water or air and various environmental and engineered phases is the focus of a number of papers, with methods used varying from polyparameter linear free energy relationships (pp-LFERs) to quantum mechanical-based approaches, mostly based on COSMO-RS theory. It was particularly interesting to see predictive models being developed for phases with new, challenging properties. For example, Satoshi Endo, Kai Goss and PhD student Lukas Linden show that 3D-QSARs are able to capture the steric effects influencing the sorption of ionic compounds to blood proteins (DOI: 10.1039/C6EM00555A). Mark Parnis and Don Mackay compare different oligomeric models and model formulations to identify those that best predict partitioning to polymers used as passive sampling materials (DOI: 10.1039/C6EM00355A). A pp-LFER study on the sorption of volatile organic compounds to carbon nanotubes (CNTs) from the laboratories of Dave Kuo and Yang-hsin Shih demonstrates that CNTs provide more and better sorption sites compared to clays or activated carbon (DOI: 10.1039/C6EM00567E). Frank Wania, PhD student Tife Awonaike and co-authors lay the basis for a rational and efficient modeling of partitioning to mixed phases such as aerosols and rain droplets by demonstrating which subcompartments contribute most and which can be safely neglected (DOI: 10.1039/C6EM00636A).

Several studies also aim to extend the applicability domain of partitioning models to a wider and more diverse range of chemicals and to changing environmental conditions. Jingwen Chen and his team develop a pp-LFER with a temperature term included that is able to capture octanol–air partitioning of a diverse set of chemicals over a 60 °C temperature range (DOI: 10.1039/C6EM00626D). The study of Steven Droge et al. on the partitioning of amines to phospholipid bilayers demonstrates that quantum-mechanical software such as COSMOmic allows going beyond previously used, oversimplistic correction factors for cationic compounds, but they also conclude that predicting partitioning of partially charged compounds to heterogeneous phases remains challenging (DOI: 10.1039/C6EM00615A).

For me, it was a pleasure to see that submissions on the rather more mature field of predicting partitioning properties were matched by about an equal number of submissions that address the prediction of chemical reactions. Two separate contributions address the prediction of oxidation rate constants with different natural oxidants based on correlation analyses with one-electron oxidation potentials. Bill Arnold and Doug Latch teamed up to develop QSARs for reaction rate constants of phenols with singlet oxygen, carbonate radicals and triplet state organic matter based on one-electron oxidation potentials calculated with density functional theory (DFT) (DOI: 10.1039/C6EM00580B). Tratnyek and collaborators also studied the same family of reactions (oxidation, including aromatics amines) but focused on reconciling the differences between descriptor data (oxidation potentials) obtained by experimental measurements vs. quantum-mechanical predictions (DOI: 10.1039/C6EM00694A). Both Arnold's and Tratnyek's studies include QSARs based on descriptor data that were obtained by a combination of experimental and theoretical methods (the former used to calibrate the latter), which results in property prediction models where statistical, theoretical, and experimental elements are deeply entangled. In a third paper on oxidation kinetics, Jingwen Chen and collaborators present a QSAR-type model with improved applicability domain for S-, N- and P-containing compounds for the prediction of aqueous reaction rate constants with hydroxyl radicals (DOI: 10.1039/C6EM00707D).

Another group of papers addresses the reaction pathways, rates, and products of contaminant transformations through detailed molecular modeling calculations, mainly at the level of density functional theory. Goran Kovacevic and Aleksandar Sabljic carried out a comparative study of the reaction pathways and products of halogenated benzenes with OH radicals, demonstrating how the understanding of the exact reaction trajectory enables the prediction of reliable rate constants at environmentally relevant temperatures (DOI: 10.1039/C6EM00577B). The combined groups of Yi Luo and Jingwen Chen contributed two papers where they used DFT calculations to elucidate the oxidation rate and pathways of the human antibiotic sulfamethoxazole with ferrate(VI) (DOI: 10.1039/C6EM00521G) and ozone (DOI: 10.1039/C6EM00698A). Jerzy Leszczynski and collaborators present a study addressing alkaline hydrolysis—the only reactivity paper in this issue not focused on oxidation—and use the results to demonstrate the critical importance of accounting for the specific hydration of hydroxide to create an accurate kinetic model of RDX hydrolysis (DOI: 10.1039/C6EM00565A). Only one paper makes extensive use of molecular modeling levels of higher accuracy than DFT (DOI: 10.1039/C7EM00009J). In this study, Bryan Wong's lab showcases the importance of corroborating the accuracy of computationally-efficient DFT methods with additional high-level wave function-based methods, particularly when it comes to assessing reactivity with inorganic oxidants such as sulfate radicals.

Only three research papers addressed various aspects of predicting ecotoxicological endpoints. This modest turnout relative to the amount of prior work in this area might reflect the relative maturity of the field of QSARs for acute toxicity endpoints. Yet significant challenges remain, particularly with respect to predicting chronic toxicity endpoints, as noted in the perspective by Mark Cronin. This said, the contribution of Beate Escher and collaborators convincingly closes a major gap that has existed for a long time in that they demonstrate that baseline toxicity prediction for nonpolar, polar and particularly ionizable chemicals all obey a common theoretical relationship if based on speciation-corrected lipid–water distribution coefficients (DOI: 10.1039/C6EM00692B). Monika Nendza and collaborators address the question of how QSARs for ecotoxicity endpoints can gain more regulatory acceptance. Based on an unprecedentedly large set of fish acute toxicity data they develop and test a baseline toxicity classification scheme that is sufficiently conservative to be accepted by regulatory authorities, yet could still reduce fish tests by 40% (DOI: 10.1039/C6EM00600K). Finally, Nikos Thomaidis, PhD student Reza Aalizadeh and Peter von der Ohe use the case of acute toxicity prediction for D. magna to demonstrate the importance of applicability domain considerations in the development of cheminformatics-type QSAR models (DOI: 10.1039/C6EM00679E).

The two papers co-authored by myself reflect my personal conviction that we also need to invest in enabling the use and development of models. The paper led by my postdoctoral researcher Diogo Latino introduces a new database containing pesticide soil biodegradation data (Eawag-Soil) and describes how these data can be used to develop improved QSBR models (DOI: 10.1039/C6EM00697C). Urs von Gunten and graduate student Minju Lee introduce O3-PPS, a modeling platform for predicting rates and products of O3 reaction with diverse compounds, which is easy enough to be used by practitioners, but also useful for more in-depth explorations, e.g., of pH-dependence, for scientists (DOI: 10.1039/C6EM00584E).

Finally, there is one contribution that stands out because it uses high-level computational chemistry to predict an endpoint that is of high practical relevance for biogeochemistry research, but also because it gives us a glimpse of where the use of ab initio calculations in environmental chemistry might be moving towards with increasing computer power. In their work, postdoctoral researcher Halua Pinto de Magalhães, Matthias Brennwald and Rolf Kipfer give a fascinating example of how ab initio molecular dynamics simulations can be used to study the different diffusion regimes of noble gases in water and to thus explain their observed size-dependent isotope fractionation behavior (DOI: 10.1039/C6EM00614K).

I hope that you enjoy reading these studies as much as I enjoyed helping to bring them together. What struck me most when reading through the final collection of research articles and critical reviews is the spectrum of methods applied and how they fall into place. While direct ab initio predictions are becoming more and more fit to be applied directly to environmental chemistry, they are currently mostly used to predict partitioning and reactivity in homogeneous phases. When more complex, heterogeneous systems are involved, the models used switch to statistical correlation analysis between the endpoint of interest and ab initio calculated electronic properties of the chemicals of interest. Finally, as soon as biological systems come into play, current predictive models are still based mainly on heuristic rules and/or data-driven chemometric-type analysis. Obviously, the error of the predictions follows the same hierarchy. Yet, most relevant fate processes in the environment take place in complex matrices with multiple reaction and sorption partners. In that sense, we hope that the current collection of high-quality papers will also inspire some new ideas for how different levels of models can be integrated in the future to predict environmentally relevant fate and effect endpoints with increased accuracy.


This journal is © The Royal Society of Chemistry 2017