Issue 12, 2020

A novel unambiguous strategy of molecular feature extraction in machine learning assisted predictive models for environmental properties

Abstract

Environmental properties of compounds provide significant information in treating organic pollutants, which drives the chemical process and environmental science toward eco-friendly technology. Traditional group contribution methods play an important role in property estimations, whereas various disadvantages emerge in their applications, such as scattered predicted values for certain groups of compounds. In order to address such issues, an extraction strategy for molecular features is proposed in this research, which is characterized by interpretability and discriminating power with regard to isomers. Based on the Henry's law constant data of organic compounds in water, we developed a hybrid predictive model that integrates the proposed strategy in conjunction with a neural network framework. The structure of the predictive model is optimized using cross-validation and grid search to improve its robustness. Moreover, the predictive model is improved by introducing the plane of best fit descriptor as input and adopting k-means clustering in sampling. In contrast with reported models in the literature, the developed predictive model demonstrates improved generality, higher accuracy, and fewer molecular features used in its development.

Graphical abstract: A novel unambiguous strategy of molecular feature extraction in machine learning assisted predictive models for environmental properties

Supplementary files

Article information

Article type
Paper
Submitted
30 Mar 2020
Accepted
21 May 2020
First published
27 May 2020

Green Chem., 2020,22, 3867-3876

A novel unambiguous strategy of molecular feature extraction in machine learning assisted predictive models for environmental properties

Z. Wang, Y. Su, S. Jin, W. Shen, J. Ren, X. Zhang and J. H. Clark, Green Chem., 2020, 22, 3867 DOI: 10.1039/D0GC01122C

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements