Issue 106, 2014

Classification study of solvation free energies of organic molecules using machine learning techniques

Abstract

In this work, we have developed a list of classification models to categorise organic molecules with respect to their solvation free energies using different machine learning approaches (decision tree, random forest and support vector machine). The solvation free energies of the molecules (experimental values obtained from the literature) were split into highly favourable (<−3 kcal mol−1) and less favourable (>−3 kcal mol−1) values; −3 kcal mol−1 was set as the threshold value for the classification model development. The MACCS fingerprint along with a set of physicochemical descriptors such as atom count, topology, vdW surface area (volsurf) and subdivided surface area contributed to the classification models. The validation studies using test set and 10-fold cross-validation methods provide statistical parameters such as accuracy, sensitivity and specificity with >90% significance. The sum of ranking difference (SRD) analysis reveals that the support vector machine models are comparatively significant, while the MACCS fingerprints containing models are ranked as good models in all approaches. The MACCS fingerprints indicate that the presence of halogen atoms causes less favourable solvation free energies. However, the presence of polar atoms/groups and some functional groups such as heteroatoms, double bonded branched aliphatic chains, C[double bond, length as m-dash]N, N–C–C–O, NCO, >1 heterocyclic atoms, OCO, etc. cause highly favourable solvation free energies. The results derived from these investigations can be used along with some quantitative models to predict the solvation free energies of organic molecules and to design novel molecules with acceptable solvation free energies.

Graphical abstract: Classification study of solvation free energies of organic molecules using machine learning techniques

Supplementary files

Article information

Article type
Paper
Submitted
01 Aug 2014
Accepted
03 Nov 2014
First published
03 Nov 2014

RSC Adv., 2014,4, 61624-61630

Author version available

Classification study of solvation free energies of organic molecules using machine learning techniques

N. S. H. Narayana Moorthy, S. A. Martins, S. F. Sousa, M. J. Ramos and P. A. Fernandes, RSC Adv., 2014, 4, 61624 DOI: 10.1039/C4RA07961B

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements