Molecular partition coefficient from machine learning with polarization and entropy embedded atom-centered symmetry functions†
Abstract
Efficient prediction of the partition coefficient (log P) between polar and non-polar phases could shorten the cycle of drug and materials design. In this work, a descriptor, named 〈q − ACSFs〉conf, is proposed to take the explicit polarization effects in the polar phase and the conformation ensemble of energetic and entropic significance in the non-polar phase into consideration. The polarization effects are involved by embedding the partial charge directly derived from force fields or quantum chemistry calculations into the atom-centered symmetry functions (ACSFs), together with the entropy effects, which are averaged according to the Boltzmann distribution of different conformations taken from the similarity matrix. The model was trained with high-dimensional neural networks (HDNNs) on a public dataset PhysProp (with 41 039 samples). Satisfactory log P prediction performance was achieved on three other datasets, namely, Martel (707 molecules), Star & Non-Star (266) and Huuskonen (1870). The present 〈q − ACSFs〉conf model was also applicable to n-carboxylic acids with the number of carbons ranging from 2 to 14 and 54 kinds of organic solvent. It is easy to apply the present method to arbitrary sized systems and give a transferable atom-based partition coefficient.
- This article is part of the themed collection: 2022 PCCP HOT Articles