Issue 16, 2020

Machine learning methods to predict the crystallization propensity of small organic molecules

Abstract

Machine learning (ML) algorithms were explored for the prediction of the crystallization propensity based on molecular descriptors and fingerprints generated from 2D chemical structures and 3D molecular descriptors from 3D chemical structures optimized with empirical methods. In total, 57 815 molecules were retrieved from the Reaxys® database, from those 53 998 molecules are recorded as crystalline (class A), 3097 as polymorphic (class B), and 720 as amorphous (class C). A training data set with 40 462 organic molecules was used to build the models, which were validated with an external test set comprising 17 353 organic molecules. Several ML algorithms such as random forest (RF), support vector machines (SVM), and deep learning multilayer perceptron networks (MLP) were screened. The best performance was achieved with a consensus classification model obtained by RF, SVM, and MLP models, which predicted the external test set with an overall predictive accuracy (Q) of up to 80%.

Graphical abstract: Machine learning methods to predict the crystallization propensity of small organic molecules

Article information

Article type
Paper
Submitted
17 jan 2020
Accepted
26 mar 2020
First published
26 mar 2020

CrystEngComm, 2020,22, 2817-2826

Machine learning methods to predict the crystallization propensity of small organic molecules

F. Pereira, CrystEngComm, 2020, 22, 2817 DOI: 10.1039/D0CE00070A

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements