Rationalising crystal nucleation of organic molecules in solution using artificial neural networks
In this study, the method of artificial neural networks (ANNs) is applied to analyse the effect of various solute, solvent, and solution properties on the difficulty of primary nucleation, without bias towards any particular nucleation theory. Sets of ANN models are developed and fitted to data for 36 binary systems of 9 organic solutes in 11 solvents, using Bayesian regularisation without early stopping and 6-fold cross validation. An initial model set with 21 input parameters is developed and analysed. A refined model set with 10 input parameters is then evaluated, with an overall improvement in accuracy. The results indicate partial qualitative consistency between the ANN models and the classical nucleation theory (CNT), with the nucleation difficulty increasing with an increase in mass transport resistance and a reduction in solubility. Notably, some parameters not included in CNT, including solute molecule bond rotational flexibility, the entropy of melting of the solute, and intermolecular interactions, also exhibit explanatory importance and significant qualitative effect relationships. A high entropy of melting and solute bond rotational flexibility increase the nucleation difficulty. Stronger solute–solute or solvent–solvent interactions are correlated with a facilitated nucleation, which is reasonable in the context of desolvation. A dissimilarity between solute and solvent hydrophobicities is connected with an easier nucleation.