Jump to main content
Jump to site search


A publicly available crystallisation data set and its application in machine learning

Author affiliations

Abstract

We present here the crystallisation outcomes for 319 publicly available compounds in up to 18 different solvents spread over 5710 individual single solvent evaporation trials. The recorded data is part of a much larger, corresponding in-house database and includes both positive as well as negative crystallisation outcomes. Such data can be used for statistical analyses of solvent performances, machine learning approaches or investigation of the crystallisation behaviour in structurally similar compound classes. The presented data suggests that crystallisation behaviour in different solvents is not correlated with chemical similarity among clusters of highly similar compounds. Further, our machine learning models can be used to guide the solvent choice when crystallising a compound. In a retrospective evaluation, these models proved potent to reduce the workload to a third of our initial protocol, while still guaranteeing crystallisation success rates >92%.

Graphical abstract: A publicly available crystallisation data set and its application in machine learning

Back to tab navigation

Supplementary files

Publication details

The article was received on 18 Apr 2017, accepted on 30 May 2017 and first published on 31 May 2017


Article type: Paper
DOI: 10.1039/C7CE00738H
Citation: CrystEngComm, 2017, Advance Article
  •   Request permissions

    A publicly available crystallisation data set and its application in machine learning

    M. Pillong, C. Marx, P. Piechon, J. G. P. Wicker, R. I. Cooper and T. Wagner, CrystEngComm, 2017, Advance Article , DOI: 10.1039/C7CE00738H

Search articles by author

Spotlight

Advertisements