Issue 43, 2020

Data mining the Cambridge Structural Database for hydrate–anhydrate pairs with SMILES strings

Abstract

Many organic molecules can crystallize in either hydrated or anhydrous forms. Predicting the formation of hydrates and their relative stability with respect to water-free alternative phases are significant challenges. Here we use the Cambridge Structural Database (CSD) and data informatics to identify and analyze hydrate–anhydrate structure pairs. A search method was developed based on Simplified Molecular-Input Line-Entry strings (SMILES) matching and implemented through the CSD Python Application Programming Interface. Of the >23 000 molecular hydrates containing no metal ions, ∼1400 were found to have at least one corresponding anhydrous form, yielding just over 2000 unique pairs in the CSD. Hydrates with and without a reported anhydrate showed a similar distribution in their water stoichiometries. Lattice symmetry and packing fraction comparisons are reported for the paired hydrates and anhydrates. Structure pairs with one organic component and multiple organic components showed some subtle differences. The details and limitations of the method are outlined in a way that can encourage and guide other types of CSD searches using SMILES.

Graphical abstract: Data mining the Cambridge Structural Database for hydrate–anhydrate pairs with SMILES strings

Supplementary files

Article information

Article type
Paper
Submitted
24 Feb 2020
Accepted
24 Mar 2020
First published
24 Mar 2020

CrystEngComm, 2020,22, 7290-7297

Author version available

Data mining the Cambridge Structural Database for hydrate–anhydrate pairs with SMILES strings

J. E. Werner and J. A. Swift, CrystEngComm, 2020, 22, 7290 DOI: 10.1039/D0CE00273A

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements