The CSD and knowledge databases: from answers to questions†
We discuss a general scheme of extracting new knowledge on crystal structures from crystallographic databases. We exemplify the proposed scheme by creating a knowledge database that contains a number of structural descriptors, which reflect geometrical and topological properties of coordination compounds. The initial crystallographic information on 7690 crystal structures was retrieved mainly from the Cambridge Structural Database and processed with the ToposPro program package. We have used a number of machine learning methods to develop a predictive scheme and proved that the Random Forest method provides the best prediction of the overall topological properties (dimensionality and underlying topology) of coordination networks. We show how the developed knowledge database and predictive scheme can be considered as a prototype of an artificial intelligence system, which can be used to answer typical questions that arise in the design of coordination compounds.