Comparing molecules and solids across structural and alchemical space†
Evaluating the (dis)similarity of crystalline, disordered and molecular compounds is a critical step in the development of algorithms to navigate automatically the configuration space of complex materials. For instance, a structural similarity metric is crucial for classifying structures, searching chemical space for better compounds and materials, and driving the next generation of machine-learning techniques for predicting the stability and properties of molecules and materials. In the last few years several strategies have been designed to compare atomic coordination environments. In particular, the smooth overlap of atomic positions (SOAPs) has emerged as an elegant framework to obtain translation, rotation and permutation-invariant descriptors of groups of atoms, underlying the development of various classes of machine-learned inter-atomic potentials. Here we discuss how one can combine such local descriptors using a regularized entropy match (REMatch) approach to describe the similarity of both whole molecular and bulk periodic structures, introducing powerful metrics that enable the navigation of alchemical and structural complexities within a unified framework. Furthermore, using this kernel and a ridge regression method we can predict atomization energies for a database of small organic molecules with a mean absolute error below 1 kcal mol−1, reaching an important milestone in the application of machine-learning techniques for the evaluation of molecular properties.