Embedding human knowledge in material screening pipeline as filters to identify novel synthesizable inorganic materials
Abstract
How might one embed a chemist's knowledge into an automated materials-discovery pipeline? In generative design for inorganic crystalline materials, generating candidate compounds is no longer a bottleneck – there are now synthetic datasets of millions of compounds. However, weeding out unsynthesizable or difficult to synthesize compounds remains an outstanding challenge. Post-generation “filters” have been proposed as a means of embedding human domain knowledge, either in the form of scientific laws or rules of thumb. Examples include charge neutrality, electronegativity balance, and energy above hull. Some filters are “hard” and some are “soft” — for example, it is difficult to envision creating a stable compound while violating the rule of charge neutrality; however, several compounds break the Hume-Rothery rules. It is therefore natural to wonder: can one compile a comprehensive list of “filters” that embed domain knowledge, adopt a principled approach to classifying them as either non-conditional or conditional “filters,” and envision a software environment to implement combinations of these in a systematic manner? In this commentary we explore such questions, “filters” for screening of novel inorganic compounds for synthesizability.
- This article is part of the themed collection: Data-driven discovery in the chemical sciences