Highly transferable atomistic machine-learning potentials from curated and compact datasets across the periodic table†
Abstract
Machine learning atomistic potentials trained using density functional theory (DFT) datasets allow for the modeling of complex material properties with near-DFT accuracy while imposing a fraction of its computational cost. The curation of the DFT datasets can be extensive in size and time-consuming to train and refine. In this study, we focus on addressing these barriers by developing minimalistic and flexible datasets for many elements in the periodic table regardless of their mass, electronic configuration, and ground state lattice. These DFT datasets have, on average, ∼4000 different structures and 27 atoms per structure, which we found sufficient to maintain the predictive accuracy of DFT properties and notably with high transferability. We envision these highly curated training sets as starting points for the community to expand, modify, or use with other machine learning atomistic potential models, whatever may suit individual needs, further accelerating the utilization of machine learning as a tool for material design and discovery.