Leveraging domain knowledge for optimal initialization in Bayesian materials optimization
Abstract
Bayesian optimization (BO) has emerged as an effective strategy to accelerate the discovery of new materials by efficiently exploring complex and high-dimensional design spaces. However, the success of BO methods greatly depends on how well the optimization campaign is initialized—the selection of initial data points from which the optimization starts. In this study, we focus on improving these initial datasets by incorporating materials science expertise into the selection process. We identify common challenges and sources of uncertainty when choosing these starting points and propose practical guidelines for using expert-defined criteria to create more informative initial datasets. By evaluating these methods through simulations and real-world alloy design problems, we demonstrate that using domain-informed criteria leads to initial datasets that are more diverse and representative. This enhanced starting point significantly improves the efficiency and effectiveness of subsequent optimization efforts. We also introduce clear metrics for assessing the quality and diversity of initial datasets, providing a straightforward way to compare different initialization strategies. Our approach offers a robust and widely applicable framework to enhance Bayesian optimization across various materials discovery scenarios.

Please wait while we load your content...