Efficient first principles based modeling via machine learning: from simple representations to high entropy materials

Kangming Li; Kamal Choudhary; Brian DeCost; Michael Greenwood; Jason Hattrick-Simpers

doi:10.1039/D4TA00982G

Efficient first principles based modeling via machine learning: from simple representations to high entropy materials†

Kangming Li,

*^a Kamal Choudhary,

^b Brian DeCost,

^b Michael Greenwood^c and Jason Hattrick-Simpers

*^adef

Author affiliations

* Corresponding authors

^a Department of Materials Science and Engineering, University of Toronto, 27 King's College Cir, Toronto, ON, Canada
E-mail: kangming.li@utoronto.ca, jason.hattrick.simpers@utoronto.ca

^b Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, Gaithersburg, MD, USA

^c Canmet MATERIALS, Natural Resources Canada, 183 Longwood Road south, Hamilton, ON, Canada

^d Acceleration Consortium, University of Toronto, 80 St George St, Toronto, ON M5S 3H6, Canada

^e Vector Institute for Artificial Intelligence, 661 University Ave, Toronto, ON, Canada

^f Schwartz Reisman Institute for Technology and Society, 101 College St, Toronto, ON, Canada

Abstract

High-entropy materials (HEMs) have recently emerged as a significant category of materials, offering highly tunable properties. However, the scarcity of HEM data in existing density functional theory (DFT) databases, primarily due to computational expense, hinders the development of effective modeling strategies for computational materials discovery. In this study, we introduce an open DFT dataset of alloys and employ machine learning (ML) methods to investigate the material representations needed for HEM modeling. Utilizing high-throughput DFT calculations, we generate a comprehensive dataset of 84k structures, encompassing both ordered and disordered alloys across a spectrum of up to seven components and the entire concentration range. We apply descriptor-based models and graph neural networks to assess how material information is captured across diverse chemical-structural representations. We first evaluate the in-distribution performance of ML models to confirm their predictive accuracy. Subsequently, we demonstrate the capability of ML models to generalize between ordered and disordered structures, between low-order and high-order alloy systems, and between equimolar and non-equimolar compositions. Our findings suggest that ML models can generalize from cost-effective calculations of simpler systems to more complex scenarios. Additionally, we discuss the influence of dataset size and reveal that the information loss associated with the use of unrelaxed structures could significantly degrade the generalization performance. Overall, this research sheds light on several critical aspects of HEM modeling and offers insights for data-driven atomistic modeling of HEMs.

This article is part of the themed collections: Journal of Materials Chemistry A HOT Papers and Advancing energy-materials through high-throughput experiments and computation

Journal of Materials Chemistry A

Efficient first principles based modeling via machine learning: from simple representations to high entropy materials†

Abstract

Supplementary files

Article information

Download Citation

Permissions

Efficient first principles based modeling via machine learning: from simple representations to high entropy materials

Social activity

Search articles by author

Spotlight

Advertisements