Data-driven discovery of novel chalcogenide semiconductors for solar absorption
Abstract
The remarkable tunability of chalcogenide semiconductors offers exciting opportunities for applications in solar cells, while also posing significant challenges in exploring their vast compositional space. Although many ternary and quaternary chalcogenides exhibit excellent optoelectronic properties, their performance is often limited by poor defect tolerance and unfavorable doping behavior. Composition engineering is as a promising strategy to simultaneously optimize bulk stability, electronic structure, and defect physics in these materials. In this work, we applied a data-driven framework integrating high-throughput density functional theory (DFT) computations with descriptor-based and structure-based machine learning models to design novel multinary chalcogenide semiconductor alloys ideal for solar absorption and other optoelectronic applications. Within a pre-defined chemical space of zincblende-derived A2BCX4 and ABX2 compounds, we performed hybrid HSE06 calculations with spin–orbit coupling to generate a large dataset of the optimized lattice parameters, decomposition energy, band gap, theoretical maximum PV efficiency, and point defect formation energies for thousands of compounds. Random forest regression models utilizing composition-weighted elemental features were trained and deployed to predict properties for nearly half a million possible compositions, leading to the identification of ∼1200 stable compounds with desired optoelectronic properties. Crystal graph-based machine learning force field (MLFF) models were additionally trained on the DFT dataset to enable rapid energy prediction and geometry optimization of new bulk and defect-containing configurations. This workflow led to the identification of several promising compounds, with a notable example being Cu2Ca0.5Cd0.5SnS4, which satisfies conditions of thermodynamic stability, photovoltaic-suitable band gap, and intrinsic defect tolerance. The entire computational workflow and dataset have been packaged and released as ChalcoDB, an online tool publicly available via the nanoHUB platform, thus facilitating community access and ready simulations and predictions for chalcogenide compounds.

Please wait while we load your content...