Data-driven discovery of novel chalcogenide semiconductors for solar absorption

Abstract

The remarkable tunability of chalcogenide semiconductors offers exciting opportunities for applications in solar cells, while also posing significant challenges in exploring their vast compositional space. Although many ternary and quaternary chalcogenides exhibit excellent optoelectronic properties, their performance is often limited by poor defect tolerance and unfavorable doping behavior. Composition engineering is as a promising strategy to simultaneously optimize bulk stability, electronic structure, and defect physics in these materials. In this work, we applied a data-driven framework integrating high-throughput density functional theory (DFT) computations with descriptor-based and structure-based machine learning models to design novel multinary chalcogenide semiconductor alloys ideal for solar absorption and other optoelectronic applications. Within a pre-defined chemical space of zincblende-derived A2BCX4 and ABX2 compounds, we performed hybrid HSE06 calculations with spin–orbit coupling to generate a large dataset of the optimized lattice parameters, decomposition energy, band gap, theoretical maximum PV efficiency, and point defect formation energies for thousands of compounds. Random forest regression models utilizing composition-weighted elemental features were trained and deployed to predict properties for nearly half a million possible compositions, leading to the identification of ∼1200 stable compounds with desired optoelectronic properties. Crystal graph-based machine learning force field (MLFF) models were additionally trained on the DFT dataset to enable rapid energy prediction and geometry optimization of new bulk and defect-containing configurations. This workflow led to the identification of several promising compounds, with a notable example being Cu2Ca0.5Cd0.5SnS4, which satisfies conditions of thermodynamic stability, photovoltaic-suitable band gap, and intrinsic defect tolerance. The entire computational workflow and dataset have been packaged and released as ChalcoDB, an online tool publicly available via the nanoHUB platform, thus facilitating community access and ready simulations and predictions for chalcogenide compounds.

Graphical abstract: Data-driven discovery of novel chalcogenide semiconductors for solar absorption

Supplementary files

Article information

Article type
Paper
Submitted
11 Feb 2026
Accepted
22 Apr 2026
First published
19 May 2026
This article is Open Access
Creative Commons BY-NC license

EES Sol., 2026, Advance Article

Data-driven discovery of novel chalcogenide semiconductors for solar absorption

M. H. Rahman and A. Mannodi-Kanakkithodi, EES Sol., 2026, Advance Article , DOI: 10.1039/D6EL00026F

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements