Predicting Bandgap of ABX2 Materials: Supervised Machine Learning on Small Datasets

Abstract

Predicting electronic bandgap in ABX2 -type materials (where, A ={Al, Ga, Zn, Cd}, B ={Ga, In, Ge, Sn, Si}, and X = {N, P, As, Sb}) is essential for photovoltaic applications. However, when working with a limited dataset, the task becomes significantly more challenging. By means of a machine learning (ML) approach, electronic bandgap of 99 datasets of ABX2 -type materials are classified and predicted with good accuracy. We find that logistic regression achieves 97% classification accuracy. For regression, the Sure Independence Screening and Sparsifying Operator (SISSO) method is used for feature selection, followed by various ML models, where Least Absolute Shrinkage and Selection Operator (LASSO) yielding the best test R2 (0.92) and RMSE (0.30 eV). We also develop an ML model to predict volume with prediction accuracy with R2 = 0.99 and RMSE = 8.65 Å3, a structural feature used for bandgap prediction. The K-means clustering is used to find underlying patterns in the data i.e. reveals two distinct material families. Notably, when crystallographic phase information is included, the prediction accuracy further improves with R2 = 0.95 and RMSE = 0.28 eV. Therefore, our results show that ML can efficiently replace computationally expensive DFT methods and can also perform well on small, costly datasets, making it suitable for broader and more specialized problem spaces than previously assumed.

Supplementary files

Article information

Article type
Paper
Submitted
21 Dec 2025
Accepted
27 Apr 2026
First published
28 Apr 2026

J. Mater. Chem. C, 2026, Accepted Manuscript

Predicting Bandgap of ABX2 Materials: Supervised Machine Learning on Small Datasets

U. Kumar, M. W. Ullah, V. Kumar, J. M. Muir and F. zhang, J. Mater. Chem. C, 2026, Accepted Manuscript , DOI: 10.1039/D5TC04449A

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements