The generation of a database of promising short-circuit current organic solar cell hole transport layers with machine learning and density functional theory

Abstract

In contrast to traditional quantitative structure–property relationship (QSPR) studies that primarily screen existing libraries, this work integrates machine learning (ML) with generative design to discover novel hole transport layers (HTLs) for organic solar cells (OSCs). We have developed an ML model to predict the short circuit current (JSC) of benzodithiophene-based molecules, which achieves a high R2 of 0.803 with the XtraTrees regressor. Beyond prediction, we used the breaking retrosynthetically interesting chemical species (BRICS) algorithm to generate a de novo library of 9278 novel molecules, significantly expanding the chemical space beyond the initial training set of 515 experimentally validated compounds. To provide deeper chemical insight than standard automated pipelines, we calculated the structure–activity landscape index (SALI), which revealed significant “activity cliffs” and correlations between molecular diversity and JSC values, ranging from 0.005 to 25.93 mA cm−2. Advanced clustering using t-distributed stochastic neighbor embedding (t-SNE) and k-means identified high-performing candidates with low SALI scores and distinct structural features. This study demonstrates the synergistic integration of ML, density functional theory (DFT), and generative algorithms in accelerating target discovery of efficient HTLs. The top candidates were validated by DFT, which confirmed their suitable electronic properties and revealed that extended π-conjugation and favorable charge distribution underlie their high predicted performance. This integrated pipeline – where ML guides the exploration and DFT provides validation and mechanistic insight – accelerates the targeted discovery of efficient HTLs.

Graphical abstract: The generation of a database of promising short-circuit current organic solar cell hole transport layers with machine learning and density functional theory

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Paper
Submitted
24 Oct 2025
Accepted
11 Mar 2026
First published
27 Apr 2026

New J. Chem., 2026, Advance Article

The generation of a database of promising short-circuit current organic solar cell hole transport layers with machine learning and density functional theory

H. A. K. Kyhoiesh, S. H. Jawad, S. H. Alwan, I. H. El Azab and H. E. Abd Elsalam, New J. Chem., 2026, Advance Article , DOI: 10.1039/D5NJ04184H

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements