Issue 28, 2025, Issue in Progress

Exploring the degree of long-range order/disorder in indaceno-based photovoltaic small molecules using data-driven machine learning analysis

Abstract

Long-range order and disorder in small molecules significantly impact their physical and chemical properties, affecting their performance in photovoltaic devices. For the current study, a data-driven machine learning (ML) approach has been applied to explore the relationship between molecular structure and crystallinity in 480 indaceno-based small molecules. Three ML models, including support vector machines and random forest models, were trained to predict crystal propensity. A heatmap analysis revealed that 72.71% of the small molecules exhibit crystalline behavior, while the remaining 27.29% are non-crystalline. ML models achieved near-perfect accuracy (AUC : SVM-RBF = 0.999, RF = 0.998; MSE : RF = 0.00, SVM-RBF = 0.01). The predicted crystal propensity values showed high accuracy, with a mean squared error ranging from 0.0–0.64. Feature importance analysis using SHAP values identified Chi0v, kappa1, Chi1n, and NumRotatableBonds as the most contributing factors to crystal propensity. The synthetic accessibility score of the small molecules ranged from 0.02 to 0.12, providing insights for designing and optimizing indaceno-based small molecules with tailored crystallinity and photovoltaic properties. This study demonstrates the potential of ML approaches in guiding the development of high-performance small molecules for solar energy applications.

Graphical abstract: Exploring the degree of long-range order/disorder in indaceno-based photovoltaic small molecules using data-driven machine learning analysis

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Paper
Submitted
19 Apr 2025
Accepted
22 Jun 2025
First published
01 Jul 2025
This article is Open Access
Creative Commons BY-NC license

RSC Adv., 2025,15, 22449-22459

Exploring the degree of long-range order/disorder in indaceno-based photovoltaic small molecules using data-driven machine learning analysis

H. A. K. Kyhoiesh, K. H. Salem, A. S. Waheeb, R. A. Hasan, H. R. Salman, A. A. Al-Kubaisi, A. Y. Elnaggar, I. H. El Azab and M. H. H. Mahmoud, RSC Adv., 2025, 15, 22449 DOI: 10.1039/D5RA02748A

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements