Smart design of phenanthrene-based organic photovoltaics using machine learning†
Abstract
To optimize organic photovoltaic (OPV) performance, machine learning (ML)–based analysis of phenanthrene-based organic dyes is performed. For the analysis, 968 phenanthrene-based dyes were collected from the literature, and descriptors were designed using the RDKit tool. To predict their PV-related parameters, different ML models were evaluated, and gradient boosting regression with an R-squared (R2) value of 0.87 and root mean square error (RMSE) value of 0.002. Shapley additive exPlanation (SHAP) values revealed that MinPartialCharge can be the most influential descriptor for predicting the exciton binding energy (Eb). Based on the descriptor-based analysis, new organic dye designs are proposed with minimum predicted Eb values. Structural similarity analysis using the synthetic accessibility likelihood index (SSA–SALI) revealed scores of 0.92–0.98 with distinct structure-based clusters. A convex hull diagram was constructed, which predicted the formation enthalpy (ΔHf) to be 5–35 eV per atom and the decomposition enthalpy (ΔHd) to be up to 3.0 eV per atom. Light harvesting efficiency reached up to 94%, and the open circuit voltage gave good values. This study illuminates the relationships between the molecular structure and OPV performance of phenanthrene-based organic dyes and will facilitate the rational design of high-efficiency organic dyes.