Structure-Guided Machine Learning for Efficiency Prediction of Organic Photovoltaics Using Experimentally Informed Molecular Descriptors
Abstract
The efficiency of organic photovoltaics was estimated using a machine learning (ML) approach. We used the organic photovoltaics database built in-house by the Korea Research Institute of Chemical Technology. Representative 1,010 donor-acceptor combinations with reliable experimental data obtained through repeated measurements were utilized. The data included 67 donors and 24 non-fullerene acceptors, device structures (normal, inverted, bulk heterojunction, and bilayer), donor/acceptor structures, donor-to-acceptor ratios, active-layer thicknesses, experimental conditions, and local symmetry. We fragmented the donors and acceptors using a self-developed method. A dataset was created by generating descriptors of the fragmented molecules and used to train various ML algorithms, including random forest, XGBoost, LightGBM, support vector regression, and multilayer perceptron. Model performance was evaluated using the coefficient of determination (R²). XGBoost showed the highest R² of 0.833. The contributions of key features were interpreted using SHAP analysis. This paper presents an ML framework that combines molecular fragmentation and data-driven modeling.
Please wait while we load your content...