Issue 15, 2023

Deep transfer learning for predicting frontier orbital energies of organic materials using small data and its application to porphyrin photocatalysts

Abstract

Machine learning (ML) models have received increasing attention as a new approach for the virtual screening of organic materials. Although some ML models trained on large databases have achieved high prediction accuracy, the application of ML to certain types of organic materials is limited by the small amount of available data. On the other hand, metalloporphyrins and porphyrins (MpPs) have received increasing attention as potential photocatalysts, and recent studies have found that both HOMO/LUMO energy levels and energy gaps are important factors controlling the MpP photocatalysts. Since the training data of MpPs are insufficient and limited to porphyrin-based dyes, in this study, we proposed a deep transfer learning approach to rapidly predict the HOMO/LUMO energy levels and energy gaps of MpPs. To complement the open-source Porphyrin-based Dyes Database (PBDD), we curated a new database, the Metalloporphyrins and Porphyrins Database (MpPD), in which MpPs were specifically designed as potential photocatalysts and the HOMO/LUMO energies were calculated by advanced DFT functionals. We proposed PorphyBERT, a BERT-based regression model that was pre-trained with PBDD and fine-tuned with MpPD. The model performed satisfactorily in predicting HOMO and LUMO energies and energy gap with RMSEs of 0.0955, 0.0988, and 0.0787 eV and MAEs of 0.0774, 0.0824, and 0.0549 eV. Furthermore, due to its unique unsupervised pre-training phase, the model is not affected by the difference in computational functionals between pre-training and fine-tuning databases. Finally, we recommended 12 MpPs as potential photocatalysts for CO2 reduction with out-of-sample model predictions of energy gaps close to the values calculated by DFT.

Graphical abstract: Deep transfer learning for predicting frontier orbital energies of organic materials using small data and its application to porphyrin photocatalysts

Supplementary files

Article information

Article type
Paper
Submitted
27 Feb 2023
Accepted
20 Mar 2023
First published
21 Mar 2023

Phys. Chem. Chem. Phys., 2023,25, 10536-10549

Deep transfer learning for predicting frontier orbital energies of organic materials using small data and its application to porphyrin photocatalysts

A. Su, X. Zhang, C. Zhang, D. Ding, Y. Yang, K. Wang and Y. She, Phys. Chem. Chem. Phys., 2023, 25, 10536 DOI: 10.1039/D3CP00917C

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements