Using machine learning methods to predict the diabatic bond dissociation energy of non-heme iron complexes†
Abstract
Bond dissociation energy (BDE) is an important property in chemical research. In the process of non-heme iron complex catalytic reactions, diabatic BDE has a significant impact on the selectivity of halogenation and hydroxylation reactions. Measuring or calculating BDE by using traditional experimental or theoretical methods is often expensive and complex, so we propose the first application of machine learning on non-heme iron complexes to predict and rationalize the diabatic BDEs of Fe–X and Fe–OH bonds in order to assist in the study of selectivity in non-heme iron complex catalytic reactions. We built a reliable and representative dataset containing over 600 types of non-heme iron complexes and used density functional theory (DFT) to calculate nearly 900 diabatic BDE for machine learning. In terms of model training, we used 2D molecular fingerprints and 3D descriptors as inputs to train the regression model. The results indicate that the ensemble algorithm combined with Morgan fingerprints can effectively predict the diabatic BDEs of non-heme iron complexes. Using the Gradient Boosting Regressor (GBR) model and Morgan fingerprints can achieve an accurate prediction of R2 = 0.791 and the mean absolute error (MAE) = 10.23 kcal mol−1. The incorporation of 3D descriptors significantly improves the predictive performance of molecular fingerprints other than Morgan fingerprints. Notably, the SOAP descriptor effectively captures key 3D molecular information, making it particularly advantageous for predicting isomers with large ΔBDE. However, when the ΔBDE of isomers in the dataset is small, Morgan fingerprints remain the more efficient choice.