Machine learning for predicting product distributions in catalytic regioselective reactions†
Gaining predictable control over various forms of selectivities, such as enantio- and/or regio-selectivities, has been a long-standing goal in chemical catalysis. Although a number of factors such as the molecular features of the reactants and catalysts, as well as the reaction conditions, can influence the outcome of a reaction, it is not quite conspicuous as to what combinations of these parameters would offer a desired form of selectivity. We use machine learning tools, such as the neural network (NN), decision tree (DT), logistic regression (LR) and Random forest algorithms, to (a) analyze the outcome of an important catalytic regio-selective difluorination reaction of alkenes, and (b) decipher the complex interplay of various molecular parameters and their non-linear dependencies. The connection between what features of alkenes will yield 1,1-difluorination and how subtle changes would steer the reaction to 1,2-difluorination under identical conditions is enunciated. The NN was able to accurately predict whether a given alkene would yield a 1,1- or 1,2-difluorinated product. A combination of DT and the random forest classifier offered important chemical insights, which could be used in making a more rational choice of the reactant alkene for the desired regioisomeric product. The results could have far reaching implications in predicting which regioisomer is likely to be formed under a given set of conditions, and thus this technique is capable of expediting the development of catalytic transformations.
- This article is part of the themed collection: 2018 PCCP HOT Articles