Machine learning with new functional structure descriptors for design and screening of ionic liquids in CO2 efficient capture†
Abstract
Carbon dioxide emission reduction, conversion and utilization are hot topics and difficult issues in the world. As a new class of green solvents, ionic liquids (ILs) are widely used in CO2 capture and conversion, but there are various kinds of ILs (more than 1018). How to select and screen appropriate ILs for CO2 capture is an urgent problem to be solved. Therefore, it is of great significance to establish the quantitative structure–property relationship (QSPR) of ILs for CO2 capture. From the practical point of view of IL design and synthesis, a new functional structure descriptor (FSD) based on the group contribution method (GC) was constructed. At the same time, the idea of increasing dimensions to increase accuracy in traditional machine learning is changed, and the feasibility of reducing the dimension under the condition of ensuring accuracy is examined. A dimensionless molecular descriptor CORE is constructed. Based on these two new molecular descriptors, we discussed the performance of six common ensemble learning models (CatBoost, LightGBM, XGBoost, GBDT, RF and AdaBoost) for CO2 solubility in ILs. It is shown that all ensemble learning models can achieve good performance, but the CatBoost model is the most outstanding. An R2 of 0.9945 and MAE of 0.0108 for the CatBoost-FSD model is achieved, while the R2 and MAE values are 0.9925 and 0.0120 for the CatBoost-CORE model, respectively. The interpretability of the CatBoost-FSD model is analyzed, and the key features are determined. Based on the CORE descriptor, the best experimental conditions are obtained, and nine kinds of ILs with superior performance are recommended.