Prediction and evaluation of multiple output machine learning methods for ethylene oligomerization and aromatization kinetics modeling†
Abstract
With the increase in industrial automation, data-driven machine learning models are becoming more and more popular due to their simplicity and less workload. The datasets calculated by the single-event kinetic model are analyzed in combination with three algorithms, such as the K-nearest neighbor (KNN), artificial neural network (ANN) method, and random forest regression (RF), in order to find the optimal machine learning model by comparing the predictions of the kinetic model. Specifically, the RF algorithm is the optimal method, and the RF model is well explained using the SHapley Additive exPlanations (SHAP) method, which is transformed to derive the effect of the input feature variables on product yields. The relative contribution of each input variable calculated from SHAP indicates that for light olefin (O2–O4) yields, space time > temperature > Si/Al ratio > pressure, for long-chain olefin (O5–O7) yields, temperature > space time > Si/Al ratio > pressure, and for aromatic (A6–A8) yields, temperature > Si/Al ratio > space time > Si/Al ratio > pressure. By combining kinetic rules, the RF model can be used as an alternative to the kinetic model. The input feature law of the SHAP calculations is consistent with the single-event kinetic analysis results according to the acid strength of zeolite and can be extended to the propane aromatization.