Data-Augmented Response Surface Methodology-Machine Learning Hybrid Model for Predicting Polyvinyl Butyral Synthesis
Abstract
Polyvinyl butyral (PVB) is indispensable in our daily life. Its synthesis involves a multiphasic physicochemical process, which is challenging to be captured by using conventional mathematical models. This study explores the fusion of response surface methodology (RSM) and machine learning (ML) to optimize PVB synthesis parameters, enabling robust model development and accurate prediction of key PVB performance indicators. First, we employed the central composite design to design and optimize PVB synthesis experiments, ensuring systematic coverage of critical process variables. Next, we use the LHS sampling second-order RSM response surface equation to expand a high-fidelity and reliable data set, and conduct SHAP analysis to use this data set as the basis for training the optimized ML model (such as vector machine regression, SVR). The resulting RSM-SVR hybrid model demonstrates excellent predictive accuracy: the coefficient of determination (R²) values for the degree of oxidation and particle size are respectively 0.867 and 0.917. Moreover, this study reflects the relative superiority of model fitting based on R², and simultaneously provides the absolute measure of prediction error with RMSE and MAE. Its reliability has also been further verified through independent test sets, confirming its consistent performance on unseen data. Compared with the model trained only using RSM, the RSM-SVR hybrid model can provide quantitative and practically meaningful insights, helping us understand how process parameters affect key quality indicators such as acetylation degree and particle size, and facilitating the development of interpretable artificial intelligence in materials science.
Please wait while we load your content...