A machine-learning framework for interpretable prediction of cellulose degree of polymerization retention in green solvents
Abstract
Cellulose is one of the most abundant natural renewable polymers. Its wide availability, biocompatibility, and biodegradability enable broad applications in papermaking, textiles, biomedical materials, and biofuels. However, the strong inherent network of intermolecular hydrogen bonds in cellulose impedes its dissolution in most organic solvents and water, thereby limiting its high-value utilization. In recent years, green solvents such as ionic liquids (ILs), deep eutectic solvents (DESs), and molten salt hydrates (MSHs) have emerged as effective dissolution media. However, they invariably induce cellulose depolymerization to some extent. Moreover, experimental determination of the degree of polymerization (DP) is laborious, and conventional prediction methods lack accuracy. Here, we propose an interpretable machine learning (ML) framework that uniquely integrates raw material properties, process parameters, and microscale physiochemical descriptors of solvent components from computational chemistry. Solvents were disassociated into cations, anions, and auxiliary components, with microscale physicochemical descriptors calculated via computational chemistry. Seven feature combinations were designed to train six ML models, and model interpretability was enhanced using SHapley Additive exPlanations (SHAP) and partial dependence plots (PDP). The optimal model (group 5-RF) achieved a test set R2 of 0.9555, demonstrating excellent generalization and robustness, as confirmed by leave-one-solvent-family-out cross validation. The key influential factors (temperature, time, Cation_ESPmax, Cation_BalabanJ, and Other_NBO(H+)) and their nonlinear synergistic/antagonistic effects were revealed. A user-friendly graphical user interface (GUI) was developed to facilitate rational solvent design and process optimization. This study establishes a new approach to intelligent prediction of cross-solvent material processing and provides a theoretical and technical basis for targeted preparation of high-performance cellulose materials.

Please wait while we load your content...