Machine learning assisted preparation of highly crystalline cellulose nanocrystals: lessons from cellulose sources and reaction conditions†
Abstract
Cellulosic materials have varying amounts of crystalline and amorphous domains, influenced by their source and processing history. The degree of crystallinity in cellulose significantly affects the properties and behaviour of cellulosic materials. Therefore, understanding and controlling cellulose crystallinity is vital for optimising the properties and performance of these materials. However, measuring the crystalline nature of synthesized cellulose nanocrystals is challenging due to inconsistent results from various analytical techniques such as XRD, NMR, FTIR, etc. Hence, developing an optimal method for predicting the crystalline nature of cellulose nanocrystals is promising. Herein, a machine learning model to predict the crystalline nature of CNCs is developed using a dataset created from the published literature. This model uses various cellulose sources and reaction conditions as input descriptors. The K-Nearest Neighbors (KNN) classifier, Support Vector classifier, Decision Tree classifier, RandomForest classifier and HistGradient boost classifier are trained on the dataset, and KNN was identified as the best machine learning model for crystalline nature prediction (accuracy = 95%). Using a KNN regressor, a crystallinity index predictor is also developed (R2 score = 0.82, RMSE = 1.59). Cellulose sources are identified as the major factors influencing cellulose nanocrystals’ crystalline nature. The developed model can bypass the need for trial-and-error synthesis to obtain a highly crystalline nature.