Data-scientific validation of prediction models for the controlled syntheses of exfoliated nanosheets†
Abstract
Exfoliated nanosheets have attracted considerable interest as two-dimensional (2D) building blocks. In general, the yield, size, and size distribution of the exfoliated nanosheets cannot be easily controlled or predicted because of the complexity in the processes. Our group studied the prediction models of the yield, size, and size distribution based on the small experimental data available. Sparse modeling for small data (SpM-S) combining machine learning (ML) and chemical insight was used for the construction of predictors. In SpM-S, the weight diagram visualizing the significance of explanatory variables plays an important role in variable selection to construct the models. However, the processes of variable selection were not validated in a data-scientific manner. In the present work, the significance of data size, visualization method, and chemical insight for variable selection was studied to validate the processes of model construction. The data size had a lower limit to extract appropriate descriptors. The weight diagram had an appropriate visualizing range for variable selection. Chemical insight as domain knowledge supplemented the limitation caused by the data size. These studies indicated that SpM-S can be applied to construct predictors, straightforward linear regression models, for the controlled syntheses of other 2D materials, even based on small data.

Please wait while we load your content...