Application of classification models to identify solvents for single-walled carbon nanotubes dispersion†
Abstract
In this study, a list of classification models was developed to categorise organic solvents with respect to their dispersibility of single-walled carbon nanotubes (SWNTs). The organic solvents were classified into solvents and nonsolvents based on their ability to disperse the SWNTs. Various feature selection techniques combined with different classifier algorithms of linear and quadratic discriminate analysis (LDA and QDA), decision trees (random forest and J48), neural networks and support vector machines (SVMs) were explored on a data set consisting of structurally diverse organic solvents. The physicochemical descriptors such as partial charges, volsurf (the volumes and surfaces of grid points at different energy levels), subdivided surface area and some shape descriptors contributed to the classification models. The validation studies using test set, leave-one-out and 10-fold cross-validation methods provide statistical parameters such as specificity, sensitivity, accuracy, Mathew's correlation coefficient and the kappa index to evaluate the developed classification models. The sum of ranking difference (SRD) procedure reveals that the random forest classifier based on selected descriptors by the wrapper feature selection method is the best classification model, while the SVM, MLP and QDA containing models that are ranked as good models. The structural features along with electrostatic interactions of solvent molecules play a significant role in discriminating good solvents from nonsolvents in SWNT dispersion.