The prediction of pKa continues to attract much attention with ongoing investigations into new ways to predict pKa accurately, where predicted pKa values deviate less than 0.50 log units from experiment. We show that a single descriptor, i.e. an ab initio bond length, can predict pKa. The emphasis was placed on model simplicity and a demonstration that more accurate predictions emerge from single-bond-length models. A data set of 171 phenols was studied. The carbon-oxygen bond length, connecting the OH to the phenyl ring, consistently provided accurate predictions. The pKa of meta- and para-substituted phenols is predicted here by a single-bond-length model within 0.50 log units. However, accurate prediction of the pKa of ortho-substituted phenols necessitated their splitting into groups called high-correlation subsets in which the pKa of the compounds strongly correlated with a single bond-length. The highly compound-specific single-bond-length models produced better predictions than models constructed with more compounds and more bond lengths. Outliers were easily identified using single-bond-length models and in most cases we were able to determine the reason for the outlier discrepancy. Furthermore, the single-bond-length models showed better cross-validation statistics than the PLS models constructed using more than one bond length. For all of the single-bond-length models, RMSEE was less than 0.50. For the majority of the models, RMSEP was less than 0.50. The results support the use of multiple high-correlation subsets and a single bond-length to predict pKa. Six one-term linear equations are listed as a starting point for the construction of a more comprehensive list covering a larger variety of compound classes.
You have access to this article
Please wait while we load your content...
Something went wrong. Try again?