Simultaneous description of the influence of solvent, reaction type, and substituent on equilibrium constants by means of three-mode factor analysis
Abstract
The equations, proposed by Hammett, Taft, and Nieuwdorp et al., respectively, for the simultaneous description of the influence of reaction type and substituent on equilibrium and reaction rate constants are discussed. The latter equation represents an example of factor analysis. This mathematical–statistical technique has also been applied to describe simultaneously the influence of solvent and substituent and the influence of solvent and reaction type. It is thus a logical step to classify equilibrium and reaction rate constants with respect to three modes, solvent, reaction type, and substituent, and to try to describe the influence of these three variables by three-mode factor analysis. Two examples of the application of this technique to literature data are given. The first concerns data on ionization constants for 15 series of substituted compounds in three solvents. The second example concerns data on phase equilibrium constants of six series of substituted compounds in nine two-phase systems. The two-phase systems comprise gas–liquid as well as liquid–liquid and solid–liquid systems. The precision of the fit of the observations and the precision of the prediction of the missing data are discussed. In the first example 237 data are missing. Among them are 90 data that cannot be predicted at all by the Hammett, Taft, or Nieuwdorp equations (viz., for reactions on which no measurements at all are available in a particular solvent). The standard deviation of the prediction of the latter data by three-mode factor analysis ranges from 0.1 to 0.2. In the second example nearly all missing data are for reactions on which no measurements at all are available in a particular solvent. They can be predicted by three-mode factor analysis with a standard deviation that ranges from 0.09 to 0.13. Further, it is shown that the number of parameters that is required to fit the observations by three-mode factor analysis is far less than the number of parameters in the corresponding regression analysis model, viz., the Taft model.