Fabio
di Lena
* and
Christina L. L.
Chai
Institute of Chemical and Engineering Sciences, Agency of Science, Technology and Research, 1 Pesek Road, Jurong Island, 627833, Singapore. E-mail: fabio_di_lena@ices.a-star.edu.sg
First published on 13th April 2010
In this work, we present the first successful application of in silico modeling to the construction of quantitative and predictive relationships between the set of constants kact, kdeact and KATRP and the structures and properties of various ATRP catalysts and initiators. The results are consistent not only with the generally accepted ATRP mechanistic picture but also provide valuable insights into this complex polymerization reaction. The models, built using the genetic function approximation algorithm, highlight and quantify the pivotal roles played in the ATRP process by energetic and steric factors of both catalysts and initiators as well as by the reaction medium. Moreover, the models suggest the existence of long-range interactions in catalyst–initiator recognition and subsequent binding. We believe that the approach will prove to be a powerful tool for the discovery of improved catalysts for ATRP.
Atom transfer radical polymerization (ATRP)48–51 is one of the methods for controlled radical polymerization that have revolutionized the field of polymer chemistry in the last decade, the others being nitroxide mediated (NMP)52,53 and reversible addition–fragmentation chain transfer (RAFT)53,54 radical polymerizations. Macromolecules with precisely controlled architecture and functionality have been synthesized from a wide range of monomers under conditions that are much less stringent than those previously required for ionic living polymerizations.50–52,54 Mechanism wise, ATRP is a redox-based process in which a transition-metal complex acts as a mediator for the intermittent generation of propagating radicals from alkyl halides (Scheme 1).48,49 The polymerization control depends largely on the appropriate equilibrium between the activation (generation of radicals, kact) and the deactivation (formation of alkyl halides, kdeact) processes, which determines the concentration of radicals and thus the rates of polymerization and termination as well as polydispersities. In particular, KATRP = kact/kdeact should be sufficiently small in order to maintain a low concentration of radicals and to minimize termination reactions, whereas both kact and kdeact should be large enough (though kact ≪ kdeact) to provide good control over the polymerization while providing a reasonable polymerization rate. The values of KATRP, kact and kdeact depend on the catalyst, initiator and monomer structure, as well as on the type of solvent and the reaction conditions. Understanding how these parameters affect the three reaction constants is crucial for the rational development of more efficient ATRP catalysts.
![]() | ||
Scheme 1 Atom transfer radical polymerization equilibrium. |
Recently, large datasets of KATRP, kact and kdeact, either experimentally determined55,56 or extrapolated by using the constant selectivity principle, were reported for a number of copper catalysts and alkyl halide initiators.57 Semi-quantitative correlations between the structures of these compounds and KATRP57 or kact57–59 were also described. In contrast, due to the difficulty of measuring the relative fast process, no such correlations are presently available for kdeact.
In this work, we present the first, successful application of in silico modeling to the construction of fully quantitative relationships between the set of constants kact, kdeact and KATRP and the structures and properties of ATRP catalysts and initiators.
![]() | ||
Chart 1 Ligands for copper-catalyzed atom transfer radical polymerization used in this study. |
The complexes with CuIX and either bidentate (bpy, dNbpy, NPPMI, NOPMI) or tetradentate (HMTETA, [2.2.3], [2.3.2], [3.2.3], BPED, Me6TREN, Et6TREN, MA6TREN, BA6TREN, AN6TREN, TPMA, Me4Cyclam, DMCBCy) ligands have been reported to exist in solution preferentially as [CuI(κ2-Lig)2]+ and [CuI(κ4-Lig)]+ respectively, whereas those with CuIIX2 as [CuII(κ2-Lig)2X]+ and [CuII(κ4-Lig)X]+.74,75 On the other hand, in the presence of tridentate ligands such as PMDETA and BPMPA, CuIX and CuIIX are likely to form neutral complexes of the type [CuI(κ3-Lig)X] and [CuII(κ3-Lig)X2] though the displacement of a halogen atom by the solvent or monomer is feasible in both species.74,75 Finally, the complex with CuIX and TPEN may exist both as a binuclear tricoordinated complex, [CuI(κ3-TPEN)X]2, and a mononuclear pentacoordinated complex, [CuI(κ5-TPEN)]+, the latter likely being the active species, while CuIIX2 with TPEN can only exist as a mononuclear and hexacoordinated compound [CuII(κ5-TPEN)X]+.76
After optimizing the geometries, molecular descriptors were calculated for each of the copper complexes above, and PCA was performed on the Cu(I) and Cu(II) species separately. The first 6 principal components (PCs) of each of the two groups of complexes were subjected to hierarchical cluster analysis (Fig. S1 and S2, ESI†). Test sets of diverse molecules were obtained by selecting one complex from each of the 4 most distant clusters such that the corresponding points in the figure-of-merit space were homogeneously spaced out. Accordingly, [CuI(NPPMI)Br], [CuI(Me6TREN)]+, [CuI(MA6TREN)]+ and [CuI(TPEN)]+ were chosen as a test set for the modeling of kact, whereas [CuII(BPMPA)Br2], [CuII(MA6TREN)Br]+, [CuII(DMCBCy)Br]+ and [CuII(TPEN)Br]+ as a test set for kdeact. Furthermore, in order to ensure the inclusion of the most relevant molecular information, the initial group of descriptors used for modeling the remaining 16 activators and 16 deactivators (training sets) was chosen among the peripheral points in the loading plots PC1 vs. PC2 vs. PC3 (Fig. S3 and S4, ESI†). Generally accepted mechanistic considerations of the ATRP process were also taken into consideration. The number of descriptors was progressively reduced by iteratively subjecting them to GFA algorithm and discarding, after each regression cycle, those with the lowest occurrences in the populations.
Of all the molecular descriptors calculated for Cu(I) complexes, HOMO energy (HOMO), cavity volume (CV), molecular area (MA), moment of inertia Z (IZ), van der Waal's surface area (vSA), solvent surface area (SSA), 5χpath, 5χchain, 5χvchain, 6χpath, and 6χvpath survived the selection process for kact. These were used to build quantitative structure–reactivity relationships and the GFA algorithm with linear polynomials returned the top 10 models in Table 1. The values of Friedman's LOF, R2, adjusted R2, cross-validated R2, significance-of-regression (SOR) F-value, and critical SOR F-value (95%) indicate that all the regressions are statistically significant. Interestingly, eqn (10) rather than the one with the lowest Friedman's LOF value yields the best prediction for the test set (Fig. 1). A close look at the models reveals a direct proportionality between kact and HOMO, which mirrors the experimental observation that the higher the reducing power of a Cu(I) complex, the higher its ATRP activity.57 The regressions highlight also the pivotal role played in the reactivity of Cu(I) compounds by topological descriptors, which differentiate the molecules according to their size, degree of branching, flexibility and overall shape. It is worth noting that only few of the tens of topological indices computed in the present study ended up in the models. This should minimize the risk of “fortuitous explanation” of the figure-of-merit that is sometimes observed with relationships containing a large number of such parameters.77 Surface area-based descriptors, which measure the extent to which molecules expose themselves to the external environment, have been generally related to molecular factors such as transport and solubility. It is therefore interesting to find out that MA and vSA, although positively correlated, not only have opposite effects on kact but also, as pointed out by control GFA analysis, both parameters need to be present in the model in order to obtain a reliable regression. According to the role of CV in the equations, metal accessibility is another parameter that should be taken into consideration when designing an ATRP catalyst (vide infra). The moments of inertia depend on both the mass of each atom in the molecule and the overall molecular geometry. The presence of Iz in five out of ten models can be therefore taken as a further evidence of the importance of the geometry of the activator on the activation process. In contrast, the absence of the average N–Cu bond length in all the models may be symptomatic of a secondary role, with respect to the other parameters, played by the stability of Cu(I) complexes in the activation process. The relatively small preference of Cu(I) for specific donor atoms as well as the much smaller variation, compared to Cu(II) compounds, of the stabilities of its complexes as the ligand is altered have been recently pointed out.78
Model | Friedman's LOF | R 2 | Adjusted R2 | Cross-validated R2 | SOR F-value | Critical SOR F-value (95%) | |
---|---|---|---|---|---|---|---|
1 | log(kact) = 47(HOMO) + 6.7 × 10−3(CV) − 3.1(5χpath) + 5.1(6χpath) + 42(5χvchain) − 4.3(6χvpath) + 0.46(MA) + 1.6 × 10−3(IZ) − 0.54(vSA) + 11 | 0.450 | 0.990 | 0.974 | 0.937 | 63.6 | 4.15 |
2 | log(kact) = 58(HOMO) + 8.1 × 10−3(CV) − 2.5(5χpath) + 30(5χchain) + 4.0(6χpath) − 3.2(6χvpath) + 0.37(MA) + 1.8 × 10−3(IZ) − 0.47(vSA) + 12 | 0.497 | 0.988 | 0.971 | 0.807 | 57.6 | 4.15 |
3 | log(kact) = 59(HOMO) + 7.9 × 10−3(CV) − 3.5(5χpath) + 5.9(6χpath) + 45(5χvchain) − 5.3(6χvpath) + 0.54(MA) − 0.64(vSA) + 12 | 0.510 | 0.978 | 0.952 | 0.874 | 38.3 | 3.75 |
4 | log(kact) = 58(HOMO) + 11 × 10−3(CV) − 3.4(5χpath) + 5.6(6χpath) + 44(5χvchain) − 5.2(6χvpath) + 0.55(MA) − 0.67(vSA) + 7.3 × 10−3(SSA) + 11 | 0.588 | 0.986 | 0.966 | 0.780 | 48.5 | 4.15 |
5 | log(kact) = 71(HOMO) + 13 × 10−3(CV) − 2.8(5χpath) + 32(5χchain) + 4.5(6χpath) − 4.2(6χvpath) + 0.47(MA) − 0.61(vSA) + 8.7 × 10−3(SSA) + 11 | 0.611 | 0.986 | 0.964 | 0.742 | 46.7 | 4.15 |
6 | log(kact) = 73(HOMO) + 9.5 × 10−3(CV) − 2.9(5χpath) + 32(5χchain) + 4.8(6χpath) − 4.3(6χvpath) + 0.47(MA) − 0.56(vSA) + 12 | 0.612 | 0.973 | 0.942 | 0.553 | 31.7 | 3.75 |
7 | log(kact) = 36(HOMO) − 2.3 × 10−3(CV) − 1.6(5χpath) + 26(5χchain) + 2.2(6χpath) + 5.4 × 10−3(IZ) − 14 × 10−3(SSA) + 12 | 0.906 | 0.935 | 0.879 | 0.735 | 16.6 | 3.51 |
8 | log(kact) = 63(HOMO) + 8.3 × 10−3(CV) − 3.3(5χpath) + 9.9(5χchain) + 5.6(6χpath) + 31(5χvchain) − 5.0(6χvpath) + 0.52(MA) − 0.61(vSA) + 12 | 0.927 | 0.979 | 0.946 | 0.554 | 30.6 | 4.15 |
9 | log(kact) = 38(HOMO) − 1.6(5χpath) + 26(5χchain) + 2.1(6χpath) + 5.1 × 10−3(IZ) − 20 × 10−2(vSA) − 8.9 × 10−3(SSA) + 12 | 0.967 | 0.931 | 0.871 | 0.518 | 15.4 | 3.51 |
10 | log(k act ) = 35(HOMO) − 2.8( 5 χ path ) + 4.6( 6 χ path ) + 41( 5 χ v chain ) − 3.0( 6 χ v path ) + 0.32(MA) + 3.5 × 10 −3 (I Z ) − 0.34(vSA) + 11 × 10 −3 (SSA) + 12 | 0.986 | 0.977 | 0.943 | 0.762 | 28.7 | 4.15 |
![]() | ||
Fig. 1 Calculated vs. experimental log(kact) values for Cu(I) activators obtained by genetic function approximation analysis with linear polynomials. Training and test sets are represented by the symbols “![]() ![]() |
As to kdeact, which is responsible for the control in ATRP reactions, the following descriptors for Cu(II) complexes outlasted the selection process: 4χvpath, 4χvpath/cluster, 5χvpath/cluster, electrostatic energy (EE), valence energy (VE), binding energy (BE), LUMO–HOMO energy (L–H), dipole Y (δY), surface area (SA), average N–Cu bond length (N–Cu), and Cu–Br bond length (Cu–Br). The comparison of these parameters with those obtained for the activation reaction highlights a higher incidence of energetic terms on the deactivation reaction. The GFA algorithm with linear polynomials returned the statistically significant top 10 models in Table 2. To the best of our knowledge, this is the first time that correlations between kdeact and the structure of ATRP catalysts are reported. Similar to what was found for kact, model 14 rather than the one with the lowest Friedman's LOF value exhibits the highest predictive power (Fig. 2), despite the negative value of cross-validated R2. Likewise topological descriptors, whose importance has already been discussed above, L–H and SA appear in all the 10 equations. In particular, the fact that the rate of deactivation is roughly proportional to the width of the L–H gap is consistent with the known, high kinetic reactivity of compounds having a narrow band gap.79 On the other hand, the direct proportionality between SA and kdeact can be rationalized on the basis of the different affinities of Cu(II) complexes for the reaction medium. The Cu–Br bond length, which turns out to be inversely proportional to the deactivation rate constant, is another important parameter that emerges from the modeling. Intuitively it is expected that the shorter the length of the Cu–Br bond, the stronger is the bond and hence the more favored is the deactivation process. This relates also to the concept of metal “halidophilicity”. It is experimentally observed that the degree of control in ATRP is lower in the presence of metal complexes with low halide affinity and/or of solvents that, like water, favor the dissociation of halides by effectively solvating them, thus reducing the amount of deactivator in solution.80,81 Interestingly, albeit not included in model 14, N–Cu is present in four out of the ten regressions suggesting that, unlike Cu(I), the stabilities of Cu(II) complexes should be taken into consideration when designing an ATRP catalyst.82,83
Model | Friedman's LOF | R 2 | Adjusted R2 | Cross-validated R2 | SOR F-value | Critical SOR F-value (95%) | |
---|---|---|---|---|---|---|---|
11 | log(kdeact) = −2.3(4χvpath) − 1.8(4χvpath/cluster) + 1.4(5χvpath/cluster) − 9.1 × 10−3(VE) + 1.3(BE) + 4.1 × 102(L–H) − 0.25(δY) + 19 × 10−3(SA) − 0.25 | 1.30 | 0.906 | 0.800 | −1.37 | 8.48 | 3.75 |
12 | log(kdeact) = −2.0(4χvpath) − 1.0(4χvpath/cluster) + 0.90(5χvpath/cluster) − 20 × 10−3(VE) + 14(N–Cu) − 6.1(Cu–Br) + 4.3 × 102(L–H) + 8.2 × 10−3(SA) − 10 | 1.40 | 0.899 | 0.784 | 0.0716 | 7.81 | 3.75 |
13 | log(kdeact) = −2.1(4χvpath) − 1.2(4χvpath/cluster) + 1.1(5χvpath/cluster) − 9.6 × 10−3(VE) − 5.0(Cu–Br) + 0.82(BE) + 4.3 × 102(L–H) + 14 × 10−3(SA) + 13 | 1.54 | 0.889 | 0.762 | 0.0711 | 7.02 | 3.75 |
14 | log(k deact ) = −2.0( 4 χ v path ) − 1.0( 4 χ v path/cluster ) + 0.79( 5 χ v path/cluster ) + 4.3 × 10 −3 (EE) − 5.9(Cu–Br) + 4.0 × 10 2 (L–H) − 0.18(δ Y ) + 8.0 × 10 −3 (SA) + 17 | 1.78 | 0.872 | 0.725 | −3.40 | 5.94 | 3.75 |
15 | log(kdeact) = −2.0(4χvpath) − 1.4(4χvpath/cluster) + 1.0(5χvpath/cluster) + 2.5 × 10−3(EE) − 0.87(BE) + 3.6 × 102(L–H) − 0.25(δY) + 14 × 10−3(SA) + 0.59 | 1.81 | 0.869 | 0.720 | −4.52 | 5.82 | 3.75 |
16 | log(kdeact) = −1.9(4χvpath) − 1.1(4χvpath/cluster) + 0.97(5χvpath/cluster) − 12 × 10−3(VE) + 9.3(N–Cu) + 0.76(BE) + 4.2 × 102(L–H) + 13 × 10−3(SA) − 18 | 1.95 | 0.859 | 0.699 | −3.68 | 5.35 | 3.75 |
17 | log(kdeact) = −2.0(4χvpath) − 0.91(4χvpath/cluster) + 0.81(5χvpath/cluster) + 2.6 × 10−3(EE) − 5.1 × 10−3(VE) − 7.6(Cu–Br) + 4.4 × 102(L–H) + 7.7 × 10−3(SA) + 21 | 1.98 | 0.857 | 0.693 | −3.88 | 5.24 | 3.75 |
18 | log(kdeact) = −1.8(4χvpath) − 0.78(4χvpath/cluster) + 0.69(5χvpath/cluster) + 2.7 × 10−3(EE) − 5.1(Cu–Br) + 0.40(BE) + 3.8 × 102(L–H) + 9.8 × 10−3(SA) + 14 | 2.06 | 0.852 | 0.682 | −6.90 | 5.03 | 3.75 |
19 | log(kdeact) = −1.8(4χvpath) − 0.73(4χvpath/cluster) + 0.65(5χvpath/cluster) + 3.5 × 10−3(EE) − 0.99(N–Cu) − 7.0(Cu–Br) + 4.1 × 102(L–H) + 7.0 × 10−3(SA) + 22 | 2.23 | 0.839 | 0.655 | −3.89 | 4.56 | 3.75 |
20 | log(kdeact) = −2.0(4χvpath) − 1.2(4χvpath/cluster) + 0.94(5χvpath/cluster) − 19 × 10−3(VE) + 18(N–Cu) + 3.9 × 102(L–H) − 0.17(δY) + 8.2 × 10−3(SA) − 32 | 2.32 | 0.833 | 0.643 | −2.52 | 4.37 | 3.75 |
![]() | ||
Fig. 2 Calculated vs. experimental log(kdeact) values for Cu(II) deactivators obtained by genetic function approximation analysis with linear polynomials. Training and test sets are represented by the symbols “![]() ![]() |
When linear splines were employed as basis functions in the GFA algorithm, the resulting models not only contained a smaller number of molecular descriptors (Tables S1 and S2, ESI†) but also provided better predictions for the training set (Fig. S5 and S6, ESI†). None of them, however, was able to produce reliable predictions for the test set.
![]() | ||
Chart 2 Alkyl halide initiators for atom transfer radical polymerization used in this study. |
The descriptors selected for kact, namely nitrile fragment counts (CN), total molecular mass (Mw), molecular volume (MV), molecular refractivity (MR), Wiener index (W), 1κ, 0χv, 3χpath, 3χcluster, principal moment of inertia X (IX), radius of gyration (RoG), ellipsoidal volume (EV), solvation energy (SE) and LUMO energy (LUMO) were used in the GFA algorithm, which returned the top 10 linear polynomials in Table 3. Interestingly, both regression 22 and 30, which have the second lowest and the highest Friedman's LOF value, respectively, show the highest predictive power for the test set (Fig. 3). The models suggest that shape- and size-based descriptors play a crucial role also in the reactivity of the initiators. This can be rationalized as follows. Since ATRP is believed to occur through an inner sphere electron transfer (ISET) mechanism,84 the halogen of the alkyl halide needs to be in the first coordination sphere of the metal center in order for the atom transfer to take place. It is therefore reasonable to imagine that the more compatible the shape of the initiator is with the steric neighborhood of the metal center in the activator, the more effective is the activation reaction (vide supra). The fact that the solvation energy appears in a number of regressions underlines further the importance of the solvent in the ATRP equilibrium.81 Another remarkable finding is the inverse proportionality between kact and LUMO energy. This is consistent with the fact that the alkyl halide is the electron acceptor in the activation reaction and hence the lower the energy of the LUMO, the faster is the reduction. Moreover, the beneficial effect on the activation process resulting from the presence of a nitrile on the activator is due to the ability of this group to stabilize radicals by resonance, a property exploited in azo initiators for conventional free radical polymerization for example the use of 2,2′-azobisisobutyronitrile (AIBN).
Model | Friedman's LOF | R 2 | Adjusted R2 | Cross-validated R2 | SOR F-value | Critical SOR F-value (95%) | |
---|---|---|---|---|---|---|---|
21 | −log(kact) = −1.9(CN) − 22 × 10−3(W) − 1.6(0χv) + 0.16(MV) − 5.0(RoG) + 41(LUMO) + 9.1 | 0.312 | 0.972 | 0.955 | 0.0595 | 57.5 | 3.23 |
22 | −log(k act ) = −2.8(CN) − 25 × 10 −3 (W) − 1.8( 0 χ v ) + 0.19(MV) − 7.4(RoG) − 1.9 × 10 2 (SE) + 45(LUMO) + 9.9 | 0.379 | 0.978 | 0.961 | 0.353 | 57.5 | 3.30 |
23 | −log(kact) = −33 × 10−3(W) − 1.8(0χv) + 0.13(MV) + 43(LUMO) + 2.2 | 0.455 | 0.919 | 0.893 | 0.675 | 34.3 | 3.31 |
24 | −log(kact) = −25 × 10−3(W) − 1.4(0χv) + 0.10(MV) + 1.9 × 102(SE) + 39(LUMO) + 4.7 | 0.477 | 0.938 | 0.910 | 0.629 | 33.3 | 3.22 |
25 | −log(kact) = −1.9(CN) − 22 × 10−3(W) + 12 × 10−3(3χpath) − 1.5(0χv) + 0.16(MV) − 5.1(RoG) + 42(LUMO) + 9.1 | 0.488 | 0.972 | 0.950 | −0.0681 | 44.3 | 3.30 |
26 | −log(kact) = −0.85(CN) − 30 × 10−3(W) − 1.8(0χv) + 0.12(MV) + 34(LUMO) + 2.7 | 0.498 | 0.935 | 0.906 | −1.08 | 31.8 | 3.22 |
27 | −log(kact) = −6.8 × 10−3(W) + 3.9 × 102(SE) + 46(LUMO) + 9.8 | 0.546 | 0.874 | 0.845 | 0.684 | 30.0 | 3.57 |
28 | −log(kact) = −15 × 10−3(W) + 0.60(3χpath) + 3.8 × 102(SE) + 45(LUMO) + 9.1 | 0.580 | 0.897 | 0.863 | 0.752 | 26.2 | 3.31 |
29 | −log(kact) = −31 × 10−3(W) − 1.7(0χv) + 0.14(MV) − 1.4(RoG) + 49(LUMO) + 3.8 | 0.582 | 0.924 | 0.890 | 0.696 | 26.9 | 3.22 |
30 | −log(k act ) = −3.3(CN) − 26 × 10 −3 (W) − 0.48( 3 χ path ) − 2.2( 0 χ v ) + 0.22(MV) − 7.6(RoG) − 2.8 × 10 2 (SE) + 40(LUMO) + 8.5 | 0.582 | 0.981 | 0.962 | 0.386 | 52.0 | 3.45 |
![]() | ||
Fig. 3 Calculated vs. experimental log(kact) values for alkyl halide initiators obtained by genetic function approximation analysis with linear polynomials. Training and test sets are represented by the symbols “![]() ![]() |
As far as KATRP is concerned, the following descriptors were found to be important for the modeling: nitrile fragment counts (CN), element count (EC), molecular density (MD), molecular refractivity (MR), Balaban index X (JX), Balaban index Y (JY), 2χ, principal moment of inertia X (IX), dipole Z (δZ), shadow area ZX plane (ShAZX), solvent surface occupied volume (SSOV), solvation energy (SE), total energy (TE), binding energy (BE), HOMO energy (HOMO), LUMO–HOMO energy (L–H) and C–X bond distance (C–X). When these parameters were used with the GFA algorithm, the linear polynomials in Table 4 were obtained, among which models 38 and 40 distinguished themselves for the reliability of the predictions (Fig. 4). The presence in the models of the highly discriminating Balaban index, whose value is substantially independent of the size of the molecule, as well as of the solvent surface occupied volume and solvation energy adds to what has already been said about the relevance of, respectively, topological descriptors and reaction medium in ATRP. By the same token, the fact that most equations contain the moment of inertia Ix goes back to the previously discussed importance of the geometry of the species involved in the ATRP equilibrium. Notably, since dipolar properties have been correlated to long-range electrostatic interactions,69 which are believed to strongly influence the mutual orientation of ligands and receptors prior to Brownian collision and subsequent binding,85 the inclusion of δZ in the regressions may indicate the existence of analogous long-range recognition processes between ATRP catalysts and initiators. Last but not least, a thorough analysis of the models points out the marked incidence of the total energy of the initiator on KATRP. TE is a measure of the overall reactivity of a molecule and can be related, in the present case, to both the reactivity of the carbon–halogen bond and the presence of radical stabilizing groups in the initiators.
Model | Friedman's LOF | R 2 | Adjusted R2 | Cross-validated R2 | SOR F-value | Critical SOR F-value (95%) | |
---|---|---|---|---|---|---|---|
31 | −log(KATRP) = 5.9(EC) − 2.5(MR) + 2.7(JY) − 4.8(2χ) − 6.6 × 10−3(IX) − 1.5 × 102(SE) − 3.5 × 10−3(TE) − 15(BE) + 13 | 86.7 × 10−3 | 0.996 | 0.992 | 0.956 | 259 | 3.45 |
32 | −log(KATRP) = −6.1(JX) + 7.9(JY) − 3.4(0χv) − 8.8 × 10−3(IX) + 49 × 10−3(SSOV) + 45(HOMO) − 1.2(δZ) + 9.9(C–X) − 8.0 | 88.5 × 10−3 | 0.996 | 0.992 | 0.977 | 254 | 3.45 |
33 | −log(KATRP) = −19(JX) + 21(JY) − 1.5(0χv) + 3.1(MD) − 9.7 × 10−3(IX) − 39 × 10−3(SSOV) + 1.1 × 103(SE) − 69(LUMO) − 3.3(δZ) − 1.1 | 92.5 × 10−3 | 0.998 | 0.995 | 0.983 | 342 | 3.70 |
34 | −log(KATRP) = 4.1(EC) − 1.9(MR) + 2.4(JY) − 4.0(2χ) − 7.2 × 10−3(IX) − 2.7 × 10−3(TE) − 13(BE) − 0.58(δZ) + 12 | 93.5 × 10−3 | 0.996 | 0.992 | 0.984 | 240 | 3.45 |
35 | −log(KATRP) = 4.6(EC) − 2.3(MR) − 2.2(2χ) − 7.6 × 10−3(IX) − 0.14(ShAZX) − 3.3 × 10−3(TE) − 16(BE) + 22 | 98.1 × 10−3 | 0.993 | 0.988 | 0.966 | 186 | 3.30 |
36 | −log(KATRP) = −11(JX) + 12(JY) − 1.0(0χv) − 9.3 × 10−3(IX) + 21 × 10−3(SSOV) + 4.7 × 102(SE) − 2.6 × 10−4 (TE) − 31(L–H) − 2.0(δZ) + 11 | 99.7 × 10−3 | 0.998 | 0.994 | 0.970 | 320 | 3.70 |
37 | −log(KATRP) = −2.1(CN) − 0.59(EC) + 2.5(JY) − 3.5(0χv) − 9.8 × 10−3(IX) + 66 × 10−3(SSOV) − 1.8(δZ) + 9.2(C–X) − 21 | 0.107 | 0.995 | 0.990 | 0.965 | 207 | 3.45 |
38 | −log(K ATRP ) = −9.5(J X ) + 10(J Y ) − 9.8 × 10 −3 (I X ) + 4.8 × 10 2 (SE) − 3.4 × 10 −4 (TE) − 0.89(BE) − 1.7(δ Z ) + 9.1 | 0.109 | 0.992 | 0.986 | 0.974 | 167 | 3.30 |
39 | −log(KATRP) = 6.1(EC) − 2.7(MR) + 2.9(JX) − 5.5(2χ) − 6.4 × 10−3(IX) − 2.1 × 10−3(SE) − 3.8 × 10−3 (TE) − 18(BE) + 14 | 0.111 | 0.995 | 0.990 | 0.960 | 202 | 3.45 |
40 | −log(K ATRP ) = −9.1(J X ) + 10(J Y ) − 10 × 10 −3 (I X ) + 10 × 10 −3 (SSOV) + 51 × 10 2 (SE) − 2.8 × 10 −4 (TE) − 1.8(δ Z ) + 6.6 | 0.111 | 0.992 | 0.986 | 0.961 | 163 | 3.30 |
![]() | ||
Fig. 4 Calculated vs. experimental log(KATRP) values for alkyl halide initiators obtained by genetic function approximation analysis with linear polynomials. Training and test sets are represented by the symbols “![]() ![]() |
As seen with the catalysts, when linear splines were used as basis functions in the GFA algorithm, the resulting models, while containing less molecular descriptors (Tables S3 and S4, ESI†) and providing better predictions for the training set, failed to return reliable values for the test set (Fig. S8 and S9, ESI†).
Footnote |
† Electronic supplementary information (ESI) available: cluster analysis dendrograms, loading plots, linear spline models and best fits. See DOI: 10.1039/c0py00058b |
This journal is © The Royal Society of Chemistry 2010 |