Zhenzhuo
Lan
a and
Shaama
Mallikarjun Sharada
*ab
aMork Family Department of Chemical Engineering and Materials Science, University of Southern California, Los Angeles, CA, USA. E-mail: ssharada@usc.edu
bDepartment of Chemistry, University of Southern California, Los Angeles, CA, USA
First published on 6th July 2021
A computational framework for ligand-driven design of transition metal complexes is presented in this work. We propose a general procedure for the construction of active site-specific linear free energy relationships (LFERs), which are inspired from Hammett and Taft correlations in organic chemistry and grounded in the activation strain model (ASM). Ligand effects are isolated and quantified in terms of their contribution to interaction and strain energy components of ASM. Scalar descriptors that are easily obtainable are then employed to construct the complete LFER. We successfully demonstrate proof-of-concept by constructing and applying an LFER to CH activation with enzyme-inspired [Cu2O2]2+ complexes. The key benefit of using ASM is a built-in compensation or error cancellation between LFER prediction of interaction and strain terms, resulting in accurate barrier predictions for 37 of the 47 catalysts examined in this study. The LFER is also transferable with respect to level of theory and flexible towards the choice of reference system. The absence of interaction-strain compensation or poor model performance for the remaining systems is a consequence of the approximate nature of the chosen interaction energy descriptor and LFER construction of the strain term, which focuses largely on trends in substrate and not catalyst strain.
Quantitative molecular catalyst design strategies can be categorized broadly into scaling relationships or volcano plots relying on energetic descriptors and valence bond approaches,9–13 those based on artificial intelligence and machine learning methods,14 and linear free energy relationships (LFERs) that rely on capturing ligand electronic and steric effects using a set of scalar, structural descriptors.8,15 LFERs describe reaction rate or equilibrium dependence on structural and electronic attributes of the system.16 This dependence is linear when the reaction mechanism is not altered upon variation of substituents. LFERs were originally developed as Hammett relationships in organic aromatic chemistry to describe the influence of (meta- or para-) substituents on the dissociation rate of benzoic acid. In Hammett relationships, the electrophilicity descriptor is a Hammett parameter (σm or σp) and the slope of the Hammett plot describes the sensitivity of the mechanism to substituent electrophilicity.17 Taft extended this framework to ortho-substituted groups via the inclusion of an additive term that captures the impact of substituent steric bulk.18 The resulting free energy relationship has two additive terms, one describing the electronic effect and another describing the steric effect of substituents on reaction rates or equilibria.
Such frameworks can offer systematic means to probe and predict the impact of ligands bound to a transition metal active site on catalyst performance metrics such as activity, selectivity, and stability.4 The construction of an LFER framework requires quantitative information regarding various ligand effects as well as the identification of appropriate scalar descriptors for these effects. It may seem intuitive to categorize ligand effects into their ability to tune the electronic character of the active site and their spatial extent, or steric bulk, that can impact substrate approach to the active site. These effects however, are often interrelated and difficult to isolate.19,20 As a result, the Taft equation describing reaction rate dependence on substituent electronic and steric effects cannot be generally applied across reaction chemistries with transition metal complexes. Our goal is to instead avoid explicit classification of ligand effects into electronic and steric terms, and develop an approach that relies on describing ligand effects in terms of directly measurable impacts on the activated catalyst–substrate system.
This study aims to establish a framework for constructing LFERs for CH activation with metal–oxygen active sites by quantifying trends in barriers obtained via systematic catalyst perturbations. Instead of breaking down ligand contributions into electronic and steric terms, we examine their impact on interaction and strain energy differences between transition and initial states of the catalyst–substrate system described by the activation strain model (ASM), also known as the distortion-interaction model.21,22 We utilize mechanistic findings from our prior study of CH activation with dioxo–dicopper complexes to calculate activation barriers using density functional theory (DFT).23 Our previous studies also identify descriptors for capturing trends in interaction23 and strain energies24 based on systematic ligand variations. An active site-specific LFER that combines both these effects is constructed in this work, as developing a single set of design rules that is valid across several active sites is a formidable task on account of the diversity in oxidation and spin states achievable with transition metal centers.25 We examine LFER accuracies in predicting activation barriers by examining several model [Cu2O2]2+ complexes constructed using imidazole N-donor ligands. The LFER is highly reliable, and predicts most barriers to within 10 kJ mol−1 of their DFT values. This is the consequence of a natural compensation or cancellation of errors observed between LFER predictions of constituent interaction and strain energies. Our functional and basis set tests show that the LFER is also transferable and can be used alongside multiple levels of theory. Analyses of outliers and sensitivity reveal that the Hammett parameter, used as a descriptor for ligand interaction effects, is the most likely cause for model deviations. In addition, as the model largely captures strain arising from substrate deformation (C-H stretch) in the transition state, LFER predictions are poorer for systems where catalyst strain deviates significantly from the reference system. Therefore, while we successfully demonstrate proof-of-concept for ASM-driven construction of LFERs, there is a need for descriptors that are more appropriate for transition metal complexes. Future work, in addition to expansion to other active sites, will include the search for interaction descriptors that are more suitable for metal–ligand bonds (such as metal ligand electronic parameters, MLEPs)26 which will enable construction of LFERs that are generalizable across several classes of ligands.
ΔE‡ = ΔEINT + ΔESTR | (1) |
To develop an LFER expression using ASM, we invoke the original equation put forth by Taft for ortho-substituted aromatics to quantify electronic and steric substituent effects:18
(2) |
By employing simplifying approximations, we can rewrite the Taft LFER using ASM. First, we assume that zero-point energies, enthalpy, entropy, and pre-exponential terms in the rate coefficient expression are less sensitive to the choice of ligand compared to electronic energies. Therefore, the LHS of eqn (2) can be rewritten as:
(3) |
(4) |
The proposed mechanism of CH activation has been debated extensively in the literature.46–52 By contrasting with experimental substrate variation-based curves,23 we demonstrate in our previous work that (in the absence of spin-crossing) the one-step oxo-insertion mechanism is preferred to the two-step radical recombination mechanism on the singlet potential energy surface. Fig. 1 represents the potential energy surface for the singlet oxo-insertion mechanism along with the geometries of initial, transition, and final states, labeled IS, TSoxo, and FS, respectively. The mechanism is concerted, and proceeds via simultaneous C–H and Cu–O bond breaking and C–O and O–H bond formation.
(5) |
While this analysis indicates that the Hammett parameter may be a reliable descriptor for interaction energies, we find poor fit to linear models for amine N-donors, which is expected as these amines are not aromatic.23 Furthermore, we are limited to using the Hammett parameter for the substituent only (e.g. CH3) rather than the entire N-donor (NH2CH3) as values (experimental or otherwise) for the latter are not readily available. Alternative scalar descriptors such as the charge of the active site oxygen atom or catalyst HOMO levels do not easily lend themselves to LFER construction. As a consequence of these limitations, we proceed with the Hammett parameter, σp, for LFER development but demonstrate proof-of-concept for complexes constructed using only imidazole N-donors. Going forward, our goal is to identify interaction descriptors that are more suitable for metal–ligand bonds than organic aromatic systems to develop LFERs that are transferable across various ligand backbones, which in this case consist of amines, diamines, imidazoles, and pyridines.
The resulting strain-only LFER, where ΔEINT is invariant, is determined to be:24
(6) |
(7) |
To apply these types of LFER models, one needs a DFT barrier calculations for a single reference system (ΔEref) and descriptors for the reference and system of interest. The model (eqn (7)) can then predict the barrier (ΔE‡) for the system of interest without the need for several tedious transition structure and barrier calculations.
Fig. 3 (a) Geometry of eleven imidazole groups substituted with Xim groups (S1–S11) and (b) geometry of monodentate [Cu2O2]2+ complexes with four imidazole groups. |
Index | Xim,1 | Xim,2 | Xim,3 | Xim,4 |
---|---|---|---|---|
A1 | OCH3 | H | OCH3 | H |
A2 | CH3 | H | CH3 | H |
A3 | CF3 | H | CF3 | H |
A4 | NO2 | H | NO2 | H |
A5 | CH3 | OCH3 | CH3 | OCH3 |
A6 | OCH3 | CF3 | OCH3 | CF3 |
A7 | OCH3 | NO2 | OCH3 | NO2 |
A8 | CH3 | CF3 | CH3 | CF3 |
A9 | CH3 | NO2 | CH3 | NO2 |
A10 | CF3 | NO2 | CF3 | NO2 |
A11 | H | H | CF3 | H |
A12 | H | H | CH3 | H |
A13 | H | H | NO2 | H |
A14 | H | H | OCH3 | H |
A15 | H | H | CF3 | CF3 |
A16 | H | H | CH3 | CH3 |
A17 | H | H | NO2 | NO2 |
A18 | H | H | OCH3 | OCH3 |
A19 | H | CF3 | CF3 | H |
A20 | CH3 | H | H | CH3 |
A21 | H | NO2 | NO2 | H |
A22 | OCH3 | H | H | OCH3 |
A23 | CF3 | H | CF3 | CF3 |
A24 | CH3 | H | CH3 | CH3 |
A25 | H | NO2 | NO2 | NO2 |
A26 | OCH3 | H | OCH3 | OCH3 |
A27 | NO2 | CF3 | CF3 | NO2 |
A28 | CH3 | CF3 | CF3 | CH3 |
A29 | CH3 | NO2 | NO2 | CH3 |
A30 | OCH3 | CF3 | CF3 | OCH3 |
A31 | OCH3 | CH3 | CH3 | OCH3 |
A32 | OCH3 | NO2 | NO2 | OCH3 |
A33 | H | CH3 | H | OCH3 |
A34 | H | CF3 | H | OCH3 |
A35 | H | NO2 | H | OCH3 |
A36 | H | CF3 | H | CH3 |
A37 | H | NO2 | H | CH3 |
A38 | NO2 | H | CF3 | H |
We use the ab initio quantum chemistry software, Q-Chem (Version 5.1 and above)60 to carry out all DFT calculations. Barriers are calculated for CH4 activation using the range-separated hybrid functional, ωB97X-D,61,62 and the Stuttgart Relativistic Small Core (srsc) effective core potential for Cu63 and triple-ζ 6-311+G* basis set with diffuse and polarization functions for the remaining atoms. The level of theory is identical to that employed for constructing the strain LFER. For the interaction LFER, the same functional was employed alongside a smaller basis set (6-31G*). Wavefunction stability analysis is used to determine stable, broken-symmetry (BS) wavefunctions.64,65 The extent of spin contamination is found to be similar across all systems in both initial (〈S2〉average = 0.098) and transition structures (〈S2〉average = 1.284). Consistent with our LFER construction studies, the second-generation absolutely-localized molecular orbital-EDA, or ALMO-EDA, is employed to calculate the ΔEINT term in ASM.66–71 The LFER model prediction is deemed accurate if the predicted barrier is within 10 kJ mol−1 of the ‘true’ DFT value, in line with the rule of thumb typically applied to energy differences that are meaningfully resolvable by DFT.
An LFER model, once constructed, must be usable alongside any level of theory. In other words, a truly generalizable LFER requires DFT calculations at the desired level of theory for only a single reference system. The reference calculation and descriptors can then be plugged into eqn (7) to predict barriers for arbitrary ligands coordinated to the [Cu2O2]2+ center. To determine whether this is the case with our ASM-based LFER model, we identify density functional approximations that benchmarking studies demonstrate to be reliable predictors of activation barriers.74,75 We also examine sensitivity to size and choice of basis set. The chosen levels of theory are described in Table 2, where the ‘baseline’ refers to ωB97X-D/6-311+G* (srsc for Cu) utilized to assess LFER model performance. Strain descriptors and IS and TSoxo structures are calculated at these levels of theory to contrast LFER predictions with DFT results. Owing to difficulties converging optimization cycles for SCAN0, single point IS and TSoxo calculations are performed with SCAN0 on structures optimized at the baseline level of theory.
Purpose | Class | Functional | Basis set (effective core potential) |
---|---|---|---|
Baseline | Range-separated hybrid functional | ωB97X-D76 | 6-311+G* (Cu: srsc) |
Functional test | Global hybrid functional | PBE077 | 6-311+G* (Cu: srsc) |
Functional test | Hybrid meta-generalized gradient approximation | SCAN078 | 6-311+G* (Cu: srsc) |
Functional test | Range-separated hybrid functional | ωB97X-V79 | 6-311+G* (Cu: srsc) |
Basis set test | Range-separated hybrid functional | ωB97X-D | 6-31G* |
Basis set test | Range-separated hybrid functional | ωB97X-D | def-SVP |
Basis set test | Range-separated hybrid functional | ωB97X-D | def-SVPD |
Index | Xim | ∑σp | θ 1 | θ 2 | B11 | B12 | ΔE‡DFT | ΔE‡LFER |
---|---|---|---|---|---|---|---|---|
S1 | N(CH3)2 | −3.32 | 94.585 | 98.370 | 1.95 | 1.92 | 94.0 | 116.4 |
S2 | NHCH3 | −2.80 | 96.874 | 96.742 | 1.59 | 1.57 | 97.3 | 101.0 |
S3 | NH2 | −2.64 | 97.006 | 97.099 | 1.57 | 1.59 | 88.4 | 100.5 |
S4 | OH | −1.48 | 98.724 | 98.938 | 1.66 | 1.68 | 87.5 | 99.8 |
S5 | OCH3 | −1.08 | 98.754 | 98.896 | 1.67 | 1.70 | 102.5 | 97.7 |
S6 | CH3 | −0.68 | 92.672 | 92.937 | 1.72 | 1.71 | 85.0 | 83.2 |
S7 | H | 0.00 | 95.678 | 95.803 | 1.65 | 1.62 | 82.2 | 82.2 |
S8 | CHO | 1.68 | 95.278 | 95.264 | 1.62 | 1.63 | 81.4 | 69.7 |
S9 | COOH | 1.80 | 97.469 | 97.316 | 1.62 | 1.61 | 79.2 | 73.1 |
S10 | CF3 | 2.20 | 95.536 | 97.808 | 1.79 | 1.88 | 75.2 | 76.8 |
S11 | NO2 | 3.12 | 97.179 | 96.610 | 1.65 | 1.59 | 63.1 | 63.5 |
The parity plot in Fig. 4 depicts LFER predictions (ΔELFER) relative to DFT barriers (ΔEDFT) for both symmetric (S1–S11) and asymmetric (A1–A38) systems. We find that LFER predictions are reliable for nearly 80% of all the systems examined in this study. Out of 47 systems (excluding reference S7), 37 LFER predictions are within 10 kJ mol−1 of the DFT value, shown by the shaded region about the x = y parity line in Fig. 4. Of these, 23 predictions are within 5 kJ mol−1 of the DFT value.
Fig. 4 LFER predictions (ΔE‡LFER) vs. BS DFT barriers (ΔE‡DFT) at the ωB97X-D/srsc,6-311+G* level of theory (kJ mol−1) for 11 symmetrically substituted (S1–S11, blue) and 38 asymmetrically substituted catalysts (A1–A38, green). Points lying in the shaded region indicate LFER prediction is within 10 kJ mol−1 of the DFT value. All model outliers are marked in red and labeled “A”/“S” to indicate asymmetric/symmetrically substituted catalysts (structures listed in Tables 1 and 3, respectively). |
For S1–S11, model predictions are within 10 kJ mol−1 of DFT results for 6 systems (excluding the S7 reference), with a mean absolute error (MAE) of 3.1 kJ mol−1. The overall MAE is higher (7.7 kJ mol−1) on account of 4 outliers, with substituents (Xim) N(CH3)2, NH2, OH and CHO. For the asymmetrically substituted A1–A38, LFER predictions are within 10 kJ mol−1 of DFT results for 32 complexes, with MAE = 4.1 kJ mol−1. The overall MAE is slightly higher at 5.7 kJ mol−1. It is worth noting that with only a few exceptions (A6, A10, A37), LFER accurately predicts barriers for mixed ligand systems that contain both electron-donating and withdrawing N-donors relative to reference Xim = H, namely A7, A8, A9, A28, A29, A30, A32, A34, A35, and A36, for which MAE = 4.8 kJ mol−1.
Transferability tests using levels of theory described in Table 2 are carried out for S1–S11. Before contrasting LFER performance, we determine whether trends in DFT barriers across N-donor substitutions are consistent across various levels of theory. In Fig. 5, we report BS barrier difference (ΔΔE‡) between a substituted imidazole and the reference (Xim = H, S7). The black curve in Fig. 5 corresponds to our baseline level of theory. In general, ΔΔE‡ across various functional and basis set combinations are within 10 kJ mol−1 of the baseline. The sole exception is S10 (Xim = CF3), for which PBE0 and ωB97X-V barriers are 12.0 kJ mol−1 and 13.7 kJ mol−1 lower than the baseline, respectively.
Based on the general agreement in trends across various levels of theory, we contrast LFER predictions calculated using eqn (7) with these levels of theory. Descriptors for LFERs are obtained at the same level of theory as the reference calculation. These descriptors as well as the resulting DFT and LFER barriers are listed in Table S2 of ESI.†Fig. 6 depicts LFER performance for each level of theory. In all cases, LFER performance resembles the baseline scenario, with similar MAEs and outliers. Functional tests result in MAEs of 9.9, 8.8, and 10.0 kJ mol−1 for PBE0, SCAN0, and ωB97X-V, respectively. Basis set tests yield MAEs of 9.0, 8.4, and 7.6 kJ mol−1 for 6-31G*, def2-SVP, and def2-SVPD, respectively. The LFER is therefore transferable across basis sets and density functional approximations. We also identify between 3 and 5 outliers at every level of theory. While S1 is an outlier in all cases, small variations are observed in the remaining outliers compared to the baseline scenario.
Fig. 6 LFER predicted BS barriers vs. DFT barriers (kJ mol−1) for S1–S11 at levels of theory listed in Table 2. Points lying in the shaded region indicate LFER prediction is within 10 kJ mol−1 of the DFT value. All model outliers are marked in red and labeled “S” to indicate symmetrically substituted catalysts. |
We also calculate the first-order sensitivity of the LFER model to the chosen descriptors for interaction (σp) and strain (B1,θ). Fig. S1 of ESI,† depicts the first order sensitivities with respect to each of the 4 σp's, and each of 2B1 and θ values. The 4 σp's combined constitute the greatest determinant of LFER performance, with a total sensitivity of 70.3%. Model sensitivities to the 2B1 and 2θ descriptors are 12.8% and 16.9%, respectively. Therefore, LFER performance is least sensitive to one of the strain descriptors, B1, and most sensitive to the interaction descriptor, σp.
Index | ΔEINT,LFER − ΔEINT,DFT | ΔESTR,LFER − ΔESTR,DFT | ΔE‡LFER − ΔE‡DFT |
---|---|---|---|
S1 | 8.4 | 14.0 | 22.3 |
S2 | 13.0 | −9.3 | 3.7 |
S3 | 15.9 | −3.8 | 12.1 |
S4 | 5.8 | 6.6 | 12.4 |
S5 | 5.4 | −10.1 | −4.8 |
S6 | 5.0 | −6.8 | −1.8 |
S7 | 0.0 | 0.0 | 0.0 |
S8 | −1.3 | −10.4 | −11.7 |
S9 | −21.9 | 15.8 | −6.1 |
S10 | −9.9 | 11.4 | 1.6 |
S11 | −26.6 | 27.0 | 0.4 |
To construct the interaction component of the LFER (eqn (1)), we vary N-donor electrophilicity and capture the effect on ΔEINT by forcing ΔESTR to be constant. Constant strain is achieved by freezing the active site and substrate. The active site distances as well as the C–H stretch in TSoxo are therefore identical to the reference in all systems. Consider the cases wherein electron-withdrawing substituents are bound to imidazole (S8–S11). While the decrease in barrier relative to the reference S7 manifests purely in ΔEINT during model construction, the barrier is also lowered due to reduced CH stretch in the fully relaxed system examined in this study. In the case of S11, the CH stretch is 1.415 Å in TSoxo, reduced from 1.447 Å in the reference S7 system. Therefore, this leads to a corresponding decrease in ΔESTR. In other words, in a completely relaxed system the barrier is redistributed between the ΔEINT and ΔESTR components of ASM compared to LFER construction wherein ΔESTR is invariant. For an electron-withdrawing substituent therefore, the LFER model is expected to predict more negative ΔEINT than what is observed with DFT, or ΔEINT,LFER − ΔEINT,DFT ≤ 0. Correspondingly, the observed DFT strain energy is lower than LFER, or ΔESTR,LFER − ΔESTR,DFT≥0. The converse is true for electron-donating groups, for which LFER is expected to predict higher interaction and lower strain energies. This compensation enables cancellation of LFER errors in ASM components, leading to reliable barrier predictions for both the symmetric and asymmetrically substituted complexes. LFER transferability tests in Fig. 6 show that this compensation occurs irrespective of the choice of underlying level of theory.
We also believe that this compensation between interaction and strain energies are built into the activation strain model. We observe this phenomena in the course of constructing the strain-only component of the LFER with bidentate diamine N-donors, wherein ΔEINT is invariant and all structures are fully relaxed.24 The strain model described by eqn (6) captures variations both in substrate strain (C–H stretch) induced by the spatial extent (or steric bulk) of the catalyst and strain arising from deformation of the catalyst. Substrate strain is typically the dominant contributor to ΔESTR and also exhibits higher sensitivity to N-donors. The outliers for the strain model consist of systems in which the catalyst ceases to be planar and exhibits bending in the TSoxo structure. We find that ΔEINT is no longer invariant for these outliers. While the likelihood of catalyst deformation in the transition structure may be challenging to determine a priori to aid model construction, we note an interesting trend, shown in Table S6 of ESI.† Model deviations in ΔESTR are directly correlated with the deviation of ΔEINT with a slope of unity (R2 = 0.93). In other words, concurrent with the decrease in overall observed strain, driven by larger decrease in substrate strain and small increase in catalyst strain, is an increase in interaction energy towards more repulsive catalyst–substrate interactions. While our limiting choice of an aromatic interaction descriptor precludes extending the compensation analysis to these diamine N-donors, this analysis highlights an important consequence of employing the ASM. The built-in compensation between interaction and strain components is leveraged effectively in the complete LFER despite the fact that individual activation and strain components are poorly predicted by our model.
The choice of scalar descriptors and model construction to enable capture of all ligand interaction and strain effects using these descriptors are both critical to LFER performance. The analysis of model sensitivity shows that LFER model performance is determined to the greatest extent (70.3%) by the interaction descriptor, σp, with each of the four σp values contributing equally (17.6%). Our choice of para-substituted Hammett parameters is based on excellent agreement of Hammett slopes with experiment.23 Their use as descriptors is justified as long as we limit the scope of LFER applicability to aromatic N-donors (e.g. imidazoles). However, we make a critical assumption that the descriptor refers only to the property of the substituent (say, OCH3) and not the entire N-donor ligand (C3N2H3OCH3). The interaction of the complete N-donor with the active site is therefore only partially captured by our choice of σp, leading to possible LFER deviations from true barriers. The second consequence of choosing substituent and not N-donor σp is the fact that the impact of physical separation of the electron-donating or withdrawing group from the active site is incorporated into the slope of the interaction term, not unlike traditional Hammett analysis.17 This further limits the scope of ligands for which the LFER model is valid.
To overcome these issues associated with Hammett parameters, our objective going forward is to identify alternative descriptors that can directly capture the impact of the complete N-donor ligand on the active site. The metal–ligand electronic parameter (MLEP) and the associated bond-strength order (BSO), both measures of metal–ligand bond strength in transition metal complexes, show promise as descriptors because they yield the most direct description of metal–ligand coordination and are easily calculated from vibrational analysis.26,80,81 Unlike Hammett parameters however, there is little prior evidence to the best of our knowledge of the use of MLEP's, BSO's, or related quantities in linear models similar to Hammett or Taft relationships. Examination of these parameters and construction of structure–function relationships therefore constitute future work towards constructing a generalizable LFER.
Although sensitivity analysis shows that the LFER model is less sensitive to strain descriptors (12.8% and 16.9% to the 2B1 and 2θ parameters, respectively), we find that these parameters overestimate strain contributions in some cases, leading to barrier deviations. For instance, S1 is the most extreme outlier among all the systems examined and for all levels of theory, seen in Fig. 6. While we cannot eliminate the possibility that σp may be an inadequate descriptor of ΔEINT, Table 4 shows that, contrary to the expected compensation effect, the strain difference (ΔESTR,LFER − ΔESTR,DFT) is large and positive. The large LFER strain estimate can be traced back to the strain descriptor, Sterimol B1, for S1 in Table 3, which is much larger than the values obtained for the remaining symmetrically substituted systems. Although our prior study24 develops the strain LFER based on catalysts that possess a wide range of B1 (1.55–2.82 Å) and θ (71.848–100.811°), it is likely that the absence of compensation in S1 stems in part from B1 overestimating the importance of steric bulk.
The sources of error are more challenging to deconvolute for the remaining outliers in symmetrically substituted catalysts – S3, S4, and S8. With S3, for instance, the absence of compensation could arise from a combination of σp overestimating N-donor electrophilicity and the fact that ΔESTR,LFER is very close to instead of being significantly lower than ΔESTR,DFT. Although the strain descriptors for S8 are very close to reference S7, the LFER strain is lower than the true value. To probe the origins of such deviations, we turn to EDA results for further decomposing strain energy into substrate and catalyst components.
Catalyst and substrate strain terms are obtained from isolated fragment energy differences between TSoxo and IS calculated using EDA. These components are listed in Table S4 of the ESI.† The catalyst component of strain is the consequence of two phenomena – (1) deformation of the catalyst in TSoxo, reflected in N-donor rotations relative to IS, and (2) change in intrafragment interactions between N-donors between TSoxo and IS. Although the two are difficult to isolate, the latter can be approximately quantified by carrying out EDA between two N-donors bound to each Cu at their respective geometries in TSoxo and IS. For the S1–S11 systems, these intramolecular interactions (Table S5 of ESI†) have non-negligible impact on lowering catalyst strain energy. The average catalyst strain energy is 63.1 kJ mol−1 for S1–S11, of which these intramolecular interactions can range between −5.3 kJ mol−1 and −20.9 kJ mol−1 with an average of −12.4 kJ mol−1. While these quantities are significantly smaller than substrate strain (125.9 kJ mol−1 on average for S1–S11), we find that trends are non-negligible.
Applying this analysis to S8, we find that the catalyst strain energy is 12.0 kJ mol−1 in excess of the reference S7. It is therefore possible that failure to capture this component leads to lower LFER strain even though the compensation effect predicts positive ΔESTR,LFER − ΔESTR,DFT. We emphasize however that this deviation is not unique to S8 and that even S5 shows catalyst strain that is 12.2 kJ mol−1 higher than S7. The absence of adequate compensation or error cancellation between the two terms constituting the ASM framework is ultimately what leads to large errors in LFER barrier predictions.
A6 and A10 exhibit the largest deviations among asymmetrically substituted systems, with barriers overestimated by 15.4 and 17.7 kJ mol−1, respectively (Table S3 of ESI†). In both cases, the magnitude of the strain deviations are higher than the compensation offered by interaction deviations. LFER overestimates strain energy because true catalyst strain is lower than the reference by 13.9 kJ mol−1 and 9.4 kJ mol−1 for A6 and A10, respectively. On the other hand, LFER underestimates barriers by 13.4 kJ mol−1 and 14.7 kJ mol−1 for A12 and A37, respectively. In A12, LFER underestimates strain by 14.5 kJ mol−1 but overestimates interaction by only 1.1 kJ mol−1, with the former originating in part, from higher catalyst strain relative to reference S7. In A37, LFER underestimates both strain and interaction terms by 7.1 kJ mol−1 and 7.7 kJ mol−1, respectively. It is therefore possible that a combination of factors described previously are contributing to these deviations. Although the deviations have opposite signs as is expected for A16 and A17, inadequate compensation between interaction and strain terms lead to poor barrier predictions, with errors of −10.8 kJ mol−1 and 11.5 kJ mol−1, respectively.
In summary, we find that the LFER model errors can exceed 10 kJ mol−1 in scenarios where individual interaction and strain deviations do not cancel each other. The absence of cancellation of errors can arise from the underlying choice of interaction and strain descriptors and/or because the strain component of the LFER is designed to primarily capture substrate strain.
We test the assumption of equal weighting of the interaction term by analyzing the dependence of barriers on the location of the N-donor relative to the Cu–O bond being broken in the course of CH activation. We select two monodentate amine N-donors (Xam = CH3, CF3) and substitute one of them in one of the four positions as shown in Fig. 3(b), with NH3 (Xam = H) groups as the remaining three N-donors. The procedure for calculating barriers is similar to the LFER construction method for the interaction term to minimize strain effects. The active site and substrate are constrained to the reference [Cu2(NH3)4O2]2+ relaxed geometry. Barriers are calculated at the ωB97X-D/srsc, 6-311+G* level of theory.
Table 5 shows that these barriers lie in a narrow range (within 10 kJ mol−1), spanning 4.7 kJ mol−1 for CH3 and 9.7 kJ mol−1 for CF3. Therefore, the assumption of equal N-donor contributions to the interaction term is reasonable and equal weights can be assigned to the interaction descriptors for N-donors coordinated to the active site. We note that this is valid exclusively in situations where the active site and the connecting N-atoms are planar. The relative contributions of various N-donors in non-planar complexes, which can occur when more than 2 N-donors are bound to each Cu, is yet to be examined.
N-Donor | Position | |||
---|---|---|---|---|
1 | 2 | 3 | 4 | |
CH3 | 76.0 | 73.6 | 71.3 | 72.1 |
CF3 | 56.0 | 63.4 | 53.7 | 61.8 |
While the ranges of barriers in the analysis present in Table 5 are within 10 kJ mol−1, the sensitivity of barriers to the position of the Xam = CF3 group may not be negligible. A subset of asymmetric catalysts described in Table 1 are employed to further examine these position effects on LFER predictions for fully relaxed systems. For example, consider A3, A15, and A19, each consisting of two Xim = CF3 and Xim = H N-donors in distinct positions. In A3 and A15, each CF3-bound imidazole group is coordinated to a different Cu in trans- and cis-like configuration, respectively. In A19, both CF3-bound imidazole groups are coordinated to the same Cu. The DFT barriers lie in a range similar to the CF3 systems in Table 5 – 73.8 kJ mol−1 for A3, 84.0 kJ mol−1 for A15, and 82.4 kJ mol−1 for A19. The LFER interaction term is identical for all three systems. LFER predicts higher barriers for A3 (81.2 kJ mol−1) and A19 (81.5 kJ mol−1) compared to A15 (78.1 kJ mol−1), on account of proximity of the CF3 group to the substrate, reflected in one of the two B1 descriptors. The predicted barrier is very close to the DFT value for A19 and higher for A3 although still within the allowable margin of error. In the case of A15, the LFER prediction is lower than DFT by 5.9 kJ mol−1 on account of modest increase in catalyst and substrate strain in the latter that is either not captured or adequately compensated for by the model. A3 does not exhibit the added strain predicted by the model despite N-donor proximity to the substrate being similar to A19. This is because, unlike in the case of A19, the Xim,4 does not undergo twisting relative to the active site plane in A3. The remaining three N-donors show similar deformation in the two systems. Similar conclusions can be drawn from examining other asymmetric systems which only vary in positions, such as A4, A17, and A21, with two Xim = NO2 and two Xim = H N-donors.
Taken together with the results in Table 5, we conclude that N-donor positions do impact interaction and strain components of the ASM, although the resulting range in barriers is relatively narrow. This range may not be resolvable with high fidelity by the LFER model or even be usable in model construction to assign position-dependent weights to N-donor descriptors.
The LFER framework, constructed using the activation strain model by isolating and assigning descriptors for each contribution, constitutes a reliable predictor of activation energies for the N-donor class examined in this study. We find that the model relies on compensation or error cancellation between the interaction and strain components. The analysis presented in this work highlights key strengths of the model, including transferability with respect to level of theory and robustness with respect to choice of reference. We also identify critical model limitations that affect performance and transferability across N-donor classes, including that imposed by the choice of interaction descriptor and incomplete description of catalyst strain effects.
The LFER construction procedure and resulting performance are illustrated for CH4 hydroxylation with enzyme-inspired [Cu2O2]2+ complexes. We combine the outcomes of our previous two studies that isolate interaction and strain effects via systematic N-donor ligand variations and determine that the former effect can be described by the Hammett parameter (σp) and the latter with a combination of bite angle (θ) and Sterimol parameter (B1). The resulting LFER can accurately predict barriers for most catalysts where the [Cu2O2]2+ is bound to arbitrary imidazole N-donors and is transferable with respect to level of theory. This is the consequence of a natural compensation observed between LFER predictions of interaction and strain energies.
Analysis of model sensitivity and outliers reveals that the choice of interaction descriptor is the most probable cause of model deviations in systems where this compensation is not observed. Use of the Hammett parameter as descriptor severely limits the ligand space that can be explored with the current LFER and may not adequately describe the interaction effect of the entire ligand. To expand LFER catalyst scope and enable a more complete description of metal–ligand effects, our future work involves exploration of alternative descriptors, specifically MLEPs. Another limitation of the LFER model is that it captures primarily trends in substrate strain and not catalyst strain, which can also lead to barrier deviations when the catalyst strain is far from the reference value. Going forward, we aim to construct a more complete LFER model by identifying a suitable set of N-donor variations to isolate and quantify trends in catalyst strain.
This work constitutes the first step towards developing a systematic, generalizable framework for active site-specific design of transition metal complex catalysts. We successfully demonstrate proof-of-concept that an LFER grounded in the activation strain model has the ability to predict activation barriers for catalyst constructed with imidazole N-donor ligands. The choice of descriptors and quantification of more complex, intramolecular effects are essential next steps towards developing a truly generalizable LFER that can accurately predict barriers for a broader spectrum of ligands.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d1cp02278d |
This journal is © the Owner Societies 2021 |