A framework for constructing linear free energy relationships to design molecular transition metal catalysts

Zhenzhuo Lan; Shaama Mallikarjun Sharada

doi:10.1039/D1CP02278D

View PDF VersionPrevious ArticleNext Article

DOI: 10.1039/D1CP02278D (Paper) Phys. Chem. Chem. Phys., 2021, 23, 15543-15556

A framework for constructing linear free energy relationships to design molecular transition metal catalysts†

Zhenzhuo Lan ^a and Shaama Mallikarjun Sharada *^ab
^aMork Family Department of Chemical Engineering and Materials Science, University of Southern California, Los Angeles, CA, USA. E-mail: ssharada@usc.edu
^bDepartment of Chemistry, University of Southern California, Los Angeles, CA, USA

Received 23rd May 2021 , Accepted 6th July 2021

First published on 6th July 2021

Abstract

A computational framework for ligand-driven design of transition metal complexes is presented in this work. We propose a general procedure for the construction of active site-specific linear free energy relationships (LFERs), which are inspired from Hammett and Taft correlations in organic chemistry and grounded in the activation strain model (ASM). Ligand effects are isolated and quantified in terms of their contribution to interaction and strain energy components of ASM. Scalar descriptors that are easily obtainable are then employed to construct the complete LFER. We successfully demonstrate proof-of-concept by constructing and applying an LFER to CH activation with enzyme-inspired [Cu₂O₂]²⁺ complexes. The key benefit of using ASM is a built-in compensation or error cancellation between LFER prediction of interaction and strain terms, resulting in accurate barrier predictions for 37 of the 47 catalysts examined in this study. The LFER is also transferable with respect to level of theory and flexible towards the choice of reference system. The absence of interaction-strain compensation or poor model performance for the remaining systems is a consequence of the approximate nature of the chosen interaction energy descriptor and LFER construction of the strain term, which focuses largely on trends in substrate and not catalyst strain.

Zhenzhuo Lan

Zhenzhuo is a PhD candidate in the Mork Family Department of Chemical Engineering and Materials Science at the University of Southern California and a recipient of the Graduate School Merit Fellowship and the TLARGI Fellowship at USC. She is studying mechanisms of CH activation and ligand sensitivities with oxygen-activated mono- and dicopper catalysts, and developing generalizable ligand design frameworks for molecular catalysts. She completed her BS in Chemical Engineering at Tianjin University in 2017.

Shaama Mallikarjun Sharada

Shaama is the WiSE Gabilan Assistant Professor in the Mork Family Department of Chemical Engineering and Materials Science since 2017 and Assistant Professor (by Courtesy) in the Department of Chemistry at the University of Southern California since 2020. Her research interests span the development and application of quantum chemistry and dynamics methods to design molecular, heterogeneous, and photo-catalysts for sustainable chemistry. She received her Bachelors and Masters in Chemical Engineering from the Indian Institute of Technology, Bombay (India) where she was awarded the Institute Gold Medal. She received her PhD in Chemical Engineering from UC Berkeley (2015) and completed her postdoctoral training in Chemical Engineering at Stanford University. She is a recipient of the 2020 ACS Petroleum Research Fund Doctoral New Investigator Award and is a 2020 Scialog Fellow for the Negative Emissions Science initiative.

1 Introduction

In the field of homogeneous catalysis with molecular transition metal complexes, there is a growing need for systematic approaches to probe the influence of active site-bound ligands on catalytic activity, as this is a necessary step towards rational catalyst screening and design. Significant progress has been made in screening and design of heterogeneous catalysts in recent decades.^1,2 Experiment-^3,4 and theory-based^5–8 screening methods for molecular catalysts are still emerging. This is largely a consequence of complex interplay between metal oxidation and spin states, ligands, substrates, solvent, counter-ions, and other factors, whose individual and combined contributions to catalytic activity can be challenging to capture.

Quantitative molecular catalyst design strategies can be categorized broadly into scaling relationships or volcano plots relying on energetic descriptors and valence bond approaches,^9–13 those based on artificial intelligence and machine learning methods,¹⁴ and linear free energy relationships (LFERs) that rely on capturing ligand electronic and steric effects using a set of scalar, structural descriptors.^8,15 LFERs describe reaction rate or equilibrium dependence on structural and electronic attributes of the system.¹⁶ This dependence is linear when the reaction mechanism is not altered upon variation of substituents. LFERs were originally developed as Hammett relationships in organic aromatic chemistry to describe the influence of (meta- or para-) substituents on the dissociation rate of benzoic acid. In Hammett relationships, the electrophilicity descriptor is a Hammett parameter (σ_m or σ_p) and the slope of the Hammett plot describes the sensitivity of the mechanism to substituent electrophilicity.¹⁷ Taft extended this framework to ortho-substituted groups via the inclusion of an additive term that captures the impact of substituent steric bulk.¹⁸ The resulting free energy relationship has two additive terms, one describing the electronic effect and another describing the steric effect of substituents on reaction rates or equilibria.

Such frameworks can offer systematic means to probe and predict the impact of ligands bound to a transition metal active site on catalyst performance metrics such as activity, selectivity, and stability.⁴ The construction of an LFER framework requires quantitative information regarding various ligand effects as well as the identification of appropriate scalar descriptors for these effects. It may seem intuitive to categorize ligand effects into their ability to tune the electronic character of the active site and their spatial extent, or steric bulk, that can impact substrate approach to the active site. These effects however, are often interrelated and difficult to isolate.^19,20 As a result, the Taft equation describing reaction rate dependence on substituent electronic and steric effects cannot be generally applied across reaction chemistries with transition metal complexes. Our goal is to instead avoid explicit classification of ligand effects into electronic and steric terms, and develop an approach that relies on describing ligand effects in terms of directly measurable impacts on the activated catalyst–substrate system.

This study aims to establish a framework for constructing LFERs for CH activation with metal–oxygen active sites by quantifying trends in barriers obtained via systematic catalyst perturbations. Instead of breaking down ligand contributions into electronic and steric terms, we examine their impact on interaction and strain energy differences between transition and initial states of the catalyst–substrate system described by the activation strain model (ASM), also known as the distortion-interaction model.^21,22 We utilize mechanistic findings from our prior study of CH activation with dioxo–dicopper complexes to calculate activation barriers using density functional theory (DFT).²³ Our previous studies also identify descriptors for capturing trends in interaction²³ and strain energies²⁴ based on systematic ligand variations. An active site-specific LFER that combines both these effects is constructed in this work, as developing a single set of design rules that is valid across several active sites is a formidable task on account of the diversity in oxidation and spin states achievable with transition metal centers.²⁵ We examine LFER accuracies in predicting activation barriers by examining several model [Cu₂O₂]²⁺ complexes constructed using imidazole N-donor ligands. The LFER is highly reliable, and predicts most barriers to within 10 kJ mol⁻¹ of their DFT values. This is the consequence of a natural compensation or cancellation of errors observed between LFER predictions of constituent interaction and strain energies. Our functional and basis set tests show that the LFER is also transferable and can be used alongside multiple levels of theory. Analyses of outliers and sensitivity reveal that the Hammett parameter, used as a descriptor for ligand interaction effects, is the most likely cause for model deviations. In addition, as the model largely captures strain arising from substrate deformation (C-H stretch) in the transition state, LFER predictions are poorer for systems where catalyst strain deviates significantly from the reference system. Therefore, while we successfully demonstrate proof-of-concept for ASM-driven construction of LFERs, there is a need for descriptors that are more appropriate for transition metal complexes. Future work, in addition to expansion to other active sites, will include the search for interaction descriptors that are more suitable for metal–ligand bonds (such as metal ligand electronic parameters, MLEPs)²⁶ which will enable construction of LFERs that are generalizable across several classes of ligands.

2 Methods

In this section, we outline the procedure for constructing a generalizable LFER model and illustrate its application to CH activation with model [Cu₂O₂]²⁺ complexes. We choose the dioxo–dicopper active site as it is found to selectively oxidize various types of CH bonds in several enzymes and their synthetic mimics.^27,28

2.1 Activation strain model & LFER

The guiding hypothesis for this work is that ligand effects on reaction kinetics can be completely characterized by their impact on changes in interaction between catalyst and substrate and strain induced in the process of bond breakage/formation. The LFER model can be constructed by identifying suitable descriptors for capturing these interaction and strain energies. To this end, we employ the activation strain model or the distortion-interaction model. ASM proposes that the activation energy for a reaction step (ΔE^‡) comprises destabilizing strain (ΔE_STR) induced by geometric deformation from the initial to transition structure and stabilizing interaction between catalyst and substrate (ΔE_INT).^22,29


ΔE^‡ = ΔE_INT + ΔE_STR	(1)

where Δ stands for the difference between transition and initial states. ΔE^‡ is computed using DFT coupled with transition structure search methods^30,31 and ΔE_INT can be determined using energy decomposition analysis (EDA).³² ASM has been extensively employed to better understand reaction mechanisms in terms of ligand contributions to transition state stabilization vis-á-vis geometric deformation, as well as to analyze catalyst–substrate interactions along the entire reaction path to uncover unusual reaction characteristics.^22,33 However, to the best of our knowledge, ASM has not been previously conceived of as a framework upon which LFERs can be constructed.

To develop an LFER expression using ASM, we invoke the original equation put forth by Taft for ortho-substituted aromatics to quantify electronic and steric substituent effects:¹⁸


	(2)

The left-hand side (LHS) represents the logarithm of the ratio of reaction rate coefficients for a substituted system to a reference system. Substituent effects on reaction rates are mapped using reactants’ property descriptors – σ for electronic and E_s for steric effects. Specifically, these are descriptor differences calculated with respect to a reference system. In the case of benzoic acid dissociation, for example, the reference is the unsubstituted benzoic acid (or substituent = H), for which σ = 0. The slopes, ρ and δ, quantify the sensitivity of the reaction to electronic and steric substituent effects, respectively. The relationship is valid (linear) when the mechanism of the reaction is not altered relative to the reference by the substituents.

By employing simplifying approximations, we can rewrite the Taft LFER using ASM. First, we assume that zero-point energies, enthalpy, entropy, and pre-exponential terms in the rate coefficient expression are less sensitive to the choice of ligand compared to electronic energies. Therefore, the LHS of eqn (2) can be rewritten as:


	(3)

where ΔE^‡ refers to the broken symmetry electronic energy difference between transition and initial states. Second, we isolate interaction and strain effects rather than the original Taft formalism of separating electronic and steric effects. This is because the former are directly computed using ASM while the latter can be difficult to quantify for ligands bound to transition metal active sites. In other words,


	(4)

The equivalence is shown only to illustrate parallels between our LFER and the Taft formalism. It does not imply a one-to-one correlation between interaction energy (strain energy) and electronic (steric) effects. However, we utilize these parallels as useful guides for the selection of descriptors. The subsequent steps of LFER construction are illustrated using CH activation with the goal of designing N-donor ligands for enzyme-inspired dioxo–dicopper ([Cu₂O₂]²⁺) catalysts.

2.2 LFER construction: CH activation with dioxo–dicopper complexes

Mechanism. In an earlier study of CH₄ activation, we compute Hammett curves via systematic variation of N-donors coordinated to the [Cu₂O₂]²⁺ center with the goal of identifying the preferred reaction mechanism.²³ [Cu₂O₂]²⁺ exhibits two isomeric forms – μ–η²:η² peroxo (P) and bis(μ-oxo) (O),³⁴ with relative prevalence and equilibrium between the two determined by various factors including ligand field, solvent, and counterions.^35–37 Although the P isomer is a quintessential multi-reference system that cannot be adequately treated with DFT,^38–45 our previous mechanistic study finds that P → O isomerization precedes CH activation.²³ Therefore, without loss of generality, the O isomer is utilized as the single-reference ground state in our previous works and the current study.

The proposed mechanism of CH activation has been debated extensively in the literature.^46–52 By contrasting with experimental substrate variation-based curves,²³ we demonstrate in our previous work that (in the absence of spin-crossing) the one-step oxo-insertion mechanism is preferred to the two-step radical recombination mechanism on the singlet potential energy surface. Fig. 1 represents the potential energy surface for the singlet oxo-insertion mechanism along with the geometries of initial, transition, and final states, labeled IS, TSoxo, and FS, respectively. The mechanism is concerted, and proceeds via simultaneous C–H and Cu–O bond breaking and C–O and O–H bond formation.


	Fig. 1 Potential energy diagram for the singlet oxo-insertion mechanism of CH activation with [Cu₂(C₃N₂H₄)₄O₂]²⁺. X-Axis represents reaction progress. IS, TSoxo, and FS correspond to initial, transition, and final states, respectively. Y-Axis is the broken symmetry electronic energy (Cu: ochre, O: red, N: blue, C: grey).

Interaction term. With oxo-insertion determined to be the preferred mechanism, we leverage the ASM to quantify N-donor effects. To probe trends in the ΔE_INT term with ligand variation, we choose the [Cu₂(NH₃)₄O₂]²⁺ model complex, where each Cu is bound to two NH₃ ligands, as the reference system. Two sets of N-donors with amine and imidazole backbones are chosen. For the amine N-donors, the active site and substrate are then frozen in the relaxed configuration obtained for the reference system and NH₃ is systematically replaced with substituted amine (NH₂X_am) groups of varying electrophilicity (X_am = OCH₃, CH₃, H, CF₃, NO₂). The procedure is repeated for the imidazole N-donors (X_im = OCH₃, CH₃, H, CF₃, NO₂). As a consequence of freezing the active site and substrate, ΔE_STR is invariant to substitutions. For the imidazole N-donors, we then obtain,


	(5)

where i (i = 1,2,3,4) refers to each of the four N-donor bound to the active site (all identical in this case,

) and the non-zero intercept is a consequence of choosing the amine reference (X_am = H for ΔE_ref and ΔE_INT). Using the Hammett parameter associated with para-substituted ligands, σ_p, as the electronic descriptor, we find that the slope ρ = −2.28 is in excellent agreement with experiment (−2.2) at 148 K (−125 °C).⁵³ Therefore, the mechanism is sensitive to choice of ligand electrophilicity and TSoxo is more stabilized when the active site is coordinated to electron-withdrawing ligands compared to electron-donating ligands. These variations in stabilization are captured in the ΔE_INT component of ASM.

While this analysis indicates that the Hammett parameter may be a reliable descriptor for interaction energies, we find poor fit to linear models for amine N-donors, which is expected as these amines are not aromatic.²³ Furthermore, we are limited to using the Hammett parameter for the substituent only (e.g. CH₃) rather than the entire N-donor (NH₂CH₃) as values (experimental or otherwise) for the latter are not readily available. Alternative scalar descriptors such as the charge of the active site oxygen atom or catalyst HOMO levels do not easily lend themselves to LFER construction. As a consequence of these limitations, we proceed with the Hammett parameter, σ_p, for LFER development but demonstrate proof-of-concept for complexes constructed using only imidazole N-donors. Going forward, our goal is to identify interaction descriptors that are more suitable for metal–ligand bonds than organic aromatic systems to develop LFERs that are transferable across various ligand backbones, which in this case consist of amines, diamines, imidazoles, and pyridines.

Strain term. Following the development of the LFER interaction component,²³ we isolate trends in ΔE_STR while keeping ΔE_INT constant by varying bidentate diamine N-donors coordinated to the [Cu₂O₂]²⁺ active center.²⁴ The strain model is designed to capture changes in CH stretch and catalyst deformation in the transition structure (TSoxo), induced predominantly by the size of the bidentate N-donor. We find that, unlike the Taft equation that uses a single scalar descriptor E_s to describe steric effects for ortho-substitutions to aromatic substrates, we need more than one descriptor to quantify the impact of varying N-donor backbone, branching, and substitution on ΔE_STR. A combination of two parameters – the N–Cu–N bite angle, θ, and Sterimol parameter,^54–57B1 – is necessary to describe N-donor strain effects. The Sterimol parameter B1 is defined as the minimum width of N-donor, which is a projected width when looking down a Cu–N bond. Since only two out of the total four N-donors are usually close to and interact with the substrate in a transition state, we identify the two B1 values of N-donors closest to the substrate as one set of strain descriptors. Sterimol parameters are determined with Bondi radii⁵⁸ using the Python package developed by Brethomé A. V. et al.⁵⁹ One pair of bite angles and one pair of Sterimol parameters for an example system (X_imi = CH₃) are illustrated in Fig. 2. All steric descriptors are calculated using optimized IS geometries as shown in Fig. 2(b).


	Fig. 2 (a) An illustration of Sterimol parameters for imidazole N-donor substituted with CH₃, where L is length of the group along a user-defined axis (Cu–N) and B1 is the minimum width when looking down from the axis. The third Sterimol parameter, which is the corresponding maximum width, B5, is not shown here. (b) Representation of strain descriptors θ and B1 in IS. All strain descriptors are calculated using optimized IS geometries. (Cu: ochre, O: red, N: blue, C: grey).

The resulting strain-only LFER, where ΔE_INT is invariant, is determined to be:²⁴


	(6)

As strain energies are dominated by substrate C–H stretch, which also shows higher sensitivity to ligand variations compared to catalyst strain, we believe that the LFER model mainly captures ligand impact on substrate strain. We find that bulkier ligands with larger bite angles or spatial extent tend to yield higher barriers. This is because the CH bond stretch in TSoxo increases with increasing B1 and/or θ, leading to larger ΔE_STR terms.

Complete LFER. In this work, our objective is to combine the two LFER components determined in previous studies and examine its accuracy in predicting barriers for a broad range of [Cu₂O₂]²⁺ complexes coordinated to monodentate imidazole N-donors. We identify model outliers to analyze sources of error, and determine LFER transferability with respect to level of theory. Assuming that the previously determined LFER components are additive as each of them captures only one of the two ligand effects, we obtain a complete LFER by adding eqn (1) and (2). For CH activation with [Cu₂O₂]²⁺ complexes where Cu is coordinated in a planar fashion with oxygen and N-donors, the LFER is given by:


	(7)

where T is set to 148 K (−125 °C) to match the experimental study used to contrast Hammett slopes.⁵³ Note that the slope of the interaction component of the LFER is slightly more negative than reported in our previous work²³ and eqn (1) on account of setting the intercept of the Hammett curve to zero by employing X_im = H as the reference.

To apply these types of LFER models, one needs a DFT barrier calculations for a single reference system (ΔE_ref) and descriptors for the reference and system of interest. The model (eqn (7)) can then predict the barrier (ΔE^‡) for the system of interest without the need for several tedious transition structure and barrier calculations.

2.3 LFER model performance tests

We examine the viability of the proposed framework for LFER construction by analyzing the accuracy of eqn (7). This work probes ΔE predictions for fully relaxed systems where active sites are bound to monodentate imidazole N-donors (C₃N₂H₃X_im). X_im is substituted at the C atom between the two N atoms as shown in Fig. 3(a). Fig. 3(b) represents the geometry of [Cu₂O₂]²⁺ complexes with four imidazole groups. Eleven symmetric complexes are constructed using four identical X_im groups, denoted S1–S11. In addition, a diverse set of 38 [Cu₂O₂]²⁺ complexes are obtained by mixed, asymmetric substitution of X_im groups as shown in Table 1, named A1–A38, where both the types of N-donors as well as their positions (illustrated in Fig. 3(b)) are varied.


	Fig. 3 (a) Geometry of eleven imidazole groups substituted with X_im groups (S1–S11) and (b) geometry of monodentate [Cu₂O₂]²⁺ complexes with four imidazole groups.

Table 1 Imidazole N-donors for asymmetrically substituted [Cu₂O₂]²⁺ complexes, where X_im,i (i = 1, 2, 3, 4) are four substitution groups, with the 1–4 positions described in Fig. 3(b)

Index	X_im,₁	X_im,₂	X_im,₃	X_im,₄
A1	OCH₃	H	OCH₃	H
A2	CH₃	H	CH₃	H
A3	CF₃	H	CF₃	H
A4	NO₂	H	NO₂	H
A5	CH₃	OCH₃	CH₃	OCH₃
A6	OCH₃	CF₃	OCH₃	CF₃
A7	OCH₃	NO₂	OCH₃	NO₂
A8	CH₃	CF₃	CH₃	CF₃
A9	CH₃	NO₂	CH₃	NO₂
A10	CF₃	NO₂	CF₃	NO₂
A11	H	H	CF₃	H
A12	H	H	CH₃	H
A13	H	H	NO₂	H
A14	H	H	OCH₃	H
A15	H	H	CF₃	CF₃
A16	H	H	CH₃	CH₃
A17	H	H	NO₂	NO₂
A18	H	H	OCH₃	OCH₃
A19	H	CF₃	CF₃	H
A20	CH₃	H	H	CH₃
A21	H	NO₂	NO₂	H
A22	OCH₃	H	H	OCH₃
A23	CF₃	H	CF₃	CF₃
A24	CH₃	H	CH₃	CH₃
A25	H	NO₂	NO₂	NO₂
A26	OCH₃	H	OCH₃	OCH₃
A27	NO₂	CF₃	CF₃	NO₂
A28	CH₃	CF₃	CF₃	CH₃
A29	CH₃	NO₂	NO₂	CH₃
A30	OCH₃	CF₃	CF₃	OCH₃
A31	OCH₃	CH₃	CH₃	OCH₃
A32	OCH₃	NO₂	NO₂	OCH₃
A33	H	CH₃	H	OCH₃
A34	H	CF₃	H	OCH₃
A35	H	NO₂	H	OCH₃
A36	H	CF₃	H	CH₃
A37	H	NO₂	H	CH₃
A38	NO₂	H	CF₃	H

We use the ab initio quantum chemistry software, Q-Chem (Version 5.1 and above)⁶⁰ to carry out all DFT calculations. Barriers are calculated for CH₄ activation using the range-separated hybrid functional, ωB97X-D,^61,62 and the Stuttgart Relativistic Small Core (srsc) effective core potential for Cu⁶³ and triple-ζ 6-311+G* basis set with diffuse and polarization functions for the remaining atoms. The level of theory is identical to that employed for constructing the strain LFER. For the interaction LFER, the same functional was employed alongside a smaller basis set (6-31G*). Wavefunction stability analysis is used to determine stable, broken-symmetry (BS) wavefunctions.^64,65 The extent of spin contamination is found to be similar across all systems in both initial (〈S²〉_average = 0.098) and transition structures (〈S²〉_average = 1.284). Consistent with our LFER construction studies, the second-generation absolutely-localized molecular orbital-EDA, or ALMO-EDA, is employed to calculate the ΔE_INT term in ASM.^66–71 The LFER model prediction is deemed accurate if the predicted barrier is within 10 kJ mol⁻¹ of the ‘true’ DFT value, in line with the rule of thumb typically applied to energy differences that are meaningfully resolvable by DFT.

2.4 Model sensitivity & transferability

Sensitivity analysis refers to the examination of uncertainty in model output. The analysis determines the source of uncertainty, or the degree to which the model output varies with variation of each input parameter. This analysis is aimed at identifying model sensitivity to the three sets of LFER descriptors – σ_p, Sterimol B1, and bite angle θ. We select a global variance-based sensitivity analysis, known as the Sobol method⁷² implemented in the SALib Python package.⁷³ A first-order Sobol sensitivity analysis calculates the percentage influence of the variance of each descriptor on the barrier variance and thus determines the relative significance of descriptors.

An LFER model, once constructed, must be usable alongside any level of theory. In other words, a truly generalizable LFER requires DFT calculations at the desired level of theory for only a single reference system. The reference calculation and descriptors can then be plugged into eqn (7) to predict barriers for arbitrary ligands coordinated to the [Cu₂O₂]²⁺ center. To determine whether this is the case with our ASM-based LFER model, we identify density functional approximations that benchmarking studies demonstrate to be reliable predictors of activation barriers.^74,75 We also examine sensitivity to size and choice of basis set. The chosen levels of theory are described in Table 2, where the ‘baseline’ refers to ωB97X-D/6-311+G* (srsc for Cu) utilized to assess LFER model performance. Strain descriptors and IS and TSoxo structures are calculated at these levels of theory to contrast LFER predictions with DFT results. Owing to difficulties converging optimization cycles for SCAN0, single point IS and TSoxo calculations are performed with SCAN0 on structures optimized at the baseline level of theory.

Table 2 Levels of theory chosen for examining model transferability

Purpose	Class	Functional	Basis set (effective core potential)
Baseline	Range-separated hybrid functional	ωB97X-D⁷⁶	6-311+G* (Cu: srsc)
Functional test	Global hybrid functional	PBE0⁷⁷	6-311+G* (Cu: srsc)
Functional test	Hybrid meta-generalized gradient approximation	SCAN0⁷⁸	6-311+G* (Cu: srsc)
Functional test	Range-separated hybrid functional	ωB97X-V⁷⁹	6-311+G* (Cu: srsc)
Basis set test	Range-separated hybrid functional	ωB97X-D	6-31G*
Basis set test	Range-separated hybrid functional	ωB97X-D	def-SVP
Basis set test	Range-separated hybrid functional	ωB97X-D	def-SVPD

3 Results

Table 3 shows the interaction and strain descriptors and CH activation barriers for the symmetrically substituted (S1–S11) complexes. Note that even though the N-donor variation is based on electrophilicity of the substituent and S1–S11 are ordered with respect to increasing electron-withdrawing character of the substituent, DFT barriers do not necessarily follow the same trend. The corresponding data for asymmetrically substituted A1–A38 systems is reported in the ESI† (Table S1).

Table 3 Sum of four Hammett parameters σ_p, bite angles θ (°) and Sterimol parameters B1 (Å), BS barriers ΔE^‡_DFT calculated at the baseline level of theory and ΔE^‡_LFER predicted by LFER model for eleven symmetric substituted catalysts (barriers in kJ mol⁻¹)

Index	X_im	∑σ_p	θ ₁	θ ₂	B1₁	B1₂	ΔE^‡_DFT	ΔE^‡_LFER
S1	N(CH₃)₂	−3.32	94.585	98.370	1.95	1.92	94.0	116.4
S2	NHCH₃	−2.80	96.874	96.742	1.59	1.57	97.3	101.0
S3	NH₂	−2.64	97.006	97.099	1.57	1.59	88.4	100.5
S4	OH	−1.48	98.724	98.938	1.66	1.68	87.5	99.8
S5	OCH₃	−1.08	98.754	98.896	1.67	1.70	102.5	97.7
S6	CH₃	−0.68	92.672	92.937	1.72	1.71	85.0	83.2
S7	H	0.00	95.678	95.803	1.65	1.62	82.2	82.2
S8	CHO	1.68	95.278	95.264	1.62	1.63	81.4	69.7
S9	COOH	1.80	97.469	97.316	1.62	1.61	79.2	73.1
S10	CF₃	2.20	95.536	97.808	1.79	1.88	75.2	76.8
S11	NO₂	3.12	97.179	96.610	1.65	1.59	63.1	63.5

The parity plot in Fig. 4 depicts LFER predictions (ΔE_LFER) relative to DFT barriers (ΔE_DFT) for both symmetric (S1–S11) and asymmetric (A1–A38) systems. We find that LFER predictions are reliable for nearly 80% of all the systems examined in this study. Out of 47 systems (excluding reference S7), 37 LFER predictions are within 10 kJ mol⁻¹ of the DFT value, shown by the shaded region about the x = y parity line in Fig. 4. Of these, 23 predictions are within 5 kJ mol⁻¹ of the DFT value.


	Fig. 4 LFER predictions (ΔE^‡_LFER) vs. BS DFT barriers (ΔE^‡_DFT) at the ωB97X-D/srsc,6-311+G* level of theory (kJ mol⁻¹) for 11 symmetrically substituted (S1–S11, blue) and 38 asymmetrically substituted catalysts (A1–A38, green). Points lying in the shaded region indicate LFER prediction is within 10 kJ mol⁻¹ of the DFT value. All model outliers are marked in red and labeled “A”/“S” to indicate asymmetric/symmetrically substituted catalysts (structures listed in Tables 1 and 3, respectively).

For S1–S11, model predictions are within 10 kJ mol⁻¹ of DFT results for 6 systems (excluding the S7 reference), with a mean absolute error (MAE) of 3.1 kJ mol⁻¹. The overall MAE is higher (7.7 kJ mol⁻¹) on account of 4 outliers, with substituents (X_im) N(CH₃)₂, NH₂, OH and CHO. For the asymmetrically substituted A1–A38, LFER predictions are within 10 kJ mol⁻¹ of DFT results for 32 complexes, with MAE = 4.1 kJ mol⁻¹. The overall MAE is slightly higher at 5.7 kJ mol⁻¹. It is worth noting that with only a few exceptions (A6, A10, A37), LFER accurately predicts barriers for mixed ligand systems that contain both electron-donating and withdrawing N-donors relative to reference X_im = H, namely A7, A8, A9, A28, A29, A30, A32, A34, A35, and A36, for which MAE = 4.8 kJ mol⁻¹.

Transferability tests using levels of theory described in Table 2 are carried out for S1–S11. Before contrasting LFER performance, we determine whether trends in DFT barriers across N-donor substitutions are consistent across various levels of theory. In Fig. 5, we report BS barrier difference (ΔΔE^‡) between a substituted imidazole and the reference (X_im = H, S7). The black curve in Fig. 5 corresponds to our baseline level of theory. In general, ΔΔE^‡ across various functional and basis set combinations are within 10 kJ mol⁻¹ of the baseline. The sole exception is S10 (X_im = CF₃), for which PBE0 and ωB97X-V barriers are 12.0 kJ mol⁻¹ and 13.7 kJ mol⁻¹ lower than the baseline, respectively.


	Fig. 5 BS DFT barrier differences for CH₄ activation (kJ mol⁻¹) with respect to the reference imidazole system (X_im = H) calculated using various levels of theory for S1–S11. ΔΔE^‡ = ΔE^‡ − ΔE^‡_ref,H. X_im's are ordered with respect to decreasing Hammett parameter, σ_p. BS barriers are obtained with optimized IS and TSoxo at the chosen level of theory, except in the case of SCAN0. Single point calculations are carried out for SCAN0 using geometries optimized at the baseline level of theory.

Based on the general agreement in trends across various levels of theory, we contrast LFER predictions calculated using eqn (7) with these levels of theory. Descriptors for LFERs are obtained at the same level of theory as the reference calculation. These descriptors as well as the resulting DFT and LFER barriers are listed in Table S2 of ESI.†Fig. 6 depicts LFER performance for each level of theory. In all cases, LFER performance resembles the baseline scenario, with similar MAEs and outliers. Functional tests result in MAEs of 9.9, 8.8, and 10.0 kJ mol⁻¹ for PBE0, SCAN0, and ωB97X-V, respectively. Basis set tests yield MAEs of 9.0, 8.4, and 7.6 kJ mol⁻¹ for 6-31G*, def2-SVP, and def2-SVPD, respectively. The LFER is therefore transferable across basis sets and density functional approximations. We also identify between 3 and 5 outliers at every level of theory. While S1 is an outlier in all cases, small variations are observed in the remaining outliers compared to the baseline scenario.


	Fig. 6 LFER predicted BS barriers vs. DFT barriers (kJ mol⁻¹) for S1–S11 at levels of theory listed in Table 2. Points lying in the shaded region indicate LFER prediction is within 10 kJ mol⁻¹ of the DFT value. All model outliers are marked in red and labeled “S” to indicate symmetrically substituted catalysts.

We also calculate the first-order sensitivity of the LFER model to the chosen descriptors for interaction (σ_p) and strain (B1,θ). Fig. S1 of ESI,† depicts the first order sensitivities with respect to each of the 4 σ_p's, and each of 2B1 and θ values. The 4 σ_p's combined constitute the greatest determinant of LFER performance, with a total sensitivity of 70.3%. Model sensitivities to the 2B1 and 2θ descriptors are 12.8% and 16.9%, respectively. Therefore, LFER performance is least sensitive to one of the strain descriptors, B1, and most sensitive to the interaction descriptor, σ_p.

4 Discussion

The cornerstone of our catalyst design philosophy is that the influence of active site-bound ligands on kinetics can be isolated and captured using the activation strain model. The LFERs constructed using descriptor-based estimation of ASM components are then employed to predict the activity of catalysts bound to arbitrary ligands. Accordingly, for CH activation with catalysts consisting of the [Cu₂O₂]²⁺ active site, we isolate and quantify (1) ΔE_INT with σ_p as the descriptor, and (2) ΔE_STR with B1 and θ as descriptors, based on careful, systematic N-donor variations in each case. The resulting complete LFER is given by eqn (7). In this section, we lay out the reasoning behind accurate barrier predictions for the vast majority of catalysts examined. By assessing model sensitivities, descriptors, and by further decomposing strain energies into substrate and catalyst components, we also outline the possible reasons behind model failures, shown in red in Fig. 4. In short, our analysis shows that LFER accuracies rely on error cancellation between individual interaction and strain terms. However, the extent of absence of these cancellations arising from the choice of descriptors or model inadequacies in capturing ASM terms is difficult to deconvolute. Therefore, we emphasize that the analysis presented in this section is only semi-quantitative, aimed at identifying the most probable causes for model failure.

4.1 LFER performance: activation-strain compensation

First, we analyze the rationale behind accurate LFER barrier predictions for nearly 80% of the systems examined in this study. We do so by breaking down the LFER prediction into its component ΔE_INT and ΔE_STR values. Table 4 shows this decomposition for the symmetrically substituted S1–S11 catalysts. The differences between model predictions and DFT results for A1–A38 are reported in Table 3 of the ESI.† In general, LFER is a poor predictor of individual interaction and strain components. However, the sum of these deviations in interaction and strain energies cancel out in most cases. To determine whether these deviations and their mutual cancellation are grounded in underlying model properties, we revisit the procedure for LFER construction.

Table 4 Differences between LFER and DFT-based (ωB97X-D/srsc/6-311+G*) ΔE_INT, ΔE_STR, and ΔE^‡ (kJ mol⁻¹) for S1–S11. ASM components for the outliers (Fig. 4) for which the magnitudes of LFER deviations exceed 10 kJ mol⁻¹ are highlighted in boldface

Index	ΔE_INT,LFER − ΔE_INT,DFT	ΔE_STR,LFER − ΔE_STR,DFT	ΔE^‡_LFER − ΔE^‡_DFT
S1	8.4	14.0	22.3
S2	13.0	−9.3	3.7
S3	15.9	−3.8	12.1
S4	5.8	6.6	12.4
S5	5.4	−10.1	−4.8
S6	5.0	−6.8	−1.8
S7	0.0	0.0	0.0
S8	−1.3	−10.4	−11.7
S9	−21.9	15.8	−6.1
S10	−9.9	11.4	1.6
S11	−26.6	27.0	0.4

To construct the interaction component of the LFER (eqn (1)), we vary N-donor electrophilicity and capture the effect on ΔE_INT by forcing ΔE_STR to be constant. Constant strain is achieved by freezing the active site and substrate. The active site distances as well as the C–H stretch in TSoxo are therefore identical to the reference in all systems. Consider the cases wherein electron-withdrawing substituents are bound to imidazole (S8–S11). While the decrease in barrier relative to the reference S7 manifests purely in ΔE_INT during model construction, the barrier is also lowered due to reduced CH stretch in the fully relaxed system examined in this study. In the case of S11, the CH stretch is 1.415 Å in TSoxo, reduced from 1.447 Å in the reference S7 system. Therefore, this leads to a corresponding decrease in ΔE_STR. In other words, in a completely relaxed system the barrier is redistributed between the ΔE_INT and ΔE_STR components of ASM compared to LFER construction wherein ΔE_STR is invariant. For an electron-withdrawing substituent therefore, the LFER model is expected to predict more negative ΔE_INT than what is observed with DFT, or ΔE_INT,LFER − ΔE_INT,DFT ≤ 0. Correspondingly, the observed DFT strain energy is lower than LFER, or ΔE_STR,LFER − ΔE_STR,DFT≥0. The converse is true for electron-donating groups, for which LFER is expected to predict higher interaction and lower strain energies. This compensation enables cancellation of LFER errors in ASM components, leading to reliable barrier predictions for both the symmetric and asymmetrically substituted complexes. LFER transferability tests in Fig. 6 show that this compensation occurs irrespective of the choice of underlying level of theory.

We also believe that this compensation between interaction and strain energies are built into the activation strain model. We observe this phenomena in the course of constructing the strain-only component of the LFER with bidentate diamine N-donors, wherein ΔE_INT is invariant and all structures are fully relaxed.²⁴ The strain model described by eqn (6) captures variations both in substrate strain (C–H stretch) induced by the spatial extent (or steric bulk) of the catalyst and strain arising from deformation of the catalyst. Substrate strain is typically the dominant contributor to ΔE_STR and also exhibits higher sensitivity to N-donors. The outliers for the strain model consist of systems in which the catalyst ceases to be planar and exhibits bending in the TSoxo structure. We find that ΔE_INT is no longer invariant for these outliers. While the likelihood of catalyst deformation in the transition structure may be challenging to determine a priori to aid model construction, we note an interesting trend, shown in Table S6 of ESI.† Model deviations in ΔE_STR are directly correlated with the deviation of ΔE_INT with a slope of unity (R² = 0.93). In other words, concurrent with the decrease in overall observed strain, driven by larger decrease in substrate strain and small increase in catalyst strain, is an increase in interaction energy towards more repulsive catalyst–substrate interactions. While our limiting choice of an aromatic interaction descriptor precludes extending the compensation analysis to these diamine N-donors, this analysis highlights an important consequence of employing the ASM. The built-in compensation between interaction and strain components is leveraged effectively in the complete LFER despite the fact that individual activation and strain components are poorly predicted by our model.

4.2 Analysis of model descriptors & outliers

The LFER effectively leverages compensation between interaction and strain energies defined by ASM to accurately predict barriers. The model described in eqn (7), however, still has limitations that lead to incorrect barrier predictions for 10 systems examined in this study. The LFER outliers for symmetric substitutions identified in Fig. 4 – S1, S3, S4, and S8 – either do not exhibit the interaction-strain compensation described above (S1, S4, S8) or the compensation does not lead to adequate error cancellation (S3), as shown in Table 4. Each of these outliers is also observed in at least 4 out of the 6 transferability tests shown in Fig. 6.

The choice of scalar descriptors and model construction to enable capture of all ligand interaction and strain effects using these descriptors are both critical to LFER performance. The analysis of model sensitivity shows that LFER model performance is determined to the greatest extent (70.3%) by the interaction descriptor, σ_p, with each of the four σ_p values contributing equally (17.6%). Our choice of para-substituted Hammett parameters is based on excellent agreement of Hammett slopes with experiment.²³ Their use as descriptors is justified as long as we limit the scope of LFER applicability to aromatic N-donors (e.g. imidazoles). However, we make a critical assumption that the descriptor refers only to the property of the substituent (say, OCH₃) and not the entire N-donor ligand (C₃N₂H₃OCH₃). The interaction of the complete N-donor with the active site is therefore only partially captured by our choice of σ_p, leading to possible LFER deviations from true barriers. The second consequence of choosing substituent and not N-donor σ_p is the fact that the impact of physical separation of the electron-donating or withdrawing group from the active site is incorporated into the slope of the interaction term, not unlike traditional Hammett analysis.¹⁷ This further limits the scope of ligands for which the LFER model is valid.

To overcome these issues associated with Hammett parameters, our objective going forward is to identify alternative descriptors that can directly capture the impact of the complete N-donor ligand on the active site. The metal–ligand electronic parameter (MLEP) and the associated bond-strength order (BSO), both measures of metal–ligand bond strength in transition metal complexes, show promise as descriptors because they yield the most direct description of metal–ligand coordination and are easily calculated from vibrational analysis.^26,80,81 Unlike Hammett parameters however, there is little prior evidence to the best of our knowledge of the use of MLEP's, BSO's, or related quantities in linear models similar to Hammett or Taft relationships. Examination of these parameters and construction of structure–function relationships therefore constitute future work towards constructing a generalizable LFER.

Although sensitivity analysis shows that the LFER model is less sensitive to strain descriptors (12.8% and 16.9% to the 2B1 and 2θ parameters, respectively), we find that these parameters overestimate strain contributions in some cases, leading to barrier deviations. For instance, S1 is the most extreme outlier among all the systems examined and for all levels of theory, seen in Fig. 6. While we cannot eliminate the possibility that σ_p may be an inadequate descriptor of ΔE_INT, Table 4 shows that, contrary to the expected compensation effect, the strain difference (ΔE_STR,LFER − ΔE_STR,DFT) is large and positive. The large LFER strain estimate can be traced back to the strain descriptor, Sterimol B1, for S1 in Table 3, which is much larger than the values obtained for the remaining symmetrically substituted systems. Although our prior study²⁴ develops the strain LFER based on catalysts that possess a wide range of B1 (1.55–2.82 Å) and θ (71.848–100.811°), it is likely that the absence of compensation in S1 stems in part from B1 overestimating the importance of steric bulk.

The sources of error are more challenging to deconvolute for the remaining outliers in symmetrically substituted catalysts – S3, S4, and S8. With S3, for instance, the absence of compensation could arise from a combination of σ_p overestimating N-donor electrophilicity and the fact that ΔE_STR,LFER is very close to instead of being significantly lower than ΔE_STR,DFT. Although the strain descriptors for S8 are very close to reference S7, the LFER strain is lower than the true value. To probe the origins of such deviations, we turn to EDA results for further decomposing strain energy into substrate and catalyst components.

Catalyst and substrate strain terms are obtained from isolated fragment energy differences between TSoxo and IS calculated using EDA. These components are listed in Table S4 of the ESI.† The catalyst component of strain is the consequence of two phenomena – (1) deformation of the catalyst in TSoxo, reflected in N-donor rotations relative to IS, and (2) change in intrafragment interactions between N-donors between TSoxo and IS. Although the two are difficult to isolate, the latter can be approximately quantified by carrying out EDA between two N-donors bound to each Cu at their respective geometries in TSoxo and IS. For the S1–S11 systems, these intramolecular interactions (Table S5 of ESI†) have non-negligible impact on lowering catalyst strain energy. The average catalyst strain energy is 63.1 kJ mol⁻¹ for S1–S11, of which these intramolecular interactions can range between −5.3 kJ mol⁻¹ and −20.9 kJ mol⁻¹ with an average of −12.4 kJ mol⁻¹. While these quantities are significantly smaller than substrate strain (125.9 kJ mol⁻¹ on average for S1–S11), we find that trends are non-negligible.

Applying this analysis to S8, we find that the catalyst strain energy is 12.0 kJ mol⁻¹ in excess of the reference S7. It is therefore possible that failure to capture this component leads to lower LFER strain even though the compensation effect predicts positive ΔE_STR,LFER − ΔE_STR,DFT. We emphasize however that this deviation is not unique to S8 and that even S5 shows catalyst strain that is 12.2 kJ mol⁻¹ higher than S7. The absence of adequate compensation or error cancellation between the two terms constituting the ASM framework is ultimately what leads to large errors in LFER barrier predictions.

A6 and A10 exhibit the largest deviations among asymmetrically substituted systems, with barriers overestimated by 15.4 and 17.7 kJ mol⁻¹, respectively (Table S3 of ESI†). In both cases, the magnitude of the strain deviations are higher than the compensation offered by interaction deviations. LFER overestimates strain energy because true catalyst strain is lower than the reference by 13.9 kJ mol⁻¹ and 9.4 kJ mol⁻¹ for A6 and A10, respectively. On the other hand, LFER underestimates barriers by 13.4 kJ mol⁻¹ and 14.7 kJ mol⁻¹ for A12 and A37, respectively. In A12, LFER underestimates strain by 14.5 kJ mol⁻¹ but overestimates interaction by only 1.1 kJ mol⁻¹, with the former originating in part, from higher catalyst strain relative to reference S7. In A37, LFER underestimates both strain and interaction terms by 7.1 kJ mol⁻¹ and 7.7 kJ mol⁻¹, respectively. It is therefore possible that a combination of factors described previously are contributing to these deviations. Although the deviations have opposite signs as is expected for A16 and A17, inadequate compensation between interaction and strain terms lead to poor barrier predictions, with errors of −10.8 kJ mol⁻¹ and 11.5 kJ mol⁻¹, respectively.

In summary, we find that the LFER model errors can exceed 10 kJ mol⁻¹ in scenarios where individual interaction and strain deviations do not cancel each other. The absence of cancellation of errors can arise from the underlying choice of interaction and strain descriptors and/or because the strain component of the LFER is designed to primarily capture substrate strain.

4.3 Ligand weighting & position effects

An implicit assumption of our LFER framework is that all N-donors contribute equally to the interaction term, as a result of which the weight of each of the four σ_p's in eqn (7) is unity. We also assume that the 2 N-donors closest to the substrate contribute equally to the strain term. This is reasonable because most TSoxo structures exhibit some degree of symmetry, where the activated substrate is positioned close to the middle of the active site, seen in Fig. 1.

We test the assumption of equal weighting of the interaction term by analyzing the dependence of barriers on the location of the N-donor relative to the Cu–O bond being broken in the course of CH activation. We select two monodentate amine N-donors (X_am = CH₃, CF₃) and substitute one of them in one of the four positions as shown in Fig. 3(b), with NH₃ (X_am = H) groups as the remaining three N-donors. The procedure for calculating barriers is similar to the LFER construction method for the interaction term to minimize strain effects. The active site and substrate are constrained to the reference [Cu₂(NH₃)₄O₂]²⁺ relaxed geometry. Barriers are calculated at the ωB97X-D/srsc, 6-311+G* level of theory.

Table 5 shows that these barriers lie in a narrow range (within 10 kJ mol⁻¹), spanning 4.7 kJ mol⁻¹ for CH₃ and 9.7 kJ mol⁻¹ for CF₃. Therefore, the assumption of equal N-donor contributions to the interaction term is reasonable and equal weights can be assigned to the interaction descriptors for N-donors coordinated to the active site. We note that this is valid exclusively in situations where the active site and the connecting N-atoms are planar. The relative contributions of various N-donors in non-planar complexes, which can occur when more than 2 N-donors are bound to each Cu, is yet to be examined.

Table 5 BS barriers (kJ mol⁻¹) for [Cu₂(NH₃)₄O₂]²⁺ complex with one H atom replaced by a CH₃ or CF₃ group. N-Donor position 1–4 are shown in Fig. 3(b)

N-Donor	Position
N-Donor	1	2	3	4
CH₃	76.0	73.6	71.3	72.1
CF₃	56.0	63.4	53.7	61.8

While the ranges of barriers in the analysis present in Table 5 are within 10 kJ mol⁻¹, the sensitivity of barriers to the position of the X_am = CF₃ group may not be negligible. A subset of asymmetric catalysts described in Table 1 are employed to further examine these position effects on LFER predictions for fully relaxed systems. For example, consider A3, A15, and A19, each consisting of two X_im = CF₃ and X_im = H N-donors in distinct positions. In A3 and A15, each CF₃-bound imidazole group is coordinated to a different Cu in trans- and cis-like configuration, respectively. In A19, both CF₃-bound imidazole groups are coordinated to the same Cu. The DFT barriers lie in a range similar to the CF₃ systems in Table 5 – 73.8 kJ mol⁻¹ for A3, 84.0 kJ mol⁻¹ for A15, and 82.4 kJ mol⁻¹ for A19. The LFER interaction term is identical for all three systems. LFER predicts higher barriers for A3 (81.2 kJ mol⁻¹) and A19 (81.5 kJ mol⁻¹) compared to A15 (78.1 kJ mol⁻¹), on account of proximity of the CF₃ group to the substrate, reflected in one of the two B1 descriptors. The predicted barrier is very close to the DFT value for A19 and higher for A3 although still within the allowable margin of error. In the case of A15, the LFER prediction is lower than DFT by 5.9 kJ mol⁻¹ on account of modest increase in catalyst and substrate strain in the latter that is either not captured or adequately compensated for by the model. A3 does not exhibit the added strain predicted by the model despite N-donor proximity to the substrate being similar to A19. This is because, unlike in the case of A19, the X_im,₄ does not undergo twisting relative to the active site plane in A3. The remaining three N-donors show similar deformation in the two systems. Similar conclusions can be drawn from examining other asymmetric systems which only vary in positions, such as A4, A17, and A21, with two X_im = NO₂ and two X_im = H N-donors.

Taken together with the results in Table 5, we conclude that N-donor positions do impact interaction and strain components of the ASM, although the resulting range in barriers is relatively narrow. This range may not be resolvable with high fidelity by the LFER model or even be usable in model construction to assign position-dependent weights to N-donor descriptors.

4.4 Choice of reference for LFER construction & application

A comment on the choice of reference system to compute ΔE_ref and descriptor reference values in the LFER model is necessary. In principle, the choice of reference system must not affect model performance for a truly generalizable LFER. In this work, each additive term is constructed separately using a different reference system – imidazole (X_im,ref = H) for the interaction term and bidentate diamine H₂NCH₂NH₂ for the strain term.^23,24 The resulting complete LFER model, with the imidazole (X_im,ref = H) as a reference, can accurately predict most barriers for the systems examined in this study. Therefore, the model is reasonably flexible to the underlying choice of reference. The choice of linear fit with or without an intercept during model construction does not affect model performance, with the former only requiring the use of additional terms in the LFER. In our framework, we set the intercept to zero to match the Taft formalism. The construction of interaction and strain terms themselves, on the other hand, requires careful selection of reference and N-donor variations. For instance, our previous study calculates Hammett (interaction) slopes for both amine (X_am,ref = H) and imidazole N-donors. As the amine N-donors are not aromatic, σ_p is not a suitable descriptor for interaction terms, and the resulting model fit is poor (R² = 0.63). Using the amine N-donor-based interaction model with either the ammonia or imidazole reference, instead of the imidazole-based model employed in this work, leads to poor model performance. The quantitative outcomes of all these scenarios are summarized in Table S7 of the ESI.†

4.5 Transferability

The LFER constructed using the baseline level of theory shows similar performance – MAEs and outliers – when the reference is calculated using a different basis set and/or functional approximation. This is a promising indicator that the LFER framework may be transferable and therefore serves as an easily generalizable platform for ligand design. It is worth noting that our choice of functionals for transferability tests are based on conventional wisdom that a fraction of Hartree–Fock (HF) exchange improves barrier estimates. Recent studies show exceptions to this rule both in the gas phase and in surface chemistry.^82,83 In the gas phase, rehybridization of the atom undergoing transformation in the TS leads to anomalous behavior of hybrid functionals.⁸² Therefore, while the carbon atom in our study is not rehybridized in the course of hydroxylation, a broader test of functional transferability is necessary for LFER construction if the reactions involve transition state rehybridization. An additional challenge with transferability can arise when attempting to construct LFERs for key steps of a catalytic cycle instead of the single stoichiometric step examined in this study. For an iron-oxo complex for CH activation, Gani and Kulik show that the ΔE_INT and ΔE_STR terms exhibit distinct functional sensitivity towards the hydrogen atom transfer and oxo-formation steps in the catalytic cycle, leading to different trends in barriers for the two steps with varying extent of HF exchange.⁸⁴ Future work into the expansion of the framework to study catalytic cycles must therefore include additional functional transferability tests of ASM (and therefore LFER) components.

The LFER framework, constructed using the activation strain model by isolating and assigning descriptors for each contribution, constitutes a reliable predictor of activation energies for the N-donor class examined in this study. We find that the model relies on compensation or error cancellation between the interaction and strain components. The analysis presented in this work highlights key strengths of the model, including transferability with respect to level of theory and robustness with respect to choice of reference. We also identify critical model limitations that affect performance and transferability across N-donor classes, including that imposed by the choice of interaction descriptor and incomplete description of catalyst strain effects.

5 Conclusions

Inspired from the Hammett and Taft equations in organic aromatic chemistry that quantify substituent effects on reaction kinetics, this work is aimed at developing a simple yet comprehensive design framework for transition metal complex catalysts. Using the activation strain model, we present a procedure for constructing active site-specific linear free energy relationships that is easily generalizable across a wide range of catalytic reactions. The procedure allows us to understand ligand effects in terms of measurable quantities – their impact on catalyst–substrate interaction and strain induced upon deformation to the transition state – rather than by assigning ‘electronic’ and ‘steric’ effects, which can difficult to isolate and quantify in a systematic manner. We propose a piecewise construction method for the LFER, in which the impact of ligands on the interaction and strain components of the activation barrier are captured separately via careful selection of ligands and simulation constraints. This involves activation barrier and EDA calculations along with the identification of a suitable set of descriptors for each term in the ASM. To subsequently predict the catalytic activity for the desired set of ligands, the LFER needs DFT barrier calculations for only a single reference system and inexpensively obtained descriptors for the reference and the desired system.

The LFER construction procedure and resulting performance are illustrated for CH₄ hydroxylation with enzyme-inspired [Cu₂O₂]²⁺ complexes. We combine the outcomes of our previous two studies that isolate interaction and strain effects via systematic N-donor ligand variations and determine that the former effect can be described by the Hammett parameter (σ_p) and the latter with a combination of bite angle (θ) and Sterimol parameter (B1). The resulting LFER can accurately predict barriers for most catalysts where the [Cu₂O₂]²⁺ is bound to arbitrary imidazole N-donors and is transferable with respect to level of theory. This is the consequence of a natural compensation observed between LFER predictions of interaction and strain energies.

Analysis of model sensitivity and outliers reveals that the choice of interaction descriptor is the most probable cause of model deviations in systems where this compensation is not observed. Use of the Hammett parameter as descriptor severely limits the ligand space that can be explored with the current LFER and may not adequately describe the interaction effect of the entire ligand. To expand LFER catalyst scope and enable a more complete description of metal–ligand effects, our future work involves exploration of alternative descriptors, specifically MLEPs. Another limitation of the LFER model is that it captures primarily trends in substrate strain and not catalyst strain, which can also lead to barrier deviations when the catalyst strain is far from the reference value. Going forward, we aim to construct a more complete LFER model by identifying a suitable set of N-donor variations to isolate and quantify trends in catalyst strain.

This work constitutes the first step towards developing a systematic, generalizable framework for active site-specific design of transition metal complex catalysts. We successfully demonstrate proof-of-concept that an LFER grounded in the activation strain model has the ability to predict activation barriers for catalyst constructed with imidazole N-donor ligands. The choice of descriptors and quantification of more complex, intramolecular effects are essential next steps towards developing a truly generalizable LFER that can accurately predict barriers for a broader spectrum of ligands.

Conflicts of interest

There are no conflicts to declare.

Acknowledgements

The authors acknowledge startup resources and graduate fellowship from University of Southern California, the TLARGI fellowship, and USC's Center for Advanced Research Computing for supporting this work.

Notes and references

J. K. Nørskov, F. Abild-Pedersen, F. Studt and T. Bligaard, Proc. Natl. Acad. Sci. U. S. A., 2011, 108, 937–943 CrossRef PubMed.
J. Greeley, Annu. Rev. Chem. Biomol. Eng., 2016, 7, 605–635 CrossRef PubMed.
E. N. Bess, D. M. Guptill, H. M. Davies and M. S. Sigman, Chem. Sci., 2015, 6, 3057–3062 RSC.
M. S. Sigman, K. C. Harper, E. N. Bess and A. Milo, Acc. Chem. Res., 2016, 49, 1292–1301 CrossRef CAS PubMed.
T. Z. Gani and H. J. Kulik, ACS Catal., 2018, 8, 975–986 CrossRef CAS.
P. C. Andrikopoulos, C. Michel, S. Chouzier and P. Sautet, ACS Catal., 2015, 5, 2490–2499 CrossRef CAS.
D. H. Ess, W. A. Goddard III and R. A. Periana, Organometallics, 2010, 29, 6459–6472 CrossRef CAS.
D. J. Durand and N. Fey, Chem. Rev., 2019, 119, 6561–6594 CrossRef CAS PubMed.
S. Shaik, D. Kumar and S. P. De Visser, J. Am. Chem. Soc., 2008, 130, 10128–10140 CrossRef CAS PubMed.
S. Shaik, W. Lai, H. Chen and Y. Wang, Acc. Chem. Res., 2010, 43, 1154–1165 CrossRef CAS PubMed.
M. D. Wodrich, B. Sawatlon, M. Busch and C. Corminboeuf, Acc. Chem. Res., 2021, 54, 1107–1117 CrossRef CAS PubMed.
M. Anand, B. Rohr, M. J. Statt and J. K. Nørskov, J. Phys. Chem. Lett., 2020, 11, 8518–8526 CrossRef CAS PubMed.
D. J. Martin, C. F. Wise, M. L. Pegis and J. M. Mayer, Acc. Chem. Res., 2020, 53, 1056–1065 CrossRef CAS PubMed.
J. P. Janet, C. Duan, A. Nandy, F. Liu and H. J. Kulik, Acc. Chem. Res., 2021, 54, 532–545 CrossRef CAS PubMed.
E. Burello and G. Rothenberg, Int. J. Mol. Sci., 2006, 7, 375–404 CrossRef CAS.
J. Shorter and N. Chapman, Adv. Linear Free Energy Relat., 1972, pp. 321, 368 Search PubMed.
L. P. Hammett, J. Am. Chem. Soc., 1937, 59, 96–103 CrossRef CAS.
R. W. Taft Jr, J. Am. Chem. Soc., 1952, 74, 2729–2732 CrossRef.
Z. Freixa and P. W. Van Leeuwen, Dalton Trans., 2003, 1890–1901 RSC.
T. L. Brown and K. J. Lee, Coord. Chem. Rev., 1993, 128, 89–116 CrossRef CAS.
F. M. Bickelhaupt, J. Comput. Chem., 1999, 20, 114–128 CrossRef CAS.
F. M. Bickelhaupt and K. N. Houk, Angew. Chem., Int. Ed., 2017, 56, 10070–10086 CrossRef CAS PubMed.
Z. Lan and S. M. Sharada, Phys. Chem. Chem. Phys., 2018, 20, 25602–25614 RSC.
Z. Lan and S. M. Sharada, Phys. Chem. Chem. Phys., 2020, 22, 7155–7159 RSC.
A. Nandy and H. J. Kulik, ACS Catal., 2020, 10, 15033–15047 CrossRef CAS.
D. Setiawan, R. Kalescky, E. Kraka and D. Cremer, Inorg. Chem., 2016, 55, 2332–2344 CrossRef CAS PubMed.
C. E. Elwell, N. L. Gagnon, B. D. Neisen, D. Dhar, A. D. Spaeth, G. M. Yee and W. B. Tolman, Chem. Rev., 2017, 117, 2059–2107 CrossRef CAS PubMed.
E. I. Solomon, D. E. Heppner, E. M. Johnston, J. W. Ginsbach, J. Cirera, M. Qayyum, M. T. Kieber-Emmons, C. H. Kjaergaard, R. G. Hadt and L. Tian, Chem. Rev., 2014, 114, 3659–3853 CrossRef CAS PubMed.
F. M. Bickelhaupt, J. Comput. Chem., 1999, 20, 114–128 CrossRef CAS.
A. Behn, P. M. Zimmerman, A. T. Bell and M. Head-Gordon, J. Chem. Phys., 2011, 135, 224108 CrossRef PubMed.
S. Mallikarjun Sharada, P. M. Zimmerman, A. T. Bell and M. Head-Gordon, J. Chem. Theory Comput., 2012, 8, 5166–5174 CrossRef CAS PubMed.
M. v. Hopffgarten and G. Frenking, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 2012, 2, 43–62 Search PubMed.
G. T. de Jong and F. M. Bickelhaupt, ChemPhysChem, 2007, 8, 1170–1181 CrossRef CAS PubMed.
L. M. Mirica, X. Ottenwaelder and T. D. P. Stack, Chem. Rev., 2004, 104, 1013–1045 CrossRef CAS PubMed.
S. Mahapatra, J. A. Halfen, E. C. Wilkinson, G. Pan, C. J. Cramer, L. J. Que and W. B. Tolman, J. Am. Chem. Soc., 1995, 117, 8865–8866 CrossRef CAS.
J. A. Halfen, S. Mahapatra, E. C. Wilkinson, S. Kaderli, V. G. Young, L. Que, A. D. Zuberbühler and W. B. Tolman, Science, 1996, 271, 1397–1400 CrossRef CAS PubMed.
X. Ottenwaelder, D. J. Rudd, M. C. Corbett, K. O. Hodgson, B. Hedman and T. D. P. Stack, J. Am. Chem. Soc., 2006, 128, 9268–9269 CrossRef CAS PubMed.
D. G. Liakos and F. Neese, J. Chem. Theory Comput., 2011, 7, 1511–1523 CrossRef CAS PubMed.
M. Flock and K. Pierloot, J. Phys. Chem. A, 1999, 103, 95–102 CrossRef CAS.
T. Yanai, Y. Kurashige, E. Neuscamman and G. K. L. Chan, J. Chem. Phys., 2010, 132, 024105 CrossRef PubMed.
C. J. Cramer, B. A. Smith and W. B. Tolman, J. Am. Chem. Soc., 1996, 118, 11283–11287 CrossRef CAS.
P. Å. Malmqvist, K. Pierloot, A. R. M. Shahi, C. J. Cramer and L. Gagliardi, J. Chem. Phys., 2008, 128, 204109 CrossRef PubMed.
C. J. Cramer, A. Kinal, M. Włoch, P. Piecuch and L. Gagliardi, J. Phys. Chem. A, 2006, 110, 11557–11568 CrossRef CAS PubMed.
C. J. Cramer, M. Włoch, P. Piecuch, C. Puzzarini and L. Gagliardi, J. Phys. Chem. A, 2006, 110, 1991–2004 CrossRef CAS PubMed.
B. F. Gherman and C. J. Cramer, Coord. Chem. Rev., 2009, 253, 723–753 CrossRef CAS.
M. F. Qayyum, R. Sarangi, K. Fujisawa, T. D. P. Stack, K. D. Karlin, K. O. Hodgson, B. Hedman and E. I. Solomon, J. Am. Chem. Soc., 2013, 135, 17417–17431 CrossRef CAS PubMed.
P. P.-Y. Chen and S. I. Chan, J. Inorg. Biochem., 2006, 100, 801–809 CrossRef CAS PubMed.
K. Yoshizawa and Y. Shiota, J. Am. Chem. Soc., 2006, 128, 9873–9881 CrossRef CAS PubMed.
Y. Shiota and K. Yoshizawa, Inorg. Chem., 2009, 48, 838–845 CrossRef CAS PubMed.
Y. Shiota, G. Juhász and K. Yoshizawa, Inorg. Chem., 2013, 52, 7907–7917 CrossRef CAS PubMed.
J. C. S. Da Silva, R. C. R. Pennifold, J. N. Harvey and W. R. Rocha, Dalton Trans., 2016, 45, 2492–2504 RSC.
C. J. Cramer and Y. Pak, Theor. Chem. Acc., 2001, 105, 477–480 Search PubMed.
C. Citek, C. T. Lyons, E. C. Wasinger and T. D. P. Stack, Nat. Chem., 2012, 4, 317–322 CrossRef CAS PubMed.
A. Verloop, W. Hoogenstraaten and J. Tipker, Drug Design, Elsevier, 1976, pp. 165–207 Search PubMed.
A. Verloop, Pesticide Chemistry: Human Welfare and Environment, Elsevier, 1983, pp. 339–344 Search PubMed.
K. C. Harper, E. N. Bess and M. S. Sigman, Nat. Chem., 2012, 4, 366 CrossRef CAS PubMed.
C. B. Santiago, A. Milo and M. S. Sigman, J. Am. Chem. Soc., 2016, 138, 13424–13430 CrossRef CAS PubMed.
A. Bondi, J. Phys. Chem., 1964, 68, 441–451 CrossRef CAS.
A. V. Brethomé, S. P. Fletcher and R. S. Paton, ACS Catal., 2019, 9, 2313–2323 CrossRef.
Y. Shao, Z. Gan, E. Epifanovsky, A. T. Gilbert, M. Wormit, J. Kussmann, A. W. Lange, A. Behn, J. Deng, X. Feng, D. Ghosh, M. Goldey, P. R. Horn, L. D. Jacobson, I. Kaliman, R. Z. Khaliullin, T. Kuś, A. Landau, J. Liu, E. I. Proynov, Y. M. Rhee, R. M. Richard, M. A. Rohrdanz, R. P. Steele, E. J. Sundstrom, H. L. Woodcock III, P. M. Zimmerman, D. Zuev, B. Albrecht, E. Alguire, B. Austin, G. J. O. Beran, Y. A. Bernard, E. Berquist, K. Brandhorst, K. B. Bravaya, S. T. Brown, D. Casanova, C.-M. Chang, Y. Chen, S. H. Chien, K. D. Closser, D. L. Crittenden, M. Diedenhofen, R. A. Distasio Jr, H. Do, A. D. Dutoi, R. G. Edgar, S. Fatehi, L. Fusti-Molnar, A. Ghysels, A. Golubeva-Zadorozhnaya, J. Gomes, M. W. Hanson-Heine, P. H. Harbach, A. W. Hauser, E. G. Hohenstein, Z. C. Holden, T.-C. Jagau, H. Ji, B. Kaduk, K. Khistyaev, J. Kim, J. Kim, R. A. King, P. Klunzinger, D. Kosenkov, T. Kowalczyk, C. M. Krauter, K. U. Lao, A. D. Laurent, K. V. Lawler, S. V. Levchenko, C. Y. Lin, F. Liu, E. Livshits, R. C. Lochan, A. Luenser, P. Manohar, S. F. Manzer, S.-P. Mao, N. Mardirossian, A. V. Marenich, S. A. Maurer, N. J. Mayhall, E. Neuscamman, C. M. Oana, R. Olivares-Amaya, D. P. O-Neill, J. A. Parkhill, T. M. Perrine, R. Peverati, A. Prociuk, D. R. Rehn, E. Rosta, N. J. Russ, S. M. Sharada, S. Sharma, D. W. Small, A. Sodt, T. Stein, D. Stück, Y.-C. Su, A. J. Thom, T. Tsuchimochi, V. Vanovschi, L. Vogt, O. Vydrov, T. Wang, M. A. Watson, J. Wenzel, A. White, C. F. Williams, J. Yang, S. Yeganeh, S. R. Yost, Z.-Q. You, I. Y. Zhang, X. Zhang, Y. Zhao, B. R. Brooks, G. K. Chan, D. M. Chipman, C. J. Cramer, W. A. Goddard III, M. S. Gordon, W. J. Hehre, A. Klamt, H. F. Schaefer III, M. W. Schmidt, C. D. Sherrill, D. G. Truhlar, A. Warshel, X. Xu, A. Aspuru-Guzik, R. Baer, A. T. Bell, N. A. Besley, J.-D. Chai, A. Dreuw, B. D. Dunietz, T. R. Furlani, S. R. Gwaltney, C.-P. Hsu, Y. Jung, J. Kong, D. S. Lambrecht, W. Liang, C. Ochsenfeld, V. A. Rassolov, L. V. Slipchenko, J. E. Subotnik, T. V. Voorhis, J. M. Herbert, A. I. Krylov, P. M. Gill and M. Head-Gordon, Mol. Phys., 2015, 113, 184–215 CrossRef CAS.
J.-D. Chai and M. Head-Gordon, Phys. Chem. Chem. Phys., 2008, 10, 6615–6620 RSC.
J.-D. Chai and M. Head-Gordon, J. Chem. Phys., 2008, 128, 084106 CrossRef PubMed.
M. Dolg, U. Wedig, H. Stoll and H. Preuss, J. Chem. Phys., 1987, 86, 866–872 CrossRef CAS.
R. Bauernschmitt and R. Ahlrichs, J. Chem. Phys., 1996, 104, 9047–9052 CrossRef CAS.
S. Mallikarjun Sharada, D. Stuck, E. J. Sundstrom, A. T. Bell and M. Head-Gordon, Mol. Phys., 2015, 113, 1802–1808 CrossRef.
R. Z. Khaliullin, E. A. Cobar, R. C. Lochan, A. T. Bell and M. Head-Gordon, J. Phys. Chem. A, 2007, 111, 8753–8765 CrossRef CAS PubMed.
P. R. Horn, E. J. Sundstrom, T. A. Baker and M. Head-Gordon, J. Chem. Phys., 2013, 138, 134119 CrossRef PubMed.
P. R. Horn and M. Head-Gordon, J. Chem. Phys., 2015, 143, 114111 CrossRef PubMed.
P. R. Horn and M. Head-Gordon, J. Chem. Phys., 2016, 144, 084118 CrossRef PubMed.
P. R. Horn, Y. Mao and M. Head-Gordon, J. Chem. Phys., 2016, 144, 114107 CrossRef PubMed.
P. R. Horn, Y. Mao and M. Head-Gordon, Phys. Chem. Chem. Phys., 2016, 18, 23067–23079 RSC.
I. M. Sobol, Math. Comput. Simul., 2001, 55, 271–280 CrossRef.
J. Herman and W. Usher, J. Open Source Software, 2017, 2, 97 CrossRef.
A. D. Becke, J. Chem. Phys., 1993, 98, 5648–5652 CrossRef CAS.
B. J. Lynch and D. G. Truhlar, J. Phys. Chem. A, 2001, 105, 2936–2941 CrossRef CAS.
J.-D. Chai and M. Head-Gordon, Phys. Chem. Chem. Phys., 2008, 10, 6615–6620 RSC.
C. Adamo and V. Barone, J. Chem. Phys., 1999, 110, 6158–6170 CrossRef CAS.
K. Hui and J.-D. Chai, J. Chem. Phys., 2016, 144, 044114 CrossRef PubMed.
N. Mardirossian and M. Head-Gordon, Phys. Chem. Chem. Phys., 2014, 16, 9904–9924 RSC.
D. Cremer and E. Kraka, Dalton Trans., 2017, 46, 8323–8338 RSC.
E. Kraka and M. Freindorf, in Characterizing the Metal–Ligand Bond Strength via Vibrational Spectroscopy: The Metal–Ligand Electronic Parameter (MLEP), ed. A. Lledós and G. Ujaque, Springer International Publishing, Cham, 2020, pp. 227–269 Search PubMed.
A. Mahler, B. G. Janesko, S. Moncho and E. N. Brothers, J. Chem. Phys., 2018, 148, 244106 CrossRef PubMed.
S. Mallikarjun Sharada, T. Bligaard, A. C. Luntz, G.-J. Kroes and J. K. Nørskov, J. Phys. Chem. C, 2017, 121, 19807–19815 CrossRef CAS.
T. Z. Gani and H. J. Kulik, J. Chem. Theory Comput., 2017, 13, 5443–5457 CrossRef CAS PubMed.

Footnote

† Electronic supplementary information (ESI) available. See DOI: 10.1039/d1cp02278d

Click here to see how this site uses Cookies. View our privacy policy here.