Exploration of H2 binding to the [NiFe]-hydrogenase active site with multiconfigurational density functional theory

The combination of density functional theory (DFT) with a multiconfigurational wave function is an efficient way to include dynamical correlation in calculations with multiconfiguration self-consistent field wave functions. These methods can potentially be employed to elucidate reaction mechanisms in bio-inorganic chemistry, where many other methods become either too computationally expensive or too inaccurate. In this paper, a complete active space (CAS) short-range DFT (CAS-srDFT) hybrid was employed to investigate a bio-inorganic system, namely H2 binding to the active site of [NiFe] hydrogenase. This system was previously investigated with coupled-cluster (CC) and multiconfigurational methods in form of cumulant-approximated second-order perturbation theory, based on the density matrix renormalization group (DMRG). We find that it is more favorable for H2 to bind to Ni than to Fe, in agreement with previous CC and DMRG calculations. The accuracy of CAS-srDFT is comparable to both CC and DMRG, despite that much smaller active spaces were employed. This enhanced efficiency at smaller active spaces shows that CAS-srDFT can become a useful method for bio-inorganic chemistry.

neglect a major part of the dynamical correlation. To recover the missing dynamical correlation, perturbation theory is normally employed after a CASSCF or DMRG-SCF calculation, as done in CASPT2 16,[21][22][23][24] or NEVPT2. 20,25,26 However, the perturbation correction comes with additional high computational cost.
An efficient method to recover the dynamical correlation in multiconfigurational methods is to merge DFT with a multiconfigurational wave function, thereby capitalizing on the efficient treatment of semi-local dynamical electron correlation within DFT methods. Simultaneously, such a hybrid method has the advantage that the multiconfigurational wave function can include static correlation. [27][28][29][30][31] In this paper, we explore the multiconfigurational short-range DFT (MC-srDFT) method. It exploits the concept of range separation of the two-electron repulsion operator to merge DFT with a multiconfigurational wave function. With a recent extension of the MC-srDFT method to a polarizable embedding framework, [32][33][34] the method can also be employed on biological systems, and the method may be a promising approach to use for metalloenzymes. However, the MC-srDFT method has mostly been benchmarked for s-and p-block atoms, diatomic molecules, 30,[35][36][37][38][39][40] and organic systems. [41][42][43][44] Studies of transition metals are more rare. 40,45 Before addressing full enzymes, we first need to ensure that the results of MC-srDFT are in agreement with previous accurate calculations for biologically relevant cases and this is the purpose of the present paper.
We investigate the binding of H 2 to the active site of [NiFe] hydrogenase, for which previous studies have given ambiguous results. On the one hand, experimental studies with CO or Xe gas-diffusion have predicted that H 2 binds to Ni. [46][47][48] On the other hand, Fe is the expected binding site from the organometallic perspective. 49,50 Various DFT studies have predicted that H 2 binds to Ni or to Fe with the active site in the Ni(II) singlet, or even to Fe in the triplet state. [51][52][53][54][55][56][57] We have recently investigated the H 2 binding site by using CCSD(T) and cumulant approximated DMRG-CASPT2 methods, 58 as well as DFTbased calculations with the big-QM approach, 59 using 819 atoms in the QM region. In this study, we compare results obtained with the MC-srPBE method with the previous CCSD(T) and DMRG-CASPT2 results, and show that MC-srDFT comes to the same conclusions. We furthermore study the method's dependence on the size of the active space and the employed basis set.
Several forms of the range-separated operators have been suggested. 30,35,61 We use in this work a range-separation operator based on the error function 31,38,62,63 where µ is the range-separation parameter, measured in bohr −1 in this article. This parameter is to some degree adjustable and slightly different values have been employed in the literature (we discuss this point further below). In limiting cases, a value of µ = ∞ reduces MC-srDFT to multiconfigurational SCF (MCSCF), a pure wave function method, whereas µ = 0 reduces MC-srDFT to a pure Kohn-Sham DFT method. Both g lr ee (1, 2) and g sr ee (1,2) depend on the choice of µ, but this dependence has been left out in all equations for brevity, because µ is selected a priori and then kept fixed.
whereĥ contains the usual one-electron operators (kinetic energy and nuclear-electron attraction),ĝ lr ee was defined in Eq. (2), and the short-range DFT potential operator is defined It should be stressed that special exchange-correlation functionals are a prerequisite for range-separated wave function DFT hybrids (this point is explained thoroughly in ref. 64).
We use in this work the short-range PBE-based srPBE functional by Goll et al. 39,65 In all cases, the applied multiconfigurational wave function ansatz was of the CASSCF type.
Further, in a few trial calculations (reported in the SI), we also employed a wave function ansatz based on Møller-Plesset second order perturbation theory (MP2). Since the applied multiconfigurational wave function ansatz was of the CASSCF type, we will henceforth refer to MC-srDFT with respect to the choices of multiconfigurational wave function and functional, i.e., CAS-srPBE for the method employed in this paper, and MC-srDFT for the general method.
For the range-separation parameter, most studies on range-separated DFT hybrids 66,67 employ values between 0.33-0. All calculations were carried out with a development version of the DALTON program. 68,69 Further details about the MC-srDFT method, as well as the implementation, can be found elsewhere. 70

Model systems and basis sets
As the name indicates, the active site of [NiFe] hydrogenase consists of a Ni ion and a Fe ion. The former is coordinated to four Cys residues, two of which are also bridging to the Fe ion. The latter also coordinates one CO and two CNligands. In this paper, we compare the stability of two binding modes of H 2 to this site, viz. binding side-on to Ni or to Fe. The two binding modes will be called H 2 -Ni and H 2 -Fe, and they are shown in Figure 1 (note that H 2 actually bridges the two metal ions in the H 2 -Fe binding mode). In analogy with our previous study, 58 we used for each state three models of increasing size, also shown in Figure 1. In the smallest model 1, the four cysteine ligands were modeled by HSgroups, whereas in the other two models they were modeled by CH 3 S -. In the largest model 3, two second-sphere residues were included, Glu34 and His88 (residue numbering according to the crystal structure with PDB entry 1H2R 71 ), modeled by acetic acid and imidazole, respectively. The structures were taken from our previous study 58  hydrogenase. 76 The calculations presented here were carried out with three basis sets of increasing size, denoted B1-B3. For the smallest one (named B1), the cc-pVTZ 77,78 basis set was employed for the Ni and Fe ions, and the cc-pVDZ 77 basis set was used for the other atoms. The effect Figure 1: The models we used in this work.
of increasing the basis set was investigated by using the cc-pVTZ basis set on all atoms except H, for which the cc-pVDZ basis set was used. This basis set resembles the ANO-type basis set employed in Ref. 58

Selection of active spaces
The selection of active space is of highest importance for an MCSCF calculation and different strategies have been proposed for this selection. One strategy has been to rely on identifying orbitals from the chemical context. 82 For transition metal complexes, this has typically led to the suggestion that all 3d orbitals and a few ligand orbitals should be included, and preferably also an additional (double-shell) of 3d orbitals. A different strategy relies on selecting orbitals based on natural occupation numbers from methods where the predicted occupation numbers are qualitatively correct. This could be either MP2 83 or a computationally cheap CI method.
Typically, one would select orbitals with occupation numbers significantly different from 2 or 0. Rules for selection of active spaces are not as well established for short-range DFT methods. Occupation numbers based on MP2-srDFT have previously been discussed 84 and it was noted these natural occupation numbers are much closer to 2.0 and 0.0 than their MP2 counterparts. This is expected because the short-range density functional effectively Considering that the MP2-srPBE occupation numbers might not reflect the "true" occupation number (i.e. occupation numbers obtained with a full-CI-srPBE approach), we initially investigated CAS(10,10), CAS (14,14) and CAS (16,16) for both species. The corresponding CAS-srPBE occupation numbers are also shown in Tables S2 and S3. For CAS (16,16), we start to include orbitals with either very high or very low occupation numbers (above 1.99 or below 0.01), which affects the convergence. The CAS(16,16)-srPBE calculation for H 2 -Fe also shows that the occupation numbers for the last two orbitals in what would correspond to a CAS(12,12) become even closer than for the MP2-srPBE calculation. Hence, CAS(12,12) will become unstable and prone to get stuck in local minima, which was confirmed by a trial calculation with this active space. The orbitals causing these difficulties are involved in the Fe−CN and Fe−CO bonds, and care must be taken to include these orbitals uniformly in the two states. This is done in CAS (14,14), which is the largest active space that can be considered balanced (and it is also feasible for the larger models 2 and 3).
Visual inspection of the CAS (14,14) orbitals in Figure 2 shows that this active space includes the Ni 3d-orbitals, the H 2 and metal-ligand (CO π-type) orbitals, although the orbitals are more delocalized than the pure DMRG-SCF (or CASSCF) orbitals in refs. 58,85.
Further reduction of the active space to CAS(10,10), leads to exclusion of orbitals that are partly on hydrogen and the Ni ion, and we therefore prefer to include these two orbitals (i.e. orbitals 4, 5, 13 and 14 are included for H 2 -Ni in Figure 2, compared to Figure S2).
Furthermore, the occupation number of the Ni orbital in H 2 -Ni (orbital 4 in Figure 2) is around 1.98 in the CAS (14,14) calculations and thus rather close to two of the other orbitals in the active space. This indicates that this orbital should be included.
Expanding the calculations to CAS (16,16), introduces orbitals that are mainly on bridging sulfur atoms and can be considered less important. For instance, for H 2 -Ni, the additional orbitals compared to CAS (14,14) are orbitals 8 and 14 in Figure S5. Although we here focus on CAS (14,14), it should be noted that the effect on the calculated (relative) energies is in fact small (2 kJ/mol and below), as will be discussed in next section. For models 2 and 3, we also focus on CAS (14,14), but we have employed both CAS(10,10) and CAS (14,14) active spaces to probe the effect of the active spaces for these larger models as well. he corresponding active space orbitals are shown in the SI. Finally, we note that we also attempted to select orbitals based on calculations with larger µ values, but this procedure was less satisfactory (shortly discussed in the SI).

Results and Discussion
In this study, we have compared the results of CAS-srPBE with previously published CCSD(T) and cumulant approximated DMRG-CASPT2 calculations for the two binding modes of H 2 to the active site of [NiFe] hydrogenase. 58 We discuss first the smallest model (model 1) and then the two larger models (models 2 and 3) in separate sections.

Calculations with model 1
The energy difference between the H 2 -Ni and H 2 -Fe states (∆E H 2 ) calculated with the CAS-srPBE method is compared to previous CCSD(T) and DMRG-CASPT2 results in Table 1. We report CAS-srPBE results with the CAS(10,10), CAS (14,14) and CAS (16,16) active spaces. In all cases, CAS-srPBE predicts that the H 2 -Ni state is most stable, in agree-  At this point we emphasize that recent studies have noted that multireference perturbation theory to second order does not always recover the 3s,3p correlation well. 86 Table 1

Calculations with models 2 and 3
Next, we carried out CAS-srPBE calculations also for the two larger models 2 and 3 in Figure 1. The results are shown in Table 2 and are compared to the corresponding results obtained with DMRG-CASPT2. It can be seen that the two approaches give similar trends: The energy differences increase in model 2 (24-39 kJ/mol), whereas inclusion of models of two nearby amino-acids counteracts this increase, so that in model 3, the energy difference decreases again to 8-19 kJ/mol. For all three models, H 2 -Ni is thus consistently predicted to be the most stable state and the CAS (14,14)-srPBE results are quite close to that of DMRG(22,22)-CASPT2. From Table 2, it can be seen that the differences to CAS(14,14)-srPBE are 5 and 10 kJ/mol for model 2, depending on whether the DMRG-CASPT2 included def2-QZVPD 34.0 18.5 a With 3s,3p correlation obtained from CCSD(T). b Extrapolated with the energy difference between CCSD(T) and DMRG (22,22)-CASPT2 for model 1, the latter with 3s,3p correlation obtained from CCSD(T). Although the performance of CAS-srPBE is encouraging for applications in metalloenzymes, further improvements are possible: For instance, the accuracy of the srDFT functional can be improved. This could be achieved by either including exact (short-range) DFT exchange or by including kinetic energy dependence in the same way as in meta-GGA functionals. 87

Conclusions
In this paper, CAS-srPBE calculations were performed on three models of H 2 bound to [NiFe]-hydrogenase. Our results indicate that H 2 binding to Ni is more stable than binding to Fe, which is consistent with previous calculations with the CCSD(T) and DMRG-CASPT2 methods. 58 Our CAS-srPBE calculations with reduced active spaces (CAS(10,10), CAS (14,14) and CAS (16,16)) gave results close to the CAS (22,22) Table S1. One notes that ∆E H 2 decreases steadily with increasing µ. Recalling that increasing µ means moving correlation from the short-range DFT part to the long-range WFT part, a possible reason for these results could be that MP2 is inadequate to describe the energy difference in wave function theory. We therefore decided to calculate ∆E H 2 also with CAS (14,14)-srPBE for the most relevant µ values, and these results are also reported in Table S1. These values clearly support our contention that MP2 is inadequate, while CAS (14,14) is sufficiently flexible and accurate to describe the transfer of correlation effects from the DFT part to the WFT part of the MC-srDFT model when µ changes from 0.4 to 0.7. As a side remark, the variation in the CAS (14,14) natural orbital occupation numbers in Tables S2 and S3 as a function of the µ value shows explicitly that the long-range correlation effects increase when µ is increased.
From these results, µ =0.4-0.7 all give reasonable ∆E H 2 energy differences compared to DMRG (22,22)-CASPT2 and CCSD(T) values (cf. Table 1 in the paper). Previous studies (see paper for references) on smaller systems have found that 0.4 is optimal, our results here provide no reason for advocating another value here, and we therefore follow our earlier recommendation and use a value of µ = 0.4 in this study.