Manuel
Sparta
and
Frank
Neese
*
Max Planck Institute for Chemical Energy Conversion, Stiftstr. 32–34, D-45470 Mülheim an der Ruhr, Germany. E-mail: Frank.Neese@cec.mpg.de
First published on 27th March 2014
The scope of this review is to provide a brief overview of the chemical applications carried out by local pair natural orbital coupled-electron pair and coupled-cluster methods. Benchmark tests reveal that these methods reproduce, with excellent accuracy, their canonical counterparts. At the same time, the speed up achieved by exploiting the locality of the electron correlation permits us to tackle chemical systems that, due to their size, would normally only be addressable with density functional theory. This review covers a broad variety of the chemical applications e.g. simulation of transition metal catalyzed reactions, estimation of weak interactions, and calculation of lattice properties in molecular crystals. This demonstrates that modern implementations of wavefunction-based correlated methods are playing an increasingly important role in applied computational chemistry.
On the other hand, wavefunction-based ab initio methods can be systematically improved and converged toward the exact solution of the Schrödinger equation. Among the ab initio quantum mechanical methods the coupled-cluster model with single and double excitations corrected by perturbative triples (CCSD(T)) has become the “gold standard” of computational chemistry.5–8 However, the very high computational expense and the high order scaling (O(N7)) of the canonical procedure render CCSD(T) unsuitable for routine applications. Hence its use is usually restricted to benchmark studies on very small systems.
One of the challenges in modern theoretical chemistry method developments has been to devise and implement approximations that expedite correlated ab initio methods without loss of accuracy. The strategies adopted involve the partition of the system of interest into fragments9–21 or the construction of correlation domains while the whole system is treated at once.22–41 In recent years, our group has been active in the field of local correlation methods and developed a way to take advantage of the locality of the electron correlation by means of Pair Natural Orbitals (PNOs, see Theoretical background). The framework for the development of the correlated method was named the Local Pair Natural Orbital (LPNO) where the term “Local” alludes to the fact that the internal space is spanned by localized internal orbitals.42,43 Further improvements gave rise to the Domain-based Local Pair Natural Orbital (DLPNO).44,45 Several correlated methods have been implemented in the (D)LPNO foundations and, acknowledging the efficiency and accuracy demonstrated, the use of PNOs in local correlation calculations has rapidly regained popularity. Tew, Helmich and Hättig have used pair natural orbitals in the implementation of explicitly correlated Møller–Plesset perturbation theory (MP2, MP3).38,39 Helmich and Hättig, focusing on excitation energies, explored how to reduce the number of double excitation amplitudes in response calculations with PNOs and localized occupied orbitals46 and subsequently implemented the iterative coupled-cluster method CC2.40 Krause and Werner systematically compared the use of projected atomic orbitals, pair natural orbitals and orbital specific virtual orbitals (OSVs) concluding that, for a given accuracy, the PNO correlation domains are, in average, four time smaller than the one obtained with PAOs or OSVs.47 Kallay and coworkers presented a local coupled-cluster method that combines the cluster-in-molecule approach with virtual orbitals constructed including approximate MP2 natural orbitals.48,49
In this review, in spite of the early age of the (D)LPNO-based methods, we provide an overview of chemical applications where their robustness, accuracy and affordability were exploited, in the hope that this motivates more researchers to apply ab initio correlated methods to their work. Furthermore, the wide-ranging scope of the studies covered in this survey demonstrates how modern implementations of highly accurate wavefunction-based correlated methods have a broad range of applicability and are becoming more common in many branches of computational chemistry.
Recently, our group attempted to revive the use of PNOs in correlated calculations by applying density-fitting approximations and by taking advantage of modern computational architectures.42,43 The first method implemented in the LPNO foundations was the coupled-electron pair approximation (CEPA)42 initially developed by Meyer.60–62 CEPA has shown to provide results that are of intermediate quality between CCSD and CCSD(T).64,65 The coupled-cluster with single and double excitations model (CCSD)43 as well as the parameterized coupled-cluster singles and doubles model (pCCSD)66 were made available within the LPNO framework. Subsequently, the LPNO-CCSD method was completely redesigned to address its inherent fifth order scaling. The new implementation combines the concepts of PNOs and Projected Atomic Orbitals (PAOs).44 PAOs were introduced by Pulay and Saebo,22–26 and extensively used in the development of various correlation methods by Werner and Schütz.27–33 The resulting DLPNO-CCSD was found to be near linear scaling.44 Finally, the addition of perturbative treatment of the triple excitations led to the DLPNO-CCSD(T) model.45
The key aspects of the DLPNO coupled-cluster methods can be briefly summarized. The initial step consists of the localization of the occupied orbitals obtained from a single determinant reference wavefunction calculation. The electron correlation for the whole system is given by the sum of over electron pair correlation energies εij (i and j refer to localized occupied orbitals). Using a local MP2 estimate of the pair correlation energy, the electron pairs are partitioned into “strong” (pair correlation larger than the TCutPairs threshold) and “weak” (see the left panel in Fig. 1). Only strong pairs are explicitly treated in the coupled-cluster procedure, whereas the MP2 estimates for the weak pairs are added a posteriori to the total correlation energy. For each strong pair, the virtual space for the correlation consists of PNOs constructed from the MP2 pair densities. PNOs with occupation numbers larger than the TCutPNO threshold (default value: 3.33 × 10−7) are kept and a MP2 correction, accounting for the truncation of the virtual space, is computed (see the right panel in Fig. 1). PNOs and integrals needed are expanded in terms of PAO domains, whose size is controlled by the TCutMKN threshold (molecular structure inserted in the right panel of Fig. 1). For a more comprehensive description of the LPNO and DLPNO methods, we refer to the original literature.42–45
All the (D)LPNO methods were implemented and distributed free of charge to the quantum chemistry community via the ORCA67 suite of programs.
In this respect, the asymmetric hydrogenation of prochiral olefins using Rh catalysts is the prototype of an enantioselective transition metal-catalyzed reaction. The reaction mechanism in the presence of the ligand [(R,R)-MeDuPHOS] was studied for two substrates: α-formamidoacrylonitrile and N(1-tertbutylvinyl)formamide by Anoop et al.69 (see Scheme 1).
Scheme 1 Asymmetric hydrogenation of enamides. Reprinted with permission from A. Anoop, W. Thiel and F. Neese, J. Chem. Theory Comput., 2010, 6, 3137–3144. Copyright 2010 American Chemical Society. |
The method adopted in this work is the local pair natural orbital coupled-cluster theory with single and double excitations (LPNO-CCSD). The method was carefully calibrated with respect to canonical CCSD and CCSD(T) results on a truncated model system. For this specific system, it was observed that the lack of the perturbative triple correction and the errors due to basis set incompleteness do not affect substantially the reaction energy profiles calculated with LPNO-CCSD/def2-TZVP. In conclusion, the energies computed for the full system are believed to be within 1–2 kcal mol−1 of the CCSD(T)/CBS limit.69
Several pathways for the hydrogenation process were investigated for both substrates, and it was concluded, in agreement with previous theoretical studies, that the enantioselectivity of the process is rooted in the different reactivity of the catalyst-substrate adduct. In practice, the presence of the polar amide group enables bidentate coordination of the substrate to the metal center. Subsequently, the substituents (cyano vs. tert-butyl) at the CC influence the reactivity of the adduct determining the less energetically demanding site for the attack of H2. In turn, this leads to an excess of the (S) enantiomeric product for the butyl system whereas the R-product is obtained for the cyano system. For both substrates, the computational prediction regarding the stereochemistry of the major product was found to be in agreement with the experiment.69
As shown, the applicability of the Rh based catalyst is restricted by the requirements of a polar functionalization of the substrates. To overcome this limitation Pfaltz and coworkers70 developed iridium complexes capable of promoting the enantioselective hydrogenation of unfunctionalized alkenes. The key feature of these complexes is the presence of a chiral chelating ligand that couples a heterocycle containing a sp2 hybridized nitrogen atom with a trisubstituted phosphorus (or N-heterocyclic carbene) as shown in the PHOX ligand.71 Recently, the catalytic cycle for the hydrogenation of ethylene and five trisubstituted prochiral olefin substrates promoted by the Ir–PHOX complex was investigated in our group72 by employing the DLPNO-CCSD(T) method. In this case, ab initio electronic energies were combined with solvation and thermochemical corrections computed with DFT to access Gibbs free energies of the species in solution. For this system, it was shown that the fine balance between steric repulsion and Van der Waals interaction between the substituted oxazoline terminus of the ligand and the substituents on the olefin substrates determines whether the Si-face or Re-face coordination of the prochiral substrate is more reactive. For all the substrates, the predicted enantiomeric excesses were found in good agreement with the available experimental data.72
The possibility of obtaining high quality ab initio data for systems containing more than one metal atom was explored in a detailed study on the equilibrium between the peroxo and bis-(μ-oxo) isomers of [Cu2(en)2(O)2]2+. In this case, the scan of the potential energy surface (PES) connecting the two structures was conducted using LPNO-CCSD energies in the complete basis set limit complemented with canonical perturbative triple corrections obtained with a small basis set. In all calculations, relativistic effects as well as solvation correction were accounted for.73 The systematic survey determined that the bis-(μ-oxo) isomer is more stable than the peroxo counterpart, in agreement with experimental results. Furthermore, it was concluded that the inclusion of relativistic corrections is important for a proper estimation of the relative stabilities. Relativistic corrections computed with three different approaches (second-order Douglas–Kroll–Hess (DKH) transformation, zeroth-order approximation for relativistic effects (ZORA) or effective core potentials (ECPs)) concur that the net effect of relativity is the stabilization of the bis-(μ-oxo) isomer. Solvation effects (accounted for with a dielectric continuum model) were found to give a similar contribution.
In Fig. 2, the best ab initio estimate (scalar-relativistic LPNO-CCSD/CBS supplemented with solvent and triple excitation corrections) is taken as reference and compared with the most accurate of the DFT functionals (B3LYP-D). Although B3LYP-D correctly predicts that the bis-(μ-oxo) isomer is the most stable, the energy difference is overestimated by 9.3 kcal mol−1 and the position of the minimum for peroxo species is calculated to be 0.1 Å too short. Interestingly, although this system is often regarded as a prototypical multireference case, the analysis of the wavefunction across the PES suggested that the multireference character in this system is very limited.73
While in the previous examples, coupled-cluster based methods were adopted, there are cases where CEPA types of approaches were preferred. For example, LPNO-CEPA/1 was applied to investigate the stability and reactivity of azaphosphiridine P-pentacarbonylchromium(0) complexes.74 The chemistry covered in this contribution is summarized in Scheme 2.74
This computational study suggested that in the presence of additional carbon monoxide, CO insertion into the P–N bond may occur yielding the 1,3-azaphosphetidin-2-one complex. Furthermore, it was shown that the opening of the ring promoted by the moderate ring strain can be controlled through appropriate tuning of the electronic properties and steric bulk of the P-substituent.74
One further example where high level LPNO-CEPA/1 calculations were performed to obtain reliable reaction barriers and binding energies is due to Kubas et al.75 In their study, the addition of dialkylzinc to a,b-unsaturated aldehydes (an important class of reactions for C–C bond formation) was investigated. In specific, the asymmetric additions of dimethyl- and diethyl-Zn, catalyzed by [2.2]paracyclophane-based N,O-ligands, was considered (see Scheme 3).75
LPNO-CEPA/1 results were used to assess the performances of a set of popular theoretical methods. Based on their benchmark study, the authors concluded that the stereoselectivity of the reaction as well as binding energies were properly addressed by DFT when empirical dispersion corrections were accounted for. However, the use of the highly correlated wavefunction-based method was necessary for a correct prediction of the regioselectivity of the reaction.75
Fig. 3 Hydroxylation of the substrate p-hydroxybenzoate by the cofactor flavin hydroperoxide in the active site of PHBH. Average activation barriers and the associated standard deviation (as error bars) are shown (kcal mol−1) for 10 snapshots representative of the molecular dynamics simulation of the enzyme. LCCSD, LCCSD(T0) and B3LYP (basis set: (aug)-cc-pvTZ) values are taken from ref. 76. Basis set for the DLPNO calculations: def2-TZVP. |
Finally, in the validation of the DLPNO-CCSD(T) method, a benchmark dataset consisting of 51 reaction energies, for which accurate results have been recently published by Friedrich and Hänchen,78 was considered. The database covers a large set of chemically interesting processes (e.g. isomerizations, hydrogenations, allylic shifts and oxidations) and it shows a broad spectrum of reaction energies (0.1 to 150 kcal mol−1). With the default setting for the thresholds controlling the DLPNO procedure, all the errors in the energy reaction dataset with respect to semicanonical reference calculations are found to be smaller than 1 kcal mol−1 (Mean Absolute Deviation, MAD = 0.31 kcal mol−1).79 A similar result was obtained by Schwabe in his benchmark study of LPNO-CEPA and LPNO-pCCSD, for a database designed to validate electronic structure methods for isomerization reactions of large organic molecules.80
In 2006, Hobza and coworkers developed a widely used benchmark set for studying noncovalent intermolecular interactions (the S22 set).81 Liakos et al. investigated the accuracy of the LPNO-CEPA/1 method for the S22 dataset.82 When compared with the most accurate ab initio results available, LPNO-CEPA/1 was found to deliver a MAD of 0.24 kcal mol−1 and this accuracy exceeds that of most purpose specific density functionals (see Fig. 4).
In 2011, the S22 dataset was extended to account for a larger variety of interactions such as hydrogen bonds, aliphatic–aliphatic and π-aliphatic interactions, giving rise to the S66 dataset.83 The accuracy of the DLPNO-CCSD(T) method was recently assessed by probing the S66 interaction energies. Based on this investigation, three different sets of the thresholds that control the DLPNO procedure were selected (namely LoosePNO, NormalPNO and TightPNO) to allow users to optimally balance performance and accuracy. In agreement with the previous results for the S22 dataset, it was found that a MAD of 0.24 kcal mol−1 is obtained with the default settings (see NormalPNO in Fig. 5) for the DLPNO-CCSD(T) calculations when compared to their semicanonical CCSD(T) counterparts.79
Antony et al. conducted a survey on protein–ligand interaction energies with dispersion corrected DFT and high-level wavefunction-based methods.84 The systems investigated are truncated models (from 50 to 300 atoms) of structures deposited in the Protein Data Bank and are considered to be representative of noncovalent interactions occurring in drug-target adducts. For this investigation, LPNO-CEPA/1(CBS) was used as non-empirical reference to evaluate the performances of DFT-D. In agreement with earlier studies, it was concluded that DFT-D generally overestimates the binding energy by ca. 10% (2–4 kcal mol−1 for large interactions).
The S66 dataset was used to investigate basis set extrapolation schemes for total energies as well as for weak intermolecular interactions.85 A common approach to handle basis set incompleteness consists of complementing high-level calculations with a basis set limit estimate computed with a lower level of theory, typically second-order Möller–Plesset. Alternative schemes, where the MP2 is replaced by LPNO-CEPA/1, were investigated. It was shown that, owing to the highly systematic nature of the deviations between canonical and LPNO methods, more accurate results can be obtained employing LPNO-CEPA/1 in the extrapolation procedure.85
Similarly, two datasets based on the relative energies of the conformers of melatonin and butane-1,4-diol were used in the benchmark of the DLPNO-CCSD(T) method.79 When compared to the semicanonical counterpart, the DLPNO-based approach delivered MAD equal to 0.04 and 0.22 kcal mol−1 for butane-1,4-diol and melatonin, respectively.79
In addition to the aforementioned benchmark surveys, the accuracy of the LPNO methods has been exploited in application studies. For example, Dhaked and Bharatam investigated, with LPNO-CCSD and LPNO-CEPA/1, the tautomeric (enamine, imine and nitronic acid, see Fig. 7) forms of N-ethyl-N′-methyl-2-nitro-1,1-ethenediamine.87 This compound served as a model to address the nitroethenediamine moiety present in several medicinally important molecules such as ranitidine. Data obtained at the MP2 level of theory suggested that the latter was more stable by ca. 1 kcal mol−1 whereas various DFT functionals predicted the former to be more stable by 6–9 kcal mol−1. The LPNO results indicate that the imine tautomers are only about 1–2 kcal mol−1 less stable than the enamine global minimum.87
Fig. 7 Possible tautomers of ranitidine. Reproduced from ref. 87 with permission from The Royal Society of Chemistry. |
Harvey and coworkers reported on the accuracy of interproton distances derived from Nuclear Overhauser Effect data (NOE).88,89 After observing that, contrary to the common perception, NOE measurements are accurate enough to establish interproton distances for rigid molecules88 the authors investigated whether the approach can be extended to molecular systems that exhibit multiple configurations in solutions. To tackle this problem, the relative energies of the different conformers of a flexible molecule (4-propylaniline) were computed with LPNO-CEPA/1 to derive the populations at equilibrium.89 The good agreement between the theoretically- and NOE-derived average interproton distances supports the accuracy of NOE based data.89
Ashtari and Cann employed LPNO-CCSD to investigate, from a theoretical point of view, poly-proline chains and derivatives for chiral high performance liquid chromatography (HPLC).90,91 In chiral HPLC, the selective moiety (in this case proline chains) is immobilized on the surface so that its chirality confers selectivity on the stationary phase. In order to characterize the stationary phase based on the selector on the surface, the authors initially investigated the major conformers for each proline chain assessing their relative energy with LPNO-CCSD/6-311G(d,p). Subsequently, force fields were developed and selected based on their ability to reproduce the relative energies of the conformers as predicted by LPNO-CCSD. Molecular dynamics simulations, employing the derived force fields, were carried out to characterize each chiral interface.90,91
During the development of analytic potential to be used in the study of the interaction between formaldehyde and carbon based nanostructures (graphene sheets and carbon nanotubes), Dodda and Lourderaj used a formaldehyde–pyrene model system.93 Various ab initio and DFT methods were tested against CCSD(T) references and it was concluded that the most accurate results were obtained with LPNO-CEPA/1 in the complete basis set limit. The LPNO-CEPA/1 data were then fitted to an analytical potential energy capable of correctly characterizing minima structures not included in the fitting procedure confirming its global nature.93
Sancho-García and coworkers used reference calculations at the LPNO-pCCSD/1a level of theory to construct a database of anthracene dimer interactions representative of those found within the molecular crystal.94,95 The dataset was used to assess the accuracy of DFT-D based methods that, in turn, gave fairly accurate estimation of the cohesive energy of the anthracene crystal.94 In addition, the LPNO-pCCSD/1a results were compared against the experimental sublimation energy (electronic component), demonstrating that the method accurately reproduces values in the 1–2 kJ mol−1 range from its experimental counterpart.95
In a combined experimental and theoretical study, the transport properties of four commercial cationic membranes toward two counterions (namely H+ and Na+) were investigated.96 In their study, the authors attempted to correlate ab initio calculations on membrane models with experimental conductometric measurements to explain the difference in the mobility of the two ions. In this case LPNO-CEPA was the method of choice in the computation of binding energies.
In passing, we note that larger models can now be investigated thanks to the near linear scaling obtained within the DLPNO framework. For example the first calculation at the CCSD(T) level on an entire protein (Crambin, 644 atoms, 6100+ basis functions) was recently reported.45
The broad variety of the chemical applications covered in this review demonstrate that there is a rapid increase in the use of the (D)LPNO methods in conjunction with or in favor of DFT. In our view it is likely that these methods will play an increasingly important role in computational chemistry for large molecules or even extended systems. Obviously much further development work is necessary in order to proceed beyond the closed-shell systems described in this review. However, such further developments are now being pursued in a number of research groups and hence there is every reason to be optimistic about the future of these methods.
This journal is © The Royal Society of Chemistry 2014 |