Excited states in large molecular systems through polarizable embedding

In this perspective, we provide an overview of recent work within the polarizable embedding scheme to describe properties of molecules in realistic environments of increasing complexity. After an outline of the theoretical basis for the polarizable embedding model, we discuss the importance of using an accurate embedding potential, and how this may be used to significantly reduce the size of the part of the system treated using quantum mechanics without compromising the accuracy of the final results. Furthermore, we discuss the calculation of local electronic excited states based on response theory. We finally discuss aspects related to two recent extensions of the model (i) eﬀective external field and (ii) polarizable density embedding emphasizing their importance for eﬃcient yet accurate description of excited-state properties in complex environments.


Introduction
Photoresponsive systems are ubiquitous in Nature, and biological functions such as photosynthesis, phototropism, phototaxis and vision are all triggered and regulated by electronic excitation processes. 1Excitation phenomena in molecular systems also find applications in many areas within physics and chemistry such as organic solar cells, 2 light-driven molecular rotors 3 and optical probes 4,5 and are thus critical to our understanding of physical, chemical and biological processes.To gain insight into such fundamental processes at the molecular level use of computational methods is highly beneficial.The great virtue of computational science is that simulations can provide information directly at the atomistic scale and classical simulation methods, like classical molecular dynamics (MD), have for a long time been established methods within the physical, chemical and biological sciences where they are used to complement theory and experiment.However, the main shortcoming of classical MD is its lack of flexibility in regards to processes that can be studied, and the fact that it relies on a predefined force field. 6uantum chemistry and dynamics provide a powerful tool to elucidate key aspects of the fundamental mechanisms underlying light-triggered processes that are imperative to achieve rational molecular design.The large-scale nature of molecular materials and biomolecular systems, however, greatly challenges the use of conventional quantum-chemical methods because of their high computational cost which rapidly increases with system size.][9][10] Such models rely on the assumption that the total system can be partitioned into two systems in which the quantum-mechanical description is restricted to only a smaller part of the system (the chromophore), whereas the remaining part (the environment) is treated classically.This effectively extends the range of applicability of quantummechanical methods to large molecular systems.Focusing on properties related to the part of the system treated using quantum mechanics, embedding methods may be introduced as models in which the environment is represented by an embedding potential.Such potentials may be of purely electrostatic nature or contain non-electrostatic components as well.One usually distinguishes between embedding methods that treat the environment as an unstructured medium or in atomic detail.In the continuum models, belonging to the first class of these methods, the environment is represented by a dielectric medium that surrounds a cavity hosting the quantum region. 11he most advanced continuum models belong to the class of polarizable continuum models (PCM) 12,13 in which the environment polarization is represented by an apparent surface-charge distribution spread on the cavity surface.The great advantage of the continuum models is the implicit description of the environment dynamics meaning that there is no need for explicit configurational sampling.A physically more appealing strategy is atomistic embedding in which the environment is represented by its discrete charge distribution.The most common embedding strategy is electrostatic embedding, where the environment is represented by a static charge distribution (described by charges or more generally by multipoles).The impact of the environment then enters into the Hamiltonian for the quantum region as a one-electron operator mimicking the nucleus-electron interaction operator.While this scheme allows the quantum region to be polarized by the environment, the latter cannot respond to changes in the quantum region.This effect, however, can be very important and is crucial for a physically correct coupling, and mutual polarization between the quantum region and the environment-so-called polarized embedding-was in fact already considered in the pioneering work on QM/MM by Warshel and Levitt. 7An overview of selected features of various embedding methods are summarized in Fig. 1.Within the context of embedding strategies designed for describing excited states, polarization is usually accounted for either on the basis of induced dipoles, [14][15][16][17] or by means of a fluctuating charge (FQ) model. 18Fig. 1 also includes QM/QM embedding methods in which a quantum-mechanical description of the environment is retained.Examples are embedding fragment ab initio model potentials (FAIMP) 19,20 and frozen density embedding (FDE). 21,22In such QM/QM-based embedding methods, polarization can be included through so-called freeze-and-thaw iterations making these models computationally more costly but provides on the other hand a more complete description of the electrostatics, and, in addition, they also include non-electrostatic interactions.A detailed comparison of various embedding methods can be found in ref. 23 and 24 together with general discussions of the relationship between QM/QM embedding approaches and fragmentation-based many-body computational schemes such as the fragment molecular orbital (FMO) method. 25ith the aim of addressing general molecular properties and excited states of embedded systems, we have in recent years formulated and applied the polarizable embedding (PE) model 26,27 -as detailed in the following sections-to various molecular systems.The strength of the PE model lies in its formulation within quantum-mechanical response theory 28 which makes it possible to calculate a variety of molecular properties for complex systems in an efficient yet accurate manner.The PE model is a focused computational scheme meaning that only a part of the total system is treated using quantum mechanics whereas the major part of the system is described through an embedding potential.This makes the PE model very efficient and thus ideal for combination with, e.g., MD in order to include effects of configurational sampling -an issue which is very important for both homogeneous and heterogeneous molecular systems.In addition, we have recently introduced the focused polarizable density embedding (PDE) 29 model which, as detailed later in this perspective, captures the main physics of QM/QM embedding but at a reduced computational cost.
In this perspective, a description of the PE model will be presented.Particular emphasis will be given to the description of electronic transition properties associated with local electronic excitations and the physical nature of their coupling to the environment.The presentation will be organized in five parts: first, we define the model in a qualitative fashion highlighting the approximations made in going to an effective environment description followed by a more rigorous definition of the basic elements of the model.We also address the construction and accuracy of the central embedding potential.We proceed by extending the model to a quantum-mechanical response theory framework and discuss an important problem caused by the nonlinear embedding operator.Then, we illustrate the capabilities of the model by its application to the calculation of absorption properties of two chromophore-protein complexes, including a discussion of the importance of the explicit polarization component.We further briefly outline the PDE model, which is a recent extension that addresses two issues of the PE model.Finally, we conclude with some reflections on future developments.

Fundamentals of the PE model
The PE model relies on a physical division of the system into subsystems: a quantum region and its environment, which are This journal is © the Owner Societies 2016 treated at different levels of theory.We start with a brief outline of the series of approximations made in the transition from a full quantum-mechanical description to this differentiated treatment.To this end, consider a system that can be decomposed into subsystems A and B. In the end, the goal is to describe only subsystem A (defining the quantum region) explicitly, while incorporating the effect of subsystem B (the environment) through an effective embedding operator.Fig. 2 provides an overview of the approximations that lead to the embedding operator defining the PE model.
The key assumption in the PE model and other quantumclassical approaches is that the subsystems are strictly nonoverlapping (step 1).This implies that exchange-repulsion between subsystems vanishes and there is no need for intersubsystem antisymmetrization. 30Within this approximation, the exact wave function of the system can be written as a linear combination of the states in the direct-product space of the complete subsystem spaces. 31The next step is to invoke a mean-field approximation of the inter-subsystem interactions by restricting the wave function of the composite system to a single generalized Hartree product of subsystem wave functions (step 2).The variational condition then leads to a set of coupled subsystem equations, each involving the electrostatic potential of the other subsystem. 32At this point, no distinction is made between the description of subsystem A and B.
To achieve a differentiated treatment of the subsystems, a perturbation approach is taken for subsystem B (the environment), allowing it to be only linearly responsive.This corresponds to keeping terms in the electrostatic potential through second order in the interaction (step 3). 32,33Hereby, we obtain an effective optimization equation for subsystem A (the quantum region), involving the electrostatic potential of the permanent and induced charge distributions of B. This potential depends on the density of subsystem A itself through the induced charge distribution of B and thus leads to a non-linear effective Hamiltonian.Finally, we make the approximation to represent the permanent and first-order induced electrostatic potentials of subsystem B by multipole expansions to eliminate the explicit reference to its wave function (step 4).The use of multipole expansions is computationally advantageous because the intersubsystem two-electron Coulomb integrals involved in steps 2 and 3 are replaced by one-electron integrals.
In summary, this series of approximations leads to an effective Schro ¨dinger equation for the quantum region in which the electrostatic effect of the environment enters through an embedding operator.In the PE model, the permanent and first-order induced charge distribution of the environment is described by distributed multipole expansions taking the isolated environment molecules as expansion points: each site in the environment is assigned a set of permanent multipole moments as well as an anisotropic dipole-dipole polarizability.5][36][37][38] A comparison of different schemes can be found in ref. 39.In the following, we detail the working equations for the PE model.While |0i will denote a generic variational wave function and j 0i its time-dependent counterpart, we note that the PE model has been implemented at several quantum-mechanical levels, encompassing both variational and non-variational electronic structure methods.Starting with the original formulation and implementation of the PE model at the level of Hartree-Fock (HF) and Kohn-Sham density functional theory (KS-DFT) for linear and up to cubic response, 26,27,40 the PE model has been implemented within the framework of multiconfigurational self-consistent-field (MCSCF) wave function theory and linear response, 41,42 including the multiconfiguration short-range DFT (MC-srDFT) approach, 43 and the density matrix renormalization group (DMRG) method, 44 coupled-cluster (CC) linear and quadratic response theory 45,46 including the very efficient RI-CC2 model, 47,48 as well as a linear response implementation within the secondorder polarization propagator approximation (SOPPA) formalism. 49urthermore, the use of London atomic orbitals has made it possible to calculate gauge-origin independent magnetic properties. 50We also note that the PE model has been combined with continuum solvation descriptions for inclusion of bulk solvation effects. 51,521 The PE-QM energy functional In the PE model, the energy functional for a composite system consisting of a quantum region embedded in a polarizable environment is partitioned into three contributions where E QM is the energy of the quantum region (including wave function polarization), and E env is the internal energy of all fragments in the environment, including the electrostatic interaction between these fragments, but excluding the energy associated with the creation of the dipole polarization in the environment.The latter contributes to the variational determination of the wave function/density of the quantum region, and it is therefore expedient to include the polarization energy of the environment in E PE together with the contributions arising from the direct interactions of the quantum region with the environment.The polarizable embedding energy can be decomposed as where E es and E ind are the electrostatic contributions arising from interactions between the quantum region and the permanent and induced charge distributions of the environment, respectively.Terms modeling non-electrostatic interactions such as dispersion, non-electrostatic repulsion and charge transfer can in principle be added as energy corrections, but they are not part of the PE model because such energy corrections do not enter into the optimization of the wave function/density of the quantum region.The electrostatic interaction energy can be written succinctly by introducing a multi-index notation. 53A multi-index k = (k x , k y , k z ) is a 3-tuple, i.e., an ordered list of non-negative integers, where each integer is associated to the indicated component of a Cartesian coordinate.The norm and factorial of a multi-index are defined as |k| = k x + k y + k z and k! = k x !Ák y !Ák z !, respectively.Using this notation, the electrostatic interaction energy takes the form where the |k| summation runs over the (|k| + 1)(|k| + 2)/2 multiindices for a given |k|, e.g., for |k| = 1, the summation is over (1,0,0), (0,1,0) and (0,0,1), which corresponds to x, y and z components.M (k) s is the k-component of the |k|'th-order Cartesian multipole moment located at R s in the environment, and Z n is the nuclear charge of the n'th nucleus in the quantum region.
K s is the truncation level of the multipole expansion assigned to site s.S is the number of sites in the environment.
Within a second-quantization formalism, the electrostatic v ˆes operator is defined as where E ˆpq is a one-electron orbital excitation operator 54 and p and q label general molecular orbital indices belonging to the quantum region.The t (k) pq (R s ) integrals are given by and involve the k'th component of the interaction tensor defined as a partial derivative of the potential between two sites i and j, i.e., For a linearly responsive environment, the variationally optimized induction energy associated with the polarization of the environment both internally and by the electrons and nuclei in the quantum region amounts to half the interaction energy with the induced dipole moments 32,55,56 where the vector l s is the induced dipole moment at site s in the environment.F(R s ) is the electric field vector acting on site s that contains the fields from the nuclei and electrons in the quantum region as well as the fields from the permanent multipole moments in the environment The induced dipole moments are obtained from a variational optimization of the classical energy of a collection of induced dipoles interacting with an electric field.This can be cast into a set of classical linear response equations given as 57 lðFÞ ¼ BF; where the 3S-dimensional vector l contains all the induced dipole moments at the polarizable sites in the environment and F the corresponding electric fields.B is the classical linear response matrix (also called the relay matrix) of dimension 3S Â 3S.It is defined as 57 where the inverse of the electronic dipole-dipole polarizabilities are placed on the diagonal, while the dipole-dipole interaction tensors are found as the off-diagonal elements.
This journal is © the Owner Societies 2016 The wave function/density describing the quantum region is determined by requiring E tot to be stationary with respect to variations of the wave function/density of the quantum region.As discussed in the previous section, the only difference in the resulting wave function optimization equations compared to the vacuum counterparts is the presence of the embedding operator where the electrostatic operator is defined in eqn (4) and the induction part is given by A Cartesian component of the electric-field operator is given by with electric-field integrals defined as Ra;sÀra RsÀr j j 3 f q ðrÞdr: ( The induced dipole moments, and in turn the embedding operator, depend on the electronic wave function/density of the quantum region through the electric fields.In other words, the presence of the embedding operator leads to a non-linear effective Hamiltonian.In Section 3.1, we discuss a complication that arises from this non-linearity upon addressing excited electronic states.As apparent from eqn (4), ( 10) and ( 12), an essential step in setting up a PE-QM calculation is the embedding potential, i.e., the distributed multipoles moments and polarizabilities in the environment representing its electrostatic effects on the quantum region.The next section will be concerned with the construction of the embedding potential.

The embedding potential
The embedding potential parameters (multipole moments and polarizabilities) that are used in the PE model are derived from quantum-mechanical calculations in order to achieve an accurate description of the environment.Special considerations are therefore required in the generation of the embedding potential of large environments involving covalently bonded units (e.g., proteins and DNA), where a full quantum-mechanical treatment cannot be afforded.To overcome this, we employ the fragment-based molecular fractionation with conjugate caps (MFCC) scheme, originally developed by Zhang and Zhang 58 for the calculation of interaction energies but later extended to distributed properties by So ¨derhjelm and Ryde. 59Fig. 3 provides an overview of the steps involved in the generation of the embedding potential, as exemplified by a protein.
In the MFCC approach, the environment is divided into chemically-sound fragments (here, amino acids: A 1 , A 2 ) that are capped with groups built from the neighboring fragments (step 1).The capping groups (c 1 Ã ; c 2 ) serve to saturate the valencies of the broken bonds as well as to mimic the effects from the covalently bonded fragments.Moreover, capping groups from neighboring fragments are merged to form a so-called concap to remove the double counting introduced by the capping.Independent distributed property calculations can now be performed on the individual capped fragments and concaps (step 2).The final set of embedding parameters {P s } for a given site s in the environment are finally obtained by subtracting the distributed properties associated with this site in concaps from those in the capped fragment (step 3).The MFCC method leaves the open question of an optimal choice of fragmentation site and size of capping groups.We recently investigated suitable choices for fragmentation of environments involving proteins, 60 nucleic acids 52 and phospholipids. 61his journal is © the Owner Societies 2016 Phys.Chem.Chem.Phys., 2016, 18, 20234--20250 | 20239 An obvious drawback of this approach is the significant computational cost associated with the preparation of the embedding potential compared to standard QM/MM embedding using parametrized force fields.Indeed, when dealing with large environment fragments and relatively small quantum regions, this step can take more time than the molecular property calculation of the embedded quantum region itself, and this is worsened further by the need to consider dynamic effects due to finite temperature.We have shown, for a set of electric and optical molecular properties, that a combination of geometry-specific embedding parameters close to the quantum region and averaged isotropic embedding parameters further away constitutes a cost-effective embedding potential without compromising the accuracy significantly. 62At present, such averaged isotropic embedding parameters have been derived and validated for a range of common solvents, 62 nucleic acids 52 and phospholipids. 61.2.1 Performance of the embedding potential.The accuracy of the final PE-QM calculation relies to a high degree on the quality of the underlying structure, the quantum-chemical method, the embedding potential and the coupling between the quantum region and the environment.In this section, we will focus on the quality of the embedding potential, the importance of the explicit polarization component and how it affects molecular properties.
Within the limitations of the quantum-classical embedding strategy, the performance of the embedding potential is restricted by the introduction of a multipole representation of the electrostatic potential as well as by the use of the fragmentation procedure for environments involving covalently bonded units.The most direct way of evaluating the quality of the embedding potential is to compare the resulting electrostatic potential (ESP) to a quantum-mechanical reference, since the ESP is the quantity entering the effective Hamiltonian [eqn (11)].Here we will use the LoProp procedure 63 to obtain the distributed multipole moments and polarizabilities that define the embedding potential.Previous analyses, based on LoProp-derived parameters of solvents and amino acids, have shown that the embedding potential is essentially converged with respect to the order of multipole expansion at the quadrupole level. 60,64The typical strategy, also followed in the examples in this work, is therefore to include permanent multipole moments up to quadrupoles and anisotropic dipoledipole polarizabilities, giving the so-called M2P2 potential.All PE-QM calculations presented in this work were performed in a development version of the Dalton program.The PE contributions were handled by the PE library, 40,65 which has been interfaced with Dalton.The LoProp-based embedding potentials were computed using the Molcas program 66,67 or Dalton combined with the Loprop-for-Dalton Python script. 68ll steps in the generation of the potentials were handled using the PE Assistant Script. 40n Fig. 4, the ESP of insulin generated by the M2P2 potential (B3LYP [69][70][71][72] /cc-pVDZ 73,74 ) derived using the MFCC scheme is benchmarked against a full-structure DEC-MP2 75 /cc-pVDZ reference ESP 76 (see Fig. 4b).The ESP difference is probed on a molecular surface defined by twice the van der Waals (vdW) atomic radii, and its distance dependence, given as root-meansquare-deviations (RMSDs), is shown in Fig. 5. From Fig. 4c, it is clearly seen that the fragment-based embedding potential reproduces the quantum-mechanical reference quite well (RMSDs of B5 kJ mol À1 ) except at short distances (as seen in Fig. 5) where the deviation expectedly increases as a result of charge penetration errors and divergence of the multipole expansions. 30In fact, the M2P2 potential completely outperforms the corresponding full-structure B3LYP ESP, which has substantial errors (Fig. 5).As discussed in ref. 76, the reason for these large errors is that B3LYP and other common exchange-correlation functionals suffer from charge-transfer overstabilization due to incomplete cancellation of self-interactions.This issue does not appear in the B3LYP-based embedding potential because the fragmentation procedure does not allow charge transfer between nonneighboring amino acids.The errors can be partly removed by using the long-range corrected CAM-B3LYP functional but the embedding potential still rivals the full-structure CAM-B3LYP ESP even relatively close to insulin.For comparison, we include in Fig. 4d and 5 the ESP derived from the electrostatic component of the AMBER ff94 force field 78 as a representative of an embedding potential that is often used in electrostatic embedding QM/MM calculations.The RMSD at twice the vdW radii increases by a factor of B7 with respect to the M2P2 potential, illustrating that the implicit polarization included in ff94 through enhanced but fixed charges is not flexible enough to yield an accurate ESP.As shown in ref. 79, the inclusion of explicit polarization is particularly important for heterogeneous environments with many charged residues, which is indeed the case for insulin (see Fig. 4a).
We have seen that high-quality potentials can be obtained from fragment-based distributed property calculations and that explicit inclusion of polarization effects is a necessity for an accurate ESP.An important reason for using high-quality embedding potentials in actual calculations is that the size of the quantum region can be reduced without deteriorating the accuracy, as exemplified below.This is particularly important when considering higher-level quantum-chemical methods, such as CC theory, where larger quantum regions cannot be afforded.
To illustrate the importance of a good quality embedding potential on the molecular properties of the quantum region, we consider as an example the convergence of the excitation energy of the lowest pp* transition in acrolein in aqueous solution with respect to the size of the quantum region (total system size defined by radius 12 Å).More specifically, we compare the behavior when the classically-treated water molecules are represented by a polarizable M2P2 potential (based on CAM-B3LYP/6-31+G* [80][81][82][83][84] ) to that obtained using the nonpolarizable TIP3P water potential. 85Fig. 6 shows the convergence obtained using the M2P2 and TIP3P potentials and a center-of-mass-based distance criterion, R QM , for selecting solvent molecules to be included in the quantum region.
The M2P2-based excitation energy shows a faster convergence with respect to the size of the quantum region than the TIP3P counterpart.Without any waters in the quantum region, the M2P2 result is already within 0.08 eV of the converged value, which is an almost 0.2 eV improvement over the TIP3P potential.As shown by the open markers in Fig. 6 at R QM = 2 Å, the inclusion of two water molecules closest to the CQC bond midpoint, and thereby to the p-conjugated system involved in the transition, brings the excitation energy within 0.03 eV of the converged result using the M2P2 potential.This clearly shows that care should be taken when selecting solvent molecules to be included in the quantum region and if chosen wisely a faster convergence can indeed be obtained.It is, however, important to note that this good agreement to some degree relies on fortuitous error cancellation that hides the effects of the intersubsystem quantum interactions still neglected (as seen from the initial convergence behavior of the M2P2 results).We will return to this point in Section 5. Nevertheless, the different performances of the M2P2 and TIP3P potentials can also be attributed to the much more accurate ESP provided by the M2P2 potential.
For electronic transitions the explicit environment polarization not only affects the electronic ground state but also the excitation process, as will be discussed in more detail in Section 3. To distinguish these two polarization contributions, we have in Fig. 6 also included M2P2-based results obtained by neglecting the environmental response upon excitation (labeled PE(GS), see Table 1).In the present case, the account of polarization in the ground state provides a modest improvement for the smallest quantum regions, but it is similar to TIP3P in terms of convergence behavior with respect to the size of the quantum region.This illustrates that the response of the environment to the electronic excitation in the quantum region extends far in this environment (more than 6 Å), and that its inclusion is essential to obtain a realistic description of the excitation process.
In summary of this section, we have seen that classical embedding potentials constitute an efficient yet realistic way of describing the long-range electrostatic effects of an environment provided that the embedding potential is formulated accurately and includes explicit environmental polarization.

Molecular properties
Optical spectroscopy plays an important role in experimental characterization of molecular properties and the electronic structure of molecules.In such experiments the sample is subjected to electromagnetic radiation and the electronic and/ or geometric structure is probed by observing the molecular response to the applied perturbation.7][88] In the standard formulation, the theory gives access-by analogy to exact-state theory-to the determination of transition properties associated with stationary states through pole and residue analyses of resonant-divergent molecular response functions.0][91] This allows addressing resonance spectroscopies without having to resolve individual states, a feature that can be of particular importance in frequency regions with a high density of states.
In this section, we will briefly outline the formulation of PE within a quantum-mechanical response framework, which allows for the calculation of response and transition properties of molecules embedded in large environments.Two new concepts need to be introduced to describe the physical aspects of the interactions between the quantum region, the environment and the external perturbing fields: (i) the non-equilibrium regime due to the existence of different relaxation times for various degrees of freedom and (ii) the local field.
As mentioned in Section 2, the PE model has been developed and implemented for both linear and non-linear response functions at several quantum-mechanical levels, including variational and non-variational theories.Recently, we considered resonant-convergent response theory within the PE formalism.The extension to resonance-convergent linear response theory parallels the PE modifications in the standard formulation and has been presented in ref. 92 and 93 at the timedependent DFT and CC levels, respectively.In the following, we restrict our attention to the coupling of PE to the conventional response theory approach and specifically consider the linear response to a uniform oscillating electric field.For simplicity of interpretation, we work within a configuration interaction (CI) framework, which is parametrized via the state-transfer operators: Ry n ¼ jnih0j and Rn ¼ j0i nj h , where {|ni} is a set of orthonormalized states that spans the orthogonal complement of the reference state |0i.
In the unperturbed ground state (no external field), the quantum region and the environment are fully equilibrated.However, upon applying a time-dependent field, the nuclear frameworks cannot follow the dynamics of the field, and the molecular response to the external field therefore only involves the electronic degrees of freedom.This is characteristic for a vertical electronic (de-)excitation where the electronic degrees of freedom of the environment respond immediately to the electronic transition in the quantum region with no appreciable changes in the intra-or inter-subsystem nuclear configuration.For an isolated molecule, this is equivalent to the standard Franck-Condon approximation and corresponds, in the terminology of embedding models, to the so-called non-equilibrium regime. 94,95The non-equilibrium response can be properly described with polarized embedding models.In the PE model, the electronic polarization of the environment due to the external perturbation is included through the induced dipole moments while the permanent multipole moments and induced dipole moments from the ground-state calculation represent the fixed orientational polarization of the environment.
The second important aspect to be taken into account in the definition of molecular properties of embedded systems is the distinction between the field actually acting on the quantum region-the local field-and the externally applied electric field, which would be experienced by the quantum region in isolation (see ref. 11, 96-100 and references therein).The local field at the embedded molecule differs from the externally applied field due to the interactions involving the quantum region, its environment and the external field: it is the superposition of the external field and the fields generated by the permanent and induced charge distributions of the environment.In the PE model, a Cartesian component of the local electric field probed at location R m within the quantum region is given by 101 sm;ab m s;b ð FÞ; (15)   where F ˜is the time-dependent version of eqn (7) augmented with the external electric field: The local field is the central quantity for deriving response and transition properties of embedded molecules that can be related to those measured in experiments, i.e., defined in terms The environment is allowed to respond to the perturbation-induced changes in the quantum region such that both static and dynamic contributions to the reaction field are included.In a response picture, this effect is incorporated through E [2]d in terms of transition electric fields [eqn (22)].

PE(+EEF)
Full reaction field and effective external field All contributions to the local field [eqn (15)] are taken into account in this model.In particular, the environment polarization is modified directly by the external field resulting in an effective external field.
of the applied external field.As will be discussed further below, the local field can be divided into two contributions: (i) the reaction field originating from the dipole polarization in eqn (15)  induced by the first two terms of eqn (16), and (ii) the effective external field, which is the superposition of the external field and the field from the dipole polarization caused by the external field itself.An additional term to the local field arises in the PE model due to the permanent multipole moments of the environment [second term in eqn (15)].In Section 4.1, the importance of these contributions in describing transition properties of embedded molecules will be illustrated.
The incorporation of the local-field effects in a response framework can be accomplished using the quasi-energy derivative formulation, 86 where the ground-state energy is replaced by the time-averaged quasi-energy.Within this formalism, the molecular response functions are determined by expanding the time-averaged quasi-energy in orders of the periodic per- Response functions can then be identified from the timeaveraged Hellmann-Feynman theorem as derivatives of the time-averaged quasi-energy with respect to suitable external field strengths. 86he time-dependent quasi-energy in the PE framework is obtained by augmenting the quasi-energy of the isolated quantum region with the time-dependent analogue of E PE and E env in eqn (1).The resulting expression becomes where H ˆ0 denotes the vacuum Hamiltonian, and an apostrophe signifies inclusion of interaction terms with the external field.j 0ðtÞi is the phase-isolated wave function built from the unperturbed reference state.The response functions are then defined as derivatives of the time average of eqn ( 17), evaluated at zero field strength.Applying the above procedure, the linear response function for the composite system becomes where As seen from eqn (18), the quasi-energy formulation results in a symmetric form of the response function involving two terms: a env is an explicit contribution from the isolated environment to the linear polarizability of the composite system, while the second term in eqn (18) encompasses the interactions between the embedded quantum region and the environment.This expression for the linear response function is suited for determining transition properties of the embedded system.Note that an alternative formulation of response theory follows from the Ehrenfest approach in which properties are defined from a perturbation expansion of the expectation value of a one-electron operator. 28pplied in the PE context, the latter approach leads to a single asymmetric term which corresponds to the so-called effective linear response property of the embedded quantum region, i.e., defined with respect to the external field. 96The choice of symmetric [eqn (19)] or asymmetric [eqn (20)] response functions corresponds to different divisions of the property of the total system, where the former is appropriate for transition strengths and the latter if focus is on properties of the quantum region.Within a variational framework, the overall form of the linear response function in eqn (19) [or eqn (20)] is preserved as in vacuum but with environmental contributions entering through the electronic Hessian E [2] and property gradient % V o a .The situation is not this simple in a non-variational framework, where additional purely environment-related terms appear (see e.g.ref. 102  for a comparison of PE-QM linear response theory in the two frameworks).The metric matrix S [2] is unchanged and its definition can be found, e.g., in ref. 86.The electronic Hessian can be partitioned into static and dynamic contributions that read as The PE contributions to the electronic Hessian model the effects of the reaction-field contribution to the local field, i.e., the impact of the direct coupling between the quantum region and the environment.The static term reflects the situation in which the environment is not allowed to respond to the changes in the electronic density of the quantum region induced by the applied perturbation.This picture can be viewed as a special case of electrostatic embedding, where the environment is frozen at the polarization induced by the ground state of the quantum region.In standard electrostatic embedding schemes only the term from v ˆes of the embedding operator in eqn ( 11) is retained in E [2](0) .The inclusion of explicit environment polarization introduces the additional E [2]d contribution.This term accounts for the indirect linear response of the environment to the applied field.In other words, it describes the environmental response to the oscillating density of the quantum region (at frequency o) induced by the time-dependent external field, and thus it gives rise to a dynamic contribution to the reaction field.It should be noted that the polarizabilities entering the classical response matrix B in eqn ( 22) are usually approximated with their static limit irrespective of the frequency o of the external field.This is a reasonable approximation for frequencies in the UV/ Vis region that are far from resonances in the environment. 103,104he remaining part of the local field describes the implications of the direct environmental response to the applied external field.The additional environment polarization gives rise to an effective external field in the void of the quantum region that differs from the applied field: 101 Such difference introduces an explicit environment contribution to the property gradient where the first term is the usual unmodified property gradient V o a , and e a is a column vector of length 3S holding the a'th Cartesian unit vector for each site in the environment.This effect is similar in origin to the so-called cavity field effect in polarizable continuum models, 97,100 although the latter is relative to the macroscopic (Maxwell) field inside the dielectric rather than the external field.The modification of the applied field due to its interactions with the environment can conveniently be expressed in terms of effective external field tensors that are defined as the linear response of the effective external field with respect to the externally applied field: 101,105 By recasting the transition electric field in the second term of eqn ( 24) to a dipolar form, the direct influence of the environment can be related to the effective external field tensors as where ¼ ðIÞ signifies that the expression on the right-hand side is correct through first order.Insertion of this into eqn (19) gives an (approximate) expression for the response function in terms of the unmodified property gradient and effective external field tensors.The latter gives the connection between the measurable molecular properties, describing the molecular response to the external field, and those in terms of the effective external field (unmodified property gradients).
To distinguish the significance of the different local-field contributions, we introduce a hierarchy of polarization models based on the extent to which environment polarization is incorporated in the response properties of an embedded molecule.These models are summarized in Table 1.In Section 4.1, we will consider the importance of the different contributions as given by this hierarchy of polarization models.Having discussed the key equations in PE-QM linear response theory, we will continue to consider the transition properties resulting from these expressions and how they compare to those obtained from the alternative state-specific formalism.

Electronic excitations: energies and transition strengths
Within the response framework, excitation energies of electronic transitions in the quantum region are identified as the poles of the linear response function in eqn (19).This involves the solution of a generalized eigenvalue equation for the electronic Hessian with the metric S [2] : where x n represents the n'th excited state with excitation energy ho n = DE n0 .State-specific approaches provide an alternative strategy for determining excitation energies involving explicit optimization of the initial and final states of interest and simple subtraction of the associated energies.For an isolated molecule the two formalisms give the same results in the limit of exact theory or for linearly parameterized approximate (CI-like) wave functions.][108] The difference between the linear-response and state-specific excitation energies in the framework of PE can be illustrated within a first-order perturbation analysis using the eigenvectors {|0 (0) i,|n (0) i} and -energies {E (0) 0 , E (0) n } of E [2](0) in eqn ( 21) as zerothorder basis. 108The linear response and state-specific transition frequencies for an excitation from the initial state |0i to the final state |ni can through first order be written as n À X S s;s 0 ¼1 hn ð0Þ j Fe R s ð Þj0 ð0Þ i T B ss 0 h0 ð0Þ j Fe R s 0 ð Þjn ð0Þ i; X S s;s 0 ¼1 ðhn ð0Þ j Fe R s ð Þjn ð0Þ i À h0 ð0Þ j Fe R s ð Þj0 ð0Þ iÞ T Â B ss 0 ðhn ð0Þ j Fe R s 0 ð Þjn ð0Þ i À h0 ð0Þ j Fe R s 0 ð Þj0 ð0Þ iÞ: The first (zeroth-order) term is equivalent in the two formulations and can be interpreted as the excitation energy associated with an artificial transition with the environment frozen at the polarization of the ground state of the quantum region.Accordingly, the difference between the two approaches concerns only the second term describing the electronic response of the environment to the excitation process, i.e., the dynamic reaction field contribution.In a linear-response formulation, the environmental response reflects the transition electric field of the quantum region rather than the change in the electric field upon excitation as in state-specific models.Recasting these terms into dipolar forms translates into the environment response being induced either by the transition dipole moment of the quantum region or the difference dipole upon the electronic excitation.The behavior of these two terms will therefore depend crucially on the nature of the specific excitation under consideration.In particular, the dynamical contribution in linear response theory is only significant for bright singlet excitations that have non-vanishing transition dipole moments.
The difference between the linear-response and state-specific approaches common to polarized embedding models has been analyzed in ref. 108 on the basis of an exact four-state-model (within the zero-overlap assumption) for two interacting molecules.Within this picture, the dynamical PE contribution retrieved in the linear-response framework has been interpreted as inter-subsystem correlation effects beyond the pure Casimir-Polder dispersion term.On the other hand, the state-specific term is purely electrostatic in nature, taking into account the differential environment response upon excitation and is therefore consistent with the classical description of the environment taken in the PE model. 108rom the residues of the linear response function in eqn (19), we can identify the one-photon transition strengths associated with the electronic transitions in the quantum region.We obtain 101 From this it is clear that the environment affects the transition strengths in two ways: (i) indirectly through modified eigenvectors, as discussed above, and (ii) directly through the property gradients in terms of the effective external field.As seen in eqn ( 26), the impact of the latter can be approximated by means of effective external field tensors.As a consequence, the influence of the effective external field will depend on the orientation of the transition dipole moment for a given excitation, and it may therefore change the shape of the spectrum.The effect becomes even more pronounced when going to higher-order transition properties, which involve longer strings of external effective field tensor components.We will consider an example in Section 4.1.
The strength of combining the PE model with a responsetheoretical framework is a direct generalization to higher-order (transition) properties.In the alternative state-specific approach, one-photon transition strengths can be computed in analogy to eqn (30), but using individually optimized electronic states.However, the need to perform separate calculations of all (relevant) electronic states and account for the fact that their mutual orthogonality is, in general, no longer ensured means that the generalization of the state-specific model to higherorder transition properties is not straightforward.Finally, we recall that the inequivalence between the response and statespecific approaches is not restricted to polarized embedding approaches, but is an issue that already enters for molecules in vacuo when using non-linearly parameterized wave functions.

Importance of local electric field contributions
Having described how local-field effects enter into the definitions of the transition properties of an embedded molecule in the PE-QM response framework, we turn to examine the importance of the various field contributions by comparing to a full quantum-mechanical reference calculation, keeping in mind inherent differences due to, e.g., basis-set incompleteness.As an example, we consider the one-and two-photon absorption (1PA and 2PA, respectively) properties of the lowest pp* transition in a 242-atoms cluster model of the red fluorescent protein DsRed (Fig. 7) obtained using the hierarchy of polarization models introduced in Table 1.
The CAM-B3LYP functional was used for the quantum region, together with the 6-31+G* basis set, and the embedding potential was generated at the B3LYP/6-31+G* level of theory.For more details on the preparation of the system and the computational setup, we refer to the literature. 110All calculations are based on a response treatment.
The transition energy (DE), associated oscillator strength ( f ) and 2PA cross section (s 2PA ) are listed in Table 2 for the different polarization models alongside with the quantum-mechanical reference results.In addition, we include the results for the isolated DsRed chromophore adopting the in-protein geometry.The static reaction field, represented by the incremental contribution when going from vacuum (vac) to a static reaction field (PE(GS)), gives rise to a substantial blue shift of the excitation energy that overshoots the reference shift by 0.15 eV.This is largely remedied by allowing the environment to respond to the excitation process in terms of the dynamic reaction field (PE(ÀEEF)), which brings the PE-TDDFT excitation energy within 0.04 eV of the reference (full QM).The 1PA and 2PA intensities, however, reflect the effective external field at the location of the chromophore rather than the applied external field and are thus not comparable to the full quantum-mechanical reference.Incorporating the effective external field effects (PE(+EEF)), i.e., defining the properties with respect to the external field, leads to absolute intensities that are in good agreement with the full quantum-mechanical counterparts.
The results obtained with the hierarchy of models clearly demonstrate the efficacy of the PE model to simulate environmental effects in systems where the inter-subsystem interactions are dominated by long-range electrostatics.In particular, it provides physical insight into the relative importance of the various contributions to the local electric field, which is not easily accessible from full quantum-mechanical calculations.

In silico mutations in channelrhodopsin
There is great potential in the use of computational models and tools for the rationalization and optimization of the properties of biomolecular systems.The PE model can provide valuable information that can be used to investigate the molecular mechanisms underlying experimental measurements.Furthermore, since we have an atomistic description of the environment, this allows us to manipulate the composition of the environment and to estimate the effects it has on the properties of the quantum region.
Here we present a proof-of-principle example to demonstrate the usefulness of the PE model for in silico mutation studies.Specifically, we investigate the 2PA activity of a channelrhodopsin, which belongs to a subfamily of 7-transmembrane retinylidene proteins that act as light-gated ion channels.The channel-opening is triggered by a light-induced isomerization in the all-transretinylidene chromophore, a vitamin A derivative, which is bound covalently to a lysine residue through a protonated Schiff base (see Fig. 8).Channelrhodopsins are often used as actuators in optogenetics, which is a biological technique that allows in vitro or in vivo optical control of cellular processes. 111heir use in living tissue is complicated, however, because of the relatively high absorption and scattering coefficients in biological tissue.The issues can potentially be overcome by using red-shifted variants, or, alternatively, activating the channel using 2PA techniques.
In Fig. 9, we present 2PA results for a Channelrhodopsin-1 (ChR1) and Channelrhodopsin-2 (ChR2) chimera called C1C2 112 (see Fig. 8).The 2PA transition probabilities were calculated at PE-CAM-B3LYP/6-31+G* level using an embedding potential derived at the B3LYP/6-31+G* level.The structures were taken from ref. 113.By mutating residues to glycine, the effect of each residue on the 2PA properties can be mapped, and this gives valuable information that can be used in a rational design process.For example, the largest effect comes from the two Schiff base counter-ions, D292 and E162, that cause a substantial decrease in the 2PA.It has been shown that a D292A mutant does not produce a photocurrent whereas a E162A mutation only has a moderate effect. 112The E162 residue would therefore be a good candidate for further more elaborate computational mutation studies.

The next generation: polarizable density embedding
We have so far demonstrated the reliability and efficiency of the PE model to study localized properties in large molecular systems.The reliability, however, is only guaranteed if the interactions between the quantum region and its environment are dominated by long-range electrostatics.The next generation of our embedding model, polarizable density embedding (PDE), 29 addresses two potential issues that can arise when the distance between the quantum region and the molecules in its immediate environment is short.
As seen in Fig. 5, the error in the embedding potential increases rapidly at short distances.This is mainly because of an incomplete description of the short-range electrostatics partly due to the use of point multipoles, i.e., charge-penetration effects are missing.Therefore, to improve the short-range electrostatics in the PDE model, we use the explicit electron density of the fragments in the environment, instead of a multipole representation of the permanent charge distribution.This increases the computational cost compared to the PE model because the interfragment two-electron Coulomb integrals between the a Vacuum results are taken from ref. 110 and correspond to the conformation adopted by the chromophore inside the protein.
Fig. 9 The gain or loss of 2PA in the C1C2 chimera as a result of point mutations to glycine.Calculated as the relative difference between 2PA transition probabilities.The distance to the chromophore is increasing from left to right.
This journal is © the Owner Societies 2016 quantum region and the permanent charge distribution of each fragment in the environment are reintroduced.Compared to other QM/QM-embedding methods, PDE is much more efficient because the classical description of environment polarization from the PE model is retained, thus avoiding the costly iterations that require multiple calculations of the electron density of each fragment in the environment.PDE is designed as a focused model in contrast to e.g.FMO which aims for a description of the total molecular system at a consistent level of quantum mechanics through the use of many-body expansions. 25Since the distributed multipoles in the PE model describe the long-range electrostatics quite well, a three-layer model may be introduced where the environment is split into two regions: an inner region containing fragment densities, and an outer region consisting of distributed multipoles.Correspondingly, we introduce a splitting of the electrostatic part of the embedding operator where v ˆmul is defined as in eqn (4) but only includes fragments that are outside a defined threshold distance from the quantum region.The v ˆfd operator describes the electrostatic interactions between the quantum region and the fragment densities in the inner region, and is thus defined as Here, the summation runs over N fd fragments, D rs is an element of the density matrix belonging to the f'th fragment, and Z m is the charge of the m'th nucleus in the f'th fragment.The one-electron integrals in eqn (32) are given by and the interfragment two-electron integrals by where molecular orbital indices r and s belong to fragments in the inner region as indicated in eqn (32).
The other potential issue that can arise in the PE model is artificial stabilization of the electron density at classical sites in the environment -the so-called electron-spill effect.It is related to the lack of non-electrostatic repulsion, i.e., exchange-repulsion or Pauli repulsion, which introduces substantial errors at short distances when there is significant wave-function overlap. 114,115n the PDE model, this issue is remedied by introducing a projection operator, based on the work by Huzinaga and Cantu, 116 that models wave-function orthogonality between the quantum region and the fragments in its environment.This non-electrostatic repulsion operator is defined as where e i is the energy of the i'th occupied orbital of the a'th fragment in the environment, and S pi and S iq are overlap integrals between molecular orbitals in the quantum region and molecular orbitals belonging to fragments in the environment.As seen from eqn (35), the operator associates an energy penalty that scales with the square of the overlap and is weighted by the energy of the occupied molecular orbitals belonging to the fragments in the environment.This effectively prevents the electrons in the quantum region from penetrating too far into the environment.However, since we use the orbitals calculated for each fragment in isolation, the repulsive effect can be too strong, as shown in ref. 29.It may therefore be necessary to introduce a scaling of the non-electrostatic repulsion.In the preliminary analysis in ref. 29, we found that a simple linear scaling gave good results.Further investigations along this line are currently in progress.

Concluding remarks and perspectives
In this perspective, we have given an overview of the status as well as recent developments of the polarizable embedding method.
The high-quality embedding potential is a key ingredient for an accurate description.The examples in this perspective illustrate that the electrostatic component of the embedding potential is well-described by the combination of a fragmentation-based approach with quantum-mechanical distributed property calculations as used in the PE and PDE models.However, there remain some directions for further improvements.First of all the environment is assumed to be linearly responsive.Yet, strong electric fields are not unusual in e.g.proteins, [117][118][119][120][121][122][123] which can be expected to require a treatment of non-linear environmental polarization.The generalization of the LoProp scheme to first hyperpolarizabilities was recently presented in the context of a second-order Applequist model. 124However, it has not been explored in the context of embedding calculations and the second hyperpolarizability is naturally expected to be of greater importance in isotropic systems.Furthermore, although the use of distributed properties to some extent account for anisotropies in the induced charge distribution, improvements can be made by including higher-order terms in the multipole expansion. 125,126ew computational methodologies for the calculation of distributed properties are currently being explored and initial investigations of distributed multipoles are promising. 39Current efforts are focused on the extension to distributed any-order multipole polarizabilities and hyperpolarizabilities, as well as the inclusion of these in the PE model.
At present, the PE and PDE models do not include dispersion in the embedding operator but this component might become very important especially when studying apolar species.The simplest approach would be to add the leading-order Casimir-Polder like term as an energy-correction as done in e.g. the EFP model. 127owever, such an approach does not lead to an effective dispersion operator entering into the expression for the embedding potential.Ongoing work addresses the construction of such effective methods for inclusion of dispersion directly into the embedding potential for a discrete environment.Several new features are in progress to broaden the use of the PE and PDE models.One of the strengths of the PE model is its formulation within a quantum-mechanical response framework.The PE model has recently been combined with an openended response theory framework 128,129 at the HF and KS-DFT levels.The details of the implementation together with pilot calculations will be published shortly.This will enable the calculation of electric response properties to any order including also multiphoton transition strengths via the recent single residue functionality. 130In addition, efforts are presently focused on an extension of the capabilities of our previous molecular gradient implementation 131 to any-order geometric and mixed electricgeometric derivatives.This will open the door to the calculation of vibrational spectroscopies and vibrational contributions to molecular properties as well as the optimization of excited electronic states.Based on such novel developments, we expect that the PE and PDE models will play an important role in future advanced spectroscopies and materials design.
The main concern of this work has been on molecular systems consisting of a single chromophore embedded in a complex environment.However, in many cases molecular systems contain a number of chromophores and it is thereby necessary to consider collective excitation and energy-transfer processes, i.e., the excitations become non-local.In such cases it is important to include environment effects directly in the description of the excitation energy and transfer processes.The environment not only tunes the excitation energies of the specific chromophores, it also contributes both directly and indirectly to the calculation of the electronic couplings. 15,132he PE model has recently been extended to multichromophoric systems paying special attention to the role played by the environment. 133In the future, the PE model as well as extensions hereof especially in the direction of energy transfer processes for multiconfigurational descriptions of molecular systems will be used to predict and rationalize energy-transfer mechanisms in complex biological systems.
Finally, an important point that has not been touched upon in this work is that of configurational sampling and finite temperature effects, which is crucial when relying on an atomistic representation of the environment.Typically, we have addressed such sampling based on a sequential approach in which the sampling and embedding calculations are treated in an uncoupled manner. 134Usually the sampling is based on classical MD simulations and the quality of both the sampling and the produced molecular structures thus relies on the quality of the underlying force field.In the future, an interesting approach will be to sample the structures at the same level as the embedding, i.e., use the embedding potential directly in the classical MD, or as an intermediate alternative to use the previously mentioned PE-QM molecular gradient implementation to optimize the geometries of snapshots extracted from a classical MD run using a conventional force field.The latter approach will, however, remove finite temperature effects from the central part and we are therefore investigating the possibility to use ab initio MD to fully address the issue of configurational sampling and finite temperature effects.

Fig. 1
Fig. 1 Schematic representation of different classes of embedding models and some selected features.By sampling is meant configurational sampling and by 2 Â SCF is meant double SCF, i.e., a dual SCF solution for the quantum region and environment polarization.

Fig. 2
Fig.2Schematic representation of the approximations made in going from a full quantum mechanical description of a composite system, exemplified by an acetophenone-dimethylsulfoxide system, to the differentiated treatment defining the PE model, where the solute is the quantum region and the solvent is the environment.See description in the text.

Fig. 4
Fig.4(a) Structure of an insulin monomer (PDB ID: 1MSO77 ) with five cationic residues (blue) and six anionic residues (red).(b) The ESP of insulin computed at the DEC-MP2/cc-pVDZ level of theory.The difference between the ESP of insulin generated by (c) the M2P2 potential (B3LYP/ cc-pVDZ) based on the fragmentation procedure described in Fig.3 and (d) the embedding potential based on the electrostatic component of the AMBER ff94 force field, both relative to the MP2 reference.The surfaces are defined at twice the vdW atomic radii and the error is reported as interaction energies with an elementary charge.Results from ref.60.

Fig. 5
Fig. 5 RMSD of the ESP at different distances relative to the DEC-MP2/ cc-pVDZ reference.The vertical dashed line indicates the vdW radius factor used for the ESPs in Fig. 4. The RMSDs are calculated at a molecular surface defined by spheres of the vdW atomic radii times a factor.Results from ref. 60.

Fig. 6
Fig. 6 Convergence of the excitation energy of the lowest pp* transition in acrolein in aqueous solution with respect to the size of the quantum region (defined by R QM ) for different embedding potentials.Open markers at R QM = 2 Å indicate results obtained by including the two water molecules closest to the CQC bond midpoint.The results are for an arbitrary MD configuration.Calculations were performed at the CAM-B3LYP/6-31+G* level of theory.

Table 1
Hierarchy of models for inclusion of polarization in response-theoretical calculations in the PE framework

Table 2
Excitation energies DE (eV), oscillator strengths f, and 2PA cross section s 2PA (GM) of the lowest pp* transition in a DsRed cluster model computed using the various polarization models defined in Table1.Results from ref.101 This journal is © the Owner Societies 2016 Phys.Chem.Chem.Phys., 2016, 18, 20234--20250 | 20247