Wavelet formulation of the polarizable continuum model . II . Use of piecewise bilinear boundary elements †

The simplicity of dielectric continuum models has made them a standard tool in almost any Quantum Chemistry (QC) package. Despite being intuitive from a physical point of view, the actual electrostatic problem at the cavity boundary is challenging: the underlying boundary integral equations depend on singular, long-range operators. The parametrization of the cavity boundary should be molecular-shaped, smooth and differentiable. Even the most advanced implementations, based on the integral equation formulation (IEF) of the polarizable continuum model (PCM), generally lead to working equations which do not guarantee convergence to the exact solution and/or might become numerically unstable in the limit of large refinement of the molecular cavity (small tesserae). This is because they generally make use of a surface parametrization with cusps (interlocking spheres) and employ collocation methods for the discretization (point charges). Wavelets on a smooth cavity are an attractive alternative to consider: for the operators involved, they lead to highly sparse matrices and precise error control. Moreover, by making use of a bilinear basis for the representation of operators and functions on the cavity boundary, all equations can be differentiated to enable the computation of geometrical derivatives. In this contribution, we present our implementation of the IEFPCM with bilinear wavelets on a smooth cavity boundary. The implementation has been carried out in our module PCMSolver and interfaced with LSDalton, demonstrating the accuracy of the method both for the electrostatic solvation energy and for linear response properties. In addition, the implementation in a module makes our framework readily available to any QC software with minimal effort.


Introduction
One of the grand challenges of quantum chemistry is the ability to describe molecular behavior in complex realistic environments, far from the ideal picture of an isolated molecule: the overall system is far too large to allow for a full quantum chemical treatment, but the inclusion of the environment is unavoidable to achieve a realistic picture of the molecular processes under investigation. Overcoming such a challenge will most likely never be fully accomplished but several strategies are being pursued in that direction. One important consideration about chemical processes which guides such a development is their localized nature: only a small fraction of the system must be modelled with quantum chemistry, whereas the remainder is required only to provide a realistic environmental effect. This consideration is the basis for the so-called focused models. Within focused models, two strategies are the most widespread: on the one hand are models which retain the atomistic description of the environment, such as molecular mechanics (MM) or polarizable embedding (PE); 1-3 on the other hand are the dielectric continuum (DC) models 4,5 where the environment is described as a structureless medium with well defined properties (dielectric permittivity and refractive index are among such properties).
Both strategies date back to the 70's with the pioneering work of Rivail et al. 6 on the one hand and Warshel et al. 7 on the other. They have since known a steady development and most quantum chemistry softwares feature at least one of them among their methods. MM and PE models are appealing because they retain the atomistic description of the system, but on the other hand their parametrization and application require dedicated skills, preventing their inclusion in any black-box approach. Continuum models are by nature more approximate, disregarding the atomistic structure of the environment altogether, but they are more suited for a black-box approach, requiring only a handful of parameters, defining the molecular cavity and the properties of the continuum. The simplicity of the model is attractive and has spurred several developments in order to make the model more accurate, keeping at the same time the underlying simplicity in the parametrization. One milestone in such a development has been the introduction of a molecular-shaped cavity instead of a simpler but less accurate analytic form (sphere or ellipsoid), pioneered by Miertuš et al., 8 with the PCM. PCM has made it possible to employ continuum models for molecules of arbitrary shape. The development of efficient and accurate methods to represent a molecular cavity has since become a research topic in its own right. 9 Most algorithms construct the cavity surface from interlocking atom-centered spheres, although a formulation based on molecular isodensity surfaces has been proposed. 10 When atomcentered spheres are used, the radii are chosen among suitable sets, for example, the Bondi set. 11,12 The radii are most of the times scaled by a factor greater than one, to account for the solvent-accessible surface (SAS). The most appropriate description of the molecular surface is however given by the solvent-excluded surface (SES), that is the surface described by the contact point of a tiny probe sphere, mimicking a solvent molecule, rolling over the SAS. An algorithm for the generation of the SES was first devised by Connolly 13,14 in the context of the modelling of the molecular surface and volume of large proteins and nucleic acids. Unlike the SAS, the Connolly surface features toroidal and concave spherical sections for which it is difficult to formulate a numerically stable and efficient discretization procedure into small patches. Most of the currently used algorithms thus approximate the Connolly surface by employing only convex spherical sections. This is done for example in GEPOL with the addition of extra, not atom-centered spheres. [15][16][17] Such an algorithm can be symmetryadapted. 18,19 Another strategy to generate the solvent-excluded surface (SES) has been provided by the DEFPOL algorithm. 20,21 Not until recently the exact Connolly parametrization of the surface has been used in quantum chemistry. 22 In addition to a proper description of the molecular cavity, it is necessary to provide a corresponding description of the solute-solvent interactions. The lack of atomic structure on the solvent side is here challenging. In practice, the solutesolvent interaction is separated in different contributions, which are connected to the underlying intermolecular interactions: electrostatics, polarization, dispersion, repulsion, exchange, and cavitation are the contributions which are generally considered. 23 The first two are classical electromagnetic interactions, the last one is the energy involved in the creation of the molecular cavity, and the remaining three are contributions stemming from quantistic interactions between the solute and the solvent. [24][25][26] Due to a fortuitous cancellation, the non-electromagnetic interactions (all besides the first two) often have a much smaller impact on the total solvation energy than the electromagnetic ones (electrostatics and polarization). Therefore, a large body of work has focused on electrostatics and polarization. This amounts to the solution of the Poisson equation in the presence of a dielectric medium. 27 The problem can then be recast as a boundary problem at the cavity surface: the boundary integral equations arising from the Poisson problem are discretized and the resulting system of linear equations can be solved by making use of an appropriate linear algebra technique. 28 The most critical point of such a procedure is the discretization method that is employed, because its choice affects the final solution in terms of accuracy, stability and efficiency. 29 The original procedure, which is still employed in most implementations, is a simple collocation method: the cavity is discretized in elements called tesserae and functions on the cavity surface are represented by their values on the collocation points, selected as the tesserae centroids. The representation of integral operators is straightforward, except for the diagonal elements where special care must be taken by either using some form of analytic integration or a numerical quadrature. 9 Increased accuracy in the electrostatic energy can be achieved by a careful selection of the procedure by which the matrix representation of the integral operators is obtained. Purisima et al. 30 showed that the diagonal matrix entries play a crucial role in this respect. More recently, Bardhan et al. have thoroughly investigated the discretization procedure and the solution method for the resulting linear system. [31][32][33] In particular, a simple interchange of the integration order in the centroid collocation quadrature formulas was found to lead to substantial increases in accuracy. This method was first proposed by Tausch et al. 34 and is termed qualocation. Very recently an altogether different approach has been proposed by Lipparini et al. 35 A domain decomposition method was used to achieve a formally exact solution for a cavity made of interlocking spheres within the COSMO-PCM approximation. 36 However, in order to guarantee accuracy of the numerical solution and to provide the necessary stability, for the general problem (IEFPCM for an arbitrary cavity), a Galerkin approach shall be employed. 37 In a previous work, we presented the first wavelet-based implementation of the IEFPCM, which is making use of piecewise constant wavelets as basis functions. 38 In this work, we have extended the approach to the use of a piecewise bilinear wavelet basis. The additional flexibility provided by piecewise bilinear functions has two main advantages: on the one hand, the convergence towards the limiting result is much faster; on the other hand, it allows to compute the shape gradient. 39 The first point makes the approach more efficient, for any given target precision, whereas the second point will allow to compute the solvent contribution to the molecular gradient, which is required both to optimize molecular geometries and to compute molecular properties requiring geometrical derivatives, such as ROA, CARS and SFG.

IEFPCM
When describing solvent effects by a continuum model, the solvent degrees of freedom are replaced by a structureless continuum characterized by the dielectric permittivity of the bulk solvent. The solute is then placed in a cavity inside this continuum. The mutual polarization between the solute charge density r and the infinite continuum dielectric is taken into account in a classical fashion, by solving the Poisson equation for the electrostatic potential u(r) with the appropriate boundary conditions: Here C C R 3 is the cavity with boundary qC. The last two equations above represent the boundary conditions. The former is the usual requirement of continuity for the electrostatic potential at the boundary, while the latter is the jump condition on the normal derivative of the potential. 27 The subscripts + and À refer to the direction, relative to the cavity boundary, in which the limits are taken: from the outside or the inside, respectively. SI-based atomic units have been used and will be used throughout the text. The solution of the Poisson problem can be achieved by its reformulation in terms of a boundary integral equation. 40 The apparent surface charge (ASC) s(s) is introduced to represent the reaction potential: where we have implicitly defined the Newton potential N r and the solvent reaction potential x as the first and second integral, respectively. Notice that the polarization in the continuum is now represented by a surface charge, a scalar function of the surface coordinate s. As shown by Cancés and Mennucci, 28 the ASC is the unique solution to the following integral equation where the integral operators are the components of the Calderón projector: Here, the subscript * A {i,e} differentiates between the internal and external Calderón projector. As apparent, knowledge of the Green's function for the differential operators is necessary in setting up the proper integral operators. Despite the fact that the Poisson problem has been formulated for a uniform, homogeneous dielectric, the boundary integral equation approach is rather general and can be exploited on a more vast class of physical problems, such as ionic liquids, liquid crystals 28 and dielectric interfaces. 41 For a uniform, homogeneous dielectric the Green's functions are given as and the boundary integral eqn (4) is simplified to where only the single layer S i and double layer D i operators are involved. Since S i is a symmetric operator, 40 the solution of eqn (6) could be achieved by use of the conjugate gradient (CG) method, 42 whilst the right hand side is obtained by applying a generalized minimal residual (GMRES) method. 43 In a more general setting one can apply the GMRES method directly to eqn (3).

Wavelet IEFPCM
Boundary integral equations, such as the IEFPCM eqn (6), can conveniently be solved numerically by the application of the boundary element method (BEM). Both the integral operators and the functions on which these act are defined solely on the boundary of the cavity. The application of the boundary element method requires that the boundary of the molecular cavity is discretized by a number of suitable finite elements. The discretization of the boundary leads to a discretization of the integral operators. This discretization can be carried out by using either the collocation or the Galerkin approach. 29 In both cases, the integral equation is transformed to a system of linear equations whose dimension is related to the number of finite elements used in the discretization of the boundary. The resulting system matrix is, in general, a dense matrix. The boundary element method thus suffers from limitations imposed by the number of matrix elements to be stored and the memory and time requirements of solving the resulting linear system.
The use of a wavelet basis in the Galerkin approach has been proven beneficial in this respect. [44][45][46] The resulting system matrices are quasi-sparse and can be further reduced to a sparse form by discarding negligible entries without considerable loss of precision.
The wavelet boundary element method starts by defining a sequence of hierarchical trial spaces, spanned by standard finite element ansatz functions: Here, D j is an index set for the single-scale basis of the space V j . In the wavelet method, the trial space V j is split into the direct sum V j = V jÀ1 " W j . The resulting complementary space W j is called the wavelet space and is not necessarily orthogonal to V jÀ1 . Recursive splitting of the trial spaces leads to the wavelet decomposition V j = " j l=0 W l . The complementary space is spanned by the wavelet basis: The choice of this basis turns out to be very convenient, since we can exploit the compression technique described in ref. 45 to build up the sparse system matrix in the wavelet basis directly, avoiding the computation of non-relevant matrix entries. The a priori compression is carried out in two steps. In the first step, contributions from basis functions sufficiently far away from each other are discarded. In the following, elements of the matrix where one wavelet is in the smooth part of the other one are ignored. The number of relevant matrix entries scales proportional to O(N J ) with N J the number of degrees of freedom for a refinement of the geometry up to level J.
In order to understand the compression steps, let us introduce some notation. Let the convex hull of the support of the wavelet c j,k be: Moreover, let the singular support of the wavelet c j,k be: It contains all the points in the support where the wavelet is not smooth. The compression steps are then governed by the following set of equations where dist (Á,Á) denotes the distance, either between the bounding boxes of the wavelets or between the singular support and the bounding box. The first and second conditions represent the first and second compression step, respectively. The parameters j, j 0 are the levels of the wavelets under consideration and the level-dependent cut-off parameters B j, j 0 and B 0 j; j 0 are given by In the particular implementation, d = 3 for the piecewise constant wavelet basis, while d = 4 for the piecewise bilinear wavelet basis. 47 The compression can thus be adjusted by the parameters a and d 0 : where d is the approximation order of the trial spaces V j : d = 1, 2 for piecewise constant and piecewise bilinear ansatz functions, respectively. Note that, if the interaction between two wavelets, c j,k and c j 0 ,k 0 , on level j and j 0 is neglected in the system matrix, all other interactions between wavelets resulting from the refinement of c j,k and c j 0 ,k 0 can also be ignored. Thus, the compression pattern of the system matrix is calculated hierarchically starting from the coarsest level. Once the compressed system matrix is assembled, we arrive at a sparse system matrix in the wavelet basis which can be compressed further by leaving out sufficiently small elements. This post-processing step is governed by the rule where the coefficients e j,j 0 are given by with the a posteriori compression parameter b o 1.

The quantum mechanical problem
Modelling the solvent as a classical continuum requires that the quantum mechanical Hamiltonian be modified. The mutual polarization of the quantum mechanical molecular charge density distribution and the dielectric continuum can be accounted for by introducing a suitable operator in the Hamiltonian: The PCM operator V sr [r] depends linearly on the solute charge density, thus introducing a scalar nonlinearity into the quantum mechanical problem. It can be shown that variational minimization of the functional leads to the ground state of the nonlinear Hamiltonian. 48 The physical quantity associated with the functional is a free energy, as it also takes into account the irreversible work spent in the process of polarizing the solvent. The derivation of the quantum mechanical equations is beyond the scope of this work and the reader is thus referred to the existing literature. 4,5 The expressions reported are in the molecular orbital (MO) basis. The usual notation for the indices is adopted: i, j, k,. . . are used for occupied orbitals and a, b, c,. . . for virtual orbitals, while p, q, r,. . . are reserved for general orbitals. It is here necessary to remark that the use of a different strategy for the solution of the integral equation arising from the classical electrostatic problem does not affect the derivations. In the self-consistent field approximation (SCF) for the electronic wave function (the Born-Oppenheimer approximation is assumed), the Fock matrix has the form where the usual vacuum terms are augmented by a solvent term: The notation used here is general. We avoid making any reference to the discretization scheme for the integral equation, thus keeping the derivations transparent with respect to the choices made in the solution of the classical problem. The vacuum-like term is given as (20) and encompasses also the case of Kohn-Sham DFT with possibly hybrid functionals. The reader is referred to the existing literature for the explicit form of the above terms. 49 The integrals j pq (s) are called charge-attraction integrals: In our notation, the polarization energy contribution can be rewritten as: The ASC and MEP have here been separated into their electronicinduced -e-and nuclear-induced -N-components. The electronic and nuclear electrostatic potential are expressed as where D pq is the density matrix and Z A , R A are the charge and position of nucleus A, respectively. Turning our attention to the formulation of the linear response function, introduction of the coupling with the continuum leads to response equations of the usual form: [E [2] À oS [2] ]X B = ÀE [1] B .
Limiting ourselves to electric properties, only the electronic Hessian has additional contributions from the continuum solvent. The matrices A and B are now defined as: A ai,bj = d ij f ab À d ab f ji + g aijb À gg abji + w xc;ai,jb + (s ai ,j jb ) qC , B ai,bj = g aibj À gg ajbi + w xc;ai,bj + (s ai ,j bj ) qC .
Since explicit formation and inversion of the electronic Hessian is too costly, the solution of the response equations is achieved by means of subspace iteration methods. 50 The solution vector is expanded in terms of n trial vectors chosen in a proper subspace. The reduced response equations are solved iteratively by repeated calculation of the r vector E [2] X B , i.e. the linear transformation of the given subspace by the electronic Hessian which assumes the form of a generalized Fock matrix. 51 From eqn (26), one can see that the solvent contributions are now included implicitly, via the unperturbed Fock matrix term, and explicitly, via the last term. When a nonequilibrium response formalism for the PCM is adopted, formation of the explicit term in eqn (26) requires the use of the dynamic apparent surface charge: the optical permittivity e N is used in the PCM matrix, instead of the static one e 0 . 52

Implementation
As apparent from Section 2.3, the solution of the PCM problem is independent of the particular strategy employed to tackle the quantum mechanical problem. The PCM functionality can be then abstracted into a module, fully agnostic of the details of the quantum mechanical problem at hand. Our current implementation of the PCM makes use of a recently developed application programming interface (API) called PCMSolver. 53 The API implements all the functionality needed to set up a PCM calculation: cavity, Green's functions and solver. The implementation is completely independent of the details in the quantum mechanical host program. This is in line with the idea of a modular programming paradigm, described already in the early 70's by Dijkstra 54 and Parnas. 55 The low coupling between the host QM code and the API effectively allows to quickly introduce a PCM implementation into codes that are intrinsically different in their internal structure. For this paper, we introduced the PCM functionality into the LSDalton program, 56 in much the same way as described in ref. 57 for the DIRAC program. 58 Fig. 1 shows a schematic view of the internal structure of the API. PCMSolver is developed in C++, but legacy C and Fortran codes coexist within the main object-oriented infrastructure. Thanks to the polymorphism, available as a language mechanism, the API is modular in itself: the cavity, Green's functions and solver are independent of each other. Coupling between the three is achieved by means of abstract interfaces. [59][60][61] The module has some external dependencies. The Boost C++ libraries 62 are used for a number of tasks, unit testing and metaprogramming, among others. Boost libraries are a highly reliable framework: many of their functions are reference implementations.

View Article Online
The Eigen linear algebra library 63 is extensively used for the manipulation of vectors and matrices. The GetKw library provides the input parsing facility. 64 Finally, the header-only Taylor library is used to implement automatic differentiation (AD) of the Green's function objects. 65 Notice that AD relies on template programming, i.e. static polymorphism, which is coupled to the dynamic polymorphism, implemented by inheritance, through the metaprogramming algorithm of Langr et al. 66 Despite the internal complexity of the API, only a handful of functions (around 10) are exposed to the QM code programmer. Finally, let us remark that the API is publicly available as an open-source project, licensed under the terms of the GNU Lesser General Public License (LGPL).

The Wavelet Solver
The wavelet solver can be used to tackle any problem where the Green's function and the surface are known. The parameters that play a role in the solution are: the number J of levels of refinement given by the cavity generator, the a priori and a posteriori compression parameters a, b, d 0 , and the number d of vanishing moments of the wavelets.
The wavelet solver flow chart is depicted in Fig. 2. We start by building an interpolation structure of the points on the quadrangular mesh as computed by the cavity generator for the selected molecular surface, cf. Fig. 5. The interpolation of these points will then be used to calculate the quadrature points needed in the integration and the computation of the normal derivatives.
The interpolation class is based on a tensorized Newton interpolation which assumes that each patch is refined uniformly. The number of polynomials used in the interpolation is determined by the degree of the polynomials and the level of refinement of each patch. It is assumed that the patch can be divided in disjunct polynomials, yielding a relation between the refinement levels and the degree of the polynomials, 2 J mod grade = 0. A simple picture showing the situation in case of J = 2 and grade = 2 for one patch is found in Fig. 3. Having determined the coefficients of the Newton polynomial, we can then easily compute the derivatives with respect to x, y or the normal derivative by the Horner scheme as described in ref. 42.
After constructing the element list and the wavelet list, the element-wise computation of the system matrix is carried out. The elements of the matrix computed are determined by the compression rule described in eqn (11). The integration is done by using tensor product Gaussian quadrature rules and the Green's function definition in PCMSolver. The last step in the computation of the sparse system matrix is the application of the a posteriori compression described in eqn (14). Having computed the system matrix, the right hand side is assembled according to the boundary integral eqn (6). To that end, the Generalized Minimal Residual (GMRES) solver 43 is used with a possible restart after 100 iterations. Finally, the linear system of equations is solved by employing a CG solver. 67 The precision used for the solver is e = 10 À6 .
The class diagram of the wavelet solver can be seen in Fig. 4. It is based on the implementation of an abstract class, the GenericAnsatzFunction class, which implements only common aspects of the code, for example the construction of the element list and the a priori and a posteriori compression. The derived classes ConAnsatzFunction and LinAnsatzFunction initialize the implementation specific constants, that are a, b, d, and d 0 , for the piecewise constant discretization and the piecewise bilinear discretization. Furthermore the specific functions, for example the integration functions, are found in the derived classes as well.

Applications
The current implementation of the IEFPCM wavelet code within PCMSolver has been interfaced with a development version of LSDalton. The wavelet code reimplements the piecewise constant discretization presented in 38 together with the piecewise bilinear discretization, 44 as presented in Section 3.
To keep consistency with Weijo et al., we used benzene as our test molecule. All calculations have been carried out at the Hartree-Fock level of theory. Two different Gaussian basis sets were employed: 6-31G and 6-311++G**, the latter to analyze how the wavelet solver performs when a more realistic description of the electronic charge distribution is sought.
The PCM calculations employ water as solvent (e = 78.39) and the solvent-accessible surface (SAS). The radii used to generate the SAS are the ones reported by Bondi, 11 unscaled: 1.70 Å for carbon and 1.20 Å for hydrogen.
All LSDalton calculations employed the augmented Roothaan-Hall algorithm in combination with the ATOMS starting guess 68 for density optimizations, and the linear-response solver of Coriani et al. 51 with the atomic orbital (AO) basis preconditioner. For efficient integral evaluation, LSDalton combines J-engine [69][70][71] acceleration for the Coulomb and LinK 72 for the exchange contributions.
Being the first such implementation, we have devoted our attention to the following aspects: testing the intrinsic accuracy of the bilinear wavelet solver with respect to the choice of the compression parameters (a and d 0 ); comparing to our previous piecewise constant implementation; assessing the overall performance of the method. In addition, we have for the first time calculated static electric dipole polarizabilities with a wavelet based PCM implementation.
All calculations were carried out on a single Intel E5-2670 processor, compiled with the Intel compiler suite version 14.0.2 in combination with OpenMPI version 1.6.5. For the standalone version used for timing and convergence results found in the Section 4.3 the GNU g++ compiler version 4.6.3 was employed.

Accuracy and compression parameters
The accuracy and memory requirements of the wavelet solver depend on the chosen compression parameters. It is thus important to determine the best set of compression parameters triplet a, d 0 and b that, for fixed patch level (PL), limits the memory requirements, while retaining the highest accuracy. We will first look at the behaviour of the wavelet Galerkin BEM for the Laplace equation on 6 patches, in order to explore the impact of the compression on the sparsity pattern and draw some conclusions on the relative importance of the various compression steps. A more thorough assessment of the accuracy will then be given based on quantum mechanical calculations of benzene. Finally, the convergence behavior with increasing PL will be discussed for the C 32 H 66 polyalkane system. Table 1 contains a summary of the number of non-zero elements retained in the system matrix for different choices of the a priori compression parameters. The impact of the first and second a priori compressions is also summarized. In all cases, the first a priori compression already discards most of the negligible entries (80% on average), achieving the desired sparsity in the system matrix. The combination with the second a priori compression discards additional negligible entries (8% on average) but does not significantly affect the sparsity pattern.  This is reflected in Fig. 6 and 7, which show the patterns obtained by applying only the first or the complete a priori compression, respectively. Table 2 shows the number of non-zero elements for different values of the a posteriori parameters b and d 0 . The main conclusion is that choice of the parameters for the a posteriori compression is not as critical as for the a priori compression.
The comparison of the sparsity pattern for the piecewise constant and piecewise bilinear wavelets in Fig. 8 shows that indeed sparsity and linear memory requirements are a general feature of the wavelet Galerkin scheme, although the piecewise bilinear wavelets features a larger number of non-zero matrix entries.
The effect of the matrix compression on the accuracy was evaluated by performing quantum mechanical calculations on benzene. The calculations were repeated at PL-2 and PL-3, varying the compression parameters. The nuclear and electronic components of the polarization energy U NN and U ee together with the static isotropic polarizability a iso were considered.
Given that the a posteriori compression has a rather limited influence on the memory requirements (see Table 2 The most important conclusion from Tables 3 and 5 is that a higher accuracy can always be achieved by a further refinement of the Galerkin discretization of the cavity surface. In passing from PL-2 to PL-3, there is a significant gain in accuracy, regardless of the compression parameter triplet chosen. This is a general feature of Galerkin BEM schemes: mesh refinement ensures convergence to the exact solution. A similar conclusion was reached by Weijo et al. 38 in their analysis of the wavelet solver based on piecewise constant wavelets. The calculations reported there were repeated in this study, but with our wider parameter set. Results using piecewise constant wavelets are though only reported for the calculations on a iso . See ESI, † for the complete set of results.
A comparison of the results obtained with the different a posteriori compression parameters shows that the more   conservative choice b = 0.001 leads to a PL-2 reference result deviating by at most 0.001% from the PL-3 values for both energies and isotropic polarizabilities. It is however to be noted that the relative accuracy, i.e. the absolute difference with respect to the reference value at the same patch level and choice of b, is worsened for the polarization energy components analyzed here. From the results in Tables 1 and 2, it is easy to see that the number of matrix elements that are discarded by the a posteriori compression is largely inferior to the compression that is achieved a priori. Based on the relative accuracy results in Table 3, we can however advocate for the use of the less conservative setting b = 0.01. This choice is further justified by the relative accuracy results for the isotropic polarizability, reported in Table 4 for piecewise constant wavelets and in Table 5 for piecewise bilinear wavelets.
Comparing the results for a iso obtained with the two different discretizations (piecewise constant wavelets in Table 4 and piecewise bilinear wavelets in Table 5) it is evident that a higher accuracy can be achieved already at PL-2 by use of the latter   wavelet basis. However, the high accuracy comes at the cost of roughly twice as many points where the electrostatic potential has to be evaluated: 10 240 vs. 23 040 at PL-2, 40 960 vs. 92 160 at PL-3. As a result of the accuracy analysis, we conclude that a sensible choice of default compression parameters is a = 2.5, d 0 = 3.0, b = 0.01 for piecewise bilinear wavelets and a = 2.5, d 0 = 2.0, b = 0.01 for piecewise constant wavelets. This choice has been adopted in the rest of the present work.

Piecewise bilinear wavelets vs. piecewise constant wavelets and standard PCM
In order to compare our piecewise bilinear wavelet implementation with previous ones, we have considered both our previous piecewise constant wavelet implementation 38 and a standard IEFPCM implementation, 28 which makes use of a collocation method and the GEPOL algorithm for the cavity construction. This comparison is illustrated in Fig. 9 for the nuclear part of the solvation energy, in Fig. 10 for the electronic part, and in Fig. 11 for the total solvation energy. Finally, the isotropic polarizability results are displayed in Fig. 12.
The nuclear and electronic components of the solvation energy show very similar trends. In particular, the piecewise bilinear basis shows a faster convergence to the limiting value, although it uses twice as many function evaluations as the corresponding piecewise constant wavelet solver at the same PL. Concerning the comparison with standard IEFPCM, it is more difficult to compare calculations vis-á-vis, because the cavity discretization is here significantly different and because a fitted parametrization for the diagonal elements is employed. Table 3 Effect of the a priori and the a posteriori matrix compression parameters a, d 0 and b on the nuclear and electronic polarization energies for benzene. Results were obtained at the Hartree-Fock level of theory either using a 6-31G or a large 6-311++G** basis set. The piecewise bilinear wavelet basis was used in the Galerkin discretization of the PCM integral operators. All the energies reported are differences, expressed in Hartrees, with respect to the case where a = 5.0, d 0 = 3.0, which is the upper limit in (13). Only selected compression parameter triplets are shown. The number of points on the cavity where the electrostatic potential has to be evaluated are 230 40 and 921 60 at PL-2 and PL-3, respectively  Table 4 Effect of the a priori and a posteriori matrix compression parameters a, d 0 and b on the static isotropic polarizability of benzene. Results were obtained at the Hartree-Fock level of theory either using a 6-31G or a large 6-311++G** basis set. The piecewise constant wavelet basis was used in the Galerkin discretization of the PCM integral operators. All the values reported are differences, expressed in a 3 0 , with respect to the case where a = 5.0, d 0 = 2.0, which is the upper limit in (13). Only selected compression parameter triplets are shown. The number of points on the cavity where the electrostatic potential has to be evaluated are 102 40 and 409 60 at PL-2 and PL-3, respectively Therefore, a standard PCM calculation is able to achieve a good accuracy even with much coarser cavity parametrization. On the other hand, the most accurate result is still different from the wavelet one. Further refinement of the cavity description would expose the instabilities of the collocation method. The values for the electronic part are very similar and identical considerations apply. Turning our attention to the total solvation energy, it is evident that the piecewise bilinear wavelets are much more accurate: already at PL-2 the result is practically converged. A similar result requires PL-4 with piecewise constants wavelets and almost 8 times as many elements. The comparison with the collocation method shows that there is going to be a gap between the converged IEFPCM results and the wavelet results.
For the electrostatic solvation energy of benzene the gap is not large (0.02 kcal mol À1 ), but more polar substrates might show wider gaps. Concerning the isotropic polarization, similar findings to the total solvation energies are obtained: the piecewise bilinear wavelets yield a practically converged result at PL-2, whereas the piecewise constant wavelets are converged at PL-3 and the ''best'' standard IEFPCM displays again a small gap with respect to the converged wavelet results.
In conclusion, both wavelet methods converge to the same result: the piecewise bilinear wavelets converge faster and require a lower PL to attain a given accuracy, both for the overall solvation energy and the polarizability. The traditional collocation method Table 5 Effect of the a priori and a posteriori matrix compression parameters a, d 0 and b on the static isotropic polarizability of benzene. Results were obtained at the Hartree-Fock level of theory either using a 6-31G or a large 6-311++G** basis set. The piecewise bilinear wavelet basis was used in the Galerkin discretization of the PCM integral operators. All the values reported are differences, expressed in a 3 0 , with respect to the case where a = 5.0, d 0 = 3.0, which is the upper limit in (13). Only selected compression parameter triplets are shown. The number of points on the cavity where the electrostatic potential has to be evaluated are 230 40 and 921 60 at PL-2 and PL-3, respectively   is able to achieve a reasonable accuracy with fewer points. However, the limit of large refinement is slightly different than the wavelet one, indicating a limitation of the collocation method. One possible explanation for such a discrepancy could be the fitting of the diagonal elements, 74 introducing a bias at large refinements.

Performance of the wavelet solver
The convergence properties and memory requirements of the wavelet solver were assessed by performing calculations on the four linear polyalkane chains C n H 2n+2 (n = 8, 16, 32, 64), using a standalone version of the wavelet solver. The interpolation parameter grade is set to 4 for levels = 2, 3, 4 and 8 for level 5. Only the nuclear charge distribution was considered. The total apparent surface charge (ASC) can thus be compared to the exact analytical value as obtained by the Gauss' theorem where e is the permittivity and Q is the total charge enclosed by the cavity. A similar analysis for the piecewise constant parametrization was presented by Weijo et al. 38 The influence of the PL on the ASC convergence is shown in Fig. 13, for three different sets of a priori compression parameters. Only the graph for C 32 H 64 is shown because similar trends were observed also for the other polyalkanes. See ESI, † for the complete set.
From Fig. 13 we see that by increasing the PL leads to convergence towards s exact , a general feature of any Galerkin BEM method. Such convergence is achieved somewhat faster when a less aggressive a priori compression is applied to the system matrix. As discussed in Section 2.2, higher values of the a priori compression parameters lead to a higher accuracy in the calculated ASC since a larger number of matrix elements is retained.
In the wavelet PCM formalism, both the construction of the system matrix and the solution of the linear equations scale linearly with the mesh size, given the same initial set of patches. In Fig. 14, we see a summary of the convergence analysis for the polyalkane chains C n H 2n+2 (n = 8, 16,32,64), which also Fig. 11 Convergence of U pol with respect to the number of molecular electrostatic potential (MEP) evaluation points on the cavity surface. The values reported are in kcal mol À1 and refer to benzene. The Gaussian basis 6-31G was used. The upper axis reports the average area for the collocation tesselation, while the lower axis refers to the patch level in the wavelet Galerkin discretization. The annotation report the number of MEP evaluation points. The compression parameter triplet was set to a = 2.5, d 0 = 2.0, b = 0.01 for the piecewise constant wavelets and to a = 2.5, d 0 = 3.0, b = 0.01 for piecewise bilinear wavelets. The limit value is extrapolated from the results obtained from the piecewise bilinear wavelet Galerkin scheme.   contains the total computational time. In all cases, the compression parameters were kept fixed: a = 1.0, d 0 = 2.25 and b = 0.01. Since we are still in the preasymptotic regime, the observed scaling is N 1.5 J instead of the proven linear behaviour N J , where J is the refinement level as described in Section 2.2.
The time spent for assembling the system matrix, discarding unnecessary elements by compression and solving the linear system of equations is shown in Fig. 15 for different choices of the compression parameters. Only the results obtained in the case of C 16 H 34 are shown, as similar trends are exhibited by the other molecules. For the other molecules the reader is referred to the ESI. † Clearly, assembling the system matrix is the most time consuming portion of the currently implemented version of the wavelet algorithm. It is also evident that this takes longer when the compression of the matrix is less aggressive and more elements are to be retained. For example, up to 93% of the time is spent in assembling the system matrix when a = 2.0, d 0 = 3.0 and b = 0.01. The solution of the system of linear equations is much less demanding and also less affected by the requested accuracy. On the other hand, a finer mesh (higher PL) implies an increased number of integral evaluations to obtain the electrostatic potential.
Another crucial aspect to be considered is the construction of the initial set of patches. The system matrix is indeed dense for the scaling part and the algorithm scales quadratically with number of original patches. In order to achieve linear scaling  with respect to the molecular size, it will be necessary to devise a cavity generator which scales sub-linearly (ideally O(N 1/2 ) where N is the number of atoms) with the molecular size. This is however not the case for the current implementation, as already shown in our previous work 38 and therefore the scaling with the molecular size is almost quadratic.

Conclusions
We have presented the first implementation of the Polarizable Continuum Model which combines the Integral Equation Formalism with a wavelet solver with piecewise bilinear wavelets for the solution of the underlying boundary integral equation. This is a further development of a previous work, 38 which made use of piecewise constant wavelets. Thanks to the construction of the system matrix within the wavelet formalism, the solution of the boundary integral equation exhibits fast and guaranteed convergence to the exact limiting values of the problem, which cannot be achieved with a collocation method. Due to the high modularity of PCMSolver, linear response was immediately available (both with picewise constant wavelets and piecewise linear wavelets) and we have demonstrated that the accuracy attained for energy calculations is reflected in the response calculations as well.
The robustness of the wavelet formalism makes our implementation an important reference benchmark for accurate calculations, which has so far been missing. In order to make the implementation competitive also on the performance side the two most important bottlenecks will have to be addressed. The accuracy achieved with wavelets depends on the evaluation of electrostatic potentials in a large number of mesh points. Speeding up this part will require the use of interpolation techniques (integrals are calculated at a coarser mesh and interpolated at a finer one) and fast multipole methods. The second bottleneck is constituted by the construction of the initial parameterization of the cavity into patches. The current implementation does not guarantee linear scaling with the molecular size, which could only be achieved if the number of patches scales sub-linearly with the molecule size.