Accurate and Interpretable Representation of Correlated Electronic Structure via Tensor Product Selected CI

The task of computing wavefunctions that are accurate, yet simple enough mathematical objects to use for reasoning has long been a challenge in quantum chemistry. The difficulty in drawing physical conclusions from a wavefunction is often related to the generally large number of configurations with similar weights. In Tensor Product Selected CI, we use a locally correlated tensor product state basis, which has the effect of concentrating the weight of a state onto a smaller number of physically interpretable degrees of freedom. In this paper, we apply TPSCI to a series of three molecular systems ranging in separability, one of which is the first application of TPSCI to an open-shell bimetallic system. For each of these systems, we obtain accurate solutions to large active spaces, and analyze the resulting wavefunctions through a series of different approaches including (i) direct inspection of the TPS basis coefficients, (ii) construction of Bloch effective Hamiltonians, and (iii) computation of cluster correlation functions.


I. INTRODUCTION
Computational electronic structure theory has developed into an indisputably powerful tool for understanding the quantum mechanical origins of molecular structure and chemical transformations.Progress over the past several decades (in both hardware and algorithmic improvements) has advanced quantum chemistry to the point where the accuracy can often rival that of experiments, particularly for low-energy molecules near equilibrium.However, as the accuracy of a computation increases, so to does the numerical complexity of the solution, making interpretation more challenging.
The need for achieving both quantitative accuracy and qualitative richness was recognized early on, as computers were first becoming increasingly powerful [1][2][3].Gaining access to the underlying driving forces of reactions or structure has proven to be one of the most valuable aspects of quantum chemistry.As such, the ability to extract qualitative insight is perhaps more important than simply arriving at a quantitatively accurate result.
Many approaches to extracting conceptual insight from ab initio calculations involves some sort of "localization".This is because much of our chemical vocabulary is inherently local (oxidation states, bond order, partial charges, hybridization, etc).The abundance of local chemical constructs is not an accident, molecular structure is generally highly local.For example, the alcohol group in 1hexanol behaves very similarly to 1-heptanol.As such, understanding the local structure of a functional group in one system extends significant reasoning power to other systems.Consequently, much of the effort spent toward extracting qualitative insight involve the localization of orbitals, such as with NBOs, [4][5][6][7] ALMOs, [8][9][10] localization of the density as in AIM, [11] or even many-electron states [12][13][14][15][16][17][18][19] (though this list is necessarily far from comprehensive).Localization has also been leveraged exten-sively for reducing computational complexity.Underlying many of these developments is the fact that the density matrix asymptotically approaches linearly scaling for gapped systems in a localized basis [20].
All (most) of the methods discussed above ultimately leverage the fact that a Slater determinant wavefunction (or MP2 or CCSD) is invariant with respect to orbital rotations within the occupied or virtual spaces.Orbitals can be mixed to maximize some localizing objective function, and the resulting wavefunctions can then be analyzed in terms of local or non-local contributions.
In contrast, a tensor product space permits a much more explicit notion of locality, one that naturally exposes the ability to factorize into local quantities (entanglement) and allows clear labeling of the entire Hilbert space in terms of unambiguously local quantities.Recently, we have explored the ability to leverage features of tensor product spaces to decrease computational cost of large active space calculations [21][22][23][24][25].In Refs.23, 24, we introduced a method called Tensor Product Selected Configuration Interaction (TPSCI) which uses a selected CI algorithm to assemble a basis of tensor products of locally correlated wavefunctions, which can provide accurate approximations to FCI.In this paper, we demonstrate the ability of these (still expansive) TPSCI wavefunctions to be meaningfully analyzed and interpreted across a rather wide range of physical systems, including non-bonded chromophores, dichromium spin coupling, and the fully delocalized π system of a graphene nanoflake.

II. THEORY
We will start by expressing the electronic Hamiltonian in a basis of active orbitals (p, q, r, s), Ĥ = h pq p † q + 1 2 ⟨pq|rs⟩ p † q † ŝr, (1) assuming that the chosen active space is large enough to capture the necessary physics.This is generally the most limiting assumption in this paper, and work to include the dynamical correlation arising from external orbitals is currently underway in our lab [26].

A. Orbital Clustering
To make progress toward a compact and interpretable representation, we will assume that the active orbitals can be partitioned into disjoint clusters, or groups of orbitals.We will generally use capital letters, I, to index clusters.This orbital partitioning (or "clustering") is chosen to maximize the interactions within a cluster while minimizing the interactions between clusters.For example, if one had a bimetallic compound (as we consider later in the paper) then one might define each cluster to include all the orbitals centered on a given metal such that all local dynamical correlation is included as intra-cluster correlation, and weak spin-coupling is considered as an inter-cluster correlation.
Physically, we will assume that the interactions within a cluster are stronger than the interactions between clusters.This is not a formal requirement, but rather one that affects the convergence of the calculations.Each cluster is effectively a new smaller active space, and thus we can construct correlated many-body wavefunctions, |α I ⟩, that are completely localized to each local active space (cluster), I.We will refer to these locally correlated states as cluster states, using them to form an orthonormal basis for the full Fock space on each cluster.Likewise, the list of all tensor products of cluster states forms an orthonormal basis for the global Fock space on the full orbital active space.This allows us to represent an arbitrary wavefunction with s states as a linear combination of cluster state tensor products: In this representation, the basis vectors can potentially contain a significant amount of electron correlation folded into the local many-body cluster states.This means that the coefficient tensor, c s α,β,...,γ only needs to describe inter -cluster correlation, with all the intra-cluster correlation being folded into the basis vectors.This is essentially the same basis used in the Active Space Decomposition (ASD) approach of Shiozaki and coworkers [27][28][29].

Cluster Mean-Field (cMF) theory
In order to make the most use out of the representation defined above, it is important that the cluster states are defined carefully, so that they incorporate as much relevant electron correlation as possible.Diagonalizing the Hamiltonian projected onto a single cluster (simply keeping only the terms where all creation/annihilation operators act on orbitals within the cluster), yields a set of correlated many-body cluster states that include an exact description of the intra-cluster correlation.Taking a product of the local FCI ground states provides a reasonable approximation for the global ground state, one that becomes exact in the "clusterable" limit.
However, interesting molecular systems generally have non-trivial interactions between clusters, and so this becomes a rather poor approximation in practice.Fortunately, one can easily obtain a much improved ground state estimate by including a mean-field description of the inter-cluster interactions when defining the local cluster Hamiltonian, instead of simply projecting out the inter-cluster terms.An approach, called Cluster Mean-Field (cMF) theory, was introduced by Scuseria and coworkers [30][31][32][33], and used by Gagliardi and coworkers under the name variational localized active space selfconsistent field (vLASSCF) [34].In this work, we construct correlated cluster states by diagonalizing the local cMF effective Hamiltonian, ĤcMF I , for each cluster: where The cMF Hamiltonian for cluster I, depends on the 1RDM of all the other clusters, requiring the cMF solution to be obtained self-consistently.
Bearing a strong resemblance to traditional Hartree-Fock theory, the self-consistent solution corresponds to the variational minimization of an unentangled (product state) wavefunction ansatz, the difference from HF being that in cMF the ansatz only enforces the absence of entanglement between clusters.
In further analogy to Hartree-Fock theory, a "generalized Brillouin condition" holds, that rigorously uncouples the (converged) cMF wavefunction (Eq.4) from tensor product states (TPS's) with a single cluster excited:

Orbital optimization
Once converged, the cMF energy is stationary with respect to the local cluster state wavefunction coefficients (local FCI coefficients).This means that the cMF energy is invariant to intra-cluster orbital rotations (assuming each cluster is solved exactly), but variant with respect to inter-cluster orbital rotations.In order to obtain a further improved product state wavefunction, we can make the cMF energy stationary with respect to all orbital rotations.This is essentially a CASSCF calculation with multiple disjoint active spaces [30].While this clearly has the benefit of providing a lower energy variational solution, perhaps more importantly, is that it removes most of the arbitrariness of the orbital clustering.Assuming each cluster has a Hilbert space dimension greater than one, orbital optimization with a single tensor product state wavefunction will naturally tend to localize the orbitals, so as to maximize electron correlation.Consequently, methods that use the cMF wavefunction as a reference state will be well-defined, not dependent on a particular heuristic for orbital localization [22,31].

B. Tensor Product Selected CI (TPSCI)
While cMF provides a qualitatively attractive approximation for the ground state of a clustered molecular system, quantitative accuracy is clearly missing due to the neglect of all inter-cluster entanglement.Analogous to the common approach of using substituted Slater determinants for a basis, we will use substituted tensor product states as a basis for the full Hilbert space, where each TPS is typically taken to be an eigenstate of a cluster's cMF Hamiltonian.This is similar to a CI analogue of the Block-Correlated Coupled Cluster (BCCC) approach of Li [35].
The Hartree-Fock based Slater determinant basis and the TPS basis are equivalent, in that both span the full space.However, as soon as truncations are made, the two bases span different spaces.One benefit of working in a TPS basis where the local Hamiltonians are diagonal, is that due to the fact that local correlation is folded into the basis states themselves, the low-energy solutions become more heavily concentrated on a smaller number of basis states.Consequently, computational methods that exploit sparsity (e.g., selected CI) might be expected to be more performant in the correlated TPS basis than in a Slater determinant basis.In Refs.[23,24] we demonstrated that this is often true, and can sometimes be leveraged for computational benefit.
In the Tensor Product Selected CI (TPSCI) method [23], we use the general CIPSI [36] algorithm to discover and exploit, in a bottom-up fashion, the sparsity of the exact wavefunction in a TPS basis.This uses perturbation theory to iteratively discover the non-negligible TPS's that are needed to accurately approximate the exact solution.This is done by the following steps: 1. Diagonalize Ĥ in the current variational space (this being a list of TPS basis states that are expected to have large amplitude in the exact solution).
2. Apply the Ĥ to the current variational space eigenvector.This couples the variational space to the external space.
3. Compute the first order wavefunction in the external space.
4. Move the external configurations with large firstorder coefficients from the external space to the variational space.

Variational Space: Diagonalization
External Space: Perturbation theory

Apply Hamiltonian
Expand Variational Space FIG. 1. Schematic illustration of the selected CI algorithm used in TPSCI to build a basis of tensor product states.This is iterated until the dimension of the variational space stops growing.

If the variational dimension increases, go back
This overall iterative loop is shown in Fig. 1.In principle, all local FCI cluster states would be computed and used to form the basis for state space.While this is tractable for small clusters, for larger clusters, this becomes computationally prohibitive, and high-energy states are generally discarded prior to running TPSCI.We generally use M to refer to the maximum number of cluster states kept in a particular fock sector of a cluster.For each particle number subspace included, the corresponding lowest M eigenstates are computed.Then the Ŝ+ and Ŝ− operators are applied to those cluster states to generate the basis for the higher M s sectors.More details can be found in Ref. [24].The computational limitations of conventional (determinant basis) CIPSI is generally determined by the size of the dimension of the variational space.Because TP-SCI uses correlated TPS basis vectors, more correlation energy is typically recovered with smaller variational spaces, addressing the most significant bottleneck.This comes at a cost, however, during the matrix element evaluation.Whereas evaluating matrix elements in the Slater determinant basis is extremely efficient, matrix elements in the TPS basis are significantly more expensive.For a 3 cluster example, consider the following Hamiltonian contribution that contains three operators (p † q † r) on cluster 1, and one operator (s) on cluster 2: Computing the matrix element of this particular Hamiltonian contribution between two arbitrary TPS basis vectors will require the contraction of an integral sub-block with tensors of local quantities: where χ is a sign determined by the number of electrons in state |α 1 ⟩, and the Γ tensors are the precomputed local operator matrices in the cluster basis, e.g.: Because the matrix elements require a series of tensor contractions, instead of just a single access from an array, the construction of matrix elements becomes the key bottleneck in TPSCI.However, in Ref. [23], we compared TPSCI to Heat Bath CI [37], and found that we were able to obtain significantly lower variational energies using TPSCI than with heat bath CI.Once the TPSCI variational space has converged, we have also found that it is sometimes beneficial (especially for ground state problems) to perform a higher-order singular value decomposition (HOSVD) of the resulting wavefunction tensor, to rotate the cluster states into a form that diagonalizes the local cluster reduced density matrices (within subspaces that preserve local particle number and Ŝz ).More details about the implementation and matrix element construction can be found in Refs.[23,24].

Bloch Effective Hamiltonian
Once an accurate TPSCI wavefunction is obtained, one is often interested in more than just the associated energy.Being able to extract qualitative information to aid in communicating and reasoning about the underlying electronic structure, is extremely valuable, and TPS wavefunctions are uniquely interpretable.Since the basis states in TPSCI are essentially diabatic states, it is a very natural extension to use the concept of Bloch effective Hamiltonians to extract quantitative relationships between qualitatively meaningful degrees of freedom [38][39][40][41][42][43][44][45][46][47] for analysis.
We start by defining a "model space", {|ϕ i ⟩}, which is taken to be the set of physically meaningful TPS's that qualitatively define the structure or process.To ensure that the model space is actually relevant to the physics computed, the exact low-energy states of the system, |Ψ s ⟩, should have relatively large projections onto the model space, i.e., where, PM = i |ϕ i ⟩⟨ϕ i |.Next, we seek a hermitian effective Hamiltonian which exists only in the model space, but that yields the exact energy spectrum.While this is often done in a bottom-up fashion through approaches like quasi-degenerate perturbation theories or Schrieffer-Wolff transformations, if one already has access to the exact target eigenstates, an effective Hamiltonian (specifically, a Bloch effective Hamiltonian) can be obtained simply in a top-down fashion by direct projection, where, will always exist when Eq. 9 holds.The individual matrix elements of ĤBloch then contain quantitative relationships between qualitatively meaningful states.

Cluster correlation functions
In addition to the Bloch effective Hamiltonian, which gives us a state-universal description of the interactions between physically intuitive degrees of freedom, we also often want to characterize specific states in terms of physically intuitive variables.
Following the recent work of Luzanov, Krylov, and Casanova [48,49], the local TPS representation makes it simple to compute cluster correlation functions of various local cluster operators to characterize states in terms of observables.A two-cluster correlation function for operator Ô is given as the covariance between the operator localized onto each individual cluster: where the covariance of an operator with itself is just the variance, which will be used to measure the local fluctuations in a given cluster.Depending on the system we will consider correlations between the following cluster operators: local charge (particle number), NI , local spin projection, Ŝz I , local spin Ŝ2 I , and local excitation QI = 1 − |0 I ⟩⟨0 I | which indicates that cluster I is excited out of its cMF ground cluster state.

III. RESULTS
In the following sections, we explore the ability to simultaneously obtain quantitative yet interpretable approximations to large active spaces.Because our representation is ideal for separable clusters, to obtain insight into the transferability of the formalism, we explore systems which span a broad spectrum in terms of clusterability, going from a non-bonded tetracene tetramer (Sec.III A), to an anti-ferromagnetically coupled dichromium complex (Sec.III B), to a completely delocalized graphene-flake model, hexabenzocoronene (Sec.III C).
We use PySCF software for performing any necessary geometry optimizations, Hartree-Fock calculations, and integral generation [50].All TPSCI calculations are performed using our open-source FermiCG software [51].

A. Singlet-Fission: Tetracene Tetramer
Singlet fission is the photophysical process by which a bright singly excited state, |S 1 ⟩, is converted into two lower energy triplet states, |T 1 ⟩ + |T 1 ⟩, by way of a multiexcitonic intermediate state 1 TT [52].Because this process converts a single photon into two excitons (each of which can split into charge carriers), materials that exhibit singlet fission have promising applications in solar cells due to the possibility of overcoming the Shockley-Queisser limit for efficiency [53].
A dimer model description of the singlet-fission process requires a total of 8 states: the ground state |S 0 S 0 ⟩, two local bright states, |S 1 S 0 ⟩ and |S 0 S 1 ⟩, two local triplet states, |S 0 T 1 ⟩ and|T 1 S 0 ⟩, and the three biexcitonic states arising from the product of two triplet states |T 1 T 1 ⟩.The biexcitonic state is characterized as an entangled pair of local triplets, which can be spin-coupled into either a singlet 1 TT , triplet 3 TT , or quintet 5 TT , represented in the diabatic basis via their Clebsch-Gordan coefficients, where the first triplet in each state refers to chromophore A and second triplet to chromophore B. Although 1 TT is the main intermediate, as it is spin-allowed, it has been shown that the triplet and quintet states can play a role in the separation process [54].While a dimer model captures the key intermediates, it is too small to describe additional physical effects that occur in bulk systems.For example, the initial bright state is generally understood to delocalize over several monomers, increasing the number of localized biexcitons to which it can couple.Further, it has been seen that singlet fission rates are increased by the involvement of a non-nearest neighbor biexciton 1 T • • • T , which is absent from the dimer model by construction [55,56].Recent work has further demonstrated the importance of beyond dimer effects [57,58].
Because methods that rely on single or even double excitations struggle to accurately describe two-electron excitations, multireference methods, such as CASPT2, are often required to capture the wide range of electronic character.While this is suitable for a couple chomophores, active space methods typically grow exponentially with the number of chromophores, making it difficult to extend to larger systems.TPSCI (similar to ASD which preceded it [27][28][29]) is well suited for treating collections of chromophores because the physical system efficiently maps onto the tensor product basis.In a recent paper [24], we demonstrated that TPSCI was able to provide accurate approximations to CASCI for a large (40e, 40o) active space, which incorporated 10 active orbitals for each of the four tetracene chromophores.In this paper, we explore this further, using the TPS structure to facilitate further analysis and characterization of the resulting wavefunctions.
In this subsection, we use TPSCI to go beyond the minimal dimer example for singlet fission and explore dressed Hamiltonains and local cluster operator correlations for tetracene tetramer (shown in Fig. 2(a)) using a large (40e, 40o) active space.

Active Space Selection and Clustering
In order to construct an active space which contains the relevant orbitals for describing both the local S 1 and T 1 states, we used the CIS-NO [59] approach to select our active space in the 6-31G* [60] basis.We first performed a CIS calculation for the first singlet and triplet on each chromophore and averaged these states into the one particle reduced density matrix (1RDM).By diagonalizing the 1RDM, we obtained a set of natural orbitals from which we extracted the 40 most correlated orbitals (i.e.those that have the most fractional occupancy) as our active space.We then localized these 40 orbitals using Pipek-Mezey [61] and then grouped the orbitals into four (10e, 10o) clusters on each chromophore for an overall active space of 40 orbitals and 40 electrons (40e, 40o).We are currently developing approximate solvers (such as RAS-CI) for obtaining the local cluster states.This will allow us to consider clusters larger than the relatively small ten orbital clusters used here.

Bloch Effective Hamiltonian
To analyze our TPSCI results, we start by computing a Bloch effective Hamiltonian by projecting the TPSCI eigenvectors onto our diabatic basis (i.e.model space).The diabatic basis for tetracene tetramer includes the biexciton diabatic states for each pair in addition to the singly excited states on each chromophore.In the tetramer, there are six possible dimer configurations and each generates three M s = 0 spin components |T + T − ⟩, T 0 T 0 , and |T − T + ⟩ which leads to a total of 18 diabatic biexciton states.We also observe in the tetracene monomer that both T 1 and T 2 are lower in energy than the first singlet excited state S 1 , thus all three of these states must be represented for each chromophore in our model space.In total, our model space includes 31 diabatic states.However, to simplify the picture, we focus on the singlet model space, where the biexcitonic states have been mixed using their Clebsch-Gordon coefficients to form proper 1 TT diabatic states.This reduces our model space from 31 states to 11 states.
In Fig. 2(b) and 2(c), we plot the bare Hamiltonian and effective Hamiltonian in the model space as described above, with the columns arranged to correspond with the cluster labels in 2(a).The diagonal energies are subtracted to better reveal the off diagonal activity in meV.In both plots, the Hamiltonian is blocked by state type: ground state, four singlet excited states, and the 6 singlet biexcitons, where the non-singlet states have been omitted for clarity.As expected the singlet excitons couple very strongly to each other, both in the bare and effective Hamiltonians, which ultimately gives rise to bright state delocalization.There is only negligible coupling between the S 1 and biexcitons in the bare plot, but after including external space correlations, we see a significant growth in the strength of the effective coupling.These are listed explicitly in Table I.Whereas the 1 TT states on the 3 herringbone dimers develop significant S 1 coupling after the inclusion of the external space, the planar dimers remain uncoupled from the bright spectrum.In addition to strengthening the coupling between the bright states and the biexcitonic states, the inclusion of higher energy states also induces couplings between the biexcitons themselves.

Correlation Analysis
As mentioned in SectionII B 2, correlation functions of various local cluster operators can be used to characterize the adiabatic TPSCI wavefunction in terms of physically meaningful relationships between clusters.In Table II, we list the expectation values of the cluster excitation operators, which measures the amount of excited state character in each state, and by summing over clusters, the total excitation rank of each excited state.This allows us to unambiguously identify the biexcitons, which we use to label the states accordingly in Fig. 3, where we compute the inter-cluster cumulants for cluster particle numbers ( NI ), cluster spin projections ( Ŝz I ), and cluster excitations ( QI ) for each of the six singlet biexcitons 1 (TT) and four bright states.
Looking first at the particle number correlations, NI , we see that overall, the "dark" |TT⟩ states are relatively quiet compared to the charge correlations present in the bright states.This is entirely expected based on the physical characteristics of the bright vs dark states.Bright states have relatively large amounts of charge transfer character mixed in.The presence of charge transfer makes local particle number less well defined, which increases a cluster's charge variance, and similarly increases the anti-correlation between two clusters' charge states (when the donor is cationic, the accepter has a high probability of being anionic).Although weaker than the lowest S 1 state, we see clear signatures of charge correlation present in a couple of the biexcitons (State 6 and 10).This is consistent with an analysis of the wavefunction itself.If we compute the amount of charge transfer present in each state, we find that out of all of the biexcitons, states 6 and 10 have the highest percentage of CT character, 4.6% and 4.5%, respectively (see the Supplementary Information for all state CT compositions).Charge correlations between clusters entangled into a biexciton also indicates significant superexchange, which stabilizes the low spin biexciton [62,63].
Considering next the Ŝz I correlations, we see the opposite trend, where the biexciton states have significantly more pronounced correlations, and the bright states are featureless (as would be expected from the lack of local triplet character).This is also consistent with the nature of the different sets of states.A 1 TT biexciton is characterized as two entangled triplet states coupled into a singlet state.Because the total M s is zero, when the first monomer is in the M s = 1 microstate, the entangled partner is very likely to be found in the M s = −1 microstate.This entanglement leads to a very strong Ŝz I covariance.
Using the spin correlation as a way to label the biexcitons [48], we can identify that state 6 is a (1,2) biexciton (meaning that it primarily exists on chromophores 1 and 2), states 7 and 12 are superpositions of (1,3) and (2,3) biexcitons, and state 21 is a (2,4) biexciton.Looking at state 19, we see what resembles a (3,4) biexciton, although the overall magnitude is much smaller.In order to understand this, we can look at the Ŝ2 expectation value of the state (shown in the Supplementary Information).We find that in this case, the (what we labeled to be ) 1 TT state has significant spin contamination of about 1.1.This is a consequence of the fact that the , 1 TT and 3 TT states are approximately degenerate, meaning that any arbitrary mixing of the two states is also an eigenstate.This mixing of the two spin states essentially creates a "broken-symmetry" state, where one chromophore is M s = 1 and the other is M s = −1.By locking the local spin vectors, the local Ŝz I fluctuations are diminished, and hence the ability to have significant covariance with any other cluster.This could be corrected by tightening our TPSCI convergence, or by rediagonalizing Ŝ2 in nearly-degenerate subspaces.
Looking more closely at state 12, we see that clusters 1, 2, and 3 all have significant Ŝz I fluctuations, but only the 1,3 and 2,3 pairs are spin correlated.Considering the cluster excitation, QI covariance plot, we see that while cluster three has zero fluctuations (it is consistently excited), clusters 1 and 2 are strongly correlated.This means that cluster 3 is always excited, but when cluster 1 is excited, cluster 2 is most likely in the ground state, and vice versa.This suggests a situation where a triplet state on cluster 3 forms a biexciton with a triplet delocalized between clusters 1 and 2. This is consistent with an analysis of the individual TPS state coefficients, were we find that 98% of the wavefunction is characterized as a superposition of the 1,3 and 2,3 biexcitons, |Ψ 12 ⟩ ≈ 0.67 1 (T 1 S 0 T 1 S 0 ) + 0.73 1 (S 0 T 1 T 1 S 0 ) .

B. Cr2 Complex Effective Hamiltonian
Multi-center organometallic complexes often exhibit interesting physics, such as single molecule magnetism [64,65], or valuable catalytic capabilities, such as water oxidation [66], or nitrogen fixation [67].While having a computational method that could efficiently compute the low-energy structure of organometallic compounds would be highly valuable, several physical features of these systems make this challenging.When multiple weakly interacting metal centers possess unpaired electrons, the resulting low-energy states are highly multiconfigurational, making conventional approaches like perturbation theory or coupled cluster theory inappropriate, as they require a qualitatively correct single determinant wavefunction as a reference.
Because the product of multiple high-spin centers leads to a large number of spin states, it is not always possible to predict, a priori, the spin multiplicity of a multicenter organometallic complex.Consider a simple biradical Hubbard model, represented in a local minimal orbital basis, as shown in Fig. 4(a).Here, the low-energy configurations are both open-shell broken symmetry states, and proper eigenstates must be superpositions of these two configurations, one being a singlet, and the other the M s = 0 component of the triplet.Because the ionic (or charge transfer) configurations are both singlets, they can only mix with the singlet combination of neutral configurations, which is ultimately the origin of the antiferromagnetic coupling in biradicals.
This picture can often be simplified significantly.For systems where the electron repulsion is sufficiently large such that hopping of an electron from one center to another incurs a high energy barrier (i.e., U ≫ t), the influence of charge fluctuations can be approximately downfolded using quasi-degenerate perturbation theory into an effective spin Hamiltonian, called the Heisenberg-Dirac-van Vleck Hamiltonian: [68] where J = − 4t 2 U .For this model, the low-energy spectrum is completely determined by the value of J.
Starting from the ab initio Hamiltonian instead of the Hubbard model, one finds that the zeroth-order term also contains the non-local direct exchange integral.Since bare exchange stabilizes the high-spin states, and the second order super-exchange term stabilizes the low-spin states, even getting the sign correct for the exchange coupling constant can be difficult, as the value of J is determined by the subtle interactions coupling the metals with each other and with the ligands.However, once known, the relative ordering of the spin states can be directly written down in terms of J.While this is indeed an approximate description of the low-energy electronic structure, it is profoundly useful as the prediction of J is also often what connects experiment to theory, where J is commonly fit to experimental magnetic susceptibility measurements.
The most common approach to computing J from ab initio quantum chemistry is to use DFT, where one of the degenerate broken symmetry configurations is optimized, followed by a spin projection formula, originally proposed by Noodleman [69], and then improved by Yamaguchi [70,71].While this approach has been widely used due to its conceptual simplicity and computational efficiency, there are downsides.First, the formalism doesn't actually ever compute the proper low-spin wavefunction, and so only the energy is generally able to be extracted.Second, the results become highly functional dependent.While all systems demonstrate some density functional dependence, spin-coupled complexes are intrinsically more sensitive because the percentage of exact exchange directly affects the relative energies of the high spin and broken symmetry states [72,73].Finally, DFT doesn't offer a path toward systematic improvements, making it difficult to compare results.
Because of these reasons, multireference methods like CASSCF and CASPT2 are often used to model exchange coupled systems.However, the associated computational cost limits the active space size, making it difficult to converge results to the point where quantitative comparison to experiment is possible.Often one finds that dynamical correlation (involving interactions with non-magnetic orbitals) has a significant impact on the value of J, generally strengthening the antiferromagnetic interactions.As such, DMRG has emerged as the standard benchmark method for computing exchange coupling constants in organometallic compounds [74][75][76][77][78][79][80][81][82][83], although if only the value of J is needed, efficient approaches that combine spin-flip methods have also been useful [44,45,47,84].

Active Space Selection
For our calculations, we extend the size of the active space by including the orbitals which overlap most strongly with the 3d and 4d orbitals on each Cr center, as well as the 2p and 3p oxygen orbitals on each bridging OH −1 ligand, leading to an overall orbital active space of 38 orbitals.Our active space was obtained by first optimizing the ROHF wavefunction for the high-spin heptet state.We then define a set of atomic orbitals for which we would like to span as closely as possible without destroying the ROHF reference.For this system, we take as our projection space (µ A ), the Cr 3d and 4d atomic orbitals, and the bridging O 2p and 3p orbitals, leading to a total of 38 orbitals.We then separately project the doubly occupied (i), singly occupied (s), and virtual (a) orbital spaces onto this AO subspace, providing matrices C µ A ,i , C µ A ,s , C µ A ,a ,.We then perform separate SVD's on each projected subspace, and keep the largest singular vectors from each orbital space.As such, we start with 38 atomic orbitals and end up with 38 molecular orbitals.Following this automated procedure produced an active space consisting of 13 doubly occupied, 6 singly occupied (the full ROHF open shell space), and 19 virtual orbitals, leading to a (32e, 38o) active space.While this is not the only way to yield an active space, it was convenient for our purposes as the resulting orbital active spaces are already localized to our target system.In the future, more extensive tests will be performed for automating the construction of localized active spaces.The active orbitals are shown in the supplementary information.

Clustering
The 38 active orbitals described above were then organized into 5 clusters, as depicted in Fig. 5(b).Here, each Cr atom cluster defined a local (7e, 10o) active space, and each oxygen a (6e, 6o) local active space.For each cluster, a local many-body basis was constructed from the M lowest energy states for each sector of Fock space which contained up to N I ± δ elec number of electrons, where we set δ elec = 3 for these calculations.For example, each Cr cluster has a basis of up to M states obtained by diagonalizing the CMF Hamiltonian for each of the following active spaces: (4e,10o), (5e,10o), (6e,10o), (7e,10o), (8e,10o), (9e,10o), and (10e,10o).Similarly, each bridging OH −1 ligand had 7 different active spaces centered at (6e, 6o).[86] As is commonly done with selected CI approaches, significantly improved approximations to the energy can obtained by performing a series of selected CI calculations with varying thresholds and extrapolating to the zero error limit, which is taken as an approximation to full CI.While more sophisticated extrapolation schemes have been proposed [87], we use the common approach of assuming a linear relationship between the variational energies and the PT2 correction [88].In Fig. 5(a), we plot  III.Exchange coupling constants (cm −1 ) for Cr2 compound with def2-SVP basis, and (32e, 38o) active space.J refers to H = −2J Ŝ1 • Ŝ2.J(S0,S1) denotes which spin states are used to computed J via Landé rule."TPSCI" refers to the best variational energy obtained, using ϵcipsi = 2e−4."TPSCI+PT2" is the best variational energy plus the statespecific PT2 correction."Extrapolated" uses differences between the extrapolated energies.M is the maximum number of cluster states computed for each cluster Fock sector.The dimension of the sparse variational TPSCI subspace is 97357 for M=100 and 127493 for M=200.
the variational TPSCI energy of the 4 lowest eigenstates as a function of the PT2 correction to each state.By extrapolating this linear relationship to zero PT2 correction, we can obtain an estimate of the exact eigenvalues of the Hilbert space defined by M .In Fig. 5(a), we show the extrapolation for the M = 100 calculations.
In order to compute J, we can use the Landé inter-val rule derived from the energy spectrum of a two-site Heisenberg model, J = (E(S−1)−E(S))/2S.After computing the lowest energy S=0, 1, 2, and 3 spin states, we could use any of the gaps to compute J.If the ab initio system were to be perfectly described by the Heisenberg model, then the computed J value would be independent of the particular energy gaps we were to choose.However, the Heisenberg model is rarely, exact, and so we can partially quantify how approximate the model is by comparing J values computed with different energy gaps.We list the various J values in Table III, using either the best variational TPSCI energies, the TPSCI+PT2 corrected energies, or the extrapolated energies.Results for both M = 100 and M = 200 are included.[89] Inspecting first the effect of energy extrapolation, we find that the extrapolated J values are larger than the TPSCI+PT2 values by only around 1 cm −1 , and that doubling the size of M from 100 to 200 only increases the J value by another 1 cm −1 , despite the fact that this also increases the dimension of the total accessible Hilbert space significantly from 2.7e14 to 6.1e15.Compared to the recent (30e, 22o) DMRG-SCF calculations which yielded a J value of −23.9 cm −1 [74], our computed values are slightly larger, in good agreement with the reported CASSCF(6e,10o)-NEVPT2 values of −31.8 cm −1 [74].

Bloch Effective Hamiltonian
In order to further analyze the results listed in Table III, we compute a Bloch effective Hamiltonian, providing access to the individual effective (dressed) interactions between the various spin microstates that lead to the low-energy spectrum.A qualitative description of this complex assigns each Cr center a well-defined oxidation state (III) and spin state (S= 3  2 ).As such, the low-energy spectrum is expected to be dominated by the 16 spin configurations that form a basis for the S = 3, 2, 1, and 0 states, providing a clear definition for our model space.However, because the Hamiltonian preserves spin, we can restrict our focus to only the global M s = 0 subspace, and take our model space to be the corresponding 4 dimensional subspace.In Fig. 6, we plot both the bare Hamiltonian (Fig. 6(a)) and the Bloch-effective Hamiltonian (Fig. 6(d)) in the model space.The corresponding low-energy spectra are provided in Figs.6(b) and 6(c), respectively.
There are two main features of the effective Hamiltonian that emerge from the implicit inclusion of the external space correlation: the spin-coupling interactions have their signs flipped, and their magnitudes increased.The result of this is that the system changes from ferromagnetic to antiferromagnetic coupling and the gaps between the spin states are approximately doubled.
This qualitative result is consistent with a recent study (Ref.[90]) where J values for a similar dichromium (III) complex were computed using the vLASSCF-SI method.Because the vLASSCF method works in a similar TPS basis, they were able to evaluate the impact of explicitly including some charge transfer configurations.They too found that this affected a qualitative change in the sign of J, switching from ferromagnetic to anti-ferromagnetic.
In order to further analyze this Cr 2 (III) system, we can compute cumulants of local observables, as mentioned above [48,49].In Table IV, we list the expectation values, the variances, and the covariances of a few operators local to the Cr 2 (III) centers, including local particle number, Ŝz Cr , and Ŝ2 Cr .We also include the global Ŝ2 because our basis is not spin-adapted, so there is potential for spincontamination, although our calculations are converged tightly enough to reduce spin-contamination to the 4th decimal place.
Considering first the local particle number values, we find that the average number of electrons is quite consistent across the different spin states, staying just barely above 7 electrons (which corresponds to a Cr(III) oxidation state, with 2 doubly occupied ligand orbitals).However, the fluctuations in the oxidation state noticeably depend on spin state, increasing as the global spin is decreased.This is easily understood as a consequence of the super-exchange mechanism, whereby coupling to electron transfer states stabilizes the low-spin states relative to the high spin states.Again consistent with that seen in Ref. [90], and the more general treatment of formal magnetic interactions from Malrieu and coworkers [43].For the global singlet state, the statistical correlation between the oxidation state fluctuations on the two Cr centers is only 21.1%, indicating that the majority of the oxidation state fluctuations are due to electron exchanges with the bridging ligands.

Because both Cr centers are re-coupled into global eigenvectors of Ŝ2 , local Ŝz
Cr is no longer a good quantum number, and becomes maximally uncertain.In

Inspecting the local Ŝ2
Cr values, we see a complementary picture to that provided by the particle number fluctuations.As the global spin is decreased, the local S 2 values also tend to decrease, while the variance increases.This is consistent with the superexchange mechanism stabilizing global low-spin states by coupling to charge transfer states.When an electron transfers from one Cr to the another, the number of unpaired electrons decreases.As a result, the local S 2 Cr values decrease, and develop a positive covariance between the centers.

C. Conjugation in 2D
In the earlier sections, the tetracene tetramer (Sec.III A) served as an example of a completely non-bonded systems, which is clearly quite easily clusterable.In Sec.III B, we demonstrated that the concepts of oxidation state and local spin allowed us to treat the Cr 2 (III) complex in a clustered representation.In contrast, conjugated π-systems are characterized primarily by the highly delocalized nature of the electronic structure.In this section, we investigate the ability to compute and analyze the full π active space for a large delocalized π system.
Hexabenzocoronene (C 42 H 18 ) is a polycyclic aromatic hydrocarbon (shown in Fig. 7(a)) where an additional benzene ring is fused to the outside of a central coronene ring.The π electrons are delocalized across the entire molecule which contributes to its unique electronic properties.As we have seen previously [23], the delocalized hexabenzocoronene system serves as a nice edge case for evaluating the ability of TPSCI to provide both accurate and insightful results for systems that are not obviously clusterable.

Active Space selection and clustering
For these results, we have considered the full π-system orbital active space (42 orbitals consisting of the 2p z orbitals on each carbon) using cc-pVDZ basis set [91].Viewing hexabenzocoronene as a collection of seven benzene rings, we partition the 42 orbitals into 7 clusters of 6 orbitals.A depiction of this clustering is shown in Fig. 7(a).The geometry is optimized at the B3LYP/cc-pVDZ level of theory.The cMF optimization (including orbital rotations between clusters) is performed using our open-source Julia package ClusterMeanField.jl[92].For each cluster, a local many-body basis was constructed using the embedded Schmidt truncation (EST) approach, where we define the cluster basis as the singular vectors of FCI ground state on the cluster plus an orbital bath.We discarded Schmidt vectors with singular values smaller than a threshold value of 1e −4 .Detailed analysis of EST approach has been done in our recent TPSCI paper [23].

Convergence of TPSCI ground state
In Fig. 7(b), we plot the extrapolation of the TPSCI variational energy as a function of the PT2 correction.We use ϵ FOIS = 1×10 −6 (threshold on the external space couplings), and tightest ϵ CIPSI = 1.5 × 10 −4 through the bootstrapping HOSVD approach for this calculation [24].
Here we see that our ground state TPSCI+PT2 energy is only about 5 mH away from the extrapolated result.Further, this was with a variational dimension of only around 114k.The extrapolated total energy of the molecule is −1601.6971±0.0001au.It's important to emphasize that the uncertainty here is due to the linear fit and does not imply a variational guarantee.If we knew that the relationship was indeed linear over the full range, then our energy would be correct to 0.1 mH.We note that the performance of TPSCI on such delocalized π systems depends significantly on the topology of the molecule.For the complex considered here, there is a well defined Clar's structure that suggests a unique clustering.We expect this to be key to achieving accurate solutions.In contrast, our recent work [23] revealed that π systems without a well-defined clustering into a Clar's structure (such as coronene) is significantly slower to converge.We plan to explore this topic in the future for a more extensive set of polyaromatic hydrocarbons.

Correlations in between clusters
In Fig. (7)(c), the charge covariances between the clusters are depicted as a heatmap, with each row/column corresponding to a given cluster labeled by the number on the diagonal.Looking first at the diagonal of the matrix (the charge variances), we see that the outside clusters (1-6) all have equivalent charge fluctuations, while the central benzene unit has significantly larger fluctuations in the ground state.Because the variance quantifies the uncertainty in the number of electrons in a given cluster, each of the outer and inner clusters have 6.0 ± 0.5 and 6.0 ± 0.7, numbers of electrons, respectively (using a 3σ confidence interval).Considering next the off-diagonal matrix elements, we see that all nearest neighbor cluster couplings are approximately the same.Assuming twobody correlations dominate, this means that the central carbon should have a variance that is about twice that of the outer clusters, just based on the fact that it has twice as many nearest neighbors, which is consistent with the observed results.Because small differences are difficult to see in the heatmap, we have listed the unique covariance quantities in Table V. Very similar results exist for the Ŝz I correlations, as can be seen in Fig. S2 in the supplementary information.
By considering the wavefunction directly, we notice that about 89.6% of the wavefunction is attributable to TPS's that have 6α and 6β in each cluster, whereas 9.1% is due to charge transfer configurations, and 1.3% is due to local spin-flip configurations.
We also note, that in Table V, we see stronger particle number and spin correlations between benzenes connected in the para position than between those with meta connections, despite being further in distance, indicating a slight directing effect of the central benzene.Although the opposite is seen with the cluster excitation, QI , correlations.
Although the particle number covariance between each pair of neighboring clusters is negative (indicating charge transfer), the Ŝ2 I correlations are positive between neighboring clusters.This is consistent with the description of charge correlation.When an electron from a cluster hops into another cluster, then a doublet state will be formed in both of the clusters.One cluster will be one electron deficient giving rise to a cationic doublet state and the extra electron forms an anionic doublet state in the neighboring cluster.Consequently, when one cluster is in a doublet state, its neighbors have a higher probability of also being found in a doublet state, making the

IV. CONCLUSIONS
In this paper, we have explored the ability of TPSCI to provide accurate yet interpretable approximations to FCI on relatively large orbital active spaces.Because TPSCI works in a basis consisting of products of local FCI states, the more separable a system is, the easier it should be to simulate.As such, in this paper, we consider three example systems, which range in the degree of separability: (i) a completely non-bonded tetramer of tetracene molecules, (ii) a more strongly interacting dichromium organometallic complex which, while bonded, is still characterized with local quantities like oxidation state, and (iii) a completely delocalized π system of hexabenzocoronene.
For the dichromium example, this was the first TP-SCI calculation applied to open-shell biradical systems.We found that TPSCI was able to compute exchange coupling constants that are larger in magnitude (presumably more accurate), than recent DMRG calculations.
For each of these systems we characterized the resulting wavefunctions using quantities that are easily accessible from the TPS basis.By leveraging the natural diabatic character of the TPS basis, we are able to easily construct Bloch effective Hamiltonians, which provide quantitative relationships between physically relevant degrees of freedom.This provided access to quantities such as the effective coupling between the bright states and the multiexcitonic states, ⟨S 1 | Ĥeff 1 TT , which implicitly includes the downfolded effects from charge transfer couplings, which are substantially enhanced compared to the direct coupling.
We additionally used correlation functions of quantities like particle number, spin, and cluster excitation to provide a more detailed analysis of the various variational TPSCI eigenstates, and ultimately compare the results to inspection of the wavefunction itself which is particularly interpretable due to the diabatic nature of the basis.This work, helps lay out approaches for extracting more insight from TPSCI wavefunctions (and other TPS based wavefunctions) in the future.

V. ACKNOWLEDGMENTS
This work was supported by the National Science Foundation (Award No. 1752612).NMB also acknowledges the support provided by the Graduate School Doctoral Assistantship from the Department of Chemistry at Virginia Tech.
From Fig. (S1), we can see how nbody interactions affect the correlation energy.Most of the correlation energy is recovered from 2-body interactions.If we include 3-body interactions, it can be seen that there is 0.39-0.43eV contribution from 3-cluster interactions, and 4-body interactions further add only 0.02 eV for δ elec = 2 and nearly 0.002 eV for δ elec = 3 which is smaller than chemical accuracy.From these observations, it can be predicted that the hexabenzocoronene system contains interactions mostly between two benzene rings although the electron density is delocalized all over the systems.However, a significant amount of contributions is achieved through the 3-body interactions.We guess that most of the 3-body interactions are available among two benzene rings present on the outer ring and the central fused benzene ring distributed in a line (para-like) or three vertices of a triangle (ortho-like).This can be confirmed via the computation of 3 rd order cumulant of particle number.Whereas 3 benzene rings in the outer ring along with the central benzene contribute negligibly to 4-body correlation energy.
Now, if we focus on δ elec parameter, there is a very small change (less than chemical accuracy) if we shift δ elec value from 1 to 2, and 3. δ elec parameter restricts the Fock sectors (n α , n β ) based on the allowable electron transitions (δ elec = ±δ electron transfer).From this observation, it can be concluded that configurations created through single electron transition inside each cluster interact mostly with singly excited configurations and ground state configuration from the rest of the systems.Inter-cluster parameter, nbody determines how many inter-cluster interactions are  significant for intra-cluster parameter, δ elec .As δ elec = 1 recovers most of the correlation energy for a particular nbody value, the most important physical process that happens in hexabenzocoronene must be electron hopping (charge-transfer) which is further supported by the correlation matrix of particle number operator, N and spin projection, Ŝz .⟨ N ⟩ 6.00009508 6.00006097 6.00008872 6.00009172 6.00006299 6.00008641 5.9995141 ⟨ Ŝz ⟩ 0.00001386 -0.0000144 0.00000359 0.0000157 -0.00001414 0.00000181 -0.00000642 ⟨Q⟩ 0.03424627 0.03424847 0.03424389 0.03424593 0.03424957 0.03424989 0.07416684 ⟨ Ŝ2 ⟩ 0.02767142 0.02767881 0.02766855 0.02767127 0.02768138 0.02767573 0.06288520

FIG. 2 .
FIG. 2. Bare and Bloch Effective Hamiltonians for tetracene tetramer with diagonal entries subtracted to show off-diagonal couplings in meV.Active space of (40e, 40o) with system and clusters labeled.(a) Tetracene tetramer with the associated clusters labeled (b) Hamiltonian matrix for the model space (diabatic basis) (c) Bloch Effective Hamiltonian obtained by projecting TPSCI wavefunctions onto the model space.The non-singlet states have been removed for clarity.

FIG. 3 .
FIG. 3. Tetracene tetramer correlation functions.(top row) particle number, NI .(middle row) spin projection, Ŝz I .(bottom row) cluster excitation, QI .The first 6 plots from the left are for the 1 TT states.The last 4 plots from the left are for the |S1⟩ states.Each matrix column/row corresponds to cluster 1 to 4, as labeled in Fig. 2. The color scale for each correlation function is shown on the far right.

FIG. 4 .
FIG. 4. Cartoon illustration of the electronic structure of an S = 1 2 biradical approximately mapped onto an isotropic Heisenberg model at the large U limit, both restricted to the relevant Ms = 0 subspace.(a) the 4 possible Slater determinants.Assuming a localized orbital basis, the bottom two correspond to neutral excitations and the top two are ionic or charge transfer configurations.(b): the 2 possible spin configurations after being mapped onto a Heisenberg model with quasi-degenerate perturbation theory (QDPT).

FIG. 5 .
FIG. 5. Convergence and extrapolation of Cr2 low-energy spectra for the def2-SVP basis and a (32e, 38o) active space.(a) Plot of the TPSCI variational energy as a function of the computed PT2 correction.Solid line is the linear fit.Units in milliHartree.M =100.(b) Clustering of the 38 orbital active space.(c) Molecular structure

FIG. 6 . 3 2
FIG. 6. Bare and Bloch Effective Hamiltonians for Cr2(III) complex.(32e, 38o) active space.Units of J in cm −1 .(a) Hamiltonian matrix in the basis of Ms = 0 tensor products of local states where all the bridging ligands are in singlet states, and the Cr centers are in S = 3 2 , Ms = ± 3 2 (b) Energy spectrum of bare Hamiltonian in local S = 3 2 basis.(c) Energy spectrum of extrapolated TPSCI results.(e) Bloch Effective Hamiltonian obtained by projecting TPSCI wavefunctions onto local S = 3 2 basis.

TABLE I .
TPSCI Effective Hamiltonian to show coupling strengths between S1 and 1 (TT) in meV.

TABLE II .
Tetracene tetramer local excitation strengths.Columns correspond to the expectation values of the local cluster excitation operator, QI = 1 − |0I ⟩⟨0I | .A value of 0 indicates the cluster is in its local ground state.A value of 1 indicates that the cluster is always in a local excited state.Summing all the local excitation rank of the state.

TABLE V .
Unique cluster correlations in hexabenzocoronene.NI is the number operator for cluster I. Ŝz I is the spin magnetization operator for cluster I. QI is the projector onto the orthogonal complement of the cMF ground state for cluster I. Cluster pair indices correspond to the labeling in Fig.7(a)with the description of interaction type in parenthesises.

TABLE S2 .
Bare Hamiltonian to show singlet coupling strengths between |S1⟩ and 1 TT in meV.

TABLE S4 .
Average value of particle number, N , spin projection, Ŝz , Q, Ŝ2 for each cluster in cc-pVDZ basis.