Formally exact simulations of mesoscale exciton dynamics in molecular materials†

Excited state carriers, such as excitons, can diffuse on the 100 nm to micron length scale in molecular materials but only delocalize over short length scales due to coupling between electronic and vibrational degrees-of-freedom. Here, we leverage the locality of excitons to adaptively solve the hierarchy of pure states equations (HOPS). We demonstrate that our adaptive HOPS (adHOPS) methodology provides a formally exact and size-invariant (i.e., ) scaling algorithm for simulating mesoscale quantum dynamics. Finally, we provide proof-of-principle calculations for exciton diffusion on linear chains containing up to 1000 molecules.


Introduction
New molecular materials, particularly organic semiconductors, offer remarkable and tunable functionality for photonic, optoelectronic, and light harvesting applications. The photophysical properties of molecular materials arise from diffusion of excited-state carriers (e.g., electronic excitations, called 'excitons') across the 10 nm to 1 mm length scale. These mesoscale exciton dynamics are sensitive to both the molecular properties of the material building blocks and structural heterogeneities, which include everything from point defects to grain boundaries. Traditional bulk spectroscopies provide only indirect evidence for the essential role of structural heterogeneity in exciton transport. The recent development of spatiallyresolved non-linear spectroscopy provides a remarkable new lens by which to study exciton dynamics in heterogeneous materials. 1,2 Interpreting spatially-resolved spectroscopic signals, however, remains challenging due to the absence of corresponding simulations.
Simulating exciton transport dynamics in heterogeneous materials on the 10 nm to 1 mm length scale (i.e., the mesoscale) remains an outstanding theoretical challenge. Organic semiconductors oen combine close intermolecular packing with correspondingly large coupling between electronic states (V) on adjacent molecules and large intramolecular electronvibrational coupling (l). 3 Perturbative equations-of-motion, such as Förster theory, can be convenient for simulating large aggregates, but are not applicable when V and l are comparable in magnitude. Similarly, in the absence of a clear separation of timescales between vibrational and electronic degrees-offreedom, Markovian equations-of-motion, such as Redeld theory, 4 struggle to capture the rich dynamics of excitation transport. There are a variety of non-perturbative, non-Markovian equations-of-motion, such as multi-layer multi-conguration time-dependent Hartree (ML-MCTDH), 5,6 timeevolving density operator with orthogonal polynomials (TEDOPA), 7 hierarchically-coupled equations-of-motion (HEOM), 8 and quasi-adiabatic path integrals (QUAPI). 9 All of these techniques, however, share an exponential scaling of computational complexity with the number of molecules. While efficient and parallelized implementations of formally exact methods have been developedfor example, distributed memory HEOM 10,11the exponential scaling severely limits even high-performance simulations of molecular aggregates.
Recently, there have been a few notable developments towards highly-scalable equations-of-motion for exciton dynamics. Modular path integrals 12,13 provide a dramatic reduction in computational cost of QUAPI, but retain an overall linear scaling with the number of molecules and are most efficient when molecules exhibit only nearest-neighbor coupling. Dissipation-assisted matrix product factorization (DAMPF) 14 extends TEDOPA to efficiently describe large numbers of vibrational degrees-of-freedom (>10) on each molecule, but it maintains between a quadratic and cubic scaling with the number of molecules. For both modular path integrals and DAMPF, the residual scaling makes it challenging to apply these methods to mesoscale calculations containing thousands to millions of molecules. Indeed, any density matrix approach will suffer from residual scaling with system size at long times due to the spread of ensemble population density across molecules.
Stochastic simulations, which decompose the ensemble into a collection of excited trajectories, can enable calculations on arbitrarily large molecular aggregates, even at long time.
Delocalized kinetic Monte Carlo 15 and the kinetic Monte Carlo version of generalized Förster theory 16 are stochastic approaches that calculate the rate of transport between clusters of strongly interacting molecules and can be readily extended to mesoscale calculations. Both of these methods, however, use a perturbative approximation to partition state space and calculate rates between adjacent spatial regions. The development of a non-perturbative, non-Markovian approach for mesoscale simulations would provide an important benchmark for new equations-of-motion and could offer insight into processes with debated mechanisms, such as charge separation in organic photovoltaic materials. 15,[17][18][19] Here, we present a non-perturbative, non-Markovian, and arbitrarily scalable stochastic method for simulating exciton transport. First, we introduce the Hamiltonian considered and our base equation-of-motion, the hierarchy of pure states (HOPS). 20 Next, we discuss locality in HOPS calculations and present an algorithm for constructing an adaptive basis. Finally, we present proof-of-concept calculations using the adaptive HOPS (adHOPS) equation-of-motion that demonstrate both its accuracy and size-invariant (i.e., Oð1Þ) scaling for large molecular aggregates.

Hamiltonian
We divide the exciton Hamiltonian into three partŝ whereĤ S ¼ P n jniE n hnj þ P nsm jniV n;m hmj describes the electronic system and H B ¼ P n;q ħu qn Àâ † qnâ qn þ 1=2 Á represents the thermal environment arising from molecular vibrations. The inuence of coupling between the electronic system and vibrational 'bath' can be described in terms of the systembath coupling operators ðL n Þ and the two-point correlation functions a n ðtÞ ¼ ħ p where T is the temperature and J n ðuÞ ¼ p P qn k qn 2 dðu À u qn Þ is the spectral density. In the following, we assume that each pigment has an independent thermal environment that drives uctuations in excitation energy. In other words, we assume that the system-bath coupling operator is a site-projection operator ÀL n ¼ jnihnj Á . We describe the thermal environment of each pigment by a Drude-Lorentz spectral density J n ðuÞ ¼ 2l n uðg n =ħÞ which, at high temperature (g n /k B T < 1), allows for a convenient exponential decomposition of the correlation function a n (t) ¼ g n e Àgnt/ħ (4) where g n ¼ 2l n k B T À il n g n . In the following we use l n ¼ g n ¼ 50 cm À1 , V ¼ 25-250 cm À1 and T ¼ 295 K, which are comparable to the parameters used for many simulations of photosynthetic pigment protein complexes and fall into the broad intermediate regime where perturbative approximations break down. 21 Hierarchy of pure states (HOPS) The non-Markovian quantum state diffusion (NMQSD) equation 22 decomposes the time-evolution of the reduced density matrix for the system degrees-of-freedom into an ensemble average over stochastic pure states indexed by a complex stochastic processes z n,t where E½z n;t ¼ 0; E½z n;t z n;s ¼ 0 and; E½z * n;t z n;s ¼ a n ðt À sÞ. The equation-of-motion for the independent stochastic trajectories is The NMQSD equation is formally exact and is equivalent to solving Feynman path integrals with the Feynman-Vernon inuence functional, 22 but the functional derivative in the last term makes direct solution of the stochastic trajectories impractical except in special cases.
The hierarchy of pure states (HOPS) equations provide a numerically tractable version of NMQSD by rewriting the functional derivative as a set of coupled differential equations. 20 Briey, the sum of integrals over a functional derivative in the nal term of the NMQSD equation is dened as a sum of rst order auxiliary wave functions: giving ħv t j ð0Þ ðt; z n;t Þ where we have now introduced a vector label into the equations to index the different components. The physical wave function is given by j0 ð Þ ðt; z n;t Þ E . The rst order auxiliaries are indexed by unit vectors with non-zero index at their n th element ðẽ n Þ.
When the correlation function a n (t) is written as an exponential (or sum of exponentials), the time-evolution of the rst order auxiliary wave functions j ðẽnÞ ðt; z n;t Þ introduces the secondorder auxiliary wave functions ðẽ n þẽ m Þ, and so on, ad innitum. The resulting general expression, called the 'linear HOPS equation,' is ħv t j ðkÞ ðt; z n;t Þ where we have introduced a general vectork to index auxiliary wave functions,k½n is the n th element of the index vector, g / is the vector of correlation function exponents (g n ), and terms involving any auxiliary wave function with an indexing vector containing a negative element are always zero. The linear HOPS equation maintains the normalization of the system reduced density matrix within the ensemble average, but the physical wave function is not normalized in individual trajectories.
Instead, for long trajectories, most realizations have j ð0Þ /0 and an innitesimal subset have physical wave functions with diverging norms. 20 As a result, linear HOPS calculations exhibit slow convergence with respect to the size of the ensemble. We can improve convergence with ensemble size by using the non-linear HOPS equation, which describes the timeevolution of a normalizable stochastic wave function. We rewrite the reduced system density matrix (eqn (5)) in terms of a normalized wave function and the norm contribution where j ðt; z n;t Þ ¼ jðt; z n;t Þ =kjðt; z n;t Þk. The norm in the rst expression can be interpreted as a weighting factor for a new ensemble average. Using a Girsanov transform, we can solve for the corresponding equation-of-motion for jj(t;z n,t )i, 23 which gives the non-linear HOPS equation 20 ħv t j ðkÞ ðt; z n;t Þ is a memory term that causes a dri in the effective noise, and ðt; z n;t Þ j ð0Þ ðt; z n;t Þ E : We note that the non-linear HOPS equation ensures that the contribution of each wave function is normalized in the reduced density matrix, but it time-evolves the non-normalized physical wave function. In the following, we will drop the explicit z n,t dependence from the wave function for simplicity jk ð Þ ðt; z n;t Þ The HOPS equations are a numerically convenient, formally exact expression for exciton dynamics in small molecular aggregates. Moreover, the calculations are 'embarrassingly' (also called 'perfectly') parallel 24 due to the independence of individual trajectories, and, as a result, HOPS ensembles can be computed using thousands of CPUs simultaneously without loss of efficiency. The application of HOPS to large molecular aggregates, however, is limited by the scaling of the HOPS basis with the number of molecules. It is convenient to think of HOPS calculations as depending on two basis sets: the state basis ðSÞ and the auxiliary basis ðAÞ. The complete state basis is a nite set of vectors that span the Hilbert space of the system, while the complete auxiliary basis is composed of an innite set of auxiliary wave functions indexed by vectorsk. To construct a nite auxiliary basis, the innite hierarchy must be truncated. Here, we employ the common triangular truncation condition which limits the auxiliary basis to those wave functions with index vectors Àk Á that have a sum of elements less than a preselected bound k max &k˛A : . If we assume one independent environment per state, then the number of auxiliary wave functions included in the triangular truncation scales as N state þ k max k max which gives an overall O N kmax state scaling for large aggregates. While convergence as a function of k max is guaranteed, the requisite number of auxiliary wave functions is oen prohibitive.

Short-time correction and Markovian modes
The Drude-Lorentz correlation function given in eqn (4) has a discontinuity at t ¼ 0 arising from the symmetry condition a n ðtÞ ¼ a * n ðÀtÞ; and can be more completely written as 25 The discontinuity in the correlation function introduces a numerically inconvenient innitely high-frequency component to the stochastic noise trajectories z n,t .
We ameliorate this problem by redening the positive-time correlation function in terms of two continuous exponential functions a(t) ¼ a 0,n (t) + a mark,n (t) where a 0,n (t) ¼ g n e Àg n jtj/ħ (16) and a mark,n (t) ¼ ÀiIm[g n ]e Àg mark jtj/ħ .
The denition of a mark,n (t) ensures the imaginary component of the total correlation is 0 when t ¼ 0, and it also provides for a smooth transition back to the naive correlation function a 0,n (t) on a nite timescale given by ħ/g mark . Except where otherwise noted, g mark ¼ 500 cm À1 for all calculations presented in the main text, because this was sufficiently fast to ensure the Markovian timescale had no inuence on the calculated dynamics.
Due to the extremely rapid timescale on which a mark,n (t) decays, this mode is Markovian, and high-lying contributions to the hierarchy can be neglected. In the following, we will only include the rst order terms associated with these Markovian modes and neglect these terms in our discussion of the auxiliary wave functions forming the hierarchy. This can be viewed as a smoothing of the noise trajectories (z n,t ) on timescales fast compared to all other dynamics.
This problem can be avoided entirely by using a different spectral density which more naturally accounts for the shorttime imaginary component of the correlation function: for example, the recently reported alternative to the Drude-Lorentz oscillator with improved low-temperature behavior. 26 Adaptive HOPS (adHOPS) Within the quantum state diffusion formalism, stochastic wave functions localize in the presence of thermal environments. [27][28][29] At a single time point, the pure state can be interpreted as a system wave function conditioned on a measurement of all the environmental degrees-of-freedom. 30 The delocalization extent of exciton wave functions within the ensemble of pure states is suppressed by the dynamic localization induced by coupling to the vibrational degrees-of-freedom. 31,32 Previously, Markovian quantum state diffusion calculations have leveraged the locality of the exciton to reduce computational complexity using both a moving basis 27,33 and an adaptive basis. 34 Both of these approaches, however, require the conservation of probability, which is violated in the HOPS equations because amplitude in the auxiliary wave functions can be created and destroyed.
Here, we develop an adaptive solution to the HOPS equationof-motion that achieves size-invariant computational scaling (i.e., Oð1Þ scaling) for calculations of large molecular aggregates. We rst establish a normalized non-linear HOPS equation, which ensures that the magnitude of derivative terms does not diverge with increasing depth of the hierarchy. We then illustrate how locality appears within the hierarchy of auxiliary wave functions, with a particular emphasis on the connection between locality and the ux between neighboring auxiliary wave functions. Finally, we present an adaptive algorithm for the normalized non-linear HOPS equation that satises a userselected bound on the absolute derivative error.

Normalization of HOPS
To ensure that the magnitude of the derivative elements for auxiliary wave functions have a consistent absolute scale across the hierarchy, we: (1) enforce normalization of the physical wave function in the time-evolution equation and (2) redene the auxiliary wave function coefficients.
To enforce the normalization of the physical wave function, we rewrite the non-linear HOPS equation in terms of a normalized physical wave function. Starting with eqn (11), dividing all wave functions by the norm of the physical wave function, taking the derivative, and expanding terms gives where is the normalization correction factor. In the non-linear HOPS equation, the magnitude of the auxiliary wave functions grows with increasing auxiliary index. The basic HOPS terminator for a hierarchy with a single thermal environment j t can be derived from the integral form of the HOPS equation by considering the limit where the auxiliary damping is much faster than any system timescale (kg/ħ [ u sys ). 20 When this terminator is used for the rst-order auxiliaries the resulting HOPS equation is equivalent to the standard Markovian quantum state diffusion equation. Fig. 1a shows the norm of the rst three auxiliary wave functions for a single trajectory with a hierarchy consisting of one Drude-Lorentz oscillator.
The magnitude j t ðkÞ increases with increasing auxiliary index which, given g/g > 1, is consistent with the terminator condition. For a single mode hierarchy, we can ensure the norm of the auxiliary wave functions does not diverge by introducing a new k-dependent prefactor for each wave function (g/g) k , as shown in Fig. 1b. In the multimode case, we extend the denition of the prefactor to Q n ðg n =g n Þk ½n which ensures that the auxiliary wave functions that dene the edges of the hierarchy (only one non-zero mode) do not diverge with increasing hierarchy depth. Rewriting the non-linear HOPS equation to account for this additional prefactor leads to the normalized non-linear HOPS equation where ensures normalization of the physical wave function.

Locality of HOPS
To construct an adaptive approach to solving the HOPS equations, we must rst address the question: how and to what extent does the locality expected in the quantum state diffusion formalism appear in HOPS? Fig. 2 shows that in HOPS calculations localization in the physical wave function induces localization in the hierarchy. By 'localization in the hierarchy,' we specically refer to clustering of amplitude in a small set of auxiliary wave functions in a way that depends on the position of the excitation in the physical wave function. In Fig. 2a, an excitation begins on the middle site of a ve pigment chain and then jumps between site 3 and site 2. Fig. 2b plots the norm of auxiliary wave functions associated with site 2 and site 3; each plot is labeled by an auxiliary vector indexk. For example, [0,1,0,0,0] (Fig. 2b, rst column) is the rst order auxiliary wave function associated with the thermal environment on the second site. The occupation of the auxiliary wave functions (black lines, Fig. 2b) track with the population of the physical wave functioni.e., the auxiliary wave functions associated with the thermal environment of site 2 (rst column Fig. 2b) are only occupied when site 2 is occupied in the physical wave function (shaded region).
The locality in the HOPS hierarchy can be understood in terms of the balance of ux terms in the normalized non-linear HOPS equation. First, every auxiliary wave function is damped (rst line of eqn (21), Àk$ g ) and, therefore, has zero amplitude without a continuous source term. The fundamental source term is the physical wave function which is the lowest order of the hierarchy. The ux of amplitude towards higherlying auxiliary wave functions arises from the second line of eqn (21) k ½ng nLn j tk Àẽn ð Þ E . For system-bath coupling operators that are site projection operators ÀL n ¼ jnihnj Á , the ux towards higher-lying auxiliaries only arises when there is amplitude on the associated site of the lower auxiliary wave function. Moreover, the auxiliary wave functions are localized by the same dynamics that localize the physical wave function. As a result, the localized auxiliary wave functions only contribute amplitude to higher-lying auxiliary wave functions with an index that differs by þẽ n in a site (n) with non-zero amplitude. Thus, the locality of the physical wave function results in preferential population of specic auxiliary wave functions.

Adaptive algorithm
We have developed an adaptive algorithm for time-evolving the HOPS equations (adaptive HOPS, adHOPS) that leverages locality by constructing a reduced basis set at each time point. We establish the essential basis set elements at each time point (t) by ensuring that the error in the time-derivative introduced by the truncated auxiliary ðA t Þ and state ðS t Þ basis is below a given threshold (d). We dene the derivative error in terms of Euclidean distance between the true derivative vector and the effective derivative vector constructed using the adaptive basis. The key equations (given in the ESI †) provide an upper bound on the derivative error squared and are derived by considering all possible ux contributions in the normalized non-linear HOPS equation (eqn (21)), excluding higher order effects introduced through the normalization correction (G t ). Because auxiliary wave functions share only nearest neighbor connections Àk AEẽ n )k Á and the Hamiltonian for a molecular aggregate supports electronic couplings over a nite spatial extent, the adaptive basis can be constructed with Oð1Þ scaling. The result is a calculation where, in addition to a trajectory of the wave functions, we construct an adaptive basis-set trajectory fB 0 ; B dt ; . Àk ¼0 Á shows the populations of site 2 and 3 as green and blue lines, respectively. Parameters: V ¼ 10 cm À1 , l ¼ g ¼ 50 cm À1 , T ¼ 295 K, and k max ¼ 10.
Our adHOPS algorithm neither assumes nor imposes locality. Rather, the adaptive basis takes advantage of whatever locality arises during the simulation. If the full hierarchy is required to satisfy the derivative error bound, adHOPS smoothly reverts to a HOPS calculation. As a result, adHOPS remains formally exactthe adaptive basis represents a time-dependent truncation of hierarchy elements, and d, like k max , is a convergence parameter.
We note that the current adHOPS algorithm makes use of two approximations: rst, the spectral density is assumed to be over-damped (e.g., Drude-Lorentz), which allows for a consistent normalization of the hierarchy elements. Second, the system-bath coupling operator is assumed to be a siteprojection operator L n ¼ jnihnj ; in other words, we assume that each molecule has an independent vibrational environment.

Results and discussion
For a ve-site linear chain, adHOPS calculations converge rapidly with respect to the derivative error bound and require only a small fraction of the full HOPS basis. Fig. 3a shows the comparison between full (black line) and adaptive (green line) HOPS population dynamics of the initially excited pigment (site 3). For d ¼ 10 À1 , the adaptive basis set is so small that the calculation shows no excitation transport. Smaller values of d improve the description, and by d ¼ 10 À3 the mean error is less than 10 À2 . Fig. 3b shows the mean adaptive error as a function of d. In the grey region the adaptive error is smaller than the statistical error associated with the 10 4 trajectory ensemble. We measure the size of the auxiliary basis for a single trajectory by the average number of auxiliary wave functions required across time points. Fig. 3c plots the ensemble distribution of the auxiliary basis size as a function of d. For d ¼ 10 À3 , most adHOPS trajectories require 10 2 auxiliaries on average, or approximately 1% of the 9 Â 10 3 auxiliaries required for the corresponding HOPS calculation. Improving the accuracy of the calculation by decreasing d two orders of magnitude only requires about four times as many auxiliaries. The other kinds of error that arise in HOPS simulations, including statistical error from a nite number of trajectories and hierarchy error from the nite k max value, are reported in the ESI. † One persistent challenge for numerical implementations of formally exact methods is demonstrating the calculations are converged to the exact answer. In hierarchical methods, calculations must be converged with respect to the auxiliary basis which is dened in the triangular truncation condition by the maximum hierarchy level considered (k max ). In HOPS, the criterion for convergence is that k max g/ħ [ u s , where u s is the characteristic frequency of the system. 20 Because the full auxiliary basis scales as N pig þ k max k max , it is oen impractical to systematically check convergence for sufficiently large values of k max . Though our adHOPS method was inspired by localization, we nd that it naturally incorporates a dynamic ltering scheme that dramatically improves the scaling of the auxiliary basis with k max even when the exciton is fully delocalized. Fig. 4a compares the full (black) and adaptive (green) HOPS dynamics with increasing coupling (V). By V ¼ 5l the oscillations in the site 3 population report a wave function that is coherently oscillating across 5 sites. Fig. 4b shows the corresponding size of the auxiliary basis as a function of k max . In all cases the adaptive auxiliary basis (green line) increases much more slowly than the full auxiliary basis (black line).  Other parameters: l ¼ g ¼ 50 cm À1 , T ¼ 295 K, d ¼ 10 À3 , and N traj ¼ 10 4 . For V ¼ 250 cm À1 , g mark ¼ 1000 cm À1 , all others used g mark ¼ 500 cm À1 .
Another perpetual challenge for formally exact methods is their intractable computational scaling with the number of molecules. In HOPS calculations this arises from the scaling of the auxiliary basis. Fig. 5a compares the full (black line) size of the state (top) and auxiliary (bottom) basis to the average size of the adaptive basis (colored lines) as a function of the number of molecules in a linear chain. Compared to the full auxiliary basis, the size of the adaptive auxiliary basis scales favorably with respect to the number of molecules in the aggregate. Moreover, both the auxiliary and state bases in adHOPS calculations show a plateau beyond a threshold size of the linear chain (N pig > N*), indicating the onset of size invariant scaling. In the ESI, † we compare the CPU time required for full and adaptive HOPS calculations (V ¼ 50 cm À1 ). We nd that adaptive calculations are faster than full calculations starting around N pig ¼ 10, and we also demonstrate the onset of size invariance (i.e., Oð1Þ) scaling of CPU time for large aggregates. In other words, increasing the number of pigments beyond a threshold size does not increase the computational expense of an adHOPS calculation. Thus, for localized excitons, the size invariance of adHOPS allows for calculations on scales that were previously unachievable for formally exact methods.
Our adaptive HOPS algorithm offers a computationally tractable approach for formally exact calculations of mesoscale quantum dynamics. As a proof-of-concept, we demonstrate the ability to simulate exciton diffusion on a linear chain of 10 3 molecules within the formally exact framework of adHOPS (Fig. 5b). Exciton diffusion is a common experimental observable extracted from non-linear microscopies 2 but is challenging to simulate on long length scales. [35][36][37] Using adHOPS, simulating exciton diffusion in a linear chain of 10 3 pigments is computationally tractable because for V ¼ 100 cm À1 it requires, on average, less than 2 Â 10 3 auxiliary wave functions and 20 pigment states. The corresponding HOPS simulation would require an auxiliary basis containing more than 10 23 auxiliary wave functions.