Alan E. Raska,
Lee Huntington
a,
SungYeon Kim
a,
David Walker
a,
Andrew Wildman
a,
Rodrigo Wang
a,
Nicole Hazel
a,
Alan Judib,
James T. Pegg
a,
Punit K. Jha
a,
Zara Mayimforb,
Carl Dukatzc,
Hassan Naseric,
Ilan Gleiserd,
Maxime R. Huguesd,
Paul M. Zimmerman
ae,
Arman Zaribafiyan
a,
Rudi Plesch
*a and
Takeshi Yamazaki
*a
aSandboxAQ, 780 High Street, Palo Alto, CA 94301, USA. E-mail: rudi.plesch@sandboxaq.com; takeshi.yamazaki@sandboxaq.com
bGood Chemistry Company, 200-1285 West Pender Street, Vancouver, BC V6E 4B1, Canada
cAccenture, 1 Grand Canal Quay, Grand Canal Dock, Dublin, D02 P820, Ireland
dAmazon Web Services, 410 Terry Ave N, Seattle, WA 98109, USA
eDepartment of Chemistry, University of Michigan, 930 N. University Ave, Ann Arbor, 48109, USA
First published on 21st August 2025
A tractable approach to solving the exact many-body electronic wavefunction has long remained an elusive goal in quantum chemistry. Accurate computation of the electronic structure and related properties of molecules would unlock a trove of precise information, but the exponential scaling of exact methods has impeded this possibility. In this work, we report that the combination of an incremental expansion of electronic correlation with effective utilization of modern cloud compute environments can overcome the long-standing barrier of intractability. As a demonstration of viability, we investigate the bond breaking of per- and polyfluoroalkyl substances (PFAS), a class of strongly correlated systems with pressing environmental relevance. Using the incremental Full Configuration Interaction (iFCI) method, we decompose the many-body wavefunction into independently computable units and distribute them across a record-breaking one million simultaneous cloud vCPUs. Calculations for perfluorooctanoic acid, the largest PFAS studied, involve correlating 150 electrons in 330 orbitals, corresponding to a wavefunction dimension of ∼10151 configurations and producing the most accurate correlation energies and electron densities to date. iFCI reveals an electron localization transition in the electronic state during bond dissociation, which standard quantum chemical methods (e.g., DFT) fail to capture due to insufficient treatment of correlation. The union of a highly parallel cloud-enabled infrastructure with a polynomial-scaling method that treats static and dynamical correlation on equal footing lays the foundation for further development of novel protocols for the degradation of PFAS, alongside potential for characterization and screening of advanced molecules and materials in a wide range of chemistries.
Per- and polyfluoroalkyl substances (PFAS) are one such example of relevant chemical systems, persistent pollutants that have been dubbed “forever chemicals” because of their resilience and accumulation in humans and the environment.5–7 One of the most common PFAS, perfluorooctanoic acid (PFOA), would require approximately 10151 electron configurations to capture the exact energy with full configuration interaction (FCI). This benchmark-level many-body wavefunction would correlate all 150 valence electrons in 330 orbitals (assuming a polarized double-zeta basis), while the largest fully variational CI to date has considered a maximum of ∼1015 determinants.8,9
Typical approximations with reduced computational costs lose the ability to reliably treat electron correlation, or introduce significant uncontrolled approximations (i.e., non-black-box input).10 For example, conventional single-reference methods, such as coupled-cluster theory,11 are not expected to properly characterize bond breaking. Non-black-box, multireference (MR) approaches can correctly break bonds, but with quantitative results that depend on user-selected inputs. In the case of PFAS, simulating bond dissociation is further challenged by their complex electronic structure: non-dynamical correlation arises when a single-reference wavefunction fails during bond breaking, and dynamical correlation remains significant due to non-perturbative interactions among breaking bonds, lone pairs, and virtual orbitals. Accurately capturing these correlations requires simultaneous treatment of many electron configurations – involving 18 electrons per CF2 unit – beyond what is amenable to CC or MR approaches. Furthermore, fluorine's electronegativity produces strong C–F bonds, setting PFAS apart from more-commonly studied systems, and necessitating a thorough investigation of the dissociation manifold. In recent years, progress has been made in inventing new chemistries to break down PFAS,12–14 showing that they are not truly immortal chemicals, and motivating the development of computational tools that can guide such efforts. In order to address these problems, a method that is both scalable and highly accurate must be identified.
The incremental full configuration interaction (iFCI) method is a novel approach to tackle large-scale electronic structure problems with high accuracy.15,16 For example, iFCI has previously been shown to generate smooth dissociation profiles for both single- and multi-bond breaking scenarios and can also be used to target excited states in transition metal complexes.17–19 iFCI uses a many-body decomposition of the energy of a molecular system within a localized orbital basis to reproduce the FCI limit of energies and electron densities. This decomposition reduces the problem from exponential to polynomial scaling, in principle enabling simulation of much larger systems, and requires no specialized input such as the choice of an active space. Even with the polynomial cost of iFCI, however, a system such as PFOA would require order one million (106) many-body terms with an average of 108 dimensions per term. PFOA is therefore intractable to study with iFCI in its original implementation.15–17 Here, we will show that a combination of highly parallel computation and correlation screening can dramatically reduce the costs and runtime of iFCI computations on larger systems. Within one day of walltime and distributed across over 1 million cloud vCPUs, the new iFCI algorithm approaches the FCI limit for PFOA (10151 dimensions).
In this work, we describe a massively parallel implementation of iFCI that can simulate the wavefunctions of molecules the size of PFAS within realistic time frames. Three representative PFAS – trifluoroacetic acid (TFA), perfluorobutanoic acid (PFBA), and perfluorooctanoic acid (PFOA) (Fig. 1) – were selected based on their environmental relevance and the difficulty of studying these systems with traditional electronic structure theory.20 In particular, describing reactions involving the C–F bond is important for understanding natural and induced PFAS degradation pathways, as well as designing potential catalysts for PFAS remediation. This is the first quantum chemistry calculation to use the cloud to scale an accurate many-body wavefunction to systems of this size, opening up a wide scope of future possibilities for precise insight into strongly correlated molecules and materials.
![]() | ||
Fig. 1 PFAS species under consideration in this work. Color scheme: green (fluorine), red (oxygen), gray (carbon), white (hydrogen). |
Despite the relative compactness of the iFCI wavefunction compared to the astronomically sized FCI wavefunction, an efficient parallel algorithm is required to scale iFCI up to compute the large PFAS molecules of this work. To enable this study, we developed a massively parallel implementation of iFCI within QEMIST Cloud (https://www.sandboxaq.com/solutions/quantum-simulation). QEMIST Cloud provides accurate methods to predict the electronic structure and properties of chemical systems via traditional ab initio quantum chemistry and machine learning methods, and now includes the iFCI method for analysis of strongly correlated systems. For this work, our software was implemented using Amazon Web Services (AWS), using Amazon Elastic Compute Cloud (Amazon EC2) Instances with Intel Xeon Scalable processors. Further details are available in the Computational methods section of this article. This new implementation allows iFCI to be performed on an almost arbitrary number of processors, up to and beyond the 1 million vCPUs employed in this work.
Multiconfigurational methods like MR-CI or CASSCF offer more flexibility for describing non-dynamical correlation, but they rely on predefined active spaces.26,27 In systems like PFAS, where bond breaking induces collective correlation effects across many orbitals, any practical active space becomes either too small to be accurate or too large to be tractable. For example, a CASPT2(2,2) treatment might focus only on the breaking σCF bond, while neglecting the extended impact that the breaking bond may have throughout the molecule. Expanding the active space incrementally leads to an open-ended problem with no clear convergence, as new interactions continue to emerge with each addition.
In contrast, the iFCI approach includes all electrons and systematically incorporates correlation across the full virtual space, without requiring an arbitrary choice of the active space. This allows iFCI to capture the many-body, delocalized nature of PFAS bond dissociation with high fidelity. Enabled by a massively parallel implementation, iFCI offers a tractable yet fully correlated description of these chemically and computationally complex systems.
Direct experimental measurements of BDEs for the specific PFAS studied here are not available in the literature, which limits direct comparison. However, data from related systems offer useful reference points. Typical C–F bonds fall in the range of 110 to 116 kcal mol−1, with fluorocarbons reported to have a C–F bond strength of up to 127 kcal mol−1.28–31 These values are in good agreement with our iFCI results for all three PFAS and confirm that our results reflect expected bond strengths of comparable systems.
Among the PFAS tested, TFA expresses the largest BDE, while PFBA and PFOA have similar BDEs at all n. The similarity of BDE between the longer-chain PFAS indicates that additional CF2 units have relatively small effects on the dissociation of a distant C–F bond. This observation is reasonable given that the local geometry in these two molecules is similar for the chosen dissociated bond, and the electronic environment is therefore similar as well. However, a small effect of the chain length on the BDE arises at higher n, with PFOA's BDE at n = 4 being almost 1 kcal mol−1 above PFBA's. This may be evidence of a slight stabilization of the radical within longer-chain PFAS.
The BDE of various functionals using unrestricted DFT are compared to iFCI in Fig. 2. Overall, DFT agrees with the general trend of the longer-chain PFAS possessing lower BDEs, but all functionals consistently overestimate the BDE to varying degrees compared to iFCI. The lower BDEs of longer-chain PFAS is consistent with experimental studies, where longer-chain PFAS have been found to exhibit faster decomposition rates.32–34 However, the difference between the BDE of short- versus long-chain PFAS estimated by DFT is much smaller than iFCI, probably due to the semi-local treatment of exchange-correlation in DFT, which does not treat long-range correlation effects. In a similar fashion, the correlation retrieved in iFCI n > 2 results in a slight widening of the difference between PFBA and PFOA, while DFT resembles n = 2 with PFBA and PFOA having nearly the same BDE.
The overestimation of the BDE by DFT can be contextualized in terms of half-life using the Arrhenius equation,35–37 under the simplifying assumption that the BDE and activation energy are equivalent. For example, the 10.6 kcal mol−1 difference between DFT (ωB97x-D) and iFCI (n = 4) for PFOA equates to a predicted half-life of bond breaking to be 108.5 times longer for DFT compared to iFCI (at 273 K). The difference in relative strength of the C–F bond and overall overestimation of the BDE by DFT, along with the varying range of BDE across functionals, may impact the study and design of PFAS remediation strategies.
TFA's iFCI potential energy curve for C–F dissociation shows physically correct behavior for a stretched bond for all n (Fig. 2), a feature usually found only in multireference methods with careful selection of active space. The n = 3 and 4 curves agree well with CCSD(T) and UHF-CCSD(T) around the equilibrium region. Since the CCSD(T) curve becomes unphysical upon dissociation, a meaningful comparison of the BDE cannot be made. As reported in literature,10,24 the UHF-CCSD(T) method provides marked improvement over RHF-CCSD(T). However, UHF-CCSD(T) still exhibits a large difference compared with iFCI (and CCSD(T)) in the intermediate region as the bond begins to dissociate. The accuracy of the UHF-CCSD(T) energy improves as the bond is stretched further toward dissociation, but it was not possible to converge the UHF-CCSD equations – and therefore also not possible to obtain UHF-CCSD(T) energies – for geometries beyond RC–F = 3.0 Å. Additionally, we were not able to obtain UHF-CCSD(T) results at any geometry for PFOA due to the high computational demands of a system of this size. For PFBA and PFOA, iFCI calculations of the equilibrium and dissociated geometries were used to determine the BDE, though we did not compute the full potential energy profiles. These species are expected to express similar behavior in the dissociation region when compared to TFA (see Fig. 2). Comparing each method, iFCI n = 4, CCSD(T), and UHF-CCSD(T) exhibit similar energies at equilibrium (all within 0.2 kcal mol−1 of each other), but both CC methods deviate from iFCI by >1 kcal mol−1 at 1.7 Å and higher. Further comparison between iFCI, CCSD(T), and UHF-CCSD(T) can be found in the SI.
A metric to indicate the accuracy of the iFCI calculations is the non-parallelity error (NPE), which measures differences (maximum vs. minimum error) between potential energy curves. As shown in Table 1, the NPE systematically decreases with increasing order n of the n-body expansion, indicating good convergence of iFCI. The NPE between n = 3 and n = 4 is smaller than a few kcal mol−1, suggesting the 4-body expansion approaches the FCI result. For TFA, the maximum and minimum errors were located near the equilibrium and dissociated geometries, and therefore the NPE for PFBA and PFOA may be estimated given the error at their respective equilibrium and dissociated geometries. The longer-chain PFAS also follow a similar trend to that of TFA with increasing n, indicating a comparable convergence of iFCI for the larger species.
The NPEs of other methods when compared to the FCI solution are typically on the order of tens of kcal mol−1, with methods using a restricted reference experiencing large errors at dissociation.10,24,38 For example, the NPE of CCSD(T) is 95 kcal mol−1 compared to iFCI n = 4 (see Table 1) due to its failure to produce a qualitatively correct electronic state at dissociation. Restricted DFT expresses similar problems. Unrestricted methods handle dissociation more accurately – UHF-CCSD(T) and UHF-DFT demonstrate NPEs of <20 kcal mol−1 – but are nowhere near chemical accuracy. The maximum errors in UHF-CCSD(T) arise from the unphysical behavior in the intermediate region, and the maxima for UHF-DFT are in the dissociation region (see SI, Fig. S2). Only iFCI treats all regions on close to equal footing.
Each n-body term is uniquely identified by a combination of occupied orbitals (X), and the correlation energy for each term (ϵX in eqn (1)) is a measure of the interaction of electrons within these orbitals. The localized orbital basis allows for a direct comparison between terms of the equilibrium and dissociated geometries. For example, the correlation from a 1-body term defined by the C–F σ bond on the α-carbon can be represented as ϵequi(σCF). Although dissociation of this bond changes its appearance, the orbital retains the same σCF character, and therefore the contribution to the BDE from this bond can be calculated through simple subtraction, ϵdiss(σCF) − ϵequi(σCF).18,19
Orbital interaction analysis was performed by grouping the localized orbitals of PFOA into general characterizations, or classes, with chemically relevant features: σCF, σCC, and lpF (orbitals present on the carboxyl group are excluded from this analysis for brevity, as they do not provide a substantial difference to the total BDE). Contributions to the bond dissociation energy of PFOA from these orbital classes are shown in Fig. 3, where each class (or combination of classes for n > 1) represents the summed contribution to the BDE for all terms that fit that description. For example, σCF represents the contribution from 1-body terms that correspond to all σ bonds between C and F. Likewise, lpF/σCF represents the contribution from all 2-body terms that correspond to the interactions between one of the lone pairs of a fluorine atom and a C–F σ bond.
As expected, the σCF class shows the most significant contribution to the BDE by far (−197 kcal mol−1), a natural result of the dissociating C–F bond. However, in n-body terms with combinations of σCF and other orbitals, the BDE is increased by some and decreased by others. For example, the sum of lpF/σCF 2-body interactions widens the BDE by ∼9 kcal mol−1, while the sum of lpF/σCF/σCF narrows the BDE by ∼2 kcal mol−1. Overall, the BDE is determined by a subtle balance between orbital interactions, where the presence of notable contributions to the BDE (>1 kcal mol−1) at n > 2 indicates that highly-excited configurations – which are not present in other common computational methods, such as CCSD(T) – significantly influence the electronic state. Further analysis of orbital-specific contributions from the many-body expansion may provide future insights in rational design of PFAS remediation processes, such as identifying relative strengths of bonds to calculate the energetics of PFAS degradation.22,39,40
According to valence bond theory, a single covalent bond comes as a pair of one bonding orbital with a corresponding anti-bonding orbital. The NOs of TFA and PFOA's equilibrium geometry fit this description for iFCI at n = 1 (Fig. 4), i.e., as a σ/σ* pair. At n > 1, correlation between this bond and all other bonds causes the bonding picture to become more delocalized, representing the close coupling between many pairs of the n = 1 localized orbitals. At least at the equilibrium geometry, individual and distinct σCF bonds are representative of simple theories (like the GVB-PP reference state) that are relatively uncorrelated. iFCI recovers the correct picture of orbitals with mixed, noninteger occupancy of delocalized orbitals across the molecule.
The electronic state for a molecule with a dissociated bond, i.e., a diradical, has a combination of the delocalized, weakly correlated states and localized, strongly correlated portion. Computational methods such as DFT have difficulty spanning the two regimes, as they prefer to have the same level of delocalization regardless of electronic state (see SI, Fig. S3). The iFCI n = 1 NOs also do not show the correct behavior (two localized 1e− states) because they are not correlated with electrons outside of the σ/σ* pair, and their portrayal is therefore a holdover from the reference orbitals. However, once correlation across all orbitals is introduced at n > 1, the NOs spanning the dissociated bond become localized with one unpaired electron on each fragment, which is the qualitatively correct description of the correlated state. (Their occupation numbers deviate slightly from 1 due to weak correlation with neighboring lone pairs and bonds.) As shown in Fig. 4, the covalently bonded orbitals of the dissociated PFAS fragment remain delocalized at n = 2, as expected. In all n = 2 NOs, iFCI provides the correct description of the two regimes – and together with the smooth potential energy profile of Fig. 2 – a smooth transition from one distribution of the electronic localization to the other. (Note, orbital spaces are further correlated at n > 2 but the degree of delocalization remains constant at higher n, and we therefore limit our discussion to just n = 1 and 2.)
These calculations constitute the first quantum chemical method to make effective use of the incredible scaling capabilities of the cloud, and are the most accurate electronic structure calculations on PFAS species to date. Calculations of this kind lay the foundation for further development of novel protocols for the degradation of PFAS, with the potential next step being the integration of this method with other pivotal methodologies in the study of PFAS bond-breaking chemistry (e.g., reaction path prediction, machine learning based screening, surface reactions, solvation, etc.).
With the emerging availability of scalable iFCI simulations, it will be possible to develop highly accurate benchmark datasets that include weakly and strongly correlated molecular systems. These datasets will provide valuable training data for the development of universal machine-learning force fields41–44 that could cover a wide range of molecular systems and chemical behaviors that conventional methods are not able to accurately describe. This difference is expected to be dramatic, as most ML force fields to date are trained on DFT data. In the long term, the availability of this high-quality training data will mean the fast and accurate force fields for strongly correlated systems could become commonplace, exponentially increasing the impact of the output from iFCI simulations.
![]() | (1) |
ϵi = Ec(i) | (2) |
ϵij = Ec(ij) − ϵi − ϵj | (3) |
ϵijk = Ec(ijk) − ϵij − ϵik − ϵjk − ϵi − ϵj − ϵk | (4) |
ϵijkl = Ec(ijkl) − ϵijk − ϵijl − ϵikl − ϵjkl − ϵij − ϵik − ϵil − ϵjk − ϵjl − ϵkl − ϵi − ϵj − ϵk − ϵl. | (5) |
Here, Ec(X) are the correlation energies obtained from the space of X occupied orbitals correlated with the full set of virtual orbitals. For example, Ec(ij) indicates solving CISDTQ in a space of i, j, and all virtual orbitals. By extension, four-body incremental energies provide correlations for up to four occupied orbitals (eight electrons) at a time. This notation is simplified throughout this work with n representing the maximum number of occupied orbitals that are correlated in a computation. For example, n = 3 refers to an energy that includes the summation of all combinations of three occupied orbitals that are correlated with the full set of virtual orbitals (∑iϵi + ∑i>jϵij + ∑i>j>kϵijk). We also discuss contributions from individual orbitals, combinations of orbitals, or classes of orbitals (e.g., ϵijk), and refer to them as n-body terms.
In order to make calculations tractable for larger systems, the incremental correlation energies Ec(i), Ec(ij), Ec(ijk),… are computed using the Heat-Bath Configuration Interaction (HBCI) approach,45 which is an efficient selected configuration interaction method that constitutes as an approximation to full configuration interaction. In this method, inspired by heat bath sampling, a number of the most important determinants are selected for a variational configuration interaction treatment. The determinant space is then expanded and second-order Epstein–Nesbet perturbation theory is performed to correct for the missing correlation effects. Two parameters are used to control the selection of determinants for the variational and perturbative calculations, ε1 and ε2 respectively. For the calculations reported herein, we used values of ε1 = 1 mHa and ε2 = 1 μHa.
To further increase the efficiency of calculations without sacrificing accuracy, we make use of Summation Natural Orbitals (SNO),17 which facilitate a selection of the virtual orbitals for each n-body term and discard a number of them that have little effect on the correlation energy. Namely, for each 1-body term, the virtual space is spanned by natural orbitals of the exact CISD 1-body RDM (i.e., CISD is exact for 2-electron systems such as a 1-body term) and the natural orbitals for which the occupation numbers fall below a threshold 10−η are discarded. For higher n, a density is formed as the sum of the one-particle CISD density matrices for each occupied orbital included in the given n-body term. For example, a 3-body term would form a density as the sum of the 1-body densities with the same set of occupied orbitals in the form
ρijk = ρi + ρj + ρk. | (6) |
Then in analogous fashion to the 1-body case, the virtual space for the 3-body term is spanned by natural orbitals of this SNO density and the natural orbitals with occupation numbers falling below a threshold 10−η are discarded. The final HBCI calculation is then performed in the truncated virtual space. Herein, we used η = 8.5 as a middle ground of accuracy and efficiency, as 9.5 did not show significantly different energies for TFA (e.g., correlation retrieved at n = 3 differed by <3 mHa between η = 8.5 and 9.5).
An additional way to reduce the computational cost of calculations without having a significant effect on accuracy is to screen certain higher order terms entering the incremental expansion of eqn (1). As the size of the system increases, the number of terms entering eqn (1) at n ≤ 3 starts to become large and many of the incremental energies have negligible contributions to the total correlation energy, especially as n increases. Hence, we can use the connectivity arguments discussed in detail in ref. 18 and 19 to discard the less important contributions to the incremental expansion for n ≥ 3. We chose a value of 10−4.6 Ha for the screening parameter (i.e., the connected n-body screening procedure is applied to n = 3 and 4 with a value of ) as a good middle-ground because correlation remains to be captured at looser cutoffs, while tighter cutoffs introduce significantly more terms with negligible change to the total energy. Further discussions on this connected n-body (CnB) truncation procedure are provided in the SI.
Our implementation of iFCI makes use of a zeroth order reference wavefunction from Generalized Valence Bond, Perfect Pairing (GVB-PP) theory.46–48 This constitutes one of the simplest multi-configurational wavefunctions to include static (non-dynamical) correlation and has been shown to be a good reference wavefunction for iFCI calculations on strongly correlated systems15 and transition metal complexes.18 As mentioned in ref. 15, GVB-PP provides an excellent localized basis for the incremental expansion of eqn (1), and since the PP ansatz includes correlation beyond Hartree–Fock (i.e., EGVB-PP < EHF), it also provides a better starting point for the incremental expansion in iFCI.
In our initial GVB-PP calculations on TFA, we found it difficult to converge to the same state for different geometries at larger C–F separation due to a discontinuity in the potential energy surface. This was mostly due to the presence of many degenerate electronic states in this region, evidenced by a change in orientation of the bonding and lone-pair orbitals on the dissociating fluorine as the bond is stretched. Therefore, we introduced a small perturbation in the form of a point charge (charge of −1 a.u.) along the dissociating C–F bond direction at a constant distance of 5 Å from the F atom, which successfully corrected the orientation of the lone-pair orbital, provided a smooth potential energy surface across the full range of geometries, and changed the GVB-PP energy by less than a few mHa. The perturbation was removed in the calculation of the reference energy for the iFCI expansion – the energy of a single determinant computed in the basis of the converged GVB-PP MOs – and the one-electron operator used in the calculation of the single determinantal energy was unperturbed, though the converged GVB-PP MOs used for this calculation were still perturbed to retain the proper orientation of the dissociating fluorine orbitals. Similarly, the perturbation was not included in the final iFCI calculation, even though the GVB-PP orbitals used in the calculation were perturbed to retain the proper orientation of orbitals of the dissociating fluorine atom. We confirmed that this procedure had negligible effect on the final iFCI (n = 3 and 4) energy for the equilibrium region of the potential energy surface. This is not surprising, as for a given starting set of localized MOs (e.g., GVB, localized HF MOs, etc.), the iFCI expansion will converge to the HBCI result as n is increased. This procedure was utilized for all three molecules considered in this work.
Before describing the technologies used to implement this software architecture at scale, we will first introduce some terminology. In modern cloud computing environments, physical hardware is transformed into “virtual machines”, where a vCPU (virtual central processing unit) is a software-defined representation of a physical CPU core. An availability zone is a set of one or more data centers in a particular region. Nodes and instances refer to discrete allocations of hardware resources, and workers refer to processes that use some or all of these resources. A cluster is a collection of nodes, and a spot instance is some spare capacity on AWS that is available at up to a 90% discount compared to on-demand prices, but may be reclaimed by AWS before the calculation is finished (with a warning issued two minutes prior to reclamation).49 The total cost of running iFCI depends on the number of n-body terms in the calculation and computational complexity of each term, as dictated by the number of virtual orbitals and the HBCI convergence threshold (more details are in the Methods section). As described in the last paragraph of this section, it is beneficial to optimize the usage between the on-demand and spot instances. Individual compute tasks were performed with standardized units of software, referred to as containers. Kubernetes performed distribution and execution of the compute tasks and containers.50 Kubernetes was deployed using the AWS managed service, Amazon Elastic Kubernetes Service.
The architecture supporting the iFCI implementation is structured around the independence and natural parallelism of the subproblems generated by the method of increments.17 In iFCI, the n-body correlation energy is composed of a summation of independent calculations, each correlating unique sets of n occupied orbitals. Due to the natural independence of each calculation and very low communication between calculations, traditional message passing interface parallelism is not required for these calculations. Instead, when our platform receives an iFCI calculation request, a dedicated decomposition worker is assigned and generates all required subproblems for the current n. The decomposition worker then submits subproblems to solver queues, where solver workers pick up and complete each subproblem, writing their results into a central database once finished. The number of solver workers scales dynamically so that larger calculations have more subproblems being solved simultaneously. Once all subproblems for a given n are completed, the decomposition worker sums the total n-body correlation energy and moves to the next subsequent n until the desired highest order of n is computed (usually not higher than n = 4). Each n is completed sequentially before moving on to n + 1 to reduce the computational cost via a screening procedure (see the Methods section).
To reach the scale required for the largest PFAS in this study (1 million simultaneous vCPUs, or 62500 workers with 16 vCPUs each), this infrastructure required numerous enhancements. In general, these improvements fell into three categories: more efficient node recruitment, enhanced database connections, and widening of the network to accommodate a large number of nodes. A summary of the final architecture is given in Fig. 5.
In order to reach a given scale, enough nodes must be available. While cloud infrastructure providers generally have more than enough to reach the scale aimed for in this work, no one availability zone is guaranteed to have the required amount. To ensure adequate node availability, the solver queues and solver workers were moved from a single cluster into multiple clusters, which distributed the work of provisioning nodes and enabled node recruitment on a larger scale. All clusters were allowed to provision nodes from multiple availability zones on the infrastructure provider to ensure a deep pool of spot instances. Distributing across availability zones does not degrade the performance of the algorithm because there is low communication between nodes during the calculation. While this multi-cluster design allows for recruitment on a larger scale, it adds complexity to the queuing and monitoring systems, where each cluster's queue must be periodically re-balanced so that all clusters remain active.
While the scale of node recruitment was addressed by the previous modifications, the speed of recruitment also needed to be improved in order to reach large scale queuing in a short time. For this, the Karpenter open-source autoscaler (https://karpenter.sh) was used in lieu of the standard Kubernetes cluster autoscaler. Karpenter removes the need for an autoscaling group, allowing for node provisioning and removal to occur at a rapid rate. An additional benefit of changing to Karpenter was the ease of scheduling multiple workers on a variety of nodes, which improved spot instance availability across all clusters due to diversification of instance type and size.
In order to efficiently store the outputs from a massive number of worker nodes, Amazon Aurora was used in a “serverless” fashion with a highly scalable proxy in between (RDS proxy) that allows pooling and sharing of the database connections between nodes. This allowed us to handle large numbers of simultaneous connections (e.g., when 62500 or more workers are active). We optimized communication between nodes and the database by decreasing their size and frequency, preventing data transfer from becoming a hindrance to computation. Additionally, leveraging 1 million simultaneous vCPUs required a widened IP address range, achieved by deploying IPv6 networking for all components, which is novel for some of the components used.
All subproblems of order n > 1 were run on spot instances and were occasionally evicted prior to completing. We therefore implemented a mechanism to automatically re-queue interrupted subproblems, where each is maintained by a decomposition worker that reschedules them in the event that a compute node stops before completing.
The supplementary information contains details of the CnB truncation procedure, total number of n-body terms computed, tables of bond dissociation energies, analysis of DFT and CC results, convergence of iFCI, computational performance on one million vCPUs, total energies, and molecular geometries. See DOI: https://doi.org/10.1039/d5sc03019f.
This journal is © The Royal Society of Chemistry 2025 |