Samuel M.
Blau
a,
Hetal D.
Patel
bc,
Evan Walter Clark
Spotte-Smith
bc,
Xiaowei
Xie
cd,
Shyam
Dwaraknath
c and
Kristin A.
Persson
*be
aEnergy Technologies Area, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
bDepartment of Materials Science and Engineering, University of California, Berkeley, CA 94720, USA
cMaterials Science Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
dCollege of Chemistry, University of California, Berkeley, CA 94720, USA
eMolecular Foundry, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA. E-mail: kapersson@lbl.gov
First published on 24th February 2021
Modeling reactivity with chemical reaction networks could yield fundamental mechanistic understanding that would expedite the development of processes and technologies for energy storage, medicine, catalysis, and more. Thus far, reaction networks have been limited in size by chemically inconsistent graph representations of multi-reactant reactions (e.g. A + B → C) that cannot enforce stoichiometric constraints, precluding the use of optimized shortest-path algorithms. Here, we report a chemically consistent graph architecture that overcomes these limitations using a novel multi-reactant representation and iterative cost-solving procedure. Our approach enables the identification of all low-cost pathways to desired products in massive reaction networks containing reactions of any stoichiometry, allowing for the investigation of vastly more complex systems than previously possible. Leveraging our architecture, we construct the first ever electrochemical reaction network from first-principles thermodynamic calculations to describe the formation of the Li-ion solid electrolyte interphase (SEI), which is critical for passivation of the negative electrode. Using this network comprised of nearly 6000 species and 4.5 million reactions, we interrogate the formation of a key SEI component, lithium ethylene dicarbonate. We automatically identify previously proposed mechanisms as well as multiple novel pathways containing counter-intuitive reactions that have not, to our knowledge, been reported in the literature. We envision that our framework and data-driven methodology will facilitate efforts to engineer the composition-related properties of the SEI – or of any complex chemical process – through selective control of reactivity.
Reaction networks have the capacity to interrogate competing reaction mechanisms in complex chemical systems on long timescales. By simplifying chemical space into a set of connected molecule nodes and reaction nodes,8 reaction networks abstract away the spatial interactions of three dimensional molecular structures while preserving a realistic description of the underlying interconnected chemical mechanisms. Reaction networks can be used to compare viable reaction pathways that transform a set of starting molecules into final products or to identify optimal pathways using pathfinding algorithms.9 The cost to traverse a given reaction can capture the relevant thermodynamics and/or kinetics, ensuring that the “shortest” pathway (the pathway with the lowest cost) is also the one most likely to occur. Such pathway analysis inherently accounts for long-time behavior without requiring an arduously long propagation as long as important rare reactions are present in the network. Substantial work has been conducted to apply reaction networks to a wide range of scientific applications, including organic chemistry,9–12 retrosynthesis,13–15 combustion,16–18 catalysis,19–21 and sugar formation,22,23 as well as metabolic7,24 and prebiotic chemistry.23 However, network scale has remained fairly small; no reaction network reported in the literature built from first principles has included more than 1000 species or more than 10000 reactions.8
The size of networks reported thus far has been limited in part due to the use of graph representations that cannot capture the chemical reality of multi-reactant reactions (e.g. A + B → C). Such representations prevent the use of optimized pathfinding algorithms. Instead, researchers employ custom approaches that can calculate path costs while respecting reaction stoichiometry.25 These algorithms suffer from performance and scaling limitations11,25 that have historically made it necessary to significantly restrict network size. As a result, reaction networks have not previously been applied to complex electrochemical or photochemical systems which may include thousands of species and millions of reactions.
The development of reaction networks to study electrochemical processes is both highly desirable and very challenging due to the extreme inherent complexity of such systems. For example, the exposure of an electrolyte to highly oxidative as well as reductive electrode materials necessitates the consideration of redox reactions coupled with irreversible chemical decomposition in a rapidly evolving local environment. These reactions form ions, fragments, and radicals, all of which may be more stable in solvent than they would be in the gas phase or in vacuum. Such factors mean that a much wider range of stable and metastable species must be included in the network than would be otherwise. As a further complication, ions and radicals are often much more reactive than neutral, closed-shell organic molecules,26 and their reaction mechanisms are not as well understood,27 precluding heuristic pruning and necessitating consideration of a vast number of reactions for each specie.
Because a reaction network capable of describing electrochemical processes must include massive quantities of species and reactions, the use of optimized pathfinding algorithms is essential. Thus, we need a graph representation that supports general network construction, including any reaction stoichiometry in a chemically consistent manner. Such an approach will ensure that all participating reaction pathways in general reaction networks of even massive size can be tractably identified.
In this article, we will first identify underlying problems in existing reaction network architectures that can potentially produce unphysical results and that prevent the use of optimized pathfinding algorithms. We then resolve this issue through the adoption of both a novel graph representation for reactions involving multiple reactant species and an iterative algorithm for the calculation of such reaction costs. To demonstrate the power of our chemically consistent graph architecture, we apply it to a network describing the electrochemical reaction cascade that forms the lithium-ion solid electrolyte interphase (SEI), a nanoscale layer that is largely responsible for battery health and capacity retention. Our approach facilitates analysis of the resulting massive reaction network containing nearly 6000 species and 4.5 million reactions, and we identify both previously accepted pathways and novel pathways to a key SEI component.
We will start by considering the graph representations of three fundamental reaction types with different stoichiometries (Fig. 1b). Type I reactions are of the form A → B; type II reactions are of the form A → B + C; and type III reactions are of the form A + B → C. Each reaction implicitly defines a change in the state of the reacting system that can be addressed by simple accounting, where a system's state is specified by the amount of each species present. As a result of a reaction, the reactants must be subtracted from the state, and the products must be added to the state. Summing the initial system state and the system state changes along a reaction path yields the final system state after the reactions in the path have taken place. We note that for CRNs, which do not consider the history of the system, the system state can only be determined by post hoc reconstruction.
A chemically intuitive graph representation of a reaction with multiple reactants (e.g. type III) fundamentally conflicts with an underlying assumption of graph theory. Graph theory assumes that cost alone governs connected traversal; if node A is connected to node C, then it is always possible to traverse the graph from A to C, provided the cost is paid. Meanwhile, chemically, a reaction must obey its stoichiometry (i.e. all reactants must be present for a reaction to occur), and there must be a non-negative amount of each species in the system. A system state with a negative quantity of a species is thus chemically inconsistent. In a type III reaction, connectivity allows traversal from one reactant node to the product node while ignoring the second reactant, contradicting chemical laws. Such contradiction can cause chemically inconsistent reaction paths to be selected during conventional pathfinding.
When only single-reactant reactions like type I and type II are used to construct a model network, an example path from node A to node G proceeds without issue (Fig. 1c). In contrast, when a type III reaction is included in the model network, an example path from node A to node G yields a negative quantity of species E in the final system state, violating chemical assumptions and making the path chemically inconsistent (Fig. 1d). This is a fundamental problem with any current multi-reactant reaction representation. To resolve this problem, it is necessary to know the cost to create the other participating reactant during pathfinding.
Current multi-reactant representations significantly limit the utility of CRNs. Reaction networks are commonly used to identify the “best” or lowest-cost reaction paths from a set of starting molecules to any target molecule(s) of interest. The ideal tools for this task are efficient pathfinding algorithms such as Dijkstra's algorithm28 for single-shortest paths and Yen's algorithm29 for the N-shortest paths. However, as described above, these algorithms will not necessarily produce chemically valid pathways in networks containing multi-reactant reactions. It has historically been necessary to use custom methods that have suboptimal performance and scale poorly with network size based on tree traversal algorithms,22 breadth-first-search,25 or depth-first-search.11,25 Performant stochastic sampling approaches can instead be employed,30 but they are not guaranteed to find the true best solution(s). While these custom approaches may be sufficient for relatively small and simple networks, analyzing larger networks describing more complex systems will necessarily require the optimal performance of Dijkstra's and Yen's algorithms. A multi-reactant graph representation that respects both graph theoretical and chemical principles is clearly needed.
Our type III representation enables the use of standard pathfinding algorithms. Consider the previous pathfinding example in which a type III reaction yielded a negative species quantity in the system state. Employing our fixed type III representation, the same example path proceeds without issue (Fig. 2b). There are two critical changes: the final system state no longer includes a negative quantity of species E, and the total path cost additionally includes the cost of producing prerequisite E (ΦE) as part of the last reaction. Thus, our novel type III representation resolves the multi-reactant reaction inconsistency and enables pathfinding via optimized Dijkstra's and Yen's algorithms for any chemical reaction network. However, our representation additionally requires that prerequisite costs be solved and included in reaction node costs before pathfinding is performed.
We have developed an iterative algorithm to simultaneously solve all prerequisite costs before pathfinding. Naively attempting to solve PR costs on the fly often results in unbound recursion. Instead, we iteratively solve all PR costs with the following procedure outlined in Fig. 3:
(1) We calculate the shortest path from each starting molecule to every possible molecule in the network that may act as a prerequisite using Dijkstra's algorithm.
(2) We then identify solved PRs in which the shortest path to that molecule node itself has no unsolved PRs.
(3) If any PRs remain unsolved, we update all PR costs in the network and continue iterating until all PRs are solved.
In this manner, PR costs at any iteration are an effective lower bound on the true PR costs. The cost of each unsolved PR will rise from one iteration to the next until that PR is solved, at which point the cost remains fixed through all further iterations. Additionally, by only updating the network at the end of each iteration, PR costs are uniquely defined and do not depend on the implementation details of Dijkstra's algorithm such as the order that the nodes are searched. We show an example of PR solving for a model network in Section S1 of the ESI.† While this example and the concept of PR solving are quite simple, we note that the PR solving process is the key to allowing optimized pathfinding in reaction networks and is the most computationally intensive step of our chemically consistent graph architecture.
One subtlety of PR costs is the temptation to think that a species created incidentally as a byproduct of a type II reaction (for example, molecule B in Fig. 2b) could then be used as a prerequisite for a later reaction, thus avoiding the need to pay ΦPR and reducing the cost of the path overall. However, a CRN-based approach cannot accommodate such a cost reduction since system history is ignored and reaction costs must be independent, which is further enforced by the greedy nature of Dijkstra's algorithm. This is a limitation of our methodology, since real chemical processes may contain paths where byproducts from one step are consumed as reactants multiple steps later. However, we note that in the chemical application considered in this work, this synergy is not relevant to the paths identified.
Understanding the mechanisms of SEI formation is critical to the development of battery technologies. Such understanding could allow researchers to modify the initial battery conditions or electrolyte composition to suppress or promote targeted reaction pathways, thereby engineering and controlling the properties and composition of the SEI. With the capability to predict the formation of SEI components and the properties of the resultant SEI based solely on the initial conditions of the electrolyte, it would be possible to computationally screen the vast chemical space of potential electrolyte salts, solvents, and additives to guide experimental investigations and expedite the development of novel next-generation battery chemistries beyond lithium-ion.
Over roughly two decades of dedicated research into SEI formation, it has become accepted that lithium ethylene dicarbonate, also known as Li2EDC or LEDC, is a major, early-formed organic component of the interphase in Li-ion batteries, based on both experimental and computational results.38–41 A number of pathways from the ethylene carbonate (EC) solvent molecule and Li+ to LEDC have been proposed in the literature. From these, two prominent paths have emerged. In the “one-electron” path, two reduced EC molecules, coordinated with lithium (Li+EC−), ring-open and combine to form LEDC and ethylene.39 In the “two-electron” path, after ring-opening, one LiEC molecule is reduced again and decomposes to form ethylene and LiCO3−. This LiCO3− then attacks an EC molecule to form LiEDC−, which then coordinates with a Li+ to form LEDC.38,42,43 Here, we employ a reaction network to determine if these proposed pathways are indeed the most thermodynamically competitive, or if there are other pathways that may contribute to LEDC formation.
Our reaction network methodology, based on a novel graph representation for multi-reactant type III reactions and an iterative PR cost solving algorithm, is uniquely suited to provide insights into SEI formation processes. Because the SEI forms spontaneously on an electrified interface in the presence of multiple species and local environments, there is a vast number of possible reactive fragments which in turn gives rise to millions of plausible reactions. As a result, a reaction network describing SEI formation will necessarily be of massive scale to ensure that key pathways are not missed, making such a network an ideal test case for our approach. We emphasize that attempting to analyze such a reaction network with millions of reactions is completely unprecedented and is only feasible with optimized pathfinding algorithms.
We have constructed a reaction network to describe SEI formation using our chemically consistent graph architecture that contains 6000 species and nearly 4.5 million reactions. We employ a thermodynamic cost function Φ = C + eΔG/kT, where ΔG is the free energy of the reaction and C is a constant that provides a lower bound on reaction cost and encourages shorter paths. We use C = 1 in this work based on empirical investigation and the physically reasonable result that, with this choice, one reaction with ΔG = 0 costs an equivalent amount as two reactions with ΔG ≪ 0. Further discussion of the cost function and our procedure for network construction is presented in Section S2 of the ESI.†
After solving the prerequisites of our final network, pathfinding autonomously identifies both previously reported and novel formation mechanisms for LEDC (Fig. 4). We define Li+ and ethylene carbonate (EC) as our starting molecules and are able to solve all prerequisite costs in 19 iterations over the course of 14 hours on a laptop. Prior to the reactions shown in Fig. 4, all paths include the bidentate coordination of Li+ with EC followed by reduction.
A key validation of our approach is that both previous prominent mechanisms, proposed through manual investigations – the two-electron path (Fig. 4, purple) and the one-electron path (Fig. 4, green) – are recovered as our 2nd shortest and 3rd shortest paths, respectively. The shortest and 4th shortest paths (Fig. 4, blue and red) are nearly equivalent to the 2nd and 3rd shortest paths, respectively, with the blue path being another two-electron mechanism and the red path being another one-electron mechanism. However, both include a counterintuitive ring-opening step that has not been previously considered because it is slightly less thermodynamically favorable than the conventional ring-opening mechanism (−1.08 eV vs. −1.21 eV). Intuitively, a chemist would select the more favorable route at a given step, but in this case non-intuitive reactions could meaningfully contribute.
The 5th shortest path (Fig. 4, gold) is entirely novel and leverages a transient lithium ion to decompose ring-opened Li+EC− exergonically. Note that decomposing ring-opened Li+EC− in isolation is slightly endergonic according to our calculations (+0.25 eV), which may be why it has not previously been reported. However, our procedure autonomously identified that the coordination of an additional lithium ion simultaneous with the decomposition yields a much more thermodynamically favorable reaction (−0.42 eV), perhaps making it competitive with the other mechanisms emerging from the network. The additional lithium ion can then dissociate simultaneously with the addition of a ring-opened Li+EC− to form LEDC (−4.44 eV), making it a transient participant in the gold pathway.
Since our network contains only thermodynamic information, reaction costs are a function of free energy changes. While a more accurate cost would be based on reaction kinetics, calculating reaction barriers is extremely computationally intensive and challenging to automate. In the case of LEDC formation, the electron transfer rate has further been shown to be important in determining pathway fitness;44 in principle, electron transfer should also be accounted for. While we currently do not have any kinetic information in our network, including such information would not require any modification to the reaction representation or pathfinding algorithm.
The ability to identify the best reaction paths in highly complex reaction networks is the first step towards reverse engineering SEI composition and complex reactivity more generally. The best paths identified by our approach are the natural targets for reaction engineering. Paths that yield desirable products provide a set of reactions that can be strategically promoted, through additives and concentration tuning, while paths that yield undesirable products instead identify reactions that may be selectively hindered. Such reaction engineering leveraging knowledge of desirable and undesirable reactions has already been put into practice in the field of heterogeneous catalysis,45–48 where substrates are chosen to minimize parasitic reactions and maximize output of the desired chemical.
Using our chemically consistent architecture, we have generated a thermodynamic network that contains nearly 6000 species and roughly 4.5 million reactions, the largest chemical reaction network ever built from first principles. Pathfinding on the resulting network to the early-SEI component LEDC identifies both previously-proposed mechanisms and three new possible mechanisms, all in an automated fashion. Identification of optimal paths informs which reactions could be promoted or hindered to control the abundance of beneficial or detrimental species, a key step towards the goal of engineering SEI composition and controlling complex reactive systems in general.
Future work will include pathfinding to other important SEI components and investigating the impact of additives on SEI formation pathways. We are also developing an automated procedure to calculate and incorporate reaction barriers into the network as well as an automated approach to adding reactions with more than two bonds changing or with one or more bonds changing simultaneously during redox processes. With this highly general, robust, and scalable approach, it will be possible to understand and control complex chemical or electrochemical reactive systems, expediting the development of novel materials and chemical technologies in domains such as energy storage and medicine and improving upon synthetic pathways for industrial applications.
Footnote |
† Electronic supplementary information (ESI) available: Simple prerequisite cost solving example and further details of network construction. See DOI: 10.1039/d0sc05647b |
This journal is © The Royal Society of Chemistry 2021 |