Open Access Article
Jingjing Zhangab,
Lily A. Robertson
*ab,
Ilya A. Shkrob*ab,
Rajeev S. Assary
ac and
Lu Zhangab
aJoint Center for Energy Storage Research, Argonne National Laboratory, 9700 S. Cass Avenue, Lemont, IL 60439, USA
bChemical Sciences and Engineering Division, Argonne National Laboratory, 9700 S. Cass Avenue, Lemont, IL 60439-4837, USA. E-mail: robertla@anl.gov; shkrob@anl.gov
cMaterials Science Division, Argonne National Laboratory, 9700 S. Cass Avenue, Lemont, IL 60439-4837, USA
First published on 8th June 2026
In chemistry, rationalizing and predicting reaction behavior in complex environments is challenging, but this knowledge is required for practical applications and materials advancement. Here, we focus on one such example that uses organic molecules as redox-active materials in electrochemical energy storage. In flow batteries, these molecules (also known as redoxmers) serve as charge carriers, and exceptional chemical stability in all states of charge is required for long-term use. Here, we show how machine learning combined with chemist's knowledge can be used to reveal patterns in the reactivity of charged redoxmers by providing mechanistically tractable clues.
Redox flow batteries (RFBs) are scalable electrochemical devices in which charge separation and charge storage are compartmentalized.1 Such devices can be used in the electric grid to integrate intermittent power sources2 or provide backup and peak-shaving for energy-intensive operations such as data centers.3 During battery charging, the positive and negative charge carriers are separated in electrochemical stacks and dispatched to storage tanks. During battery discharging, the stored charges are flowed back to the stacks, and the charge is collected. The positive and negative charge carriers are referred to as catholytes (posolytes) and anolytes (negolytes), respectively. Redox-active organic molecules (redoxmers) can serve as anolytes and catholytes for nonaqueous flow cells (NRFBs). Redoxmers and metal complexes involving abundant elements are among the few economic charge-storage materials for grid-scale devices.4
The chief attraction of such flow cells is an extended electrochemical window compared with aqueous cells, which potentially allows operation at a higher voltage. To translate this advertised advantage into higher energy density, one needs both higher redox potentials and a high concentration of the carriers. Importantly, the charge carrier must have high solubility in the battery electrolyte; i.e., the high concentration of supporting salts used to balance the charge often affects redoxmer solubility. Thus, catholyte molecules need to have high oxidation potentials, possess high solubility in all states of charge, and be exceptionally stable both during storage of charged fluids in a reservoir (calendar life stability) and electrochemical cycling (cycle life stability). As these properties can be at odds with each other, only a few families of redoxmers have been identified as attractive flow cell catholyte candidates, most notably cyclopropenium cations,5 phenothiazines,6 and dialkoxyarenes.7
In recent years, numerous derivatives of these redoxmers have been synthesized and tested to optimize electrochemical performance. Navigating the complex trade-offs inherent to this optimization would be improved by the ability to predict properties of interest. Unfortunately, many of these properties are difficult or impossible to compute from first principles.
Machine learning (ML) modeling is used across different branches of chemistry for statistical discovery of trends underlying the interplay of multiple factors.8 There have been successful examples of using this methodology for flow cells, e.g., for optimization of reactivity9 and solubility within a redoxmer family.10,11 The properties of interest for this study are the ones that “make or break” active materials as potential candidates: the electrochemical potentials (that define the energy density), the calendar life, and the cycling stability. The latter two properties can show high variability, appearing erratic and complicating modeling. Another roadblock is data scarcity: the existing training sets for homologous redoxmers are small (10–50 compounds). For such small sets, using multiple structural descriptors and fingerprints routinely used in ML modeling cannot be justified; on the other hand, using too few descriptors can be insufficient to capture variability in the data. Compounding this difficulty is the fact that molecular descriptors are typically designed for closed-shell molecules, whereas singly charged redoxmers are open-shell species, whose properties only somewhat correlate with the properties of neutral precursors. While there are many known approaches to constructing descriptors for neutral molecules,12 there is a dearth of such methods for reactive, charged molecules such as radical ions.13,14 Here, we demonstrate how more intuitive descriptors to characterize such species can be constructed and used in ML models using the strategy illustrated in Fig. 1.
![]() | ||
| Fig. 2 The upper row: non-symmetric (type 1) and symmetric (type 2) dialkoxyarenes and volatile products 3 to 6 observed after electrolysis of these redoxmers in acetonitrile-based electrolytes. The lower row: bar plots of the calendar and cycle lives for redoxmers in Table 1. | ||
| Catholyte | Typea | Rb | E1/2, V vs. Ag/Ag+c | t1/2, hd | N90e |
|---|---|---|---|---|---|
| a Symmetry type.b Substitution groups in Fig. 2.c Measured half-wave potential.d 50% decay lifetime for the radical cation.e Cycle number corresponding to 10% capacity loss at 50% SOC for 3C rate cycling at 20 mM. | |||||
| DiMe | 2 | Methyl | 0.690 | 26.5 | 573 |
| MeEt | 1 | Ethyl | 0.673 | 365.2 | 300 |
| DiEt | 2 | Ethyl | 0.657 | 208 | 337 |
| MePr | 1 | n-Propyl | 0.702 | 86.3 | 255 |
| DiPr | 1 | n-Propyl | 0.661 | 248.6 | 140 |
| MeiPr | 1 | iso-Propyl | 0.644 | 213.5 | 9 |
| DiiPr | 2 | iso-Propyl | 0.598 | 312.9 | 210 |
| MesBu | 1 | sec-Butyl | 0.648 | 19.5 | 160 |
| DisBu | 2 | sec-Butyl | 0.594 | 23.7 | 341 |
| MeBn | 1 | Benzyl | 0.710 | 5.6 | 27 |
| DiBn | 2 | Benzyl | 0.731 | 19.5 | 3 |
In liquid electrolyte (here, 0.5 M lithium bistriflimide, LiTFSI, in acetonitrile at room temperature in a glovebox), these molecules undergo reversible one-electron oxidation (see cyclic voltammograms and tables in Section S3), and the half-wave potentials E1/2 of this electrochemical reaction vs. an Ag/Ag+ quasi-reference electrode are given in Table 1.
Another useful metric is the half-life t1/2 of the radical cation (the oxidation product) that was determined in dilute redoxmer solutions (5–20 mM) to minimize cross reactions. Electron paramagnetic resonance (EPR) spectroscopy was used to observe the decay of the dialkoxyarene radical cations over time, and these kinetics were numerically analyzed to estimate reaction times (Section S5). In this way, the calendar lifetime was obtained to characterize the chemical stability of charged redoxmer molecules in bulk electrolyte during charge storage in a tank.
The cycling stability was determined in symmetric stationary H-cells where the dialkoxyarene is present in both compartments but at opposite states of charge (Section S4). This method allows study of the effects of parasitic reactions: in one compartment, the neutral molecule was oxidized to the radical cation, while in the other compartment, the radical cation was reduced to the precursor. As the electrochemically active material is converted to products via parasitic reactions, the capacity decreases, and the cycling stability can be characterized by this capacity fade. In such diagnostic experiments, the cell is charged and discharged at a constant rate until a set capacity is reached or the potential limits are exceeded. These experiments were limited to 50% theoretical capacity with a 3C rate (1C means full discharge in 1 h) and a potential range of 0–1.8 V vs. Ag/Ag+. The number of full cycles N90 that lead to 10% decay of the discharge capacity, calculated from initial discharge capacity (see Fig. 1 in ref. 16), was used as a metric for chemical stability during charge or discharge; thus, the two metrics reflect the two basic regimes of operation in a flow cell. As shown in Fig. 2 and Table 1, these two stability metrics show high variability and do not correlate in an obvious way.
In these cycling experiments, except for the benzyl derivatives, for which N90 was very low (Table 1), >10% of the capacity faded when >90% of the redoxmer still remained in the cell, as suggested by gas-chromatography mass spectrometry (GCMS), proton nuclear magnetic resonance spectroscopy (1H NMR) and high-performance liquid chromatography (HPLC) post-mortem analyses of the cycled electrolyte solutions (Section S7); thus, the loss of cyclability was not a simple matter of loss of the redox-active material to parasitic reactions. Rather, the cycling was terminated due to a rapid increase in the potential of the working electrode as illustrated for MeEt in Fig. 3, which suggests fouling of the electrodes by insoluble reaction products. Initially, the charging half-cycle terminated at a charging plateau at ∼1.0 V vs. Ag/Ag+ (Fig. 3a), but over time the end-cycle potentials increased, and a new plateau started to form at ∼1.5 V vs. Ag/Ag+ (Fig. 3b and c). In Fig. 3d, we plot the maximum potential at the end of each cycle. The initial slow growth that was linear with the cycle number was replaced with a rapid increase in the potential, at which time the potential control was lost, and the discharge capacity rapidly decreased. In some systems, the potential control was recovered after 1–3 h of rest at open circuit; the cycling resumed to once again halt in the same manner.
Such behavior would be consistent with the formation of poorly soluble insulating deposits on the electrode at the end of the cycle that slowly dissolved during rest.
Second, as observed above, for many dialkoxyarenes, the cyclability is not limited by redoxmer decomposition per se but rather by a rapid increase in the working electrode potential, as illustrated in Fig. 3. This overshoot occurs even when the redoxmer is still present in the electrolyte at the end of the cycle. We attribute this increase to the formation of a resistive film consisting of insoluble products whose overall yield is small. For this reason, their formation has little effect on the decay kinetics of a radical cation in the bulk, yet it greatly affects the cycle life.
We note that the molecules in Table 1 all share a 2,5-di-tert-butyl substitution.17 These protective groups play an important role in the stabilization of the radical cation by preventing proton loss coupled with radical addition to the arene ring of a neutral molecule.17–20 The bulky groups in the ring provide steric hindrance to this reaction without inducing excessive strain in the radical cation. When the arene ring is tri- or tetra-substituted, the stability decreases as the resulting strain eases O-dealkylation of the radical cation.17,21
The problem with the 2,5-di-tert-butyl substitution is that the tert-butyl groups can become eliminated as carbocations, leaving the unprotected fragment 3 (Fig. 2).22 The typical gas chromatogram showing product 3 generated in electrolysis of MeBn can be seen in Fig. S7.1 in Section S7. Adducts of deprotected 3 and precursor molecules were found by chromatography among the products of electrolysis in ref. 23; it is likely that still larger oligomers are formed as there is no impediment to ring addition and polymerization once the protecting groups become lost. The eliminated carbocation can subsequently undergo the Ritter reaction24,25 with acetonitrile solvent (Fig. 4a and S7.1 and S7.2 in Section S7).23 While the deprotection is a low-yield reaction (<2–5%), it can have dramatic consequences for the cycling lifetime as it leads to the formation of macromolecules near the electrodes, which can form deposits on the electrode surface.
The above applies to the radical cations whose lifetimes exceed the duration of a half-cycle. When the opposite is true, the short lifetime affects both stability metrics. For benzyl derivatives, the radical cations undergo rapid O-dealkylation by eliminating the benzyl carbocation and leaving the aroxyl radical (Fig. 4b). Our product analyses suggest that these aroxyl radicals abstract hydrogen (yielding phenols 4) or disproportionate (yielding quinones 5), while the eliminated carbocation (R+) undergoes the Ritter reaction yielding the corresponding nitrilium ion (RN
C+Me or RN+
CMe). During wet analysis, this ion hydrolyzes to N-benzylacetamide, which is observed chromatographically among the volatile products along with the phenol and quinone (see Fig. S7.1 and S7.2). A less expected reaction is the formation of a benzoxazole derivative 6, which is a product of the addition of the solvent molecule (acetonitrile) concerted with the removal of the tert-butyl group. This product can be identified through mass spectrometry and 1H NMR spectroscopy (Fig. S7.1 and S7.3). The multiplicity of reaction pathways and their different impacts on bulk and electrode reactions explains the need for ML modeling in balancing different structural factors on these reactions.
In this approach (Fig. 1), multiple subsets χ of m < Nx variables (“chromosomes”, with each explanatory variable representing a “gene”) are generated at random, and the MMLR with the optional ridge regularization is performed for each set. The root mean square deviation (rms) for each “chromosome” serves as the measure of its “fitness”. At each iteration, 50 “chromosomes” are sorted by rms, and the 10 most fit “chromosomes” are selected at random according to the precedence of their fitness. Two “mutations” (random substitutions of the “genes”) are introduced per “chromosome”, and random fragments of these chromosomes are interchanged (Fig. 1). These variations are introduced at each generation as the population evolves, searching for the global minimum on the fitness landscape. Typically, 100–2000 iterations were sufficient to locate the optimal subset.
When these generic descriptors were used to fit well-defined redoxmer properties closely associated with their electronic structure, such as redox potential (E1/2) or hyperfine coupling constants (spin densities) for radical cations, they performed quite well (Section S6). For the E1/2 of dialkoxyarenes, even for an extended set as in ref. 16 (which includes the molecules examined in this paper, see Fig. S6.1), it was always possible to find two descriptors modeling the data with a coefficient of determination r2 > 0.97. These trends are illustrated in Fig. S6.1, where E1/2 is correlated with a linear combination of Mollog
P (an estimator of lipophilicity)31 and ATSC7c (lag-7 centered Moreau–Broto autocorrelator weighted by Gasteiger charge)32 with r2 ≈ 0.98. The generic descriptors were also suitable for rationalizing more complex properties directly related to molecular structure. For example, the MMLR model for the proton hyperfine coupling constants in the radical cations that quantify the unpaired electron spread across the molecule (see Table S5.1) performed nearly as well as the model for redox potentials. For dialkoxyarene molecules with unbranched R-groups (16 molecules, 4 hyperfine coupling constants for each), a set of 64 measurements can be fit using just six generic descriptors (r2 > 0.95, see Fig. S6.2). This number of descriptors is justified given the number of measurements. The accuracy of this model was better than the accuracy of our density functional theory (DFT) estimates. Thus, certain properties of radical cations readily lend themselves to ML analysis. However, many of the descriptors not only are not intuitive but also lack an obvious connection to the properties of charged redoxmers.
The approach was less successful in capturing trends for the redoxmer stability metrics in Table 1 despite the large number of generic descriptors available for such analyses (>1200 were used). Fitting each metric with r2 > 0.9 required 5–6 generic descriptors for only 11 measurements (see, for example, Fig. S6.3). We sought to ameliorate this deficiency by using a logistic regression (Section S6) to separate charged dialkoxyarenes into short- and long-lived by their half-life times t1/2 (setting 30 h as the stability threshold). In this way, a quadratic 2-descriptor model was found with a pseudo-r2 of 0.75. Thus, even such a low-fidelity classifier proved challenging to construct. We hypothesized that more chemically intuitive and relevant descriptors representing the reactivity of radical ions are required.
As stipulated above, the known fragmentation reactions of a dialkoxyarene radical cation involve the quinoid core. As the reactivity is defined by distribution of charge and unpaired electron density through this fragment, Mulliken partial atomic charges and spin densities on carbon atoms, oxygens, and α-hydrogens obtained in DFT calculations were chosen as 1D descriptors (the entire list of these descriptors is given in the Excel spreadsheet in the SI). To these quantities, the average charge and spin density and their variance over the arene rings and the quinoid fragment were added. These variances served as heuristic metrics for charge and spin delocalization. Other descriptors included all-molecule H and C counts, Hα atom counts, bond lengths and bond angles for H, C, and O atoms in the quinoid fragment, and dihedral angles involving the Cα–O and C(arene)–C(tert-butyl) bonds. The minimal-approach distances between the aromatic protons and Cα carbons were also included to quantify steric strain.
Another insight from previously recognized trends was a possible role of symmetry in the radical cation stabilization.21,33 Several descriptors were computed to quantify this symmetry, where the electric dipole was the most straightforward metric and other parameters are described in Section S6. In total, 100 descriptors were used as “genes” in the GA-optimized MMLR models (Fig. 1). As observed above, the redox potential E1/2 is a relatively simple property that can be fit with high fidelity using only two descriptors. The computed E1/2 (which was one of the descriptors) already correlates well with the experimental estimates, as seen from Fig. S6.4a. The goodness-of-fit can be increased to r2 = 0.96 by adding a second descriptor (C6–HAr bond length); still greater accuracy (r2 = 0.98) is obtained by using the spin density in the oxygen and the C(arene)–O bond length. These examples suggest that there is sufficient variability in our descriptor set to account for variations in the redox potentials.
![]() | ||
Fig. 5 Correlations of the experimental and estimated (a) cycling life (N90) and (b) calendar life (log t1/2) stabilities for dialkoxyarenes in Table 1. The coefficients of determination are shown in the plots. (c) A dialkoxyarene radical cation at the crossroads. Partial charges in tert-butyl and α-carbons are good predictors for elimination of the corresponding carbocations. While rapid O-dealkylation limits both of the stability metrics, if this reaction is slow, the cyclability is limited mainly by deprotection of the radical cation leading to the formation of macromolecules, causing the eventual loss of potential control, while the calendar stability is limited by slow neutralization in the bulk. | ||
| Descriptors | Fit variable & coeff. | Descriptor type |
|---|---|---|
| a Reduced values are shifted by their mean and normalized by their standard deviation over the data set. The atoms are numbered as shown in the upper right panel of Fig. 1. A positive coefficient means positive correlation, while a negative coefficient means negative correlation. | ||
| Redox potential | E1/2 | |
| Spin density, O16 | −0.92 | Spin density in the oxygen atom |
| r(C3–O10) | +0.78 | C(arene)–O bond length |
![]() |
||
| Cycling life stability | N90 | |
| Charge, C12 | −0.73 | Mulliken charge on tert-butyl carbon |
| Charge, C17 | −0.84 | Mulliken charge on α-carbon |
| r(C3–O10) | −1.00 | C(arene)–O bond length |
![]() |
||
| Calendar life stability | log(t1/2) | |
| I2 | +0.83 | Median semiaxis of the mass weighted gyration ellipsoid |
| Charge, C11 | +0.73 | Mulliken charge on α-carbon |
| Spin_ring_stdev | +1.23 | Standard deviation of spin density in the arene ring |
As suggested by these analyses, the cycling stability for short-lived radical cations is mainly determined by the efficiency of the O-dealkylation, whereas for long-lived radical cations, the cycling stability is mainly determined by the efficiency of deprotection. As both reactions are carbocation elimination, the charges in the leaving carbon are good quantifiers for the corresponding reaction barriers. The cycling life of dialkoxyarene molecules is predicated on (i) the energy stored in the radical cation and (ii) its proclivity towards carbocation elimination.
For the calendar life stability, log(t1/2) was chosen as a variable. The best three-descriptor fit we found (with r2 = 0.98) is shown in Fig. 5b. One of the selected descriptors is the Cα carbon charge (which was also implicated in our fit for N90): the inclusion of this descriptor can be rationalized through its correlation with the O-dealkylation rate (for short-lived radical cations) as was the case with N90. Another descriptor is the standard deviation for the spin densities in the arene ring, which is an ad hoc metric of spin delocalization in the ring.
The most surprising descriptor is the third one (I2), which corresponds to the intermediate principal moment of inertia of the molecule. Its principal axis lies in the arene plane and points perpendicular to the main axis of the molecule. It is unlikely that this descriptor reflects steric hindrance, as steric effects primarily influence I3, whose principal axis is perpendicular to the arene plane. As charge neutralization is the main reaction of long-lived radical cations in the bulk, solvent access to a reactive site plays an important role. For every radical cation that is converted to a neutral molecule, there should be an oxidation. It is unlikely that the oxidized species is a salt ion in the electrolyte. This suggests that the acetonitrile solvent itself becomes oxidized during the reaction. In this respect, our observation of product 6 can be relevant mechanistically as it suggests that the neutralization could be initiated by the formation of a C–O bond between the solvent and the charged oxygen in the alkoxy group of the molecule (similar to the initial step of the Pinner reaction).34 We hypothesize that the descriptor I2 is a predictor of a reaction barrier for this reaction, controlling solvent access to the alkoxy oxygens.
Recently, this speculation has been boosted by a ML-driven high throughput study from Argonne's RAPID-200 autonomous discovery laboratory,35 in which the stability of N-substituted phenothiazinium radical cations was studied. It was shown that the slowest reactions of this radical cation involved homolysis of the solvents (including acetonitrile) with the formation of sulfonium carbocations, some of which eventually decayed back to the parent compound. It seems likely that an analogous reaction can occur with the formation of oxonium ions that also partially revert to the parent molecule. It is this multistep reaction that has been recognized as bulk neutralization of the radical cations.
Combining product analyses with chemically intuitive descriptors of radical cation reactivity and ML methodology, we obtained a model strongly suggesting that the stability of charged dialkoxyarene redoxmers is controlled by the interplay of three competing reactions: charge neutralization, O-dealkylation, and arene ring deprotection (Fig. 5c). Of these three reactions, the deprotection (which is a low-yield reaction) has an especially strong impact on electrochemical cyclability as it enables polymerization of deprotected molecules that causes fouling of the electrodes and the loss of potential control. This detrimental reaction, however, has little effect on the calendar life of charged dialkoxyarenes in the electrolyte bulk. Conversely, the charge neutralization has little effect on the cycle life, as it recovers the active material, while having a major effect on the calendar life. Expectedly, the O-dealkylation affects both stability metrics when it becomes sufficiently fast.
The “erratic” data shown in Fig. 2 and Table 1 reflect competition between these three reactions and their unequal effect on the two metrics of electrochemical performance. In this way, the seeming randomness has been reduced to the interplay of simpler causes, conforming to the Darwinian view on chance.
The broader impact of this study is that the chemical processes we discuss—fragmentation, polymerization, and neutralization—are very common for radical ions, so competing reactions of this kind are the norm. Different reaction pathways respond to molecular structure in distinct ways, making it generally difficult to discern clear patterns in the data. This variability and the challenge of interpreting seemingly random results have been a shared experience in the field for many years. The use of laboratory automation and AI/ML-directed methods may make disentangling these reactions increasingly feasible in the future, but this ideal has not yet been realized.
The data supporting this article have been included as part of the supplementary information (SI). Supplementary information: Sections S1 to S9 contain synthetic procedures, electrochemical, crystallographic, and EPR protocols, NMR spectra, crystallographic and EPR data, EPR kinetics, electrolysis and ML plots, and high-performance liquid chromatography (HPLC) spectra. An Excel spreadsheet is provided containing descriptor names, meanings, and values and ML fits. See DOI: https://doi.org/10.1039/d6ta00771f.
| This journal is © The Royal Society of Chemistry 2026 |