Navigating the Unkown With AI: Multiobjective Bayesian Optimization of Non-Noble Acidic OER Catalysts

Experimental catalyst optimization is plagued by slow and laborious efforts. Finding innovative materials is key to advancing research areas for sustainable energy conversion, such as electrocatalysis. Artificial intelligence (AI)-guided optimization bears great potential to autonomously learn from data and plan new experiments, identifying a global optimum significantly faster than traditional design of experiment approaches. Furthermore, it is vital to incorporate essential electrocatalyst features such as activity and stability into the optimization campaign to screen for a truly high-performing material. In this study, a multiobjective Bayesian optimization (MOBO) was used in conjunction with an experimental high-throughput (HT) pipeline to refine the composition of a non-noble Co-Mn-Sb-Sn-Ti oxide toward its activity and stability for the oxygen evolution reaction (OER) in acid. The viability of the MOBO algorithm was verified on a gathered data set, and an acceleration of 17x was achieved in subsequent experimental screening compared to a hypothetical grid search scenario. During the ML-driven assessment, Mn-rich compositions were critical to designing high-performing OER catalysts, while Ti incorporation into MnO x triggered an improved activity after short accelerated stress tests. To examine this finding further, an operando mass spectrometry technique was used to probe the evolution of activity, metal dissolution, and surface area over 3 h of operation. This work demonstrates the importance of respecting the multiobjective nature in electrocatalyst performance during HT campaigns. AI-based decision-making helps to bridge the gap between fast HT screening (limited property extraction) and slow fundamental research (rich property extraction) by avoiding less informative experiments.


Introduction
The task of optimization is prevalent in any scientific discipline.Material optimization and discovery, in particular, have propelled technological advancements, creating a materially different generation compared to previous ones.The current climate crisis poses a severe threat to the well-being of humanity, where innovation in energy materials is needed to tackle a monumental challenge of this scale.Establishing a renewable energy landscape is of utmost importance to secure a sustainable energy supply while maintaining current living standards in the near future.Electricity from wind and solar is intermittent, necessitating its storage into energy-dense molecules such as hydrogen to cover energy demands at any time, even during downtimes of renewable electricity production.Polymer electrolyte membrane water electrolysis (PEMWE) will be a cornerstone in the future energy transition, generating green hydrogen to end the reliance on current fossil-based energy carriers. 1,2 ere, the oxygen evolution reaction (OER) is profoundly more sluggish than the hydrogen evolution reaction (HER).Given that anodic and cathodic reactions proceed simultaneously, the slower reaction ultimately determines the total device efficiency, making the improvement of OER kinetics a main target of many research endeavors in the water splitting community.Electrocatalysts build the heart of electrochemical devices and act as reaction promoters.So far, IrOx has become the state-of-the-art OER catalyst for PEMWEs.Despite decades of research, no marketable alternative consisting of cheap earth-abundant elements has emerged to replace this scarce and expensive noble metal catalyst, necessitating impactful materials innovation.Catalysts in commercial PEMWEs applications must not only be active but also durable over an extended time.The particular challenge lies in overcoming the inherent instability of non-noble metals at lower pHs, which is the operating environment of PEMWEs.Recently, some promising demonstrations have proved the viability of Co-and Mn-based oxides.For instance, Mondschein et al. showed that Co3O4 can perform OER over several days in strongly acidic electrolytes, albeit at low current densities, using a three-electrode cell. 3In terms of practical applicability, γ-MnO2 was used to operate a PEM setup for 400 h at 10 mA cm -2 . 4When 100 mA cm -2 was applied, the system shut down after 8 h, which was attributed to extensive Mn leaching and calls for the need to stabilize the noble metal-free OER catalyst further.One strategy to achieve higher stabilization is the expansion to multinary compositions.This method forms thermodynamically more stable alloys or bonds by adding elements that can alter electronic structures and act as stabilizing additives. 5The incorporation of Mn into the spinel lattice of Co3O4 can extend the catalyst lifetime by two orders of magnitude without compromising any activity during acidic OER.This effect was attributed to the formation of a stable Mn-O bond, suppressing dissolution. 6Chong et al. even demonstrated a La-and Mndoped cobalt spinel OER catalyst that was able to operate at current densities of 200 mA cm -2 within a PEM setup for 100 h. 7[10] So far, studies on multimetallic transition metal catalysts for acidic OER have primarily focused on binary oxide systems.A plethora of unexplored mixed metal systems still exist that need to be screened for their viability as non-noble alternatives to IrOx.Constructing multinary systems represents a combinatorial problem, scaling exponentially with a new parameter (e.g., elements) added to the mix.Traditional material optimization is laborious and slow, partly due to the commonly utilized oneparameter-at-a-time approach, requiring many experiments to screen a grid of parameter combinations.Such grid search becomes time-intensive and thus infeasible when subjected to large search space.High-throughput (HT) methods have contributed greatly to this challenge, where individual laboratory tasks can be automated (e.g., synthesis, measurement, data analysis) to accelerate the testing of thousands of samples. 11Automation in HT methods not only expedites the workflow but introduces less human bias and error into the optimization campaign.Such approaches have already proven useful in screening non-noble multinary oxides toward acidic OER. 12,13 e effectiveness of HT experiments in uncovering promising electrocatalysts lies in the proper definition of the main objective, i.e., which property to optimize.While many HT campaigns focus solely on the catalyst activity, screening the stability is crucial to adequately identify high-performing catalysts. 14,15 owever, the evaluation becomes more time and resource-intensive with each additional property being assessed. 16Combining thorough testing of each sample with a grid-search-based strategy for vast search spaces prolongs the total measurement time, losing the essence of HT screening.Additionally, not all experiments are equally useful.Spending time-intensive measurements on unpromising candidates during grid search can be a waste of resources.As an alternative, iterative approaches are more favorable, where a small subset across the entire parameter space is explored in the first iteration.The obtained information serves as prior knowledge, informing which regions in the parameter space to investigate next and which ones to neglect in order to optimize the objective.In a way, this heuristic approach has been performed by human scientists for many centuries.However, with the recent emergence of predictive machine learning (ML) algorithms, such experimental planning can be performed autonomously without human intervention, excluding human bias from research. 17ML techniques have catapulted the scientific community into a new paradigm, whereas the implementation for energy material research has just started. 18,19  ML algorithm is often discussed in conjunction with big data, with massive amounts of data providing the basis for training ML models.However, some ML approaches are specifically geared toward dealing with datapoor optimization campaigns, which are often prevalent in experimental material science, let alone electrocatalysis. 20One active learning approach is Bayesian optimization (BO), an adaptive sampling strategy relying on an iterative optimization process to find the global optimum in a predefined parameter space. 21The BO algorithm consists of the following steps: (i) initialization, in which some observations are collected, building the starting point for the optimization process.(ii) machine learning, in which a surrogate model (typically a Gaussian process) is fitted on the given observations.(iii) optimization policy, in which an acquisition function decides which parameter combinations are most informative to reach the global optimum.(iv) experimental evaluation of the newly suggested candidates and starting over at (ii).The feasibility of BO for material science has been demonstrated for photovoltaics, 22,23 thin-films, 24 photocatalysts, 25 or organic compounds, 26 where it builds the heart of fully autonomous self-driving laboratories (SDL). 27SDLs are still very rarly applied to electrocatalytic applications.So far, only Black et al. have demonstrated a use-case in which they optimized a non-noble metal composition for acidic OER using an SDL. 280][31][32] BO has also been used to guide traditional laboratory workflows with manual involvement in either the synthesis or characterization to refine catalysts for electrochemical reactions.Yamauchi et al. have demonstrated the viability of BO for effectively screening the Pt-Pd-Au composition space for methanol oxidation. 33It was concluded that only 1% of the entire search space was screened to find the optimum.In another report, Arenz et al. used BO to explore a multidimensional high-entropy alloy composition space comprised of Pt-Ru-Pd-Rh-Au for H2/CO electrooxidation. 34Zelenay et al. implemented adaptive learning into an automated synthesis workflow for electrocatalyst development, which guided the optimization of a Fe-N-C catalyst for the oxygen reduction reaction. 35hile these studies mostly focused on singly optimizing the catalyst activity, high-performing electrocatalysts must combine high activity and stability, as mentioned earlier, representing a multiobjective optimization task.Therefore, multiobjective BO (MOBO) is crucial, in which both objectives (activity and stability) are simultaneously optimized. 16In electrocatalysis, activity and stability tend to be inversely correlated, i.e., a catalyst with exceptionally high activity commonly shows poor stability and vice versa. 36Thus, MOBO aims to identify catalysts that possess the best compromise between these two properties.Our previous report demonstrated an HT workflow that allows rapid synthesis and testing of electrocatalysts for their OER activity and stability using single-task automation. 37,38 owever, the underlying optimization strategy always relied on a grid search without any feedback loop driven by ML.In this work, MOBO was implemented into a previously established HT platform to adaptively screen a Co-Mn-Sb-Sn-Ti oxide space to simultaneously maximize the activity and stability toward acidic OER.To test the viability of ML-driven experiments for the present optimization task, a data set was curated from a grid search of the Co-Mn-Sb-Sn-Ti oxide space.Based on this data set, a simulated MOBO was performed to evaluate whether the algorithm can identify the global optimum faster than a random sampler.After this validation, the algorithm was applied to suggest new compositions to be synthesized and screened in the laboratory to iteratively narrow down the optimal compositions.This strategy is illustrated in Figure 1.Once interesting compositions were identified, more detailed and time-consuming testing was carried out on selected candidates to study the observed behavior in greater depth, for example, by using operando mass spectrometry to probe the degradation behavior.

Simulated optimization of composition through MOBO
An initial proof of concept needs to validate the viability of MOBO for the OER activity-stability optimization of Co-Mn-Sb-Sn-Ti oxides.To avoid resynthesis and testing in each new iteration, a complete data set was constructed first using inhouse HT capabilities.This step was taken as no open-source data sets exists that resembles this multiobjective optimization task with 5 elements as features and activity and stability as properties.A grid search approach was chosen, where 70 samples (25 at.% difference between each composition) were synthesized using the pipetting robot with subsequent annealing.The samples were then assessed toward their activity and stability for acidic OER in 0.1 M HNO3 using the automated scanning flow cell (SFC) developed in an earlier study. 37,38 he testing protocol applied to each sample is shown in Figure S1.The activity was assessed through the OER overpotential (ηOER) reached during the first 1 mA cm -2 hold, whereas the stability is determined from the ηOER change (ΔηOER) after the accelerated stress test (AST).Characterizing stability via ΔηOER is a classical and relatively facile approach in electrocatalysis, which was the main reason for its adoption for the MOBO-driven screening.This method provides a comprehensive view of the performance deterioration over time.For deeper insights into material degradation, a more involved operando technique based on mass spectrometry was used in subsequent investigations within this study, offering a detailed understanding of the catalyst dissolution during electrochemical testing.
To ensure an accurate interpretation of the compositionperformance results during this study, the discrepancy between the nominal and actual composition must be assessed.However, determining the actual composition of the samples on the FTO substrate with energy-dispersive X-ray (EDX) or Xray fluorescence (XRF) is challenging as Sn signals will be overshadowed by the Sn in the substrate and Sb signals overlap with those of Sn.Instead, the composition of the drop-casted solutions was evaluated using an inductively coupled plasma mass spectrometer (ICP-MS) as a proxy measurement.The final film composition should closely resemble the ink composition, as no metal evaporation is expected during the annealing step.Table S1 shows that the ink composition for four random mixtures is in close proximity to the nominal targeted composition, confirming the accuracy of the pipetting sequence.
It is important to note that features such as morphology, phases, or surface facets undoubtedly influence the electrocatalytic performance. 39,40 owever, such features often come as a result of compositional tuning and can turn out to be secondary information for material discovery campaigns.Such information becomes indispensable when focusing on revealing a detailed structure-property relation to, e.g., derive new descriptors.Implementing X-ray diffraction (XRD) or X-ray photoelectron spectroscopy (XPS) into SDLs to extract such supportive information adds more engineering complexity.Hence, recording the electrocatalytic performance as a function of the composition remains a popular route for automated HT workflows in electrocatalysis and represents an intuitive research question in material science.Figure 2a depicts the obtained ηOER and ΔηOER for each composition.The data set can be found in Table S2 in the Supporting Information.It is noticeable that the lack of Co or Mn manifests in a massive activity drop indicated by the increase in ηOER around experiment count 55.This result aligns with expectations, as Sb, Sn, and Ti oxides are known to be poor OER catalysts. 2An obvious trend in the stability results was not observed.Surprisingly, some ΔηOER are negative, indicating that the catalyst became more active after the AST.Overpotentials typically increase as a result of deactivation. 41A so-called https://doi.org/10.26434/chemrxiv-2023-0509zORCID: https://orcid.org/0000-0001-8979-7252Content not peer-reviewed by ChemRxiv.License: CC BY 4.0 activation step, as encountered for noble metals 42,43 or MEAs, 44,45 was not observed in the majority of the tested samples.A follow-up investigation on this irregular phenomenon will be present towards the end of this study.For now, it seems that most samples exhibit a ΔηOER around 0 -20 mV.The data shown in Figure 2a was used to simulate a MOBO process without running additional experiments to evaluate if the algorithm can quickly identify the most active and stable compositions within the data set.qNoisyExpectedHypervolumeImprovement (qNEHVI) was chosen as the acquisition function to identify the Pareto front as is can weigh trade-offs among multiple objectives.Additionally, it does not require a prior selection of a known trade-off between the objectives. 46A FixedNoiseMultiTaskGP surrogate model was used as ML model, which allows for the inclusion of experimental noise determined during the HT screening into the model.The algorithm was initialized with five random entries.Based on the results in Figure 2a, a ηOER of 550 mV and a ΔηOER of 10 mV were determined as suitable reference points for the optimization, presenting values that are both desirable and realistically achievable.After initialization, the MOBO algorithm picked one new candidate from the remaining data set per iteration, rearranging the data as shown in Figure 2b.By constructing a Pareto plot by mapping the activity over stability, it becomes clear that the algorithm rapidly selected points within the optimal quarter (left bottom corner) during the first 20 iterations.Once no better compositions are left within the data set, the points scatter to less active and stable regions.To evaluate the speed of optimization, the FixedNoiseMultiTaskGP model was benchmarked against a random sampler that does not learn from previous iterations.For this comparison, the normalized hypervolume is plotted over each sampling iteration.The hypervolume is the area that spans between the Pareto optimal points and the reference point.When dealing with random sampling, multiple repetitions are required to probe the statistical significance.If the random sampler happens to sample the best compositions at the beginning, and this repitition is compared, it would insinuate MOBO to be inferior.Figure 2d shows the comparison of the FixedNoiseMultitaskGP surrogate model against a random sampler after 500 repetitions.The adaptive sampling strategy indeed outperforms an uninformed sampler and finds the optimum already after 20 iterations.The FixedNoiseMultiTaskGP model was also compared to a MultiTaskGP model that does not consider noise.The performance is just marginally worse (see Figure S2).However, as heteroscedasticity (different variance in each data point) is an important information during ML-guided HT screening, the FixedNoiseMultiTaskGP was chosen as the surrogate model for the subsequent MOBO-driven experimental optimization of the Co-Mn-Sb-Sn-Ti oxide space.

Experimental optimization of composition through MOBO
The previous results showed the validity of MOBO to optimize the OER catalyst activity and stability simultaneously in a 5dimensional space and laid the groundwork for the subsequent ML-driven experiments.The algorithm now suggests new compositions that a scientist would synthesize and screen in the laboratory, realizing a human-in-the-loop ML-driven catalyst optimization.The campaign begun with screening 15 homogeneously distributed samples over the entire quinary search space (each composition differing 50 at.%,see Table S3).As a rule of thumb, BO campaigns should be initialized with around 2*(n+1) samples, where n is the dimensionality of the optimization campaign (5 in this case). 46The samples were screened using the same protocol shown in Figure S1.The observations serve as prior, whereupon the MOBO algorithm suggests 15 new candidates from a total composition space of 1001 candidates (each composition differing 10 at.%) that will be synthesized and screened.This loop is continued until a convergence is reached (see Figure1).
Figure 3a shows a summary of the recorded compositiondependent ηOER and ΔηOER for each MOBO iteration.The data set can be found in Table S4 in the Supporting Information.After initialization, subsequent suggestions all focus on compositions yielding an ηOER below 550 mV and ΔηOER below 15 mV.Multidimensional scaling (MDS) of the quinary parameter space in Figure 3b (for activity) and Figure 3c (for stability) indicates that Mn seems to play an important role in achieving the global optimum.The gray points in the MDS plots represent the total search space.Figure S3 illustrates how, after initialization, the MOBO algorithm quickly focuses on sampling Mn-rich compositions.
Similar to the previously simulated proof-of-concept, the MOBO algorithm improved the hypervolume significantly faster than random sampling (see Figure 4a).A summary of the random sampling can be found in Figure S4 and its corresponding data set in Table S5.The hypervolume increases sharply during the first iteration and plateaus during the third iteration, suggesting that most non-dominated compositions are found during the first two optimization cycles.From Table S6, one can observe that all suggested sampling points for a potential 4 th MOBO iteration are very similar, implying that the algorithm is already trying to exploit a particular region.Figure 4b shows how almost every sampled candidate is better than the predefined reference point.Compared to the grid search performed earlier, the MOBO was able to significantly improve the hypervolume and explore more non-dominated compositions that reside on the Pareto front, as indicated by Figure S5.
Inspecting the Pareto compositions in Figure 4c more closely reveals that Mn90Co10Ox achieves the highest activity while suffering from an ΔηOER of around 10 mV after the AST.It is important to note that this result is highly dependent on the testing protocol chosen and should not necessarily mean that Mn90Co10Ox is the universally highest active sample.Rather than finding the one and only optimal catalyst, it is equally intriguing to discover certain trends that MOBO was able to unravel.For instance, incorporating Ti into the catalyst triggers an activity improvement after AST, shown by the negative ΔηOER.The more Ti is incorporated, the more pronounced this effect, but at a sacrifice of activity.This trend prevailed throughout the sampled compositions, which is evident from the analysis of the 2 nd , 3 rd , and 4 th best Pareto front shown in Figure S6.To clarify, the 2 nd best Pareto front is obtained when all points from the 1 st Pareto front are deleted.The 3 rd is obtained by deleting the 1 st and 2 nd, and so on.More elaborate follow-up investigations are needed to understand this behavior, which will be highlighted in the upcoming sections.Literature reports have demonstrated the beneficial role of Sb incorporation into mixed Mn oxides to improve electrocatalytic    stability during acidic OER. 8,47,48 Sight improvements in the stability of Mn-oxides through the addition of Sb could also be observed in this study, indicated by a lowered ΔηOER for a Mn90Sb10Ox sample compared to bare MnOx in Figure S7.To compare the performance of the mixed metal oxides against a state-of-the-art catalyst, a IrOx sample was synthesiued with the same procedure used for the non-noble samples (see Figure S7).As expected, IrOx outperforms the Pareto compositions in terms of activity by 180 mV due to its superior OER kinetics.The ΔηOER after AST is near 0, demonstrating good stability.Overall, it seems that binary and ternary compositions are more frequently sampled during the MOBO campaign (see Figure S8).

Extended OER testing of selected compositions
As outlined earlier, the addition of Ti into MnOx triggers an improvement in activity after ASTs, a recurring behavior observed across the entire optimization campaign.To investigate this particular behavior in more detail, the Mn70Ti30Ox composition, which showed this phenomenon the strongest within the Pareto compositions, was subjected to a more extensive electrochemical testing (see Figure S10).The objective was to conduct a Tafel analysis before and after the same AST applied during the MOBO-guided screening.The Tafel analysis was performed by recording the ηOER at chronopotentiometric (CP) holds of 0.1, 0.2, 0.5, 1, 2, and 5 mA cm -2 . 50A Mn90Co10Ox sample was also subjected to this testing to probe the behavior of a candidate at the opposite spectrum on the Pareto front (see Figure 5 a).As observed before, the overall activity is lower than a Mn90Co10Ox sample.The ηOER of both mixed metal oxides become similar as the current density approaches 5 mA cm -2 , reasoned by the fact that Mn is the main constituent performing OER and both being Mn-rich.
The AST once more causes the η OER to decrease for the Mn70Ti30Ox composition.However, the Tafel slope increased (see Figure 5b).This outcome suggests that OER kinetics worsened after the AST, implying that the improvement in OER activity might stem from extrinsic factors, e.g., an increase in surface area.Surface roughening due to dissolution is one option that could cause an increase in the electrochemically active surface area, exposing more active sites to catalysis. 51On the other hand, the kinetics improved slightly for the Mn90Co10Ox sample.Pinning down the exact mechanism is complicated and requires more elaborate in-situ testing.However, attention was devoted to the increase in activity for Ti-incorporated MnOx samples due to its irregular behavior compared to all other samples.
Hence, as a final follow-up, the Mn70Ti30Ox sample was subjected to an even more rigorous testing using in-situ ICP-MS. 52The coupling of a mass spectrometer to the SFC allows for studying the real-time Mn and Ti dissolution during electrochemical operations with high sensitivities.This approach will help understand catalyst stability from the perspective of active site leaching, a key degradation pathway for electrocatalysis. 53

Operando dissolution study of Mn70Ti30Ox catalyst
The protocol used for the follow-up measurement is shown in Figure 6a.The objective was to record the activity, dissolution behavior, and change in surface area over a prolonged time span.The protocol was applied in a loop up to 19 times, starting with with three cyclic voltammograms (CVs) between 1.1 and 1.3 VRHE.Scan rates of 25, 50, 100, and 200 mV s -1 were chosen to extract the capacitance from the capacitive current, which was treated as a proxy metric for the electrochemical surface area. 54A final hold at 1 mA cm -2 and 80 AST cycles between 0 and 1 mA cm -2 conclude the protocol.Figure 6b and c show the dissolution traces for Mn and Ti, respectively, for each iteration of the protocol.The traces were overlaid for Mn, which helps visualize the change in their shape over time.Interestingly, the initial Mn dissolution during the AST shows a transient behavior.However, the transience gradually disappears toward the end, where the dissolution rate remains constant throughout the AST until a complete deactivation is reached, which in this case occured around iteration 17.A tentative hypothesis includes the increasing upper potential limit for the ASTs towards the end, which would destabilize Mn more due to its transition to a soluble MnO4 - phase based on the Pourbaix diagram. 55Similar OER-triggered MnO2 dissolution was reported in alkaline media, where the main driver for destabilization was attributed to the MnO2/MnO4 -redox transition. 56It is worth noting that the highest Mn dissolution was actually recorded when changing the potentiostat from galvanostatic (CP hold) to potentiostatic (CV) between each iteration, which resulted in a sharp potential drop from around 1.4 -1.5 VRHE to 1.1 VRHE (see Figure S11).This pronounced Mn leaching could result from the redox transition from MnO2 to an aqueous Mn 2+ phase, which would thermodynamically occur at these potentials. 4,55 ased on this argumentation, it is logical that the Mn dissolution observed when initiating the CP hold at 1 mA cm -2 is comparatively less, as the reverse redox transition is triggered, going from Mn 2+ to the solid MnO2 phase.Due to the lower signal-to-noise ratio for Ti, a stacked representation was chosen.A slight peak is present during the hold for the first two to four iterations, after which no clear dissolution signal is recognizable.This behavior could imply that Ti stabilizes over time or dissolves at rates lower than the detection limit of the ICP-MS.Nevertheless, TiO2 is thermodynamically much more stable in acidic media compared to Mn and should be resistive to dissolution up until 2.1 V at pH 0. 55 Hence, most of the Ti dissolution could be originating from a cooperative dissolution mechanism, in which predominant Mn leaching rips off Ti atoms from the surface.Similar behavior was observed for Fe-Ni oxide systems during neutral OER. 37igure 6d shows the total dissolved amount of Mn calculated as the integral of the dissolution rates for the hold and AST.
Calculating the integral for the Ti signals was more challenging due to the noisy signal, which impeded the baselining.Superimposing the Mn dissolution with the activity (as η at 1 mA cm -2 ) and electrochemical surface area proxy (as capacitance) shown in Figure 6e implies some intriguing trends.
Initially, the η decreases with a concomitant increase in the surface area.Preferential leaching of unalloyed Mn species could trigger such behavior, which is reasoned by the increase in Mn dissolution during the hold until the 7 th iteration.The initial leaching could cause an increase in surface roughness that would lead to more active sites exposed for catalytic processes.
Interestingly, the peak of the Mn dissolution during the holds around iteration 7 coincides with the peak in the activity and surface area, implying that a transformation of the surface comes to a halt at this point.After the initial surface composition change, more stable MnxTiyOz alloys reside at the interface to the electrolyte.Subsequently, the Mn dissolution decreases between iterations 7 and 14, which could be attributed to the stabilizing effect of Ti toward Mn, as demonstrated in previous reports. 9,57 t would be expected that the dissolved amount of Mn during the ASTs shows the same trend.Instead, it keeps rising as the iteration progresses, ascribed to the harsher conditions applied during ASTs with longer exposure to fluctuating potentials.Past iteration 14, the sample deactivates fully, shown by the sudden drop in activity accompanied by a final rise in Mn dissolution caused by the high potentials before ceasing to near zero.This Mn dissolution increasing toward the end supports the hypothesis that the drop in Mn leaching during iterations 7 and 14 comes from the Ti-stabilized Mn rather than solely from a simple depletion of the sample.

Conclusions
Material optimization is needed to tackle highly relevant challenges of the 21 st century.Electrocatalysis is not exempt from this urgency.Mixed metal oxides are a promising material class to be studied as an earth-abundant alternative for OER catalysis in PEMWEs.Screening the vast composition space efficiently is key to accelerating material optimization, where a brute-force approach (e.g., grid search) might be highly resource-intensive when subjected to millions of test samples.The typical countermeasure in HT studies is to reduce the testing time per candidate, compromising the amount of  information extractable during the measurement.OER stability, in particular, requires extended testing to accurately probe destabilization.Electrocatalytic performance is highly impacted by the choice of the electrochemical protocol due to the dynamic electrode|electrolyte interface.In other words, the chosen protocol will decide the final outcome of an optimization campaign and its translatability toward real applications.
In an effort to implement these concepts into HT electrocatalysis research, we have relied on ML-driven decisionmaking based on multiobjective Bayesian optimization to simultaneously optimize the activity and stability within a Co-Mn-Sb-Sn-Ti oxide space for acidic OER using an in-house HT platform.
All chemicals were used as received without any further purification.

Sample preparation
The sample preparation using an automated pipetting robot is described in detail elsewhere. 37In short, 12 mM inks of Co, Mn, Sb, and Sn were prepared in a 2 mL solution containing 70% v/v 1% HNO3 and 30% v/v glycerol.A 12 mM Ti ink was prepared similarly, except using 2 M instead of 1% HNO3 to retain the Ti in the solution.A 12 mM ink corresponds to 7 mg of Co(NO3)2*6H2O, 6 mg of Mn(NO3)2*4H2O, 0.292 mL of a 10 mg mL -1 Sb stock solution in tartaric acid/HNO3, 8 mg of SnCl4*5H2O, and 8 μL of Ti-butoxide.An IrOx benchmark was prepared as well, for which 10 mg of H2Cl6Ir*xH2O was used to achieve a 12 mM ink.The FTO substrate was first cleaned by ultrasonicating 5 min each in a 2% Hellmanex III solution, water, and IPA.The FTO was then air-dried and subjected to a silanization step by immersing the substrate in a 6% v/v dichlorodimethylsilane solution in hexane for 5 min.This step was used to render the FTO surface hydrophobic, which helped locally contain the deposited droplet on the substrate. 58The FTO was subsequently rinsed with hexane to remove the residual silane solution and dried in air before final use.The drop-casting volume for the mixed inks was 0.3 μL.Subsequent annealing in air first at 300 °C for 10 min using a heating rate of 1 °C min -1 and then at 500 °C for 4 h using a heating rate of 2.5 °C min -1 using a box furnace (KLC 10/14, Thermconcept) converts the mixed transition metals into their oxide form.

Electrochemical testing
The development of automated SFC measurements is described in detail elsewhere. 37,38 n short, a laser microscope (VK-X250, Keyence) was used in conjunction with an image detection algorithm to extract spot coordinates and geometric surface areas.The coordinates were used for the xy translation of the SFC during HT measurements.The geometric surface area was used to normalize obtained currents.Electrochemical measurements were controlled with a Gamry REF 600 potentiostat.The reference electrode was a doublejunction Ag/AgCl electrode in 3 M KCl (Metrohm).The counter electrode was a glassy carbon rod (SIGRADUR G, HTW).Samples were typically contacted with copper tape at the FTO substrate.Measured potentials, EAg/AgCl, were all corrected to the reversible hydrogen electrode (RHE) scale.The electrolyte was constantly purged with 30 mL min -1 of Ar.The electrolyte flow was regulated using a peristaltic pump (Reglo ICC, Ismatec) set to 15 RPM.The protocol utilized for the initial grid search and MOBOguided experiments is shown in Figure S1.Each composition was subjected to a galvanostatic protocol starting with a 20 s hold at 1 mA cm -2 , where the activity of the sample is extracted as OER overpotential (ηOER).After performing a short accelerated stress test (AST) of 120 cycles between 0 and 1 mA cm -2 with a 1 s hold each, the activity is assessed again at 1 mA cm -2 .The change in ηOER before and after the AST serves as a proxy metric for stability.
https://doi.org/10.26434/chemrxiv-2023-0509zORCID: https://orcid.org/0000-0001-8979-7252Content not peer-reviewed by ChemRxiv.License: CC BY 4.0 The total protocol used for follow-up measurements on Mn70Ti30Ox and Mn90Co10Ox samples is shown in Figure S8.The activity is assessed through multiple 30-second chronopotentiometric (CP) steps at 0.1, 0.2, 0.5, 1, 2, 5 mA cm -2 to allow a Tafel analysis.As higher currents are reached during this protocol, the potential was iR corrected using the resistance measured through electrochemical impedance spectroscopy (EIS) between 100 -100,000 Hz at open circuit potential.The CP holds are followed by the same AST outlined earlier (120 cyles between 0 and 1 mA cm -2 for 1 s each) before concluding with another Tafel analysis with the same current steps.Finally, another electrochemical protocol is designed to study the operando dissolution behavior of Mn70Ti30Ox.The objective was to record the activity, dissolution behavior, and change in surface area over a prolonged time span, for which the protocol was applied in a loop up to 19 times.Each iteration starts with three cyclic voltammograms (CVs) between 1.1 and 1.3 VRHE.Scan rates of 25, 50, 100, and 200 mV s -1 were chosen to extract the capacitance from the capacitive current, which serves as a proxy for the electrochemical surface area. 54The obtained capacitance is purposely not converted to an area value as the specific capacitance for this system is unknown.However, as the change throughout the operation is more relevant, reporting the capacitance alone is thought to be sufficient.Afterward, a 30-second hold at 1 mA cm -2 and 80 AST cycles between 0 and 1 mA cm -2 are performed to roughly mimic testing conditions applied previously.

Inductively coupled plasma mass spectrometry
An inductively coupled plasma mass spectrometer (ICP-MS, Nexion 350X, Perkin Elmer) was used in two ways during this study.First, it was used to determine the elemental composition of the drop-casting ink.Later, it was used to record the operando dissolution of selected compositions identified during the MOBO-guided experiments.For the latter, the ICP-MS was connected through Tygon tubings (Proliquid) with the outlet of the SFC.The ICP-MS was always calibrated before measurements by a four-point calibration (0, 1, 10, 50 μg L -1 ) using Merck Certipur ICP standards. 59Co, 55 Mn, 121 Sb, 120 Sn, and 47 Ti were used as analyte. 48Ti, which would be the more abundant isotope for Ti, was deliberately not chosen as it had higher background counts, increasing the detection limit.The calibration matrix was 0.1 M HNO3 to mimic the supporting electrolyte for subsequent operando measurements.Internal standards were prepared in 1-2% HNO 3 at 5 µg L -1 , where 74 Ge was the internal standard for Co and Mn, 138 Ba for Sb, 103 Rh for Sn, and 45 Sc for Ti.Internal standards were used to ensure a stable and reliable system performance.A Y-connector was used to simultaneously ingest the analyte and internal standard during measurements.

Multidimensional scaling
Multidimensional scaling (MDS) is another dimensionality reduction tool that can help to visualize high-dimensional data in 2D scatter plots.This technique can be employed when the property of interest is the similarity/dissimilarity of compositions, which can be readily visualized by the distance of a compositional data point from all others.MDS is performed using the Scikit-learn 59 package in Python, employing two dimensions to represent the dissimilarities.

Multiobjective Bayesian optimization
BoTorch, 60 an open-source framework built on PyTorch 61 was used to implement the multiobjective Bayesian optimization in Python.A FixedNoiseMultiTaskGP was used as the surrogate model to fit the observed data points.This model also permits feeding in experimentally determined noise around each data point (error bar).As a comparison, a MultiTaskGP model without noise input was also tested during initial benchmarking using a data set.A qNoisyExpectedHypervolumeImprovement (qNEHVI) acquisition function was used as a decision-making policy with a reference point of 550 mV and 10 mV for activity (ηOER) and stability (ΔηOER), respectively.The reference point represents a compromise between being realistically attainable and not too far from the desired optimum.The qNEHVI acquisition function allows batched optimization in which multiple candidates can be suggested per iteration, which is needed to couple MOBO guidance with the developed HT pipeline most efficiently.All features (i.e., compositions) were normalized to 1.The ηOER, Δη OER , and reference point had to be negated to depict a problem where higher values are more desired.This conversion is necessary, as the algorithm can only deal with maximization problems.Additionally, all values except the features were standardized using the StandardScaler from Scikit-learn, which removes the mean and scales the data to unit variance.During the initial benchmarking of MOBO using a data set, 5 randomly sampled compositions depict the starting condition.Then, one new composition is sampled from the data set during each iteration with the objective to improve the hypervolume.The optimization finishes when all candidates within the data set have been sampled.To gain statistical significance, 500 repetitions of such optimization runs have been performed.The seed for the random initialization went from 0 to 499.The FixedNoiseMultiTaskGP and MultiTaskGP models were compared against a random sampler with a random see concomitantly going from 0 to 499.The final performance was evaluated by calculating the average normalized hypervolume per iteration within an upper percentile of 75% and lower percentile of 25%.During MOBO-driven experiments, the acquisition function value is calculated for a constrained parameter space.The sum of all elements must be 1 (i.e., 100%), and each element must be within 0 and 1 (i.e., between 0 and 100%).The acquisition function then suggests 15 new candidates where each element is outputted with a value within the continuous 5-dimensional space (output value has many decimal places).Such elemental fractions are unfeasible to be synthesized using the pipetting robot.Thus, the most similar composition within the total search space of 1001 compositions (each differing in 10 at.%) is chosen based on the smallest Euclidean distance to the suggested candidate.Every evaluated candidate is eliminated from the total search space to avoid re-sampling.

Figure 1 :
Figure 1: Schematic workflow of the multiobjective Bayesian optimization for the experimental OER catalyst composition optimization.

Figure 2 :
Figure 2: Simulated MOBO on a data set from a grid search campaign.(a) Summary grid search data showing overpotential, overpotential change, and corresponding composition.Standard deviation calculated from two duplicates.(b) Rearrangement of data shown in (a) as a result of MOBO sampling.Color code indicating MOBO iterations.(c) Pareto plot of ΔηOER against ηOER for each composition with a color map indicating MOBO iterations.(d) Normalized hypervolume plotted against MOBO iterations.MOBO is benchmarked against random sampling.

Figure 4 :
Figure 4: Experimental optimization of the Co-Mn-Sb-Sn-Ti oxide composition for OER in 0.1 M HNO3 with MOBO.(a) Measured ηOER and ΔηOER of initial and subsequent compositions suggested by MOBO.Color mapping indicating MOBO iteration.Standard deviation calculated from triplicates.(b) MDS plot of ηOER for compositions shown in panel (a) (colored points).The grey points indicate the total composition space available for MOBO sampling.(c) Similar to panel (b) but plotting ΔηOER.

Figure 3 :
Figure 3: Performance and Pareto front extracted during Experimental optimization of the Co-Mn-Sb-Sn-Ti oxide composition.(a) Normalized hypervolume plotted against MOBO iterations.MOBO is benchmarked against random sampling.(b) Plotting ΔηOER against ηOER for each composition with a color map indicating MOBO iterations.(c) ηOER and ΔηOER for the observed Pareto compositions.Standard deviation calculated from triplicates.
Iteration https://doi.org/10.26434/chemrxiv-2023-0509zORCID: https://orcid.org/0000-0001-8979-7252Content not peer-reviewed by ChemRxiv.License: CC BY 4.0 Figure S8 b illustrates once more how little samples were needed (6% of total space) to arrive at an optimum.Quinary compositions were not explored at all. Figure S9 depicts the predicted ηOER and ΔηOER values for the unexplored compositions after training the FixedNoiseMultiTaskGP model on the observations made during the MOBO campaign.Here, quinary catalysts are predicted to have no potential to come close to the Pareto front.However, making predictions into new composition spaces without being trained on them is not straightforward and requires more sophisticated transfer learning approaches.49

Figure 5 :
Figure 5: Tafel analysis of Mn70Ti30Ox and Mn90Co10Ox.(a) Tafel plot for both samples before and after AST.Standard deviation calculated from duplicates.(b) Tafel slopes calculated from (a) before and after the AST.

Figure 6 :
Figure 6: Operando ICP-MS measurement of Mn70Ti30Ox in 0.1 M HNO3.(a) Applied electrochemical protocol.The protocol was repeated 19 times.(b) Dissolution rate of Mn during CP hold and AST.Dotted line highlighting full deactivation of the catalyst.(c) Dissolution rate of Ti during CP hold and AST.(d) Amount of dissolved Mn during the CP hold and AST.(e) Measured overpotential at the first 1 mA cm -2 hold and the extracted capacitance from the initial CVs.Standard deviation calculated from duplicates.

Figure S1 :
Figure S1: Electrochemical protocol used for grid search study.

Figure S2 :Figure S 3 :
Figure S2: MOBO performance of different models based on the grid search data set.

Figure S 6 :Figure S 7 :
Figure S 6: 1 st , 2 nd , 3 rd , and 4 th best Pareto front and their corresponding compositions.

Figure S 8 :Figure S 9 :
Figure S 8: Compositional type sampeld during MOBO-guided experiments.(a) Pareto plot plotting OER ηOER against ΔηOER.color mappign indicating type of sampled composition.(b) Amount of total available (grey) and sampled (red) compositions per composition type.

Figure S 10 :
Figure S 10: Electrochemical protocol used for follow-up study on Mn70Ti30Ox and Mn90Co10Ox.

Figure S 11 :
Figure S 11: Operando dissolution of Mn70Ti30Ox.(a) Current density and potential.(b) Dissolution rate of Mn and Ti.(c) Exploded view of highlighted area in (a).(d) Exploded view of highlighted area in (b).Green arrow indicating time where potentiostat switch from galvanostatic to potentiostatic mode.

Table S 1
: Composition of ink mixture of two randomly selected samples determined by ICP-MS.

Table S 6
: Raw selection of new candidates by MOBO algorithm for a potential 4 th optimization iteration.