Machine learning-driven breakthroughs in water electrolysis and supercapacitors

Diab Khalafallah *ab, Fuming Lai c, Hao Huang c, Jue Wang a, Xiaoqing Wang d, Shengfu Tong *c and Qinfang Zhang *a
aSchool of Materials Science and Engineering, Yancheng Institute of Technology, Yancheng, 224051, P. R. China. E-mail: qfangzhang@gmail.com
bMechanical Design and Materials Department, Faculty of Energy Engineering, Aswan University, P.O. Box 81521, Aswan, Egypt. E-mail: diab_khalaf@energy.aswu.edu.eg
cSchool of Sustainable Energy Materials and Science, Jinhua Advanced Research Institute, Jinhua, 321015, Zhejiang, China. E-mail: sftong@nju.edu.cn
dCollege of Materials and Chemical Engineering, Chuzhou University, Huifeng West Road 1, Chuzhou 239000, China

Received 28th April 2025 , Accepted 10th June 2025

First published on 12th June 2025


Abstract

Electrochemical energy conversion and storage have attracted widespread interest as green and sustainable technologies. In particular, research on water electrolysis and supercapacitors (SCs) has experienced significant growth, focusing on novel electrodes/electrocatalysts with prominent performances. Recently, computational frameworks employing machine learning (ML) algorithms have revitalized the targeted design of advanced nanomaterials as electrodes/electrocatalysts with tunable electronic configurations and superior reactivity. Descriptor-based analysis has proven efficient in elucidating the structure–property (e.g., activity, selectivity, and stability) relationships, addressing the complex interactions between the catalytic surface and reactant species and predicting enormous data sets. In this contribution, we present an overview of ML-driven electrode/electrocatalyst design, highlighting several novel algorithms and descriptors. The latest advancements in ML approaches are presented to efficiently screen a wide range of metal-based materials. Leveraging recent achievements, this review describes the application of ML for the discovery of active and durable nanomaterials, including identifying active sites, manipulating compositions at the atomic level, predicting the structure/performance, and optimizing thermodynamic properties as well as kinetic barriers. Moreover, recent milestones and state-of-the-art progress in ML integration strategies-materials informatics to stimulate the design of highly efficient electrode/electrocatalyst systems for the hydrogen evolution reaction (HER), oxygen evolution reaction (OER), and SCs are discussed. Finally, we highlight potential future directions for uncovering the revolutionary potential of ML in boosting sustainability and prediction efficiency in the electrochemical energy conversion and storage sector. This review intends to reinforce the junctions between industry and academia and merge endeavors from fundamental understanding to technological execution.


image file: d5qm00326a-p1.tif

Diab Khalafallah

Diab Khalafallah obtained his PhD in 2017. He conducted his postdoctoral research at the School of Materials Science and Engineering, Zhejiang University (China, 2018–2023). Since 2023, he has been an Associate Professor at the Faculty of Energy Engineering, Aswan University, Egypt. His research interest focuses on the advancement of earth-abundant transition metals as economical, high-performance electrodes/electrocatalysts for supercapacitors and H2 production using water/seawater electrolysis systems.

image file: d5qm00326a-p2.tif

Qinfang Zhang

Qinfang Zhang completed his PhD at Nanjing University in 2005. He conducted his postdoctoral research at Twente University (the Netherlands, 2005–2009) and Riken (Japan, 2009–2012). He is currently a Full Professor at Yancheng Institute of Technology (China). He was awarded the JSPS Invitation Fellowship Program for Research in Japan (FY2012). His current research focuses on nanostructured materials for photocatalysis.


1. Introduction

Currently, sustainable electrochemical energy technologies have become an imperative subject of discourse. The reckless use of conventional fossil fuel resources has led to increasing environmental and industrial issues. Accordingly, an appealing option to tackle this substantial challenge is the development of renewable and eco-friendly energy systems.1–4 The sustainable energy sector is now progressing gradually with the implementation of molecular hydrogen (H2). Scientific, governmental, and industrial initiatives are essential to oversee the trajectory of H2 production as the principal energy carrier shortly.5,6 Numerous nations and policymakers have ambitious goals for distributing electricity produced from low-carbon sources. Electrochemical water splitting, as an advanced energy conversion technology driven by the HER and OER to produce H2 gas at the cathode and O2 gas at the anode from decomposed water molecules, respectively, presents a viable method for generating an inexhaustible carbon-free energy carrier.7–13 Consequently, the electricity-driven heterogeneous water catalysis technology can be readily adapted for large-scale industrial applications. The HER catalysis process involves a two-electron transfer process, while the OER necessitates a minimum of four steps: proton-coupled electron transfer steps and three reaction intermediates. In this case, the complex OER mechanism results in a large activation energy, which causes slower reaction kinetics.14–16 As a result, the overall efficiency of water splitting is significantly limited by the slow OER process.

Efficient electrocatalysts are key components in electrochemical water splitting, enhancing reaction kinetics and improving energy efficiency.17–20 Cost-effective, highly stable, and extremely reactive compounds significantly contribute to electrocatalytic systems.21–25 Complexes based on precious metals remain the most effective electrocatalysts for numerous electrolysis processes, including HER (e.g., platinum “Pt”-based catalysts), OER (e.g., ruthenium “Ru”-, rhodium “Rh”-, and iridium “Ir”-based catalysts), ORR (e.g., Pt-based catalysts), and carbon dioxide reduction reaction (CO2RR, e.g., gold “Au”- and silver “Ag”-based catalysts).26–30 Nevertheless, the scarcity of resources and high cost of noble metal electrocatalysts severely hinder their widespread applications. Therefore, it is necessary to explore cost-effective and high-efficiency catalysts as alternatives to precious metal-based electrocatalysts. Electrochemical supercapacitors (ESCs) represent a novel category of energy storage technologies characterized by an extended cycle life and high energy density.31–35 They can swiftly accumulate charges electrostatically or faradaically at the electrode/electrolyte contact. ESCs are classified into three primary categories, electrostatic double-layer capacitors (EDLCs), which store charge through an electrostatic mechanism; electrochemical pseudo-capacitors, utilizing faradaic electron charge transport; and hybrid SCs. Carbon, transition metal oxide and hydroxide, and conducting polymer-based materials have been commonly introduced as electrodes for SC applications. However, although candidate electrodes/electrocatalysts are developed by trial and error in traditional methodologies, notable problems include their high preparation costs, low efficiency, and extended manufacturing durations. Furthermore, rapidly achieving enhanced electrodes and electrocatalysts from a diverse range of materials remains challenging, hindering widespread breakthroughs in sustainable green energy technologies. The reactivity and catalytic centers of metal- or alloy-based electrodes/electrocatalysts with various facets and Miller indices are significant.36 Furthermore, defects, vacancies, and localized dopants are often recognized as co-catalytic sites.23,37

Solid electrocatalysts and heterogeneous electrocatalysis significantly contribute to chemical engineering, renewable energy, and catalytic water decomposition. Understanding the reaction pathway in electrolysis is highly challenging due to its complex structure and close interaction between the catalytic surface and molecules during the reaction. Accordingly, researchers have conducted fundamental studies to tailor structure–activity relationships, which can provide hypotheses for the development of novel electrocatalysts. However, the in situ screening of the structure of catalysts under the reaction conditions are challenging due to the constraints of characterisation approaches. In this case, atomic simulations utilizing quantum mechanical (QM) computation tools have gained prominence in supplementing experimental analyses and generating invaluable information to strengthen catalyst research.38–40 Over the past decade, artificial intelligence (AI)-assisted computational methodologies and their core principles have attracted widespread attention in both heterogeneous electrolysis and solid electrocatalysts, elucidating the reactive catalytic surfaces during operation, describing the underlying reaction mechanisms and rate-determining steps, predicting the reaction kinetics, uncovering insights for enhanced electrode/electrocatalyst design, and presenting prospects for well-established water electrolyzers.41–46 Advancements in computational technology enable high-throughput computations and theories to facilitate the logical creation of advanced electrocatalysts via the inverse engineering of the potential structure algorithm of materials at low cost, independent of professional input.

ML is a viable method for automating the development, processing, and interpretation of intricate electrode/electrocatalyst datasets, exhibiting enhanced attributes compared to conventional statistical methods. ML can effectively replace density functional theory (DFT) calculations, substantially minimizing expenses and shortening the development cycle.47–55 Moreover, ML algorithms identify sophisticated data-driven models to ascertain critical correlations between the characteristics of electrodes/electrocatalysts and the overall electrochemical efficiency (e.g., activity, specific capacitance, rate capability, stability, and selectivity).56–60 This progression has led to the prosperous establishment of efficacious design and screening guidelines for heterogeneous solid electrocatalysts with distinct properties. For instance, Singh et al. employed ML-driven high-throughput screening of metal atom (M = Sc, Ti, V, Cr, Mn, Fe, Co, Ni, Cu, Zn)-intercalated g-C3N4/transition-metal dichalcogenide (TMD = MoS2, MoSe2, MoTe2, WS2, WSe2, WTe2) heterostructures to boost the HER activity.61 The ML models and comprehensive computations elucidated the correlation between catalyst characteristics and HER activity, as well as identified HER active spots. The strategically confined metal atoms within the heterostructure can significantly optimize the electrocatalytic activity. Qin and colleagues examined the ML random forest algorithm model to predict the OER activity of hydroxide catalysts with extensive doping capacity. The newly synthesized extremely reactive hydroxide catalyst Ni0.77Fe0.13La0.1 was experimentally developed and systematically optimized, achieving an ultralow overpotential of 226 mV at 10 mA cm−2.62 Additionally, Zhang et al. investigated the electronic structure and dynamic properties of electric double layer (EDL) microporous carbon via ML force field accelerated molecular dynamics. The obtained findings indicated that the electrode potential for the Na+ intercalation process can be reduced by improving the solvation configuration of ions at the micropore/electrolyte interface.63 The field of sustainable energy conversion and storage is developing at a rapid rate. As a result, the number of studies pertaining to water catalysis and electrochemical systems published has increased dramatically to date. This area of research should not forfeit the scope of a comprehensive evaluation of significant advancements and methodologies.

This review presents critical and thorough guidelines for materials scientists and engineers to enhance their understanding of ML, particularly in the development of high-efficiency nanomaterials for use as electrodes/electrocatalysts in HER, ER, and SCs. It distinctly emphasizes material-centric insights, illustrating the relation between material systems and their extensive applications, while tuning the understanding by elaborating the essential, unique material discoveries. Specifically, we delve into prevalent ML algorithms and elucidate significant descriptors obtained from theoretical simulations or experimental outcomes, which serve as inputs for modeling various nanomaterials. This will enable materials researchers, engineers, and chemists to determine the critical criteria for predicting the overall performance of nanomaterials. Specifically, this will enable hands-on practice for electrode/electrocatalyst researchers lacking ML knowledge by guiding their techniques in dataset processing accordingly. In this context, the ML applications will be more accessible, assisting scientists, chemists, and engineers in conducting and understanding ML techniques efficiently. Leveraging the recent literature, our in-depth discussions delineate current advancements in the ML-aided design of electrodes/electrocatalysts for HER, OER, and SCs across varying proficiency levels. We also highlight future opportunities and aspects for the expansion of this rapidly evolving research domain, while identifying challenges that may reveal potential innovative solutions. Ultimately, this article examines the transformative impact of ML on the design of electrodes/electrocatalysts for HER, OER, and SCs, while also providing practical guidance tailored for nanomaterials scientists. It will inspire reader-accessible, material-based focus and support the paradigm shift to data-driven research methodologies in this domain and beyond.

2. Concepts and domain knowledge of ML

2.1. ML algorithms

AI has advanced swiftly in recent decades, emerging as a formidable tool for tackling intricate challenges across multiple fields. AI denotes the ability of machines to execute tasks that ordinarily necessitate human intelligence, encompassing a diverse range of technologies, including ML64 and large language models (LLMs).65 ML, a subset of AI, focuses on creating algorithms that enable computers to learn from data autonomously without explicit programming (Fig. 1).
image file: d5qm00326a-f1.tif
Fig. 1 Hierarchical AI framework incorporating ML, deep learning, and generative AI. LSTM: long short-term memory; GCNs: graph convolutional networks; CNNs: convolutional neural networks; DRL: deep reinforcement learning; RNNs: recurrent neural networks; LLM: large language model; GANs: generative adversarial networks; GPT: generative pre-trained transformer; AGI: artificial general intelligence; SL: supervised learning; WS: weak supervision; RL: reinforcement learning; MTL: multi-task learning; and UL: unsupervised learning.

In recent years, the emergence of LLMs, which employ advanced neural network architectures to analyze and produce human-like text, has elevated AI capabilities to unprecedented levels. ML algorithms can be classified into many categories based on the characteristics of the data they utilize and the types of problems they address. The key categories of ML algorithms include supervised, unsupervised, semi-supervised, reinforcement, and deep learning.66 A summary of the most commonly used ML algorithms in electrocatalysis and their comparative performance is presented in Table 1.61,67–70 RF performs robustly on small to medium electrocatalyst datasets with minimal hyperparameter tuning, while CNNs learn complex non-linear relationships but require much larger data volumes and careful regularization to avoid overfitting. SVR remains highly effective in data-scarce settings, and Gaussian process regression (GPR) matches or exceeds the accuracy of support vector regression (SVR), while also providing principled uncertainty estimates. However, both SVR and GPR suffer from cubic training complexity as the dataset size increases. Gradient-boosting decision trees (GBDT) strike a compromise by often delivering accuracy comparable to more complex models with built-in feature-importance measures for interpretability, although they demand longer training times and more extensive hyperparameter optimization.

Table 1 Comparison of the performance of ML algorithms for electrocatalysis prediction
Algorithm Typical performance (MAE/R2) Application scenario Advantages Disadvantages Ref.
Random forest (RF) MAE ≈ 0.118 ΔGH adsorption-energy prediction, catalyst screening. • Simple to implement, naturally resistant to overfitting. • Poor at extrapolating beyond training distribution. 61
R 2 ≈ 0.957 • Stable on small–medium datasets. • Limited interpretability of complex patterns.
GBDT MAE ≈ 0.25 Single-atom/carbon-based SAC catalysis HER activity and stability prediction. • Balances predictive performance with interpretability. • Long training times. 67
R 2 ≈ 0.87 • Robust to heterogeneous feature sets. • Requires extensive hyperparameter tuning.
CNN R 2 ≈ 0.93 Overpotential prediction, multi-property modeling. • Powerful at fitting highly non-linear relationships. • Requires large, labeled datasets. 68
• Deep layers extract complex features. • Prone to overfitting; needs careful hyperparameter tuning.
SVR RMSE ≈ 0.24 eV Band-gap prediction on small datasets. • Excellent in high-dimensional, small-sample settings. • Runtime scales poorly with dataset size. 69
• Flexible via kernel choice. •Sensitive to hyperparameters
GPR RMSE ≈ 0.14 eV GW-level band-gap prediction of functionalized MXenes. • Built-in uncertainty quantification. • Training cost ∝ O(N3). 70
MAE ≈ 0.11 eV • Highest accuracy on limited data. • Kernel hyperparameters are critical.
R 2 ≈ 0.83


Supervised learning algorithms are engineered to establish a correspondence between input data and associated output labels.71 This educational procedure employs labeled training data to develop models capable of predicting outcomes on non-observed data. Prominent supervised learning methods include linear regression, logistic regression, support vector machines (SVM), decision trees, random forests, and k-nearest neighbors (KNN). Linear regression predicts a continuous value from independent variables by determining the optimal line that minimizes the error between predicted and actual values. Logistic regression is used for binary classification tasks, calculating the probability of a given input belonging to a designated class by a sigmoid function. SVM seek to identify the best hyperplane that distinguishes data points of disparate classes, hence maximizing the margin between them. Decision trees are fundamental models that split data into nodes according to feature values, whereas random forests constitute an ensemble of decision trees that enhance the predictive accuracy by mitigating overfitting. KNN are non-parametric algorithms that determine the class of a certain data point by identifying the predominant class among its nearest neighbors.

Unsupervised learning algorithms function on unlabeled data, discerning intrinsic patterns and structures. These techniques are frequently designed for clustering and dimensionality reduction tasks.72 Prominent unsupervised learning algorithms include k-means clustering, hierarchical clustering, principal component analysis (PCA), and autoencoders. K-means is a widely utilized clustering technique that divides the dataset into k clusters according to the distance between data points and cluster centroids. Hierarchical clustering constructs a hierarchy of clusters, which can be represented as a dendrogram, facilitating study at multiple degrees of granularity. PCA is a dimensionality reduction method employed to decrease the number of features in a dataset, while preserving the maximum variance, hence facilitating visualization and minimizing computational expenses. Autoencoders are a type of neural network employed for unsupervised representation learning; they compress data into a latent space, and subsequently rebuild it, allowing the acquisition of essential properties of the input data. Semi-supervised learning occupies a position between supervised and unsupervised learning.73 It leverages a minimal amount of labeled data in conjunction with a substantial volume of unlabeled data to construct more efficient models. This method is advantageous when acquiring labeled data is costly or labor-intensive. Graph-based approaches and self-training are often utilized in semi-supervised learning.

Reinforcement learning (RL) draws inspiration from behavioral psychology, wherein an agent acquires knowledge through interaction with the environment and receives rewards or penalties.74 The objective is to acquire a policy that optimizes cumulative reward over time. Prominent algorithms in reinforcement learning include Q-learning, deep Q-network (DQN), and policy gradient techniques. Q-learning is an off-policy reinforcement learning method that evaluates the value of executing a specific action in a given state, depending on the rewards obtained. DQN enhances Q-learning by employing deep neural networks to estimate Q-values for various actions, enabling its application in intricate, high-dimensional situations. Policy gradient approaches such as REINFORCE and proximal policy optimization (PPO) directly optimize the policy by estimating gradients that enhance the probability of actions, resulting in substantial rewards.

Deep learning, a branch of ML, emphasizes the utilization of multi-layered neural networks to acquire intricate data representations.75 Convolutional neural networks (CNNs),76 recurrent neural networks (RNNs),77 long short-term memory (LSTM) networks,78 and transformers79 are prominent designs employed in many applications, including image classification, speech recognition, and natural language processing. LLMs have significantly transformed how machines comprehend and produce human-like text. These models are constructed using sophisticated deep learning architectures, particularly transformer-based neural networks. In the following section, we examine LLMs and their importance in the domain of AI. Transformers, presented by Vaswani et al. in 2017,80 provide the backbone of modern LLMs. Its architecture is founded on the principle of self-attention, enabling the model to concurrently focus on various segments of an input sequence. This mechanism is very effective for managing long-range dependencies in text. Transformers have replaced RNNs and LSTMs in many natural language processing (NLP) tasks due to their superior ability to capture contextual relationships.

Generative pre-trained transformers (GPT)81 are a category of LLMs created by OpenAI. These models undergo pre-training on extensive text corpora by unsupervised learning, followed by fine-tuning for particular tasks. The GPT series, comprising GPT-2,82 GPT-3,83 GPT-4,84 and GPT-4o,85 have exhibited exceptional proficiency in text production, language translation, question answering, and more tasks. GPT-3, with 175 billion parameters, signifies a substantial progression in natural language processing. It is capable of zero-shot, one-shot, and few-shot learning, making it exceptionally adaptable for many applications. GPT-3 has been used in numerous real-world scenarios such as chatbots, content creation, and programming support. GPT-4o, the most recent and sophisticated model in the series, is anticipated to feature trillions of parameters, surpassing the scale of GPT-4. Utilizing state-of-the-art training methodologies and an even larger and more diverse dataset, GPT-4o is anticipated to outperform in various domains, including sophisticated reasoning, real-time data assimilation, and multi-modal functionalities (e.g., image, video, and text processing). This model is expected to demonstrate exceptional fluency in both natural and technical language, hence improving its capacity to aid in intricate research, innovative problem-solving, and advanced professional activities. Bidirectional encoder representations from transformers (BERT), created by Google, is an important LLM that has profoundly influenced NLP.86 Unlike GPT, which is autoregressive and processes text unidirectionally, BERT is bidirectional, allowing it to simultaneously examine both left and right contexts. This bidirectional methodology enables BERT to understand the complete context of a word within a sentence, making it exceptionally proficient for tasks such as sentiment analysis, named entity recognition and question answering. LLMs have been utilized across various domains, ranging from customer service and healthcare to creative writing and software development. Nonetheless, these models also come with challenges such as the need for extensive computational resources, the risk of producing biased or detrimental information, and the complexities associated with assuring interpretability and transparency.

2.2. Physics-inspired electroactive material design

The development of sophisticated electroactive materials for water electrolysis and SCs has progressively been informed by ideas from physics, especially those derived from quantum mechanics, thermodynamics, and solid-state physics.87–90 This physics-inspired technique fundamentally relies on the understanding of how atomic and electronic structures influence the macroscopic electrochemical performance. This understanding has been essential in progressing materials for energy conversion and storage devices, especially for HER and OER in electrolysis, as well as in improving the performance of SCs. Quantum mechanical models, notably DFT, have been crucial in forecasting and enhancing the electronic structure of materials employed as electrocatalysts.91 In electrocatalysis, materials must effectively promote electron transfer during HER and OER, processes reliant on the atomic-level interactions of the catalyst with the reaction intermediates, including hydrogen atoms and hydroxyl groups. For instance, an ideal catalyst for HER should bind hydrogen atoms with an appropriate strength, sufficient to facilitate adsorption without excessively hindering desorption.92 The equilibrium, fundamental to the Sabatier principle, is key for effective catalysis and is refined through quantum mechanical computations. These models have resulted in the creation of advanced electrocatalysts, such as platinum alloys and transition metal-based catalysts, which demonstrate excellent efficiency in water electrolysis.

Alongside quantum mechanical insights, thermodynamic considerations are essential for improving materials used in SCs. SCs accumulate energy by electrostatic charge at the electrode–electrolyte interface, with their performance dictated by the thermodynamic relationship between electrode potential and ion concentration, as articulated by the Nernst equation.93 Recent advancements have focused on refining the porosity and surface chemistry of these materials to facilitate ion transport, therefore augmenting energy storage.94–96 Pseudocapacitive materials such as transition metal oxides and conducting polymers have garnered interest owing to their capacity to store charge through faradaic redox reactions, providing substantially greater capacitance than the conventional EDLC mechanism of typical carbon-based materials.97 The amalgamation of these pseudocapacitive materials with carbon-based electrodes has yielded hybrid SCs exhibiting enhanced energy and power densities, paving the way for developing more efficient energy storage solutions.

Solid-state physics has greatly enhanced the development of materials for electrocatalysis and energy storage by manipulating their electrical conductivity and structural characteristics. Conductivity is a crucial determinant of electron transfer efficiency in electrochemical reactions, especially in water splitting, where efficient electron transport is vital for optimal catalyst performance. Metal oxides such as ruthenium oxide (RuO2) and iridium oxide (IrO2) are frequently employed as OER catalysts owing to their exceptional conductivity and stability in harsh electrochemical environments.98–101 However, at elevated anodic potentials, the dissolution of IrO2 may result in performance impairment (Fig. 2A).


image file: d5qm00326a-f2.tif
Fig. 2 (A) Free-energy profile (ΔG) of Ir dissolution from the IrO2(110) surface under OER conditions. Reproduced with permission from ref. 98. Copyright 2022, the American Chemical Society. (B) Polarization curves, (C) Tafel plots, (D) Nyquist plots, and (E) chronopotentiometry measurements at 10 mA cm−2 of La-doped RuO2 and RuO2. Reproduced with permission from ref. 99 copyright 2020, Elsevier.

Surface-bound intermediates, such as IrO2OH and IrO3, generated during the dissolution process markedly affect the catalytic activity, underscoring the intrinsic trade-offs between activity and stability. These observations highlight the significance of stability optimization in catalyst design. Doping procedures have been utilized to improve the conductivity and durability of catalysts in response to these problems. For instance, La-doping of RuO2 has demonstrated a reduction in charge transfer resistance, leading to enhanced reaction kinetics and prolonged performance under OER conditions (Fig. 2B–E). Likewise, the design of supercapacitor (SC) electrodes has benefited from solid-state principles, especially through the advancement of nanostructured materials, including nanowires, nanosheets, and nanorods. These structures enhance the surface area and ion accessibility, promoting accelerated charge/discharge rates and greater energy storage capacities. Furthermore, two-dimensional (2D) materials, such as graphene and molybdenum disulfide (MoS2), demonstrate distinctive electronic characteristics and high surface areas, making them versatile options for water electrolysis and SCs.102 Their structural flexibility enables functionalization, permitting the further optimization of their surface chemistry for improved electrochemical interactions. These integrated methodologies offer theoretical and empirical insights into the advancement of efficient and durable materials for energy applications.

3. ML-based electrode/electrocatalyst property analysis

3.1. Synergistic workflow integrating ML and DFT for accelerated high-throughput screening

DFT has been widely used to investigate the electronic structure, thermodynamic properties, and catalytic performance of electrodes and electrocatalysts. However, DFT calculations are often resource-intensive and time-consuming, particularly for large-scale material screening. This imposes significant limitations on the rapid discovery and optimization of novel electrocatalytic materials. In recent years, the integration of ML with DFT has emerged as a promising strategy, demonstrating excellent efficiency and predictive capability in applications such as HER, OER, and SCs.36,103

The primary advantage of combining ML with DFT lies in leveraging the high accuracy of DFT computations with the fast inference and generalizability of ML models. The complete workflow for this strategy is illustrated in Fig. 3, which depicts the end-to-end process from DFT data generation and ML model training to iterative optimization through active learning. Researchers typically begin by generating reliable initial datasets using DFT, which include electronic structures, adsorption free energies, and reaction kinetic parameters.104,105 Based on this dataset, essential physical and chemical features are extracted, such as elemental composition, electronegativity, atomic radius, and coordination environment. Then, these features are used to train high-performance ML models, including random forest, support vector regression, and deep neural networks.106,107 These models capture complex nonlinear relationships between material descriptors and catalytic performance, enabling the accurate predictions of a wide range of unexplored materials.2


image file: d5qm00326a-f3.tif
Fig. 3 Workflow for automating the discovery of theoretical materials using a combination of ML and DFT: (A) conventional experimental workflow based on testing and analysis; (B) automated DFT-based workflow incorporating scaling relationships and structure selection; (C) manual DFT screening process relying on expert intuition; and (D) ML-driven automated workflow including motif selection, structure generation, and candidate prediction within a closed-loop system. Reproduced with permission from ref. 36, copyright 2018, Springer Nature.

Subsequently, trained ML models are applied to predict the properties of large candidate material libraries. This enables the rapid identification of promising candidates, while significantly reducing the number of costly DFT computations required.108

The predicted top-performing candidates are further validated by DFT to ensure the reliability of the predictions. The new DFT data generated during this validation phase are used to update and retrain the ML models, establishing a closed-loop active learning framework that progressively enhances the model accuracy and generalization.109 In addition to computational improvements, experimental feedback is incorporated to validate ML predictions and synthesize promising candidates. Then, experimental results are used to refine both the DFT computations and ML models. This closed-loop interaction between theory and experiment substantially increases the success rate of materials discovery and accelerates advancements in electrocatalyst research.

3.2. Structure/composition-oriented design and optimization

The design and optimization of the structure and content of electrodes/electrocatalysts are crucial for improving the catalytic performance in water splitting reactions, specifically the OER and HER processes. The catalytic efficacy of NiFe-based electrocatalysts is significantly influenced by factors including thickness, surface dynamics, and atomic configuration. Research on NiFe catalysts indicates that an ideal thickness, generally approximately 3 nm, creates a dynamic balance between the dissolution and redeposition of Fe species. This self-healing mechanism guarantees extended stability and improved OER performance in alkaline environments.110 Besides thickness, tuning the structural defects such as oxygen vacancies markedly affects the electronic structure and adsorption energy of the intermediates, enhancing the catalytic pathways.111 Defect engineering solutions, such as incorporating multi-vacancy systems, are effective in enhancing the density of active sites and optimizing the conductivity, hence yielding higher catalytic performance. The integration of multiscale hollow structures into catalyst design has led to substantial progress in electrocatalysis. These materials, especially those featuring intricate hollow architectures, optimize the quantity of available active sites, while enhancing the mass transport and charge transfer efficiencies. Hollow structures with open channels and multi-shell configurations enhance the electrolyte transport and product desorption, leading to better reaction kinetics and stability.112 Additionally, defect-rich hollow structures originating from transition metal chalcogenides and oxides have demonstrated efficacy in reducing the activation barrier through the introduction of oxygen vacancies and strain-induced lattice distortions. The incorporation of heterojunctions and phase-engineered compositions in these hollow materials significantly improves their catalytic performance. The addition of Mo to Ni-based heterostructures induces electron redistribution at the interface, thereby optimizing the free energy of the reaction intermediates and enhancing the HER and OER processes.113

ML methodologies have significantly contributed to revealing concealed relationships among structure, composition, and catalytic efficacy. By combining ML with DFT computations, researchers can efficiently evaluate an extensive array of candidate catalysts, predict their adsorption energies, and enhance their electronic architectures. ML algorithms have effectively discerned synergistic effects in multi-metal systems, wherein the optimum composition promotes the electronic conductivity and optimizes the intermediate adsorption energies.114,115 These studies expedite the discovery of high-performance catalysts and elucidate essential structural characteristics, including bond lengths, defect density, and ionization energies, that influence catalytic activity. The standard procedure for ML-guided catalyst tuning involves three main stages including data generation, model training, and feature importance analysis, as depicted in Fig. 4.114 The advancement of defect-engineered catalysts, especially those including oxygen and cation vacancies, has significantly enhanced catalyst optimization. Oxygen vacancies significantly reduce the reaction barrier by improving the adsorption of intermediates, as evidenced in systems such as CoFe oxides and NiFe-LDHs.116,117 These materials demonstrate enhanced electronic conductivity and tunable electron transport resulting from vacancy-induced modifications in their electronic structure. Multi-vacancy systems containing both cation and anion defects offer increased electrochemically active surface areas and diminished overpotentials, facilitating elevated current densities and prolonged stability under the reaction conditions. Analyses powered by ML have consistently predicted and validated these defect configurations, demonstrating their potential to further improve the catalytic performance.118


image file: d5qm00326a-f4.tif
Fig. 4 ML approach for catalyst optimization: (A) workflow of the ML process including data generation, model training and testing, and feature analysis; (B) comparison between DFT-calculated Gmax values at η = 0.3 V and predictions from the best-performing GBR model after four-fold cross-validation; (C) feature importance analysis of gradient boosting regressor (GBR) model for Gmax at η = 0.3 V. Reproduced with permission from ref. 114 copyright 2024, Elsevier.

3.3. Screening of active sites

Contemporary investigations in sustainable electrochemical energy conversion and storage focus on elucidating the chemical reactivity, reaction selectivity, and charge storage capacity induced by electrochemical potentials. Owing to their superior computational efficiency and exceptional modeling capabilities, ML models can be employed for high-throughput material screening to enable the cost-effective evaluation of numerous candidate materials and identify those with optimal characteristics (e.g., predicting materials based on compositional and structural data).119,120 An inverse design approach is considered an effective alternative for managing the pool of possible materials, offering an appealing search framework for materials with desired properties. In this framework, both the input and output of the ML model can be interchanged to predict the structural characteristics of materials with optimal attributes.121 Generative adversarial networks and autoencoder-based models are often utilized for inverse design strategies. Besides, ML identifies the catalytic reactive sites and describes the processes of the electrolysis reaction. Identifying the reactive sites of nanoparticles using traditional theoretical models is challenging due to the intricacies of atomic composition and the extensive quantum mechanics investigations needed. For example, a 10 nm particle may contain 200[thin space (1/6-em)]000 bulk atoms and approximately 10[thin space (1/6-em)]000 surface sites, requiring extensive theoretical calculations and identifications.

Appropriate ML algorithms and beneficial high-throughput screening of materials are essential for surmounting complexity hurdles and systematically suggesting extensive search dimensions. Lunger and colleagues introduced the ML approach to precisely predict the per-site characteristics of perovskite oxides for the OER, utilizing site-projected O 2p-band, Bader charges, and d-band centers, where faceting and elemental substitution were found to alter the local electronic structure.122 To achieve this objective, scientists developed a graph-based neural network model to investigate site-dependent characteristics, and subsequently predict the binding energies of the OER intermediates. They asserted that per-site descriptors throughout the dataset exhibited considerable variation based on surface coordination and composition, and could be frugally predicted from the structure. The site-specific features that exhibit a linear correlation with OER binding energies could be adopted to modulate OER energetics. Through a comparison of the OER energetics from the created dataset with prior studies, the authors manifested the potential to tailor per-site characteristics of the active site via their theoretical framework. A covalency competition model guided by a random forest algorithm was developed to predict highly efficient OER electrocatalysts. A dataset of more than 300 spinel oxides was evaluated for structural and elemental characteristics.123 Using this ML model, researchers identified a promising spinel, [Mn]T[Al0.5Mn1.5]OO4, as an exceptionally effective OER catalyst (Fig. 5).


image file: d5qm00326a-f5.tif
Fig. 5 (A) Comparative experimental and calculated electrocatalytic activities of spinel oxides; (B) random forest algorithm predicted Max(DT, DO) against DFT-calculated Max(DT,DO). Inset B displays the deviation between both models. (C) random forest model screened Max(DT, DO) of M0.5 (M = Zn, Al, Li, and Cu)-substituted CoCo2O4, MnMn2O4, and FeFe2O4. Reproduced with permission from ref. 123, copyright 2020, Springer Nature.

Chen and coworkers employed three distinct ML methods to explore high-performing HER catalysts. Various parameters, including catalyst, additive, electrolyte type, and support, were analyzed to determine their impact on the overpotential. The data analysis revealed that Pt and Mo metals at a ratio of 0.5 were the most reactive constituents, whereas heteroatomic N, S and nickel foam served as the ideal non-metallic elements and support, respectively.124 Considering this, the ML model predicted the HER performance of N, S-doped Pt@Mo2S3 in alkaline electrolyte, demonstrating a minimal overpotential of 33 mV. A high-throughput screening of 2D transition-metal dichalcogenides (TMDs) was conducted to identify high-performance HER electrocatalysts, considering the thermodynamic stability between phases, vacancy formation energy, hydrogen adsorption energy, and zero bandgap. In comparison to Pt(111), monolayer VS2 and NiS2, transition metal ion vacancies in ZrTe2 and PdTe2, as well as chalcogenide ion vacancies in CrSe2, TiTe2, VSe2, and MnS2 exhibited superior HER catalytic activity.125 The ML model provided an attractive catalytic descriptor for quantitatively evaluating the HER activity of 2D TMDs based on their local electronegativity and valence electron number, indicating a novel approach for catalyst engineering and activity optimization.

Another strategy for creating high-performance catalysts is to optimize their active sites and geometric structures.48,53 However, catalytic material simulations and experiments generally rely on conjecture, which makes them expensive, time-consuming, and ineffective. In the investigation of boosting the catalytic performance through structural design, Ryan et al. constructed a deep neural network based on existing crystal structures, signifying its usefulness as an ML tool for examining huge crystallographic datasets.48 The model was applied to the Mn–Ge system after being trained on 51[thin space (1/6-em)]723 binary and ternary crystal structure templates. As a result, 36 more compositions in the Li-Mn–Ge ternary system were discovered, together with four compositions with high-probability structures. This method facilitates the synthesis and discovery of novel materials, especially for complex multi-element systems.

Using only readily available intrinsic properties, a universal and interpretable descriptor model (called ARSC) was adopted to unify activity and selectivity prediction for multiple electrocatalytic reactions (i.e., O2/CO2/N2 reduction and O2 evolution reactions). This model decoupled the atomic property (A), reactant (R), synergistic (S), and coordination effects (C) on the d-band shape of dual-atom sites, verifying the importance of the d-orbital overlap degree in reactions at dual-atom locations. Instead of performing more than 50[thin space (1/6-em)]000 DFT high-throughput computations, the authors could quickly identify highly active dual-atom locations for a variety of reactions and products owing to this description. The universality of this model was supported by abundant documented research and subsequent experiments, anticipating that Co–Co/Ir–Qv3 is the optimal bifunctional oxygen reduction reaction (ORR)/OER catalyst. As a result, the as-synthesized Co–Co/Ir–Qv3 experimentally achieved a minor overpotential of 330/340 mV at 10 mA cm−2 for OER, in addition to a remarkable half-wave potential of 0.941/0.937 V for ORR.55

To date, the origins of the activity of single-atom catalysts (SACs) are still obscure, which makes their rational design problematic. In this case, Liu et al. applied supervised learning to examine synchrotron spectroscopic data and interpret the catalytic mechanism of graphene-supported Co SACs for HER.53 In line with DFT predictions, the ML model manifested that the active centers were the Co edge locations, which include zigzag and armchair topologies. Impressed by these observations, researchers created edge-rich Co SACs, which at high current densities ultimately performed better than standard Pt/C catalysts. This study highlights the potential of ML in catalyst characterisation and structure-performance correlation analysis.

Additionally, Fe@MoS2 was proposed as a potential electrocatalyst candidate for nitric oxide reduction reaction by merging machine learning techniques with DFT, obtaining a low limiting potential of −0.52 V. Under local alloying circumstances, Fe@MoS2 exhibited high resistance to electrochemical corrosion with desirable thermodynamic stability. More significantly, by exploiting the inherent steric barrier of the defective MoS2 caused by the surrounding S atoms, competing complex electroreduction reaction pathways were directed in the anticipated direction. Using an effective random forest regression (RFR) model algorithm, the relationship between the atomic configurations and the macroscopic characteristics of the materials was evaluated.56

2D TM carbides and nitrides have emerged as potential HER electrocatalysts owing to their plethora of active surface functional groups and tunable electrical conductivity. However, their single active site and poor reaction kinetics restrict their extensive usage in catalytic reactions. Thus, to tackle this issue, TM dopants can change the catalytic characteristics and produce highly reactive compounds. Given this, a series of TM atom-doped Ti3CNO2 and Zr2HfCNO2 was subjected to high-throughput computations to examine the effects of the local structure and corresponding electronic structure modifications on their HER properties. Additionally, the site identification features were used to train a multisite prediction model that predicted the trend of the hydrogen adsorption Gibbs free energy (ΔGH*), reaching a final model accuracy of R2 = 0.97. The results demonstrated the role of Nb, Sc, Rh, W, Ti, and V dopant atoms in boosting the catalytic activity of M′2M′′CNO, yielding ΔGH* < 0.2 eV for more than 38 M′2M′′CNO2, respectively. Hence, this study proved the importance of dopant atoms in reinforcing the catalytic activity of M′2M′′CNO2.60

Using computational algorithms, ML analysis for SCs estimates the performance of various compounds based on their inherent characteristics. The reversible redox reactions of pseudo-capacitive components enhance the specific capacitance and overall efficiency. Besides, heteroatomic dopants further boost the charge storage capability of carbonaceous framework electrodes by providing additional active sites through synergistic interactions. With the help of ML, Wang et al. identified oxygen (O)-rich active porous carbon electrodes for aqueous SCs using an artificial neural network (ANN) model. The overall percentage of nitrogen (N) and O dopants served as the structural characteristics in this investigation, while the surface areas of micropores and mesopores operated as the structural parameters. In both 6 M KOH and 1 M H2SO4 electrolyte, the SC performance of N/O co-doped activated carbon-based electrodes was gathered using the training database. The N/O co-doped activated carbon electrode in 1 M H2SO4 had a large capacitance, as predicted by the ANN. This capacitance was caused by the combined contribution of 1502 m2 g−1 micropore surface area, 687 m2 g−1 mesopore surface area, 20 at% O doping, and 0.5 at% N doping. The excessive O doping in 1 M H2SO4 could considerably boost the specific capacitance, which was ascribed to the advantageous electrode surface wetting and regulated electronic conductivity.34

3.4. Data curation and challenges

In high-throughput electrocatalyst screening, machine learning models are pretrained on vast DFT-derived repositories such as Materials Project, AFLOW, OQMD, COD and OC22, and only a few hundred experimentally validated performance measurements are available. This imbalance results in systematic bias when translating predictions into real systems.126Fig. 6 shows the closed-loop workflow, which collects and integrates raw data from DFT calculations and experiments and applies standardized preprocessing, including outlier removal, harmonization of exchange–correlation functionals such as PBE or RPBE and energy references, and normalization of experimental conditions such as temperature, electrolyte concentration, pH and scan rate, to map each sample into a unified feature space. After preprocessing, features are engineered to capture physical descriptors such as d-band center, adsorption energies and coordination numbers. Models are trained and validated using stratified k-fold cross-validation to ensure the splits represent material classes and performance ranges. Performance metrics and uncertainty estimates guide an iterative feedback loop. Regions with high uncertainty are identified, and new data points are targeted for additional DFT or experimental evaluation. This active learning driven augmentation reduces the overall prediction error by approximately 20% with only a 10% increase in data volume.127
image file: d5qm00326a-f6.tif
Fig. 6 Closed-loop data-driven electrocatalyst design workflow. Reproduced with permission from ref. 127, copyright 2024, the Royal Society of Chemistry.

Generative adversarial networks trained on the ICSD database expand the chemical search space by proposing new inorganic compositions; over 92% of generated structures satisfy validity checks, and more than 84% maintain charge neutrality, thereby enriching candidate libraries.128 Models pretrained on DFT data are refined with limited experimental measurements via transfer learning, while active learning algorithms identify the most uncertain predictions quantified by Monte Carlo Dropout or ensemble variance to guide additional experiments or high-level calculations. This targeted sampling strategy can reduce overall prediction error by approximately 20% with only a 10% increase in data volume.36 Robust model evaluation employs stratified k-fold cross-validation to preserve the proportions of material classes and performance ranges across folds. Class imbalance is addressed through SMOTE oversampling or cost-sensitive weighting, and descriptor reduction techniques such as LASSO regression or principal component analysis eliminate redundant or collinear features, thereby improving the interpretability and stability.129

3.5. Study of stability

The growing interest in highly reactive nanostructured materials has prompted scientists to optimize their redox electrochemistry and surface properties. The differences between acidic and alkaline environments significantly affect the activity and stability of electrocatalysts. Under alkaline conditions, electrocatalysts demonstrate enhanced long-term durability, broadening the range of appropriate materials due to the reduced corrosiveness of alkaline media. This adaptability directs the investigation of cost-effective electrocatalysts. ML models and structural prediction algorithms can rapidly evaluate forces and energy that may theoretically characterize the catalytic activity, durability, and structure of molecules.58 For example, O’Connor et al. described a statistical learning approach utilizing various shrinkage techniques (e.g., LASSO regression, ridge regression, and least-angle regression) to forecast the bonding energy of a SAC, thereby identifying a subset of physical descriptors within a feature space and minimizing the discrepancy between predicted and actual values. The electronic structures of metal/oxide systems exhibiting both strong and weak metal–support interactions were investigated to elucidate the trends in these interactions. Enhanced precision in predicting binding energy was attained when metal–metal interactions were incorporated into descriptor matrices that involved features of both adatom and surface metal within the ≥2D descriptor sets. The results signified that the physical characteristics of the metal and substrate, such as metal oxidation enthalpy, support reducibility, and metal–metal interaction enthalpy, were correlated with the interfacial bonding. The parameters affecting robust metal–support interactions were utilized alongside LASSO + lo to create a predictive model for evaluating metal/support combinations that yield highly adsorbed SACs.130 Wang et al. analyzed the HER electrocatalytic performance and stability of twenty-eight TM atoms on eight representative carbon frameworks (e.g., C2N, C3N4, N/C-coordination graphene, graphdiyne (GDY), phthalocyanine (Pc), covalent organic frameworks (COF), and metal–organic frameworks (MOF)), while studying the underlying structure–property relationship and the catalytic activity mechanisms of SACs (Fig. 7A). The ML technique indicated that the HER activities of these carbon-based SACs could be properly predicted by gradient boosting algorithms. Through feature importance analysis, the HER activity was predominantly tailored by the electronic configurations of the d orbitals and the geometric shape surrounding the TM active center within the catalytic system.67 This study utilized ML to enhance the stability of the catalyst by using a GBDT framework to investigate the parameters influencing the HER activity. The dimensionality catastrophe caused by an excessive number of input features was mitigated using recursive feature removal. The optimal GBDT regressor achieved a coefficient of determination (R2) of 0.87 and a mean absolute error (MAE) of 0.25 eV. The authors asserted that the geometric configuration of the TM active center and the electrical structure of d orbitals are the primary determinants influencing the HER process. MXene-related configurations are advantageous for electrochemical energy conversion and storage due to their extensive active surface area, adjustable electronic structure, and superior structural stability, which provide improved activity and utilization efficiency. The compositional plasticity of MXenes facilitates favorable property modification by surface functionalization, hybridization, and element substitution, allowing ML to achieve significant discoveries. Liang et al. documented the engineering of Ti-based 2D single-atom MXene catalysts (Fig. 7B).131 Artificial neural network, KNN, least absolute shrinkage and selection operator (LASSO), SVR, and Bayesian ML algorithms were applied to train the data, enabling comparable precision RMSE values of 0.01 and 0.22 eV for thermal stability and activity predictions, respectively. The authors were able to predict the ΔGH* and the cohesive energies per atom. Among the 21 most promising HER catalysts, the implemented algorithms identified 7 efficient electrocatalysts, Ti3C2I2–Ir, Ti3C2Br2–Cu, Ti3C2Br2–Pt, Ti3C2Cl2–Cu, Ti3C2Cl2–Pt, Ti3C2Se2–Au, and Ti3C2Te2–Nb, with HER activity superior to that of Pt, exhibiting both dynamic and thermal stability.
image file: d5qm00326a-f7.tif
Fig. 7 (A) GBDT model. Reproduced with permission from ref. 67 copyright 2022, the Royal Society of Chemistry; (B) optimized atomic configuration of single-atom-anchored MXenes. Neither Cr nor Mn were investigated for single atom implanting while C was not studied for the surface termination. Reproduced with permission from ref. 131, copyright 2022, Wiley-VCH.

Presently, ML approaches enable the construction of reliable surface Pourbaix diagrams for realistic nanoparticle sizes. Bang et al. developed a bond-type embedded crystal graph convolutional neural network trained on DFT adsorption-energy differences for O and OH on Pt nanoparticles up to 6525 atoms.132 Their BE-CGCNN model reproduced experimental surface-phase boundaries for Pt–O and Pt–OH coverages within 0.1 eV and delivered Pourbaix diagrams of real-scale nanoparticles at a fraction of the DFT cost. Data-driven kinetic prediction has also been applied in accelerated degradation testing. Wang et al. combined a mechanistic OER degradation model with Bayesian data assimilation of current–time data. By assimilating only the first 300 h of electrolysis, their framework accurately predicted the catalyst lifetime approaching 1000 h with less than 10% error, reducing a multi-thousand-hour test to a fraction of the time.133

Highly reactive oxide electrocatalysts typically demonstrate instability due to rapid ion exchange and defect formation. These materials may undergo dynamic compositional and structural alterations under severe operational conditions. Thus, balancing their performance and stability has emerged as a significant area of research. In this regard, Jeong et al. proposed the weighted mean of cation electronegativities image file: d5qm00326a-t1.tif to characterize the covalency of AB2O4 spinel oxides, confirming its role as a valuable shortcut for catalyst design and a descriptor for assessing stability, catalytic activity, and predicting reaction mechanisms (Fig. 8A–G).49 Compositions demonstrated good stability with an image file: d5qm00326a-t2.tif value exceeding 0.96. Those with a value below this threshold revealed partial breakdown, which extremely impairs their performance. Conversely, compositions with an image file: d5qm00326a-t3.tif value close to 0.96 could fulfil ideal requirements, providing suitable structural flexibility to maintain their active sites and stability. The adsorbate evolution mechanism (AEM), which includes the reaction intermediates HO*, O*, and HOO*, was proposed for oxide electrocatalysts facilitating the OER in alkaline media through four sequential single-electron charge-transfer steps.


image file: d5qm00326a-f8.tif
Fig. 8 (A) Predicted stability and catalytic activity of spinel oxide electrocatalysts versusimage file: d5qm00326a-t4.tif; (B)–(D) covalency degree expressed by the oxygen 2p-band center hybridization, band center of cation valence orbitals, and polyhedral distortion, illustrating the stability of spinel oxide catalyst with an increase in image file: d5qm00326a-t5.tif value; (E)–(G) proposed OER mechanisms in alkaline medium with an increase in image file: d5qm00326a-t6.tif value. Reproduced with permission from ref. 49, copyright 2024, Wiley-VCH.

3.6. Thermodynamic and reaction kinetics

The application of ML in electrocatalysis research has emerged as a revolutionary method, significantly tuning the design and optimization of catalysts. ML has shown considerable promise in predicting thermodynamic parameters and reaction kinetics, which are essential for improving the performance and efficiency of catalysts. Conventional techniques, chiefly DFT calculations, have underpinned electrocatalysis research for decades. However, although DFT calculations yield precise information on critical parameters, including adsorption energies, reaction enthalpies, and activation energies, their computational intensity and time requirements constrain the effective exploration of vast material domains. This constraint has propelled the utilization of ML techniques, which exploit extensive datasets to enhance predictive speed and achieve accuracy levels akin to DFT.134 Thermodynamic characteristics, including adsorption energies, are crucial in assessing the activity and stability of electrocatalysts. Conventionally, these characteristics are derived using DFT calculations, which, although dependable, are computationally intensive. ML provides an effective alternative by facilitating quick predictions without compromising precision. Researchers have successfully generalized across material spaces and predicted thermodynamic parameters in a fraction of the time by training ML models on datasets obtained from DFT computations.135 Zhang et al.136 established a thermodynamic-based framework for screening and designing HER catalysts by optimizing the critical thermodynamic parameter ΔG*H and utilizing the advanced computational capabilities of ML and DFT. This methodology offers a theoretical foundation and technological assistance for identifying efficient and economical HER catalysts. Similarly, Lu et al.68 suggested a data-driven approach (CROSST) to forecast the HER performance of TM-anchored bis-TM carbon nitride (TM-M′2M′′CNO2) electrocatalysts by integrating feature engineering-generated descriptors into a convolutional neural network (CNN). The authors established MXenes feature datasets comprised of TM-M′2M′′CNO2 and M′2M′′CNO2 and trained interpretable ML models. According to the obtained results, Mo and Ni displayed notable moderating effects, which enhanced ΔGH < −0.2 eV for M′2M′′CNO2-based catalysts. Stabilizing Ru and Rh species on the C and N sides of M′2M′′CNO2 platforms enabled more feasible HER catalysts. By clustering between the TM and the three M′s adjacent to the TM, the TM anchoring modified the M′2M′′CNO2 electron distribution. It boosted the HER performance by increasing the electrons in the O in the I site and decreasing the bonding orbital occupancy of O and H. The symbolic transfer CNN model adeptly managed intricate features, mitigated human influence, and transcended the constraints of the CNN algorithm on limited datasets through the integration of symbol transformation and feature engineering. The HER catalytic performance was predicted with high accuracy (train/test R2 = 0.9397/0.9320), offering a trustworthy tool for material design. Andersen and Reuter137 emphasized the efficacy of ensemble approaches in the examination of transition metal oxides and sulfides for catalyzing the OER process. These studies underscore the pivotal role of ML in predicting thermodynamic parameters, allowing researchers to effectively discover stable and active electrocatalysts.

Reaction kinetics, which include the reaction rates and rate constants, are essential for optimizing electrocatalytic processes. Conventional methods of kinetic modeling typically necessitate comprehensive experimental datasets and depend on microkinetic models that demand considerable manual intervention. These techniques are constrained in their capacity to consider the intricate interdependencies of environmental parameters, including surface charge distribution, pH, and electric fields, which influence catalyst activity.138 ML mitigates these issues by automating the prediction of the reaction kinetics via data-driven methodologies. Yue et al.139 conducted first-principles calculations to examine the influence of defect charges on the electrocatalytic performance of transition metal (TM = Ti, V, Cr, Mn, Fe, Co, Ni, Cu, Ru, Rh, Pd, Ag) single atoms coupled with a PtSe2 monolayer (TM@PtSe2), identifying the rate-determining phases of OER and ORR (Fig. 9). The findings manifested that the Pt-rich environments could strengthen the confinement of TM atoms on PtSe2 and confirmed the stability of 29 types of TM@PtSe2 in various charge states. The Pd˙@PtSe2 (ηOER/ORR = 0.31/0.43 V) and Pd × @PtSe2 (ηOER/ORR = 0.36/0.74 V) configurations demonstrated both low formation energy and superior electrocatalytic performance due to their ultralow overpotential, reduced formation energy, and stable structure across a wide Fermi level range. The charge states of TM@PtSe2 significantly influenced the establishment of bifunctional OER/ORR catalysts with lower overpotentials by optimizing the interaction intensity between the TM@PtSe2 catalytic schemes and oxygenated intermediates. The ML models elucidated the nature of activity in OER/ORR processes. Primary descriptors such as ionization energy, electronegativity, number of TM d-electrons, d-band center, and electron affinity were utilized to assess the adsorption behavior. This highlights the impact of defect charges on electrochemical processes, offering a theoretical framework for exploring efficient bifunctional OER/ORR electrocatalysts.


image file: d5qm00326a-f9.tif
Fig. 9 Reaction mechanisms of OER and ORR on transition metal-based catalysts. Reproduced with permission from ref. 139, copyright 2024, the Royal Society of Chemistry.

Xu et al.140 highlighted the utilization of ML techniques to simulate reaction rate constants for HER and OER processes. Their research illustrated how ML algorithms can capture variations in catalyst performance across different environmental circumstances, offering theoretical insights for catalyst improvement. Moreover, ML methods have been employed to automate the enumeration of reaction pathways, therefore considerably diminishing the dependence on human perception. The application of ML in reaction kinetics transcends mere prediction of the rate constants. Graph neural networks (GNNs) have been employed to model the interactions between reactants and the catalyst surface, considering structural and electrical aspects that affect the reaction kinetics.141 These models have demonstrated significant efficacy in HER and OER investigations, where surface morphology and electronic environments are critical for catalytic performance.

3.7. Electronic structure and bandgap

The electronic structure and bandgap of materials play a crucial role in determining their electrochemical capabilities in the domains of electrocatalysis and SCs. The bandgap and electronic structure of electrocatalysts employed in water electrolysis, namely in HER/OER electrolysis, directly affect the catalytic activity and reaction kinetics.142 Materials with a reduced bandgap typically promote electron transport, thereby improving the catalytic reactivity. The electronic structure in SCs dictates their charge storage capacity and the pace of charge/discharge cycles. Consequently, understanding and enhancing these electronic properties are critical for the development of high-performance electrocatalysts and SC materials. The examination of electronic structures has conventionally depended on quantum mechanical computations such as DFT, which offers comprehensive insights into material properties, including the bandgap, density of states (DOS), and orbital interactions. Nonetheless, although DFT is exceptionally precise, it is computationally intensive and time-consuming, particularly in the context of large-scale materials screening. Conversely, ML methods have emerged potent alternatives, proficiently predicting the electronic structure and bandgap of unknown materials by using existing datasets, thereby surmounting some limits of traditional techniques.

ML techniques such as neural networks, support vector machines (SVMs), random forests, and graph neural networks (GNNs) are powerful tools for predicting electronic properties from extensive datasets. Among them, GNNs are particularly adept at modeling intricate atomic interactions and electronic characteristics (e.g., bandgap and DOS). Recent findings indicate that oxygen functional groups (OGs) markedly improve the catalytic efficiency of M–N4–C SACs for HER and OER. Specifically, incorporating functional groups such as OH, COOH, and C–O–C into M–N4 structures (e.g., Fe or Co centers) modifies essential features such as the d-band center and electron affinity, which are vital for promoting the catalytic activity (Fig. 10A and B).143 The integration of ML with DFT allows an in-depth examination of these effects, yielding significant insights into HER/OER activity and aiding in the development of high-performance catalysts. Moreover, manipulating metal centers and substituents together with doping 5d transition metals into graphitic carbon nitride underscores the significance of the d-band center, M–O/M–H interactions, charge states, and defect formation energies in enhancing bifunctional OER/ORR catalysts (Fig. 10C and D).144 Likewise, combining DFT and ML for TM-doped C3B monolayers has proven effective in identifying key descriptors such as the number of d electrons, electronegativity, first ionization energy, and atomic radius, which are vital for enhancing the HER and OER activities (Fig. 10E and F).145 By doping 3d metals (Ti, V, Cr, Mn, Fe, Co, Ni, Cu, and Zn), 4d metals (Zr, Nb, Mo, Ru, Rh, Pd, and Ag), and 5d metals (Ta, W, Re, Os, Ir, Pt, and Au) with diverse structural and electronic properties in a C3B monolayer structure, numerous metal boride catalysts were created and their possible use for HER/OER further investigated. Fe-, Ag-, Re-, and Ir-doped C3B showed outstanding HER performances with their ΔG*H value approaching 0.00 eV. Alternatively, the Ni- and Pt-doped C3B exhibited remarkable OER activities with an overpotential of less than 0.44 V among the TM atoms considered. In conjunction with their low overpotentials for the HER process, the adopted Ni/C3B and Pt/C3B were suggested as possible bifunctional electrocatalysts for water splitting. These achievements demonstrate the collaborative potential of ML and DFT in accelerating catalyst development and discovery.


image file: d5qm00326a-f10.tif
Fig. 10 Integration of ML and DFT in catalyst design. (A) and (B) Impact of oxygen functional groups (OH, COOH, and C–O–C) on the d-band center and catalytic performance of M–N4–C catalysts. Reproduced with permission from ref. 143 copyright 2024, the Royal Society of Chemistry. (C) and (D) Effect of 5d transition metal doping on OER and ORR overpotentials and key electronic properties. Reproduced with permission from ref. 144. Copyright 2024, Elsevier. (E) and (F) Key descriptors influencing HER/OER activities in TM-doped C3B monolayers identified through ML and DFT. Reproduced with permission from ref. 145 copyright 2023, the American Chemical Society.

Materials databases such as OQMD,146 Materials Project,147 and AFLOWlib148 are essential resources that offer extensive information on the electronic structures of thousands of materials. When combined with ML algorithms, these databases enable the identification of materials with optimal electronic structures for specific applications, such as strengthening the catalytic properties for water splitting and improving the performance of SCs. For example, studies on dual transition metal Janus-MXenes have shown that ML-assisted models can accurately determine key factors influencing their ORR and OER performance, such as cohesive energy, phonon dispersion, and electronic stability. This method not only reduces research costs but also accelerates the discovery of high-performance catalysts.149 The application of ML in materials discovery has advanced the boundaries of materials innovation, enabling the rapid development of more efficient and sustainable energy technologies. Bandgap prediction and optimization constitute a significant domain for ML in materials design. The bandgap not only dictates the conductivity of a material but is also intrinsically linked to its catalytic efficiency and energy storage capability. By employing ML models, researchers can rapidly predict the bandgap based on attributes such as atomic composition, crystal structure, and electron density. This approach has proven to be a robust alternative to conventional DFT-based techniques, which frequently encounter challenges such as high computational cost and underestimation of the bandgap. Priyanga et al.150 emphasized the capability of random forest models in predicting the nature of band gaps in perovskite oxides (ABO3) based on chemical composition, ionic radius, ionic character, and electronegativity. A Random Forest algorithm predicted the relationship between the bandgap characteristics and the composition of the perovskite oxide, achieving an accuracy of approximately 91%. The confusion matrices produced for various random states (Fig. 11A–E) demonstrated the reliability of this model in differentiating between direct and indirect band gaps, offering a robust framework for understanding the bandgap properties in complex oxide systems. ABO3 containing alkali metals such as Li, Na, K, and Rb at either the A or B site exhibits the greatest potential for possessing a direct bandgap characteristic. Regardless of whether they occupy the A or B site, these perovskite oxides will exhibit a direct bandgap. The incorporation of alkaline earth elements such as Be and Mg, or d metals such as Sc and Fe, significantly enhanced the likelihood of anticipating the direct band gap. According to the presented ML approach, NaPuO3 and VPbO3 materials were identified as promising candidates for solar cells owing to their optimized band gaps.


image file: d5qm00326a-f11.tif
Fig. 11 Comprehensive analysis of ML models for bandgap prediction. (A)–(E) Confusion matrices of random forest for perovskite oxides. Reproduced with permission from ref. 150 copyright 2022, Elsevier. (F) and (G) SVR model performance on bandgap predictions of inorganic solids. Reproduced with permission from ref. 151 copyright 2018, the American Chemical Society. (H)–(O) GPR-based bandgap prediction of functionalized MXenes. Reproduced with permission from ref. 70 copyright 2023, the American Chemical Society.

Zhuo et al.151 demonstrated that SVR after the initial classification of metals and nonmetals can yield predictions markedly closer to experimental values than traditional DFT calculations. The 3896 experimentally reported band gaps in their training set were made up of 2458 distinct compositions derived from measurements of diffuse reflectance, resistivity, surface photovoltaic, photoconduction, and UV-vis. Their research revealed a robust correlation between ML-predicted bandgaps and both experimental and DFT-calculated values, highlighting the effectiveness of ML in bandgap prediction (Fig. 11F and G). The SVR-based ML approach could forecast the bandgap utilizing a descriptor set based on the elemental properties of constituent components, which pertained to the relative location of the atom in the periodic table, its electrical structure, and its physical characteristics. These results underscore how the SVR model can surmount the conventional limitations of DFT, such as the underestimation of bandgap values. Building upon these advancements, Rajan et al.70 applied kernel ridge (KRR), SVR, GPR, and bootstrap aggregating regression algorithms on functionalized MXenes. A database with computed optimum structural and electrical attributes was established to accomplish the goal. Besides, a metal-semiconductor categorization model with 94% accuracy was designed. Readily accessible parameters of MXene, including boiling and melting points, atomic radii, phases, and bond lengths, served as input features, attaining GW-level precision in bandgap predictions. In particular, the GPR model forecasted the bandgap within seconds with the lowest RMSE value of 0.14 eV. The models were more accessible given that they did not necessitate prior information of the Perdew–Burke–Ernzerhof bandgap and the positions of conduction band minima and valence band maxima as predictive variables. The mean boiling point and standard deviation of melting point demonstrated a robust positive correlation with the GW bandgap, revealing the influence of constituent elements. This work effectively leveraged primary and compound features, as illustrated in scatter plots comparing the ML-predicted bandgaps with the true GW values. Consequently, the harvested outcomes underscore the accuracy and reliability of GPR in tackling the challenges of bandgap prediction across diverse material systems (Fig. 11H–O).

3.8. Adsorption energy

Adsorption energy is a vital parameter in assessing the activity and selectivity of electrocatalysts in HER and OER, as well as in several electrochemical processes related to SCs. The adsorption of key intermediates such as hydrogen or oxygen species is crucial in regulating the reaction kinetics and overall catalytic efficiency. However, precise calculation or experimental assessment of adsorption energies can be resource-intensive. ML approaches have emerged as potent tools to tackle these difficulties by delivering efficient and accurate predictions of adsorption energies for various materials and surface configurations. ML algorithms such as ANNs, GPR, and GNNs predict adsorption energies by discerning correlations between catalyst attributes and adsorption properties, thereby expediting the screening process and minimizing the need for exhaustive DFT calculations. The amalgamation of DFT calculations with ML models has demonstrated efficacy in optimizing the adsorption energies for various catalytic materials. Research on transition metal borides (MBenes) has shown that ML can effectively predict their adsorption properties by leveraging large datasets derived from DFT calculations. The investigation of the d-band center combined with ML-derived scaling relations between adsorption energies has proven crucial in optimizing key intermediates such as *OH, *O, and *OOH, thereby enhancing their catalytic performance.152 In the case of MBenes, ΔGOH* has been identified as a reliable descriptor for overpotentials (ηOER and ηORR), with the optimal range for Ni-based MBenes found to be 1.5–1.7 eV (Fig. 12A and B). A strong linear relationship was established, where ΔGO* = 1.82ΔGOH* = 0.15 and ΔGOOH* = 0.82ΔGOH* = 2.8 (R2 = 0.98), minimizing the overpotential by reducing the competition between *O and *OOH adsorption (Fig. 12C). Additionally, ΔGOH* was found to correlate linearly with the number of transferred charges (Ne), described as ΔGOH* = 2.61Ne + 2.54 (R2 = 0.70) (Fig. 12D). The d-band centers of different MBenes were computed (Fig. 12E), revealing considerable variance with increased d-orbital occupancy, wherein deeper d-band centers correlated with diminished intermediate binding strengths. The analyzed d-band center of M2B2 and M3B4-type MBenes exhibited a linear correlation with ΔGOH* (R2 = 0.97), indicating that increased d-electron occupancy led to a decrease in the intermediate binding strengths and lower overpotentials (Fig. 12F). Moreover, the combination of ML and DFT in the study of single-atom catalysts (SACs) supported on phosphorus carbide (PC3) revealed essential descriptors such as the first ionization energy and bond length, highlighting the complex interplay between adsorption behavior and catalyst architecture.153
image file: d5qm00326a-f12.tif
Fig. 12 Relationship between ΔGOH* and overpotential for (A) OER and (B) ORR and (C) correlation among ΔGOH*, ΔGO*, and ΔGOOH*; (D) linear relationship between charge transfer (Ne) and ΔGOH*; (E) calculated d-band centers of different MBenes and (F) scaling relationship between εd and ΔGOH*. Reproduced with permission from ref. 152. Copyright 2023, Elsevier.

Interpretable ML models offer further insights into variables affecting adsorption processes. Techniques such as Shapley additive explanations (SHAPs) and feature importance analysis help identify key features such as the d-band center and quantity of d-orbital electrons, which are essential for comprehending and enhancing the catalytic activity. Liu et al.154 established that these descriptors are crucial for elucidating the adsorption behavior of bifunctional electrocatalysts, thereby offering a theoretical basis for the design of high-performance catalysts. This understanding has enabled the rational design of novel materials with tailored adsorption properties, hence improving the catalytic efficiency. The integration of ML models with high-throughput testing and autonomous laboratories has significantly accelerated catalyst discovery. Through the use of a closed-loop optimization strategy, ML predictions are validated and progressively refined via experimental feedback, therefore considerably decreasing development time. The synergy between ML, DFT, and experimental methods establishes a robust framework for optimizing the adsorption energies and enhancing the performance of electrocatalysts. The advances in ML-assisted DFT simulations for evaluating adsorption energies in bifunctional catalysts highlight the importance of adopting a hybrid methodology for catalyst improvement, bridging computational predictions with experimental implementations.155

4. ML applications in sustainable energy technologies

4.1. ML descriptors for HER electrocatalysts

The design and selection of descriptors are essential for forecasting the catalytic activity of HER electrocatalysts. Numerous descriptors, including hydrogen adsorption free energy (ΔGH*), Bader charge analysis, and electronic structural properties, have been extensively used in research integrating ML and DFT.156–158 These descriptors facilitate the understanding and prediction of the activity, selectivity, and stability of catalysts, which is essential for accelerating the advancement of innovative catalysts.159 The advancement of descriptors in single-atom catalysts (SAC) and 2D materials such as MXenes, MBenes, and graphene has greatly boosted the efficiency of material screening and performance optimization.160 Recent studies have underscored the essential role of electrostatic interactions in HER kinetics, which conventional descriptors such as ΔGH* do not adequately represent.161 The intricacy of coordination environments in nanocluster electrocatalysts complicates the selection and development of descriptors, prompting further research into novel descriptors.162 Descriptors are essential factors that describe the physicochemical properties of electrocatalysts, indicating their activity, stability, and selectivity. Descriptors may be categorized into structural, electronic, and surface properties. For example, ΔGH* is commonly employed to characterize the adsorption characteristics of hydrogen in the HER process. Ideally, ΔGH* should approach zero to ensure that the catalyst preserves an ideal equilibrium between hydrogen adsorption/desorption.157 Descriptors of electronic structure, including d-band center and Fermi level, are frequently employed to predict the catalytic activity. These characteristics elucidate the electron occupancy within the catalyst and its interaction with reactants.159 Recent investigations indicate that the strength of hydrogen bonding in bimetallic MXenes is highly correlated with the initial state of the outer metal–M′–O bond, hence broadening the range of descriptors for HER applications.163 ML models have demonstrated that in supported heteroatom-doped metal compounds, the type of electrolyte, catalyst shape, and the metal-to-nonmetal ratio are critical factors influencing the HER performance.164

ΔGH* is a crucial descriptor for evaluating the activity of HER electrocatalysts, given that it indicates the binding strength of hydrogen on the catalyst surface, a vital element for optimal catalytic performance. Research has consistently demonstrated that an ideal ΔGH* value close to zero achieves equilibrium between hydrogen adsorption/desorption. Fig. 13A and B illustrate that the correlation between ΔGH* and HER activity in transition metal-doped metal phosphides exemplifies this principle, indicating that catalysts with ΔGH* values approaching zero demonstrate an enhanced HER performance by optimizing adsorption/desorption processes.165 Research has repeatedly demonstrated that an ideal ΔGH* value approaching zero achieves equilibrium between hydrogen adsorption/desorption, as illustrated in the trends of TM-doped MXenes in Fig. 13C.158 In transition metal atom-intercalated g-C3N4/TMD heterostructures, DFT-calculated ΔGH* has been employed to identify excellent hydrogen adsorption sites. Jyothirmai et al.61 illustrated the utilization of RFR with a low MAE of 0.118 eV and a high R2 of 0.957 for precise ΔGH* prediction, markedly improving the accuracy and efficiency of HER activity prediction (Fig. 13D–F). This study demonstrated the substantial influence of chalcogenide choice and electron configurations on the stability of heterostructures and the interaction properties of substrates by introducing ML-driven high-throughput screening of TM atom-intercalated g-C3N4/MX2 (M = Mo, W; X = S, Se, Te). The essential parameters affecting the HER activity were clarified by the SHAP technique, including hydrogen adsorption on the C site, MX layer, S site, and the intercalation of TM atoms at the N site. Important information for strategic catalyst design and optimization was provided by the ML model, which revealed that the hydrogen adsorption energies on the N site of the CN layer resulted in exceptional HER performances with high exchange current densities, especially in Sc and Ti-intercalated heterostructures.


image file: d5qm00326a-f13.tif
Fig. 13 Role of hydrogen adsorption free energy (ΔGH*) in HER electrocatalyst performance evaluation: (A) and (B) ΔGH* as a universal descriptor for HER activity in TM-doped metal phosphides. Reproduced with permission from ref. 165 copyright 2023, Elsevier; (C) trends of ΔGH* in TM-doped MXenes. Reproduced with permission from ref. 158 copyright 2023, Elsevier; (D)–(F) RFR-based ΔGH* predictions in TM-intercalated g-C3N4/TMD heterostructures. Reproduced with permission from ref. 61 copyright 2024, the American Chemical Society.

Alloy electrocatalysts can modulate the adsorption Gibbs free energy of hydrogen to efficiently lower the HER overpotential. The combination of various metallic elements modifies the density of d electronic states, thus attaining an almost ideal ΔGH* value, which is necessary for effective proton adsorption and hydrogen desorption. Catalytic activity improves and energy consumption diminishes when the HER overpotential is decreased. The active site availability and intrinsic activity are further enhanced by the synergistic effects (e.g., strain effects and electronic structure modulation) of alloy components. In this context, Zhou and coworkers established a high-throughput pathway to evaluate the adsorption energy of a catalyst surface and predict its final configuration, identifying 43 appropriate alloys as possible HER electrocatalysts. This can significantly expedite the identification of high-performance HER electrocatalysts (about 100 times faster than DFT). A promising AgPd candidate, beyond the ML dataset, was randomly chosen and rigorously examined using ab initio simulations in a realistic electrocatalytic environment, therefore validating the precision of the ML model and facilitating the identification of suitable structures for calculations.51

Electronic structural characteristics, including the d-band center and Fermi level, are essential for predicting the catalytic activity by describing the interaction between the catalyst and reactants. The electronic structural properties, such as local charge distribution and bonding states, are crucial in influencing the HER activity in SACs (Fig. 14A).160 The d-band center, a commonly utilized parameter for transition metal catalysts, elucidates the distribution of electronic states and affects the adsorption intensity of reactants on the catalyst surface (Fig. 14B).156 Charge transfer is a crucial element for understanding the electronic characteristics of catalysts during catalytic processes. In high-entropy alloys, the combined effects of charge transfer and mixing entropy enhance their stability and HER activity, underscoring their potential as effective HER catalysts (Fig. 14C).166 Additionally, in MBene materials, Bader charge analysis offers quantitative insight into their charge distribution and electron transfer during catalytic reactions. ML investigations have revealed a direct correlation between charge transfer and catalytic performance, as illustrated in the charge maps and activity volcano plots (Fig. 14D–F).167 In recent years, the combination and optimization of ML-based descriptors have emerged as an important direction in catalyst research, especially for the development and validation of descriptors utilizing DFT data. The integration of ML and DFT allows the efficient identification and optimization of descriptors.168 Research on doped MBenes and phosphides utilized structural and elemental characteristics as inputs, employing models such as support vector machine (SVM) and gradient boosting tree (GBT) to attain precise predictions of ΔGH*, thereby accelerating the screening of potential electrocatalysts.167 Recent research has established a multi-step ML workflow for forecasting the HER performance of 4500 varieties of MXenes, effectively selecting the most catalytically active materials.169 In Mo2C MXenes, electrostatic repulsion was recognized as a crucial element influencing HER kinetics, establishing a novel theoretical foundation for catalyst design.161 High-throughput screening employing random forest regression has demonstrated the superior HER performance of transition metal-intercalated g-C3N4/MX2 heterostructures.61 Furthermore, the influence of various metal compositions and cluster sizes on the HER performance in nanocluster electrocatalysts was elucidated by ML, helping to identify the optimal nanocluster combinations.162 Research through ML on supported heteroatom-doped metal compounds demonstrated the synergistic influence of electrolyte type, catalyst shape, and combinations of metals and nonmetals on their HER performance.164


image file: d5qm00326a-f14.tif
Fig. 14 Integrative analysis of electronic structure descriptors, charge transfer, and catalytic performance for HER over diverse catalyst systems. (A) Electronic structure characteristics and HER activity over SACs. Reproduced with permission from ref. 160, copyright 2021, the Royal Society of Chemistry; (B) free energy profile and active site analysis for HER on the Z2-βGyNR system. Reproduced with permission from ref. 156, copyright 2023, Elsevier; (C) impact of charge transfer and mixing entropy on HER activity of high-entropy alloys. Reproduced with permission from ref. 166, copyright 2023, Wiley-VCH; (D)–(F) charge transfer distribution and catalytic performance correlation of MBene materials. Reproduced with permission from ref. 167, copyright 2023, Wiley-VCH.

A multitude of effective case studies utilizing ML and DFT simulations demonstrate the significance of descriptors in the screening and optimization of HER catalysts. For example, in N-doped graphene-based dual-atom catalysts (NG-DACs), descriptors such as ΔGH* and local average electronegativity were employed to swiftly identify high-performance catalysts (Fig. 15A).170 Research on Ni-doped MXenes159 (Fig. 15B) and Pt-doped graphene systems160 demonstrated that the use of descriptors markedly improved the catalytic efficacy. Furthermore, research on bimetallic MXenes has demonstrated the considerable impact of the outer metal atoms on hydrogen binding strength, with Mo2NbC2O2 exhibiting the lowest overpotential, indicating substantial promise as an HER electrocatalyst (Fig. 15C).163 ML-assisted screening of nanocluster catalysts has demonstrated the influence of various metal atom combinations on their performance, particularly at the nanoscale, where the synergistic effects of size and metal composition markedly impact their catalytic activity.162 These investigations illustrate that ML-enhanced high-throughput screening techniques have substantial potential in catalyst design, markedly diminishing the utilization of experimental and computational resources, while expediting the advancement of effective HER catalysts.


image file: d5qm00326a-f15.tif
Fig. 15 Comprehensive visualization of descriptor-based screening and optimization strategies for HER catalysts. (A) ML and DFT-assisted workflow for descriptor-based screening of NG-DACs. Reproduced with permission from ref. 170, copyright 2025, Elsevier; (B) adsorption free energy diagram (ΔGH*) of Ni-doped MXenes for HER performance enhancement. Reproduced with permission from ref. 159, copyright 2023, Wiley-VCH; (C) hydrogen bonding strength and overpotential analysis of Mo2NbC2O2 and bimetallic MXene with excellent HER activity. Reproduced with permission from ref. 163, copyright 2024, the American Chemical Society.

4.2. ML descriptors for OER electrocatalysts

The OER process occurs through two potential pathways, the lattice oxygen-mediated mechanism (LOM) and AEM. The AEM route occurs through the adsorption and breakdown of H2O molecules, resulting in the formation of HO* and H+ ions on the catalyst surface. During the second phase of OER electrolysis, HO* dissociates to produce O* and an additional H+ ion. The interaction between an additional H2O molecule and an O* reaction intermediate promotes the synthesis of HOO* (Fig. 16A).171,172 Alternatively, the OER culminates in the liberation of oxygen and desorption of H+ ions. The LOM of OER electrolysis entails the binding of H2O/OH species at oxygen vacancies, establishing new sites for electron transfer and the regeneration of lattice oxygen. The lattice oxygen atoms of the catalyst material facilitate the OER mechanism (Fig. 16B).173,174 Under acidic conditions, the LOM mode of OER electrolysis is initiated by the adsorption and dissociation of the H2O molecule on the surface lattice O, resulting in the formation of HO*, which is then transformed to O* and a proton (H+) during the electrochemical process (Fig. 16C).175
image file: d5qm00326a-f16.tif
Fig. 16 (A) Conventional AEM and (B) LOM OER electrolysis in alkaline solutions; (C) dissolution of surface cations represents another OER electrolysis pathway caused by thermodynamic instability under OER catalysis conditions. Reproduced with permission from ref. 171, copyright 2021, the Royal Society of Chemistry.

Generally, electrocatalytic activities are linked to the catalyst composition and experimental conditions. The oxidation states and structures influence the catalytic efficacy, prompting the consideration of electrolysis as a multifaceted issue. Under these conditions, investigating electrocatalysts is a formidable challenge. The effective design of active catalysts through trial and error necessitates supplementary knowledge and intuition. Catalysis informatics offers AI-driven data-centric design to assist researchers in understanding information, implicit guidelines, and concealed patterns associated with electrocatalysts and electrolysis, hence expediting the design of electrocatalysts. This section emphasizes the utilization of the ML screening method to identify high-efficiency electrocatalysts for the OER process. Fig. 17A illustrates six essential factors to consider when applying X-ides (e.g., transition metal borides, carbides, pnictides, and chalcogenides) as electrocatalysts for OER electrolysis. Objectives must be specified before conducting the investigation. Materials databases and available literature can be utilized to select the most suitable composition of TM X-ides. Researchers should predict the stability of TM X-ide compositions using Pourbaix diagrams and appropriate computer assessments. Efficient physicochemical characterizations are essential for evaluating the composition of the material after conditioning and monitoring its reconstruction processes during long-term testing (Fig. 17B).17


image file: d5qm00326a-f17.tif
Fig. 17 (A) Proposed design considerations for evaluating TM X-ide-based electrocatalysts for OER catalysis; (B) analysis workflow for examining physicochemical transformations of TM X-ide electrocatalysts before, during, and after OER catalysis. Reprinted with permission from ref. 17, copyright 2023, the American Chemical Society.

The performance of 18 perovskites for OER catalysis was assessed at various current densities in an alkaline solution, yielding 1080 data points with each measurement performed three times to ensure analytical reproducibility. A symbolic regression could elucidate the linear relationship between tolerance and octahedral characteristics of perovskites and their corresponding OER activity (Fig. 18A and B).176 Within this scenario, around 3000 theoretical structures were evaluated. Thirteen promising perovskites were synthesized, of which five were produced in pure form, and four exhibited greater catalytic reactivity than previously identified perovskites, demonstrating no significant loss in activity over time. This manifested the experimental utility and data-intensive nature of intimate practical/ML feedback loops.


image file: d5qm00326a-f18.tif
Fig. 18 (A) Density profile and Pareto front of mean absolute error versus the complexity of 8640 mathematical formulas; (B) OER catalysis onset potential against the octahedral/tolerance factors ratio (μ/t). The red and black dots mark the discovered and previously known perovskites, respectively. Reprinted with permission from ref. 176, copyright 2023, Springer Nature.

Furthermore, Hong et al. examined the behavior of 14 descriptors to assess the strength of the metal–oxygen bond in perovskite materials through a literature review and additional analysis. Findings from statistical methodologies including linear regression and factor analysis validated the significance of employing multiple descriptors to enhance the predictive accuracy, highlighting the critical influence of factors such as d-electron density, charge transfer energy, and structural characteristics (e.g., M–O–M bond angle and tolerance factor) on the electrocatalytic activity for OER. The examined predictive models showed superior capability in probing electron occupancy and covalency as initial factors influencing the OER activity compared to traditional single-descriptor methods.177 The disruption of correlations among the adsorption energies of OH*, O*, and OOH* intermediates could enhance the sluggish OER process, aiding in the identification of new electrocatalysts. Recently, ML algorithms have explored various valuable and abundant metal oxides, including 2D compounds, perovskites, and metal oxides, yielding intriguing potential.178–181 Rohr et al. examined the efficacy of many ML models by utilizing a diverse range of pseudo-quaternary metal oxide datasets derived from high-throughput synthesis and electrochemical tests, revealing their OER electrocatalytic activities. The linear ensemble, random forest, and Gaussian process algorithms exhibited varying behaviors, relying on the research target rather than a particular model, as confirmed by exploring three distinct research objectives. The examination of various learning schemes, each comprised of 2121 catalysts over four chemical windows, enabled a research acceleration of 20-fold greater than random acquisition.182 Flores and colleagues tested the electrochemical stability of iridium oxide polymorphs in acidic OER via an active ML accelerated algorithm.183 After creating the dataset of 38[thin space (1/6-em)]000 molecules, 196 polymorphs of IrO2 and 75 polymorphs of IrO3 were examined, focusing on α–IrO3, which suggests a possible detection rate double that of random searching. The acquired structural outputs predicted the octahedral local coordination environments as the low-energy configurations, with the Pourbaix Ir–H2O analysis certifying α–IrO3 as the most stable phase.

Additionally, diverse TMD (MoX2/WY2, where X/Y = S, Se, Te) heterostructure composite electrocatalysts were predicted through integrated AI, specifically utilizing the LASSO methodology in conjunction with quantum mechanics, thereby innovatively employing PL (λ, θ, d, and l) as a universal activity descriptor. The PL descriptor encompasses layer distance, rotational angle, bandgap ratio, and bond length to predict the catalytic efficacy of TMD heterostructures. Analyses of the free energy and binding energy can identify the more stable heterojunction schemes and elucidate the fundamental mechanics of water splitting electrocatalysis. Findings indicated that the unique MoTe2/WTe2 heterostructure, when rotated at 300°, had exceptional electrocatalytic performance for HER and OER, achieving overpotentials of 0.03 V and 0.17 V for HER and OER, respectively, surpassing existing water splitting electrocatalyst systems.91 The utilization of LASSO regression demonstrated the critical descriptors influencing the adsorption behavior, hence improving the computational efficiency and reducing the dependence on time-consuming DFT calculations.

The recent literature indicates that single- and dual-atomic catalysts are highly valued for enhancing next-generation anodes and cathodes, achieving performance levels comparable to noble-based electrocatalysts, while remaining cost-effective. Furthermore, the smart properties of single- and dual-atom catalysts in comparison to their bulk counterparts provide distinctly superior performances due to their quantum size effect and extensive electrochemically active surface area.184 Moreover, significant research endeavors have focused on enhancing atomic catalyst–support interactions utilizing both traditional carbonaceous and transition metal supports. A novel carbonaceous electrocatalyst utilizing graphdiyne was developed, exhibiting advantageous electrical and structural stability due to its abundant sp– and sp2–hybridized carbon framework derived from diacetylene and benzene motifs, respectively, which enhanced its mechanical and chemical robustness.185 Lin et al. presented an effective ML framework for 104 graphene-modified metal–nitrogen–carbon (M–N–C) SACs to analyze the intrinsic patterns of their physical properties and limiting potentials (UL), enabling accurate predictions of the HER/OER/ORR UL for 260 additional graphene-SACs containing metal-NxCy active centers.179 The DFT training data were prudently obtained via an open-source RF technique from a prior study.186 Six descriptors of the OER catalysts were established including the electron number of the d-orbital “d”, the Pauling electronegativity of the metal atom “Em”, the average pKa value of the surrounding atoms “pKa”, the formation energy of the oxide “Hof”, the hydride formation enthalpy “Hxf”, and the cumulative Pauling electronegativities of the metal atoms “Es”.130 Among them, descriptor d was the most significant characteristic for the OER, followed by the Hxf and Hof descriptors.187 The accuracy analysis based on the mean square error evaluation produced a value of 0.021 for the OER after confirming the available data points. The ML framework was utilized to predict the performance by evaluating the UL values of 260 M–N–C SACs. It is important to note that the existing data for metal and non-metal-based OER electrocatalysts with multiple characteristics induced by different descriptors can be utilized to systematically design fully optimized OER electrocatalysts. The universal descriptor can be achieved with an AI algorithm spanning from heuristics to meta-heuristics algorithms. The primary aim of an algorithm is to determine the selection procedure for an appropriate algorithm.

4.3. ML descriptors for SCs

ESCs are significant energy storage technology in modern society; however, their performance in terms of energy density and cycle life is still not satisfactory. In this case, the conventional “trial-and-error” approaches to boost the charge storage capability of ESCs require a vast number of tedious operations and experiments. Alternatively, ML algorithms and computational chemistry techniques can greatly facilitate the advancement and research of advanced SC schemes. Seeking cost-efficient and highly reactive electrode materials plays a crucial role in the development of ESCs. Generally, the properties of electrode materials (e.g., capacitance, working voltage, chemical/electrochemical stabilities, and electronic/ionic conductivities) should be considered during the property prediction. Given that the intrinsic physicochemical properties of electrode materials can be evaluated from their corresponding crystal structures, it is desirable to preferentially predict these characteristics.

To boost the electrochemical behavior, it is essential to choose the proper electrode material and electrolyte. Novel electrode materials can be theoretically proposed by predicting their properties, but identifying their properties via standard DFT analysis or experimental efforts seems costly and difficult. Attractively, ML algorithms may afford a feasible option to solve the materials exploration problem owing to their ability to capture complex patterns and correlations from existing results. Although screened electrode materials may exhibit high performance and optimized reaction kinetics, the safety of ESCs represents an additional concern. Thus, predicting the degradation of SCs by ML algorithms is primary for the entire system. Thus, the fundamental aim of ML models in ESCs is to construct structure–activity correlation via inexpensive and precise predictions. This section focuses on the recent applications of ML models for monitoring the properties of materials and the design of materials for ESCs. In the case of SC-based carbon electrodes, factors such as heteroatom doping and surface area should be considered when optimizing electrode materials. In addition, the ion diffusion efficiency affects the rate performance of SCs.

In this case, Zhao and coworkers introduced a data-driven project, employing ML tools to optimize the capacitive properties of carbon-based electrodes by well-directing material synthesis and electrolyte selection (Fig. 19).188 The authors utilized a tree-based pipeline optimization technique (TPOT) and GBR model, realizing that N dopants and specific surface area could positively impact the resultant specific capacitance of carbon network electrodes. Interestingly, the N motifs displayed the highest influential role in boosting the capacitive properties because of their tunable electric conductivity and regulated charge transfer dynamics at the electrode/electrolyte interface. Furthermore, N-doping reinforced the stability of the carbonaceous electrode in high-voltage electrolytes. Importantly, given that the voltage window revealed a relatively minor impact on the storage performance, the safety issue in electrolyte selection should be given high priority.


image file: d5qm00326a-f19.tif
Fig. 19 (A) Schematic of data-driven construction of SCs; (B) dataset of carbon-based SCs adopted by the ML algorithm; (C) algorithm analysis by the optimized TPOT technique; (D) Shapley additive analysis of parameters, and (E) average of the absolute Shapley value of parameters. Reproduced with permission from ref. 188, copyright 2023, Elsevier.

The combination of transition metal oxides (TMOs) and graphene oxide (GO) has become important as hybrid electrode materials for SCs. The temperature-dependent capacitance of Co–rGO hybrid electrodes was illustrated using the ML RF model combined with X-ray photoelectron spectroscopy (XPS) findings for the Co ion ratio in a Co–rGO complex fabricated at various temperatures, with an excellent accuracy of 99.9% (Fig. 20A).189


image file: d5qm00326a-f20.tif
Fig. 20 (A) Schematic of the project; (B)–(D) XPS spectra of Co 2p core level of the Co–rGO electrode fabricated at 200 °C, 400 °C, and 600 °C; (E) actual and predicted Co3+/Co2+ ratio, (F) capacitance at various synthesis temperatures and 16 wt%; (G) actual and predicted specific capacitance value, and (H) resistance measured at various temperatures and 16 wt%. Reproduced with permission from ref. 189, copyright 2023, Elsevier.

The outcomes demonstrated that the Co3+/Co2+ ratio of the Co-rGO hybrid increased consistently with an increase in temperature from 200 °C to 600 °C, verifying the ability of the RF model to predict the XPS results and shortening the time-consuming XPS analysis (Fig. 20B–E). The measured capacitance of the Co-rGO electrode followed the order of 600 °C (176.2 F g−1) > 200 °C (160.9 F g−1) > 400 °C (16 F g−1) (Fig. 20F and G). Moreover, the results elucidated that the capacitance was strongly correlated with the ion diffusion kinetics. The electrodes obtained at 200 °C and 600 °C exhibited a higher Warburg slope than that obtained at 400 °C, implying that the electrodes prepared at 200 °C and 600 °C possessed a greater ion transfer rate (Fig. 20H). It could be inferred that the lower ion transfer resistance of the electrodes prepared at 200 °C and 600 °C significantly tuned the overall capacitance compared with the electrode obtained at 400 °C with a large ion transfer resistance.

Transition metal carbides/nitrides (MXenes) are emerging 2D materials for electrochemical energy conversion and storage due to their novel structural/compositional characteristics, modified ionic/electronic conductivities, outstanding chemical stability, and surface functionalities. To achieve a superior performance, researchers are studying their structure–performance relationships by manipulating their microstructure and existing elements. Wang's group explored the structure–property relationship of 600 MXenes such as M2XT2 (T = bare, O, S) and their doped configurations. The authors screened the metallicity and stability via hydrogen ion adsorption using the DFT technique. The ML-derived sure independence screening and sparsifying operator (SISSO) model developed the pseudocapacitance formulas of 200 MXenes (e.g., M2X, M2X-m, M2XO2(-m), M2XS2 and M2XS2-m) based on their key features. Among the M2X systems, Sc2N and Ti2N revealed the best pseudocapacitance values, which were ascribed to their stronger hydrogen ion binding. Statistical analyses claimed that the elements that contributed significantly to the high pseudocapacitance of group-free, O-functionalized and S-functionalized MXenes were positioned in the upper left, lower left, and upper right of the periodic table, respectively.190

2D layered TMDs are potential candidates for nanoelectrochemical and flexible electronic systems. These materials offer new chances to achieve extraordinary functionalities owing to their quantum confinement and the occurrence of a direct bandgap within their monolayer scheme. Understanding the mechanical properties of 2D TMDs in different environments is crucial to ensure their prolonged operation in flexible electronics. In this context, ML algorithms (long short-term memory “LSTM” and feed-forward neural network “FFNN”) paired with molecular dynamics simulations were established to predict the mechanical behavior of MX2 (M = Mo, W and X = S, Se) TMDs with more than 95% accuracy. The LSTM model could predict the stress–strain response with an accuracy close to 1 for the training and validation samples. With a similar accuracy level, the FFNN model could predict the Young's modulus, fracture stress, and fracture strain. More importantly, both ML models estimated the mechanical properties of 2D TMDs under varying operating conditions.191

Metal oxynitrides with a composition ranging between pure oxides and nitrides have favorable physicochemical properties, including chemical inertness, elevated melting point, and high thermal stability. The presence of nitrogen in the oxide phase leads to strong faradaic redox reactions, yielding high rate capability and long cycle life. Thus, the research on advanced metal oxynitride materials requires further exploration. As a proof-of-concept study, the specific capacitance and cyclic stability of the cerium oxynitride electrode were predicted by ML models (multilayer perceptron model (MLP), RF, APRF and APMLP) (Fig. 21A). A specific capacity of ∼26.6 mA h g−1 at a current density of 2 A g−1 with a capacity retention of > 90% over 10[thin space (1/6-em)]000 cycles could be obtained under specific material characteristics of morphology, composition, and surface area operational conditions (e.g., current density and applied potential window) (Fig. 21B–D). The experimental findings (∼26.6 mA h g−1 and ∼100% capacity retention) matched well with the predictive strategic approach (Fig. 21E).192 Based on the above discussion, the prediction of electrode materials requires merging proper descriptors with ML methods. Some correlation strategies (e.g., sequential backwards selection algorithm and contribution analysis) and embedded analysis techniques were employed to identify the key parameters that impact the electrode material properties. Consequently, advanced electrodes can be rationally engineered by researchers. However, the correlation between the targeted properties of electrode materials and selected descriptors is complex in the majority of cases. Therefore, utilizing multiple ML algorithms for optimizing and creating huge data through virtual simulation may be helpful for optimal prediction.


image file: d5qm00326a-f21.tif
Fig. 21 (A)–(D) Predicted against actual values of specific capacitance calculated by multiple value-prediction models; (E) plot of error values compared to actual values of specific capacitance obtained by RF, APRF, MLP and APMLP models. Reproduced with permission from ref. 192, copyright 2021, Elsevier.
4.3.1. ML descriptors for hybrid SC. In the development of hybrid SCs, where EDLC and faradaic pseudocapacitance coexist and interact, effective ML models must employ descriptors that explicitly quantify each mechanism and their coupling. Therefore, several studies have introduced cyclic voltammetry-based metrics obtained via peak deconvolution, i.e., the ratio of the integrated currents under the EDLC and pseudocapacitive peaks and the peak potential separation (ΔEp) between them, extracted by fitting Gaussian or Lorentzian functions to CV scans at multiple rates.193 Complementary descriptors include redox-site density derived from X-ray photoelectron spectroscopy by taking the area ratio of trivalent to divalent metal species (M3+/M2+) and normalizing to the total metal content, as well as an interface synergy factor defined by the fraction of contact area between carbon scaffolds and pseudocapacitive phases, measured through threshold segmentation of TEM/SEM images.194Fig. 22 illustrates the full high-throughput descriptor-to-ML workflow.
image file: d5qm00326a-f22.tif
Fig. 22 ML workflow for SC capacitance prediction.

The effectiveness of these descriptors has been demonstrated in multiple studies. Su et al. demonstrated that an artificial neural network trained on graphene-based electrodes with descriptors covering surface chemistry, pore structure and i_EDLC/i_pseudo could predict capacitance with R2 = 0.88 on unseen data.195 Zhu et al. applied a feed-forward neural network to a broad library of carbon materials and showed that including ΔEp and redox-site density improved the prediction accuracy by 15% over models using only textural features.196 In hybrid systems, Yogesh et al. used the TPOT AutoML framework on graphene-oxide nano-ring electrodes employing descriptors such as i_EDLC/i_pseudo, mesopore volume and interlayer spacing to discover formulations that raised the energy density by ≈25% under 1 A g−1.197 Moreover, Nanda et al. developed ML models correlating hybrid-device cyclic stability with descriptors spanning redox-site density, interface synergy and σ, achieving an R2 of 0.90 for cycle-life prediction.198 Together, these works illustrate how physically grounded, hybrid-specific descriptors and explainable ML can establish a closed-loop workflow for the discovery and optimization of next-generation hybrid SCs.

5. Conclusions

This comprehensive review systematically outlined the utilization of ML in engineering electrodes/electrocatalysts for HER, OER and SCs. Our discussion spanned various aspects, underscoring the significant role of ML in illustrating the optimal options and deciphering sophisticated challenges. We highlighted the common inputs and pivotal characteristics reported by ML models for each system. ML for representative model training has been mastered in computational analyses and experimental discoveries, motivating the exploration and optimization of novel electrodes/electrocatalysts. Besides, the interpretative analysis of ML models provides profound insights into the physicochemical attributes of these electrodes/electrocatalysts, resulting in the identification of design features and key descriptors. Adopting ML presents an attractive paradigm turn toward data-centric strategies in electrode/electrocatalyst engineering, notably reinforcing the domain of electrode/electrocatalyst detection and the understanding of electrochemical catalytic/faradaic processes. This transformation not only signifies the capability of ML in overcoming the economic and sustainability drawbacks in electrochemical energy conversion and storage systems but also guides the prediction of electrochemical performance and tailoring of more effective electrodes/electrocatalysts. Moving forward, the capability of ML to bridge the gap between experimental verification and computational prediction is poised to boost the advances in electrode/electrocatalyst design for energy conversion and storage technologies, yielding more energy-efficient and sustainable options.

6. Outlook

To date, despite the significant progress in this field, many challenges still need to be addressed. The landscape of research focusing on sustainable electrochemical energy conversion and storage is swiftly growing with new opportunities on the horizon. This essential stage of research inspires scientists not only to report their breakthroughs and achievements but also predict their ideas for future directions, in which the potential of ML to stimulate innovations becomes largely untapped. Building on the foundation laid by scientists, engineers, and chemists in utilizing ML for sustainable electrochemical energy conversion and storage systems, researchers, specialists, and readers should shift their attention to unresolved obstacles and memorable scopes that identify the trajectories in this field. Considering the insights harvested from the present review, the upcoming outlook addresses the key obstacles and potential trends in ML-aided electrode/electrocatalyst design. Within the period of the emergence of DFT simulations to the budding phase of experimental automation, the road to fully attaining the potential of ML in sustainable energy conversion and storage is full of intricate issues that span various levels of electrode/electrocatalyst systems and reaction pathways.

6.1. Autonomous tailoring of electrode/electrocatalyst systems

Experimental data is known for its authenticity but expensive nature. It plays a considerable role in advancing electrode/electrocatalyst optimization and research. As elucidated, popular tactics involve collecting features from domain knowledge, previous professional reports, or implementing handcrafted high-throughput synthesis, which are costly. Alternatively, recently, automation processes have emerged. For example, ChatGPT established the natural language processing work of scientific publications, which requires fast and rich expertise in materials science and chemistry. Very recently, Yaghi et al. supported the investigation of ChatGPT to quickly grab findings from MOF-related literature and gain information from synthesis-related sectors to help ML modeling and direct experiments.199 Interestingly, sophisticated systems that integrate multiple ML models based on several sources of knowledge can also be espoused. An efficient A-lab approach for oxide discovery was implemented for lithium-ion batteries by Ceder and colleagues.200 Upon using an innovative multi-decision platform, their contribution allowed high-throughput automated robotic tests. The authors stabilized DFT calculation simulation-assisted decision-making and text mining to simultaneously take part in the learning cycle of robotic synthesis. This may open a path for the autonomous discovery of electrodes/electrocatalysts, benefiting from profuse data resources and expert system decisions from the literature, DFT, and local experimental findings. As emphasized, high-throughput synthesis apparatuses are still not broadly employed in electrochemical energy conversion and storage due to their high costs. Hence, more accessible alternatives can be adopted. In this context, easily programmable inkjet printers can be applied for the high-throughput fabrication of electrodes/electrocatalysts. Combining ML with these systems simplifies the construction of flexible electronics, and similar methodologies can be adopted to prepare efficient electrodes/electrocatalysts. By nominating cost-effective and readily accessible systems, scientists, engineers, and chemists potentially achieve satisfactory data-driven discovery and optimization in electrode/electrocatalyst advancement.

6.2. Establishing interpretable and reliable ML models

To realize product evolution and real-world applications, the reliability and explainability of ML models are indispensable features. However, the lack of transparency of black-box models makes understanding and authenticating their predictions unclear. This ambiguity induces problems related to reliability, where models might work distinctly on test data but drop out in real-world scenarios owing to overfitting or unestimated parameters. Thus, shifting from merely depending on black-box ML models to implementing more transparent models may address this stumbling block. Transparent models can supply intense interpretability and boost the reliability and appreciation of ML predictions in practical schemes, minimizing hazards related to model deployment and ensuring deeper insights into electrochemical processes. Fundamental physicochemical basics, data science, and other domain knowledge resources can be linked with these models. Strengthening the interpretability not only represents a matter of diminishing risks but is also crucial in proposing opportunities for extracting insights into underlying reaction mechanisms. Integrating domain knowledge with physical/chemical principles into ML models may make predictions plausible by material properties or reaction mechanisms as a model input. Simpler models reinforce the interpretability, particularly when they efficiently capture the correlations within the data. Furthermore, composite approaches, which investigate both ML and mechanistic approaches, can enhance the strengths of both configurations.

6.3. Boosting knowledge transfer and bridging the loyalty space

The single focus on certain material schemes and geometries is a periodic challenge in sustainable electrochemical energy conversion and storage. The present ML research on electrodes/electrocatalysts has not adequately achieved the aim of seamless bridging different material schemes or experiments and DFT simulations. Accordingly, flexible techniques, merging data from various loyalty scales in a similar system and conveying knowledge along multiple systems should be widely applied. Transfer learning and corresponding systems can help in interspersing knowledge and insights, hence decreasing the costs related to training data. High-throughput DFT data-based natural language processing and computer vision to validate preliminary ML models, followed by modulating these models with a set of expensive experimental findings, reflects a potential approach. This strategy not only describes the real-world experimental environments but also ensures the high utilization efficiency of resources. Furthermore, we should emphasize the ability to convey knowledge between similar electrode/electrocatalyst schemes. The work-based automated text mining reported by Ding et al. displayed that the models for acidic and alkaline HER/OER literature can reach promising findings by modifying them on a small set of neutral HER/OER reports.201 Another viable strategy is to adopt DFT data for quick screening and the subsequent utilization of targeted experimental findings to probe potential candidates, assisting scientists in realizing the focused optimization of electrodes/electrocatalysts.

6.4. Identifying economic and sustainability challenges and encouraging collaborative efforts

Economic and sustainability issues are major challenges in the advancement of ML approach-based electrochemical energy conversion and storage technologies.

Recent advances demonstrate that ML-guided strategies can substantially reduce both the number of experiments and material consumption in materials discovery workflows. For example, active learning reduces DFT simulations by over 70%, reducing the computational time and energy use.188 Building on this efficiency, Ceder's A-Lab platform combines robotic synthesis with ML heuristics and DFT energetics to achieve a success rate of 71%, synthesizing 41 of 58 predicted solids in 17 days and delivering more than two new materials per day.202 Moreover, adaptive Bayesian optimization further extends these gains to the laboratory by reducing the number of physical trials by about 70% when targeting specific properties.203

However, although ML can facilitate material usage and minimize waste, it is associated with certain economic and sustainability issues. Substantial resources and energy are necessary for running large-scale ML models, leading to significant costs. Although high-throughput synthesis is effective in performing massive datasets, material waste occurs if not properly handled. Thus, to overcome these drawbacks, scientists should explore more energy-efficient ML algorithms and anticipate sustainable practices in experimental configurations. Utilizing reusing and recycling approaches for electrodes/electrocatalysts in high-throughput tests may reduce waste. The gap between idealized environments usually expressed by DFT calculation simulations and the sophisticated operational conditions of electrodes/electrocatalysts is substantial. Future research topics should prioritize high-throughput experimental routes that investigate device-scale synthesis and analysis. These experiments are realistic, enabling the practical deployment of ML-optimized electrode/electrocatalyst configurations and creating leapfrog-based theoretical models for feasible operational systems. Collaboration between industry and academia can effectively address these challenges. Partners from the industry can offer support and real-world requirements that instruct researchers toward more applicable opportunities. Special collaborative initiatives may involve shared datasets, industry-sponsored projects, and joint research enterprises. Resource sharing, elevated innovation, and practical utilization of research findings are few, and thus, teamwork between industry and academia can address the economic and sustainability challenges in electrochemical energy storage and conversion. The collaborative role will allow the rapid progress of ML applications in electrode/electrocatalyst design and optimization, resulting in the advancement of more sustainable and efficient energy conversion and storage technologies.

Author contributions

Diab Khalafallah: conceptualization, formal analysis, methodology, visualization, writing – original draft, writing – review & editing, and investigation. Fuming Lai: conceptualization, formal analysis, writing – original draft, writing – review & editing, and investigation. Hao Huang: validation and investigation. Jue Wang: formal analysis and validation. Xiaoqing Wang: data curation and formal analysis. Shenfu Tong: methodology, writing – review & editing, and visualization. Qinfang Zhang: resources, writing – review & editing, and funding acquisition.

Conflicts of interest

The authors have no conflicts of interest to declare.

Data availability

The data are available from the corresponding authors upon reasonable request.

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 12474276 and 12274361), the Natural Science Foundation of Jiangsu Province (BK20211361), and the College Natural Science Research Project of Jiangsu Province (20KJA430004).

References

  1. P. E. Brockway, A. Owen, L. I. Brand-Correa and L. Hardt, Estimation of global final-stage energy-return-on-investment for fossil fuels with comparison to renewable energy sources, Nat. Energy, 2019, 4, 612–621 CrossRef CAS .
  2. Z. W. Seh, J. Kibsgaard, C. F. Dickens, I. Chorkendorff, J. K. Nørskov and T. F. Jaramillo, Combining theory and experiment in electrocatalysis: Insights into materials design, Science, 2017, 355, eaad4998 CrossRef PubMed .
  3. D. Khalafallah, F. Qiao, C. Liu, J. Wang, Y. Zhang, J. Wang, Q. Zhang and P. H. L. Notten, Heterostructured transition metal chalcogenides with strategic heterointerfaces for electrochemical energy conversion/storage, Coord. Chem. Rev., 2023, 496, 215405 CrossRef CAS .
  4. L. Ma, M. A. Schroeder, O. Borodin, T. P. Pollard, M. S. Ding, C. Wang and K. Xu, Realizing high zinc reversibility in rechargeable batteries, Nat. Energy, 2020, 5, 743–749 CrossRef CAS .
  5. D. Khalafallah, Y. Zhang, H. Wang, J.-M. Lee and Q. Zhang, Energy-saving electrochemical hydrogen production via co-generative strategies in hybrid water electrolysis: Recent advances and perspectives, Chin. J. Catal., 2023, 55, 44–115 CrossRef CAS .
  6. D. Castelvecchi, How the hydrogen revolution can help save the planet—and how it can’t, Nature, 2022, 611, 440–443 CrossRef CAS PubMed .
  7. M. Chatenet, B. G. Pollet, D. R. Dekel, F. Dionigi, J. Deseure, P. Millet, R. D. Braatz, M. Z. Bazant, M. Eikerling, I. Staffell, P. Balcombe, Y. Shao-Horn and H. Schafer, Water electrolysis: from textbook knowledge to the latest scientific strategies and industrial developments, Chem. Soc. Rev., 2022, 51, 4583–4762 RSC .
  8. T. Terlouw, C. Bauer, R. McKenna and M. Mazzotti, Large-scale hydrogen production via water electrolysis: a technoeconomic and environmental assessment, Energy Environ. Sci., 2022, 15, 3583–3602 RSC .
  9. H. Li, J. Lai, Z. Li and L. Wang, Multi–sites electrocatalysis in high–entropy alloys, Adv. Funct. Mater., 2021, 31, 2106715 CrossRef CAS .
  10. V. Artero, M. Chavarot-Kerlidou and M. Fontecave, Splitting water with cobalt, Angew. Chem., Int. Ed., 2011, 50, 7238–7266 CrossRef CAS PubMed .
  11. A. Buttler and H. Spliethoff, Current status of water electrolysis for energy storage, grid balancing and sector coupling via power-to-gas and power-to-liquids: A review, Renewable Sustainable Energy Rev., 2018, 82, 2440–2454 CrossRef CAS .
  12. Y. Zhou and H. J. Fan, Progress and challenge of amorphous catalysts for electrochemical water splitting, ACS Mater. Lett., 2021, 3, 136–147 CrossRef CAS .
  13. K. Zeng and D. Zhang, Recent progress in alkaline water electrolysis for hydrogen production and applications, Prog. Energy Combust. Sci., 2010, 36, 307–326 CrossRef CAS .
  14. P. Li, M. Wang, X. Duan, L. Zheng, X. Cheng, Y. Zhang, Y. Kuang, Y. Li, Q. Ma, Z. Feng, W. Liu and X. Sun, Boosting oxygen evolution of single-atomic ruthenium through electronic coupling with cobalt-iron layered double hydroxides, Nat. Commun., 2019, 10, 1711 CrossRef PubMed .
  15. M. Qin, S. Li, Y. Zhao, C.-Y. Lao, Z. Zhang, L. Liu, F. Fang, H. Wu, B. Jia, Z. Liu, W. Wang, Y. Liu and X. Qu, Unprecedented synthesis of holey 2D layered double hydroxide nanomesh for enhanced oxygen evolution, Adv. Energy Mater., 2019, 9, 1970003 CrossRef .
  16. S. Li, B. B. Chen, Y. Wang, M. Y. Ye, P. A. van Aken, C. Cheng and A. Thomas, Oxygen-evolving catalytic atoms on metal carbides, Nat. Mater., 2021, 20, 1240 CrossRef CAS PubMed .
  17. K. Kawashima, R. A. Marquez, L. A. Smith, R. R. Vaidyula, O. A. Carrasco Jaim, Z. Wang, Y. J. Son, C. L. Cao and C. B. Mullins, A Review of transition metal boride, carbide, pnictide, and chalcogenide water oxidation electrocatalysts, Chem. Rev., 2023, 123, 12795–13208 CrossRef CAS PubMed .
  18. T. Kou, S. Wang and Y. Li, Perspective on high-rate alkaline water splitting, ACS Mater. Lett., 2021, 3, 224–234 CrossRef CAS .
  19. B. R. Wygant, K. Kawashima and C. B. Mullins, Catalyst or precatalyst? The effect of oxidation on transition metal carbide, pnictide, and chalcogenide oxygen evolution catalysts, ACS Energy Lett., 2018, 3, 2956–2966 CrossRef CAS .
  20. S. Jin, Are metal chalcogenides, nitrides, and phosphides oxygen evolution catalysts or bifunctional catalysts?, ACS Energy Lett., 2017, 2, 1937–1938 CrossRef CAS .
  21. K. N. Dinh, Q. Liang, C.-F. Du, J. Zhao, A. I. Y. Tok, H. Mao and Q. Yan, Nanostructured metallic transition metal carbides, nitrides, phosphides, and borides for energy storage and conversion, Nano Today, 2019, 25, 99–121 CrossRef CAS .
  22. J. Joo, T. Kim, J. Lee, S.-I. Choi and K. Lee, Morphology-controlled metal sulfides and phosphides for electrochemical water splitting, Adv. Mater., 2019, 31, 1806682 CrossRef PubMed .
  23. Y. Jia, K. Jiang, H. Wang and X. Yao, The role of defect sites in nanomaterials for electrocatalytic energy conversion, Chem, 2019, 5, 1371–1397 CAS .
  24. G. Wu, K. L. More, C. M. Johnston and P. Zelenay, High-performance electrocatalysts for oxygen reduction derived from polyaniline, iron, and cobalt, Science, 2011, 332, 443–447 CrossRef CAS PubMed .
  25. A. Chunduri, A. Bhide, S. Gupta, K. H. Mali, B. R. Bhagat, A. Dashora, M. Spreitzer, R. Fernandes, R. Patel and N. Patel, Exploring the role of multi-catalytic sites in an amorphous Co–W–B electrocatalyst for hydrogen and oxygen evolution reactions, ACS Appl. Energy Mater., 2023, 6, 4630–4641 CrossRef CAS .
  26. D. Chen, Z. Pu, P. Wang, R. Lu, W. Zeng, D. Wu, Y. Yao, J. Zhu, J. Yu, P. Ji and S. Mu, Mapping hydrogen evolution activity trends of intermetallic Pt-group silicides, ACS Catal., 2022, 12, 2623–2631 CrossRef CAS .
  27. Y. Liu, X. Liang, H. Chen, R. Gao, L. Shi, L. Yang and X. Zou, Iridium-containing water-oxidation catalysts in acidic electrolyte, Chin. J. Catal., 2021, 42, 1054–1077 CrossRef CAS .
  28. C. Wang, L. Jin, H. Shang, H. Xu, Y. Shiraishi and Y. Du, Advances in engineering RuO2 electrocatalysts towards oxygen evolution reaction, Chin. Chem. Lett., 2021, 32, 2108–2116 CrossRef CAS .
  29. Q. Q. Zhang and J. Q. Guan, Single-atom catalysts for electrocatalytic applications, Adv. Funct. Mater., 2020, 30, 2000768 CrossRef CAS .
  30. S. Banerjee, C. S. Gerke and V. S. Thoi, Guiding CO2RR selectivity by compositional tuning in the electrochemical double layer, Acc. Chem. Res., 2022, 55, 504–515 CrossRef CAS PubMed .
  31. D. Silva, M. Leonardo, R. Cesar, M. R. Moreira, H. M. Santos, S. De and G. Lindomar, Reviewing the fundamentals of supercapacitors and the difficulties involving the analysis of the electrochemical findings obtained for porous electrode materials, Energy Storage Mater., 2020, 27, 555–590 CrossRef .
  32. B. Yao, S. Chandrasekaran, J. Zhang, W. Xiao, F. Qian, C. Zhu, E. B. Duoss, C. M. Spadaccini, M. A. Worsley and Y. Li, Efficient 3D printed pseudocapacitive electrodes with ultrahigh MnO2 loading, Joule, 2019, 3, 459–470 CrossRef CAS .
  33. E. Pomerantseva, F. Bonaccorso, X. L. Feng, Y. Cui and Y. Gogotsi, Energy storage: The future enabled by nanomaterials, Science, 2019, 366, 969 CrossRef PubMed .
  34. T. Wang, R. Pan, M. L. Martins, J. Cui, Z. Huang, B. P. Thapaliya, C. L. Do-Thanh, M. Zhou, J. Fan, Z. Yang, M. Chi, T. Kobayashi, J. Wu, E. Mamontov and S. Dai, Machine-learning-assisted material discovery of oxygen-rich highly porous carbon active materials for aqueous supercapacitors, Nat. Commun., 2023, 14, 4607 CrossRef CAS PubMed .
  35. S. Zhang and N. Pan, Supercapacitors performance evaluation, Adv. Energy Mater., 2015, 5, 1401401 CrossRef .
  36. K. Tran and Z. W. Ulissi, Active learning across intermetallics to guide discovery of electrocatalysts for CO2 reduction and H2 evolution, Nat. Catal., 2018, 1, 696–703 CrossRef CAS .
  37. D. Khalafallah, A. A. Farghaly, C. Ouyang, W. Huang and Z. Hong, Atomically dispersed Pt single sites and nanoengineered structural defects enable a high electrocatalytic activity and durability for hydrogen evolution reaction and overall urea electrolysis, J. Power Sources, 2023, 558, 232563 CrossRef CAS .
  38. S. Kirklin, J. E. Saal, B. Meredig, A. Thompson, J. W. Doak, M. Aykol, S. Rühl and C. Wolverton, The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies, npj Comput. Mater., 2015, 1, 15010 CrossRef CAS .
  39. Y. Dan, Y. Zhao, X. Li, S. Li, M. Hu and J. Hu, Multiscale computational understanding and growth of 2D materials: a review, npj Comput. Mater., 2020, 6, 84 CrossRef CAS .
  40. Z. W. Ulissi, A. R. Singh, C. Tsai and J. K. Nørskov, Automated discovery and construction of surface phase diagrams using machine learning, J. Phys. Chem. Lett., 2016, 7, 3931–3935 CrossRef CAS PubMed .
  41. X. Shan, Y. Pan, F. Cai, H. Gao, J. Xu, D. Liu, Q. Zhu, P. Li, Z. Jin, J. Jiang and M. Zhou, Accelerating the discovery of efficient high-entropy alloy electrocatalysts: High-throughput experimentation and data-driven strategies, Nano Lett., 2024, 24, 11632–11640 CrossRef CAS PubMed .
  42. Y. Wang, G. Brocks and S. Er, Data-driven discovery of intrinsic direct-gap 2D Materials as potential photocatalysts for efficient water splitting, ACS Catal., 2024, 14, 1336–1350 CrossRef CAS .
  43. J. A. Keith, V. Vassilev-Galindo, B. Cheng, S. Chmiela, M. Gastegger, K.-R. Müller and A. Tkatchenko, Combining machine learning and computational chemistry for predictive insights into chemical systems, Chem. Rev., 2021, 121, 9816–9872 CrossRef CAS PubMed .
  44. H. Chen, Y. Zheng, J. Li, L. Li and X. Wang, AI for nanomaterials development in clean energy and carbon capture, utilization and, storage (CCUS), ACS Nano, 2023, 17, 9763–9792 CrossRef CAS PubMed .
  45. L. Shen, J. Zhou, T. Yang, M. Yang and Y. P. Feng, High-throughput computational discovery and intelligent design of two-dimensional functional materials for various applications, Acc. Mater. Res., 2022, 3, 572–583 CrossRef CAS .
  46. A. Ihalage and Y. Hao, Formula graph self-attention network for representation-domain independent materials discovery, Adv. Sci., 2022, 9, 2200164 CrossRef PubMed .
  47. X. Fan, L. Chen, D. Huang, Y. Tian, X. Zhang, M. Jiao and Z. Zhou, From single metals to high-entropy alloys: How machine learning accelerates the development of metal electrocatalysts, Adv. Funct. Mater., 2024, 34, 2401887 CrossRef CAS .
  48. K. Ryan, J. Lengyel and M. Shatruk, Crystal structure prediction via deep learning, J. Am. Chem. Soc., 2018, 140, 10158–10168 CrossRef PubMed .
  49. I. Jeong, Y. Shim, S. Oh, J. M. Yuk, K.-M. Roh, C.-W. Lee and K. T. Lee, A machine learning-enhanced framework for the accelerated development of spinel oxide electrocatalysts, Adv. Energy Mater., 2024, 14, 2402342 CrossRef CAS .
  50. S. N. Steinmann, Q. Wang and Z. W. She, How machine learning can accelerate electrocatalysis discovery and optimization, Mater. Horiz., 2023, 10, 393–406 RSC .
  51. L. Chen, Y. Tian, X. Hu, S. Yao, Z. Lu, S. Chen, X. Zhang and Z. Zhou, A universal machine learning framework for electrocatalyst innovation: A case study of discovering alloys for hydrogen evolution reaction, Adv. Funct. Mater., 2022, 32, 2208418 CrossRef CAS .
  52. J. Liu, W. Luo, L. Wang, J. Zhang, X. Fu and J. Luo, Toward excellence of electrocatalyst design by emerging descriptor-oriented machine learning, Adv. Funct. Mater., 2022, 32, 2110748 CrossRef CAS .
  53. X. Liu, L. Zheng, C. Han, H. Zong, G. Yang, S. Lin, A. Kumar, A. R. Jadhav, N. Q. Tran, Y. Hwang, J. Lee, S. Vasimalla, Z. Chen, S. Kim and H. Lee, Identifying the activity origin of a cobalt single-atom catalyst for hydrogen evolution using supervised learning, Adv. Funct. Mater., 2021, 31, 2100547 CrossRef CAS .
  54. L. Qu, P. Wang, B. Motevalli, Q. Liang, K. Wang, W.-J. Jiang, J. Z. Liu and D. Li, New engineering science insights into the electrode materials pairing of electrochemical energy storage devices, Adv. Mater., 2024, 36, 2404232 CrossRef CAS PubMed .
  55. X. Lin, X. Du, S. Wu, S. Zhen, W. Liu, C. Pei, P. Zhang, Z.-J. Zhao and J. Gong, Machine learning-assisted dual-atom sites design with interpretable descriptors unifying electrocatalytic reactions, Nat. Commun., 2024, 15, 8169 CrossRef CAS PubMed .
  56. L. Yang, J. Fan and W. Zhu, Using ternary steric hindrance synergy of a defective MoS2 monolayer to manipulate the electrocatalytic mechanism toward nitric oxide reduction: a first-principles and machine learning study, J. Mater. Chem. A, 2023, 11, 7159–7169 RSC .
  57. M. Zhou, A. Gallegos, K. Liu, S. Dai and J. Wu, Insights from machine learning of carbon electrodes for electric double layer capacitors, Carbon, 2020, 157, 147–152 CrossRef CAS .
  58. L. Chen, X. Zhang, A. Chen, S. Yao, X. Hu and Z. Zhou, Targeted design of advanced electrocatalysts by machine learning, Chin. J. Catal., 2022, 43, 11–32 CrossRef .
  59. Y. Qiu, L. Chen, X. Zhang, D. Ping, Y. Tian and Z. Zhou, A universal machine learning framework to automatically identify high-performance covalent organic framework membranes for CH4/H2 separation, AIChE J., 2024, 70, e18575 CrossRef CAS .
  60. S. Lu, Z. Wang, Z. Gao, T. Peng, P. Song, Z. Jia, Y. Zhou, H. Cui, W. Tian, R. Feng, L. Jin and H. Yuan, Modulation of hydrogen evolution reaction performance of MXenes by doped transition metals: Comprehensive exploration of high-throughput computing and machine learning, ACS Appl. Mater. Interfaces, 2025, 17, 23795–23808 CrossRef CAS PubMed .
  61. M. V. Jyothirmai, R. Dantuluri, P. Sinha, B. M. Abraham and J. K. Singh, Machine-learning-driven high-throughput screening of transition-metal atom intercalated g-C3N4/MX2 (M = Mo, W; X = S, Se, Te) heterostructures for the hydrogen evolution reaction, ACS Appl. Mater. Interfaces, 2024, 16, 12437–12445 CrossRef CAS PubMed .
  62. X. Jiang, Y. Wang, B. Jia, X. Qu and M. Qin, Using machine learning to predict oxygen evolution activity for transition metal hydroxide electrocatalysts, ACS Appl. Mater. Interfaces, 2022, 14, 41141–41148 CrossRef CAS PubMed .
  63. Y. Zhang, H. Huang, J. Tian, C. Li, Y. Jiang, Z. Fan and L. Pan, Modelling electrified microporous carbon/electrolyte electrochemical interface and unravelling charge storage mechanism by machine learning accelerated molecular dynamics, Energy Storage Mater., 2023, 63, 103069 CrossRef .
  64. M. I. Jordan and T. M. Mitchell, ML: Trends, perspectives, and prospects, Science, 2015, 349, 255–260 CrossRef CAS PubMed .
  65. W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong, Y. Du, C. Yang, Y. Chen, Z. Chen, J. Jiang, R. Ren, Y. Li, X. Tang, Z. Liu, P. Liu, J.-Y. Nie and J.-R. Wen, A survey of large language models, arXiv, 2023, preprint, arXiv:2303.18223 DOI:10.48550/arXiv.2303.18223.
  66. I. H. Sarker, Machine learning: Algorithms, real-world applications and research directions, SN Comput. Sci., 2021, 2, 160 CrossRef PubMed .
  67. Y. Wang, X. Huang, H. Fu and J. Shang, Theoretically revealing the activity origin of the hydrogen evolution reaction on carbon-based single-atom catalysts and finding ideal catalysts for water splitting, J. Mater. Chem. A, 2022, 10, 24362–24372 RSC .
  68. S. Lu, P. Song, Z. Jia, Z. Gao, Z. Wang, T. Peng, X. Bai, Q. Jiang, H. Cui, W. Tian, R. Feng, Z. Liang, Q. Kang, L. Jin and H. Yuan, Symbolic transform optimized convolutional neural network model for high-performance prediction and analysis of MXenes hydrogen evolution reaction catalysts, Int. J. Hydrogen Energy, 2024, 85, 200–209 CrossRef CAS .
  69. J. Lee, A. Seko, K. Shitara, K. Nakayama and I. Tanaka, Prediction model of band gap for inorganic compounds by combination of density functional theory calculations and machine learning techniques, Phys. Rev. B, 2016, 93, 115104 CrossRef .
  70. A. C. Rajan, A. Mishra, S. Satsangi, R. Vaish, H. Mizuseki, K.-R. Lee and A. K. Singh, Machine-learning-assisted accurate band gap predictions of functionalized MXene, Chem. Mater., 2018, 30, 4031–4038 CrossRef CAS .
  71. R. Caruana and A. Niculescu-Mizil, An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd International Conference on ML, 2006, 161–168 Search PubMed.
  72. D. Ruppert, The elements of statistical learning: Data mining, inference, and prediction, J. Am. Stat. Assoc., 2009, 567–585 Search PubMed .
  73. J. E. Van Engelen and H. H. Hoos, A survey on semi-supervised learning, Mach. Learn., 2020, 109, 373–440 CrossRef .
  74. C. Szepesvári, Algorithms for reinforcement learning, Springer Nature, 2022 Search PubMed .
  75. Y. LeCun, Y. Bengio and G. Hinton, Deep learning, Nature, 2015, 521, 436–444 CrossRef CAS PubMed .
  76. Z. Li, F. Liu, W. Yang, S. Peng and J. Zhou, A survey of convolutional neural networks: analysis, applications, and prospects, Neural Netw. Learn. Syst., 2021, 33, 6999–7019 Search PubMed .
  77. L. R. Medsker and L. Jain, Recurrent neural networks, Des. Appl., 2001, 5, 64–67 Search PubMed .
  78. G. Van Houdt, C. Mosquera and G. Nápoles, A review on the long short-term memory model, Artif. Intell. Rev., 2020, 53, 5929–5955 CrossRef .
  79. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser and I. Polosukhin, Attention Is All You Need, in Advances in Neural Information Processing Systems, ed. I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan and R. Garnett, 2017, vol. 30 Search PubMed .
  80. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser and I. Polosukhin, Attention Is All You Need, arXiv, 2017, preprint, arXiv:1706.03762v7 DOI:10.48550/arXiv.1706.03762.
  81. R. Luo, L. Sun, Y. Xia, T. Qin, S. Zhang, H. Poon and T. Y. Liu, BioGPT: generative pre-trained transformer for biomedical text generation and mining, Brief. Bioinform., 2022, 23, bbac409 CrossRef PubMed .
  82. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei and I. Sutskever, Language models are unsupervised multitask learners, OpenAI blog, 2019, 1, 8 Search PubMed .
  83. L. Floridi and M. Chiriatti, Its nature, scope, limits, and consequences, Minds Mach., 2020, 30, 681–694 CrossRef .
  84. J. Achiam, et al., Gpt-4 technical report, arXiv, 2023, preprint, arXiv:2303.08774 DOI:10.48550/arXiv.2303.08774.
  85. R. Islam and O. M. Moushi, Gpt-4o: The cutting-edge advancement in multimodal llm. Authorea Preprints, 2024 Search PubMed.
  86. J. Devlin, M.-W. Chang, K. Lee and K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv, 2018, preprint, arXiv:1810.04805 DOI:10.48550/arXiv.1810.04805.
  87. J. Zhang, H. B. Tao, M. Kuang, H. B. Yang, W. Cai, Q. Yan, Q. Mao and B. Liu, Advances in thermodynamic-kinetic model for analyzing the oxygen evolution reaction, ACS Catal., 2020, 10, 8597–8610 CrossRef CAS .
  88. S. M. Tan, C. K. Chua, D. Sedmidubsky, Z. C. Sofer and M. Pumera, Electrochemistry of layered GaSe and GeS: applications to ORR, OER and HER, Phys. Chem. Chem. Phys., 2016, 18, 1699–1711 RSC .
  89. L. Miao, W. Jia, X. Cao and L. Jiao, Computational chemistry for water-splitting electrocatalysis, Chem. Soc. Rev., 2024, 53, 2771–2807 RSC .
  90. F. Fasulo, A. Massaro, A. Pecoraro, A. B. Muñoz-García and M. Pavone, Role of defect-driven surface reconstructions in transition metal oxide electrocatalysis towards OER/ORR: A quantum-mechanical perspective, Curr. Opin. Electrochem., 2023, 42, 101412 CrossRef CAS .
  91. L. Ge, H. Yuan, Y. Min, L. Li, S. Chen, L. Xu and W. A. Goddard III, Predicted optimal bifunctional electrocatalysts for the hydrogen evolution reaction and the oxygen evolution reaction using chalcogenide heterostructures based on machine learning analysis of in silico quantum mechanics based high throughput screening, J. Phys. Chem. Lett., 2020, 11, 869–876 CrossRef CAS PubMed .
  92. Q. Zhang and J. Guan, Atomically dispersed catalysts for hydrogen/oxygen evolution reactions and overall water splitting, J. Power Sources, 2020, 471, 228446 CrossRef CAS .
  93. B. K. Kim, S. Sy, A. Yu and J. Zhang, Electrochemical Supercapacitors for Energy Storage and Conversion, Handbook of Clean Energy Systems, 2015, pp. 1–25 Search PubMed .
  94. H. Wang, X. Liu, P. Niu, S. Wang, J. Shi and L. Li, Porous two-dimensional materials for photocatalytic and electrocatalytic applications, Matter, 2020, 2, 1377–1413 CrossRef .
  95. Y. Zhao, Z. Song, X. Li, Q. Sun, N. Cheng, S. Lawes and X. Sun, Metal organic frameworks for energy storage and conversion, Energy Storage Mater., 2016, 2, 35–62 CrossRef .
  96. D. Liu, G. Xu, H. Yang, H. Wang and B. Y. Xia, Rational design of transition metal phosphide-based electrocatalysts for hydrogen evolution, Adv. Funct. Mater., 2022, 33, 2208358 CrossRef .
  97. F. Ahmad, A. Shahzad, M. Danish, M. Fatima, M. Adnan, S. Atiq, M. Asim, M. A. Khan, Q. U. Ain and R. Pervee, Recent developments in transition metal oxide-based electrode composites for supercapacitor applications, J. Energy Storage, 2024, 81, 110430 CrossRef .
  98. A. Zagalskaya and V. Alexandrov, Mechanistic study of IrO2 dissolution during the electrocatalytic oxygen evolution reaction, J. Phys. Chem. Lett., 2020, 11, 2695–2700 CrossRef CAS PubMed .
  99. Y. Wu, R. Yao, Q. Zhao, J. Li and G. Liu, La-RuO2 nanocrystals with efficient electrocatalytic activity for overall water splitting in acidic media: Synergistic effect of La doping and oxygen vacancy, J. Chem. Eng., 2022, 439, 135699 CrossRef CAS .
  100. L. She, G. Zhao, T. Ma, J. Chen, W. Sun and H. Pan, On the durability of iridium-based electrocatalysts toward the oxygen evolution reaction under acid environment, Adv. Funct. Mater., 2021, 32, 2108465 CrossRef .
  101. K. Du, L. Zhang, J. Shan, J. Guo, J. Mao, C.-C. Yang, C.-H. Wang, Z. Hu and T. Ling, Interface engineering breaks both stability and activity limits of RuO2 for sustainable water oxidation, Nat. Commun., 2022, 13, 5448 CrossRef CAS PubMed .
  102. H. B. M. Sidek, J. Lee, X. Jin and S. J. Hwang, Optimization of oxygen evolution electrocatalytic activity of metal oxide nanosheet via surface modification, Bull. Korean Chem. Soc., 2023, 44, 962–968 CrossRef CAS .
  103. K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev and A. Walsh, Machine learning for molecular and materials science, Nature, 2018, 559, 547–555 CrossRef CAS PubMed .
  104. S. Back, K. Tran and Z. W. Ulissi, Toward a design of active oxygen evolution catalysts: insights from automated density functional theory calculations and machine learning, ACS Catal., 2019, 9, 7651–7659 CrossRef CAS .
  105. T. Xie and J. C. Grossman, Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties, Phys. Rev. Lett., 2018, 120, 145301 CrossRef CAS PubMed .
  106. Y. Liu, B. Guo, X. Zou, Y. Li and S. Shi, Machine learning assisted materials design and discovery for rechargeable batteries, Energy Storage Mater., 2020, 31, 434–450 CrossRef .
  107. L. Ward, R. Liu, A. Krishna, V. I. Hegde, A. Agrawal, A. Choudhary and C. Wolverton, Including crystal structure attributes in machine learning models of formation energies via Voronoi tessellations, Phys. Rev. B, 2017, 96, 024104 CrossRef .
  108. Z. W. Ulissi, M. T. Tang, J. Xiao, X. Liu, D. A. Torelli, M. Karamad, K. Cummins, C. Hahn, N. S. Lewis, T. F. Jaramillo, K. Chan and J. K. Nørskov, Machine-learning methods enable exhaustive searches for active bimetallic facets and reveal active site motifs for CO2 reduction, ACS Catal., 2017, 10, 6600–6608 CrossRef .
  109. G. R. Schleder, A. C. M. Padilha, C. M. Acosta, M. Costa and A. Fazzio, From DFT to machine learning: recent approaches to materials science–a review, J. Phys. Mater., 2019, 2, 032001 CrossRef CAS .
  110. L. Ciambriello, I. Alessandri, L. Gavioli and I. Vassalini, NiFe catalysts for oxygen evolution reaction: Is there an optimal thickness for generating a dynamically stable active interface, ChemCatChem, 2024, 16, e202400286 CrossRef CAS .
  111. Y. Zhang, X. Zheng, X. Guo, J. Zhang, A. Yuan, Y. Du and F. Gao, Design of modified MOFs electrocatalysts for water splitting: High current density operation and long-term stability, Appl. Catal., B, 2023, 336, 122891 CrossRef CAS .
  112. M. Yang, C. H. Zhang, N. W. Li, D. Luan, L. Yu and X. W. D. Lou, Design and synthesis of hollow nanostructures for electrochemical water splitting, Adv. Sci., 2022, 9, e2105135 CrossRef PubMed .
  113. W. Feng, M. Bu, S. Kan, X. Gao, A. Guo, H. Liu, L. Deng and W. Chen, Interfacial hetero-phase construction in nickel/molybdenum selenide hybrids to promote the water splitting performance, Appl. Mater. Today, 2021, 25, 101175 CrossRef .
  114. H. Park, Y. Kim, S. Choi and H. J. Kim, Data driven computational design of stable oxygen evolution catalysts by DFT and machine learning: Promising electrocatalysts, J. Energy Chem., 2024, 91, 645–655 CrossRef CAS .
  115. J. Li, N. Wu, J. Zhang, H.-H. Wu, K. Pan, Y. Wang, G. Liu, X. Liu, Z. Yao and Q. Zhang, Machine learning-assisted low-dimensional electrocatalysts design for hydrogen evolution reaction, Nano-Micro Lett., 2023, 15, 227 CrossRef CAS PubMed .
  116. M. Zubair, P. Kumar, M. Klingenhof, B. Subhash, J. A. Yuwono, S. Cheong, Y. Yao, L. Thomsen, P. Strasser, R. D. Tilley and N. M. Bedford, Vacancy promotion in layered double hydroxide electrocatalysts for improved oxygen evolution reaction performance, ACS Catal., 2023, 13, 4799–4810 CrossRef CAS .
  117. A. Badreldin, A. Nabeeh, E. Youssef, N. Mubarak, H. ElSayed, R. Mohsen, F. Ahmed, Y. Wubulikasimu, K. Elsaid and A. Abdel-Wahab, Adapting early transition metal and nonmetallic dopants on CoFe oxyhydroxides for enhanced alkaline and neutral pH saline water oxidation, ACS Appl. Energy Mater., 2021, 4, 6942–6956 CrossRef CAS .
  118. S. N. Steinmann, Q. Wang and Z. W. Seh, How machine learning can accelerate electrocatalysis discovery and optimization, Mater. Horiz., 2023, 10, 393–406 RSC .
  119. K. Takahashi and I. Miyazato, Rapid estimation of activation energy in heterogeneous catalytic reactions via machine learning, J. Comput. Chem., 2018, 39, 2405–2408 CrossRef CAS PubMed .
  120. R. B. Wexler, J. M. P. Martirez and A. M. Rappe, Chemical pressure-driven enhancement of the hydrogen evolving activity of Ni2p from nonmetal surface doping interpreted via machine learning, J. Am. Chem. Soc., 2018, 140, 4678–4683 CrossRef CAS PubMed .
  121. S. Wang, H. Lin, Y. Wakabayashi, L. Q. Zhou, C. A. Roberts, D. Banerjee, H. Jia and C. Ling, Transfer learning aided high-throughput computational design of oxygen evolution reaction catalysts in acid conditions, J. Energy Chem., 2023, 80, 744–757 CrossRef CAS .
  122. J. R. Lunger, J. Karaguesian, H. Chun, J. Peng, Y. Tseo, C. H. Shan, B. Han, Y. Shao-Horn and R. Gómez-Bombarelli, Towards atom-level understanding of metal oxide catalysts for the oxygen evolution reaction with machine learning, npj Comput. Mater., 2024, 10, 80 CrossRef CAS .
  123. Y. Sun, H. Liao, J. Wang, B. Chen, S. Sun, S. J. H. Ong, S. Xi, C. Diao, Y. Du, J.-O. Wang, M. B. Breese, S. Li, H. Zhang and Z. J. Xu, Covalency competition dominates the water oxidation structure−activity relationship on spinel oxides, Nat. Catal., 2020, 3, 554–563 CrossRef CAS .
  124. N. Zhou, Y. Zhao, Q. Lv and Y. Chen, Using machine learning to forecast the conductive substrate-supported heteroatom-doped metal compound electrocatalysts for hydrogen evolution reaction, J. Phys. Chem. C, 2024, 128, 17274–17281 CrossRef CAS .
  125. N. Ran, B. Sun, W. Qiu, E. Song, T. Chen and J. Liu, Identifying Metallic transition-metal dichalcogenides for hydrogen evolution through multilevel high-throughput calculations and machine learning, J. Phys. Chem. Lett., 2021, 12, 2102–2111 CrossRef CAS PubMed .
  126. R. Ramprasad, R. Batra, G. Pilania, A. Mannodi-Kanakkithodi and C. Kim, Machine learning in materials informatics: recent applications and prospects, npj Comput. Mater., 2017, 3, 54 CrossRef .
  127. R. Ding, J. Chen, Y. Chen, J. Liu, Y. Bando and X. Wang, Machine learning applications in electrocatalyst design for water splitting, Chem. Soc. Rev., 2024, 51, 1476–1500 Search PubMed .
  128. Y. Dan, Y. Zhao, X. Li, S. Li, M. Hu and J. Hu, Generative adversarial networks (GANs) for inverse design of inorganic materials, npj Comput. Mater., 2020, 6, 84 CrossRef CAS .
  129. Y. Liu, T. Zhao, W. Ju and S. Shi, Materials discovery and design using machine learning, J. Materiomics, 2017, 3, 159–177 CrossRef .
  130. N. J. O’Connor, A. S. M. Jonayat, M. J. Janik and T. P. Senftle, Interaction trends between single metal atoms and oxide supports identified with density functional theory and statistical learning, Nat. Catal., 2018, 1, 531–539 CrossRef .
  131. H. Liang, P. F. Liu, M. Xu, H. Li and E. Asselin, A study of two-dimensional single atom-supported MXenes as hydrogen evolution reaction catalysts using density functional theory and machine learning, Int. J. Quantum Chem., 2022, 123, e27055 CrossRef .
  132. K. Bang, D. Hong, Y. Park, D. Kim, S. S. Han and H. M. Lee, Machine learning-enabled exploration of the electrochemical stability of real-scale metallic nanoparticles, Nat. Commun., 2023, 14, 3004 CrossRef CAS PubMed .
  133. M. Wang, A. Ishii and K. Sakaushi, Accelerated electrocatalyst degradation testing by accurate and robust forecasting of multidimensional kinetic model with Bayesian data assimilation, ACS Energy Lett., 2025, 10, 22–29 CrossRef CAS .
  134. Y. Hu, J. Chen, Z. Wei, Q. He and Y. Zhao, Recent advances and applications of machine learning in electrocatalysis, J. Mater. Inf., 2023, 3, 18 CAS .
  135. B. R. Goldsmith, J. Esterhuizen, J. X. Liu, C. J. Bartel and C. Sutton, Machine learning for heterogeneous catalyst design and discovery, AIChE J., 2018, 64, 2311–2323 CrossRef CAS .
  136. Y. Zhang, X. Liu and W. Wang, Theoretical calculation assisted by machine learning accelerate optimal electrocatalyst finding for hydrogen evolution reaction, ChemElectroChem, 2024, 11, e202400084 CrossRef CAS .
  137. M. Andersen and K. Reuter, Adsorption enthalpies for catalysis modeling through machine-learned descriptors, Acc. Chem. Res., 2021, 54, 2741–2749 CrossRef CAS PubMed .
  138. R. Ding, J. Chen, Y. Chen, J. Liu, Y. Bando and X. Wang, Unlocking the potential: machine learning applications in electrocatalyst design for electrochemical hydrogen energy transformation, Chem. Soc. Rev., 2024, 53, 11390–11461 RSC .
  139. S. Yue, D. Li, A. Zhang, Y. Yan, H. Yan, Z. Feng and W. Wang, Rational design of single transition-metal atoms anchored on a PtSe2 monolayer as bifunctional OER/ORR electrocatalysts: a defect chemistry and machine learning study, J. Mater. Chem. A, 2024, 12, 5451–5463 RSC .
  140. J. Xu, X. M. Cao and P. Hu, Perspective on computational reaction prediction using machine learning methods in heterogeneous catalysis, Phys. Chem. Chem. Phys., 2021, 23, 11155–11179 RSC .
  141. X. Shi, G. Zhang, Y. Lu and H. Pang, Applications of machine learning in electrochemistry, Renewables, 2023, 1, 668–693 CrossRef .
  142. M. Guo, M. Ji and W. Cui, Theoretical investigation of HER/OER/ORR catalytic activity of single atom-decorated graphyne by DFT and comparative DOS analyses, Appl. Surf. Sci., 2022, 592, 153237 CrossRef CAS .
  143. L. Xie, W. Zhou, Y. Huang, Z. Qu, L. Li, C. Yang, Y. Ding, J. Li, X. Meng, F. Sun, J. Gao, G. Zhao and Y. Qin, Elucidating the impact of oxygen functional groups on the catalytic activity of M-N4-C catalysts for the oxygen reduction reaction: a density functional theory and machine learning approach, Mater. Horiz., 2024, 11, 1719–1731 RSC .
  144. W. Wentao, Y. Qu, D. Li, A. Zhang, H. Yan, Z. Feng and W. Yao, The defect chemistry and machine learning study 5d transition metal doped on graphitic carbon nitride for bifunctional oxygen electrocatalyst with low overpotential, Int. J. Hydrogen Energy, 2024, 79, 702–714 CrossRef .
  145. C. Chen, B. Xiao, Z. Qin, J. Zhao, W. Li, Q. Li and X. Yu, Metal-doped C3B monolayer as the promising electrocatalyst for hydrogen/oxygen evolution reaction: a combined density functional theory and machine learning study, ACS Appl. Mater. Interfaces, 2023, 15, 40538–40548 CrossRef CAS PubMed .
  146. D. Jain, S. Chaube, P. Khullar, S. G. Srinivasan and B. Rai, Bulk and surface DFT investigations of inorganic halide perovskites screened using machine learning and materials property databases, Phys. Chem. Chem. Phys., 2019, 21, 19423–19436 RSC .
  147. K. Choudhary and B. DeCost, Atomistic Line Graph Neural Network for improved materials property predictions, Npj Comput. Mater., 2021, 7(173), 2021 Search PubMed .
  148. H. Rossignol, M. Minotakis, M. Cobelli and S. Sanvito, Machine-Learning-Assisted Construction of Ternary Convex Hull Diagrams, J. Chem. Inf. Model., 2024, 64, 1828–1840 CrossRef CAS PubMed .
  149. N. Ma, Y. Zhang, Y. Wang, C. Huang, J. Zhao, B. Liang and J. Fan, Machine learning-assisted exploration of the intrinsic factors affecting the catalytic activity of ORR/OER bifunctional catalysts, Appl. Surf. Sci., 2023, 628, 157225 CrossRef CAS .
  150. S. G. Priyanga, M. N. Mattur, N. Nagappan, S. Rath and T. Thomas, Prediction of nature of band gap of perovskite oxides (ABO3) using a machine learning approach, J. Materiomics, 2022, 8, 937–948 CrossRef .
  151. Y. Zhuo, A. M. Tehrani and J. Brgoch, Predicting the Band Gaps of Inorganic Solids by Machine Learning, J. Phys. Chem. Lett., 2018, 9, 1668–1673 CrossRef CAS PubMed .
  152. Y. Zhang, Y. Zhang, Z. Guo, Y. Fang, C. Tang, N. Miao, B. Sa, J. Zhou and Z. Sun, Establishing theoretical landscapes for identifying basal plane active sites in MBene toward multifunctional HER, OER, and ORR catalysts, J. Colloid Interface Sci., 2023, 652, 1954–1964 CrossRef CAS PubMed .
  153. S. Lu, J. Cao, Y. Zhang, F. Lou and Z. Yu, Transition metal single-atom supported on PC3 monolayer for highly efficient hydrogen evolution reaction by combined density functional theory and machine learning study, Appl. Surf. Sci., 2022, 606, 154945 CrossRef CAS .
  154. X. Liu, Y. Zhang, W. Wang, Y. Chen, W. Xiao, T. Liu, Z. Zhong, Z. Luo, Z. Ding and Z. Zhang, Transition metal and N doping on AlP monolayers for bifunctional oxygen electrocatalysts: density functional theory study assisted by machine learning description, ACS Appl. Mater. Interfaces, 2022, 14, 1249–1259 CrossRef CAS PubMed .
  155. M. Zhang, Y. Hou, Y. Jiang, X. Ni, Y. Wang and X. Zou, Rational design of water splitting electrocatalysts through computational insights, Chem. Commun., 2024, 60, 14521–14536 RSC .
  156. Y. Lv, G. Chen, R. Ma, J. Yong Lee and B. Kang, Hybrid scheme of DFT and machine learning to accelerate the design of graphyne nanoribbons as electrocatalysts for the ORR and HER, Fuel, 2024, 357, 130017 CrossRef CAS .
  157. X. Sun, J. Zheng, Y. Gao, C. Qiu, Y. Yan, Z. Yao, S. Deng and J. Wang, Machine-learning-accelerated screening of hydrogen evolution catalysts in MBenes materials, Appl. Surf. Sci., 2020, 526, 146522 CrossRef CAS .
  158. J. Hu, J. Mo, C. Yu, D. Liu, R. Zhang, L. Miao, X. Ji and J. Jiang, Universal electronic descriptors for optimizing hydrogen evolution in transition metal-doped MXenes, Appl. Surf. Sci., 2024, 653, 159329 CrossRef CAS .
  159. R. Anand, A. S. Nissimagoudar, M. Umer, M. Ha, M. Zafari, S. Umer, G. Lee and K. S. Kim, Late transition metal doped MXenes showing superb bifunctional electrocatalytic activities for water splitting via distinctive mechanistic pathways, Adv. Energy Mater., 2021, 11, 2102388 CrossRef CAS .
  160. M. Ha, D. Y. Kim, M. Umer, V. Gladkikh, C. W. Myung and K. S. Kim, Tuning metal single atoms embedded in NxCy moieties toward high-performance electrocatalysis, Energy Environ. Sci., 2021, 14, 3455–3468 RSC .
  161. X. Huang, X. Hu, J. Wang and H. Xu, Overlooked role of electrostatic interactions in HER kinetics on MXenes: Beyond the conventional descriptor DeltaG approximately 0 to identify the real active site, J. Phys. Chem. Lett., 2024, 15, 11200–11208 CrossRef CAS PubMed .
  162. R. K. Sharma, M. K. Jena, H. Minhas and B. Pathak, Machine-learning-assisted screening of nanocluster electrocatalysts: Mapping and reshaping the activity volcano for the oxygen reduction reaction, ACS Appl. Mater. Interfaces, 2024, 16, 63589–63601 CrossRef CAS PubMed .
  163. D. Jin, L. R. Johnson, A. S. Raman, X. Ming, Y. Gao, F. Du, Y. Wei, G. Chen, A. Vojvodic, Y. Gogotsi and X. Meng, Computational screening of 2D ordered double transition-metal carbides (MXenes) as electrocatalysts for hydrogen evolution reaction, J. Phys. Chem. C, 2020, 124, 10584–10592 CrossRef CAS .
  164. N. Zhou, Y. Zhao, Q. Lv and Y. Chen, Using machine learning to forecast the conductive substrate-supported heteroatom-doped metal compound electrocatalysts for hydrogen evolution reaction, J. Phys. Chem. C, 2024, 128, 17274–17281 CrossRef CAS .
  165. S. Cao, Y. Luo, T. Li, J. Li, L. Wu and G. Liu, Machine learning assisted screening of doped metals phosphides electrocatalyst towards efficient hydrogen evolution reaction, J. Mol. Catal., 2023, 551, 113625 CrossRef CAS .
  166. W. A. Saidi, T. Nandi and T. Yang, Designing multinary noble metal-free catalyst for hydrogen evolution reaction, Electrochem. Sci. Adv., 2022, 3, e2100224 CrossRef .
  167. X. Lin, Y. Wang, X. Chang, S. Zhen, Z. J. Zhao and J. Gong, High-throughput screening of electrocatalysts for nitrogen reduction reactions accelerated by interpretable intrinsic descriptor, Angew. Chem., Int. Ed., 2023, 62, e202300122 CrossRef CAS PubMed .
  168. M. Yan, S. Dong, Y. Li, Z. Liu, H. Zhao, Z. Ma, F. Geng, Z. Li and C. Wu, Accelerating the design and optimization of catalysts for the hydrogen evolution reaction in transition metal phosphides using machine learning, J. Mol. Catal., 2023, 548, 113402 CrossRef CAS .
  169. B. M. Abraham, P. Sinha, P. Halder and J. K. Singh, Fusing a machine learning strategy with density functional theory to hasten the discovery of 2D MXene-based catalysts for hydrogen generation, J. Mater. Chem. A, 2023, 11, 8091–8100 RSC .
  170. H. Zhang, Q. Wei, S. Wei, Y. Luo, W. Zhang and G. Liu, Machine learning assisted screening of nitrogen-doped graphene-based dual-atom hydrogen evolution electrocatalysts, J. Mol. Catal., 2025, 570, 114649 CrossRef CAS .
  171. B. J. Kim, E. Fabbri, M. Borlaf, D. F. Abbott, I. E. Castelli, M. Nachtegaal, T. Graule and T. J. Schmidt, Oxygen evolution reaction activity and underlying mechanism of perovskite electrocatalysts at different pH, Mater. Adv., 2021, 2, 345–355 RSC .
  172. Q. Shi, C. Zhu, D. Du and Y. Lin, Robust noble metal-based electrocatalysts for oxygen evolution reaction, Chem. Soc. Rev., 2019, 48, 3181–3192 RSC .
  173. Y. Pi, Q. Shao, P. Wang, F. Lv, S. Guo, J. Guo and X. Huang, Trimetallic oxyhydroxide coralloids for efficient oxygen evolution electrocatalysis, Angew. Chem., Int. Ed., 2017, 56, 4502–4506 CrossRef CAS PubMed .
  174. B. Han, A. Grimaud, L. Giordano, W. T. Hong, O. Diaz-Morales, L. Yueh-Lin, J. Hwang, N. Charles, K. A. Stoerzinger, W. Yang, M. T. Koper and Y. Shao-Horn, Iron-based perovskites for catalyzing oxygen evolution reaction, J. Phys. Chem. C, 2018, 122, 8445–8454 CrossRef CAS .
  175. J. Shan, Y. Zheng, B. Shi, K. Davey and S. Z. Qiao, Regulating electrocatalysts via surface and interface engineering for acidic water electrooxidation, ACS Energy Lett., 2019, 4, 2719–2730 CrossRef CAS .
  176. B. Weng, Z. Song, R. Zhu, Q. Yan, Q. Sun, C. G. Grice, Y. Yan and W.-J. Yin, Simple descriptor derived from symbolic regression accelerating the discovery of new perovskite catalysts, Nat. Commun., 2020, 11, 3513 CrossRef CAS PubMed .
  177. W. T. Hong, R. E. Welsch and Y. Shao-Horn, Descriptors of oxygen-evolution activity for oxides: a statistical evaluation, J. Phys. Chem. C, 2016, 120, 78–86 CrossRef CAS .
  178. B. M. Abraham, M. V. Jyothirmai, P. Sinha, F. Viñes, J. K. Singh and F. Illas, Catalysis in the digital age: Unlocking the power of data with machine learning, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 2024, 14, e1730 Search PubMed .
  179. S. Lin, H. Xu, Y. Wang, X. C. Zeng and Z. Chen, Directly predicting limiting potentials from easily obtainable physical properties of graphene-supported single-atom electrocatalysts by machine learning, J. Mater. Chem. A, 2020, 8, 5663–5670 RSC .
  180. H. Niu, X. Wan, X. Wang, C. Shao, J. Robertson, Z. Zhang and Y. Guo, Single-atom rhodium on defective g-C3N4: A promising bifunctional oxygen electrocatalyst, ACS Sustainable Chem. Eng., 2021, 9, 3590–3599 CrossRef CAS .
  181. Z. Li, L. E. K. Achenie and H. Xin, An adaptive machine learning strategy for accelerating discovery of perovskite electrocatalysts, ACS Catal., 2020, 10, 4377–4384 CrossRef CAS .
  182. B. Rohr, H. S. Stein, D. Guevarra, Y. Wang, J. A. Haber, M. Aykol, S. K. Suram and J. M. Gregoire, Benchmarking the acceleration of materials discovery by sequential learning, Chem. Sci., 2020, 11, 2696–2706 RSC .
  183. R. A. Flores, C. Paolucci, K. T. Winther, A. Jain, J. A. G. Torres, M. Aykol, J. Montoya, J. K. Nørskov and M. Bajdich, Active learning accelerated discovery of stable iridium oxide polymorphs for the oxygen evolution reaction, Chem. Mater., 2020, 32, 5854–5863 CrossRef CAS .
  184. C. Zhu, S. Fu, Q. Shi, D. Du and Y. Lin, Single-atom electrocatalysts, Angew. Chem., Int. Ed., 2017, 56, 13944 CrossRef CAS PubMed .
  185. Y. Zhao, J. Wan, H. Yao, L. Zhang, K. Lin, L. Wang, N. Yang, D. Liu, L. Song, J. Zhu, L. Gu, L. Liu, H. Zhao, Y. Li and D. Wang, Few-layer graphdiyne doped with sp-hybridized nitrogen atoms at acetylenic sites for oxygen reduction electrocatalysis, Nat. Chem., 2018, 10, 924 CrossRef CAS PubMed .
  186. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel and J. Vanderplas, Logistic regression, J. Mach. Learn. Res., 2011, 12, 2825 Search PubMed .
  187. M. Umer, S. Umer, M. Zafari, M. Ha, R. Anand, A. Hajibabaei, A. Abbas, G. Lee and K. S. Kim, Machine learning assisted high-throughput screening of transition metal single atom based superb hydrogen evolution electrocatalysts, J. Mater. Chem., 2022, 10, 6679 RSC .
  188. Y. Wang, J. Sha, S. Zhu, L. Ma, C. He, C. Zhong, W. Hu and N. Zhao, Data-driven design of carbon-based materials for high-performance flexible energy storage devices, J. Power Sources, 2023, 556, 232522 CrossRef CAS .
  189. X. Liu, D. Ji, X. Jin, V. Quintano and R. Joshi, Machine learning assisted chemical characterization to investigate the temperature-dependent supercapacitance using Co-rGO electrodes, Cabron, 2023, 214, 118342 CAS .
  190. L. Wang, S. Gao, W. Li, A. Zhu, H. Li, C. Zhao, H. Zhang, W.-H. Wang and W. Wang, Machine learning assisted screening of MXenes pseudocapacitive materials, J. Power Sources, 2023, 564, 232834 CrossRef CAS .
  191. P. Malakar, M. S. H. Thakur, S. M. Nahid and M. M. Islam, Monolayer transition-metal dichalcogenides for applications in flexible electronics, ACS Appl. Nano Mater., 2022, 5, 16489–16499 CrossRef CAS .
  192. S. Ghosh, G. R. Rao and T. Thomas, Machine learning-based prediction of supercapacitor performance for a novel electrode material: Cerium oxynitride, Energy Storage Mater., 2021, 40, 426–438 CrossRef .
  193. M. Zhou, A. Gallegos, K. Liu, S. Dai and J. Wu, Insights from machine learning of carbon electrodes for electric double layer capacitors, Carbon, 2020, 157, 147–152 CrossRef CAS .
  194. M. Alam and S. Husain, Hyperparameter tuned machine learning predictions of specific capacitance of conducting polymers and their composites for high performance advanced supercapacitors, Appl. Phys. A: Mater. Sci. Process., 2025, 131, 67 CrossRef CAS .
  195. H. Su, Z. Yang, Y. Li and Q. Chen, Predicting the capacitance of carbon-based electric double layer capacitors by machine learning, Nanoscale Adv., 2019, 1, 2162–2166 RSC .
  196. S. Zhu, F. Chen, H. Wu and M. Zheng, Artificial neural network enabled capacitance prediction for carbon-based supercapacitors, Mater. Lett., 2018, 233, 294–297 CrossRef CAS .
  197. G. K. Yogesh, D. Nandi, R. Yeetsorn, W. Wanchan, C. Devi, R. P. Singh, A. Vasistha, M. Kumar, P. Koinkar and K. Yadav, A machine learning approach for estimating supercapacitor performance of graphene oxide nano-ring based electrode materials, Energy Adv., 2025, 4, 119–139 RSC .
  198. S. Nanda, S. Ghosh and T. Thomas, Machine learning aided cyclic stability prediction for supercapacitors, J. Power Sources, 2022, 536, 231174 Search PubMed .
  199. Z. Zheng, O. Zhang, C. Borgs, J. T. Chayes and O. M. Yaghi, ChatGPT chemistry assistant for text mining and the prediction of MOF synthesis, J. Am. Chem. Soc., 2023, 145, 18048–18062 CrossRef CAS PubMed .
  200. N. J. Szymanski, B. Rendy, Y. Fei, R. E. Kumar, T. He, D. Milsted, M. J. McDermott, M. Gallant, E. D. Cubuk, A. Merchant, H. Kim, A. Jain, C. J. Bartel, K. Persson, Y. Zeng and G. Ceder, An autonomous laboratory for the accelerated synthesis of novel materials, Nature, 2023, 624, 86–91 CrossRef CAS PubMed .
  201. R. Ding, X. Wang, A. Tan, J. Li and J. Liu, Unlocking new insights for electrocatalyst design: A unique data science workflow leveraging internet-sourced big data, ACS Catal., 2023, 13, 13267–13281 CrossRef CAS .
  202. N. J. Szymanski, B. Rendy, Y. Fei, R. E. Kumar, T. He, D. Milsted, M. J. McDermott, M. Gallant, E. D. Cubuk, A. Merchant, H. Kim, A. Jain, C. J. Bartel, K. Persson, Y. Zeng and G. Ceder, An autonomous laboratory for the accelerated synthesis of novel materials, Nature, 2023, 617, 510–516 Search PubMed .
  203. D. Xue, P. V. Balachandran, J. Hogden, J. Theiler and T. Lookman, Accelerated search for materials with targeted properties by adaptive design, Nat. Commun., 2016, 7, 11241 CrossRef CAS PubMed .

Footnote

Diab Khalafallah and Fuming Lai contributed equally to this work.

This journal is © the Partner Organisations 2025
Click here to see how this site uses Cookies. View our privacy policy here.