Machine learning (ML)-assisted development of 2D green catalysts to support sustainability

Manshu Dhillon; Soumya Mahapatra; Adreeja Basu; Shyam S. Pandey; Manpreet Singh Manna; Shantanu Bhattacharya; Basab Chakraborty; Ajeet Kaushik; Aviru Kumar Basu

doi:10.1039/D5MH01739D

View PDF VersionPrevious ArticleNext Article

DOI: 10.1039/D5MH01739D (Review Article) Mater. Horiz., 2026, 13, 619-640

Machine learning (ML)-assisted development of 2D green catalysts to support sustainability

Manshu Dhillon ^a, Soumya Mahapatra ^h, Adreeja Basu ^b, Shyam S. Pandey ^c, Manpreet Singh Manna ^d, Shantanu Bhattacharya ^e, Basab Chakraborty ^f, Ajeet Kaushik ^g and Aviru Kumar Basu *^a
^aQuantum Material and Device Unit, Institute of Nano Science and Technology, Mohali, Punjab 140306, India. E-mail: aviru.basu@inst.ac.in
^bUniversity Centre for Research and Development, Chandigarh University, Gharuan, Mohali 140431, India
^cGraduate School of Life Science and Systems Engineering, Kyushu Institute of Technology, Kitakyushu, Japan
^dElectrical & Instrumentation Engineering, Sant Longowal Institute of Engineering and Technology, Longowal, Punjab, India
^eDepartment of Mechanical Engineering, Indian Institute of Technology Kanpur, 208016, India
^fRajendra Mishra School of Engineering Entrepreneurship, Indian Institute of Technology Kharagpur, West Bengal 721302, India
^gNanoBioTech Laboratory, Department of Chemistry, Florida Polytechnic University, Lakeland, Florida 33805, USA
^hDepartment of Materials Science and Engineering, Indian Institute of Technology (IIT) Delhi, Hauz Khas, New Delhi, Delhi 110016, India

Received 12th September 2025 , Accepted 24th September 2025

First published on 15th October 2025

Abstract

Advanced functional two-dimensional (2D) materials have emerged as efficient catalysts for promoting sustainability through the degradation of pollutants and gases. Their tailored features enable diverse catalytic applications, including photocatalysis, piezo-catalysis, and electrocatalysis; however, chemical synthesis of these materials remains a challenge. Therefore, green synthesis of these catalysts is an emerging focus wherein bio-derived and bio-acceptable bioactive catalysts can deal with environmental issues and overcome challenges associated with traditional routes. In this direction, the timely selection and optimization of green catalysts are key factors, requiring exploration through green chemistry and computational analysis. We believe that the involvement of machine learning (ML) in materials science can offer timely catalyst discovery through data-driven predictions and help in developing high-performance catalytic materials required for a sustainable environment. To cover such aspects, this article explores an ML-assisted investigation of efficient, green catalysts via adopting data-driven predictions, thereby assisting in the design and development of a catalyst with desired properties for piezo-catalysis, water splitting, and photocatalysis. This report explores the need for ML to forecast material properties, optimize reaction conditions, and find new catalysts by enhancing computational techniques, such as the density functional theory (DFT), that require a lot of resources. This is a new approach that faces some challenges, which are systematically discussed in this report. The outcomes of this report will serve as guidelines for scholars to explore ML-assisted development of green 2D catalysts, which are needed to achieve high-performance catalysis, thereby managing and maintaining a sustainable environment.

Manshu Dhillon

Manshu Dhillon is currently pursuing his Ph.D. under the supervision of Dr. Aviru Kumar Basu. She is working on the piezo-photocatalytic degradation of pollutants in industrial wastewater such as organic dyes and then using this water as a renewable energy source for Hydrogen and Oxygen evolution reactions at the Institute of Nanoscience and Technology (INST) DST, Govt. of India, Mohali, India. She has completed his master's (MSc) in Physics. She has expertise in the fabrication and characterization of nanomaterials, green synthesis process, 3D printing, piezo-catalysis, photocatalysis, and electrochemistry.

Aviru Kumar Basu

Aviru Kumar Basu is a Scientist at the Institute of Nano Science and Technology (INST), Mohali, DST, Govt. of India. His research focuses on 2D materials, MEMS/NEMS, biosensors, lab-on-chip, AI/ML, photocatalysis, and gas sensing. He earned his PhD from IIT Kanpur under Prof. Shantanu Bhattacharya and pursued postdoctoral research in Singapore. He has received the RSC Young Scientist Award (2024) for contributions to nanotechnology. His work has been highlighted by DST and featured in media. Dr. Basu has authored journal articles, conference papers, book chapters, and edited two books with AIP Publishing, USA.

Wider impact

This review highlights machine learning (ML) integration with the development of green-synthesized 2D nanomaterials to advance catalyst discovery, catalyst optimization, and sustainability. We systematically discuss ML-driven approaches for catalysis, emphasizing their role in high-throughput screening, reaction condition optimization, and structure–property relationship analysis. With increasing global energy demands and environmental concerns, 2D nanomaterials such as graphene, borophene, MoS₂, and MXenes have emerged as promising candidates for sustainable catalysis. Their high surface area and tunable electronic properties enable enhanced catalytic performance. However, conventional synthesis methods often involve hazardous chemicals and high energy consumption. Our review explores how green synthesis techniques, utilizing bioactive components and benign precursors, overcome these limitations while enhancing catalytic efficiency and aligning with sustainability goals. We further emphasize how ML transforms material discovery by enabling data-driven predictions and rapid screening, thereby reducing reliance on traditional empirical, trial-and-error methods. Unlike previous reviews, our review bridges ML methodologies, green chemistry principles, and sustainable nanomaterial design, offering a comprehensive roadmap for next-generation catalyst development. Additionally, we propose hybrid ML frameworks and cyber-physical approaches to advance catalytic research for hydrogen production and environmental remediation. Recently published catalysis reviews emphasize pioneering interdisciplinary research in catalytic science and engineering.

1. Introduction

Growing industrialization and world population present major difficulties in environmental preservation and in the development of sustainable energy systems.¹ Traditional non-renewable energy sources are the main components of greenhouse gas emissions and non-biodegradable environmental pollutants.² This requires eco-friendly and effective methods for the pollutant degradation, among which catalysis stands out by addressing the challenges and enabling eco-friendly and environmentally sustainable chemical reactions.³ One of its most significant contributions is in the degradation of pollutants, including organic contaminants, heavy metals, and greenhouse gases. Catalytic processes such as photocatalysis, piezocatalysis, and electrocatalysis harness renewable energy sources like sunlight, mechanical stress, or electricity to drive reactions that break down harmful substances into less toxic or inert products.⁴ For example, photocatalysis converts the organic pollutants present in water into degradable or eco-friendly substances.⁵ Similarly, electrocatalysis facilitates the reduction of CO₂ and H₂O into hydrogen, oxygen, and other value-added chemicals.⁶ The catalysis process efficiently degrades the pollutants without producing secondary wastes, making it a valuable approach for a clean and green environment.⁷

Nanomaterial selection is very important in catalysis. Nanomaterials are emerging tools in the context of catalysis due to their unique properties.⁸ 2D nanomaterials have garnered immense interest due to their unique physicochemical properties.⁹ Some materials, such as graphene, borophene, MoS₂, and MXenes, have already been explored for catalysis, and they show remarkable catalytic performance; therefore, they are selected as ideal candidates for enhancing the catalytic efficiency.¹⁰ However, the conventional synthesis methods for these 2D nanomaterials use harmful chemicals, and energy consumption is very high during the process. Furthermore, they release toxic byproducts, which is again an environmental problem. So, instead of conventional methods, a new green synthesis approach is used to synthesize 2D nanomaterials using eco-friendly precursors.² It minimizes the use of hazardous chemicals in the reaction, and the byproducts are environmentally friendly. Green synthesis approaches often utilize plant extracts, microorganisms, or benign precursors, aligning with green chemistry principles and sustainable development.¹¹ These methods offer a safer alternative to conventional synthesis techniques and often produce nanomaterials with superior catalytic properties due to the involvement of bioactive components.¹²

Theoretically, an enormous number of possible catalysts can be designed—far more than can be extensively studied in laboratory experiments. Therefore, theoretical techniques like DFT and molecular dynamics (MD) have been developed to examine the microstructures of materials and are commonly used for in silico catalyst screening to speed up the discovery of novel materials. The exploration of undiscovered material areas has been advanced thanks to high-throughput DFT calculations.^13,14 However, due to the inherently complicated nature of catalysts and their numerous applications, computational studies using DFT is expensive and time-consuming, especially when the amount of data is continuously growing.¹⁵ Here, ML plays a role as a complementary technique. DFT is indispensable for giving accurate, quantum-mechanical descriptions of catalytic processes, particularly when benchmark datasets or mechanistic insights are required. By learning from curated DFT datasets, ML models can be extrapolated to unexplored catalysts to rapidly predict their properties at a fraction of the cost by using minimal sources.^13,16

A wide range of mathematical techniques and model systems are included in ML, such as support vector machines (SVM), deep learning (DL), and deep neural networks (DNN). ML has emerged as an important instrument in catalysis research because it allows computers to learn from data, improve task performance, and make precise predictions or judgments without explicit programming. Its incorporation into this domain has complemented resource-intensive DFT calculations and improved material discovery and functional material optimization.^14,17

The benefits of ML over traditional methods, especially in recognition, search, and prediction tasks, have provided innovative solutions to long-standing issues in a wide range of fields, including the rapidly evolving 2D material sciences. ML algorithms are excellent at analyzing large, intricate datasets to predict the performance of new materials, optimize reaction conditions, and clarify underlying mechanisms.^18–20 ML offers a potent toolkit for improving the catalytic efficacy of photocatalysis, piezocatalysis, and electrocatalysis when paired with green-synthesized 2D nanomaterials, hence increasing their useful applications. Moreover, ML approaches play an important role in revealing complex structure–property connections and promoting the de novo discovery of novel materials.^21,22

In this work, we explore green-synthesized 2D nanomaterials like MXenes and transition metal dichalcogenides (TMDs) for catalytic applications and how ML is revolutionizing catalysis, with a focus on how it affects the development and optimization of electrocatalysts and photocatalysts (Fig. 1). We highlight the important ML algorithms, like ensemble methods and DL, and how they can be used to speed up high-throughput screening, optimize reaction conditions, and find structure–property relationships. We also demonstrate the potential of ML to improve catalyst design by combining it with methods like DFT.


	Fig. 1 Combining the green synthesis of 2D nanomaterials with ML for catalytic applications to support sustainability.

This review offers a thorough overview of green-synthesized 2D nanomaterials and state-of-the-art ML techniques, emphasizing how they can facilitate the de novo discovery of catalysts and customized materials. Our goal is to accelerate the development of high-performance catalysts and advance sustainable energy technologies by encouraging interdisciplinary collaboration.

2. State-of-the-art 2D materials as effective catalysts

Due to their distinct physical, chemical, and electronic characteristics, next-generation 2D materials have drawn much interest. These qualities make them ideal for various catalytic applications, including electrocatalysis, piezocatalysis, and photocatalysis. These materials, usually only a few atoms thick, have unique attributes that make them suitable candidates for solar energy conversion, pollutant degradation, and the creation of hydrogen and oxygen. These attributes include their surface area, electronic structures, and light absorption property.^9,23 Conventional techniques for creating 2D materials frequently require hazardous chemicals, large energy consumption, and unfavorable environmental circumstances. Green synthesis techniques that are more economical, environmentally agreeable, and sustainable are becoming more and more popular as a result of using natural and renewable resources like plant extracts, microorganisms, or safe reagents. Green synthesis creates 2D materials without the hazardous byproducts that come with traditional methods.⁷ This strategy aligns with the international initiatives to lessen adverse environmental effects and advance sustainable development.¹² A significant step towards more environmentally friendly technology and sustainable industrial processes is the creation of next-generation green-synthesized 2D materials such as graphene, TMDs, and borophene. Using advanced material science and green chemistry concepts, we hope to develop high-performance catalysts that meet global sustainability targets and work effectively.

2.1. Graphene spectrum materials

Graphene-based materials (graphene, graphene oxide (GO), and rGO) are rapidly expanding their catalytic activity in dye degradation and water-splitting applications due to their unique chemical, electronic, and physical properties; high surface-area-to-volume ratio; ease of modification; and planar honeycomb nanostructures. Oxygen-containing functional groups (epoxy, carboxyl, and hydroxyl) are prevalent in GO and increase surface area by serving as active sites for catalytic processes. These functional groups increase the efficiency of electron transport and aid in the better dispersion of catalysts. Although pure graphene lacks a bandgap, functionalized versions such as GO and rGO can be tailored for photocatalysis. They frequently interact with other semiconductors (such as ZnO or TiO₂) to enhance electron transport and limit the recombination of electron–hole pairs. Reactive oxygen species (ROS), such as hydroxyl radicals, are facilitated by GO in dye degradation, which speeds up the breakdown of organic contaminants.⁵ For water splitting, GO and its reduced form (rGO) can act as excellent conductive substrates, enhancing charge separation and transfer, thus improving the overall electrocatalytic performance. When GO is combined with other materials like metal oxides or sulfides, it further synergizes their catalytic activity by providing additional surface area, improved charge transport, and reduced recombination of charge carriers, leading to superior catalytic efficiency in dye degradation and water-splitting reactions.

Krishnan S. et al. synthesized GO composite G-ZnFe₂O₄/rGO nanohybrids, via green chemistry using orange peels (Citrus sinensis L.) for the photocatalytic degradation of organic pollutants, demonstrating photo antibacterial and cytotoxic activities,²⁴ as shown in Fig. 2(a). A reduced graphene oxide/copper (RGO/Cu) composite was produced using spearmint extract. Fig. 2(b) illustrates a green and sustainable method for fabricating RGO/Cu nanocomposites, in which graphene oxide and Cu²⁺ ions are reduced using spearmint extract as both a capping agent and a reductant.¹


	Fig. 2 Green synthesis process for (a) G-ZnFe₂O₄/rGO nanohybrids (reproduced from ref. 24 with permission from Elsevier, copyright 2021) and (b) RGO/Cu nanocomposite using spearmint extract (reproduced from ref. 1 with permission from Elsevier, copyright 2021).

Another study by A. R. Malik et al. used a straightforward one-step approach to embellish the rRGO sheet with green-synthesized ZnO NPs, mediated by Ocimum basilicum extract.²⁵ Using the DPPH scavenging assay, the antioxidant activity of the green-synthesized Ocimum basilicum ZnO NPs and RGO–ZnO NCs was determined to be dose-dependent. Through its in vitro antidiabetic effects, RGO–ZnO efficiently inhibited α-amylase and α-glucosidase. Furthermore, against both Gram-positive (Cocci) and Gram-negative (E. coli) bacterial strains, RGO–ZnO NCs showed antibacterial activity with increasing concentration. When used as catalysts in photocatalytic activity, ZnO NPs and RGO–ZnO NCs degraded the Rh-B dye by 91.4% and 96.7% under UV-visible light, respectively. Compared to pure ZnO NPs, RGO–ZnO NCs had superior antibacterial, antidiabetic, and photocatalytic activities. RGO–ZnO nanocomposites have shown promise as an intriguing material for biological research and photocatalysis.

A light-stable and magnetically separable g-Fe₃O₄/RGO nanocomposite was synthesized via a single-step hydrothermal process with Averrhoa carambola leaf extract as a natural surfactant for a variety of water purification applications, as presented in a report by D. K. Padhi et al.²⁶ The leaf extract from Averrhoa carambola significantly altered the Fe₃O₄ nanoparticle's structural, optical, and electrical characteristics. At room temperature, the g-Fe₃O_4/2RGO nanocomposite demonstrated phenol degradation of 76% and a Cr(VI) reduction of 97%. The in situ loading of RGO was responsible for the increased activity of g Fe₃O_4/2RGO. Photoluminescence and photocurrent measurements indicated that the synergism between RGO and the supermagnetic Fe₃O₄ nanoparticle leads to improved separation of photoexcited charge carriers (e⁻/h⁺). Furthermore, when compared to GO and a conventional antibiotic (30 μg), the g-Fe₃O_4/2RGO nanocomposite showed superior antibacterial action against three bacterial pathogens, including Staphylococcus aureus (MTCC-737), Bacillus subtilis (MTCC 736), and Escherichia coli (MTCC-443).

Despite facing challenges such as costly synthesis processes and limited thermal stability in oxygen-rich environments, graphene-based materials have considerable potential for a wide range of applications if issues related to production and integration can be addressed.^27,28 Some techniques, such as developing graphene-based hybrids or nanocomposites and incorporating nanoparticles onto graphene surfaces, should improve performance. These techniques can effectively enhance the electrical properties and surface-to-volume ratio, thus lowering detection limits.²⁹

2.2. MXenes

MXenes, 2D transition metal carbides, nitrides, and carbonitrides have drawn much interest in water splitting and photocatalysis because of their unique surface, optical, and electrical characteristics. They are great options for increasing the effectiveness of photocatalytic processes because of their layered structure, high conductivity, and adjustable surface chemistry. Surface-modified carbides give MXenes, the newly found multidimensional 2D materials, flexibility and variable composition. M_n+1X_nT_x is their usual formula, where n = 1–3. They comprise early transition metal strata separated by n layers of either nitrogen or carbon (designated as X) and end with surface functional groups (defined as T_x/T_z). Exfoliation of MXenes from MAX phase precursors (M_n+1AX_n) (M = Mo, Ti, and Zr, Cr; A = Al, Ga, Ge, and Si; and X = C and N) determines the structure and composition of MXenes. According to the n range from 1 to 3, the MAX phases are typically divided into three categories, namely the 211, 312, and 413 phases. Ti₃C₂ was the structure of the first MXene described by Naguib et al.,³⁰ and more than 30 different MXenes have been experimentally fabricated using assorted approaches.^31,32 The surface of MXenes is typically grafted with different terminal functional groups like OH, O, and F because their synthesis involves etching with strong acids like hydrofluoric acid.³³ Notably, the synthesis conditions used, including temperature, pH, and etchants, significantly impact the distribution of these functional groups.³³ The structure of MXenes is generally expressed as M_n+1XT_x (M: early transition metal, X: carbide or nitride, and T: a terminal functional group) due to the presence of terminal functional groups. In general, MXenes have a hexagonal closed-packed (HCP) crystal structure. However, M₃X₂ and M₄X₃ display a face-centered cubic (FCC) sequence (ABCABC), whereas the M atoms in M₂X display an HCP sequence (ABABAB).^34,35

In photocatalysis, MXenes serve as co-catalysts or support materials that enhance the separation of charge carriers, thus preventing the recombination of electrons and holes, a common issue in conventional photocatalysts. Moreover, their hydrophilic nature allows effective interaction with water molecules, which is particularly beneficial in water-splitting applications. MXenes can act as efficient catalysts for hydrogen and oxygen evolution reactions in water splitting. Their catalytic performance is further enhanced by their ability to be functionalized with various metal ions or doped with other elements. MXenes are photoactive, showing more efficient water splitting under sunlight or visible light irradiation by improving light absorption and providing active sites for catalytic reactions. These materials are promising for sustainable energy production, offering a pathway to more effective solar-to-hydrogen conversion systems. Ti₃C₂T_x is one of the most studied MXenes for photocatalysis due to its metallic conductivity and surface tunability. When coupled with semiconductors like TiO₂ or g-C₃N₄, it enhances photocatalytic hydrogen production by acting as an electron conductor.

A few research groups have recently conducted significant attempts to synthesize green MXenes. For instance, r-Ti₃C₂T_x MXene was produced by utilizing a green method with L-ascorbic acid at room temperature by Limbu et al.³⁶ The findings showed that r-Ti₃C₂T_x is a more promising material for numerous applications, including photo-electro-catalysis, due to its superior oxidation stability and six times higher electrical conductivity than Ti₃C₂T_x. Fluoride-free MXene processes have been the subject of recent research, particularly in the past few years. While the material's fluoride-containing terminal groups provide some essential benefits, other methods, such as electrochemical etching, have been suggested to address the issues brought on by too much fluoride.³⁶ The oxidation of element A by the cation of molten salts of Lewis acids without fluorine, which has a more significant electrochemical redox potential, is another alternate method that involves selectively etching the MAX phases. The redox coupling between A elements in MAX phases and cations in Lewis acid Cl melts can be estimated using Gibbs free energy mapping based on the redox potentials of A elements in MAX phases and molten salt melts. A novel molten salt-aided electrochemical etching (MS-E-etching) technique for creating fluoride-free Ti₃C₂Cl₂ was initially reported in the literature by Shen et al.³⁷ The study's noteworthy finding is that the salt may be reused, and no acidic liquid waste is generated following the synthesis process, making it a sustainable and environmentally friendly approach. Recent studies in the literature and newly created methods, including electrochemical etching, molten salt-assisted etching, and fluoride-free etching, hold promise for enhancing the environmentally friendly MXene production process.

With degradation efficiencies of 99.7%, 98.9%, and 99.3%, respectively, Ti₃C₂T_x/Ti₃AlC₂@Ag-50 showed outstanding catalytic degradation capability towards methylene blue (MB), Rhodamine B (RhB), and methylene orange (MO).³⁸ A novel nanocomposite of CdLa₂S₄ and Ti₃C₂ was created; the surface of 2D Ti₃C₂ nanosheets had good anchoring for CdLa₂S₄ nanoparticles. Over CdLa₂S₄/Ti₃C₂, a maximum H₂-evolution rate of 11 [thin space (1/6-em)] 182.4 μmol h⁻¹ g⁻¹ was attained, and at 420 nm the apparent quantum efficiency was 15.6%. Regarding photocatalytic H₂ production, Ti₃C₂ is an effective co-catalyst.³⁹ The Zn₂In₂S₅/Ti₃C₂(O, OH)x hybrids with a 1.5% (by mass) Ti₃C₂(O, OH)x content demonstrated hydrogen generation yields of 12 [thin space (1/6-em)] 983.8 μmol g⁻¹ upon exposure to visible light, which was noticeably superior to those of pure Zn₂In₂S₅. At 420 nm, the apparent quantum efficiency was 8.96%. Additionally, the rate of photocatalytic tetracycline removal was approximately 1.25 times greater than that of pure Zn₂In₂S₅, and it may be further enhanced by raising the temperature between 35 and 55 °C. The combined effects of conductive Ti₃C₂(O, OH)x and visible-light-active Zn₂In₂S₅ for spatial electrical promotion led to excellent photocatalytic activity. The transfer efficiency of photogenerated electrons from Zn₂In₂S₅ to Ti₃C₂(O, OH)x was 33.0%. We suggest that the interfacial-built-in quasi-alloying effect between ZIS and Ti₃C₂(O, OH)x resulted in significant charge redistribution, which in turn enabled the spatial separation and transfer of photogenerated electron–hole pairs, following spectroscopic, electrochemical, and DFT investigations. The underlying photo-excited charge transfer between the semiconductor and the metallic compound was discovered in this work.⁴⁰

2.3. Borophene

Borophene, a 2D material of boron atoms, has garnered attention because ultrathin 2D nanosheets possess unique and remarkable characteristics, including high surface areas, enhanced chemical and physical reactivity, and quantum confinement effects. Since nearly all their atoms are exposed on the surface, these nanosheets exhibit exceptional photonics, catalysis, electronics, and magnetic properties.⁴¹ In photocatalysis and water splitting, borophene shows immense promise, mainly when synthesized via green chemistry methods. The green synthesis approach is an environmentally friendly process that avoids harmful chemicals and utilizes plant extracts, microwave-assisted synthesis, or biological reagents for materials synthesis. These techniques reduce the use of hazardous chemicals and secondary byproducts, aligning with the global push toward sustainable technologies. Borophene's unique electronic band structure and high surface area make it an excellent candidate for enhancing the efficiency of photocatalytic reactions. Borophene can facilitate hydrogen and oxygen evolution reactions in water splitting due to its superb electron mobility and active surface sites, as shown in Fig. 3, by adding transition metals. In photocatalysis, borophene is an agent that absorbs sunlight and efficiently transfers the photo-generated charge carriers with a minimum recombination rate. Therefore, it enhances the degradation rate of pollutants or the splitting of water molecules into hydrogen and oxygen.


	Fig. 3 (a) Structural model showing transition metal doping in the α-borophene lattice. (b) Gibbs free energies for the HER and CO₂ ERR on metal-doped α-borophene. (c) and (d) Adsorption structure and projected density of states (PDOS) of CO₂-adsorbed Co-doped α-borophene. (e) Negatively charged β12 borophene nanosheet showing reversible CO₂ capture and release processes (reproduced from ref. 42 with permission from Elsevier, copyright 2024).

With green synthesis, borophene-based photocatalysts can contribute to more sustainable energy solutions while addressing environmental concerns such as water pollution. To produce superior properties comparable to graphene, we have been searching for 2D materials featuring a honeycomb lattice similar to graphene or single-element 2D nanosheets within the same periodic group as carbon, which has gained significant attention. Notably, the advancement of graphene-like materials, including hexagonal boron nitride (h-BN), TMDs, and mono-element 2D materials such as borophene, silicene, germanene, and stanene, has shown significant potential that may not only compensate for the disadvantage of graphene's zero band gaps but also offer other unique qualities that will open up new applications and opportunities. The etching process and evaporation are typically used to synthesize borophene. Compared to traditional 2D materials, mono-elemental nanoparticles offer three main advantages: (I) they are more appropriate for the latest advancements in semiconductor technology. For example, silicon and germanium are the main components of traditional semiconductor materials. (II) They are easy to synthesize at a high grade because they only comprise one ingredient. (III) They are easily digested and metabolized by biological systems. Black phosphorus is another mono-elemental 2D nanomaterial with excellent biocompatibility. Because of their incredibly substantial specific surface area and various levels of reactivity to light, pH, electricity, etc., mono-elemental 2D materials are also superior options for use in biological imaging, medication administration, optical treatment, electron devices, and other fields.⁴³ Borophene is one of the most mysterious mono-elemental 2D nanomaterials; its adaptability distinguishes it from other mono-elemental 2D materials.⁴⁴ The different borophene allotropes generated under different conditions and procedures have various geometries and characteristics. Pmmn borophene is an intriguing Dirac material predicted to exhibit Dirac cones and unique electrical characteristics. There are variations in their characteristics since some are isotropic and others are highly anisotropic. This suggests controlling and regulating the processing parameters to create borophene to meet application requirements.⁴³ In recent years, there has been a dearth of reports on borophene, and the unique properties that have been identified are just the beginning. There are still many nuances that scientists need to explore.

In diagnostic instruments, biological sensors, energy storage devices, high-performance medical equipment, and super-capacitors, the remarkable transversal nanomaterial, 2D borophene, is emerging and replacing its predecessors. The market for borophene is supported by its strong mechanical, optical, thermal, magnetic, and electrical capabilities, which set it apart from other 2D nanomaterials. Nonetheless, continuous work is being done to translate conceptual and empirical knowledge into workable systems. Research into the analytical and computational chemistry needed to optimize borophene with desired properties is required to close the associated knowledge gap. Water splitting for the creation of hydrogen fuel has garnered a lot of attention lately. Honda–Fujishima used TiO₂ as a photocatalyst to establish the first photocatalytic activity for water splitting in 1972.⁴⁵ However, the resulting activity was not very high because the TiO₂ band gap falls inside the UV range,⁴⁶ and semiconductor materials with a visible bandgap are needed to solve this issue. Although numerous materials have already been found, there is currently a high demand for 2D-based materials due to their adjustable bandgap, allowing us to overcome the first obstacle.

Additionally, 2D materials offer active locations for catalytic activity due to their large surface area.⁴⁵ Using first-principles calculations, Li Shi et al.⁴⁷ showed in 2016 that boron monolayers have electrocatalytic qualities for the hydrogen evolution reaction (HER), also known as water splitting. They concluded that boron monolayers exhibit roughly zero free energy (ΔG_H) for hydrogen adsorption due to their metallic characteristics, which resemble those of metals like Pt. Furthermore, because a mismatch during borophene development increases electrocatalytic activity, silver is an appropriate, effective substrate for HER activity. For the Oxygen Evolution Reaction (OER) and HER, borophene is a weightless material. Showkat H. Mir and colleagues researched the effect of doping borophene with C, N, P, and Li elements and determined the adsorption-free energy for hydrogen and oxygen. According to the results, there is a new opportunity for lightweight 2D materials for the HER and OER by improving the catalytic activity.⁴⁸ As previously indicated, numerous researchers have demonstrated that borophene has essential features; the data indicate that it is a novel electrocatalyst with great potential for causing a notable shift in the HER and OER fields.⁴⁹

2.4. Transition metal dichalcogenides

TMDs have recently gained significant attention as a prominent class of 2D nanomaterials. Structurally, TMDs feature a central layer of transition metal atoms flanked by two layers of chalcogen atoms, forming a hexagonal lattice held together by weak van der Waals forces. Common examples include MoS₂, NbS₂, WS₂, TiSe₂, VSe₂, and WTe₂. These materials exhibit tunable band gaps and pronounced light–matter interactions, which make them highly effective for applications like photocatalysis. Among them, MoS₂ stands out for the HER due to its efficient catalytic edge sites. Though TMDs share similarities with graphene, they are thinner and less mechanically robust. However, they excel in electrical conductivity owing to their direct bandgap. As semiconductors, their bandgap is thickness-dependent, increasing as the material approaches monolayer dimensions. TMDs have excellent electrical, optical, and photoluminescent properties, making them perfect candidates for diverse fields such as energy storage, catalysis, and electronics. Their photocatalytic ability is a hot topic for research due to their tunable band gaps, long-term stability, and abundant active reaction sites. Noteworthy examples include WO₃,⁵⁰ TiS₂,⁵¹ SnS₂,^52,53 WSe₂.^49,54 A particularly fascinating property of TMDs is their transition from an indirect to a direct bandgap from bulk to monolayer form, enhancing their performance in various applications.

Due to its superior performance, MoS₂ is the most commonly used TMD in microfluidic devices. For further enhancement in performance, MoS₂ nanosheets are decorated with metal nanoparticles. This composite consists of the properties of both MoS₂ and the metal nanomaterials, but also introduces new functionalities because of their unique combination. As a newly emerging class of 2D materials, TMDs offer remarkable characteristics, making them suitable for various applications, including optoelectronics, electronics, and catalysis. Their ease of exfoliation and potential for surface modification further enhance their versatility. While TMDs are already recognized as strong contenders in electronics, materials, and optoelectronics, their photocatalysis and water-splitting technology advancements hold immense promise, particularly for applications requiring sustainable and accessible solutions. The broad manufacturing potential of TMDs has positioned them as a central focus in developing cutting-edge nanomaterials. Using chemical vapor deposition (CVD), researchers have synthesized vertical van der Waals (vdW) heterostructures composed of MoS₂ and h-BN, notably enhancing light absorption capabilities. These vdW heterostructures based on TMDs are highly effective in minimizing the recombination of photogenerated electron–hole pairs.

Shengda Luo et al. developed a straightforward and eco-friendly method to produce few-layer WS₂ suspensions on a large scale. This process involves directly exfoliating commercially available WS₂ powders in a water–ethanol mixture. Capitalizing on the unique properties of 2D WS₂, the team successfully synthesized a novel 2D WS₂/MoS₂ composite for the first time. This was achieved through an in situ hydrothermal process, which enabled the growth of MoS₂ nanoflakes on the basal planes of a few-layer WS₂. The resulting WS₂/MoS₂ heterostructure was then examined for its photocatalytic properties. Compared to individual WS₂, MoS₂, and their simple physical mixtures, the hybrid material displayed improved photocatalytic efficiency in breaking down organic dye pollutants. This enhancement is attributed to the synergistic interaction within the WS₂/MoS₂ heterostructure. Furthermore, this heterojunction demonstrated markedly superior photocatalytic performance when benchmarked against other recently developed 2D TMD-based composites.⁵⁵

R. Jha et al. demonstrated a novel approach to exfoliating WS₂ using deionized water without relying on chemical solvents or surfactants. This method leverages the thermal expansion mismatch between WS₂ and water, and water's unique properties, such as forming five-, six-, and seven-membered ring structures during freezing. The researchers introduced two eco-friendly synthesis techniques for exfoliation: the ‘hard’ and ‘soft’ quenching methods. Various characterization techniques confirmed the successful exfoliation of bulk WS₂ into nanosheets with fewer than four layers and an average lateral size of approximately 200 nm. This method stands out for producing pristine nanosheets free from residual organic solvents, which are often challenging to eliminate (Fig. 4).⁵⁶


	Fig. 4 Proposed exfoliation mechanism: (a) at 4 °C, water molecules initially cluster together and (b) upon cooling or (c) quenching, the H₂O molecules expand, forming six-membered rings (represented by blue spheres connected with black lines to signify hydrogen bonding). This transitional expansion is further confirmed by the FESEM images shown in (d)–(f). During heating, the agitation of some H₂O molecules initiates the exfoliation process, leading to the separation of hexagonal WS₂ platelets (reproduced from ref. 56 with permission from IOP, copyright 2016).

A. K. Mishra and colleagues presented an environmentally friendly approach for synthesizing few-layer WS₂ and MoS₂ nanosheets using a simple, dilute aqueous solution of household detergent. By employing the brief sonication of bulk materials in soapy water, the team successfully scaled up the production of nanosheets. The resulting WS₂ and MoS₂ nanosheets exhibited thermal stability, optical absorption, and Raman spectral properties comparable to conventional synthesis methods. The photocatalytic efficiency was demonstrated by mixing the green dye in an aqueous solution under visible light. Their research highlights the significant potential of TMD nanosheets in environmental pollutant degradation, particularly for the degradation of harmful industrial pollutants in wastewater using sunlight.⁵⁷

These next-generation 2D materials hold great promise for achieving advancements in photocatalysis, piezocatalysis, and electrocatalysis via the eco-friendly synthesis route using green chemistry for sustainable energy and environmental applications.

3. Machine learning framework in the discovery of green catalysts

ML algorithms can approximate and model any complex relationship in physical systems, provided they have enough data. This distinguishes them from traditional computational techniques, such as DFT, which require explicit coding of chemical and physical principles and quantum mechanics into the software. DFT has been a fundamental tool in computational catalysis, offering detailed insight into reaction mechanisms, adsorption processes, and the correlation between structure and catalytic behavior. With the emergence of extensive DFT-based datasets, the integration of ML techniques has become increasingly powerful, enabling the rapid prediction of catalyst activity, selectivity, and stability across wide chemical domains. This shift has moved the field from isolated, case-specific investigations toward large-scale, data-driven screening approaches that hold promise for sustainable catalysis. However, the direct translation of computational predictions into experimentally verified catalyst systems remains rare. While theoretical studies dominate the literature, only limited examples demonstrate the successful validation of DFT-guided designs under realistic operating conditions. Consequently, there is still no well-defined pipeline that effectively connects DFT datasets and experimental implementation in a streamlined manner. To overcome this gap, future progress will depend on the integration of computational modeling with high-throughput experimental platforms, automated synthesis technologies, and iterative feedback loops, allowing predictions to be continuously refined through experimental outcomes. Such convergence is anticipated to drive the development of robust, practical, and scalable catalyst design strategies in the context of green chemistry. Unlike DFT, ML algorithms identify patterns and relationships within a dataset rather than depend on predefined rules to build models that can predict the behavior of new molecules or materials for catalysis applications.

The first step of training any ML algorithm is using pre-existing data to create databases.⁵⁹ A high-quality and diverse database is crucial in developing reliable models, catalyst design, and process optimization for green catalyst discovery.⁶⁰ Data from published articles, experimental measurements, computational studies, and material structure and property databases can be sourced. Still, the challenge in material informatics is to gather sufficient, dependable data that includes well-defined input variables, such as microscopic structures or synthesis conditions, and the macroscopic properties that the model will predict.⁶¹ Publications and literature provide a lot of data, but they frequently need to undergo a lot of preprocessing to handle data uncertainties and maintain reliability. To overcome these difficulties, high-throughput experimentation (HTE) and automation technologies are used nowadays for dataset generation with high speed, ease of reproduction, and scalability. In some recent studies, the role of HTE in catalysis is shown, as reported by Nguyen et al.⁶² who made a high-throughput screening (HTS) instrument for generating a dataset for the oxidative coupling of methane (OCM) catalysis. This system can automatically assess the performance of up to 20 catalysts under a range of predefined conditions using fixed-bed reactors. Through this approach, a comprehensive dataset of 12 [thin space (1/6-em)] 708 experimental points, covering 59 different catalysts, was generated. Also, Yanagiyama et al. employed HTE platforms to evaluate photocatalytic water purification under practical conditions, showcasing the relevance of such methods for addressing real-world sustainability challenges.⁶³ Similarly, Foppa et al. combined HTE with artificial intelligence to extract design rules for selective oxidation catalysts, illustrating how data-centric workflows can accelerate the identification of structure–activity relationships.⁶⁴ Alongside HTE, automation systems and robotics have become integral for dataset generation in green catalysis. Automated platforms minimize experimental variability, reduce human bias, and enable adaptive learning cycles when integrated with machine learning models. Eyke et al. emphasized the promise of machine learning-enhanced HTE, in which automation facilitates closed-loop optimization for catalytic processes.⁶⁵ Fig. 5 graphically illustrates the fundamental architecture of ML in catalyst discovery.^66–68


	Fig. 5 Basic framework for the integration of ML in catalyst discovery (reproduced from ref. 58 with permission from Elsevier, copyright 2022).

The second stage is featurization, sometimes referred to as feature engineering; it transforms the data into an appropriate mathematical format for training ML models. Also, a sub-step, known as feature selection, is essential to remove unnecessary information and create explicit structure–property connections of catalysts. It uses numerical values, such as vectors or tensors known as features or descriptors, to express important characteristics of materials, such as molecules, compounds, and clusters.⁶⁹ Various parameters can be used as features for chemical and material structures, such as stoichiometric properties (e.g., element fractions), elemental properties (e.g., atomic radii), electronic properties (e.g., bandgap, work function), and crystal features (e.g., nuclear positions, radial distribution functions). Interestingly, this process can be automated, and more complex material representations are made possible by recent developments in automatic featurization employing DNNs, which also improves the process by reducing the need for manual intervention.^60,70–72 Physical characteristics like periodicity and invariance must be taken into consideration when choosing features for describing a material. Effective descriptors should distinguish dataset objects and capture key material attributes without redundancy. Descriptors should offer unique information about a material's structure, composition, and physicochemical properties, and the number of descriptors must be carefully limited to avoid overfitting, especially in small datasets that are standard in electrocatalyst and photocatalyst research. Some descriptors, such as electronic and atomic ones, may struggle to distinguish between complex forms like allotropes, isomers, or polymorphs, requiring more advanced techniques.⁷³ Techniques such as Voronoi tessellations,⁷⁴ which divide the crystal lattice into regions surrounding the atoms, and radial distribution functions, representing density variations with distance, are useful for capturing crystal structures. Hansen et al. predicted molecular energies using pairwise interatomic force fields, ensuring symmetry and invariance; nonetheless, their performance deteriorated for non-equilibrium geometries. Crystal structures were encoded using simple metrics to increase prediction power. The structure graph, Coulomb matrix, topological descriptor, and diffraction fingerprint are the four representative structural properties included in this. Hansen's Bag of Bonds model, which is a variant of the Coulomb matrix, captures non-local interactions among atoms while maintaining rotational and translational invariance.⁷⁵ However, it lacks reverse mapping capabilities. Xie et al. addressed this by employing a crystal graph and a crystal diffusion variational autoencoder, allowing material properties to be directly learned from atomic connections without needing an invertible representation.⁷⁶ Table 1 lists the various kinds and examples of Material Descriptors for ML Models that are frequently utilized in catalysis.

Table 1 Material descriptors for ML models in catalysis discovery

Descriptor type	Description	Examples	Ref.
Structural	Encodes structural properties, often through models of electrostatic interactions or molecular representations.	Coulomb matrix (CM), atom-centered symmetry functions (ACSF), smooth overlap of atomic positions (SOAP), many-body tensor representation (MBTR).	77–81
Compositional	Describes material composition, often with elemental or molecular breakdowns. Also provides broad-scale material properties that are useful in applications like adsorption studies.	Atomic radii, atomic masses, simplified molecular-input line-entry system (SMILES), PaDEL, RDKit.	60, 82 and 83
Provenance	Relates to synthesis conditions and origins, including manufacturing settings.	Temperature, pH, reaction time.	84–86
Topographical	Details the spatial features of the material, especially for porous materials.	Pore size, surface area, volume	87–90
Electronic	Reflects the electronic properties essential for activity and is useful in applications like catalysis and semiconductors.	HOMO, LUMO, d-band, and p-band characteristics, electron density, and electronegativity.	91–94
Graph-based descriptors	Encodes molecular and crystalline structures using graph models for compatibility with conventional representations.	Property-labeled material fragments (PMFs)	95–97
Experimental	Derived from experimental methods, it is often costly but precise.	X-ray diffraction, extended X-ray absorption fine structure (EXAFS), X-ray absorption near-edge structure (XANES)	98–101

Selecting the right descriptors is essential for material modelling because using too many can complicate the model, resulting in overfitting and diminished predictive accuracy. Sparse, well-chosen subsets of descriptors enhance model predictivity and make the models more straightforward to interpret. Two primary methods for descriptor selection are down-selection and dimensional reduction, as summarized in Table 2.

Table 2 Methods for descriptor selection and dimensional reduction in ML applications

Method	Technique	Description	Advantages	Limitations	Applications	Ref.
Down-selection	Least absolute shrinkage and selection operator (LASSO) (L1 regularization)	Shrinks less relevant descriptors to zero, reducing feature space in regression models.	Identifies key features; reduces complexity	May discard valuable information; unstable with categorical descriptors or poor hyperparameter tuning.	Hydrogen evolution studies on Ni₂P.	22, 77 and 102
	Sure independence screening and sparsifying operator (SISSO)	Identifies optimal descriptors in large and correlated feature spaces.	Effective in large feature spaces	It may be complex to implement in specific datasets	Large-scale material property studies	22, 77 and 103
	Random forest (RF)	Uses a tree-based approach to rank descriptor importance post-training.	Robust and interpretable; handles high-dimensional data	May lose valuable information, particularly with small datasets	Descriptor ranking in catalytic and material	77 and 104
Dimensional reduction	Principal component analysis (PCA)	Projects data to lower dimensions, creating new descriptors from linear combinations of the original ones while retaining maximum variance.	Reduces dimensionality and computational load	Linear projection may result in information loss	Catalysts, photovoltaics, and supramolecular materials	105 and 106
	Kernel PCA	It applies nonlinear transformations for PCA in higher dimensions and handles non-linear relationships.	Captures nonlinear structures	Higher computational complexity	Nonlinear structure–property studies	106–109
	t-Stochastic neighbor embedding (t-SNE)	Visualizes high-dimensional data by preserving neighborhood relationships in a lower-dimensional space.	Effective for data visualization	High computational cost; sensitive to hyperparameters	High-dimensional data visualization	105 and 110
	Uniform manifold approximation and projection (UMAP)	Provides efficient nonlinear dimensionality reduction, retaining the global structure of data.	Fast, accurate, and computationally efficient	Potential loss of local data granularity	Catalyst screening and structure–property analysis	111–113

Once the optimal feature subset is selected, ML models are trained using linear and nonlinear methods. Data is typically divided into training sets to build the model and test sets to assess its predictive accuracy. As shown in Fig. 6, the overall machine learning workflow for catalyst discovery begins with data collection from literature, databases, and experimental sources, followed by cleaning, normalization, and preprocessing to ensure high-quality inputs. In the model development stage, feature selection and engineering focus on key physicochemical properties and reaction conditions, and appropriate algorithms (ranging from black-box neural networks to interpretable gray- and glass-box models) are selected and validated. The resulting models enable the accurate prediction of reaction pathways, catalytic activity, and optimal conditions, which, in turn, support the iterative refinement of catalyst design. Importantly, the framework underscores critical considerations such as ensuring generalizability across different catalyst systems, balancing interpretability with computational efficiency, and maintaining robust data quality.¹¹⁴ This workflow demonstrates how ML provides predictive power and practical guidance for developing efficient, selective, and sustainable green catalysts.¹¹⁵


	Fig. 6 Conceptual illustration showing possible ML algorithms employed in catalyst discovery (reproduced from ref. 115 with permission from Elsevier, copyright 2025).

Therefore, after all these selections, the ML algorithm can be applied in reaction optimization to minimize solvent usage and eliminate toxic reagents by predicting other sustainable alternatives. Also, ML-trained models on DFT have already successfully identified the non-precious metal catalysts with less complexity. Moreover, these data-driven strategies can guide reactor design, process intensification, and the scaling of these reactions in more resource-efficient ways.

3.1. ML-assisted green catalyst discovery and reaction prediction

Catalysis is a complex, dynamic, and multidimensional process. Traditionally, the development of catalytic materials has relied on empirical trial-and-error approaches, which are time-consuming and labor-intensive, especially when developing multicomponent catalysts under various reaction circumstances. However, recent advancements in ML techniques have changed how electrocatalytic and photocatalytic materials are discovered and designed. This section reviews the application of ML in advancing electrocatalyst and photocatalyst development, highlighting its role in graphene, MXene, and TMD-based photocatalytic and electrocatalytic materials. ML models help to accelerate material discovery and design through these methods, especially in catalysis, by giving accurate predictions and reducing experimental costs.¹¹⁶

3.1.1. Multiple linear regression (MLR). MLR is a simple and widely used algorithm in materials science for small datasets in green catalyst discovery because it is data-efficient and highly interpretable. It can easily capture the relationship between tasks, like catalyst design and reaction yield prediction. It assumes a linear relationship between eco-relevant descriptors such as bandgap, surface area, d-band center, or adsorption energy and catalytic performance, even when only limited experimental or computational data are available, minimizing the loss function via gradient descent. MLR performs well when data points exceed descriptors, but can overfit with fewer data points or highly correlated features. Regularization techniques like ridge regression (L2) and LASSO (L1) address these issues, while the elastic net combines both for stability. Ge et al. used LASSO to develop an ML model to predict the HER and OER overpotentials of MX2 heterojunctions, identifying MoTe₂/WTe₂ with a 300° rotation as the best candidate for HER performance.¹⁰² Similarly, Fung et al. employed the SISSO method to analyze transition metal-doped N-doped graphene for the HER.¹⁰³ They identified key descriptors, namely covalent radius, d-state centers, and Bader charge, which were instrumental in designing improved catalysts by modifying graphene supports. These studies demonstrate how ML-driven descriptor analysis enhances catalyst discovery and optimization. Despite its simplicity, MLR is adequate for small datasets with linear relationships. In cases of nonlinearity, MLR can be extended by transforming descriptors into a higher-dimensional space using kernel functions. This makes MLR a reliable predictive tool and a mechanistic guide, helping to identify key factors that govern catalytic behavior. In practice, MLR enables the screening of catalyst families, provides insights into sustainability by guiding the selection of less toxic or more energy-efficient materials, and serves as a baseline model against which more complex algorithms can be benchmarked.

3.1.2. Support vector machines. SVMs are highly suitable for green catalyst discovery, particularly when working with small to medium datasets that are often constrained by the high cost and time involved in catalytic experiments or simulations. These robust ML algorithms are used for both classification (SVC) and regression (SVR). They are instrumental when linear regression is ineffective, such as in cases with nonlinear relationships between catalyst descriptors, such as electronic structure, adsorption energies, and surface properties, which are critical in determining activity and selectivity. SVMs work by mapping data into a high-dimensional space where linear separation is possible. In SVC, the algorithm identifies an optimal hyperplane that maximizes the distance between two classes. For nonlinearly separable data, kernel functions transform the data for linear separation into higher dimensions. SVR differs slightly, focusing on finding a hyperplane that maintains a defined margin of error for most data points, with the tolerance margin (ξ) determined by the model's required accuracy.¹¹⁷ SVC and SVR are widely used in materials science for catalytic activity prediction and streamlining DFT calculations.^14,118–123 Highlighting the effectiveness of SVR, Sun et al. created a dataset of 180 MBenes, using four ML algorithms to train models predicting ΔG_H.¹²⁴ The support vector regression (SVR) model outperformed others, identifying Co₂B₂ and Mn-doped Co₂B₂ as promising HER catalysts. These studies underscore the power of ML in accelerating the discovery of novel 2D electrocatalysts. While SVM provides strong generalization and handles outliers well, it can be slow on large datasets, and choosing the proper kernel function lacks a definitive rule. Additionally, SVM models have limited transparency and are primarily suited for binary classification. This makes SVM a reliable tool for predicting catalytic performance trends, identifying the most influential descriptors, and narrowing down candidate materials for further testing. By accelerating the screening and rational design of catalysts, SVM saves resources and supports the broader goals of green chemistry by enabling the identification of sustainable, efficient, and environmentally benign catalytic systems.^119,120

3.1.3. Ensemble learning. Ensemble learning methods, such as Random Forests and gradient boosting, are particularly effective in green catalyst discovery because they combine the strengths of multiple weak or base learners to achieve higher predictive accuracy and robustness. In catalytic research, datasets often contain noisy, incomplete, or nonlinear relationships between structural descriptors, reaction conditions, and performance outcomes. Ensemble models handle this complexity by reducing variance and bias through aggregation, making them less prone to overfitting compared to single-model approaches.^125,126 Ensemble learning combines multiple models to enhance performance. Random Forests, for instance, provide insights into feature importance, helping researchers identify the most relevant physicochemical descriptors that govern catalytic efficiency or selectivity. Gradient boosting, on the other hand, builds models iteratively to minimize errors, making it especially powerful for predicting subtle trends in activity or stability across diverse catalyst families. (a) Bagging: like Random Forest (RF), it trains models on data subsets and averages predictions, reducing overfitting and noise. RF effectively predicts catalyst property, identifying key factors while saving resources.^125,126 For example, Zheng et al. utilized an RF model to screen 299 MXenes. They discovered that S-terminated Os₂B materials hold great promise as HER catalysts, with the S functional groups playing a crucial role in performance.¹²⁷ When boron replaces the X element in MXenes, the resulting MBenes show potential for the HER due to their metallic conductivity. Similarly, Lin et al. applied RF models to investigate the limiting potentials for the HER, ORR, and OER in graphene-supported catalysts. Their models, trained on data from 104 catalysts, predicted limiting potentials that matched DFT calculations and led to the discovery of promising new catalysts, including Ir- and Ni-based materials.¹²⁸ (b) Boosting assigns weights to emphasize poorly predicted data points, with algorithms like AdaBoost and gradient boosting improving precision and reducing residual errors.¹²⁶ For instance, Wang et al. employed Adaboost models to map the intrinsic features of MXene binary alloys to hydrogen adsorption energy, identifying bond lengths, ionization energy differences, and valence electrons as key descriptors for predicting HER activity.¹²⁹ (c) Gradient boosting, including gradient boosting tree (GBT) and gradient boosting regression (GBR), is widely used in materials science for property prediction and discovery due to its accuracy and optimization capabilities.¹²⁵ For example, Zhu et al. used XGBoost to predict reaction yields for the synthesis of 2-oxazolidones via Cu-catalyzed radical oxy-alkylation of allylamines and heteroarylmethylamines with CO₂, which is a complex three-component chemical reaction. Out of the 19 ML models that were tested, XGBoost performed noticeably better.¹³⁰ Even with limited reaction data and the complex variables typical of multicomponent reactions in traditional research frameworks, this study demonstrates the impressive potential of regression-based ML models in organic synthesis. By enabling accurate predictions from medium-sized datasets and offering interpretability of key drivers, ensemble learning supports rational catalyst design, optimizes experimental efforts, and advances the discovery of sustainable catalytic systems aligned with the goals of green chemistry.

3.1.4. Neural networks (NNs). NNs are bio-inspired algorithms where interconnected units, similar to neurons in the brain, transmit weighted signals. During training, NNs adjust the weights and biases on each connection to minimize the error between predicted and target values. NNs are known for their ability to model complex relationships, making them practical for continuous and discrete data.¹³¹ However, they require large datasets and can have long training times. Their ability to model highly nonlinear and complex relationships allows them to capture subtle interactions between catalyst structures, reaction environments, and performance outcomes that simpler models may overlook. NNs are widely used in electro and photocatalysis.^132–134 The most common NN architecture is the backpropagation NNs, consisting of input, hidden, and output layers. During training, errors are propagated back through the network to adjust weights, enabling the model to learn from data. While backpropagation NNs are flexible, their design requires trial and error to determine the optimal number of hidden layers and neurons. Bayesian regularization can help optimize network architecture automatically,¹¹⁴ yielding robust models with high predictive power. From experimental EXAFS results, Liu et al. used a neural network model to analyze structural data for cobalt atoms in N-doped graphene.¹³⁵ Their study revealed that Co atoms at graphene's edge sites (armchair and zigzag) were highly active for the HER, outperforming commercial Pt/C catalysts at high current densities.

DL, a subset of ML, primarily employs multi-layered neural network architectures to model complex nonlinear relationships. It uses several hidden layers to progressively abstract features from input data. These models are perfect for handling data like images, sounds, and videos since they can learn intricate patterns without manual feature engineering.¹³⁶ In materials science, DL has been applied to explore catalytic reactivity, optimize experimental conditions, and accelerate catalyst discovery; for example,¹³² Zarafi et al. identified highly selective nitrogen reduction catalysts by predicting nitrogen reduction and hydrogenation energies in B-doped graphene using a DNN.¹³⁷ This study demonstrates how ML can direct catalyst design and optimization for difficult reactions such as nitrogen reduction.^138,139

One type of DL model is a convolutional neural network (CNN), which is beneficial for image analysis. CNNs consist of convolutional and pooling layers that extract features from images and pass them through fully connected layers for prediction. For example, Jiang et al. created an interpretable CNN-based ML model for predicting the photocatalytic degradation rates of organic pollutants using TiO₂. They coupled the structural information they extracted from molecular images using EfficientNet with experimental data to train neural networks. The model outperformed earlier ML approaches and revealed active molecular sites that affect catalysis, assisting in understanding the connection between molecular structure and photocatalytic activity.¹⁴⁰ CNNs have been used in materials research to predict characteristics like band gaps and binding energies.^141,142 CNNs have also been combined with theory to predict adsorption energies, providing insights into the interactions between adsorbates and metal surfaces.¹⁴³ Despite their versatility, NNs require large datasets and significant computational resources. They are also prone to overfitting, especially with small training sets. To address these challenges, NNs can be combined with ensemble learning, active learning, and evolutionary methods. Additionally, new techniques like class activation maps can help to improve the interpretability by providing insights into the internal workings of NNs¹¹¹ models, mitigating their “black box” nature.¹⁴⁰

The dataset size is one of the most important factors playing a role in determining the suitability of different ML models for catalysis applications. Smaller datasets often require simpler algorithms, such as SVM, to avoid overfitting, and often originate from scarce experimental electrochemical measurements, such as oxygen reduction or alcohol oxidation catalysts. Medium datasets typically come from the computational screening of adsorption energies or perovskite stability, where ensemble methods balance predictive power and interpretability. Larger datasets enable the use of deep learning architectures with improved predictive performance, generated through high-throughput DFT calculations or automation-assisted experimentation, where deep learning and graph neural networks can be effectively applied in the closed-loop discovery of sustainable catalysts, such as for CO₂ reduction or green ammonia synthesis. The best practices to mitigate overfitting across dataset scales are summarized in Table 3, based on the models discussed earlier.

Table 3 Dataset size-dependent selection of the machine learning model, overfitting risks, and mitigation strategies in catalysis research

Dataset size	Preferred models	Overfitting risk	Mitigation strategies	Examples in catalysis ML	Ref.
Small datasets (10²–10³ data points)	LASSO, SVM, Random forest, Gradient boosting	High, due to limited data diversity and noise sensitivity	k-fold cross-validation and regularization (L1/L2) with restrict model depth.	ML-guided prediction of ORR activity for non-precious metal catalysts and alcohol oxidation performance from small experimental datasets.	19, 144 and 145
Medium datasets (10³–10⁴ data points)	Ensemble models, Kernel methods	Moderate, due to model complexity; must be balanced with the data size.	Hyperparameter tuning with Bayesian optimization and transfer learning from larger materials datasets.	Prediction of adsorption energies for CO₂ reduction intermediates, stability trends in perovskite oxides, and catalytic activity from computational datasets	107, 115, 146 and 147
Large datasets (>10⁴–10⁵ data points)	Deep neural networks	Lower (if data is diverse and representative), but risk persists if biased	Dropout, batch normalization, and early stopping. Also, closed-loop experimentation with parallel training	High-throughput catalyst discovery for CO₂ reduction and green ammonia synthesis, graph neural networks for heterogeneous catalysis, automated discovery pipelines integrating ML with HTE for heterogeneous catalysis	148–151

3.1.5. Other algorithms. Several other algorithms are widely used in ML for materials science. K-Nearest Neighbor (KNN) is a simple, efficient algorithm often used for pattern recognition. It identifies the k closest neighbors to an unknown sample based on Euclidean distance.¹⁵² Still, it can struggle with large datasets and high-dimensional spaces and is sensitive to noise and missing data. Kernel ridge regression (KRR), an extension of ridge regression, introduces kernel functions to handle nonlinear data. KRR is faster for medium datasets but becomes slower than SVR on larger datasets, making it suitable for tasks like catalyst optimization and crystal structure analysis.¹⁵³ Decision tree (DT) models classify data by applying rules to input features, forming a tree structure with internal nodes (tests), branches (test outcomes), and leaf nodes (class labels or values). DTs are widely used in materials science for tasks like compound classification and catalyst optimization because they are easy to train, have high accuracy, and are straightforward to interpret.¹⁵⁴ They are, however, prone to overfitting and instability, as minor changes in data can cause the tree to shift significantly. Even though pruning lessens this, DTs may still have trouble capturing intricate relationships. Inspired by evolution, genetic algorithms (GA) use a fitness function to evolve material genomes over multiple iterations, optimizing properties such as catalytic activity. ML models can accelerate this process as surrogate fitness functions, reducing the required experiments.¹⁵⁵ Active learning is an iterative method that improves model accuracy with minimal labeled data. A model is trained on a subset, selects high-uncertainty data for labelling, and refines the model in each cycle.¹⁵⁶ It is helpful for applications like catalyst discovery and property prediction, where data is scarce or expensive to obtain.

3.2. Limitations, challenges, and future prospects with ML application in green catalysis

Even though electrocatalysis and photocatalysis have achieved significant success using ML, several limitations can impair model performance. Data quality is crucial because ML models rely on sizable, varied, high-quality datasets; however, data comparability must be reliable, particularly for condition-sensitive experiments like HER. Training data sets should only use data that was acquired under consistent conditions. Descriptors are essential for ML modelling and its performance. They must minimize correlation while providing sufficient information about the objective attribute; LASSO and RF can assist with relevant descriptor selection. Another typical concern is overfitting, specifically when there are more model parameters than datasets; regularization techniques, lowering descriptors, or increasing datasets can all help prevent this. Also, cross-validation is effective for validating these small datasets since it validates the model using an independent dataset. Furthermore, it is suggested that outliers be carefully examined since they could reveal important information or point to experimental errors.

Despite the notable advancements, several challenges still exist in using ML for electrocatalysis and photocatalysis. Accurate models require high-quality, diversified datasets, yet obtaining such data can be challenging. Temperature and pressure can affect experimental outcomes, which makes data comparability challenging. Furthermore, published results frequently overlook low-performance materials in favor of preferred results, which can bias models. Enhancing metadata recording and implementing high-throughput synthesis methods will increase data reliability and should address these challenges. Another challenge is the lack of ML applications in kinetic studies of electrocatalysis or photocatalysis due to the complexity of probing excited states. Methods like time-resolved spectroscopy and Kelvin probe microscopy¹⁵⁷ could be useful. Although sophisticated techniques like reinforcement, active, and meta-learning show promise in handling small datasets, limited data remains a problem. Since descriptors directly impact model quality, and it is difficult to represent complex mechanisms using simple interpretable descriptors, it is imperative to build better and more efficient interpretable descriptors to address this issue. Finding novel materials with desirable qualities will be easier using techniques like generative models and DL.

Future research should concentrate on resolving the present drawbacks of ML-assisted catalysis to the progress of catalyst discovery and optimization. This involves bridging the gap between dynamic, complicated, real-world catalytic systems and simplified descriptor-based models. It is vital to create refined datasets combining experiments and calculations made from first-principles computations under constant reaction conditions (as illustrated in Fig. 7). It will also be necessary to address issues with data representation for time-dependent catalytic events and provide unbiased training sets for ML algorithms.¹⁵⁸ Integrating ML with theoretical and experimental approaches can develop new methodologies, improving the discovery of catalysts and catalytic processes through a more comprehensive and integrated approach, ultimately driving further advancements. ML, being a versatile tool, can also be employed in the development and design of green catalysts or bio-inspired catalysts by leveraging green chemistry principles. There is a lot of potential for ML applications in this relatively unexplored area for developing green catalytic systems with minimal environmental effect. Looking ahead, the field should concentrate on developing hybrid ML models and integrating cyber-physical systems to achieve sustainable and efficient hydrogen production. Pushing the limits of catalyst design requires a cooperative, interdisciplinary approach that combines the knowledge of data science, materials chemistry, and engineering. Furthermore, implementing ML technologies with ethical considerations will be essential for offering sustainable growth in the hydrogen production sector. Together, these efforts have the potential to accelerate the transition to a cleaner energy future.


	Fig. 7 Revolutionizing and improving the closed-loop catalyst discovery framework, integrating refined datasets, DFT theoretical calculations with HTE, and ATs with ML.

Looking ahead, the integration of different techniques, such as DFT, HTE, and automation technologies (ATs) with supervised ML to make a closed-loop framework, is important for the prediction of catalysts and easy modification as per the requirements of the real world; also, the advancements in catalyst discovery increasingly emphasize minimizing the environmental footprint of their synthesis and application, as shown in Fig. 7. These advances will position ML as a tool for faster catalyst discovery and also as a driving force in achieving environmentally benign and industrially relevant catalytic solutions.

4. Conclusion and viewpoint

Integrating advanced methodologies, such as the green synthesis of nanomaterials and ML, offers transformative solutions for addressing critical challenges in catalysis and sustainable energy systems. Green synthesis provides an environmentally friendly and sustainable pathway to produce 2D nanomaterials with superior catalytic properties. At the same time, ML accelerates the discovery and optimization of catalytic materials by unraveling complex structure–property relationships and predicting performance with unparalleled efficiency. While conventional approaches often rely on labor-intensive trial-and-error experiments, by combining ML with practical and theoretical methodologies, catalyst discovery can be matched with sustainability goals, resulting in developments in eco-friendly hydrogen production and renewable energy sources. The development of hybrid ML models and cyber-physical systems allows for efficient, scalable operations, minimizing reliance on traditional energy sources. By using ML in catalysis, we can eliminate our reliance on hazardous solvents and prevent the wastage of chemicals that are used for optimization. One more thing is that ML can guide in the preparation of the same catalysts used earlier, but this time they can be synthesized under mild conditions, have low-energy reaction pathways with minimal byproducts, and replace toxic precursors with benign alternatives, thereby lowering the environmental footprint of catalyst preparation and deployment. The integration of ML with DFT, high-throughput experimentation (HTE), and automation technologies is used for dataset generation with high speed, ease of reproduction, and scalability, which can generate a closed-loop discovery pipeline, where models can continuously modify the catalyst design as per the demand of real-world scenarios.

Finally, initiatives towards open-access datasets, standardised benchmarking, and cooperation between computational scientists, experimental chemists, and process engineers will be necessary to fully realise the potential of ML for sustainable catalysis. The area can advance beyond discovery acceleration to a more transformational role by integrating machine learning (ML) skills with green chemistry goals. This will allow for the logical design of catalysts that are safe, scalable, ecologically benign, efficient, and selective.

Conflicts of interest

There are no conflicts to declare.

Data availability

Data for this review article are present in the cited papers mentioned in the figures and text.

Acknowledgements

Manshu Dhillon and Aviru Kumar Basu would like to acknowledge the Quantum Materials and Devices unit of the Institute of Nano Science and Technology, Mohali, for providing the facilities to prepare this review article.

References

H. Safajou, et al., Green synthesis and characterization of RGO/Cu nanocomposites as photocatalytic degradation of organic pollutants in waste-water, Int. J. Hydrogen Energy, 2021, 46(39), 20534–20546, DOI:10.1016/j.ijhydene.2021.03.175.
P. Upadhyay, S. K. Prajapati and A. Kumar, Impacts of riverine pollution on greenhouse gas emissions: A comprehensive review, Ecol Indic, 2023, 154, 110649, DOI:10.1016/j.ecolind.2023.110649.
M. Arunachalapandi and S. Mohana Roopan, Environment Friendly g-C3N4-Based Catalysts and Their Recent Strategy in Organic Transformations, High Energy Chem., 2022, 56(2), 73–90, DOI:10.1134/s0018143922020102/metrics.
S. Wang, et al., Photo-/electro-/piezo-catalytic elimination of environmental pollutants, J. Photochem. Photobiol., A, 2023, 437, 114435, DOI:10.1016/j.jphotochem.2022.114435.
M. Dhillon, A. Naskar, N. Kaushal, S. Bhansali, A. Saha and A. K. Basu, A novel GO hoisted SnO 2 –BiOBr bifunctional catalyst for the remediation of organic dyes under illumination by visible light and electrocatalytic water splitting, Nanoscale, 2024, 16(26), 12445–12458, 10.1039/D4NR01154F.
R. Palani, et al., Imidazolatic-Framework Bimetal Electrocatalysts with a Mixed-Valence Surface Anchored on an rGO Matrix for Oxygen Reduction, Water Splitting, and Dye Degradation, ACS Omega, 2021, 6(24), 16029–16042, DOI:10.1021/acsomega.1c01870/suppl_file/ao1c01870_si_001.pdf.
P. C. Nagajyothi, S. V. Prabhakar Vattikuti, K. C. Devarayapalli, K. Yoo, J. Shim and T. V. M. Sreekanth, Green synthesis: Photocatalytic degradation of textile dyes using metal and metal oxide nanoparticles-latest trends and advancements, Crit. Rev. Environ. Sci. Technol., 2020, 50(24), 2617–2723, DOI:10.1080/10643389.2019.1705103.
S. Singla, S. Sharma, S. Basu, N. P. Shetti and T. M. Aminabhavi, Photocatalytic water splitting hydrogen production via environmental benign carbon based nanomaterials, Int. J. Hydrogen Energy, 2021, 46(68), 33696–33717, DOI:10.1016/j.ijhydene.2021.07.187.
V. Chugh, A. Basu, A. Kaushik, N. Manshu, S. Bhansali and A. K. Basu, Employing nano-enabled artificial intelligence (AI)-based smart technologies for prediction, screening, and detection of cancer, Nanoscale, 2024, 16(11), 5458–5486, 10.1039/D3NR05648A.
J. Fatima, et al., Tunable 2D Nanomaterials; Their Key Roles and Mechanisms in Water Purification and Monitoring, Front. Environ. Sci., 2022, 10, 766743, DOI:10.3389/fenvs.2022.766743/bibtex.
S. Patnaik, D. P. Sahoo and K. Parida, An overview on Ag modified g-C3N4 based nanostructured materials for energy and environmental applications, Renewable Sustainable Energy Rev., 2018, 82, 1297–1312, DOI:10.1016/j.rser.2017.09.026.
H. T. Ren, S. Y. Jia, Y. Wu, S. H. Wu, T. H. Zhang and X. Han, Improved photochemical reactivities of Ag2O/g-C3N4 in phenol degradation under UV and visible light, Ind. Eng. Chem. Res., 2014, 53(45), 17645–17653, DOI:10.1021/ie503312x/suppl_file/ie503312x_si_001.pdf.
S. N. Steinmann, A. Hermawan, M. Bin Jassar and Z. W. Seh, Autonomous high-throughput computations in catalysis, Chem. Catal., 2022, 2(5), 940–956, DOI:10.1016/j.checat.2022.02.009.
Z. Fang, et al., The DFT and Machine Learning Method Accelerated the Discovery of DMSCs with High ORR and OER Catalytic Activities, J. Phys. Chem. Lett., 2024, 15(1), 281–289, DOI:10.1021/acs.jpclett.3c02938/asset/images/large/jz3c02938_0006.jpeg.
B. R. Goldsmith, J. Esterhuizen, J. X. Liu, C. J. Bartel and C. Sutton, Machine learning for heterogeneous catalyst design and discovery, AIChE J., 2018, 64(7), 2311–2323, DOI:10.1002/AIC.16198.
B. Lu, et al., When Machine Learning Meets 2D Materials: A Review, Adv. Sci., 2024, 11(13), 2305277, DOI:10.1002/ADVS.202305277.
L. I. Ugwu, Y. Morgan and H. Ibrahim, Application of density functional theory and machine learning in heterogenous-based catalytic reactions for hydrogen production, Int. J. Hydrogen Energy, 2022, 47(4), 2245–2267, DOI:10.1016/j.ijhydene.2021.10.208.
K. T. Winther, M. J. Hoffmann, J. R. Boes, O. Mamun, M. Bajdich and T. Bligaard, Catalysis-Hub.org, an open electronic structure database for surface reactions, Sci. Data, 2019, 6(1), 1–10, DOI:10.1038/s41597-019-0081-y.
J. Fujima, Y. Tanaka, I. Miyazato, L. Takahashi and K. Takahashi, Catalyst Acquisition by Data Science (CADS): a web-based catalyst informatics platform for discovering catalysts, React. Chem. Eng., 2020, 5(5), 903–911, 10.1039/D0RE00098A.
L. Chanussot, et al., Open Catalyst 2020 (OC20) Dataset and Community Challenges, ACS Catal., 2021, 11(10), 6059–6072, DOI:10.1021/acscatal.0c04525/asset/images/large/cs0c04525_0008.jpeg.
T. Le, V. C. Epa, F. R. Burden and D. A. Winkler, Quantitative structure-property relationship modeling of diverse materials properties, Chem. Rev., 2012, 112(5), 2889–2919, DOI:10.1021/cr200066h/asset/images/medium/cr-2011-00066h_0009.gif.
H. Mai, T. C. Le, D. Chen, D. A. Winkler and R. A. Caruso, Machine Learning for Electrocatalyst and Photocatalyst Design and Discovery, Chem. Rev., 2022, 122(16), 13478–13515, DOI:10.1021/acs.chemrev.2c00061/asset/images/large/cr2c00061_0015.jpeg.
T. Li, T. Jing, D. Rao, S. Mourdikoudis, Y. Zuo and M. Wang, Two-dimensional materials for electrocatalysis and energy storage applications, Inorg. Chem. Front., 2022, 9(23), 6008–6046, 10.1039/D2QI01911F.
S. Krishnan, et al., Facile green synthesis of ZnFe2O4/rGO nanohybrids and evaluation of its photocatalytic degradation of organic pollutant, photo antibacterial and cytotoxicity activities, Colloids Surf., A, 2021, 611, 125835, DOI:10.1016/j.colsurfa.2020.125835.
A. R. Malik, et al., Green synthesis of RGO-ZnO mediated Ocimum basilicum leaves extract nanocomposite for antioxidant, antibacterial, antidiabetic and photocatalytic activity, J. Saudi Chem. Soc., 2022, 26(2), 101438, DOI:10.1016/j.jscs.2022.101438.
D. K. Padhi, T. K. Panigrahi, K. Parida, S. K. Singh and P. M. Mishra, Green Synthesis of Fe3O4/RGO Nanocomposite with Enhanced Photocatalytic Performance for Cr(VI) Reduction, Phenol Degradation, and Antibacterial Activity, ACS Sustainable Chem. Eng., 2017, 5(11), 10551–10562, DOI:10.1021/acssuschemeng.7b02548/asset/images/large/sc-2017-02548f_0009.jpeg.
A. K. Basu, A. Basak and S. Bhattacharya, Geometry and thickness dependant anomalous mechanical behavior of fabricated SU-8 thin film micro-cantilevers, J. Micromanuf., 2020, 3(2), 113–120, DOI:10.1177/2516598420930988.
A. K. Basu, A. N. Sah, A. Pradhan and S. Bhattacharya, Poly-L-Lysine functionalised MWCNT-rGO nanosheets based 3-d hybrid structure for femtomolar level cholesterol detection using cantilever based sensing platform, Sci. Rep., 2019, 9(1), 1–13, DOI:10.1038/s41598-019-40259-5.
S. Das, V. Chugh, C. Das and M. Bhattacharjee, Non-enzymatic Glucose Sensing Employing a Patterned Substrate Miniaturized Device-on-Mask, IEEE Sens Lett, 2023, 7(9), 1–4, DOI:10.1109/LSENS.2023.3307089.
M. Naguib, et al., Two-dimensional nanocrystals produced by exfoliation of Ti₃ALC₂, MXenes, 2023, 15–29, DOI:10.1201/9781003306511-4/two-dimensional-nanocrystals-produced-exfoliation-ti3alc2-Michael-naguib-murat-kurtoglu-volker-presser-jun-lu-junjie-niu-min-heon-lars-hultman-yury-gogotsi-michel-barsoum.
L. Verger, C. Xu, V. Natu, H. M. Cheng, W. Ren and M. W. Barsoum, Overview of the synthesis of MXenes and other ultrathin 2D transition metal carbides and nitrides, Curr. Opin. Solid State Mater. Sci., 2019, 23(3), 149–163, DOI:10.1016/j.cossms.2019.02.001.
M. Malaki, A. Maleki and R. S. Varma, MXenes and ultrasonication, J. Mater. Chem. A, 2019, 7(18), 10843–10857, 10.1039/C9TA01850F.
F. Wang, et al., Cluster-Based Multifunctional Copper(II) Organic Framework as a Photocatalyst in the Degradation of Organic Dye and as an Electrocatalyst for Overall Water Splitting, Cryst. Growth Des., 2021, 21(7), 4242–4248, DOI:10.1021/acs.cgd.1c00479/suppl_file/cg1c00479_si_001.pdf.
B. Anasori, M. R. Lukatskaya and Y. Gogotsi, 2D metal carbides and nitrides (MXenes) for energy storage, MXenes, 2023, 677–722, DOI:10.1201/9781003306511-35/2d-metal-carbides-nitrides-mxenes-energy-storage-babak-anasori-maria-lukatskaya-yury-gogotsi.
O. Salim, K. A. Mahmoud, K. K. Pant and R. K. Joshi, Introduction to MXenes: synthesis and characteristics, Mater. Today Chem., 2019, 14, 100191, DOI:10.1016/j.mtchem.2019.08.010.
T. B. Limbu, et al., Green synthesis of reduced Ti 3 C 2 T x MXene nanosheets with enhanced conductivity, oxidation stability, and SERS activity, J. Mater. Chem. C, 2020, 8(14), 4722–4731, 10.1039/c9tc06984d.
Y. Shen, et al., Zincophilic Ti₃C₂Cl₂ MXene and anti-corrosive Cu NPs for synergistically regulated deposition of dendrite-free Zn metal anode, J. Mater. Sci. Technol., 2024, 169, 137–147, DOI:10.1016/j.jmst.2023.06.017.
Y. Lv, K. Wang, D. Li, P. Li, X. Chen and W. Han, Rare Ag nanoparticles loading induced surface-enhanced pollutant adsorption and photocatalytic degradation on Ti3C2Tx MXene-based nanosheets, Chem. Phys., 2022, 560, 111591, DOI:10.1016/j.chemphys.2022.111591.
L. Cheng, Q. Chen, J. Li and H. Liu, Boosting the photocatalytic activity of CdLa2S4 for hydrogen production using Ti3C2 MXene as a co-catalyst, Appl. Catal., B, 2020, 267, 118379, DOI:10.1016/j.apcatb.2019.118379.
H. Wang, et al., Electrical promotion of spatially photoinduced charge separation via interfacial-built-in quasi-alloying effect in hierarchical Zn₂In₂S₅/Ti₃C₂(O, OH)x hybrids toward efficient photocatalytic hydrogen evolution and environmental remediation, Appl. Catal., B, 2019, 245, 290–301, DOI:10.1016/j.apcatb.2018.12.051.
V. Chugh, A. Basu, A. K. Kaushik and A. K. Basu, Progression in Quantum Sensing/Bio-Sensing Technologies for Healthcare, ECS Sens. Plus, 2023, 2(1), 2754–2726, DOI:10.1149/2754-2726/ACC190.
P. Kumar, et al., The rise of borophene, Prog. Mater. Sci., 2024, 146, 101331, DOI:10.1016/j.pmatsci.2024.101331.
S. N. Nangare, Z. G. Khan, A. G. Patil and P. O. Patil, Design of monoelemental based two dimensional nanoarchitectures for therapeutic, chemical sensing and in vitro diagnosis applications: A case of borophene, J. Mol. Struct., 2022, 1265, 133387, DOI:10.1016/j.molstruc.2022.133387.
Z. Xie, et al., Two-Dimensional Borophene: Properties, Fabrication, and Promising Applications, Research, 2020, 2020, 2624617, DOI:10.34133/2020/2624617.
A. Fujishima and K. Honda, Electrochemical Photolysis of Water at a Semiconductor Electrode, Nature, 1972, 238(5358), 37–38, DOI:10.1038/238037a0.
R. Saha, S. Mahapatra, A. Dalal, A. Mondal and S. Chakrabarti, Investigation of dual-wavelength selective self-powered photo response of ZnO/Si heterojunction with insertion of thin TiO₂ layer, Appl. Phys. A, 2025, 131(1), 1–12, DOI:10.1007/S00339-024-08155-6/METRICS.
L. Shi, C. Ling, Y. Ouyang and J. Wang, High intrinsic catalytic activity of two-dimensional boron monolayers for the hydrogen evolution reaction, Nanoscale, 2017, 9(2), 533–537, 10.1039/C6NR06621F.
S. H. Mir, et al., Two-dimensional boron: Lightest catalyst for hydrogen and oxygen evolution reaction, Appl. Phys. Lett., 2016, 109(5), 0003–6951, DOI:10.1063/1.4960102/32754.
A. K. Singh, K. Mathew, H. L. Zhuang and R. G. Hennig, Computational screening of 2D materials for photocatalysis, J. Phys. Chem. Lett., 2015, 6(6), 1087–1098, DOI:10.1021/jz502646d/asset/images/large/jz-2014-02646d_0008.jpeg.
Y. R. Do, W. Lee, K. Dwight and A. Wold, The Effect of WO₃ on the Photocatalytic Activity of TiO₂, J. Solid State Chem., 1994, 108(1), 198–201, DOI:10.1006/JSSC.1994.1031.
U. Gupta, B. G. Rao, U. Maitra, B. E. Prasad and C. N. R. Rao, Visible-Light-Induced Generation of H₂ by Nanocomposites of Few-Layer TiS₂ and TaS₂ with CdS Nanoparticles, Chem. Asian J., 2014, 9(5), 1311–1315, DOI:10.1002/asia.201301537.
H. L. Zhuang and R. G. Hennig, Theoretical perspective of photocatalytic properties of single-layer SnS₂, Phys. Rev. B: Condens. Matter Mater. Phys., 2013, 88(11), 115314, DOI:10.1103/physrevb.88.115314/figures/6/medium.
J. Yu, C. Y. Xu, F. X. Ma, S. P. Hu, Y. W. Zhang and L. Zhen, Monodisperse SnS₂ nanosheets for high-performance photocatalytic hydrogen generation, ACS Appl. Mater. Interfaces, 2014, 6(24), 22370–22377, DOI:10.1021/am506396z/suppl_file/am506396z_si_001.pdf.
Y. Wang, et al., Wafer-scale synthesis of monolayer WSe₂: A multi-functional photocatalyst for efficient overall pure water splitting, Nano Energy, 2018, 51, 54–60, DOI:10.1016/j.nanoen.2018.06.047.
S. Luo, et al., Rational and green synthesis of novel two-dimensional WS₂/MoS₂ heterojunction via direct exfoliation in ethanol-water targeting advanced visible-light-responsive photocatalytic performance, J. Colloid Interface Sci., 2018, 513, 389–399, DOI:10.1016/j.jcis.2017.11.044.
R. Jha, S. Santra and P. K. Guha, Green synthesis route for WS 2 nanosheets using water intercalation, Mater. Res. Express, 2016, 3, 95014, DOI:10.1088/2053-1591/3/9/095014.
A. K. Mishra, K. V. Lakshmi and L. Huang, Eco-friendly synthesis of metal dichalcogenides nanosheets and their environmental remediation potential driven by visible light, Sci. Rep., 2015, 5(1), 1–8, DOI:10.1038/srep15718.
L. Yan, S. Zhong, T. Igou, H. Gao, J. Li and Y. Chen, Development of machine learning models to enhance element-doped g-C₃N₄ photocatalyst for hydrogen production through splitting water, Int. J. Hydrogen Energy, 2022, 47(80), 34075–34089, DOI:10.1016/j.ijhydene.2022.08.013.
T. Taniike and K. Takahashi, The value of negative results in data-driven catalysis research, Nat Catal, 2023, 6(2), 108–111, DOI:10.1038/s41929-023-00920-9;subjmeta=298,563,606,638,639,77;kwrd=catalysis,computational+chemistry,materials+chemistry.
N. Sanosa, D. Dalmau, D. Sampedro, J. V. Alegre-Requena and I. Funes-Ardoiz, Recent advances of machine learning applications in the development of experimental homogeneous catalysis, Artif. Intell. Chem., 2024, 2(1), 100068, DOI:10.1016/j.aichem.2024.100068.
Y. Liu, T. Zhao, W. Ju and S. Shi, Materials discovery and design using machine learning, J. Materiomics, 2017, 3(3), 159–177, DOI:10.1016/j.jmat.2017.08.002.
T. N. Nguyen, et al., High-Throughput Experimentation and Catalyst Informatics for Oxidative Coupling of Methane, ACS Catal., 2020, 10(2), 921–932, DOI:10.1021/acscatal.9b04293/asset/images/large/cs9b04293_0010.jpeg.
K. Yanagiyama, K. Takimoto, S. Dinh Le, N. Nu Thanh Ton and T. Taniike, High-throughput experimentation for photocatalytic water purification in practical environments, Environ. Pollut., 2024, 342, 122974, DOI:10.1016/j.envpol.2023.122974.
L. Foppa, et al., Learning Design Rules for Selective Oxidation Catalysts from High-Throughput Experimentation and Artificial Intelligence, ACS Catal., 2022, 12(4), 2223–2232, DOI:10.1021/acscatal.1c04793/suppl_file/cs1c04793_si_002.xlsx.
N. S. Eyke, B. A. Koscher and K. F. Jensen, Toward Machine Learning-Enhanced High-Throughput Experimentation, Trends Chem., 2021, 3(2), 120–132, DOI:10.1016/j.trechm.2020.12.001/asset/793910be-e9cc-41c7-9815-25750cd72d66/main.assets/b2.jpg.
A. J. Medford, M. R. Kunz, S. M. Ewing, T. Borders and R. Fushimi, Extracting Knowledge from Data through Catalysis Informatics, ACS Catal., 2018, 8(8), 7403–7429, DOI:10.1021/acscatal.8b01708/asset/images/large/cs-2018-01708k_0013.jpeg.
K. Takahashi, et al., The Rise of Catalyst Informatics: Towards Catalyst Genomics, ChemCatChem, 2019, 11(4), 1146–1152, DOI:10.1002/cctc.201801956.
J. R. Kitchin, Machine learning in catalysis, Nat. Catal., 2018, 1(4), 230–232, DOI:10.1038/s41929-018-0056-y;subjmeta=563,606,638,639,77,887,888;kwrd=computational+chemistry,heterogeneous+catalysis,homogeneous+catalysis.
J. P. Janet and H. J. Kulik, Resolving Transition Metal Chemical Space: Feature Selection for Machine Learning and Structure-Property Relationships, J. Phys. Chem. A, 2017, 121(46), 8939–8954, DOI:10.1021/acs.jpca.7b08750/suppl_file/jp7b08750_si_003.zip.
P. G. Ghanekar, S. Deshpande and J. Greeley, Adsorbate chemical environment-based machine learning framework for heterogeneous catalysis, Nat. Commun., 2022, 13(1), 1–12, DOI:10.1038/s41467-022-33256-2.
L. Schleider, E. L. Pasiliao, Z. Qiang and Q. P. Zheng, A study of feature representation via neural network feature extraction and weighted distance for clustering, J. Comb. Optim., 2022, 44(4), 3083–3105, DOI:10.1007/s10878-022-00849-y/metrics.
A. K. Jain, Artificial Neural Networks for Feature Extraction and Multivariate Data Projection, IEEE Trans Neural Netw, 1995, 6(2), 296–317, DOI:10.1109/72.363467.
K. T. Schütt, H. Glawe, F. Brockherde, A. Sanna, K. R. Müller and E. K. U. Gross, How to represent crystal structures for machine learning: Towards fast prediction of electronic properties, Phys. Rev. B: Condens. Matter Mater. Phys., 2014, 89, 20, DOI:10.1103/physrevb.89.205118/figures/3/thumbnail.
A. N. Korovin, et al., Boosting heterogeneous catalyst discovery by structurally constrained deep learning models, Mater. Today Chem., 2023, 30, 101541, DOI:10.1016/j.mtchem.2023.101541.
K. Hansen, et al., Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space, J. Phys. Chem. Lett., 2015, 6(12), 2326–2331, DOI:10.1021/acs.jpclett.5b00831/suppl_file/jz5b00831_si_001.pdf.
T. Xie and J. C. Grossman, Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties, Phys. Rev. Lett., 2018, 120, 14, DOI:10.1103/physrevlett.120.145301/figures/3/thumbnail.
L. H. Mou, T. T. Han, P. E. S. Smith, E. Sharman and J. Jiang, Machine Learning Descriptors for Data-Driven Catalysis Study, Adv. Sci., 2023, 10(22), 2301020, DOI:10.1002/ADVS.202301020.
L. Himanen, et al., DScribe: Library of descriptors for machine learning in materials science, Comput. Phys. Commun., 2020, 247, 106949, DOI:10.1016/j.cpc.2019.106949.
H. Huo and M. Rupp, Unified representation of molecules and crystals for machine learning, Mach. Learn Sci. Technol., 2022, 3(4), 045017, DOI:10.1088/2632-2153/ACA005.
J. Behler, Atom-centered symmetry functions for constructing high-dimensional neural network potentials, J. Chem. Phys., 2011, 134, 7, DOI:10.1063/1.3553717.
A. P. Bartók, R. Kondor and G. Csányi, On representing chemical environments, Phys. Rev. B: Condens. Matter Mater. Phys., 2013, 87(18), 184115, DOI:10.1103/PhysRevB.87.184115.
Z. Sadiq, W. Yang, M. M. Meraz, W. Yang and W. H. Sun, Catalytic Activity of 2-Imino-1,10-phenthrolyl Fe/Co Complexes via Linear Machine Learning, Molecules, 2024, 29(10), 2313, DOI:10.3390/MOLECULES29102313/S1.
M. Tamtaji, et al., Machine learning for design principles for single atom catalysts towards electrochemical reactions, J. Mater. Chem. A, 2022, 10(29), 15309–15331, 10.1039/D2TA02039D.
S. Singh and R. B. Sunoj, Molecular Machine Learning for Chemical Catalysis: Prospects and Challenges, Acc Chem. Res., 2023, 56(3), 402–412, DOI:10.1021/acs.accounts.2c00801/asset/images/large/ar2c00801_0006.jpeg.
Q. Yuan, et al., Machine Learning-Assisted Catalysts for Advanced Oxidation Processes: Progress, Challenges, and Prospects, Catalysts, 2025, 15(3), 282, DOI:10.3390/CATAL15030282.
M. Suvarna, P. Preikschas and J. Perez-Ramirez, Identifying Descriptors for Promoted Rhodium-Based Catalysts for Higher Alcohol Synthesis via Machine Learning, ACS Catal., 2022, 12(24), 15373–15385, DOI:10.1021/acscatal.2c04349/asset/images/large/cs2c04349_0008.jpeg.
G. Lo Dico, Á. P. Nuñez, V. Carcelén and M. Haranczyk, Machine-learning-accelerated multimodal characterization and multiobjective design optimization of natural porous materials, Chem. Sci., 2021, 12(27), 9309–9317, 10.1039/D1SC00816A.
M. Delpisheh, et al., Leveraging machine learning in porous media, J. Mater. Chem. A, 2024, 12(32), 20717–20782, 10.1039/D4TA00251B.
H. Mai, T. C. Le, D. Chen, D. A. Winkler and R. A. Caruso, Machine Learning in the Development of Adsorbents for Clean Energy Application and Greenhouse Gas Capture, Adv. Sci., 2022, 9(36), 2203899, DOI:10.1002/ADVS.202203899.
K. Mukherjee and Y. J. Colón, Machine learning and descriptor selection for the computational discovery of metal-organic frameworks, Mol Simul, 2021, 47(10–11), 857–877, DOI:10.1080/08927022.2021.1916014.
J. Guo, et al., Rational Design of Earth-Abundant Catalysts toward Sustainability, Adv. Mater., 2024, 36(42), 2407102, DOI:10.1002/ADMA.202407102.
S. Wang and J. Jiang, Interpretable Catalysis Models Using Machine Learning with Spectroscopic Descriptors, ACS Catal., 2023, 13(11), 7428–7436, DOI:10.1021/acscatal.3c00611/asset/images/large/cs3c00611_0004.jpeg.
Y. Guan, et al., Machine learning in solid heterogeneous catalysis: Recent developments, challenges and perspectives, Chem. Eng. Sci., 2022, 248, 117224, DOI:10.1016/j.ces.2021.117224.
I. Takigawa, K. Ichi Shimizu, K. Tsuda and S. Takakusagi, Machine learning predictions of factors affecting the activity of heterogeneous metal catalysts, Nanoinformatics, 2018, 45–64, DOI:10.1007/978-981-10-7617-6_3/FIGURES/10.
O. Isayev, C. Oses, C. Toher, E. Gossett, S. Curtarolo and A. Tropsha, Universal fragment descriptors for predicting properties of inorganic crystals, Nat. Commun., 2017, 8(1), 1–12, DOI:10.1038/ncomms15679.
O. Isayev, C. Oses, C. Toher, E. Gossett, S. Curtarolo and A. Tropsha, Universal fragment descriptors for predicting properties of inorganic crystals, Nat Commun, 2017, 8, 15679, DOI:10.1038/NCOMMS15679.
R. Zhang, F. Rong, G. Lai, G. Wu, Y. Ye and J. Zheng, Machine learning descriptors for crystal materials: applications in Ni-rich layered cathode and lithium anode materials for high-energy-density lithium batteries, J. Mater. Inf., 2024, 4(4), N/A–N/A, DOI:10.20517/JMI.2024.22.
S. Wang and J. Jiang, Interpretable Catalysis Models Using Machine Learning with Spectroscopic Descriptors, ACS Catal., 2023, 13(11), 7428–7436, DOI:10.1021/acscatal.3c00611/asset/images/large/cs3c00611_0004.jpeg.
J. Timoshenko, et al., Deciphering the Structural and Chemical Transformations of Oxide Catalysts during Oxygen Evolution Reaction Using Quick X-ray Absorption Spectroscopy and Machine Learning, J. Am. Chem. Soc., 2023, 145(7), 4065–4080, DOI:10.1021/jacs.2c11824/asset/images/large/ja2c11824_0010.jpeg.
D. H. Park, J. H. Yang, A. Vinu, A. Elzatahry and J. H. Choy, X-ray diffraction and X-ray absorption spectroscopic analyses for intercalative nanohybrids with low crystallinity, Arab. J. Chem., 2016, 9(2), 190–205, DOI:10.1016/j.arabjc.2015.07.007.
A. Iglesias-Juez, G. L. Chiarello, G. S. Patience and M. O. Guerrero-Pérez, Experimental methods in chemical engineering: X-ray absorption spectroscopy—XAS, XANES, EXAFS, Can. J. Chem. Eng., 2022, 100(1), 3–22, DOI:10.1002/CJCE.24291.
L. Ge, et al., Predicted Optimal Bifunctional Electrocatalysts for the Hydrogen Evolution Reaction and the Oxygen Evolution Reaction Using Chalcogenide Heterostructures Based on Machine Learning Analysis of in Silico Quantum Mechanics Based High Throughput Screening, J. Phys. Chem. Lett., 2020, 11(3), 869–876, DOI:10.1021/acs.jpclett.9b03875/suppl_file/jz9b03875_liveslides.mp4.
V. Fung, G. Hu, Z. Wu and D. E. Jiang, Descriptors for Hydrogen Evolution on Single Atom Catalysts in Nitrogen-Doped Graphene, J. Phys. Chem. C, 2020, 124(36), 19571–19578, DOI:10.1021/acs.jpcc.0c04432/suppl_file/jp0c04432_si_001.pdf.
T. T. Nguyen, J. Z. Huang and T. T. Nguyen, Unbiased Feature Selection in Learning Random Forests for High-Dimensional Data, Sci. World J., 2015,(1), 471371, DOI:10.1155/2015/471371.
S. Pablo-García, R. García-Muelas, A. Sabadell-Rendón and N. López, Dimensionality reduction of complex reaction networks in heterogeneous catalysis: From linear-scaling relationships to statistical learning techniques, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 2021, 11(6), e1540, DOI:10.1002/WCMS.1540.
A. J. Chowdhury, W. Yang, E. Walker, O. Mamun, A. Heyden and G. A. Terejanu, Prediction of Adsorption Energies for Chemical Species on Metal Catalyst Surfaces Using Machine Learning, J. Phys. Chem. C, 2018, 122(49), 28142–28150, DOI:10.1021/acs.jpcc.8b09284/suppl_file/jp8b09284_liveslides.mp4.
Y. Çakır and M. Uzunca, Kernel Principal Component Analysis for Allen–Cahn Equations, Mathematics, 2024, 12(21), 3434, DOI:10.3390/MATH12213434.
X. Wang and P. Wu, Nonlinear Dynamic Process Monitoring Based on Ensemble Kernel Canonical Variate Analysis and Bayesian Inference, ACS Omega, 2022, 7(22), 18904–18921, DOI:10.1021/acsomega.2c01892/asset/images/large/ao2c01892_0011.jpeg.
A. Errachdi, S. Slama and M. Benrejeb, On the combination of kernel principal component analysis and neural networks for process indirect control, Math. Comput. Model. Dyn. Syst., 2020, 26(2), 144–168, DOI:10.1080/13873954.2019.1710715.
“Introduction to t-SNE: Nonlinear Dimensionality Reduction and Data Visualization | DataCamp. Accessed: 27, 2025. [Online]. Available: https://www.datacamp.com/tutorial/introduction-t-sne?utm_source=chatgpt.com.
A. F. Usuga, C. S. Praveen and A. Comas-Vives, Local descriptors-based machine learning model refined by cluster analysis for accurately predicting adsorption energies on bimetallic alloys, J. Mater. Chem. A, 2024, 12(5), 2708–2721, 10.1039/D3TA06316J.
F. Trozzi, X. Wang and P. Tao, UMAP as a Dimensionality Reduction Tool for Molecular Dynamics Simulations of Biomacromolecules: A Comparison Study, J. Phys. Chem. B, 2021, 125(19), 5022–5034, DOI:10.1021/acs.jpcb.1c02081.
Q. Zhang, Y. Liu and H. Fang, Manifold learning-based UMAP method for geochemical anomaly identification, Geochemistry, 2024, 84(4), 126157, DOI:10.1016/j.chemer.2024.126157.
J. Wu, X. Y. Chen, H. Zhang, L. D. Xiong, H. Lei and S. H. Deng, Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimization, J. Electron. Sci. Technol., 2019, 17(1), 26–40, DOI:10.11989/jest.1674-862x.80904120.
L. G. de Araujo, L. Vilcocq, P. Fongarland and Y. Schuurman, Recent developments in the use of machine learning in catalysis: A broad perspective with applications in kinetics, Chem. Eng. J., 2025, 508, 160872, DOI:10.1016/j.cej.2025.160872.
K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev and A. Walsh, Machine learning for molecular and materials science, Nature, 2018, 559(7715), 547–555, DOI:10.1038/s41586-018-0337-2.
N. Hao, L. Flagg, R. Jayawardhana, A. Jalili and A.-X. Chen, Prediction of ground state charge radius using support vector regression, New J. Phys., 2024, 26(10), 103017, DOI:10.1088/1367-2630/AD850E.
H. I. Akyildiz, E. Yigit, A. B. Arat and S. Islam, A machine learning approach for the estimation of photocatalytic activity of ALD ZnO thin films on fabric substrates, J. Photochem. Photobiol., A, 2024, 448, 115308, DOI:10.1016/j.jphotochem.2023.115308.
J. Wu, J. Zhang, G. Qian and T. Y. Zhang, The design and discovery of catalysts for simultaneous catalysis of chlorobenzene and nitrogen oxides via domain knowledge guided machine learning, Appl. Catal., A, 2023, 668, 119487, DOI:10.1016/j.apcata.2023.119487.
X. Zhai and M. Chen, A machine learning-based nano-photocatalyst module for accelerating the design of Bi 2 WO 6 /MIL-53(Al) nanocomposites with enhanced photocatalytic activity, Nanoscale Adv, 2023, 5(16), 4065–4073, 10.1039/D3NA00122A.
Z. Yang, K. Song, X. Gu, Z. Wang and X. Liang, Predictor and optimizer system on selective catalytic reduction of NO in activated carbons based on experiment and computational intelligence technique, Eng. Comput., 2020, 37(5), 1737–1756, DOI:10.1108/ec-05-2019-0235/full/xml.
M. Tamtaji, S. Chen, Z. Hu, W. A. Goddard and G. H. Chen, A Surrogate Machine Learning Model for the Design of Single-Atom Catalyst on Carbon and Porphyrin Supports towards Electrochemistry, J. Phys. Chem. C, 2023, 127(21), 9992–10000, DOI:10.1021/acs.jpcc.3c00765/suppl_file/jp3c00765_si_002.xlsx.
A. Baghban, S. Habibzadeh and F. Zokaee Ashtiani, Bandgaps of noble and transition metal/ZIF-8 electro/catalysts: a computational study, RSC Adv., 2020, 10(39), 22929–22938, 10.1039/D0RA02943B.
X. Sun, et al., Machine-learning-accelerated screening of hydrogen evolution catalysts in MBenes materials, Appl. Surf. Sci., 2020, 526, 146522, DOI:10.1016/j.apsusc.2020.146522.
A. A. Khan, O. Chaudhari and R. Chandra, A review of ensemble learning and data augmentation models for class imbalanced problems: Combination, implementation and evaluation, Expert Syst. Appl., 2024, 244, 122778, DOI:10.1016/j.eswa.2023.122778.
Y. Yang, Ensemble Learning, Temporal Data Mining Via Unsupervised Ensemble Learning, 2017, pp. 35–56 DOI:10.1016/B978-0-12-811654-8.00004-X.
J. Zheng, et al., High-Throughput Screening of Hydrogen Evolution Reaction Catalysts in MXene Materials, J. Phys. Chem. C, 2020, 124(25), 13695–13705, DOI:10.1021/acs.jpcc.0c02265/suppl_file/jp0c02265_si_001.pdf.
S. Lin, H. Xu, Y. Wang, X. C. Zeng and Z. Chen, Directly predicting limiting potentials from easily obtainable physical properties of graphene-supported single-atom electrocatalysts by machine learning, J. Mater. Chem. A, 2020, 8(11), 5663–5670, 10.1039/C9TA13404B.
X. Wang, et al., Accelerating 2D MXene catalyst discovery for the hydrogen evolution reaction by computer-driven workflow and an ensemble learning strategy, J. Mater. Chem. A, 2020, 8(44), 23488–23497, 10.1039/D0TA06583H.
X. Y. Zhu, et al., Prediction of Multicomponent Reaction Yields Using Machine Learning, Chin. J. Chem., 2021, 39(12), 3231–3237, DOI:10.1002/CJOC.202100434.
M. G. M. Abdolrasol, et al., Artificial Neural Networks Based Optimization Techniques: A Review, Electronics, 2021, 10(21), 2689, DOI:10.3390/electronics10212689.
S. Xu, et al., Developing new electrocatalysts for oxygen evolution reaction via high throughput experiments and artificial intelligence, npj Comput. Mater., 2024, 10(1), 1–8, DOI:10.1038/s41524-024-01386-4.
S. Wang, P. Mo, D. Li and A. Syed, Intelligent Algorithms Enable Photocatalyst Design and Performance Prediction, Catalysts, 2024, 14(4), 217, DOI:10.3390/CATAL14040217.
L. R. Oviedo, D. M. Druzian, L. D. D. Nora and W. L. da Silva, Study of machine learning on the photocatalytic activity of a novel nanozeolite for the application in the Rhodamine B dye degradation, Catal. Today, 2025, 443, 114986, DOI:10.1016/j.cattod.2024.114986.
X. Liu, et al., Identifying the Activity Origin of a Cobalt Single-Atom Catalyst for Hydrogen Evolution Using Supervised Learning, Adv. Funct. Mater., 2021, 31(18), 2100547, DOI:10.1002/ADFM.202100547.
S. Razavi, Deep learning, explained: Fundamentals, explainability, and bridgeability to process-based modelling, Environ. Model. Software, 2021, 144, 105159, DOI:10.1016/j.envsoft.2021.105159.
M. Zafari, D. Kumar, M. Umer and K. S. Kim, Machine learning-based high throughput screening for nitrogen fixation on boron-doped single atom catalysts, J. Mater. Chem. A, 2020, 8(10), 5209–5216, 10.1039/C9TA12608B.
K. Chen, et al., Pt nanoparticles stabilized within MOF derivative on inverse opal ZnO for acetone prediction, Sens. Actuators, B, 2023, 396, 134570, DOI:10.1016/j.snb.2023.134570.
Q. Yang, R. Xu, P. Wu, J. He, C. Liu and W. Jiang, Three-step treatment of real complex, variable high-COD rolling wastewater by rational adjustment of acidification, adsorption, and photocatalysis using big data analysis, Sep. Purif. Technol., 2021, 270, 118865, DOI:10.1016/j.seppur.2021.118865.
Z. Jiang, J. Hu, A. Samia and X. Yu, Predicting Active Sites in Photocatalytic Degradation Process Using an Interpretable Molecular-Image Combined Convolutional Neural Network, Catalysts, 2022, 12(7), 746, DOI:10.3390/CATAL12070746.
S. Back, J. Yoon, N. Tian, W. Zhong, K. Tran and Z. W. Ulissi, Convolutional Neural Network of Atomic Surface Structures to Predict Binding Energies for High-Throughput Screening of Catalysts, J. Phys. Chem. Lett., 2019, 10(15), 4401–4408, DOI:10.1021/acs.jpclett.9b01428/suppl_file/jz9b01428_si_001.pdf.
Z. Wang, et al., Deep learning for ultra-fast and high precision screening of energy materials, Energy Storage Mater., 2021, 39, 45–53, DOI:10.1016/j.ensm.2021.04.006.
S. H. Wang, H. S. Pillai, S. Wang, L. E. K. Achenie and H. Xin, Infusing theory into deep learning for interpretable reactivity prediction, Nat. Commun., 2021, 12(1), 1–9, DOI:10.1038/s41467-021-25639-8.
B. Dou, et al., Machine Learning Methods for Small Data Challenges in Molecular Science, Chem. Rev., 2023, 123(13), 8736, DOI:10.1021/acs.chemrev.3c00189.
K. Ding, et al., Machine learning-guided co-optimization of fitness and diversity facilitates combinatorial library design in enzyme engineering, Nat. Commun., 2024, 15(1), 1–13, DOI:10.1038/s41467-024-50698-y;subjmeta=114,1305,603,631,638,639,77;kwrd=biocatalysis,machine+learning.
R. Tachibana, K. Zhang, Z. Zou, S. Burgener and T. R. Ward, A Customized Bayesian Algorithm to Optimize Enzyme-Catalyzed Reactions, ACS Sustainable Chem. Eng., 2023, 11(33), 12336–12344, DOI:10.1021/acssuschemeng.3c02402/suppl_file/sc3c02402_si_004.pdf.
M. C. Ramos, S. S. Michtavy, M. D. Porosoff and A. D. White, Bayesian Optimization of Catalysis With In-Context Learning, 2025, Accessed: Aug. 28, 2025. [Online]. Available: https://arxiv.org/pdf/2304.05341v2 Search PubMed.
Y. Xu, et al., AI-Empowered Catalyst Discovery: A Survey from Classical Machine Learning Approaches to Large Language Models, Proc. ACM Comput. Surveys, 2025, 1 DOI:10.48550/arXiv.2502.13626.
A. Kolluru, et al., Open Challenges in Developing Generalizable Large-Scale Machine-Learning Models for Catalyst Discovery, ACS Catal., 2022, 12(14), 8572–8581, DOI:10.1021/acscatal.2c02291.
A. N. Korovin, et al., Boosting heterogeneous catalyst discovery by structurally constrained deep learning models, Mater. Today Chem., 2023, 30, 101541, DOI:10.1016/j.mtchem.2023.101541.
A. Merchant, S. Batzner, S. S. Schoenholz, M. Aykol, G. Cheon and E. D. Cubuk, Scaling deep learning for materials discovery, Nature, 2023, 624(7990), 80–85, DOI:10.1038/S41586-023-06735-9;techmeta.
H. Liu, K. Liu, H. Zhu, W. Guo and Y. Li, Explainable machine-learning predictions for catalysts in CO₂ -assisted propane oxidative dehydrogenation, RSC Adv., 2024, 14(11), 7276–7282, 10.1039/D4RA00406J.
J. Xu, X. Zhang and Y. Li, Kernel MSE algorithm: A unified framework for KFD, LS-SVM and KRR, Proc. Int. Jt Conf. Neural Netw., 2001, 2, 1486–1491, DOI:10.1109/ijcnn.2001.939584.
B. Medasani, et al., Predicting defect behavior in B2 intermetallics by merging ab initio modeling and machine learning, npj Comput. Mater., 2016, 2(1), 1–10, DOI:10.1038/s41524-016-0001-z.
X. Cheng, C. Wu, J. Xu, Y. Han, W. Xie and P. Hu, Leveraging Machine Learning Potentials for In-Situ Searching of Active sites in Heterogeneous Catalysis, Prec. Chem., 2024, 2, 570–586, DOI:10.1021/prechem.4c00051/asset/images/large/pc4c00051_0010.jpeg.
X. Ge, et al., Atomic Design of Alkyne Semihydrogenation Catalysts via Active Learning, J. Am. Chem. Soc., 2024, 146(7), 4993–5004, DOI:10.1021/jacs.3c14495/asset/images/large/ja3c14495_0006.jpeg.
W. Melitz, J. Shen, A. C. Kummel and S. Lee, Kelvin probe force microscopy and its application, Surf. Sci. Rep., 2011, 66(1), 1–27, DOI:10.1016/j.surfrep.2010.10.001.
T. Toyao, Z. Maeno, S. Takakusagi, T. Kamachi, I. Takigawa and K. I. Shimizu, Machine Learning for Catalysis Informatics: Recent Applications and Prospects, ACS Catal., 2020, 10(3), 2260–2297, DOI:10.1021/acscatal.9b04186/asset/images/large/cs9b04186_0030.jpeg.

Click here to see how this site uses Cookies. View our privacy policy here.