Open Access Article
Jenny G.
Vitillo
*a,
Alán
Aspuru-Guzik
bcdefghi,
Eric
Doskocil
j,
Omar K.
Farha
kl,
Timur
Islamoglu
m,
Heather J.
Kulik
n,
Peter M.
Margl
o,
Stuart
Miller
m,
Jordan
Reddel
o,
Aayush R.
Singh
p and
Varinia
Bernales
*bcdm
aDepartment of Science and High Technology and INSTM, Università Degli Studi Dell’Insubria, Via Valleggio 9, Como 22100, Italy
bDepartment of Chemistry, University of Toronto, 80 St. George St., Toronto, ON M5S 3H6, Canada
cAcceleration Consortium, 700 University Ave., Toronto, ON M7A 2S4, Canada
dDepartment of Computer Science, University of Toronto, 40 St George St., Toronto, ON M5S 2E4, Canada
eVector Institute for Artificial Intelligence, W1140-108 College St., Schwartz Reisman Innovation Campus, Toronto, ON M5G 0C6, Canada
fDepartment of Materials Science & Engineering, University of Toronto, 184 College St., Toronto, ON M5S 3E4, Canada
gDepartment of Chemical Engineering & Applied Chemistry, University of Toronto, 200 College St., Toronto, ON M5S 3E5, Canada
hCanadian Institute for Advanced Research (CIFAR), 661 University Ave., Toronto, ON M5G 1M1, Canada
iNVIDIA, 431 King St W #6th, M5V 1K4, Toronto, Canada
jApplied Sciences & Technology, BP, 30 S Wacker Drive, Suite 900, Chicago 60606, IL, USA
kDepartment of Chemistry and International Institute for Nanotechnology, Northwestern University, Evanston 60208, IL, USA
lPaula M. Trienens Institute for Sustainability and Energy, Northwestern University, Evanston 60208, IL, USA
mMaterials Discovery Research Institute, UL Research Institutes, Skokie 60077, Illinois, USA
nDepartments of Chemical Engineering and Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
oDow, Inc., Core R&D, 1776 Building, Midland 48667, Michigan, USA
pSandboxAQ, Palo Alto, 94301, CA, USA
First published on 28th January 2026
The growing demand for energy-efficient processes to support a sustainable future drives the need for research to rapidly explore chemical and material space through accelerated catalyst discovery initiatives. Recent breakthroughs in high-throughput experimental and computational methods are transforming the catalysis field, surpassing traditional approaches to manipulating variables in catalytic processes. Key advancements in innovation include the integration of machine learning for efficient catalyst screening, high-throughput experimentation, data-driven methodologies employing comprehensive databases, and in situ and in operando techniques for realistic observations. This progress has undoubtedly been intertwined with a collaborative framework across disciplines, reshaping catalyst discovery methods in both industry and academia. This Opinion article presents a multifaceted perspective from coauthors with expertise spanning various stages of the Technology Readiness Level spectrum, highlighting both opportunities and persistent challenges in integrating computational and experimental approaches in catalysis. These challenges span from obtaining high-quality experimental data, scaling simulations to industrially relevant materials and process conditions to navigating the complexity and predictive accuracy of computational models.
Building on the significance of HTE, it is essential to clarify key terminology, as some terms are often used interchangeably despite their nuanced differences. High-throughput synthesis (HTS) refers to systematically exploring chemical space to produce desired materials, utilizing tools that enable parallel reaction setups or simultaneous characterization processes.7 On the other hand, “accelerated materials discovery,” which includes the expanding field of self-driving laboratories,8 builds upon HTS by incorporating advanced data analysis techniques—with simple methods such as Design of Experiments to modern approaches such as Bayesian optimization—to create a feedback loop that optimizes and streamlines the synthesis and characterization processes for targeted compounds.9–11 This distinction underscores the progression of HTE from a tool for rapid experimentation to an integrated framework driving innovation in material science.
Although HT techniques are becoming indispensable for remaining competitive in academia and industry, significant barriers prevent both high-throughput experimentation (HTE) and computation (HTC) in catalysis from reaching their full potential. Key areas for improvement include (i) enhancing data management, standardization, and integration, (ii) accelerating characterization processes, and (iii) fostering transparent and efficient interdisciplinary collaboration—each of these requiring ongoing focus and innovation. The large amount of data produced by HT methods in short periods must be analyzed to extract useful information. This results in a substantial manual workload, diminishing the advantages of high-throughput methods and hindering their broader adoption.1,12
Lately, most of the solutions on how to automate the handling of these large datasets have come from artificial intelligence (AI) methods, which are particularly powerful in solving this issue.1,12,13 Collaboration between chemists, materials scientists, engineers, and data scientists becomes essential to leveraging the full potential of these technologies. It goes without saying that maintaining depth in subject-matter expertise is essential for success in this field, as in any scientific endeavor. A frequent bottleneck in HTE is the inherent timescale mismatch between characterization and other critical processes, such as formulation, reaction, and synthesis methods. In many cases, HT characterization becomes the most crucial and primary limiting factor in accelerating discovery workflows. In catalysis—the focus of this opinion piece—characterizing key catalytic intermediates is particularly important. However, current experimental methods are still not amenable to HT approaches. To overcome this limitation, it is important to complement experimental measurements with accurate and inexpensive predictions from first principles. Several HTC workflows have been designed to achieve these needs,14–17 allowing for the parallel acquisition of mechanistic data, including intermediate identification. Combined with theoretical predictions, such workflows help bridge the gap between detailed mechanistic understanding and scalable discovery. This convergence of HTE and computational chemistry, driven by advances in robotics and AI, has been further augmented with deep learning, generative models, and machine learning potentials to help address data gaps.18 The topic has been the subject of several recent reviews and perspectives,19–25 and some positive examples of predictions,26 proving the potential of data science in catalysis to optimize reaction conditions and improve yields.27,28
This Opinion article condensed the ideas discussed in the framework of a symposium organized at the ACS Fall of 2024 by the Catalysis Division with a homonymous title as the present article, where the authors, experts in the field from academia and industry, participated in a panel discussion to cover this topic.29 The aim of this Opinion article is to expose the topics explored during this event, covering the current state of high-throughput catalysis, highlighting recent breakthroughs, and discussing the opportunities and challenges in this rapidly evolving field. We specifically focus on three interconnected aspects critical to accelerating catalytic advancements: (I) high-throughput characterization for balancing data precision with experimental efficiency (Section 3); (II) standardization of data and metadata to improve data sharing and reproducibility (Section 4); and (III) integrated workflows connecting experimental and computational systems for autonomous discovery (Section 5). By concentrating on these key challenges and opportunities, this work aims to provide actionable recommendations for the catalysis community.
In this paper, catalytic discovery refers to the identification of new catalysts or reaction systems enabled by data-driven and high-throughput methods, whereas catalytic advancement encompasses the broader acceleration of catalyst understanding, optimization, and deployment. It is important to note that homogeneous (molecular) and heterogeneous (solid-based) catalysis differ substantially in synthesis control, reaction environment, and characterization complexity. While the data-driven and high-throughput strategies discussed here are broadly applicable, their implementation requires domain-specific adaptation to account for these intrinsic differences.
000 results, most are dated from 2010 onwards (see Fig. 1a).31 This graph indicates that, only about 14 years ago, HT methods transitioned from niche approaches to widely adopted techniques due to the automation of the process. A key factor in this transition has been the development of robotic platforms capable of parallel synthesis and screening. Coupled with flow chemistry, these platforms enable wider process windows and help overcome challenges associated with traditional batch-wise screening, further accelerating high-throughput experimentation.32,33
![]() | ||
| Fig. 1 Patents extracted from the Google Patent website31 using the keywords (high throughput AND catalysis). (a) Histogram showing the distribution of the selected patents published between 1935 and 2024, grouped in 5-year intervals. Each bar represents the number of patents issued within the respective 5-year period. (b) Word cloud showing the fields of application of HT methods (words counting > 190). | ||
Today, HT experiments and computational methods are widely adopted in the research and development headquarters of multinational chemical, pharmaceutical, and tech companies.2,29,34–36 HT use has significantly contributed to chemical process development, often leading to breakthrough patents in various chemistries. This demonstrates the role of HT methods as powerful drivers of innovation.37 Suzuki-Miyaura cross-coupling,26,38–40 hydrogen production,41 and olefin polymerization42 are only a few examples of the various applications of these methods (see Fig. 1b).
In academia, the high cost of fully automated HTE systems (>$200k USD)43 remains a significant barrier to broader adoption, particularly as access to such equipment typically depends on major funding initiatives. Unlike more flexible types of scientific instrumentation that can be readily shared across different research groups and disciplines, HTE platforms tend to be inherently more specialized, which can limit their cross-disciplinary utility and reduce the feasibility of shared investment. Nowadays, HTC screening rhymes with AI and, in particular, with machine learning (ML) methods. The rise of autonomous HTC is generally recognized as a critical enabler in the rapid exploration of catalytic materials, integrating atomic-scale simulations with machine learning to minimize human intervention.12 They allow the exploration of vast chemical spaces in silico, which is particularly valuable for challenging systems such as transition-metal catalysts.13 Additionally, ML methods elucidate hidden trends and periodicities within data that have the potential to unveil the importance of parameters that were not considered, or complex dependence among the data, enhancing the ability to predict molecular behaviors.44 A comprehensive consideration of the catalysts' properties and optimal reaction conditions that influence the reactivity in terms of selectivity and stability is not straightforward. Automation has become a ubiquitous solution to this problem.44 A recent example presented at the symposium is Microsoft's AutoRXN workflow.15 This automated framework removes the human bias when analyzing large catalyst libraries created by in silico screening, allowing the examination of mechanistic pathways and structural modifications of catalysts. This approach was successfully applied to homogeneous catalysis for the asymmetric reduction of ketones.
The application of ML methods to uncover hidden structure-property relationships has enabled the concurrent optimization of core performance metrics such as activity and selectivity—hallmarks of traditional catalytic research.3 Beyond these, ML has also opened up doors to rigorously interrogate and optimize critical yet frequently neglected properties, such as material stability, thereby bridging a longstanding gap in computational studies.13,45 On the other hand, data-driven approaches also have the potential to uncover novel chemical behaviors, such as spin-state switching and redox activity,13 which are challenging to predict using traditional computational methods, especially for transition metal reactivity.46 HTC screening using density functional theory (DFT) has traditionally been limited to models containing fewer than 200 atoms.34,47,48 This represents a significant limitation since second or higher coordination shells around the active site often contribute significantly to catalyst reactivity.49 However, these constraints are gradually being overcome thanks to innovations like GPU-accelerated DFT, e.g., with TeraChem, a computational chemistry code that replaces traditional CPU architecture with GPU-based calculations.50 This advancement enables a twenty-fold increase in computational speed compared to CPU-based programs, allowing systems of thousands of atoms to be analyzed within hours.34
While the above-mentioned advancements in HTC are highly encouraging and have definitively earned their place alongside HTE, key challenges remain. Real-life catalytic systems, especially those deployed in industrial environments, are often the product of decades of work to not only optimally develop a catalyst in relative isolation but also of painstaking formulation, scaling, and tuning of the catalytic process. For instance, in heterogeneous catalytic systems, carefully formulated, highly disordered systems are often key to sustained catalytic performance. These systems are difficult to model with current DFT methods because they require large scales and long simulation times. Here, continued development of simulation methods and machine learning methods to augment traditional, physical models will be required to match HTC to HTE. In homogeneous catalysis, similar challenges pertain, for instance, to reactions involving ion pairs, systems with significant conformational flexibility, and systems where the solvent actively participates in catalysis—especially in combination with complex electronic structures. Nonetheless, strategies that leverage qualitative trends, like energy descriptors, for systems with complex electronic structures have successfully guided the design of new iron-based molecular catalysts for methane oxidation.13 For solid-state simulations in heterogeneous catalysis, the choice of the electronic structure method is often dictated by a trade-off between accuracy and computational feasibility, particularly in high-throughput or large-scale screening studies. While local DFT functionals remain widely used due to their favorable scaling and robustness, they are increasingly complemented by more advanced approaches. State-of-the-art modeling now frequently incorporates dispersion-corrected functionals, meta-GGA formulations, hybrid functionals, or multilevel and embedding strategies to improve accuracy and transferability.51 Hybrid functionals and beyond-DFT methods can provide significant improvements for systems involving localized electronic states, such as transition metal oxides, non-zero oxidation states, or open-shell configurations, albeit at a substantially higher computational cost.52 Conversely, for metallic or gapless systems, the benefits of hybrid functionals are often limited, and carefully validated local approaches may still yield reliable trends. In all cases, systematic benchmarking and validation against experimental data or higher-level calculations on representative models are essential to ensure predictive reliability.52
The coupling of AI tools and automation for data analysis is also becoming more and more common in HTE, with the development of mobile robotic chemists and self-driving laboratories (SDLs), bringing significant advantages in process efficiency and decision-making (see Section 5).12,53,54 The high investment necessary for HTE equipment remains an important discussion point for its scientific and social repercussions, limiting opportunities for innovation and development. Integrating these advanced technologies with sustainable practices and cost-effective solutions remains a critical goal. This has brought to the development of “frugal twins” or low-cost SDLs,55 that are low-cost, functionally simplified versions of sophisticated SDLs. Built using off-the-shelf components and open-source software, these systems require a capital investment up to three orders of magnitude lower than their high-end counterparts. Originally conceived to address the accessibility gap in HTE technology, frugal twins enable cost-effective automation of experimental workflows. They offer a pragmatic solution for academic labs with limited resources, supporting increased throughput and experimental reproducibility while preserving key features such as scalability and modularity.55
While HTE's parallel screening accelerates exploration across a broader chemical and material space, its effectiveness ultimately depends on carefully (1) selecting the initial parameters and (2) integrating characterization and testing techniques to ensure accuracy. For instance, at the symposium, Bozal-Ginesta et al. showed an excellent example of HT synthesis and characterization of perovskite thin film libraries.56 This work characterized compositional maps using various HT techniques, such as XRF, XRD, Raman, and EIS. Herein, the authors stressed the importance of a sufficiently large and consistent experimental dataset to extract relevant oxygen reduction reaction properties alongside appropriate data processing and analysis routines. Gagliardi and Del Ferro also showcased at the symposium a HTE campaign, where researchers evaluated the performance of decorated MOFs in propylene dimerization to hexadienes.4,57 Initial exploration utilizing Bayesian optimization techniques required over six months and 721 experiments, unexpectedly resulting in low yield (<5%) reactions. Consequently, the reevaluation of the initial parameters led the authors to redesign the experiments, increasing the H2 concentration to 40 vol%, allowing significant yield improvements.4,57 These improvements were attributed to the formation of larger nanoparticles under higher H2 levels. This example illustrates the relevance of taking a human-in-the-loop approach, i.e., subject matter expertise to guide parameter choices effectively, as screening all possible conditions is often impractical due to time and cost.4,54,57,58 Crucially, the success of this campaign depended on careful post-hoc characterization of the catalysts using high-angle annular dark-field scanning transmission electron microscopy (HAADF-STEM) and DFT modeling, which revealed that larger copper nanoparticles formed under high H2 conditions were responsible for the improved catalytic activity. This underscores that high-throughput experimentation alone is insufficient and effective HT catalysis requires integrating detailed characterization and mechanistic understanding to interpret results accurately and guide subsequent experimental design.
For accelerated catalyst discovery platforms relying on a high-throughput framework, a tight coupling between synthesis and characterization is fundamental. For instance, high-throughput PXRD is an invaluable tool for identifying materials or validating synthesis processes. Patterns can be matched against databases or analyzed directly, with parameters then compared to target materials. However, relying solely on statistical measures such as goodness-of-fit for analysis is insufficient. Poor fits may arise from unfitted diffraction features or slight inaccuracies in peak shapes.59,60 Trained experts should evaluate these results and consider alternative models, such as structural disorder or missing elements. Understanding the effects of disorder and defects is essential for differentiating between genuinely novel structures and those arising from imperfections. Incorporating these considerations into material characterization can provide critical guidance in defining exploration boundaries.
There is an essential balance between data collection speed and quality. When structural information is paramount, prioritizing quality over speed is essential to avoid misinterpretation, which could derail the discovery process. For example, in HTC, the balance between precision and speed is achieved by adequately selecting the computational method and limiting the dimension of the cluster models.34 Although it is evident that methods with prohibitive resource scaling are unsuitable for large-scale screening, it is critical in HT catalysis to prioritize data quality over computational speed. This ensures that the resulting models can capture the underlying chemical complexity and preserve the integrity and predictive power of the analysis.
Therefore, devoting HTE and HTC resources to investigate crystalline structures plays a relevant role in exploring regions with the potential for discovering new catalysts, even if these are model systems that may not fully capture the complexity of industrial-scale processes. During HT catalyst activity screening, it is worth noting that using small-scale testing and characterization rarely guarantees success in real-world conditions. Nonetheless, HTE provides a unique advantage by enabling the rapid pre-screening of vast experimental spaces using minimal material and time, thereby guiding more targeted optimization efforts at larger scales. While miniaturized setups may not capture all kinetic or thermodynamic aspects relevant for scale-up, they provide valuable early-stage insight that accelerates and informs subsequent development. However, relatively few studies report systematic follow-up even at the laboratory scale, reflecting both logistical constraints and the prevailing gap between screening platforms and scale-up infrastructure. Despite this limitation, the ability of HTE to rapidly triage large sets of possibilities enables more efficient use of experimental resources. Notable efforts toward bridging this gap include the work by Ruan et al.,61 where large language model (LLM)–powered agents were integrated with automated synthesis and testing platforms to support a complete workflow from literature mining through kinetic analysis and optimization to scale-up, for various catalytic processes. Such end-to-end approaches highlight the growing potential for routine translation of HTE findings into scalable technologies.62 To fully realize this potential, it is essential to establish consistent protocols for HT experiments, including characterization and standardized data reporting (as discussed in Section 4). These practices are crucial to enhancing reproducibility and minimizing duplication of effort across multiple and diverse research environments, significantly advancing the pace of catalysts discovery.63
Multiple databases and repositories, such as the CatApp,65 Catalysis-hub,66 Open Catalyst,35 ioChemBD,67 and NOMAD,68 are publicly available, aiming to supplement in-house data resources and support data management tasks. Although the goal of these databases is to ensure that data adheres to the FAIR principles for data management—being Findable, Accessible, Interoperable, and Reusable,69 this is still not always attainable in practice.51 This limited practical adherence is reflected in the persistently low dataset-to-article ratio observed in chemistry, which remains below the scientific average and is even lower in catalysis, where approximately one dataset is available per one hundred articles, indicating that experimental and computational data are still rarely shared at scale.51 One of the root causes may be primarily linked to resistance to change on the part of researchers and institutions, resulting in longstanding inconsistencies in database design and repository standards. Additionally, HTC datasets frequently omit key experimental conditions such as reaction temperature, pH, solvent, or chemical environment for simplicity.36,70–72 The lack of uniformity in the database causes difficulties in applying AI tools to advance the discovery process through complex analysis.73,74 Thus, standardized datasets cannot only improve the training of machine learning (ML) models but also deepen our understanding of how experimental variables influence material performance and catalytic activity.
In the past decades, scientific research has also shifted towards projects that rely on multi-institutional efforts. This often requires scientists to share and combine datasets from multiple sources while ensuring consistency to land research efforts into meaningful insights.75,76 However, the lack of standardized protocols across institutions and the complexity of diverse experimental setups introduces additional challenges. This becomes particularly complicated when data is obtained from various equipment, and each is operated by software developed by different vendors, often resulting in fragmented and difficult-to-integrate data. Although several efforts have been made to overcome these barriers,77,78 it is crucial to reassess the data collection mechanism as a community. This represents an essential step towards making data machine- and human-readable to drive discovery, which also accounts for AI agents.79 Thus, the discussion around data integration and interpretation requires a focus on (1) technical approaches, such as new algorithms or methodologies for combining diverse data types, and (2) pragmatic organizational strategies to improve workflows and communication across disciplines and equipment. To this end, interdisciplinary collaborations involving data scientists in research teams and enhancing data stewardship educational programs are critical in advancing effective and collaborative data analysis.
There is no doubt that the conjunction of HTE and HTC provides enormous opportunities to transform the field of catalysis. The integration of HTC with HTE offers synergistic advantages: HTC can serve both as a pre-screening tool, guiding the design of focused HTE campaigns, and as a post-analysis aid, helping to rationalize trends and mechanistic insights from experimental data. At the symposium, this was exemplified by Ser and Hao et al., which demonstrated the ligand-dependence of palladium-catalyzed protodeboronation.40 The HTE setup facilitated the parallel reaction of 27 unique phosphine ligands and their respective deboronation yields were subsequently analyzed via HPLC in one workflow. HTC via DFT was used to investigate the proposed reaction mechanism containing 23 intermediates and transition states for all 27 ligands and determining that bulkier ligands favor the formation of an unstable post-transmetalation product that subsequently undergoes facile protodeboronation with water, supporting the experimental observations. This example demonstrates how the synergy between HTE and HTC can facilitate the elucidation of more general trends. This interplay between simulation and experimentation is also central to the development of autonomous, closed-loop platforms for catalytic discovery.
HT research unifies aspects of many traditionally disparate fields, such as synthesis, analytical chemistry, statistics, machine learning, robotics, cheminformatics, and quantum chemistry, to name a few. Properly capitalizing on these opportunities requires scientists who have been trained to take advantage of them. For education, both for new generations of scientists as well as those with long experience, it follows that a certain amount of technical knowledge outside of one's core competency domain is beneficial. This holds true both if one is working in comparative isolation or as part of a large, multidisciplinary team. For team leads, the ability to translate between experts in different knowledge domains becomes paramount to fostering good teamwork. Especially large organizations, such as industrial research laboratories, will need to adjust hiring and internal upskilling efforts to facilitate cross-disciplinary communication, particularly accounting for a gradual shift to a more significant share of digital skill sets. Both research managers and researchers must adopt an open-minded yet practical approach to the rapidly advancing fields within the digital space. Unrealistic expectations and the indiscriminate use of AI for the sake of following the latest trend have the potential to undermine trust, while also leading to a significant waste of resources, including time, funding, and energy, particularly in large-scale screening efforts.
Another critical point in the discussion of data integration occurs when HT catalysis campaigns involving theory and experiment are considered. Here, one key challenge is the lack of knowledge of the atomistic structure of a catalyst under catalytic conditions, which is in line with the discussion covered in Section 3. This often results in simplified and not necessarily representative catalyst models.57,80–82 The successful integration of HTC into catalysis research frequently requires multiple iterations and feedback from experimental results to validate further or refine existing hypotheses around the plausible active sites and corresponding catalytic behavior. Thus, planning how metadata should be stored in HTC campaigns is not always straightforward, as this requires building prior knowledge and development time from both computational models and experiments.
A frequently overlooked opportunity in scientific research arises when model predictions and experimental data do not align. Rather than discarding such results or forcing models to fit, these discrepancies should be seen as valuable opportunities for new insights. The same applies to results that do not yield the expected material or catalytic activity; such data is often excluded from publications or patents as it may be perceived as a distraction. Such results can lead to a better understanding of what drives or inhibits a catalytic system. In this sense, learning from ‘negative’ results is crucial to advance scientific research. Thus, creating a culture of documenting negative results is mandatory. To this end, journals and funding agencies should promote and stimulate a change in the paradigm of publication bias and incentivize proper documentation of comprehensive datasets, avoiding selective reporting that shapes findings to fit a preferred narrative.83 Novel publication forms such as tutorials and articles that provide quick incremental updates, such as the Commits article recently released by Digital Discovery,84 are much-needed for the effective co-development of software-hardware workflows.
Similarly, leveraging historical data is essential to enhancing the precision and efficacy of HT workflows.76 Existing datasets, including those from less successful or “negative” experiments, are invaluable for elucidating material properties, optimizing reaction conditions, and refining methodologies. The preservation and conditioning of historical data and metadata can guide researchers in identifying patterns, preventing redundant experiments, and enhancing predictive models. Recent advances in literature mining, particularly those leveraging LLM agents,85 have shown that information from historical sources can be systematically extracted and structured, offering a cost-effective and scalable way to expand training datasets and reveal trends that would otherwise be overlooked. Innovation in HT research requires both adapting existing technologies and creating novel ones. For instance, emerging techniques like AI-driven optimization algorithms and automated synthesis platforms can streamline discovery.15,86,87 Along these lines, integrating negative results is critical for diversifying AI datasets. As mentioned in Section 4, most databases document and published literature predominantly report successful experiments, while failed attempts or non-productive conditions are underrepresented due to longstanding publication biases. This lack of negative data leads to biased training sets for machine learning models, including LLMs,88–92 which are often fine-tuned on scientific texts. Although these models are not trained on nonsensical or erroneous data, their exposure is limited to what is reported, typically positive results. This can result in overly optimistic or incomplete suggestions when LLMs are used in hypothesis generation or retrosynthetic planning. Still, despite lacking fine-grained experimental details, historical datasets offer valuable patterns and trends that support both human and AI-driven discovery.
Adopting effective workflows and datasets must go hand-in-hand with developing new tools, such as next-gen HT spectroscopic techniques or reactors, which can provide more precise data and enable new insights into molecular and material behavior under different conditions. One should also democratize HT research by combining these principles with emerging technologies such as AI and robotics to optimize HT research further. This can only be accomplished by openly sharing data, methodologies, and tools so that researchers worldwide can easily access and build from well-funded research institutions, bringing new perspectives and applications to the table and new solutions. The seamless integration of new automated data analysis and human expertise ensures that each experimental run contributes to creating an incremental and continuous learning feedback loop.
Finally, a crucial element in HT optimization is the commitment to continuous, critical thinking to transition from high-throughput to smart-throughput experimentation.93 Artificial intelligence and machine learning approaches are increasingly integral to both HTE and HTC workflows. Bayesian optimization and active learning can efficiently guide experimental campaigns, ML interatomic potentials accelerate atomistic simulations at near-DFT accuracy, and large-language models can mine literature to reveal catalytic patterns. However, realizing their full potential requires improving data diversity and reliability, refining uncertainty quantification, and developing tighter integration between computational and experimental pipelines. Researchers must rigorously assess and refine experimental protocols, challenge assumptions, and embrace new tools and methodologies for improved data interpretation. This includes reevaluating experimental setups, adjusting protocols, and adopting new programming skills alongside knowledge of statistical or machine-learning models to enhance data interpretation. Embracing feedback from successes and failures allows for iterative growth in HT research, keeping in mind that unexpected results can spark new hypotheses, ideas, and discoveries.
Beyond these overarching directions, it is important to recognize that catalysis presents a unique set of challenges distinct from those in broader chemistry or materials science. Catalytic systems are inherently dynamic, operating under nonequilibrium conditions and involving complex reaction networks, multiscale transport phenomena, and evolving active sites that are difficult to capture experimentally or computationally. Furthermore, bridging the gap between model and real catalysts—across pressures, timescales, and reactor environments—remains a fundamental obstacle to predictive understanding. The interplay between catalyst structure, reaction mechanism, and process conditions also introduces feedback loops that complicate optimization efforts. Addressing these catalysis-specific complexities will be fundamental for realizing the transformative potential outlined above, ensuring that high-throughput, data-driven approaches deliver meaningful insights into real-world catalytic behavior.
| This journal is © The Royal Society of Chemistry 2026 |