Renzhi Ma
Research Center for Materials Nanoarchitectonics (MANA), National Institute for Materials Science (NIMS), 1-1 Namiki, Tsukuba, Ibaraki 305-0044, Japan. E-mail: MA.Renzhi@nims.go.jp
To realize a productive human–AI partnership, AI systems must offer transparent, interpretable, and reliable outputs that complement human creativity and critical thinking, avoiding both over-reliance and uncritical acceptance. Emerging approaches, such as expert-guided collaboration and domain-specific reasoning, are beginning to address these challenges. Ultimately, aligning AI agents with diverse scientific values and contexts will foster a balanced and collaborative research environment.
AI agentic models, often referred to as AI agents to emphasize their autonomy, have become adept not only at coordinating cyber assets such as datasets, computational simulations, and generating ideas or hypotheses, but also at autonomously operating physical assets, including experimental equipment and advanced characterization and measurement systems. In particular, the capability for literature data mining has improved dramatically in recent years. With conversational AIs such as ChatGPT and DeepSeek, which exhibit high-level cognition and natural language understanding, it is now possible to autonomously search and compile representative studies, select those of interest, generate reusable structured knowledge or knowledge graphs, and predict research trends.
As the performance of LLMs is closely linked to their ability to comprehend and reason over complex, interconnected materials science knowledge, large-scale datasets capturing both the latest scientific advancements and valuable materials science information, are vitally needed. The National Institute for Materials Science (NIMS) in Japan has been developing the Materials Data Platform (MDPF), designed to share data collected through the Advanced Research Infrastructure for Materials and Nanotechnology (ARIM) project.4 In 2026, a machine learning system coined as Playground for INspiring AI eXperiences (pinax) was launched to enable researchers to analyze and effectively utilize diverse datasets. In China, the Songshan Lake Materials Laboratory has developed a specialized AI model called MatChat to provide precise, knowledge-based answers for materials science researchers.5 Another model, MatMind, developed at the Shanghai Institute of Ceramics, integrates cross-scale materials data ranging from atomic simulations to industrial parameters. According to the report, it is trained on over 200
000 materials science data points, 10 million literature documents, and 1.5 million patent data points accumulated over many years at the institute.6 In these latest LLMs, retrieval-augmented generation (RAG) techniques have been employed to expand the models’ knowledge bases and generative capabilities beyond their training datasets, enabling logically rigorous inference chains and enhancing reasoning accuracy.
At the experimental science level, several cutting-edge advances have enabled the automation of entire experimental processes, including synthesis, measurement, and characterization, through the control of diverse hardware devices such as robotic arms and spectrometers. In 2023, Ceder and colleagues developed an autonomous laboratory (A-Lab) that achieves full-process automation, from compound design to synthesis and characterization.7 This work drew significant attention for its successful synthesis of 41 novel compounds, primarily oxides and phosphates, over just 17 days. Another autonomous platform, Coscientist, integrates multiple AI agents capable of planning synthesis routes and writing code to directly drive laboratory hardware, thus realizing a true automated closed-loop “design-synthesis-test” process.8 The Coscientist was reported to reproduce a reaction studied by the 2010 Nobel Prize in Chemistry laureate within 4 minutes, demonstrating significant application potential across various other reactions.
A profound challenge in materials research is to unravel the complex and highly nonlinear relationships among structure, property, processing, and performance. Existing LLMs and AI agents still struggle to thoroughly grasp the physical essence of sophisticated material design, which limits their predictive accuracy and generalization capabilities. Overcoming these limitations requires AI agents to deeply correlate structural evolution with property changes. One promising solution is to employ multiple AI agents that perform cross-checking or mutual verification, functioning as “virtual research teams”. These specialized agents, focused on literature retrieval, experimental design, and data validation, collaborate through multi-agent systems, mimicking the interactive dynamics of human scientific research. For example, in Agent Laboratory,9 a cluster of agents representing PhD students, postdocs, and engineers assumes respective tasks such as literature review, experimental design, and code implementation to enhance collaborative efficiency. In a typical material synthesis process, the PhD agent analyzes literature data to construct a knowledge graph, while the postdoc agent uses this graph to design comparative experimental schemes. The engineer agent carries out validation via an automated experimental platform, reducing the overall process time to one-sixth of traditional methods. Another agentic model, ChatBattery,10 deployed seven specialized agents to autonomously complete the entire workflow, from literature review and design to experimental validation. With minimal human intervention, the system successfully discovered and synthesized three novel lithium-ion battery cathode materials exhibiting a reversible capacity 28.8% higher than the baseline material (NMC811), while compressing the R&D cycle from years to months.
In addition, general-purpose LLMs commonly struggle to comprehend or may confuse similar concepts when processing multi-modal materials data. Highly specialized terminology unique to materials science, such as molecular descriptors, space group symbols, and crystallographic information file (CIF) formats, requires customized models to establish domain-specific cognitive frameworks. One promising approach is to create “domain experts” by balancing general language comprehension with specialized terminology recognition and fine-grained domain knowledge. Domain-specific fine-tuning enhances AI agents’ understanding of physical laws and chemical intuition, enabling simultaneous comprehension of data across different modalities and the establishment of semantic connections between them.
For instance, AI agents can be trained not only to directly comprehend lattice parameters and space group information from textual descriptions of crystal structures, but also to interpret graphical data such as diffraction patterns, microscopy images, and performance curves. In the metal–organic-framework (MOF) domain, Yaghi et al., leveraged the image analysis capabilities of the multi-modal data mining applications.11,12 By guiding GPT-4V with natural language instructions, digital processing of non-textual data including X-ray diffraction patterns, nitrogen adsorption isotherms, and thermogravimetric curves, can efficiently extract key experimental data, such as porosity and crystallinity, from charts. This approach also identifies deviations between experimental results and theoretical predictions, providing intuitive guidance for MOF design and optimization. Another platform, ChatMOF, implements a three-tier architecture comprising planning, toolkit execution, and evaluator validation.13 The MOF design process is reconfigured into an iterative cycle of “goal setting-structure generation-data retrieval-property prediction-genetic optimization”, achieving multi-objective optimization while maintaining 87.5% generation accuracy. The versatility of domain-specialized agents in accelerating materials research has also been demonstrated in other case studies for chemical synthesis reaction development,14 energy materials,2 perovskite photovoltaic research,15,16 and beyond.
By automating technical workflows, AI agents can greatly enhance the efficiency of research tasks, allowing human scientists to devote more attention to creative and interpretive aspects of scientific inquiry. This advantage is particularly evident in early-stage research, where idea generation and hypothesis verification are pivotal. For example, in the previously mentioned ChatBattery,10 an “expert-guided LLM reasoning” approach is utilized to achieve a synergistic effect between human and AI efforts. During AI-driven data exploration under expert supervision, scientists are increasingly positioned as strategic planners and final decision-makers, responsible for defining overarching scientific objectives and formulating high-level, creative concepts.2 Meanwhile, AI agents focus on systematic exploration, experimentation, and result validation. However, research fields with abundant, accessible data are generally more amenable to AI-driven studies.17 As a result, the adoption of AI tools has accelerated research in data-rich, highly visible topics within established disciplines. This underscores the importance of developing AI agents that not only optimize analysis of existing datasets but also incentivize human scientists to search for, select, and collect novel data from less explored or previously inaccessible domains.
A critical and cross-disciplinary challenge, extending beyond materials research and nanoscience, is the extent to which the scientific community is prepared to trust and empower AI agents.2,18 The “black box” nature of many AI models often impedes building up a genuine trust from scientists. For AI to become a true partner in scientific research, it must not only provide reliable, hallucination-free hypotheses and results, but also display transparent and interpretable reasoning, enabling scientists to rigorously validate its outputs. Conversely, uncritical acceptance of AI-generated suggestions risks undermining the critical thinking, creativity, and serendipity that characterize human-led research. Over-reliance on AI may also fundamentally impact the training of next generation scientists, potentially reducing emphasis on the development of deep theoretical understanding and hands-on experimental skills.
Recently, approaches such as expert-guided multi-agent collaboration and transparent, domain-specific reasoning frameworks have begun to address these issues, fostering reliability, trustworthiness, and alignment with human scientific goals. Achieving this vision will require integrating balanced, multi-objective human values and a broader scientific context into AI agents’ reward functions, ultimately creating a genuinely collaborative partnership between human scientists and AI in an increasingly AI-integrated research landscape.
| This journal is © The Royal Society of Chemistry 2026 |