Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs

Abstract

Natural language processing with the help of large language models such as ChatGPT has become ubiquitous in many software applications and allows users to interact even with complex hardware or software in an intuitive way. The recent concepts of Self-Driving Labs and Material Acceleration Platforms stand to benefit greatly from making them more accessible to a broader scientific community through enhanced user-friendliness or even completely automated ways of generating experimental workflows that can be run on the complex hardware of the platform from user input or previously published procedures. Here, two new datasets with over 1.5 million experimental procedures and their (semi)automatic annotations as action graphs, i.e., structured output, were created and used for training two different transformer-based large language models. These models strike a balance between performance, generality, and fitness for purpose and can be hosted and run on standard consumer-grade hardware. Furthermore, the generation of node graphs from these action graphs as a user-friendly and intuitive way of visualizing and modifying synthesis workflows that can be run on the hardware of a Self-Driving Lab or Material Acceleration Platform is explored. Lastly, it is discussed how knowledge graphs - following an ontology imposed by the underlying node setup and software architecture - can be generated from the node graphs. All resources, including the datasets, the fully trained large language models, the node editor, and scripts for querying and visualizing the knowledge graphs are made publicly available.

Supplementary files

Article information

Article type
Paper
Submitted
13 Feb 2025
Accepted
02 May 2025
First published
05 May 2025
This article is Open Access
Creative Commons BY license

Digital Discovery, 2025, Accepted Manuscript

Natural Language Processing for Automated Workflow and Knowledge Graph Generation in Self-Driving Labs

B. Ruehle, Digital Discovery, 2025, Accepted Manuscript , DOI: 10.1039/D5DD00063G

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements