Context-aware Computer Vision for Chemical Reaction State Detection

Junru Ren; Abhijoy  Mandal; Rama El-khawaldeh; Shi Xuan Leong; Jason E Hein; Alan Aspuru-Guzik; Lazaros  Nalpantidis; Kourosh  Darvish

doi:10.1039/D5DD00346F

Context-aware Computer Vision for Chemical Reaction State Detection

Junru Ren, Abhijoy Mandal, Rama El-khawaldeh, Shi Xuan Leong, Jason E Hein, Alan Aspuru-Guzik, Lazaros Nalpantidis and Kourosh Darvish

Abstract

Real-time monitoring of laboratory experiments is essential for automating complex workflows and enhancing experimental efficiency. Accurate detection and classification of chemicals in varying forms and states support a range of techniques, including liquid-liquid extraction, distillation, and crystallization. However, challenges exist in the detection of chemical forms: some classes appear visually similar, and the classification of the forms is often context-dependent. In this study, we adapt the YOLO model into a multi-modal architecture that integrates scene images and task context for object detection. With the help of Large Language Models (LLM), the developed method facilitates reasoning about the experimental process and uses the reasoning result as the context guidance for the detection model. Experimental results show that by introducing context during training and inference, the performance of the proposed model, YOLO-text, has improved among all classes, and the model is able to make accurate predictions on visually similar areas. Compared to the baseline, our model increases 4.8% overall mAP without context given and 7% with context. The proposed framework can classify and localize substances with and without contextual suggestions, thereby enhancing the adaptability and flexibility of the detection process.

Supplementary files

Article information

DOI: https://doi.org/10.1039/D5DD00346F
Article type: Paper
Submitted: 06 Aug 2025
Accepted: 28 Oct 2025
First published: 12 Jan 2026
This article is Open Access

Download Citation

Digital Discovery, 2025, Accepted Manuscript

Permissions

Request permissions

Context-aware Computer Vision for Chemical Reaction State Detection

J. Ren, A. Mandal, R. El-khawaldeh, S. X. Leong, J. E. Hein, A. Aspuru-Guzik, L. Nalpantidis and K. Darvish, Digital Discovery, 2025, Accepted Manuscript , DOI: 10.1039/D5DD00346F

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Digital Discovery

Context-aware Computer Vision for Chemical Reaction State Detection

Abstract

Supplementary files

Article information

Download Citation

Permissions

Context-aware Computer Vision for Chemical Reaction State Detection

Social activity

Search articles by author

Spotlight

Advertisements