Issue 30, 2024

Deductive machine learning models for product identification

Abstract

Deductive solution strategies are required in prediction scenarios that are under determined, when contradictory information is available, or more generally wherever one-to-many non-functional mappings occur. In contrast, most contemporary machine learning (ML) in the chemical sciences is inductive learning from example, with a fixed set of features. Chemical workflows are replete with situations requiring deduction, including many aspects of lab automation and spectral interpretation. Here, a general strategy is described for designing and training machine learning models capable of deduction that consists of combining individual inductive models into a larger deductive network. The training and testing of these models is demonstrated on the task of deducing reaction products from a mixture of spectral sources. The resulting models can distinguish between intended and unintended reaction outcomes and identify starting material based on a mixture of spectral sources. The models also perform well on tasks that they were not directly trained on, like performing structural inference using real rather than simulated spectral inputs, predicting minor products from named organic chemistry reactions, identifying reagents and isomers as plausible impurities, and handling missing or conflicting information. A new dataset of 1 124 043 simulated spectra that were generated to train these models is also distributed with this work. These findings demonstrate that deductive bottlenecks for chemical problems are not fundamentally insuperable for ML models.

Graphical abstract: Deductive machine learning models for product identification

Supplementary files

Article information

Article type
Edge Article
Submitted
17 Sep 2023
Accepted
09 Jun 2024
First published
01 Jul 2024
This article is Open Access

All publication charges for this article have been paid for by the Royal Society of Chemistry
Creative Commons BY license

Chem. Sci., 2024,15, 11995-12005

Deductive machine learning models for product identification

T. Jin, Q. Zhao, A. B. Schofield and B. M. Savoie, Chem. Sci., 2024, 15, 11995 DOI: 10.1039/D3SC04909D

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements