The effect of chemical representation on active machine learning towards closed-loop optimization

A. Pomberger; A. A. Pedrina McCarthy; A. Khan; S. Sung; C. J. Taylor; M. J. Gaunt; L. Colwell; D. Walz; A. A. Lapkin

doi:10.1039/D2RE00008C

The effect of chemical representation on active machine learning towards closed-loop optimization†

A. Pomberger,^a A. A. Pedrina McCarthy,^b A. Khan,^a S. Sung,^c C. J. Taylor,^ad M. J. Gaunt,

^b L. Colwell,^b D. Walz

^e and A. A. Lapkin

*^ac

Author affiliations

* Corresponding authors

^a Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge CB3 0AS, UK
E-mail: aal35@cam.ac.uk

^b Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, UK

^c Cambridge Centre for Advanced Research and Education in Singapore Ltd., CREATE Tower 05-05, 138602 Singapore

^d Astex Pharmaceuticals, 436 Cambridge Science Park, Milton, Cambridge CB4 0QA, UK

^e BASF SE Data Science for Materials, Carl-Bosch-Strasse 38, 67056 Ludwigshafen am Rhein, Germany

Abstract

Multivariate chemical reaction optimization involving catalytic systems is a non-trivial task due to the high number of tuneable parameters and discrete choices. Active machine learning (ML) represents a powerful strategy for automating reaction optimization. However, the translation of chemical reaction conditions into a machine-readable format requires the identification of highly informative features which accurately capture the factors which determine reaction success. Herein, we compare the efficacy of different calculated chemical descriptors for a high throughput experimentation generated dataset to determine the impact on a supervised ML model when predicting reaction yield. Then, the effect of featurization and size of the initial dataset within a closed-loop reaction optimization was examined. Finally, the balance between descriptor complexity and dataset size was considered. Ultimately, tailored descriptors did not outperform simple generic representations, however, a larger initial dataset accelerated reaction optimization.

Supplementary files

Article information

DOI: https://doi.org/10.1039/D2RE00008C
Article type: Paper
Submitted: 06 Jan 2022
Accepted: 07 Feb 2022
First published: 11 Mar 2022
This article is Open Access

Download Citation

React. Chem. Eng., 2022,7, 1368-1379

Permissions

Request permissions

The effect of chemical representation on active machine learning towards closed-loop optimization

A. Pomberger, A. A. Pedrina McCarthy, A. Khan, S. Sung, C. J. Taylor, M. J. Gaunt, L. Colwell, D. Walz and A. A. Lapkin, React. Chem. Eng., 2022, 7, 1368 DOI: 10.1039/D2RE00008C

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Reaction Chemistry & Engineering

The effect of chemical representation on active machine learning towards closed-loop optimization†

Abstract

Supplementary files

Article information

Download Citation

Permissions

The effect of chemical representation on active machine learning towards closed-loop optimization

Social activity

Search articles by author

Spotlight

Advertisements