Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis

Yehia Amar; Artur M. Schweidtmann; Paul Deutsch; Liwei Cao; Alexei Lapkin

doi:10.1039/C9SC01844A

Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis†

Yehia Amar,

^a Artur M. Schweidtmann,

^b Paul Deutsch,^c Liwei Cao^ad and Alexei Lapkin

*^ad

Author affiliations

* Corresponding authors

^a Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge, UK
E-mail: aal35@cam.ac.uk

^b Aachener Verfahrenstechnik – Process Systems Engineering, RWTH Aachen University, Aachen, Germany

^c UCB Pharma S.A. Allée de la Recherche, Brussels, Belgium

^d Cambridge Centre for Advanced Research and Education in Singapore Ltd., 1 Create Way, CREATE Tower #05-05, Singapore

Abstract

Rational solvent selection remains a significant challenge in process development. Here we describe a hybrid mechanistic-machine learning approach, geared towards automated process development workflow. A library of 459 solvents was used, for which 12 conventional molecular descriptors, two reaction-specific descriptors, and additional descriptors based on screening charge density, were calculated. Gaussian process surrogate models were trained on experimental data from a Rh(CO)₂(acac)/Josiphos catalysed asymmetric hydrogenation of a chiral α-β unsaturated γ-lactam. With two simultaneous objectives – high conversion and high diastereomeric excess – the multi-objective algorithm, trained on the initial dataset of 25 solvents, has identified solvents leading to better reaction outcomes. In addition to being a powerful design of experiments (DoE) methodology, the resulting Gaussian process surrogate model for conversion is, in statistical terms, predictive, with a cross-validation correlation coefficient of 0.84. After identifying promising solvents, the composition of solvent mixtures and optimal reaction temperature were found using a black-box Bayesian optimisation. We then demonstrated the application of a new genetic programming approach to select an appropriate machine learning model for a specific physical system, which should allow the transition of the overall process development workflow into the future robotic laboratories.

Supplementary files

Article information

DOI: https://doi.org/10.1039/C9SC01844A
Article type: Edge Article
Submitted: 15 Apr 2019
Accepted: 28 May 2019
First published: 30 May 2019
This article is Open Access

All publication charges for this article have been paid for by the Royal Society of Chemistry

Download Citation

Chem. Sci., 2019,10, 6697-6706

Permissions

Request permissions

Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis

Y. Amar, Artur M. Schweidtmann, P. Deutsch, L. Cao and A. Lapkin, Chem. Sci., 2019, 10, 6697 DOI: 10.1039/C9SC01844A

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Chemical Science

Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis†

Abstract

Supplementary files

Article information

Download Citation

Permissions

Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis

Social activity

Search articles by author

Spotlight

Advertisements