High-throughput computational workflow for ligand discovery in catalysis with the CSD

Marc A. S. Short; Clare A. Tovee; Charlotte E. Willans; Bao N. Nguyen

doi:10.1039/D3CY00083D

High-throughput computational workflow for ligand discovery in catalysis with the CSD†

Marc A. S. Short,

^a Clare A. Tovee,

^c Charlotte E. Willans

^b and Bao N. Nguyen

*^b

Author affiliations

* Corresponding authors

^a School of Chemical and Process Engineering, University of Leeds, Woodhouse Lane, Leeds LS2 9JT, UK

^b School of Chemistry, University of Leeds, Woodhouse Lane, Leeds LS2 9JT, UK
E-mail: b.nguyen@leeds.ac.uk

^c The Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, UK
E-mail: tovee@ccdc.cam.ac.uk

Abstract

A novel semi-automated, high-throughput computational workflow for ligand/catalyst discovery based on the Cambridge Structural Database is reported. Two potential transition states of the Ullmann–Goldberg reaction were identified and used as a template for a ligand search within the CSD, leading to >32 000 potential ligands. The ΔG^‡ for catalysts using these ligands were calculated using B97-3c//GFN2-xTB with high success rates and good correlation compared to DLPNO-CCSD(T)/def2-TZVPP. Furthermore, machine learning models were developed based on the generated data, leading to accurate predictions of ΔG^‡, with 70.6–81.5% of predictions falling within ± 4 kcal mol⁻¹ of the calculated ΔG^‡, without the need for the costly calculation of the transition state. This accuracy of machine learning models was improved to 75.4–87.8% using descriptors derived from TPSS/def2-TZVP//GFN2-xTB calculations with a minimal increase in computational time. This new workflow offers significant advantages over currently used methods due to its faster speed and lower computational cost, coupled with excellent accuracy compared to higher-level methods.

This article is part of the themed collections: Machine Learning and Artificial Intelligence: A cross-journal collection and Catalysis Science & Technology Most Popular 2023 Articles

Supplementary files

Article information

DOI: https://doi.org/10.1039/D3CY00083D
Article type: Paper
Submitted: 16 Jan 2023
Accepted: 20 Mar 2023
First published: 22 Mar 2023
This article is Open Access

Download Citation

Catal. Sci. Technol., 2023,13, 2407-2420

Permissions

Request permissions

High-throughput computational workflow for ligand discovery in catalysis with the CSD

M. A. S. Short, C. A. Tovee, C. E. Willans and B. N. Nguyen, Catal. Sci. Technol., 2023, 13, 2407 DOI: 10.1039/D3CY00083D

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Catalysis Science & Technology

High-throughput computational workflow for ligand discovery in catalysis with the CSD†

Abstract

Supplementary files

Article information

Download Citation

Permissions

High-throughput computational workflow for ligand discovery in catalysis with the CSD

Social activity

Search articles by author

Spotlight

Advertisements