Issue 17, 2022

Detection of multi-reference character imbalances enables a transfer learning approach for virtual high throughput screening with coupled cluster accuracy at DFT cost

Abstract

Appropriately identifying and treating molecules and materials with significant multi-reference (MR) character is crucial for achieving high data fidelity in virtual high-throughput screening (VHTS). Despite development of numerous MR diagnostics, the extent to which a single value of such a diagnostic indicates the MR effect on a chemical property prediction is not well established. We evaluate MR diagnostics for over 10 000 transition-metal complexes (TMCs) and compare to those for organic molecules. We observe that only some MR diagnostics are transferable from one chemical space to another. By studying the influence of MR character on chemical properties (i.e., MR effect) that involve multiple potential energy surfaces (i.e., adiabatic spin splitting, ΔEH–L, and ionization potential, IP), we show that differences in MR character are more important than the cumulative degree of MR character in predicting the magnitude of an MR effect. Motivated by this observation, we build transfer learning models to predict CCSD(T)-level adiabatic ΔEH–L and IP from lower levels of theory. By combining these models with uncertainty quantification and multi-level modeling, we introduce a multi-pronged strategy that accelerates data acquisition by at least a factor of three while achieving coupled cluster accuracy (i.e., to within 1 kcal mol−1 MAE) for robust VHTS.

Graphical abstract: Detection of multi-reference character imbalances enables a transfer learning approach for virtual high throughput screening with coupled cluster accuracy at DFT cost

Supplementary files

Article information

Article type
Edge Article
Submitted
21 Jan 2022
Accepted
04 Apr 2022
First published
05 Apr 2022
This article is Open Access

All publication charges for this article have been paid for by the Royal Society of Chemistry
Creative Commons BY-NC license

Chem. Sci., 2022,13, 4962-4971

Detection of multi-reference character imbalances enables a transfer learning approach for virtual high throughput screening with coupled cluster accuracy at DFT cost

C. Duan, D. B. K. Chu, A. Nandy and H. J. Kulik, Chem. Sci., 2022, 13, 4962 DOI: 10.1039/D2SC00393G

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements