Issue 5, 2023

Multi-constraint molecular generation using sparsely labelled training data for localized high-concentration electrolyte diluent screening

Abstract

Recently, machine learning methods have been used to propose molecules with desired properties, which is especially useful for exploring large chemical spaces efficiently. However, these methods rely on fully labelled training data, and are not practical in situations where molecules with multiple property constraints are required. There is often insufficient training data for all those properties from publicly available databases, especially when ab initio simulation or experimental property data is also desired for training the conditional molecular generative model. In this work, we show how to modify a semi-supervised variational auto-encoder (SSVAE) model which only works with fully labelled and fully unlabelled molecular property training data into the ConGen model, which also works on training data that have sparsely populated labels. We evaluate ConGen's performance in generating molecules with multiple constraints when trained on a dataset combined from multiple publicly available molecule property databases, and demonstrate an example application of building the virtual chemical space for potential lithium-ion battery localized high-concentration electrolyte (LHCE) diluents.

Graphical abstract: Multi-constraint molecular generation using sparsely labelled training data for localized high-concentration electrolyte diluent screening

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Paper
Submitted
10 Apr 2023
Accepted
14 Aug 2023
First published
15 Aug 2023
This article is Open Access
Creative Commons BY license

Digital Discovery, 2023,2, 1390-1403

Multi-constraint molecular generation using sparsely labelled training data for localized high-concentration electrolyte diluent screening

J. P. Mailoa, X. Li, J. Qiu and S. Zhang, Digital Discovery, 2023, 2, 1390 DOI: 10.1039/D3DD00064H

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements