Jump to main content
Jump to site search


Bayesian semi-supervised learning for uncertainty-calibrated prediction of molecular properties and active learning

Abstract

Predicting bioactivity and physical properties of small molecules is a central challenge in drug discovery. Deep learning is becoming the method of choice but studies to date focus on mean accuracy as the main metric. However, to replace costly and mission-critical experiments by models, a high mean accuracy is not enough: Outliers can derail a discovery campaign, thus models need reliably predict when it will fail, even when the training data is biased; experiments are expensive, thus models need to be data-efficient and suggest informative training sets using active learning. We show that uncertainty quantification and active learning can be achieved by Bayesian semi-supervised graph convolutional neural networks. The Bayesian approach estimates uncertainty in a statistically principled way through sampling from the posterior distribution. Semi-supervised learning disentangles representation learning and regression, keeping uncertainty estimates accurate in the low data limit and allowing the model to start active learning from a small initial pool of training data. Our study highlights the promise of Bayesian deep learning for chemistry.

Back to tab navigation

Publication details

The article was received on 03 Feb 2019, accepted on 04 Jul 2019 and first published on 10 Jul 2019


Article type: Edge Article
DOI: 10.1039/C9SC00616H
Chem. Sci., 2019, Accepted Manuscript
  • Open access: Creative Commons BY license
    All publication charges for this article have been paid for by the Royal Society of Chemistry

  •   Request permissions

    Bayesian semi-supervised learning for uncertainty-calibrated prediction of molecular properties and active learning

    Y. Zhang and A. Lee, Chem. Sci., 2019, Accepted Manuscript , DOI: 10.1039/C9SC00616H

    This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. Material from this article can be used in other publications provided that the correct acknowledgement is given with the reproduced material.

    Reproduced material should be attributed as follows:

    • For reproduction of material from NJC:
      [Original citation] - Published by The Royal Society of Chemistry (RSC) on behalf of the Centre National de la Recherche Scientifique (CNRS) and the RSC.
    • For reproduction of material from PCCP:
      [Original citation] - Published by the PCCP Owner Societies.
    • For reproduction of material from PPS:
      [Original citation] - Published by The Royal Society of Chemistry (RSC) on behalf of the European Society for Photobiology, the European Photochemistry Association, and RSC.
    • For reproduction of material from all other RSC journals:
      [Original citation] - Published by The Royal Society of Chemistry.

    Information about reproducing material from RSC articles with different licences is available on our Permission Requests page.

Search articles by author

Spotlight

Advertisements