Active Learning for Drug Discovery and Automated Data Curation
Active machine learning is an experimental design approach that puts machine learning models in the driver seat of data acquisition and automated optimization. Introduced to drug discovery approximately 15 years ago, a handful of impressive studies have revealed the potential of active machine learning to guide molecular discovery and optimization. Most recently, researchers have shown that active learning can also be applied to datasets retrospectively, enabling automated data curation to train powerful machine learning models on small datasets. This chapter reviews the key findings from these active learning studies and summarizes remaining challenges and future opportunities of active learning as an adaptive learning technology in the context of drug discovery.