Deep Docking, Part 2: An Amplified DDU Platform for Ultra-Large Virtual Screening
Abstract
The exponential growth of accessible chemical space represents a significant computational challenge for structure-based virtual screening. Hence, active-learning and machine-learning approaches, such as Deep Docking, have been introduced to significantly speed up this process; yet even such methods became computationally prohibitive as docking libraries expanded into and beyond billion-entries levels. To address this challenge, we herein introduce the Deep Docking Ultra (DDU) approach, which integrates advanced acquisition functions with a pre-trained molecular large language model (MLLM). We demonstrate that such a combination improves accuracy of docking score emulations, while significantly reducing their computational costs. Through 384 virtual screening experiments involving 12 proteins from all major target classes, we systematically benchmarked DDU performance to identify optimal configurations that reduce required computations by up to 45-fold compared to the original Deep Docking method, and by up to 28,500-fold, compared to brute-force docking, without compromising predictive accuracy. We further demonstrate that DDU is able to screen 10.1 billion ligands against the phosphoglycerate kinase 2 target in just 10 days using 50 Tesla V100 GPUs, and yields an overall docking enrichment factor of 12,000. The DDU code is available at https://github.com/diamondspark/DDU.
Please wait while we load your content...