Pose Ensemble Graph Neural Networks to Improve Docking Performances
Abstract
Predicting the geometry and strength governing small molecule-protein interactions remains a paramount challenge in drug discovery due to their complex and dynamic nature. Several machine learning (ML) methods have been proposed to complement and improve on physics-based tools such as molecular docking, usually by mapping three dimensional features of poses to their closeness to experimental structures and/or to binding affinities. Here, we introduce Dockbox2 (DBX2), a novel approach that encodes ensembles of computational poses within a graph neural network framework via energy-based features derived from molecular docking. The model was jointly trained to predict binding pose likelihood as a node-level task and binding affinity as a graph-level task using the PDBbind dataset and demonstrated significant performance in comprehensive, retrospective docking and virtual screening experiments, compared with state-of-the-art physics- and ML-based tools. Our results encourage further exploration of ML models learning from conformational ensembles to accurately model small molecule-protein interactions and thermodynamics. The DBX2 code is available at https://github.com/jp43/DockBox2.