Modeling protein–ligand interactions for drug discovery in the era of deep learning
Abstract
Accurate modeling of protein–ligand interactions is a cornerstone of rational drug discovery, yet persistent challenges remain due to the intricate complexity of molecular interactions and the limitations of conventional physics-based computational methods. Approaches such as molecular dynamics simulations, molecular docking, and free energy calculations provide theoretically rigorous insights grounded in physical principles, but their practical deployment is often constrained by high computational cost, limited scalability for large systems, and questionable predictive accuracy in real-world settings. Recent advances in deep learning (DL) have introduced powerful data-driven paradigms that complement and extend physics-based strategies across several dimensions, including (1) DL-augmented molecular dynamics, (2) DL-enhanced molecular docking and virtual screening, (3) end-to-end modeling of target proteins and protein–ligand complexes, (4) structure-based de novo drug design with deep generative models, and (5) sequence-based methods for interaction prediction and drug discovery. In this review, we provide a focused overview of these advances, highlight emerging strategies for their integration, examine ongoing challenges, and outline future directions. We argue that bridging physics-based and data-driven approaches not only improves predictive power and efficiency, but also enables exploration of the vast chemical and biological spaces central to modern drug discovery.