Open Access Article
This Open Access Article is licensed under a
Creative Commons Attribution 3.0 Unported Licence

Peptide–protein docking: from physics-based models to generative intelligence

Kai Ling a, Shu Li a, Zicong Zhang a, Woong-Hee Shin b and Daisuke Kihara *ac
aDepartment of Computer Science, Purdue University, West Lafayette, Indiana, 47907, USA. E-mail: dkihara@purdue.edu
bDepartment of Biomedical Informatics, Korea University College of Medicine, Seoul, 02708, Republic of Korea
cDepartment of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907, USA

Received 28th January 2026 , Accepted 17th March 2026

First published on 18th March 2026


Abstract

Peptide–protein interactions (PepPIs) play a pivotal role in cellular signaling and regulation, representing a significant category of therapeutic agents. However, determining peptide–protein complex structures by experiment is costly and often challenging. Computational peptide–protein complex structure prediction, therefore, plays an important role in mapping binding modes and guiding design. Classical pipelines combine template-based, local, or global docking conformational search algorithms with physics-based or empirical scoring, but they often struggle with highly flexible peptides, induced fit at shallow interfaces, and non-canonical chemistries. In this review, we describe an ongoing shift from such conventional search-and-score workflows to deep learning-based pipelines. We categorize the modern methods into three modules: (i) approaches that predict likely peptide-binding regions on the protein surface and use these predictions to guide or filter docking models; (ii) AlphaFold-based protocols that use general structure prediction methods for peptide–protein co-folding and refinement; and (iii) deep generative models that sample peptide conformations given a target protein structure. We highlight that recent methods have substantially improved the accuracy and applicability of peptide–protein docking, while also identifying shared remaining challenges, including limited avaiability of training data and weak performance on long, disordered, or chemically modified peptides. We conclude by outlining directions for integrating richer biophysical constraints, better-curated peptide–protein datasets, and large-scale generative models to move toward robust, design-ready peptide docking.


image file: d6cc00583g-p1.tif

Kai Ling

Kai Ling is a PhD candidate in Computer Science at Purdue University, where he previously received his MS degree in Computer Science. He obtained his BS degree in Computer Science from Huazhong University of Science and Technology. His research lies at the intersection of artificial intelligence and structural bioinformatics, focusing on deep learning approaches for protein and RNA complex prediction, peptide–protein docking, and multimer structure modeling. His work aims to advance data-driven and physics-aware computational frameworks for understanding biomolecular interactions.

image file: d6cc00583g-p2.tif

Shu Li

Shu Li is a PhD student in the Department of Computer Science at Purdue University, where he is advised by Prof. Daisuke Kihara. His research operates at the intersection of machine learning and structural biology (AI4Science), with a specific focus on multimodal representation learning involving cryo-EM maps, biological sequences, and protein–ligand complexes. He holds a BSc in Information and Computational Science from Wuhan University and an MSc in Computer Science and Technology from Nanjing University.

image file: d6cc00583g-p3.tif

Zicong Zhang

Zicong Zhang is a PhD student in Computer Science at Purdue University, where he conducts research in bioinformatics and biomolecular structure modeling. He received his MS and BS degrees in Computer Science from The Ohio State University. His research lies at the intersection of deep learning and structural biology, focusing on improving protein–peptide docking, macromolecular structure modeling from cryo-electron microscopy density maps, and protein folding pathway prediction.

image file: d6cc00583g-p4.tif

Woong-Hee Shin

Woong-Hee Shin is an Associate Professor in Biomedical Informatics at Korea University College of Medicine. He obtained his BS and PhD in Chemistry from Seoul National University. Following the completion of his doctoral studies, he joined as a postdoctoral researcher in the Department of Biological Sciences at Purdue University. His research is primarily focused on studying the interaction between proteins and small molecules using physical chemistry and AI.

image file: d6cc00583g-p5.tif

Daisuke Kihara

Daisuke Kihara is a full professor in the Department of Biological Sciences and the Department of Computer Science at Purdue University, West Lafayette, Indiana, USA. He received a BS degree from the University of Tokyo, Japan and PhD from Kyoto University, Japan. He has been working on algorithm and software development in broad topics in protein bioinformatics including molecular structure modeling from cryo-electron microscopy maps, protein–protein docking, protein tertiary structure prediction, protein function prediction, and computational drug design. He was elected as an AIMBE (The American Institute for Medical and Biological Engineering) Fellow. In 2025, he received a Faculty Research Award from the College of Science, Purdue University. Lab website: https://kiharalab.org.


Introduction

Protein–protein interactions (PPIs) regulate almost all cellular processes, including signal transduction, immune recognition, and metabolic control.1,2 Dysregulation of these interaction networks is associated with numerous diseases, such as cancer, metabolic disorders, and infectious diseases.3,4 Consequently, targeting PPIs has become an important strategy in modern drug discovery. However, conventional small-molecule drugs often struggle to modulate PPIs because many PPIs are large, flat, and lack well-defined binding pockets.4,5 Peptides may provide a promising alternative because their size and structural flexibility allow them to mimic natural interaction motifs and bind extended protein surfaces with high specificity.6,7 Many biological peptides function as signaling molecules or regulatory motifs that mediate transient peptide–protein interactions (PepPIs), which often control key cellular pathways.7–12 Understanding the structural and mechanistic principles of PepPIs is therefore essential both for elucidating cellular regulation and for developing peptide-based therapeutics.

In recent years, peptide therapeutics have emerged as an important drug modality for targeting PPIs and other biologically relevant protein interfaces.7,10 Approximately one hundred peptide drugs have been approved or are currently undergoing clinical trials worldwide.13 Compared to the traditional small-molecule drugs, peptides offer precise targeting capabilities required for protein complex interfaces, which are typically flat and difficult to target by small-molecule drugs.14 The modular architecture of peptide therapeutics also allows for systematic engineering to optimize binding affinity and selectivity. In addition, peptide drugs provide the ease of synthesis.

Another noteworthy application of PepPIs is in the context of masking peptides. The peptide, which is of a limited length, has been engineered to sterically block a functional binding site of a therapeutic molecule, such as an antibody. Consequently, it impedes the drug from interacting with its target while circulating in normal tissues. Subsequently, the blocking peptide is removed upon reaching the target cell, thereby enabling the therapeutic molecule to exert its activity at the target site.15,16 For instance, tumor-activated T-cell engager platforms, such as the TRACTr system developed by Janux Therapeutics, utilize peptide masks to impede CD3-binding domains until protease-mediated activation transpires in the tumor microenvironment.17

To develop peptide drugs or investigate PepPIs, it is important to obtain the three-dimensional structure of the complex. As with monomer structure characterization, experimental techniques such as X-ray crystallography, nuclear magnetic resonance, and cryogenic electron microscopy have been extensively utilized.18 Nevertheless, the experimental methods are characterized by their high resource demands and may face challenges in the context of transient complexes and membrane targets. It may also prove challenging to address the combinatorial diversity engendered by length variants and non-canonical amino acids solely through the utilization of experimental techniques. Therefore, computational peptide–protein docking methods can complement experimental approaches by providing atomic-level models of PepPIs.

In PepPI, a peptide has a length of 2 to 50 residues and usually binds to the larger protein. The problem could be translated into general protein–protein docking (PPD) by considering the peptide and the protein as ligand and receptor proteins, respectively. The field of PPD has established robust methodological foundations, such as the fast Fourier transform as used in ZDOCK19 and ClusPro,20 and geometric hashing as used in LZerD,21 which focuses on finding shape complementarity between the proteins. The PPD programs have demonstrated a degree of success in predicting PepPIs in certain cases. For example, Bonvin22 and his colleagues developed a protocol for cyclic peptide–protein complex structure prediction using their HADDOCK23 docking program. However, a limitation of these classical methods is that they treat the ligand as a rigid molecule to reduce the computational complexity, which might not be suitable for peptides, which are highly flexible. To address these complexities, classical docking strategies integrate global rigid body placement with conformational sampling and refinement, where the peptide is handled as a flexible molecule.24–26

More recently, the field of structure prediction has been revolutionized with the advent of deep learning methods such as AlphaFold,27–29 which have demonstrated a capacity for modeling monomer structures with high accuracy. Since then, deep learning based algorithms have been rapidly applied to predict protein complexes, such as AlphaFold-Multimer.27 However, direct transfer of such protocols to peptide docking has proven insufficient. Peptides are characterized by their intrinsic flexibility and small size, properties that enable them to fold upon binding and to form intricate hydrogen-bonding and electrostatic networks. Moreover, the fact that many peptides remain disordered in solution severely complicates conformational sampling. Given this rapid methodological evolution and the growing importance of peptides in drug discovery, this review surveys the development of peptide docking approaches from early PPD-based methods to cutting-edge deep learning frameworks and outlines future directions for achieving accurate, efficient, and biologically meaningful predictions of PepPIs. The methods discussed in this article are summarized in Table 1.

Table 1 Peptide–protein docking tools discussed in this article
Method Year Comments Availability
Traditional docking methods
Rosetta FlexPepDock 2011 High-resolution refinement protocol; couples peptide ab initio folding with docking refinement; requires approximate binding site. https://flexpepdock.furmanlab.cs.huji.ac.il/
HADDOCK peptide docking 2013 Integrates conformational selection and induced fit. https://www.bonvinlab.org/software/bpg/peptides/
pepATTRACT 2015 Fully blind coarse-grained peptide docking with two-stage atomistic refinement. https://bioserv.rpbs.univ-paris-diderot.fr/services/pepATTRACT/
CABS-dock 2015 Fully flexible peptide docking without prior binding-site knowledge. coarse-grained sampling plus all-atom rebuilding. https://biocomp.chem.uw.edu.pl/CABSdock
GalaxyPepDock 2015 Template-based protein–peptide docking using interaction similarity and energy-based optimization. https://galaxy.seoklab.org/cgi-bin/submit.cgi?type=PEPDOCK
MDockPeP 2016 All-atom blind global peptide docking from peptide sequence and receptor structure using knowledge-based scoring. https://zougrouptoolkit.missouri.edu/mdockpep/
PIPER-FlexPepDock 2017 Fragment-based global peptide docking with FFT sampling and high-resolution Rosetta refinement. https://piperfpd.furmanlab.cs.huji.ac.il/
HPEPDOCK 2018 Blind hierarchical peptide–protein docking using an ensemble of peptide conformations. https://huanglab.phys.hust.edu.cn/hpepdock/
AutoDock CrankPep (ADCP) 2019 Peptide-specific AutoDock engine. Folds and docks flexible, including cyclic peptides to a rigid receptor. https://ccsb.scripps.edu/adcp/
PatchMAN 2023 Surface patch-matching. Threads peptide sequences onto structural motifs matched to the receptor surface, followed by refinement. https://furmanlab.cs.huji.ac.il/patchman/
Site priors prediction methods
PepSite 2012 Predicts peptide-binding spots on protein surfaces using statistical preferences. Does not generate peptide–protein complex structures. https://pepsite2.russelllab.org
PepBind 2013 Sequence-based peptide-binding prediction framework that combines sequence and structure features. Does not generate peptide–protein docking structures. https://yanglab.qd.sdu.edu.cn/PepBind/
PeptiMap 2017 Identifies potential peptide-binding sites on receptor surfaces by solvent-based surface mapping. Does not generate peptide–protein docking structures. https://peptimap.cluspro.org
CAMP 2021 Sequence-based deep learning framework (CNN and self-attention) that jointly predicts peptide–protein interaction and peptide binding residues. https://github.com/twopin/CAMP
PepNN 2022 Parallel structure-based and sequence-based models using reciprocal attention and graph neural networks to predict peptide-binding residues. https://gitlab.com/oabdin/pepnn
PepCNN 2022 Deep learning predictor of peptide-binding residues that integrates sequence-based features and structural descriptors into a CNN. https://github.com/abelavit/PepCNN
PepBCL 2022 Deep learning protein–peptide prediction. Used as a screening stage: scores docked peptide–target complexes to prioritize high-affinity binders for specific targets. https://github.com/Ruheng-W/PepBCL
PepCA 2024 Sequence-based protein–peptide binding residue predictor built on protein language models and a cross-attention mechanism. https://github.com/cloudaner115/PepCA
TpepPro 2025 Transformer-based peptide-binding site predictor that uses protein sequence representations and attention to identify peptide-binding regions on proteins. https://github.com/wanglabhku/TPepPro
Pose reranking methods
InterPepRank 2021 Graph convolutional network that encodes each docked peptide–protein decoy as a residue-level interaction graph and learns to re-rank poses. https://bitbucket.org/isaakh94/interpeprank/src/master/
GraphPep 2025 Interaction-derived graph learning framework for scoring protein–peptide complexes. Uses graph neural networks/transformers to re-rank poses. https://huanglab.phys.hust.edu.cn/GraphPep/
AF-based methods
AF-Multimer w/Forced Sampling 2022 Peptide-specific protocol. Uses dropout/noise to generate diverse decoy ensembles for flexible peptides. N/A
AF-augmented (Tsaban et al.) 2022 Uses AF2 with a poly-glycine linker to model peptide-protein interactions. N/A
MHC specific AF
(Motmaen et al.) 2023 Fine-tunes AF by adding a classifier head and jointly optimizing binding classification and structure prediction. Applied on peptide–MHC interactions. https://github.com/phbradley/alphafold_finetune
MHC-Fine 2024 Fine-tuned AF model trained exclusively on high-resolution MHC–peptide crystal structures. https://bitbucket.org/abc-group/mhc-fine/src/main/
DistPepFold 2025 Knowledge distillation method. Improves AFM by training teacher and student models. https://github.com/kiharalab/DistPepFold
Diffusion based methods
RAPiDock 2025 SE(3)-equivariant diffusion model. Uses bi-scale graphs and physical constraints for rapid, all-atom peptide docking. https://github.com/huifengzhao/RAPiDock
DiffPepDock 2025 Diffusion model trained on synthetic fragments as initial training. https://github.com/YuzheWangPKU/DiffPepBuilder


Conventional docking methods

Traditional peptide–protein docking evolved from geometric and energetic frameworks, originally designed for protein–protein interactions. It was adapted to accommodate greater flexibility and a smaller size of peptides. In practice, these tools generally fall into three categories based on their scope:30 Template-based methods, methods that perform local refinement, and global docking. These three categories differ primarily in the amount of prior knowledge required. Template-based methods rely on previously determined structures of similar protein–peptide complexes, which are used as scaffolds for modeling. Local refinement methods assume partial knowledge of the target complex, such as a predefined binding site or known contact residues, to restrict the search space. In contrast, global docking methods perform peptide position and pose sampling over the entire receptor surface without prior specification of the binding region.

Template- and motif-based methods, such as GalaxyPepDock31 and PatchMAN,32 transfer interaction patterns or surface-bound peptide backbones from known structures to new targets. They achieve this either by identifying proteins with highly similar sequences or by matching local shapes on the receptor surface using large structural databases. GalaxyPepDock is highly accurate when structurally similar protein–peptide complexes are available in existing databases. On the other hand, the main limitation of this approach is its strict dependence on the availability of suitable templates. Consequently, the method typically fails when predicting entirely novel binding modes or when targeting proteins with no known structurally similar references. On the other hand, PatchMAN does not require complete complex templates to operate. Instead, it identifies small structural fragments from known proteins that are complementary to the receptor surface. While the system is capable of handling non-standard amino acids with post-translational modifications, a key limitation is that its initial search relies on rigid sampling. Lastly, it may struggle to successfully place and refine peptides in cases where a “closed” receptor pocket creates severe steric clashes with the peptide backbone.32

When the binding site is already defined, local refinement protocols are a standard choice. Methods such as Rosetta FlexPepDock,33 HADDOCK peptide docking,26 and even adapted small-molecule engines such as AutoDock Vina34 operate by restricting conformational sampling to a specific box. Rosetta FlexPepDock is reported to generate accurate models by allowing full flexibility of the peptide and the receptor side chains.24,35 Its primary disadvantage is its inability to perform fully blind docking. Also, a key limitation is its reliance on an accurate starting position, and the receptor backbone remains largely rigid. HADDOCK offers a significant advantage when biochemical or biophysical experimental data are available, which are used as constraints.36 Conversely, the accuracy of the method strongly depends on the availability of experimental restraints. AutoDock Vina is fast and uses an efficient empirical scoring function. However, its limitation is that it cannot handle compounds with too many rotatable bonds, i.e., long peptides.37

Global or “blind” docking presents a different challenge: tools such as HPEPDOCK,38 PIPER–FlexPepDock,24 and CABS-dock25 navigate the entire receptor surface to identify the peptide binding site. To handle this large computational load, they typically rely on coarse-grained models or staged searches. MDockPeP39–41 works in three steps: first, it creates multiple peptide backbone conformations. Then, it docks each conformation independently on the whole receptor surface using a method based on AutoDock Vina. Finally, docked peptides are ranked using a scoring system specific to protein–peptide interactions. HPEPDOCK is a fast global docking method that represents peptide flexibility using multiple pre-generated conformations, avoiding costly sampling during docking. Because these conformations are generated without consideration of the receptor and treated as rigid, the method is less accurate when binding requires large structural changes and for long peptides with large conformational space. PIPER-FlexPepDock combines global docking with high-resolution refinement, enabling peptide flexibility and receptor side-chain adjustment. However, the method is computationally expensive, and the rigid-receptor assumption in the initial search limits performance when backbone rearrangement is required. CABS-dock uses a coarse-grained model for blind docking, enabling high flexibility of both partners without prior knowledge,23 but the simplified representation may reduce accuracy, particularly for interactions requiring detailed atomic contacts. MDockPeP is an efficient ab initio docking method that performs fully blind global docking using only the receptor structure and peptide sequence. It combines template-based peptide modeling with global rigid sampling and local flexible refinement, making it faster than methods based on molecular dynamics. However, the receptor is treated as rigid, and accuracy decreases for long peptides. IDP-LZerD tackled the docking of long disordered peptides by docking fragments of disordered peptides and assembling them into the full chain length.42,43 Although it provided a novel idea, the accuracy of long peptides is limited.

These classical approaches share a common architectural pipeline. The process begins by generating a large number of initial ensembles of peptide poses. While template-based methods derive backbone conformation from structurally similar complexes, local and global schemes typically employ rigid-body sampling, fragment assembly, or coarse-grained models to populate the receptor's surface with candidate orientations. Subsequently, scoring functions that evaluate coarse-grained features of the models are used to screen candidate conformations, for example, by assessing shape complementarity and physicochemical properties. Selected candidates often undergo a high-resolution refinement stage at all-atom resolution, allowing adjustments of the peptide backbone and limited flexibility of the receptor.

The ability of classical docking to recover near-native poses hinges largely on the efficacy of scoring and refinement. Primary scoring functions vary widely in their approaches: they range from knowledge-based statistical potentials (e.g., ITScorePeP in MDockPeP39) and hybrid energies (GalaxyPepDock31) to the coarse-grained interaction terms found in pepATTRACT44 or CABS-dock.25 At the other end of the spectrum lie physics-based functions, such as those used by Rosetta FlexPepDock33 and the AutoDock45 family, including ADCP,35,46 which evaluate interactions on pre-calculated affinity grids. All these functions face the difficult task of distinguishing native-like interfaces from a vast sea of decoys while remaining computationally efficient. To address the limitations of fast scoring, refinement stages apply high-resolution, all-atom energy functions to a narrowed candidate set. For instance, FlexPepDock33 employs intensive Monte Carlo moves and gradient minimization in an all-atom potential, whereas pepATTRACT44 and CABS-dock25 transition from simplified coarse-grained docking to high-resolution atomistic reconstruction. Earlier protein–protein docking frameworks, such as LZerD,21,47 which combine geometric shape descriptors with energy-based filtering, helped establish scoring and refinement ideas that modern peptide docking engines adapt to flexible ligands.

Classical geometric docking suffers from three main limitations. Combinatorial growth of the search space with peptide length and flexibility prevents exhaustive sampling, especially for long peptides. Rigid receptor backbones fail to capture loop motions or order–disorder transitions, making results highly dependent on the input conformer. Reliance on binding-site priors, such as experimental data, adds fragility, as inaccurate or missing site information produces models that cannot be reliably ranked. These challenges reflect the static-energy landscape assumption, which deep learning-based approaches aim to overcome.

In the last paragraph in this section, we mention metrics used to evaluate the accuracy of docking models. Commonly, model evaluation follows metrics used in the community-wide evaluation of protein docking, the Critical Assessment of Predicted Interactions (CAPRI).48 The CAPRI criteria evaluate models using three metrics: the root mean square deviation (RMSD) of the ligands (L-RMSD), RMSD of interface atoms (iRMSD), and the fraction of native contacts (fnat). The DockQ49 score, which combines these three metrics, is also frequently used lately. Additional geometric and stereochemical evaluations, such as Ramachandran plot statistics,50 rotamer outlier analysis,51 MolProbity score,52 and buried surface area assessment,53 are also often used.

Recent evolution of peptide–protein docking methods

Peptide–protein docking has progressed substantially from its early approaches mentioned above. The evolution of method development is schematically summarized in Fig. 1. Deep learning began to be introduced into key components of this problem around 2012,54–57 with a marked increase in such studies after approximately 2020, coinciding with the introduction of AlphaFold228 into the field. In protein–peptide docking, two categories of auxiliary methods that complement complex structure modeling have been actively pursued: methods that predict peptide-binding sites on receptor proteins and methods that rank docking models. Since 2022, nearly all newly proposed approaches have incorporated deep learning components. Most recently, we have observed methods that use diffusion models, a generative deep learning approach.
image file: d6cc00583g-f1.tif
Fig. 1 Timeline of peptide-docking methods from 2011 to 2025, grouped by conceptual class. Methods mentioned in this work are classified into five categories. Early work is dominated by traditional search-and-score docking engines (green). From 2021 onward, deep learning-based frameworks emerge, including AF-based models (yellow) and diffusion-based models (cyan). In parallel, two auxiliary components, site-prior (binding site) prediction methods (red) and learned pose re-ranking strategies (mint).

Peptide binding site prediction and docking pose re-ranking methods

There have been notable efforts to develop methods for predicting peptide binding sites. These methods predict where on the receptor surface a peptide is likely to bind, at residue or patch resolution. Predicted binding sites can be used in the downstream docking, co-folding, or in generative engines as masks, spatial constraints, or re-ranking signals.23,33,39,58–60 These methods use hand-crafted physicochemical, geometric, or statistical features rather than learned representations. PepSite59 and PeptiMap60 identify peptide-binding patches on protein surfaces by detecting statistically meaningful residue–residue contact patterns or energetically favorable fragment placements. PepBind58 infers peptide-binding regions from sequence-derived features, including residue composition and evolutionary profiles that are integrated in a support vector machine, a type of machine learning algorithm, and integrates these predictions with binding-site annotations transferred from structurally similar proteins in the BioLiP database.61

From around 2020, we observed the appearance of methods using deep learning. PepNN-Struct/Seq,62 Pep-CNN,63 and PepCA64 exploit receptor geometry, sequence context, and local physicochemical features to highlight peptide-binding hotspots. PepNN-Struct models the receptor as a residue-level graph derived from the protein structure and applies attention mechanisms to predict peptide-binding residues. Pep-CNN and PepCA primarily operate on residue-level representations derived from sequence and local physicochemical features. Pep-CNN employs convolutional neural networks on residue-wise feature maps combining evolutionary and structural descriptors, whereas PepCA uses protein language model embeddings together with co-attention mechanisms to infer residue-level peptide-binding sites.

There are also methods that predict peptide binding residues using sequence information. CAMP65 is a multi-level framework that takes sequence-derived feature profiles of both peptide and protein as input, predicting whether a given protein–peptide pair interacts and, additionally, identifying binding residues along the peptide sequence. TPepPro66 is a Transformer-based PepPI predictor that integrates local sequence representations of proteins with global structural context derived from protein contact maps, and outputs pair-level interaction likelihoods rather than explicit binding sites. PepBCL67 uses the BERT language model as the base. It uses sequence information of a protein and a peptide only and uses contrastive learning to predict protein-side peptide-binding residues at the residue level.

In the majority of computational pipelines, binding site prediction is not treated as an endpoint; rather, it functions as a modular component that facilitates the conversion of blind global searches into constrained quasi-local exploration. Additionally, they contribute to the stabilization of pose generation and ranking when experimental or template-derived site information is incomplete or noisy.

Peptide docking pose re-ranking is another important component in peptide docking. After generating an initial ensemble of peptide poses, a challenge is to reliably identify which decoys (generated models) are worth refining. Early work approached this by examining if predicted peptide poses overlap with a designated binding site. For example, decoys were re-ranked based on the overlap between predicted or restrained binding interfaces and the interfaces formed in docking models.23 Other approaches employed knowledge-based statistical potentials derived from known peptide–protein complexes (e.g., ITScorePeP in MDockPeP39). In addition, consensus- and clustering-based strategies prioritized decoys belonging to large, densely populated interface clusters under the assumption that near-native binding modes are sampled repeatedly, as implemented in peptide docking frameworks such as CABS-dock,25 pepATTRACT,44 and the PIPER–FlexPepDock24 pipeline.

More recent methods learn re-ranking from decoy ensembles using deep learning. InterPepRank68 encodes each peptide–protein decoy as a residue-level graph with physical contacts as edges and trains a graph convolutional network to predict model quality, which substantially improves the selection of starting structures for FlexPepDock33 refinement. GraphPep69 takes a more interaction-centric view, representing decoys as graphs whose nodes correspond to protein–peptide contacts rather than individual residues, and learns to distinguish near-native from incorrect poses using a mixture of classical docking and AlphaFold-generated27–29 decoys. In both cases, supervision is provided by continuous or discretized quality labels such as DockQ49 docking score or CAPRI prediction accuracy classes,48 and models are typically optimized with regression or ranking losses.

AlphaFold-based peptide docking methods

After AlphaFold2 (AF2) was developed, Tsaban et al.70 extended the capabilities of AF2, which is fundamentally a protein monomer prediction framework, by introducing modified inputs that enable the modeling of protein–peptide complexes. This approach connects receptor sequence and peptide sequence with a poly-glycine linker. It was shown that without further training, AF2 is able to model protein–peptide complexes. Similarly, Motmaen et al.71 extended AF2 to jointly predict protein structure and major histocompatibility complex (MHC)-peptide-binding specificity, where MHC and peptide sequences were connected and used as input to AF2. However, caution is required when using AF2 for peptide docking, as peptide predictions by AF2 have been shown to be biased by the peptide's secondary structure.72

Shortly after AF2, AlphaFold-Multimer (AFM)27 was developed for modeling protein complexes. Even though AFM has demonstrated high accuracy in modeling general protein complexes, it has shown that the accuracy of modeling protein–peptide complexes lags behind that for single-chain proteins and regular protein complexes.70,73–77 Johansson-Åkhe et al.74 reported that increasing the recycle number and generating more structures in AFM can lead to better modeling performance. The approaches by Tsaban et al.70 and Johansson-Åkhe et al.74 preserve the original AFM architecture, avoid additional training cost, and maintain broad applicability, but they are inherently limited by the capacity of AFM for peptide docking. MHC-Fine improved prediction accuracy for MHC–peptide complex structures by further training AFM with larger neural networks and adding protein–peptide interaction information to templates during training.78 Phospho-Tune improved structure modeling of phosphorylated peptide and protein interactions by further training AFM specifically on phosphopeptide–protein complexes with an integration of phosphorylation information, such as the position of phosphorylated residues in the peptide sequence, during the training process.79

MHC-Fine and Phospho-Tune are examples of methods that target specific biological applications. Limitations could be the size and diversity of available datasets for these specific applications. The relatively small number of known peptide–protein complex structures constrains a comprehensive assessment of model robustness. Broader and more diverse benchmarks are desired to validate the stability and reliability of models fine-tuned for specific peptide–protein interactions, but it is actually a challenge common to all types of current peptide-docking methods.

DistPepFold: improving peptide–protein docking using privileged knowledge distillation

DistPepFold73 improves protein–peptide docking by applying a privileged knowledge distillation approach. Fig. 2a shows the workflow of the algorithm. DistPepFold uses a teacher-student framework, where two models, the teacher model and the student model, are trained. The teacher model is trained first with privileged knowledge, which in this case is the set of interacting residue pairs between the receptor and the peptide in this case. With this critical information on interacting residues, the teacher model achieved high modeling accuracy. The idea of the student-teacher framework is to indirectly transfer the privileged knowledge to the student model through the training process. During the distillation process, the student model is instructed to mimic the behavior of the teacher model, or more concretely, reproduce intermediate outputs of the teacher model, in addition to the correct complex model for the target. During the inference stage, only the student model is used for predicting structures. DistPepFold outperforms AFM and other existing methods in terms of several metrics, where DistPepFold predicted more high-quality structures evaluated using the peptide docking accuracy criteria used in the CAPRI docking competition.48 DistPepFold is available on GitHub at https://github.com/kiharalab/DistPepFold.git.
image file: d6cc00583g-f2.tif
Fig. 2 Workflows of DistpepFold, RAPiDock, and DiffPepDock. The diagrams illustrate the workflows of the three methods. See text for explanation. (a) DistPepFold; (b) RAPiDock; (c), DiffPepDock.

In Fig. 3, four examples of structure models predicted by DistPepFold are shown in comparison with AFM. The models shown here are generated by the student model, as the teacher model needs correct residue contact information, which is not available in an actual prediction scenario. We did not show predicted receptor structures as they were close to the reference structure in the PDB database.80 For these examples, we showed two metrics, interface root-mean square deviation (RMSD; iRMSD) and the fraction of the native contacts (Fnat). iRMSD quantifies the deviation of residue positions at the binding interface in the model relative to the reference structure in PDB. Fnat measures the proportion of native residue–residue contacts that are correctly recovered in the model.


image file: d6cc00583g-f3.tif
Fig. 3 Examples of predicted protein–peptide complex structures. AlphaFold-Multimer predictions (cyan) and DistPepFold predictions (red) are shown with the native peptide conformations (marine). The receptor structure is rendered in a grey cartoon. We did not show predicted receptor structures as they were close to the reference structure in the PDB database. (a) Procollagen-specific molecular chaperone complexed with Collagen model peptide 15-R8 (PDB ID: 4AU3). The complex consists of two receptors of 392 residues and three peptides of 20 residues. (b) Complex of the SHC SH2 domain and tyrosine-phosphorylated peptide (PDB ID: 1TCE). The protein–peptide complex consists of a receptor of 107 residues and a peptide of 20 residues. (c) Adaptor protein complex AP-2 with peptide from intersectin-1 (PDB ID: 3HS8). The protein–peptide complex consists of a receptor of 273 residues and a peptide of 12 residues. (d) ADP-ribosylation factor binding protein GGA1 with the 15-mer peptide fragment of p56 (PDB: 1OM9). The protein–peptide complex consists of a receptor of 154 residues and a peptide of 15 residues. For each method, interface RMSD (iRMSD, Å) and Fnat scores are reported. iRMSD quantifies the root-mean-square deviations of interface residues between predicted and native peptide conformations after receptor superposition. Fnat represents the fraction of native interfacial contacts recovered in the structure model, which ranges from 0 to 1 (the best). According to CAPRI criteria for protein–peptide docking, iRMSD cutoffs of >2.0/2.0–1.0/1.0–0.5/<0.5 Å and Fnat cutoffs of <0.2/0.2–0.5/0.5–0.8/≥0.8 correspond to incorrect/acceptable/medium/high-quality models, respectively.

In the first example (Fig. 3a), AFM failed to identify the peptide binding site, causing the peptide to detach from the receptor structure. In contrast, DistPepFold successfully identified a near-native pose of the bound peptide, resulting in a substantial improvement in iRMSD from 19.5 Å to 4.8 Å. In the second example (Fig. 3b), AFM predicted the correct peptide binding site, but the peptide conformation was incorrect. DistPepFold correctly identified the binding site and produced a peptide conformation close to the reference PDB structure. In the third example (Fig. 3c), AFM docked the peptide at an entirely incorrect location (cyan), whereas DistPepFold accurately identified the correct binding site. In the final example (Fig. 3d), AFM predicted a slightly shifted peptide binding site, resulting in the loss of many native contacts. By contrast, DistPepFold predicted the correct binding pose at the correct site. This improvement is reflected in the iRMSD values (4.8 Å for AFM and 1.3 Å for DistPepFold).

Generative peptide docking using the diffusion model

Recent advances in peptide docking utilize diffusion models, adapting deep generative strategies that have revolutionized fields such as computer vision and structural biology. The diffusion model has been used as the core of the structure generation engine of recent protein and DNA/RNA structure prediction methods. A representative method in this category is AlphaFold3,29 and there are other similar methods with a diffusion model, including RoseTTAFold All-Atom,81 HelixFold-Multimer,82 Boltz-1,83 Chai-1,84 and ProteniX.85

The diffusion model was also introduced in protein–peptide docking to overcome the physical complexity and data scarcity inherent in peptide docking. In RAPiDock86 (Fig. 2b), instead of moving atoms individually, which can lead to distorted structures, a step-by-step diffusion process is employed that manipulates the peptide only through physically valid degrees of freedom: global rotation, translation, and torsion angles. To achieve computational efficiency without sacrificing resolution, the model represents peptides using a graph with two types of nodes, coarse-grained residues for rapid global placement and fine-grained atoms for precise side-chain modeling. Furthermore, it uses Clebsch–Gordan tensor products87 to provide a rigorous mathematical framework for enforcing SE(3)-equivariance. This ensures that the model inherently respects rotational symmetry, allowing it to achieve high-throughput screening speeds while maintaining atomic-level precision. Another method, DiffPepDock88 (Fig. 2c), tackles the scarcity of data on peptide–protein complex structures. This method uses a two-stage training strategy. First, the network is pretrained using a dataset of protein-fragment complexes extracted from protein–protein complexes, which mimic protein–peptide interactions. Extracted fragment-protein complexes are approximately four times more abundant than available peptide–protein complexes. In the second stage of training, the network is fine-tuned on real peptide–protein complexes.

Collectively, representative diffusion-based peptide docking methods and general-purpose structure predictors exhibit complementary strengths and characteristic limitations. Specialized docking frameworks implemented in RAPiDock and DiffPepDock provide high sampling efficiency and strong docking accuracy, particularly for short to medium-length peptides binding to relatively rigid receptors. Their physically constrained diffusion processes reduce structural distortions and improve pose generation. However, they may underperform when substantial receptor conformational changes occur, or when binding modes are insufficiently represented in the training data. In contrast, general-purpose structure predictors, including AlphaFold 3, ProteniX, RoseTTAFold All-Atom, HelixFold-Multimer, Boltz-1, Boltz-2, and Chai-1, provide broad applicability across diverse biomolecular complexes but are not specifically optimized for peptide docking. Moreover, these models rely heavily on multiple sequence alignments (MSAs) and evolutionary information to infer inter-chain contacts, and their performance may decrease for short peptides or orphan sequences with a limited MSA depth. They may also misidentify binding sites or mis-rank alternative poses when evolutionary signals are weak.

Finally, it is worth mentioning that the diffusion model is also used in the de novo design of peptides that bind at the target pocket. Methods in this category design the binding peptide sequence instead of docking a known peptide to a receptor structure. The pipeline of RFdiffusion89 and ProteinMPNN90 is a representative workflow where the former designs a structure of the protein or peptides that meet the provided condition (e.g., binding to a pocket in a receptor protein) and the latter designs amino acid sequences that fold into the designed structure. This paradigm is rapidly expanding with other binder design frameworks such as ODesign,91 BoltzGen,92 and BindCraft.93 ODesign and BoltzGen generalize this process by performing simultaneous co-design of both sequence and structure within unified all-atom generative models, enabling multimodal and flexible conditioning. BindCraft leverages AlphaFold-based hallucination and gradient-based optimization to iteratively refine binder sequences and interfaces, emphasizing one-shot functional binder generation rather than diffusion-based backbone sampling. While the resulting protein–peptide complex is structurally indistinguishable from a docked pose, these methods fundamentally operate by solving the inverse folding problem through generative design, rather than stochastic sampling of a pre-existing sequence.

Challenges and future directions of the field

Peptide–protein docking methods have evolved markedly over the past decades. Early approaches largely adapted rigid-body protein–protein docking frameworks, extending them with coarse-grained sampling, fragment-based assembly, and local refinement to accommodate peptide flexibility. These classical methods established practical baselines but were limited by the size of the conformational search space they could explore and the accuracy of heuristic scoring functions. The introduction of deep learning enabled data-driven modeling of peptide–protein interactions, improving binding-site identification and pose selection. More recently, generative models, including diffusion-based approaches, have advanced peptide docking by learning conditional distributions over peptide–protein complex conformations rather than relying on exhaustive search-and-score sampling. Recent methods report substantial improvement in the modeling accuracy over earlier methods. For example, DiffPepDock,88 a diffusion-based peptide docking method, reported that the mean ligand RMSD (RMSD of a predicted ligand pose relative to the native when the receptor structure is superimposed) ranged from ∼11 Å for classical docking methods to ∼4.5 Å. Similarly, RAPiDock86 achieves over 50% success for high-quality predictions under CAPRI criteria among top-100 ranked models, compared with ∼20–30% for classical docking baselines.

Despite substantial progress, peptide–protein docking methods remain less accurate than single-chain protein structure prediction, reflecting several methodological limitations. A major challenge is the rapid growth of conformational space with peptide length and flexibility, which prevents exhaustive sampling of long or macrocyclic peptides within a realistic computational time. Deep learning-based approaches improve accuracy but depend heavily on patterns learned from existing structural datasets, performing well on common interface types but generalizing poorly to long, disordered, or chemically modified peptides underrepresented in training data. Many end-to-end models treat docking primarily as a structure prediction task rather than an explicit binding process, limiting their ability to model alternative binding modes or conformational ensembles. As a result, different methods succeed under specific assumptions but fail outside their intended regimes, highlighting the need for approaches that balance sampling, physical realism, and reliable scoring. A further unresolved issue is the reliability of confidence estimation.94,95 While many modern pipelines provide confidence scores or ranking metrics, these scores are often poorly calibrated with respect to true docking accuracy, particularly for chemically diverse peptides or alternative binding modes. Improving the interpretability and calibration of confidence estimates remains essential for translating peptide docking predictions into practical decision-making.

Looking forward, peptide–protein docking is likely to continue making incremental progress through the adoption of increasingly advanced deep learning techniques emerging from the machine learning community. However, achieving fundamental improvements will require the construction of new, high-quality datasets. In particular, the development of experimentally determined peptide–receptor complex structures and associated energetic measurements, as a community-wide effort, would provide a critical foundation for next-generation methods. Expanding training and benchmark datasets to better represent long, cyclic, and chemically modified peptides will also be essential for improving performance beyond the current set of curated cases. In parallel, integrating physically motivated energetic models with deep learning may help compensate for limited training data and enable more principled generalization. Together, these advances will be necessary to move peptide–protein docking from incremental gains toward robust, broadly applicable predictive modeling.

Author contributions

DK conceived the study. KL, SL, and ZZ drafted the manuscript. WHS and DK critically edited it. All the authors read and approved the manuscript.

Conflicts of interest

There are no conflicts to declare.

Data availability

This article is a review, and no new data is produced.

Acknowledgements

This work was partly supported by the National Institutes of Health (R35GM158267, R21AI187928) and the National Science Foundation (IIS2211598, DBI2146026, and DBI2422620).

Notes and references

  1. M. R. Arkin and J. A. Wells, Small-molecule inhibitors of protein-protein interactions: progressing towards the dream, Nat. Rev. Drug Discovery, 2004, 3, 301–317,  DOI:10.1038/nrd1343.
  2. D. E. Scott, A. R. Bayly, C. Abell and J. Skidmore, Small molecules, big targets: drug discovery faces the protein-protein interaction challenge, Nat. Rev. Drug Discovery, 2016, 15, 533–550,  DOI:10.1038/nrd.2016.29.
  3. M. Skwarczynska and C. Ottmann, Protein-protein interactions as drug targets, Future Med. Chem., 2015, 7, 2195–2219 CrossRef CAS PubMed.
  4. J. A. Wells and C. L. McClendon, Reaching for high-hanging fruit in drug discovery at protein-protein interfaces, Nature, 2007, 450, 1001–1009 CrossRef CAS PubMed.
  5. M. R. Arkin, Y. Tang and J. A. Wells, Small-molecule inhibitors of protein-protein interactions: progressing toward the reality, Chem. Biol., 2014, 21, 1102–1114 CrossRef CAS PubMed.
  6. J. L. Lau and M. K. Dunn, Therapeutic peptides: Historical perspectives, current development trends, and future directions, Bioorg. Med. Chem., 2018, 26, 2700–2707 CrossRef CAS PubMed.
  7. K. Fosgerau and T. Hoffmann, Peptide therapeutics: current status and future directions, Drug Discovery Today, 2015, 20, 122–128 CrossRef CAS PubMed.
  8. E. Petsalaki and R. B. Russell, Peptide-mediated interactions in biological systems: new discoveries and applications, Curr. Opin. Biotechnol, 2008, 19, 344–350 CrossRef CAS PubMed.
  9. J. J. Ward, J. S. Sodhi, L. J. McGuffin, B. F. Buxton and D. T. Jones, Prediction and functional analysis of native disorder in proteins from the three kingdoms of life, J. Mol. Biol., 2004, 337, 635–645 CrossRef CAS PubMed.
  10. J. L. Lau and M. K. Dunn, Therapeutic peptides: Historical perspectives, current development trends, and future directions, Bioorg. Med. Chem., 2018, 26, 2700–2707 CrossRef CAS PubMed.
  11. A. S. Gokhale and S. Satyanarayanajois, Peptides and peptidomimetics as immunomodulators, Immunotherapy, 2014, 6, 755–774 CrossRef CAS PubMed.
  12. J. Vagner, H. Qu and V. J. Hruby, Peptidomimetics, a synthetic tool of drug discovery, Curr. Opin. Chem. Biol., 2008, 12, 292–296 CrossRef CAS PubMed.
  13. W. Xiao, W. Jiang, Z. Chen, Y. Huang, J. Mao, W. Zheng, Y. Hu, J. Shi, W. Xiao, W. Jiang, Z. Chen, Y. Huang, J. Mao, W. Zheng, Y. Hu and J. Shi, Advance in peptide-based drug development: delivery platforms, therapeutics and vaccines, Signal Transduction Targeted Ther., 2025, 10, 74 CrossRef CAS PubMed.
  14. W.-H. Shin, K. Kumazawa, K. Imai, T. Hirokawa and D. Kihara, Current Challenges and Opportunities in Designing Protein–Protein Interaction Targeted Drugs, Adv. Appl. Bioinf. Chem., 2020, 13, 11–25 Search PubMed.
  15. R. Lucchi, J. Bentanachs and B. Oller-Salvia, The Masking Game: Design of Activatable Antibodies and Mimetics for Selective Therapeutics and Cell Control, ACS Cent. Sci., 2021, 7, 724–738 CrossRef CAS PubMed.
  16. P. Silva-Pinheiro and M. Minczuk, The potential of mitochondrial genome engineering, Nat. Rev. Genet., 2022, 23, 199–214 CrossRef CAS PubMed.
  17. F. Cattaruzza, A. Nazeer, M. To, M. Hammond, C. Koski, L. Y. Liu, V. Pete Yeung, D. A. Rennerfeldt, A. Henkensiefken, M. Fox, S. Lam, K. M. Morrissey, Z. Lange, V. N. Podust, M. K. Derynck, B. A. Irving and V. Schellenberger, Precision-activated T-cell engagers targeting HER2 or EGFR and CD3 mitigate on-target, off-tumor toxicity for immunotherapy in solid tumors, Nature cancer, 2023, 4, 485–501 CrossRef CAS PubMed.
  18. N. London, D. Movshovitz-Attias and O. Schueler-Furman, The structural basis of peptide-protein binding strategies, Structure, 2010, 18, 188–199 CrossRef CAS PubMed.
  19. R. Chen, L. Li and Z. Weng, ZDOCK: an initial-stage protein-docking algorithm, Proteins, 2003, 52, 80–87 CrossRef CAS PubMed.
  20. S. R. Comeau, D. W. Gatchell, S. Vajda and C. J. Camacho, ClusPro: a fully automated algorithm for protein-protein docking, Nucleic Acids Res., 2004, 32, W96–99 CrossRef CAS PubMed.
  21. V. Venkatraman, Y. D. Yang, L. Sael and D. Kihara, Protein-protein docking using region-based 3D Zernike descriptors, BMC Bioinf., 2009, 10, 407 CrossRef PubMed.
  22. V. Charitou, S. C. v Keulen and A. M. J. J. Bonvin, Cyclization and Docking Protocol for Cyclic Peptide–Protein Modeling Using HADDOCK2.4, J. Chem. Theory Comput., 2022, 18, 4027–4040 CrossRef CAS PubMed.
  23. C. Dominguez, R. Boelens and A. M. Bonvin, HADDOCK: a protein− protein docking approach based on biochemical or biophysical information, J. Am. Chem. Soc., 2003, 125, 1731–1737 CrossRef CAS PubMed.
  24. N. Alam, O. Goldstein, B. Xia, K. A. Porter, D. Kozakov and O. Schueler-Furman, High-resolution global peptide-protein docking using fragments-based PIPER-FlexPepDock, PLoS Comput. Biol., 2017, 13, e1005905 CrossRef PubMed.
  25. M. Kurcinski, M. Jamroz, M. Blaszczyk, A. Kolinski and S. Kmiecik, CABS-dock web server for the flexible docking of peptides to proteins without prior knowledge of the binding site, Nucleic Acids Res., 2015, 43, W419–W424 CrossRef CAS PubMed.
  26. M. Trellet, A. S. J. Melquiond and A. M. J. J. Bonvin, A Unified Conformational Selection and Induced Fit Approach to Protein-Peptide Docking, PLoS One, 2013, 8, e58769 CrossRef CAS PubMed.
  27. R. Evans, M. O’Neill, A. Pritzel, N. Antropova, A. Senior, T. Green, A. Žídek, R. Bates, S. Blackwell, J. Yim, O. Ronneberger, S. Bodenstein, M. Zielinski, A. Bridgland, A. Potapenko, A. Cowie, K. Tunyasuvunakool, R. Jain, E. Clancy, P. Kohli, J. Jumper and D. Hassabis, Protein complex prediction with AlphaFold-Multimer, bioRxiv, preprint, 2022 DOI:10.1101/2021.10.04.463034.
  28. J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakool, R. Bates, A. Zidek, A. Potapenko, A. Bridgland, C. Meyer, S. A. A. Kohl, A. J. Ballard, A. Cowie, B. Romera-Paredes, S. Nikolov, R. Jain, J. Adler, T. Back, S. Petersen, D. Reiman, E. Clancy, M. Zielinski, M. Steinegger, M. Pacholska, T. Berghammer, S. Bodenstein, D. Silver, O. Vinyals, A. W. Senior, K. Kavukcuoglu, P. Kohli and D. Hassabis, Highly accurate protein structure prediction with AlphaFold, Nature, 2021, 596, 583–589 CrossRef CAS PubMed.
  29. J. Abramson, J. Adler, J. Dunger, R. Evans, T. Green, A. Pritzel, O. Ronneberger, L. Willmore, A. J. Ballard, J. Bambrick, S. W. Bodenstein, D. A. Evans, C.-C. Hung, M. O’Neill, D. Reiman, K. Tunyasuvunakool, Z. Wu, A. Žemgulytė, E. Arvaniti, C. Beattie, O. Bertolli, A. Bridgland, A. Cherepanov, M. Congreve, A. I. Cowen-Rivers, A. Cowie, M. Figurnov, F. B. Fuchs, H. Gladman, R. Jain, Y. A. Khan, C. M. R. Low, K. Perlin, A. Potapenko, P. Savy, S. Singh, A. Stecula, A. Thillaisundaram, C. Tong, S. Yakneen, E. D. Zhong, M. Zielinski, A. Žídek, V. Bapst, P. Kohli, M. Jaderberg, D. Hassabis, J. M. Jumper, J. Abramson, J. Adler, J. Dunger, R. Evans, T. Green, A. Pritzel, O. Ronneberger, L. Willmore, A. J. Ballard, J. Bambrick, S. W. Bodenstein, D. A. Evans, C.-C. Hung, M. O’Neill, D. Reiman, K. Tunyasuvunakool, Z. Wu, A. Žemgulytė, E. Arvaniti, C. Beattie, O. Bertolli, A. Bridgland, A. Cherepanov, M. Congreve, A. I. Cowen-Rivers, A. Cowie, M. Figurnov, F. B. Fuchs, H. Gladman, R. Jain, Y. A. Khan, C. M. R. Low, K. Perlin, A. Potapenko, P. Savy, S. Singh, A. Stecula, A. Thillaisundaram, C. Tong, S. Yakneen, E. D. Zhong, M. Zielinski, A. Žídek, V. Bapst, P. Kohli, M. Jaderberg, D. Hassabis and J. M. Jumper, Accurate structure prediction of biomolecular interactions with AlphaFold 3, Nature, 2024, 630, 493–500 CrossRef CAS PubMed.
  30. M. Ciemny, M. Kurcinski, K. Kamel, A. Kolinski, N. Alam, O. Schueler-Furman and S. Kmiecik, Protein–peptide docking: opportunities and challenges, Drug Discovery Today, 2018, 23, 1530–1537 CrossRef CAS PubMed.
  31. H. Lee, L. Heo, M. S. Lee and C. Seok, GalaxyPepDock: a protein-peptide docking tool based on interaction similarity and energy optimization, Nucleic Acids Res., 2015, 43, W431–435 CrossRef CAS PubMed.
  32. A. Khramushin, Z. Ben-Aharon, T. Tsaban, J. K. Varga, O. Avraham, O. Schueler-Furman, A. Khramushin, Z. Ben-Aharon, T. Tsaban, J. K. Varga, O. Avraham and O. Schueler-Furman, Matching protein surface structural patches for high-resolution blind peptide docking, Proc. Natl. Acad. Sci. U. S. A., 2022, 119, e2121153119 CrossRef CAS PubMed.
  33. B. Raveh, N. London, L. Zimmerman and O. Schueler-Furman, Rosetta FlexPepDock ab-initio: Simultaneous Folding, Docking and Refinement of Peptides onto Their Receptors, PLoS One, 2011, 6, e18934 CrossRef CAS PubMed.
  34. O. Trott and A. J. Olson, AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading, J. Comput. Chem., 2010, 31, 455–461 CrossRef CAS PubMed.
  35. Y. Zhang and M. F. Sanner, AutoDock CrankPep: combining folding and docking to predict protein-peptide complexes, Bioinformatics, 2019, 35, 5121–5127 CrossRef CAS PubMed.
  36. F. G. Martins, H. A. Santos and S. F. Sousa, A Review of Current Computational Tools for Peptide–Protein Docking, J. Comput. Chem., 2026, 47, e70328 CrossRef CAS PubMed.
  37. A. Mondal, L. Chang and A. Perez, Modelling peptide–protein complexes: docking, simulations and machine learning, QRB Discovery, 2022, 3, e17 CrossRef PubMed.
  38. P. Zhou, B. Jin, H. Li and S. Y. Huang, HPEPDOCK: a web server for blind peptide-protein docking based on a hierarchical algorithm, Nucleic Acids Res., 2018, 46, W443–W450 CrossRef CAS PubMed.
  39. C. Yan, X. Xu and X. Zou, Fully Blind Docking at the Atomic Level for Protein-Peptide Complex Structure Prediction, Structure, 2016, 24, 1842–1853 CrossRef CAS PubMed.
  40. X. Xu and X. Zou, Predicting Protein–Peptide Complex Structures by Accounting for Peptide Flexibility and the Physicochemical Environment, J. Chem. Inf. Model., 2021, 62, 27–39 CrossRef PubMed.
  41. X. Xu, C. Yan and X. Zou, MDockPeP: An ab-initio protein-peptide docking server, J. Comput. Chem., 2018, 39, 2409–2413 CrossRef CAS PubMed.
  42. L. X. Peterson, A. Roy, C. Christoffer, G. Terashi and D. Kihara, Modeling disordered protein interactions from biophysical principles, PLoS Comput. Biol., 2017, 13, e1005485 CrossRef PubMed.
  43. C. Christoffer and D. Kihara, IDP-LZerD: Software for Modeling Disordered Protein Interactions, Methods Mol. Biol., 2020, 2165, 231–244 CrossRef CAS PubMed.
  44. C. E. M. Schindler, S. J. de Vries and M. Zacharias, Fully Blind Peptide-Protein Docking with pepATTRACT, Structure, 2015, 23, 1507–1515 CrossRef CAS PubMed.
  45. G. M. Morris, R. Huey, W. Lindstrom, M. F. Sanner, R. K. Belew, D. S. Goodsell and A. J. Olson, AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility, J. Comput. Chem., 2009, 30, 2785–2791 CrossRef CAS PubMed.
  46. Y. Zhang and M. F. Sanner, Docking Flexible Cyclic Peptides with AutoDock CrankPep, J. Chem. Theory Comput., 2019, 15, 5161–5168 CrossRef CAS PubMed.
  47. C. Christoffer, S. Chen, V. Bharadwaj, T. Aderinwale, V. Kumar, M. Hormati and D. Kihara, LZerD webserver for pairwise and multiple protein-protein docking, Nucleic Acids Res., 2021, 49, W359–W365 CrossRef CAS PubMed.
  48. M. F. Lensink, N. Nadzirin, S. Velankar and S. J. Wodak, Modeling protein-protein, protein-peptide, and protein-oligosaccharide complexes: CAPRI 7th edition, Proteins, 2020, 88, 916–938 CrossRef CAS PubMed.
  49. S. Basu and B. Wallner, DockQ: A Quality Measure for Protein-Protein Docking Models, PLoS One, 2016, 11, e0161879 CrossRef PubMed.
  50. S. C. Lovell, I. W. Davis, W. B. Arendall, P. I. W. d Bakker, J. M. Word, M. G. Prisant, J. S. Richardson and D. C. Richardson, Structure validation by Cα geometry: ϕ,ψ and Cβ deviation, Proteins: Struct., Funct., Bioinf., 2003, 50, 437–450 CrossRef CAS PubMed.
  51. M. V. Shapovalov and R. L. Dunbrack, A Smoothed Backbone-Dependent Rotamer Library for Proteins Derived from Adaptive Kernel Density Estimates and Regressions, Structure, 2011, 19, 844–858 CrossRef CAS PubMed.
  52. V. B. Chen, W. B. Arendall, 3rd, J. J. Headd, D. A. Keedy, R. M. Immormino, G. J. Kapral, L. W. Murray, J. S. Richardson and D. C. Richardson, MolProbity: all-atom structure validation for macromolecular crystallography, Acta Crystallogr., Sect. D: Biol. Crystallogr., 2010, 66, 12–21 CrossRef CAS PubMed.
  53. B. Lee and F. M. Richards, The interpretation of protein structures: Estimation of static accessibility, J. Mol. Biol., 1971, 55, 379–400 CrossRef CAS PubMed.
  54. S. Wang, S. Sun, Z. Li, R. Zhang and J. Xu, Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model, PLoS Comput. Biol., 2017, 13, e1005324 CrossRef PubMed.
  55. J. Eickholt and J. Cheng, Predicting protein residue–residue contacts using deep networks and boosting, Bioinformatics, 2012, 28, 3066–3072 CrossRef CAS PubMed.
  56. W. Ding, J. Xie, D. Dai, H. Zhang, H. Xie and W. Zhang, CNNcon: Improved Protein Contact Maps Prediction Using Cascaded Neural Networks, PLoS One, 2013, 8, e61533 CrossRef CAS PubMed.
  57. S. Wang, J. Peng, J. Ma, J. Xu, S. Wang, J. Peng, J. Ma and J. Xu, Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields, Sci. Rep., 2016, 6, 18962 CrossRef CAS PubMed.
  58. A. A. Das, O. P. Sharma, M. S. Kumar, R. Krishna and P. P. Mathur, PepBind: a comprehensive database and computational tool for analysis of protein-peptide interactions, Genomics, Proteomics Bioinf., 2013, 11, 241–246 CrossRef PubMed.
  59. L. G. Trabuco, S. Lise, E. Petsalaki and R. B. Russell, PepSite: prediction of peptide-binding sites from protein surfaces, Nucleic Acids Res., 2012, 40, W423–W427 Search PubMed.
  60. T. Bohnuud, G. Jones, O. Schueler-Furman and D. Kozakov, Detection of Peptide-Binding Sites on Protein Surfaces Using the Peptimap Server, Methods Mol. Biol., 2017, 1561, 11–20 CrossRef CAS PubMed.
  61. J. Yang, A. Roy and Y. Zhang, BioLiP: a semi-manually curated database for biologically relevant ligand–protein interactions, Nucleic Acids Res., 2013, 41, D1096–1103 CrossRef CAS PubMed.
  62. O. Abdin, S. Nim, H. Wen, P. M. Kim, O. Abdin, S. Nim, H. Wen and P. M. Kim, PepNN: a deep attention model for the identification of peptide binding sites, Commun. Biol., 2022, 5, 503 CrossRef CAS PubMed.
  63. S. Zhang and X. Li, Pep-CNN: An improved convolutional neural network for predicting therapeutic peptides, Chemom. Intell. Lab. Syst., 2022, 221 Search PubMed.
  64. J. Huang, W. Li, B. Xiao, C. Zhao, H. Zheng, Y. Li and J. Wang, PepCA: Unveiling protein-peptide interaction sites with a multi-input neural network model, iScience, 2024, 27, 110850 CrossRef CAS PubMed.
  65. Y. Lei, S. Li, Z. Liu, F. Wan, T. Tian, S. Li, D. Zhao, J. Zeng, Y. Lei, S. Li, Z. Liu, F. Wan, T. Tian, S. Li, D. Zhao and J. Zeng, A deep-learning framework for multi-level peptide–protein interaction prediction, Nat. Commun., 2021, 12, 5465 CrossRef CAS PubMed.
  66. X. Jin, Z. Chen, D. Yu, Q. Jiang, Z. Chen, B. Yan, J. Qin, Y. Liu and J. Wang, TPepPro: a deep learning model for predicting peptide–protein interactions, Bioinformatics, 2024, 41 Search PubMed.
  67. R. Wang, J. Jin, Q. Zou, K. Nakai and L. Wei, Predicting protein–peptide binding residues via interpretable deep learning, Bioinformatics, 2022, 38, 3351–3360 CrossRef CAS PubMed.
  68. I. Johansson-Åkhe, C. Mirabello and B. Wallner, InterPepRank: assessment of docked peptide conformations by a deep graph network, Front. Bioinf., 2021, 1, 763102 CrossRef PubMed.
  69. H. Tao, X. Wang and S.-Y. Huang, An interaction-derived graph learning framework for scoring protein–peptide complexes, Nat. Mach. Intell., 2025, 7, 1–12 CrossRef.
  70. T. Tsaban, J. K. Varga, O. Avraham, Z. Ben-Aharon, A. Khramushin, O. Schueler-Furman, T. Tsaban, J. K. Varga, O. Avraham, Z. Ben-Aharon, A. Khramushin and O. Schueler-Furman, Harnessing protein folding neural networks for peptide–protein docking, Nat. Commun., 2022, 13, 176 CrossRef CAS PubMed.
  71. A. Motmaen, J. Dauparas, M. Baek, M. H. Abedi, D. Baker, P. Bradley, A. Motmaen, J. Dauparas, M. Baek, M. H. Abedi, D. Baker and P. Bradley, Peptide-binding specificity prediction using fine-tuned protein structure prediction networks, Proc. Natl. Acad. Sci. U. S. A., 2023, 120, e2216697120 CrossRef CAS PubMed.
  72. E. F. McDonald, T. Jones, L. Plate, J. Meiler and A. Gulsevin, Benchmarking AlphaFold2 on peptide structure prediction, Structure, 2023, 31, 111–119 CrossRef CAS PubMed.
  73. Z. Zhang, J. Verburgt, Y. Kagaya, C. Christoffer and D. Kihara, Learning with Privileged Knowledge Distillation for Improved Peptide–Protein Docking, ACS Omega, 2025, 10, 26684–26693 CrossRef CAS PubMed.
  74. I. Johansson-Åkhe and B. Wallner, Improving peptide-protein docking with AlphaFold-Multimer using forced sampling, Front. Bioinf., 2022, 2, 959160 CrossRef PubMed.
  75. J. Verburgt, Z. Zhang and D. Kihara, Multi-level analysis of intrinsically disordered protein docking methods, Methods, 2022, 204, 55–63 CrossRef CAS PubMed.
  76. S. Shanker and M. F. Sanner, Predicting Protein–Peptide Interactions: Benchmarking Deep Learning Techniques and a Comparison with Focused Docking, J. Chem. Inf. Model., 2023, 63, 3158–3170 CrossRef CAS PubMed.
  77. R. Yin, B. Y. Feng, A. Varshney and B. G. Pierce, Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants, Protein Sci., 2022, 31, e4379 CrossRef CAS PubMed.
  78. E. Glukhov, D. Kalitin, D. Stepanenko, Y. Zhu, T. Nguyen, G. Jones, C. Simmerling, J. C. Mitchell, S. Vajda, K. A. Dill, D. Padhorny and D. Kozakov, MHC-Fine: Fine-tuned AlphaFold for Precise MHC-Peptide Complex Prediction, bioRxiv, preprint, 2023 DOI:10.1101/2023.11.29.569310.
  79. E. Glukhov, V. Averkava, S. Kotelnikov, D. Stepanenko, T. Nguyen, J. C. Mitchell, C. Simmerling, S. Vajda, A. Emili, D. Padhorny and D. Kozakov, Phospho-Tune: Enhanced Structural Modeling of Phosphorylated Protein Interactions, bioRxiv, preprint, 2024 DOI:10.1101/2024.02.29.582580.
  80. G.-J. Bekker, C. Nagao, M. Shirota, T. Nakamura, T. Katayama, D. Kihara, K. Kinoshita and G. Kurisu, Protein Data Bank Japan: Computational Resources for Analysis of Protein Structures, J. Mol. Biol., 2025, 437, 169013 CrossRef CAS PubMed.
  81. R. Krishna, J. Wang, W. Ahern, P. Sturmfels, P. Venkatesh, I. Kalvet, G. R. Lee, F. S. Morey-Burrows, I. Anishchenko, I. R. Humphreys, R. McHugh, D. Vafeados, X. Li, G. A. Sutherland, A. Hitchcock, C. N. Hunter, A. Kang, E. Brackenbrough, A. K. Bera, M. Baek, F. DiMaio and D. Baker, Generalized biomolecular modeling and design with RoseTTAFold All-Atom, Science, 2024, 384, eadl2528 CrossRef CAS PubMed.
  82. J. Gao, J. Hu, L. Liu, Y. Xue, K. Zhu, X. Zhang and X. Fang, Precise Antigen-Antibody Structure Predictions Enhance Antibody Development with HelixFold-Multimer, arXiv, preprint, arXiv.2412.09826, 2024 DOI:10.48550/arXiv.2412.09826.
  83. J. Wohlwend, G. Corso, S. Passaro, M. Reveiz, K. Leidal, W. Swiderski, T. Portnoi, I. Chinn, J. Silterra, T. Jaakkola and R. Barzilay, Boltz-1 Democratizing Biomolecular Interaction Modeling, bioRxiv, preprint, 2024 DOI:10.1101/2024.11.19.624167.
  84. C. D. Team, J. Boitreaud, J. Dent, M. McPartlon, J. Meier, V. Reis, A. Rogozhonikov and K. Wu, Chai-1: Decoding the molecular interactions of life, bioRxiv, preprint, 2024 DOI:10.1101/2024.10.10.615955.
  85. B. A. A. S. Team, X. Chen, Y. Zhang, C. Lu, W. Ma, J. Guan, C. Gong, J. Yang, H. Zhang, K. Zhang, S. Wu, K. Zhou, Y. Yang, Z. Liu, L. Wang, B. Shi, S. Shi and W. Xiao, Protenix - Advancing Structure Prediction Through a Comprehensive AlphaFold3 Reproduction, bioRxiv, preprint, 2025 DOI:10.1101/2025.01.08.631967.
  86. H. Zhao, O. Zhang, D. Jiang, Z. Wu, H. Du, X. Wang, Y. Zhao, Y. Huang, J. Ge, T. Hou, Y. Kang, H. Zhao, O. Zhang, D. Jiang, Z. Wu, H. Du, X. Wang, Y. Zhao, Y. Huang, J. Ge, T. Hou and Y. Kang, Protein–peptide docking with a rational and accurate diffusion generative model, Nat. Mach. Intell., 2025, 7, 1308–1321 CrossRef.
  87. R. Kondor, Z. Lin and S. Trivedi, Clebsch-Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network, Adv. Neural Inf. Process. Syst., 2018, 31 Search PubMed.
  88. Y. Wang, F. Wang, L. Feng, C. Zhang and L. Lai, DiffPepDock: Efficient protein-peptide docking and binder screening via SE(3)-equivariant diffusion, Protein Sci., 2025, 34, e70338 CrossRef CAS PubMed.
  89. J. L. Watson, D. Juergens, N. R. Bennett, B. L. Trippe, J. Yim, H. E. Eisenach, W. Ahern, A. J. Borst, R. J. Ragotte, L. F. Milles, B. I. M. Wicky, N. Hanikel, S. J. Pellock, A. Courbet, W. Sheffler, J. Wang, P. Venkatesh, I. Sappington, S. V. Torres, A. Lauko, V. De Bortoli, E. Mathieu, S. Ovchinnikov, R. Barzilay, T. S. Jaakkola, F. DiMaio, M. Baek, D. Baker, J. L. Watson, D. Juergens, N. R. Bennett, B. L. Trippe, J. Yim, H. E. Eisenach, W. Ahern, A. J. Borst, R. J. Ragotte, L. F. Milles, B. I. M. Wicky, N. Hanikel, S. J. Pellock, A. Courbet, W. Sheffler, J. Wang, P. Venkatesh, I. Sappington, S. V. Torres, A. Lauko, V. De Bortoli, E. Mathieu, S. Ovchinnikov, R. Barzilay, T. S. Jaakkola, F. DiMaio, M. Baek and D. Baker, De novo design of protein structure and function with RFdiffusion, Nature, 2023, 620, 1089–1100 CrossRef CAS PubMed.
  90. J. Dauparas, I. Anishchenko, N. Bennett, H. Bai, R. J. Ragotte, L. F. Milles, B. I. M. Wicky, A. Courbet, R. J. de Haas, N. Bethel, P. J. Y. Leung, T. F. Huddy, S. Pellock, D. Tischer, F. Chan, B. Koepnick, H. Nguyen, A. Kang, B. Sankaran, A. K. Bera, N. P. King and D. Baker, Robust deep learning-based protein sequence design using ProteinMPNN, Science, 2022, 378, 49–56 CrossRef CAS PubMed.
  91. O. Zhang, X. Zhang, H. Lin, C. Tan, Q. Wang, Y. Mo, Q. Feng, G. Du, Y. Yu, Z. Jin, Z. You, P. Lin, Y. Zhang, Y. Tao, S. Chen, J. X. Chen, C. Hua, W. Zhao, R. Ma, Y. Xia, K. Ying, J. Li, Y. Zeng, L. Lang, P. Pan, H. Cao, Z. Song, B. Qiang, J. Wang, P. Ji, L. Bai, J. Zhang, C.-Y. Hsieh, P. A. Heng, S. Sun, T. Hou and S. Zheng, ODesign: A World Model for Biomolecular Interaction Design, arXiv, preprint, arXiv.2510.22304, 2025 DOI:10.48550/arXiv.2510.22304.
  92. H. Stark, F. Faltings, M. Choi, Y. Xie, E. Hur, T. O’Donnell, A. Bushuiev, T. Uçar, S. Passaro, W. Mao, M. Reveiz, R. Bushuiev, T. Pluskal, J. Sivic, K. Kreis, A. Vahdat, S. Ray, J. T. Goldstein, A. Savinov, J. A. Hambalek, A. Gupta, D. A. Taquiri-Diaz, Y. Zhang, A. K. Hatstat, A. Arada, N. H. Kim, E. Tackie-Yarboi, D. Boselli, L. Schnaider, C. C. Liu, G.-W. Li, D. Hnisz, D. M. Sabatini, W. F. DeGrado, J. Wohlwend, G. Corso, R. Barzilay and T. Jaakkola, BoltzGen: Toward Universal Binder Design, bioRxiv, preprint, 2025 DOI:10.1101/2025.11.20.689494.
  93. M. Pacesa, L. Nickel, C. Schellhaas, J. Schmidt, E. Pyatova, L. Kissling, P. Barendse, J. Choudhury, S. Kapoor, A. Alcaraz-Serna, Y. Cho, K. H. Ghamary, L. Vinue, B. J. Yachnin, A. M. Wollacott, S. Buckley, A. H. Westphal, S. Lindhoud, S. Georgeon, C. A. Goverde, G. N. Hatzopoulos, P. Gonczy, Y. D. Muller, G. Schwank, D. C. Swarts, A. J. Vecchio, B. L. Schneider, S. Ovchinnikov and B. E. Correia, One-shot design of functional protein binders with BindCraft, Nature, 2025, 646, 483–492 CrossRef CAS PubMed.
  94. X. Dai, R. Wang, Y. Zhang, X. Dai, R. Wang and Y. Zhang, Topological deep learning for enhancing peptide-protein complex prediction, Commun. Chem., 2025, 8, 347 CrossRef PubMed.
  95. I. Johansson-Åkhe and B. Wallner, Benchmarking Peptide-Protein Docking and Interaction Prediction with AlphaFold-Multimer, bioRxiv, preprint, 2021 DOI:10.1101/2021.11.16.468810.

This journal is © The Royal Society of Chemistry 2026
Click here to see how this site uses Cookies. View our privacy policy here.