PpF: a density functional fine-tuned for noncovalent interactions of protein and peptide residues
Abstract
The essence of protein–peptide interactions lies in the noncovalent interactions between amino-acid pairs; accurately calculating interaction energies of these pairs is crucial for modelling the tertiary structure and protein–peptide interactions. However, available functionals for density functional theory have insufficient accuracy for many noncovalent interactions. To address this challenge, we developed a new functional called PpF by starting with the broadly trained CF22D model as a foundation model and fine tuning it by using it with more specific data on interactions of capped amino acids. The PpF model is specifically designed for noncovalent interactions of pairs of amino acids. First, based on the LEADS-PEP dataset for assessing peptide docking performance, we constructed the amino-acid pair structures dataset (called AAPS260) containing 260 pairs of noncovalently interacting capped amino acids, from which we selected 36 representatives. We performed DLPNO-CCSD(T) calculations on these pairs to determine reference energies for a new training dataset with 12 interaction energies and a new testing dataset with 24 interaction energies. We used an iterative supervised training strategy to optimize parameters for an exchange–correlation functional with a damped dispersion term; the loss function involves 89 previously defined datasets augmented by the new training dataset (AAIE12) with a performance-triggered determination of its weight. This produces the PpF functional. We find that the PpF functional outperforms other functionals on the training set (AAIE12), test set (AAIE24) and the Side Chain Atlas of Interactions (SCAI) dataset. It also does very well on the JSCH, GMTKN55, and MGCDB84_NC databases. The PpF functional is then used to establish the Amino-Acid Interaction Energy benchmark dataset, which is called AAIE260. This work produces a new density functional, a structural dataset of pairs of capped amino acids, a benchmark dataset of the interaction energies of these pairs, and a reliable computational method for exploring protein–peptide binding mechanisms.

Please wait while we load your content...