Alice B. Nongonierma
and
Richard J. FitzGerald
*
Department of Life Sciences and Food for Health Ireland (FHI), University of Limerick, Limerick, Ireland. E-mail: dick.fitzgerald@ul.ie; Fax: +353-61331490; Tel: +353-61202598
First published on 1st August 2016
The generation of bioactive peptides (BAPs) from dietary proteins has been widely studied. One of the main limitations of a broader application of BAPs in functional foods may arise from their low potency. Therefore, the search for more potent structures is crucial. Quantitative structure–activity relationship (QSAR) has been widely applied in drug discovery and some examples may also be found in the study of BAPs. The aim of this review was to assess the efficiency of QSAR for the discovery of novel and potent BAPs, derived from food protein sources. A wide range of bioactive properties including antioxidant, antimicrobial, angiotensin converting enzyme (ACE), renin and dipeptidyl peptidase IV (DPP-IV) inhibition as well as bitter peptides has been investigated with QSAR. Some studies have identified structural requirements for specific bioactivities, which generally confirmed findings from earlier studies carried out on those BAPs. However, discrepancies are found across analyses, possibly due to the quality of the peptide datasets as well as the descriptors used to build QSAR models. It appears to date that only a limited number of QSAR studies conducted with BAPs have subsequently carried out confirmatory studies and evaluated promising peptide sequences in vivo. This suggests that more research is needed in order to advance knowledge in the area of BAP discovery using QSAR.
An increasing number of peptide sequences identified in various food protein hydrolysates have been reported in the literature over the past few years. The technology used to identify these sequences within food protein hydrolysates or their fractions has greatly evolved. Earlier studies have mainly relied on immunoreactive techniques (e.g., enzyme-linked immunoassay – ELISA) or Edman degradation of the N-terminal amino acids linked to peptide sequencers to identify positive peptide candidates.5–7 However, advances in mass spectrometric (MS) analysis have allowed greater capability in the identification of peptides within complex mixtures and also identification of a large number of peptide sequences in an automated manner, for reviews, see: ref. 8–11. Nevertheless, challenges such as the reliable identification of short peptide sequences (<5 amino acid residues) are still an issue for a more comprehensive understanding of dietary BAP structures.12–14
Overall, an understanding of the structural requirements for peptides having specific bioactivities has increased. This has been made possible by (1) novel strategies to release and/or isolate BAPs from food protein hydrolysates, (2) the increasing number of BAP sequences which have been identified, (3) access to a higher number of bioinformatic, peptidomic and proteomic tools as well as (4) structure–activity relationship studies.15,16 Conventionally, structure–activity relationships for BAPs have been developed using empirical knowledge based on previously known BAP sequences. Recently, the structure-function of peptides has been reported for a wide range of bioactive properties including mineral binding, angiotensin converting enzyme (ACE) and renin inhibition, antithrombic, antidiabetic, antimicrobial, immunomodulatory, antioxidant and opioid activities.17
Quantitative structure–activity relationship (QSAR) modelling is a well-accepted approach for the study of active molecules, which is extensively used in drug discovery.18 QSAR has been developed to elucidate novel drug molecules displaying higher activity, produced at lower cost or which mediate less side-effects. Furthermore, several examples of QSAR are found for the study of food protein-derived BAPs, for reviews, see: ref. 15 and 19. QSAR studies have been applied to bioactive properties such as ACE inhibition, antimicrobial, anticancer and antioxidant activities, for reviews, see: ref. 20–22. Additional studies are also found where QSAR approaches have been employed to try to understand the link between peptide structure and bitterness.23–25
A summary of the structural features of ACE inhibitory peptides, based on the QSAR outcomes, has been compiled in a recent review by Iwaniak, et al.15 However, to our knowledge, studies summarizing the structural requirements for other BAPs, as determined by QSAR analysis, are rare. A better understanding of peptide structural requirements for bioactivity may lead to the discovery of novel peptide sequences with enhanced bioactivities.26 Therefore, the aim of this review was to assess existing QSAR approaches in order to identify common peptide motifs in BAPs which may be used to design novel dietary protein-derived BAP sequences. The search period covered by the review was from 1990 to date. QSAR studies were classified by bioactive properties, i.e., antioxidant, antimicrobial, ACE, renin and dipeptidyl peptidase IV (DPP-IV) inhibitory as well as bitter peptides.
![]() | ||
Fig. 1 Schematic of the general approach used in quantitative structure–activity relationship (QSAR) methodologies for the study of bioactive peptide (BAP) sequences (adapted from Hellberg et al.31 and Iwaniak et al.15). |
The peptide library used to construct the QSAR model contains quantitative biological outputs, e.g., half maximal inhibitory concentration (IC50) values for enzyme inhibition, minimal inhibitory concentration (MIC) for microbial strains, half maximal effective concentrations (EC50) or scavenging activities for antioxidant peptides, etc.24,26,30,31 For comparative purposes, it is recommended that the bioactivity output should be obtained under similar experimental conditions.18,27,32 The peptide database may be restricted to sequences which originate from a single protein26 or a group of proteins found within certain species (e.g., Homo sapiens, Bos taurus, etc.).33 In addition to the peptide origin, the dataset can also be made of peptides having a defined amino acid length or incorporate peptides having various amino acid lengths.24 In other instances, QSAR strategies have focused on peptide analogs of a so-called “lead peptide”. The lead peptide may be used to design specific analogs which are then employed to construct minimum analogue peptide sets (MAPS).31,34 The sequence of the peptide candidates to be included in the peptide library and subsequently tested to generate biological activity data may be defined using a full or fractional factorial design. The MAPS are designed in such a way as to incorporate the minimum number of peptides in the dataset while covering a wide range of amino acid physicochemical properties and while simultaneously varying their positions within the peptide sequences. The number of peptide analogs generated depends on the number of amino acid positions varied within the peptide sequence.34 The inclusion of peptide analogs in datasets has been described, for example, for derivatives of lactoferricin (LFcin), a peptide with antimicrobial properties.35,36
It has been stated that selection/design of the peptide set is the most crucial step to conduct successful QSAR studies.31 Once the peptide library has been compiled, the peptides are generally classified into a “training set” and a “test set”. This classification generally consists in randomly excluding a certain number of peptides from the peptide library. These excluded peptides will then be part of the test set, which is subsequently used for cross validation of the QSAR model. The peptides within the test set may also be chosen in order to cover a wide range of structural characteristics. For example, this may be achieved using a statistical molecular design approach which is based on a fractional design methodology.27
Amino acid | 3 z-score34 | v-scale39 | 5 z-score37 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
z1 | z2 | z3 | v1 | v2 | v3 | z1 | z2 | z3 | z4 | z5 | |
a The amino acids are coded with their one letter code. | |||||||||||
A | 0.07 | −1.73 | 0.09 | 0.05702 | 0.007187 | 0.42 | 0.24 | −2.32 | 0.60 | −0.14 | 1.30 |
R | 2.88 | 2.52 | −3.44 | 0.58946 | 0.043587 | −1.37 | 3.52 | 2.50 | −3.50 | 1.99 | −0.17 |
N | 0.71 | −0.97 | 4.13 | 0.22972 | 0.005392 | −0.82 | 3.05 | 1.62 | 1.04 | −1.15 | 1.61 |
D | −1.39 | 2.32 | 0.01 | 0.21051 | −0.02382 | −1.05 | 3.98 | 0.93 | 1.93 | −2.46 | 0.75 |
C | 0.92 | −2.09 | −1.4 | 0.14907 | −0.03661 | 1.34 | 0.84 | −1.67 | 3.71 | 0.18 | −2.65 |
Q | 3.64 | 1.13 | 2.36 | 0.34861 | 0.049211 | −0.30 | 1.75 | 0.50 | −1.44 | −1.34 | 0.66 |
E | 3.08 | 0.39 | −0.07 | 0.32837 | 0.006802 | −0.87 | 3.11 | 0.26 | −0.11 | −3.04 | −0.25 |
G | 2.23 | −5.36 | 0.30 | 0.00279 | 0.179052 | 0.00 | 2.05 | −4.06 | 0.36 | −0.82 | −0.38 |
H | 2.41 | 1.74 | 1.11 | 0.37694 | −0.01069 | 0.18 | 2.47 | 1.95 | 0.26 | 3.90 | 0.09 |
I | −4.44 | −1.68 | −1.03 | 0.37671 | 0.021631 | 2.46 | −3.89 | −1.73 | −1.71 | −0.84 | 0.26 |
L | −4.19 | −1.03 | −0.98 | 0.37876 | 0.051672 | 2.32 | −4.28 | −1.30 | −1.49 | −0.72 | 0.84 |
K | 2.84 | 1.41 | −3.14 | 0.45363 | 0.017708 | −1.35 | 2.29 | 0.89 | −2.49 | 1.49 | 0.31 |
M | −2.49 | −0.27 | −0.41 | 0.38872 | 0.002683 | 1.68 | −2.85 | −0.22 | 0.47 | 1.94 | −0.98 |
F | −4.92 | 1.30 | 0.45 | 0.55298 | 0.037552 | 2.44 | −4.22 | 1.94 | 1.06 | 0.54 | −0.62 |
P | −1.22 | 0.88 | 2.23 | 0.2279 | 0.239531 | 0.98 | −1.66 | 0.27 | 1.84 | 0.70 | 2.00 |
S | 1.96 | −1.63 | 0.57 | 0.09204 | 0.004627 | −0.05 | 2.39 | −1.07 | 1.15 | −1.39 | 0.67 |
T | 3.22 | 1.45 | 0.84 | 0.19341 | 0.003352 | 0.35 | 0.75 | −2.18 | −1.12 | −1.46 | −0.40 |
W | −4.75 | 3.65 | 0.85 | 0.79351 | 0.037977 | 3.07 | −4.36 | 3.94 | 0.59 | 3.44 | −1.59 |
Y | 2.18 | 0.53 | −1.14 | 0.6115 | 0.023599 | 1.31 | −2.54 | 2.44 | 0.43 | 0.04 | −1.47 |
V | −2.69 | −2.53 | −1.29 | 0.25674 | 0.057004 | 1.66 | −2.59 | −2.64 | −1.54 | −0.85 | −0.02 |
Amino acida | Collantes scale42 | DPPS scale41b | T scale40 | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ISA | ECI | V1 | V2 | V3 | V4 | V5 | V6 | V7 | V8 | V9 | V10 | T1 | T2 | T3 | T4 | T5 | |
a The amino acids are coded with their one letter code.b DPPS: divided physicochemical property scores. | |||||||||||||||||
A | 62.90 | 0.05 | −1.02 | −2.88 | −0.56 | 0.36 | −6.15 | −1.68 | 0.04 | −2.51 | −1.94 | −0.01 | −9.11 | −1.63 | 0.63 | 1.04 | 2.26 |
R | 52.98 | 1.69 | 1.99 | 4.13 | −4.41 | −1.02 | 4.78 | 3.04 | −9.06 | 6.71 | 4.41 | 0.07 | 0.23 | 3.89 | −1.16 | −0.39 | −0.06 |
N | 17.87 | 1.31 | −2.19 | 1.86 | 0.38 | −0.13 | −2.30 | 1.41 | −5.71 | −1.11 | 1.73 | −0.19 | −4.62 | 0.66 | 1.16 | −0.22 | 0.93 |
D | 18.46 | 1.25 | −6.60 | 3.32 | 1.61 | 0.36 | −3.25 | 1.95 | −7.36 | 0.14 | 1.24 | −0.15 | −4.65 | 0.75 | 1.39 | −0.40 | 1.05 |
C | 78.51 | 0.15 | 0.21 | 1.12 | 3.42 | −0.68 | −2.27 | −1.22 | 3.11 | −2.98 | −1.70 | 1.57 | −7.35 | −0.86 | −0.33 | 0.80 | 0.98 |
Q | 19.53 | 1.36 | −0.47 | 1.16 | −0.57 | 0.69 | 0.39 | 1.93 | −5.46 | −0.84 | 1.93 | 0.85 | −3.00 | 1.72 | 0.28 | −0.39 | 0.33 |
E | 30.19 | 1.31 | −5.39 | 0.65 | −0.98 | 1.39 | −0.23 | 2.51 | −6.84 | −0.68 | 1.41 | 1.28 | −3.03 | 1.82 | 0.51 | −0.58 | 0.43 |
G | 19.93 | 0.02 | −2.86 | −5.00 | −2.97 | 0.53 | −11.45 | 1.89 | −2.11 | −3.99 | −2.16 | −0.76 | −10.61 | −1.21 | −0.12 | 0.75 | 3.25 |
H | 87.38 | 0.56 | 0.73 | 2.68 | −0.66 | −1.89 | 1.60 | 1.13 | −1.94 | −0.11 | 0.44 | 0.15 | −1.01 | −1.31 | 0.01 | −1.81 | −0.21 |
I | 149.77 | 0.09 | 1.91 | −3.13 | 0.01 | 1.14 | 2.70 | −4.55 | 8.93 | 0.18 | −1.10 | −0.76 | −4.25 | −0.28 | −0.15 | 1.40 | −0.21 |
L | 154.35 | 0.10 | 1.64 | −2.57 | 0.00 | 1.35 | 2.62 | −2.65 | 7.72 | 0.05 | −1.03 | −1.81 | −4.38 | 0.28 | −0.49 | 1.45 | 0.02 |
K | 102.78 | 0.53 | 2.47 | 1.54 | −4.28 | −0.86 | 2.77 | 2.06 | −6.18 | 2.05 | 2.19 | −1.65 | −2.59 | 2.34 | −1.69 | 0.41 | −0.21 |
M | 132.22 | 0.34 | 1.93 | −0.01 | 1.21 | 0.99 | 2.79 | −0.56 | 5.33 | −0.87 | −0.99 | −1.09 | −4.08 | 0.98 | −2.34 | 1.64 | −0.79 |
F | 189.42 | 0.14 | 2.68 | 0.84 | 2.22 | 0.71 | 5.02 | −0.30 | 8.60 | 1.13 | −1.40 | −0.28 | 0.49 | −0.94 | −0.63 | −1.27 | −0.44 |
P | 122.35 | 0.16 | 0.45 | −2.89 | 1.77 | −5.81 | −3.79 | −0.61 | 0.70 | 1.21 | −1.67 | 1.79 | −5.11 | −3.54 | −0.53 | −0.36 | −0.29 |
S | 19.75 | 0.56 | −1.76 | −0.19 | 1.06 | −0.69 | −5.72 | 0.14 | −4.14 | −2.42 | −0.13 | 0.69 | −7.44 | −0.65 | 0.68 | −0.17 | 1.58 |
T | 59.44 | 0.65 | −0.55 | −0.66 | 0.13 | −0.31 | −2.76 | −1.56 | −2.46 | −2.12 | 0.17 | 0.08 | −5.97 | −0.62 | 1.11 | 0.31 | 0.95 |
W | 179.16 | 1.08 | 3.88 | 1.78 | 1.68 | 2.00 | 9.31 | 0.89 | 7.53 | 4.27 | −0.23 | −1.42 | 5.73 | −2.67 | −0.07 | −1.96 | −0.54 |
Y | 132.16 | 0.72 | 2.10 | 1.26 | 1.15 | 0.91 | 5.90 | 0.74 | 3.71 | 3.32 | 0.25 | 1.33 | 2.08 | −0.47 | 0.07 | −1.67 | −0.35 |
V | 120.91 | 0.07 | 0.83 | −3.02 | −0.22 | 0.97 | 0.05 | −4.55 | 5.61 | −1.41 | −1.44 | 0.30 | −5.87 | −0.94 | 0.28 | 1.10 | 0.48 |
Other amino acid descriptors have been described to develop 3D scales. The T-scale has been developed by Tian et al.40 from a PCA of 67 structural and topological variables of 135 amino acids. A PCA was carried out on the hydrogen bonding (5), electronic (23), steric (37) and hydrophobic (54) properties of the 20 conventional amino acids, yielding 10 descriptors termed as divided physicochemical property scores (DPPS).41 The 3D scale developed by Collantes and Dunn42 combines amino acid side-chains descriptors, i.e., the isotropic surface area (ISA, hydrophobic character of the side chain substituent) and the electronic charge index (ECI, charge concentration of the amino acid).
The main limitation of those scalar descriptors may be the difficulty in relating biological activity to specific physicochemical characteristics of the peptides.27 To overcome this issue, QSAR modelling analyses combining a z-score approach and individual physicochemical characteristics of the peptides have been carried out.32
When the peptide descriptors are defined by the characteristics of its constitutive amino acids, a QSAR model may be described by eqn (1):
![]() | (1) |
Some peptide QSAR approaches have studied peptide sets made of one defined amino acid length (e.g., di- or tripeptides). It is however possible to develop QSAR models using peptide libraries composed of sequences having variable amino acid lengths. This has been addressed by Pripp, et al.32 and Li and Li43 in the development of a QSAR approach for ACE-inhibitory peptides (2–6 amino acid in length) and antioxidant peptides (3–20 amino acid in length). In such QSAR studies, the length (l) of the shortest peptide of the dataset is generally taken into consideration in order to build the model. The QSAR model includes peptide descriptors of the l amino acids located at the N- and C-terminus of each peptide (eqn (2)):
![]() | (2) |
The QSAR model is subsequently statistically cross-validated using the peptides within the test set (originally excluded from the training set) to verify the ability of the QSAR model to be applied to unknown compounds.
To date, the highest number of QSAR studies which are relevant to food protein-derived peptides appear to have been conducted with ACE inhibitory or antimicrobial peptides (AMPs). The following sections review QSAR studies which have been classified according to the target in vitro bioactivity (i.e., antioxidant, antimicrobial, ACE, renin and DPP-IV inhibition) as well as bitterness of the peptides.
Bioactivitya | Peptide lengthb | Scale usedc | Favorable amino acidd at the peptide position (C1 is the C-terminal amino acid) | Confirmatory studiesd | Reference | |||||
---|---|---|---|---|---|---|---|---|---|---|
C6 | C5 | C4 | C3 | C2 | C1 | |||||
a ACE: angiotensin converting enzyme; DPP-IV: dipeptidyl peptidase IV.b The number of peptides used to build the QSAR model is given into bracket (when two numbers are provided, the first and second ones correspond to the number of peptides in the training and the test set, respectively).c DPPS: divided physicochemical property scores; ECI: electronic charge index; HESH: hydrophobic, electronic, steric and hydrogen; ISA: isotropic surface area; MS-WHIM: molecular surfaces-weighted holistic invariant molecular; VHSE: vectors of hydrophobic, steric and electronic properties.d The amino acids are coded with their one letter code. Xaa: amino acid. | ||||||||||
ACE inhibition | 2 (n = 58) | 3-z scale | Hydrophobic & positively charged Xaa | Hydrophobic & bulky Xaa | None | 31 | ||||
2–8 (n = 36) | 8 physicochemical characteristics | Small Xaa | Hydrophobic & non-positively charged Xaa | None | 32 | |||||
2–6 (n = 29) | 3-z scale | Small & non-positively charged Xaa | Non-positively charged Xaa | None | ||||||
2 (n = 168) | 3-z scale | F/Y/W | F/Y/W/P | FW, WW, YW | 30 | |||||
3 (n = 140) | 3-z scale | V/L/I | K/R | P/F/W | VRF, IKP, LRW, LRF | |||||
2 (n = 168) | 3- and 5-z scales | F/Y/W | F/Y/W | None | 46 | |||||
3 (n = 140) | V/L/I | R/K | P/F/W | None | ||||||
4 (n = 79) | V/I/V/M | R/H/W/F | F | Y/P/F | None | |||||
≥5 (n = 226) | W | I/L/V/M | H/W/M | Y/C | None | |||||
2 (n = 58) | 3-z, 5-z, 8-v, 38 physicochemical characteristics, ISA and ECI scales | Hydrophobic Xaa | Hydrophobic, small Xaa | None | 23 | |||||
3 (n = 55) | 17 different scales | Bulky, charged Xaa | G | None | 39 | |||||
2 (n = 58) | 10 different scales | G | G | None | 50 | |||||
3 (n = 55) | 3-z, 5-z and 8-G scales | G | G | G | None | |||||
9 (n = 19) | 3-z, 5-z and 8-G scales | P | None | |||||||
≥5 (n = 245/18) | 38 physicochemical characteristics | V/I | W/Y/C | D/N/K | R/V/T | G/L/A/V/I | None | 45 | ||
3 (n = 17) | 3-z scale | V/L/I | P/C | P/F/W | IVP, INP, IQP, VIP | 33 | ||||
3 (n = 38) | 3-z scale | IIP, IVP | 51 | |||||||
Renin inhibition | 2 (n = 11) | 3-z and 5-z scales | V/L/I/A | W/Y/F | LW, IW, AW, VW | 52 | ||||
Antioxidant | 3 (n = 143/71) | DPPS, HESH, ISA-ECI, MS-WHIM, VHSE and 3-z scales | A/G/V/L/E | R/K/H | 64 | |||||
3 (n = 143/71), 4 (n = 12) | 8-v scale | A/V/L | R/K/H/D/E/T/S/N/Q | W/E/L/I/M/V/Y | None | 65 | ||||
2 (n = 32) | DPPS, ISA-ECI, 3-z and 5-z scales | Y | Hydrophobic, small, low hydrogen bond and electronic Xaa | 66 | ||||||
W | Bulky, hydrophobic Xaa | |||||||||
Bitterness | 2 (n = 48) | Hydrophobic & bulky Xaa | Hydrophobic & bulky Xaa | None | 31 | |||||
2 (n = 77) | 3-z scale | Hydrophobic, polar/charged Xaa | Hydrophobic or bulky Xaa | 25 | ||||||
3 (n = 52) | Bulky Xaa | Hydrophobic, bulky Xaa | ||||||||
4 (n = 23) | Basic, bulky Xaa | Basic, bulky, hydrophobic Xaa | ||||||||
5 (n = 12) | Basic, bulky Xaa | Bulky, hydrophobic Xaa | ||||||||
6 (n = 20) | Basic, bulky, hydrophilic Xaa | Bulky, hydrophobic Xaa | ||||||||
7 (n = 16) | Bulky, hydrophobic Xaa | Bulky, basic Xaa | ||||||||
2 (n = 48) | 3-z, 5-z, 8-v, 38 physicochemical characteristics, ISA and ECI scales | Hydrophobic, small Xaa | Hydrophobic, small Xaa | None | 23 | |||||
2 (n = 53) | 3-z scale | Bulky Xaa | Hydrophobic Xaa | None | 24 | |||||
3 (n = 55) | 3-z scale | Bulky Xaa (W/R/Y) | Hydrophobic Xaa | |||||||
2 (n = 48) | 17 different scales | W/Y/F | W/Y/F | 39 |
Bioactivitya | Peptide lengthb | Scale usedc | Favorable amino acidd at the peptide position (N1 is the N-terminal amino acid) | Confirmatory studiesd | Reference | ||||
---|---|---|---|---|---|---|---|---|---|
N1 | N2 | N8 | N9 | N13 | |||||
DPP-IV inhibition | 2–5 (n = 21/5) | 3-z scale | W/I/F/L | W/I/F/L | VPGEIVE, LPQNIPPLT, LPLPLL, QPLPPT, LPVPQ | 63 | |||
3-v scale | W/I/F/L | ||||||||
Anti-microbial | 15 (n = 19) murine LFcin analogs | 12 physicochemical parameters, 3-z scale | W | C | C | V/L/I/M | None | 67 |
While different structural features for potent ACE inhibitory peptides have been reported, the C-terminal sequence of the peptide appears to have a major contribution to ACE inhibition while a minor contribution of the N-terminal sequence has been proposed in several of the QSAR studies conducted to date.30,32,33,45,46 In particular, the C1 (C-terminal) amino acid of peptides is thought to have a major effect on the in vitro ACE inhibitory properties of peptides. Overall, most QSAR studies have indicated that the presence of aliphatic hydrophobic and small amino acids (Ala, Trp, Pro, Phe, Gly, Cys, Leu and Ile) at the C1 position of peptides was a good predictor for potent in vitro ACE inhibitory activity. Amino acids located at other positions also appear to play a role in modulating the overall ACE inhibitory properties of peptides. It was reported, for instance, that the nature of C2 to C4 amino acids in peptides had an effect on the ACE inhibitory potential.45,46 The importance of the C-terminal sequence in ACE inhibition may come from the specific mode of action of the enzyme. ACE is a dipeptidyl carboxypeptidase, meaning that peptide binding to its active site occurs through the C-terminal dipeptide sequence.47 More particularly, hydrophobic amino acids have been described to bind to the hydrophobic S′2 subsite of the ACE active site.45 Zhou, et al.48 have shown that there was a relationship between the binding energy of peptides to ACE and their inhibitory potency, with a positive contribution of hydrophobicity to peptide binding. In addition, they demonstrated using truncated versions of Gln–Pro–Leu–Ile–Tyr–Pro that the C-terminal dipeptide (Tyr–Pro) played a preponderant role in peptide binding to ACE and that the C5 and C6 amino acids hardly affected binding.48
To date, most QSAR studies on ACE inhibition appear to have been applied to relatively short peptides (≤8 amino acid residues, Table 3). A recent QSAR study on ACE inhibitory activity of peptides has been applied to the largest peptide (>1400 peptides) dataset employed to date.49 The originality of this study also lies in the fact that it took into account peptides with a broad size range (from 2 to >12 amino acid residues). The peptides were classified according to their amino acid length as “tiny” (≤3 amino acids), “small” (4–6 amino acids), “medium” (7–12 amino acids) and “large” (>12 amino acids). When analyzing the amino acid composition of ACE inhibitory peptides, Gly (≥13%) was found to be the most abundant amino acid in dipeptides while Pro (≥14%) was the most abundant in peptides having ≥3 amino acids. This result is in agreement with QSAR outcomes from the study of Wang, et al.50 also predicting that Gly was the preferred amino acid residue both at the N- and C-terminal position of ACE inhibitory dipeptides.
Despite the very large number of QSAR studies focusing on ACE inhibitory peptides, only a few have subsequently been utilised as predictive tools to design peptides with potent ACE inhibitory properties (Table 3). In a few instances, QSAR studies have led to the identification of novel ACE inhibitory peptides following confirmatory in vitro studies with synthetic peptides.30,33,51 Novel ACE inhibitory peptides, Tyr–Phe and Leu–Arg–Phe, being 2–10 times more potent than the well-known ACE inhibitory lactotripeptide Ile–Pro–Pro, were reported.30 Similarly, Ile–Val–Pro (IC50 = 49.7 ± 4.2 μM) and Val–Ile–Pro (IC50 = 26.1 ± 0.8 μM) were found to be relatively potent ACE inhibitors.33 Huang, et al.51 specifically applied QSAR to C-terminal Pro containing tripeptides and successfully identified two novel ACE inhibitory peptides, Ile–Ile–Pro and Ile–Val–Pro having IC50 values of 1.39 and 1.58 μM, respectively.
To our understanding, new ACE inhibitory peptides identified using QSAR approaches have been evaluated in one study for their in vivo hypertensive properties.33 Ile–Val–Pro and Val–Ile–Pro were administered to spontaneously hypertensive rats (SHRs) at a dose of 0.75 mg kg−1. Interestingly, these treatments resulted in a significantly higher reduction (∼3 times, p < 0.05) in systolic blood pressure (SBP) than Ile–Pro–Pro, when evaluated at the same dose.33
The outcomes of the two QSAR models (developed with a 3-z and 3-v scales, Table 3) applied to DPP-IV inhibitory peptides revealed the importance of hydrophobic amino acids (Trp, Ile, Leu and Phe) at the N-terminal position of the peptide.63 In addition, the 3-v scale model also showed the importance of hydrophobic amino acids located at position 2 of the peptide. These findings were in agreement with earlier structural studies showing the importance of hydrophobic amino acids at the N-terminal side of peptides with DPP-IV inhibitory properties.61,62 The aim of this QSAR study was not to design peptides with more potent DPP-IV inhibitory properties. However, the QSAR models were used as a tool to predict the DPP-IV inhibitory properties of a large number of peptides which have previously been identified in the gastrointestinal tract of humans following ingestion of milk. Confirmatory studies allowed the identification of milk protein derived peptides relevant to humans with relatively high in vitro DPP-IV inhibitory potency such as Leu–Pro–Val–Pro–Gln displaying an IC50 value of 43.8 ± 8.8 μM.
Several studies have utilised QSAR approaches to better understand the structural requirements of the antimicrobial lactoferrin (LF)-derived pentadecapeptide, LFcin, for reviews, see: ref. 18 and 70. In these studies, LFcin from different species was used as a lead peptide to construct peptide analogs which were then employed to build a QSAR model in order to define structural characteristics of potent AMPs. A QSAR study on bovine LFcin (LF f(14–41)) analogs (8 peptides with 12–19 amino acid residues) was conducted to study antimicrobial activity against Escherichia coli and Staphylococcus aureus.71 Large and negatively charged peptides or peptides with a high hydrophobic moment (μ) displayed high antimicrobial activity against E. coli and S. aureus, respectively. In another study, 19 murine LFcin (LF f(16–30)) analog peptides were used to build a QSAR model in relation to their antimicrobial activity against E. coli and S. aureus.35 The importance of the N-terminal amino acid was highlighted. It is thought that the N-terminal amino acid may establish electrostatic interactions with the negatively charged phospholipids of bacterial membranes. Thereafter, it was hypothesised that hydrophobic amino acids (Trp, Tyr) may bind to the interface through hydrogen bonding, causing the phospholipid membranes to leak. While this study concluded that good antibacterial activity was obtained by replacement of several amino acids within the peptide sequence of murine LFcin, it was shown that the parameters of importance were the net charge and micelle affinity of the peptide. The most effective murine LFcin analog was found to be LFcin Arg1, 9 Trp8 Tyr13 (with an Arg at position 1 and 9, a Trp at position 8 and a Tyr at position 13).
Most QSAR studies conducted with antimicrobial LFcin peptides appear to have been designed with descriptors consisting of physicochemical parameters. Therefore, they took into account the whole characteristics of the peptide as opposed to the properties of its individual amino acids, making them more complicated to interpret and to employ for subsequent design of novel peptide sequences. Very few QSAR studies have attempted to elucidate the amino acid descriptors which correlated with the antimicrobial activity of LFcin peptide analogs.67 The 3-z scale was used for the amino acids located at the 4 varied positions (1, 9, 8 and 13) of 19 murine LFcin analogs. The preferred amino acids at each position were determined following QSAR analysis (Table 3). A larger peptide dataset of human, bovine, caprine and murine LFcin analogs (52 peptides) was employed in another QSAR study using the 3-z scale for peptide descriptors.72 The most important properties that governed the antimicrobial activity (E. coli and S. aureus) was z1 (hydrophilicity) for amino acids located at positions 1, 3, 4 and 14, z2 (size) for position 10 and 14 and z3 (charge) for position 4.
The ability of human LFcin analog peptides to act as cell membrane permealising agents of Pseudomonas aeruginosa, as a means to subsequently enhance the action of a synthetic antibiotic (novobiocin) and to avoid antibiotic resistance mechanisms, was studied using a QSAR approach.73,74 There was no direct relationship between peptide antimicrobial activity and cell permeabilising activity. The QSAR analysis revealed a positive correlation between antimicrobial activity and peptide hydrophobicity, the number of Trp and aromatic residues as well as the percentage of hydrophobic plus basic residues. On the other hand, peptides with cell permealising activity generally possessed aromatic and positively charged residues and had an amphiphilic structure.
Antimicrobial peptides other than LFcin have also been studied by QSAR.75–77 For example, the search for non-hemolytic (to reduce the risk of endotoxic shock) cyclic cationic peptides having high antibacterial properties has been aided by QSAR studies.77 It was shown that charge and amphipathicity was responsible for increased antibacterial activity. On the other hand, the anti-hemolytic effect was linked to lower peptide lipophilicity, in particular that of the residues involved in the nonpolar face of the peptides, which are likely to form a β-hairpin-like structure.
While some QSAR studies have concluded in structural characteristics relevant to AMPs, clear structural requirements for AMPs still do not appear to be available, as stated earlier.18 This has been explained by the fact that the structural requirements of AMPs appear to involve a quite complex combination of specific physicochemical properties (hydrophobicity, cationic residues, amphipathicity). In addition, the mechanism of action and specific target of AMPs are still not fully understood.74
A recent QSAR study has utilised peptide descriptors linked with electron transfer properties, i.e., energy of highest occupied molecular orbital (EHOMO) and bond length of active sites (L(X–H)), to study the antioxidant activity (OH˙ and O2−˙ scavenging) of di- to heptapeptides.78 The O2−˙ scavenging ability of peptides correlated with high EHOMO and long L(X–H). However, the predictive ability of the model designed for ˙OH scavenging was not very accurate, possibly due to interferences of the assay reagents (Cu2+). Overall, this study has highlighted the role of electron effects on the radical scavenging ability of peptides.
A systematic evaluation of the antioxidant capacity (ferric reducing antioxidant power – FRAP) of all possible tripeptides (172 unique sequences) from β-Lg has been carried out.26 The QSAR analysis showed that antioxidant activity was governed by the electronic and hydrogen-bonding properties of all amino acids within the peptide. For the N- and C-terminal amino acids, it was also shown that the steric properties of the amino acids were important. Cys- and Trp-containing tripeptides were associated with high antioxidant activity. The effect of both of these amino acid residues was explained by their ability to interact with free radicals through hydrogen (–SH and indole group), electron (S) or proton (aromatic ring of Trp) donation. It was found that 3 β-Lg tripeptides (Leu–Thr–Cys, Cys–Gln–Cys and Gly–Thr–Trp) had a higher antioxidant activity than the well-known physiological antioxidant glutathione (Glu–Cys–Gly).
The antioxidant properties of peptides may be determined using a wide range of assays. These assays are targeted at quite different oxidative species, therefore, leading to different outcomes in terms of the peptides' potential ability to reduce oxidation.66 These differences highlight the challenges in attempting to find a consensus in the structural properties for antioxidant peptides. As highlighted in the recent review of Li and Yu,22 no clear consensus between peptide structure and antioxidant peptides has been established using QSAR studies, possibly due to the lack of knowledge of the mechanism of action of such peptides.
In several studies, the R2 determined with QSAR models developed for bitter peptides was shown to be quite low. This was related to the difficulty of accurately measuring bitterness as its threshold is subject to interindividual variation in humans.48 Asao, et al.81 conducted a QSAR study on short (di- and tri-) peptides and their derivatives and showed a positive link between lower bitterness threshold and the peptides' physicochemical properties (i.e., length of the carbon backbone and octanol/water partition coefficient). In line with structural studies conducted on bitter short (di- and tri-) peptides, other QSAR studies have shown that hydrophobicity and size of the amino acids at the C1 and C2 position of peptides, respectively, correlated positively with bitterness.23,24
For datasets incorporating larger peptides (2–14 amino acid residues, n = 224), it was shown using the 3-z scale that the bitterness was correlated with hydrophobic amino acids in the C1 position and bulky, basic and hydrophilic amino acids in N1 (N-terminal) position of the peptides.25 The molecular mass, hydrophobicity, number of amino acid residues in the peptide together with the amino acid descriptors of the 3-z scale were incorporated in a PLSR. The previous physicochemical parameters had a higher influence on the bitterness than the 3-z scale. This suggests that the bitterness depends on the overall properties of the peptide rather than on the properties of specific amino acids within the peptide sequence.25
A number of QSAR studies have attempted to identify peptides displaying both good bioactivity profiles and low bitterness. It has been shown that several potent ACE inhibitory or antioxidant short peptides (2 to 3 amino acid residues) were also bitter.23,40,48,82 In contrast, in another study,24 no direct correlation was seen between ACE inhibitory properties and bitterness of di- or tripeptides. The number of possible dipeptide combinations (202 = 400) is lower than that of larger peptides, increasing the structural diversity of larger peptides. Therefore, the likelihood of identifying highly BAP candidates with low bitterness is increased with larger peptides. In addition, the C1 to C4 amino acid residues are important in peptide binding to ACE. Hence the interest in identifying peptides with 3–4 amino acid residues having high ACE inhibitory activity and a good sensory profile.23,48
Some QSAR studies incorporating relatively large numbers of BAPs in the dataset have been conducted with bioactivity data obtained using different experimental conditions (enzyme:
substrate ratio, source of enzymes, substrate, temperature, etc.), which makes the validity of the models developed questionable. Several studies have also not taken into account the mode of action of the peptides with certain biological receptors. This is particularly relevant for enzyme inhibition assays where peptides may act at the active site (competitive enzyme inhibition) or outside the active site (modes of inhibition other than competitive) of the target enzyme(s). Recently, Nongonierma and FitzGerald63 have shown when using QSAR that it was only possible to obtain a statistically significant correlation between DPP-IV inhibition of peptides and their descriptors when applying a series of filters such as IC50 data obtained under the same experimental conditions for inhibitors with the same mode of DPP-IV inhibition (competitive). While many QSAR studies have yielded models with statistically significant correlations, other studies have not achieved this result, possibly due to the relatively low number of peptides in the training set, structural diversity and/or structural relevance of the peptides therein.31,51,66 This again highlights the importance of the quality/heterogeneity of the data which may be included in the QSAR model.18 In addition, when the mechanism of action is not fully understood, e.g., mode of enzyme inhibition, it can be challenging to select meaningful peptide descriptors to build QSAR models.
In addition to selecting the appropriate peptide sequences to build QSAR models, a limited number of studies have attempted to train the models with so-called “negative datasets” which incorporate peptides which do not display any bioactivity. In their QSAR approach developed with ACE inhibitory peptides, Kumar, et al.49 have also included negative datasets to develop classification models in order to assign unknown peptides to a category of active or inactive peptides. These models have been incorporated in a freely available web resource (http://crdd.osdd.net/raghava/ahtpin).
Using QSAR models applied to ACE inhibitory peptides, an in silico approach was developed in order to predict dietary protein substrates which would act as good precursors of ACE inhibitory peptides.85 The ACE IC50 value of peptides predicted in silico to be released from the major proteins present in 15 food commodities by thermolysin and combinations of thermolysin and pepsin then thermolysin, pepsin and trypsin were determined by QSAR. This analysis allowed the identification of meat (pork, beef and chicken) proteins as rich sources for potent ACE inhibitory peptides (IC50 < 10 μM).
Validation of the in silico results with in vitro testing of the hydrolysates is needed to confirm their potential bioactivity. The prediction ability of QSAR combined with peptide cutters to release ACE inhibitory peptides from the major egg white proteins was studied by Majumder et al.86 They predicted that thermolysin followed by pepsin digestion of ovotransferrin would allow the release of potent ACE inhibitory peptides (Ile–Arg–Try, Leu–Lys–Pro and Ile–Gln–Try). The three target peptides could not be identified by liquid chromatography (LC)-MS/MS. However, precursors of the target peptides were found within the digest. The inability to release the three target peptides was imputed to the limited access of the enzymes to certain peptide bonds possibly due to the globular structure of ovotransferrin. Based on the knowledge of ACE inhibitory peptide structure, which generally possess Pro residues at their C-terminal region, a post-Pro endoproteinase preparation from Aspergillus niger (An-PEP) was employed to hydrolyse a Pro-rich protein substrate, i.e. bovine β-casein.87 The hydrolysate generated after 24 h incubation of β-casein with An-PEP at pH 6.0 and an enzyme to substrate (E:
S) ratio of 2.5% (w/w) was a particularly potent inhibitor of ACE, having an IC50 value of 16.41 ± 6.06 μg mL−1. Subsequently, LC-MS/MS characterisation of the peptides within this hydrolysate followed by confirmatory studies with synthetic peptides revealed that the bioactivity was linked to the presence of several C-terminal Pro containing peptides, some of which had ACE IC50 values in the μM range.
Other examples illustrating the use of in silico predictions to guide hydrolysis generation may be found in the development of DPP-IV inhibitory peptides. Tulipano et al.62 predicted, using a peptide cutter approach, that gastrointestinal digestion of β-lactoglobulin (β-Lg) would yield a higher number of DPP-IV inhibitory peptides than that of α-lactalbumin (α-La). Their prediction was subsequently validated by the fact that a β-Lg gastrointestinal digest had a higher DPP-IV inhibitory potency than an α-La digest. Recently, the targeted release of known DPP-IV inhibitory peptides from α-La has been studied.88 Approximately 64% of the peptide sequences predicted to be released in silico by digestion of α-La with elastase were identified by LC-MS/MS in an α-La elastase digest. The differences between in silico predictions and in vitro peptide release were possibly due to the presence of disulphide bonds within the α-La sequence. All five DPP-IV inhibitory peptides predicted to be released in silico were identified in the α-La elastase digest. Currently, the number of studies translating in silico results to BAP release in vitro are limited. However, the above studies have demonstrated the benefit of employing in silico approaches as a means to select enzyme × substrate combinations which may result in the release of potent BAPs.
In addition to enzyme specificity and protein sequence knowledge, a wide range of physicochemical parameters (pH, temperature, E:
S ratio, protein concentration, etc.) can affect both enzyme activity and protein conformation. Therefore, peptide release is highly dependent on the conditions employed during food protein hydrolysis. Understanding of the impact of hydrolysis parameters on hydrolysate bioactivity has been systematically studied using multifactorial experimental design and response surface methodology (RSM) approaches.89 In particular when a specific peptide sequence is being targeted for enzymatic release, RSM may be employed to determine the hydrolysis parameters giving an optimum yield of the peptide. For example, RSM has been applied to optimally release an ACE inhibitory peptide, His–Leu–Pro–Leu–Pro (β-casein f(134–138)), from casein using Corolase PP, an intestinal enzyme preparation.90 The optimum conditions were found to be 24 h hydrolysis using an E
:
S ratio of 6% (w/w).
The production of food protein hydrolysates enriched with these potent BAPs may be achieved through the utilisation of design of experiments combined with RSM approaches to predict the optimum hydrolysis parameters which would yield the enzymatic release of BAPs. Combination of QSAR with other in silico tools and peptide library approaches will allow the development of systematic methods for the discovery of novel and potent BAPs. For instance, QSAR models have been used to predict the bioactivity of very large sets of peptide sequences with unknown bioactivities.26,63 These large sets of peptides may correspond, for example, to all possible amino acid combinations to generate a specific peptide size, peptides which may be present within a specific proteome, novel peptide sequences as well as protein-derived peptides identified within humans.
Overall, many of the QSAR studies appear to have highlighted the importance of hydrophobic amino acids (Pro, Trp, Leu, Ile, Val and Ala) for a wide range of bioactive properties (ACE, DPP-IV and renin inhibition). Interestingly, the presence of some of these residues (i.e., Pro) within peptide sequences have been linked with gastrointestinal or serum stability in vivo, which potentially makes them interesting candidates for the development of functional foods targeted at human nutrition.
To date, it appears that certain bioactivities (e.g., renin and DPP-IV inhibition) have not been extensively studied with QSAR approaches. More research in this area will allow the development of more potent BAPs and ultimately the identification of enzymatic hydrolysis strategies to optimally release such peptide sequences. The main challenge in applying QSAR approaches to certain bioactive properties lies in the fact that the target and mode of action of the peptides is not known, making it virtually impossible to develop meaningful models. Therefore, more research is also needed in this area.
This journal is © The Royal Society of Chemistry 2016 |