Open Access Article
V. Erckes
,
A. Misiek,
A. Streuli
and
C. Steuer
*
ETH Zurich, Institute of Pharmaceutical Sciences, Laboratory of Pharmaceutical Analytics, Zurich, Switzerland. E-mail: christian.steuer@pharma.ethz.ch
First published on 20th May 2026
Peptidomimetics such as fatty-acid-modified, head-to-tail cyclic, and disulfide-constrained peptides challenge conventional MS/MS interpretation because their fragmentation pathways extend beyond canonical backbone cleavages or require multiple bond breakages to generate sequence-informative ions. We present a structure-aware algorithm that calculates and labels theoretical fragment ions across modified and cyclic peptides, including side-chain and disulfide-related fragments. To evaluate assignments, we integrate the three numerical metrics sequence coverage, intensity coverage, and signal coverage and assess their behavior across m/z tolerances, intensity thresholds, and charge states. Using representative MS/MS data of angiotensin related peptides, liraglutide, semaglutide, cyclosporine, oxytocin, and somatostatin, we demonstrate that the metrics reliably distinguish correct from incorrect assignments, including closely related sequences. Incorporating fatty-acid-specific fragments increased intensity coverage for liraglutide and semaglutide. MS3 improved sequence coverage for the cyclic peptide cyclosporine relative to MS2, but no additional benefit was seen at MS4 level. For disulfide-bonded peptides, the combination of electron-transfer dissociation (ETD) and collision induced dissociation (CID) shifted fragment distributions toward disulfide-cleavage products relative to CID, which were captured by the algorithm's dedicated disulfide labels. Together, these results demonstrate that structure-aware fragment calculation coupled to explicit assignment metrics enables more comprehensive and standardized interpretation of peptidomimetic MS/MS data. With this approach we lay a foundation for reproducible benchmarking which can be extended to additional cycliclization, modification, and fragmentation techniques.
As sequence and structural modifications directly influence peptide drug properties, reliable identification of peptide and peptidomimetic sequences is essential. Tandem mass spectrometry (MS/MS) is the central technique for peptide and protein sequencing because of its high sensitivity, compatibility with complex mixtures, and minimal sample requirements.8,9 Alternative approaches such as Edman degradation, nuclear magnetic resonance spectroscopy, and X-ray crystallography provide complementary insights but require large amounts of material and are less suited for high-throughput analysis.10–14 By comparison, MS/MS enables rapid and automated analysis for the identification of peptides. Peptides are ionized and fragmented, and the resulting fragment ions provide information that can be used to confirm or deduce peptide sequences.15 Fragmentation methods, such as collision-induced dissociation (CID), higher-energy collisional dissociation (HCD), and electron transfer dissociation (ETD), generate complementary fragmentation patterns that can be exploited to improve sequence coverage.16–20 However, MS/MS data evaluation generates spectra with hundreds of fragment signals, making manual interpretation time-consuming and creating the need for automated analysis tools.21
For linear peptides composed of the 20 canonical amino acids, numerous computational approaches exist to for MS/MS spectra interpretation. Database search algorithms and tools such as SEQUEST,22 Mascot,23 X!Tandem,24 MaxQuant25 and MS-GF+26 have been widely adopted in proteomics, enabling rapid peptide identification by matching experimental spectra to theoretical spectra derived from sequence databases.27 In parallel, de novo sequencing tools including PEAKS,28 Novor,29 DeepNovo,30 and InstaNovo31 allow direct sequence interpretation without database dependence, facilitating discovery of novel peptides or unexpected sequence variants. However, these methods generally assume linear peptides with canonical residues and provide limited support for chemical modifications or structural constraints. Even when post-translational modifications are considered, they are typically modelled as static mass shifts rather than as distinct fragmentation behaviors. To our knowledge, no comprehensive public spectral database with annotated MS/MS data dedicated to cyclic and backbone-modified peptides is currently available.
Peptidomimetics pose additional challenges. Structural features such as fatty acid conjugation, head-to-tail cyclization, and disulfide bridges create alternative fragmentation pathways, resulting in complex spectra not captured by conventional algorithms. For example, fatty acid-modified peptides such as semaglutide contains additional amide linkages within the modification that might influence fragmentation and complicate spectrum interpretation. Cyclic peptides require multiple fragmentation events to generate sequence-informative ions, and the number of possible fragments increases substantially with peptide length. Multi-stage MSn has been described to provide this information, but also greatly expands the number of fragment ions and spectra to evaluate. Recent work has demonstrated selective cleavage and linearization of thioether-macrocyclized peptides to facilitate MS/MS interpretation, although this requires prior derivatization of the thioether ring-closing moiety.32 Disulfide-bridged peptides additionally exhibit fragmentation patterns that depend on whether the disulfide bond remains intact or is cleaved, alongside backbone amide bond cleavage.33,34 While dedicated tools exist for specific tasks (e.g., MassMatrix for disulfide mapping,35 or Byonic36 and pGlyco37 glycopeptide-focused software for glycans), there is no general framework that systematically accounts for the diverse fragmentation behaviour of modified and cyclic peptides. As a result, interpretation of MS/MS spectra of peptidomimetics is still largely manual, time-consuming, and not standardized.
Recently, we introduced PICKAPEP38 as a versatile platform for virtual generation and parameter calculation of libraries of modified and cyclic peptides, supporting user-defined residues and diverse cyclization strategies. While valuable for in silico peptide design, its utility for MS/MS experiments remained limited. In this work, we extend PICKAPEP with an algorithm for the calculation of fragment ions characteristic of peptidomimetics. The approach goes beyond canonical backbone fragmentation by considering structural features such as fatty acid conjugation, head-to-tail cyclization and disulfide linkages. We demonstrate its utility using representative MS/MS data from angiotensin peptides, liraglutide, semaglutide, cyclosporine, oxytocin, and somatostatin. Our analysis spans from overall performance across peptide classes to detailed fragmentation behaviour in selected case studies. By enabling structure-aware fragment calculation, the framework supports a more comprehensive and standardized interpretation of MS/MS spectra of peptidomimetics.
![]() | (1) |
![]() | (2) |
![]() | (3) |
Data processing and statistical evaluation were performed with NumPy41 (1.23.5), pandas42 (1.5.3) and SciPy43 (1.15.3). Data visualization was carried out with matplotlib44 (3.10.1) and seaborn45 (0.13.2). Chemical structures were drawn using ChemDraw (20.0.0.41). Graphical abstract was created in BioRender. Steuer, C. (2026) https://BioRender.com/qlu1rqo.
| Peptide | Type | Sequence | [M + H]+ | [M + 2H]2+ | [M + 3H]3+ | [M + 4H]4+ |
|---|---|---|---|---|---|---|
| Ado = 3,8-dioxaamino octanoic acid, Abu = aminobutyric acid, Sar = sarcosine, –nme = N-methylated, –d = D-amino acid. | ||||||
| AT1 | Linear | H2N-DRVYIHPFHL-COOH | 1296.69 | 648.85 | 432.90 | 324.93 |
| AT2 | Linear | H2N-DRVYIHPF-COOH | 1046.54 | 523.78 | 349.52 | 262.39 |
| AT3 | Linear | H2N-RVYIHPF-COOH | 931.52 | 466.26 | 311.18 | 233.63 |
| AT4 | Linear | H2N-VYIHPF-COOH | 775.41 | 388.21 | 259.14 | 194.61 |
| LGL | Linear + fatty acid | H2N-HAEGTFTSDVSSYLEG | 3749.95 | 1874.98 | 1250.32 | 937.99 |
| QAAK(*)EFIAWLVRGRG-COOH | ||||||
| *γ-Glu-C16 fatty acid | ||||||
| SGL | Linear + fatty acid | H2N-HAibEGTFTSDVSSYLEG | 4112.12 | 2056.06 | 1371.04 | 1028.53 |
| QAAK(*)EFIAWLVRGRG-COOH | ||||||
| *2 Ado-γ-Glu-C18 fatty diacid | ||||||
| CSA | Cyclic (head-to-tail) | V(1)-Lnme-A-Ad-Lnme-Lnme-Vnme-Mebmt-Abu-Sar-Lnme(1) | 1202.85 | 601.42 | 401.29 | 301.22 |
| OXY | Cyclic (disulfide) | H2N-C(1)YIQNC(1)PLG-Am | 1007.44 | 503.72 | 336.15 | 252.37 |
| SMT | Cyclic (disulfide) | H2N-AGC(1)KNFFWKTFTSC(1) | 1637.72 | 818.86 | 546.24 | 409.94 |
![]() | ||
| Fig. 1 Fragmentation strategies used for theoretical fragment calculation. (A) amide and disulfide bond fragmentation pathways. Amide bond cleavage predominantly produces b/y-type ions under CID and c/z-type ions under ETD.16 Disulfide bond cleavage is more common under ETD conditions.47 (B–E) Principles of fragment generation for the different peptide classes: (B) linear peptides, (C) fatty acid-modified peptides, (D) head-to-tail cyclized peptides, and (E) disulfide-constrained peptides. Fragmentation pathways yielding ions with the same mass as the precursor peptide, which require an additional fragmentation step for sequence information, are highlighted in grey. | ||
Each amide bond can cleave at three different positions, producing fragment ion pairs of the a/x, b/y, or c/z type, corresponding to the N- and C-terminal sides of the peptide, respectively (Fig. 1A).15 Although peptide fragmentation is sequence- and method-dependent, certain method-specific patterns are well established. CID predominantly yields b/y ions, whereas ETD favors c/z ions.16 Since the final fragmentation step in this study was always performed with CID, calculation and subsequent analysis of linear peptides focuses on b/y-type fragments. In line with established notation, linear peptide fragments were calculated and labelled by ion type and charge state in number of ‘+’ symbols (Fig. 1B). For instance, a six amino acid N-terminal fragment detected at a double charge is denoted b6++.
This notation was extended to peptides carrying fatty acid side-chain modifications. To our knowledge, no specific nomenclature has been reported for such cases. In these peptides, amide bonds occur both along the peptide backbone and within the fatty acid-linker connection. Backbone amide cleavages were annotated as in linear peptides, while side-chain cleavages were labelled according to the number of linker segments cleaved, analogous to amino acid numbering. The complementary ion was denoted as the intact molecular ion mass (M) minus the fragmented fatty acid portion (Fig. 1C). For example, in liraglutide, which contains a C8 fatty acid (fa1) linked via a γ-glutamate spacer (fa2) to a lysine side chain, cleavage between lysine and glutamate generates two fragments with the notation: fa2 and (M − fa2).
Sequencing head-to-tail cyclized peptides requires at least two cleavage events, resulting in n × (n − 1) theoretical b-type fragments for a cyclic peptide with n amino acids and n amide bonds. As described previously, we restrict interpretationto b-type ions while y-type fragments can be omitted. Although alternative ion types (e.g. c/z, a/x) cannot be distinguished based on mass alone in cyclic peptides, this reflects established CID fragmentation preferences.41 In PICKAPEP, cyclic peptides are created by cyclizing a linear precursor, with residues numbered according to the linear sequence. Fragments were therefore labelled by the first and last amino acid contained in the fragment (Fig. 1D). For instance, [5–8] represents cleavage between residues 4/5 and 8/9, corresponding to the sequence from residue 5 to 8. Numbering can extend across the cycle, as for example [9–4] denotes the complementary segment of [5–8].
For peptides containing side chain-to-side chain disulfide bridges, three types of fragmentation pathways were considered (Fig. 1E). Backbone cleavages of amide bonds outside the cyclic region were annotated as in linear peptides. Cleavages within the cyclic region produced either continuous fragments labelled as in head-to-tail cycles, e.g. [5–6] or discontinuous fragments, labelled by the two residues retained in the cycle, e.g. [4/7]. Disulfide bond cleavages were previously described to occur either at the S–S bond or the C–S bond, yielding to eight distinct fragmentation products per disulfide bridge. These fragments were annotated as linear ions, with additional labels denoting the disulfide modification state (e.g.
S, –SH,
CH2, –SSH) resulting in for example a fragment S(
S)b6. The opening of the disulfide bond in MS/MS analysis has especially been described in ETD.47
All spectra were assigned using the developed algorithm, and sequence coverage, intensity coverage and signal coverage were calculated for each assignment. An example output for AT1 [M + 3H]3+ is shown in Fig. 2. The assigned spectra of all MS/MS data can be found in the SI (S3).
![]() | ||
| Fig. 2 Example of the output for our algorithm-based MS/MS data assignment for the spectrum of AT1 [M + 3H]3+. All assigned fragment ions are shown in blue, with only a subset labelled for clarity. | ||
We first assessed the applicability of the metrics to linear peptides, as their fragmentation behaviour is well characterized in the literature and directly aligns with the fragmentation pathways implemented in our algorithm.15,16 Using AT1 [M + 3H]3+ as an example (Fig. 2), we assessed the sensitivity of assignment metric values to m/z tolerance and intensity threshold. As shown in Fig. 3A, sequence coverage increased with broader m/z tolerances and lower thresholds. Intensity coverage increased with m/z tolerance until reaching a plateau and showed little dependence on the intensity threshold, reflecting the dominance of high-intensity peaks in this metric. Signal coverage behaved differently, increasing with both tolerance and threshold, suggesting higher-intensity peaks are more likely to relate to the known fragment patterns compared to lower intensity fragments. Because coverage metrics remained stable at higher m/z tolerances and considering the resolution of our instrument, a tolerance of ±0.5 m/z was selected for subsequent analyses unless stated otherwise. With low-resolution MS, some degree of unspecific fragment assignment cannot be fully excluded. However, we never observed signals above the precursor mass in any spectrum, even when scans extended beyond this range due to default charge state settings. This suggests that the large number of low-intensity signals does not represent random background noise. Instead, they may arise from sequence scrambling or other poorly understood fragmentation pathways, and we therefore interpret them as potential but unspecific fragment assignments. Instead, they may arise from sequence scrambling or other poorly understood fragmentation pathways, and we therefore interpret them as potential but unspecific fragment assignments.
Based on these considerations, we also wanted to evaluate the ability of the metrics to detect false assignments. For a first evaluation, the AT1 [M + 3H]3+ spectrum was compared with theoretical fragments from AT2 (related sequence with theoretical fragment overlap of 50%) and CSA (unrelated but with >100 theoretical fragments) as shown in Fig. 3B. Heatmaps and threshold-dependent plots of other charged states for the comparison of AT1, AT2 and CSA are provided in the SI (S4). As expected, AT1 values were closer to those of AT2 than to CSA, while overall the metrics showed lower values for incorrect assignments, particularly at higher intensity thresholds. The comparison with CSA also showed a measurable level of matching despite the absence of sequence relatedness. Given the large number of theoretical fragments for CSA, particularly when multiple charge states are considered, this can be attributed to an increased probability of incidental matches within the applied m/z tolerance. This therefore reflects a baseline level of unspecific matching that should be considered when interpreting assignment metrics.
To investigate this further, we assessed whether the assignment metrics can reliably distinguish correct from incorrect matches across the full set of peptides including related and unrelated sequences. Using again AT1 as a reference peptide, we compared assignments in both directions: AT1 fragments assigned to spectra of related (AT2–4) or unrelated peptides (all other peptides in our set), and AT1 spectra assigned with fragments from related or unrelated peptides. Each charged state spectrum was treated as an independent data point. To evaluate the influence of the intensity threshold, we varied it across a range of values. At a representative threshold of 0.1% (Fig. 3C), average sequence coverage and intensity coverage were significantly higher for correct assignments than for related or unrelated matches. This trend was robust across a range of thresholds (0.01%, 0.5%, 1%, 5% shown in the SI (S4)). Only at the lowest threshold (0.01%) did sequence coverage fail to reach significance for the comparison of related fragments to AT1 spectra, which may reflect unspecific assignment of low-intensity ions. Signal coverage was in comparison to the other metrics less discriminative, particularly for related peptides and at low thresholds. Additionally, mean values of the signal coverage were generally lower than for the other metrics, indicating that many signals in the fragment spectra remain unexplained. Overall, these observations indicate that especially the sequence and intensity coverage metrics provide sufficient ability to distinguish correct from false assignments in our dataset, even when comparing closely related peptides. It should be noted that these results were obtained using a relatively broad m/z tolerance of 0.5 Da. In addition, precursor mass was not used as a filtering criterion in this study, since low-resolution MS did not allow unambiguous molecular formula determination. With high-resolution instrumentation, narrower tolerances and precursor-based filtering could further improve the specificity of fragment assignments.
Finally, we assessed whether assignments for peptidomimetics were comparable to those of linear peptides. Fig. 3D compares the different metrics between peptide types, with each charge state treated as an independent data point. On average, sequence and intensity coverage were lower for fatty acid-modified, cyclic, and disulfide-constrained peptides than for linear peptides. This difference was particularly pronounced for sequence coverage at higher thresholds as shown in the SI (S4). Intensity coverage remained more stable across thresholds, consistent with previous observations. Signal coverage showed no consistent trend. It might be used as a value reflecting the correlation between observed and assumed fragmentation pathways, but this requires further evaluation. Pairwise comparisons between related peptidomimetics, analogous to those described for AT1, further supported the discriminative power of the metrics. Correct and incorrect assignments were tested bidirectionally between LGL and SGL, between AT1 and CSA, and between SMT and OXY shown in the SI (S4). Although spectra and results varied across charge states, highlighting the importance of considering multiple metrics as well as charge states when available, correct assignments consistently produced higher sequence and intensity coverage across thresholds and charge states. When examining individual assignments, even correct assignment may yield in low absolute metric values (e.g. OXY [M + H]+ and [M + 2H]2+). However, when the spectra were compared to related peptides, the correct assignments still resulted in higher values. This suggests that achieving an absolute threshold may not be necessary for correct identification and instead, aiming for a relative maximum in each case may be more appropriate. Nevertheless, to draw more conclusive results, larger datasets will need to be evaluated.
![]() | ||
| Fig. 4 Assigned MS/MS spectra of (A) LGL [M + 2H]2+, (B) LGL [M + 3H]3+ and (C) SGL [M + 3H]3+. Side-chain fatty acid modifications are shown with the main fragmentation site [M − fa2] highlighted. | ||
| Peptide | Spectrum | Intensity coverage (%) | Fragments detected | Calculated mass (m/z) | Mass found (m/z) | Normalized Intensity (%) | |
|---|---|---|---|---|---|---|---|
| Assignment | Without fa fragments | With fa fragments | |||||
| LGL | [M + 2H]2+ | 24.5 | 40.8 | [M − fa2]++ | 1691.84 | 1692.07 | 100.0 |
| [M + 3H]3+ | 51.8 | 56.8 | [M − fa2]+++ | 1128.23 | 1128.43 | 44.04 | |
| [M + 4H]4+ | 7.9 | 8.0 | Fa2+ | 368.28 | 368.22 | 0.34 | |
| SGL | [M + 3H]3+ | 26.9 | 29.3 | Fa2+ | 426.28 | 426.50 | 0.14 |
| Fa4+ | 716.43 | 716.68 | 0.14 | ||||
| [M − fa1]++ | 1908.45 | 1908.41 | 1.10 | ||||
| [M − fa2]++ | 1843.92 | 1844.15 | 2.20 | ||||
| [M − fa2]+++ | 1229.62 | 1229.78 | 28.61 | ||||
| [M − fa3]++ | 1771.39 | 1771.28 | 0.27 | ||||
| [M − fa4]++ | 1698.85 | 1698.86 | 0.23 | ||||
| [M + 4H]4+ | 11.3 | 11.3 | None | ||||
We next examined head-to-tail cyclized peptides in greater detail. To allow direct comparisons, MS/MS spectra for cyclic peptides were acquired under the same experimental parameters as for linear peptides, without specific adjustments. Since sequence-informative ions in cyclic peptides generally require two or more backbone cleavages, MSn experiments are often considered advantageous for structural elucidation. To test this assumption, we compared assignment metrics for CSA across MS/MS (MS2), MS3, and MS4 analysis. The structure of CSA with amino acid numbering is shown in Fig. 5A. For this comparison, the [M + 2H]2+ precursor was used and the resulting sequence, intensity, and signal coverage are shown in Fig. 5B. The obtained intensity coverage values were similar to those reported in the literature for cyclic peptides.20 MS3 improved sequence coverage by 8.7% compared to MS2, but intensity coverage decreased by 6.9%, likely because additional fragmentation distributed signal intensity over a larger number of ions. MS4 did not yield further improvement in sequence coverage over MS3. Two-dimensional fragment maps of CSA (Fig. 5C) illustrate the distribution of observed fragments and their relative intensities. The main fragment observed in MS2 corresponded to [8–6], matching the initial loss of Vnme as previously described for [CSA + Na]+ in MSn analyses.48 Notably, this visualization revealed differences in charge distribution: fragments detected with double charge were generally larger in size. This suggests that, in addition to intrinsic bond-cleavage probabilities, the ability of a fragment to retain charge influences whether it is ultimately observed in an MS/MS spectrum and thus available for structural interpretation. Comparable trends were observed in MS3 and MS4 spectra, but no systematic changes indicating improved sequence-level interpretation at higher MS order were identified. Therefore, MSn analysis may be more useful for the identification of individual fragments than for improving overall sequence coverage in cyclic peptides.
Finally, we investigated disulfide-containing peptides in more detail. Disulfide bonds are known to cleave inefficiently under CID but can be more effectively addressed with ETD. To assess this, we compared spectra obtained from CID (CID–CID) alone to spectra resulting from ETD followed by CID (ETD-CID). Since ETD efficiency increases with precursor charge, the highest available charge state was used for each peptide.49 For OXY, CID of [M + 2H]2+ yielded sequence, intensity, and signal coverages of 42.4%, 20.5%, and 24.8%, respectively. The ETD-CID spectrum showed lower coverage values (22.7%, 3.6%, and 12.5%). For SMT, CID of [M + 3H]3+ yielded sequence, intensity, and signal coverages of 74.3%, 51.2%, and 48.7%, respectively, while ETD-CID of [M + 2H]2+ showed values of 45.0%, 9.9%, and 26.2%. Thus, ETD-CID spectra displayed approximately half the sequence coverage with a fivefold lower intensity coverage compared to CID spectra, likely reflecting less efficient ETD fragmentation and the dominance of lower-charged precursor ions that remained unfragmented. According assigned spectra can be found in the SI (S4). Fig. 6 shows the structures of OXY and SMT with annotated bond types and the distribution of fragment types from CID and ETD-CID. In OXY, the CID spectrum was dominated by b/y-type fragments, with additional cyclic fragments and low-intensity disulfide-related ions. By contrast, ETD-CID produced a higher relative proportion of disulfide cleavage products, consistent with ETD-mediated S–S bond opening. In SMT, CID spectra were dominated by cyclic-related fragments with some disulfide products, whereas ETD-CID retained cyclic fragmentation as the main contribution but showed an increased ratio of disulfide-related fragments compared to CID. Together, these results demonstrate that the algorithm is useful not only for sequence assignment but also for investigating how different fragmentation methods influence peptidomimetic fragmentation pathways.
Other data are available from the corresponding authors upon request.
| This journal is © The Royal Society of Chemistry 2026 |