Framework for De novo Sequencing of Peptide Mixtures via Network Analysis and Two-Dimensional Tandem Mass Spectrometry
Abstract
Two-dimensional tandem mass spectrometry (2D MS/MS) provides in-depth biopolymer structural information previously not directly accessible with traditional one-dimensional MS/MS workflows, and in significantly less time (<1 second per sample). In this study, we enhance 2D MS/MS data analysis for greater applicability in omics workflows and address challenges in sequencing peptides in mixtures. We designed a graph-theory-based framework to efficiently manage, visualize, and maximize the structural information extractable from 2D MS/MS spectra. Graph analysis algorithms, including a PageRank-based method, are shown to deconvolve MS/MS signals and group together product ions from the same presursor peptide, enabling the reconstruction of peptide fragmentation trees. From this, MSn information can be extracted to improve sequencing accuracy relative to current MS/MS methods. We also introduce a computationally efficient de novo sequencing approach that leverages this structural information to reduce reliance on databases and sample separation, while also enabling the rapid sequencing of post-translationally modified peptides. Tests on simulated 2D MS/MS spectra, designed to mimic those from proteomic samples, achieved high precision in signal assignment. Proof-of-concept studies were conducted on real data from simple mixtures of short chain peptides, showing the potential applicability of combining network analysis with de novo sequencing to analyze unknown peptide mixtures. We anticipate that this technique will complement proteomics workflows and facilitate direct biopolymer structural analysis.