Jeric Mun Chung
Kwan
ab,
Yaquan
Liang
a,
Evan Wei Long
Ng
a,
Ekaterina
Sviriaeva
b,
Chenyu
Li
a,
Yilin
Zhao
a,
Xiao-Lin
Zhang
a,
Xue-Wei
Liu
a,
Sunny H.
Wong
b and
Yuan
Qiao
*a
aSchool of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, 21 Nanyang Link, 637371, Singapore. E-mail: yuan.qiao@ntu.edu.sg
bLee Kong Chian School of Medicine, Nanyang Technological University, 11 Mandalay Road, 308232, Singapore
First published on 5th January 2024
Peptidoglycan is an essential exoskeletal polymer across all bacteria. Gut microbiota-derived peptidoglycan fragments (PGNs) are increasingly recognized as key effector molecules that impact host biology. However, the current peptidoglycan analysis workflow relies on laborious manual identification from tandem mass spectrometry (MS/MS) data, impeding the discovery of novel bioactive PGNs in the gut microbiota. In this work, we built a computational tool PGN_MS2 that reliably simulates MS/MS spectra of PGNs and integrated it into the user-defined MS library of in silico PGN search space, facilitating automated PGN identification. Empowered by PGN_MS2, we comprehensively profiled gut bacterial peptidoglycan composition. Strikingly, the probiotic Bifidobacterium spp. manifests an abundant amount of the 1,6-anhydro-MurNAc moiety that is distinct from Gram-positive bacteria. In addition to biochemical characterization of three putative lytic transglycosylases (LTs) that are responsible for anhydro-PGN production in Bifidobacterium, we established that these 1,6-anhydro-PGNs exhibit potent anti-inflammatory activity in vitro, offering novel insights into Bifidobacterium-derived PGNs as molecular signals in gut microbiota-host crosstalk.
While the chemical makeup of peptidoglycan polymers is largely conserved, the exact compositions and structural modifications of peptidoglycan are highly variable across bacteria and under different environmental conditions (Fig. 1).1,12,13 In general, the ‘glycan’ component of peptidoglycan consists of alternating units of N-acetylglucosamine (GlcNAc, or herein NAG) and N-acetylmuramic acid (MurNAc, or herein NAM) linked via β-1,4-glycosidic bonds; the ‘peptido’ portion refers to the short stem pentapeptide connected onto the lactoyl group of each NAM, which has the common sequence L-Ala1-γ-D-Glu/isoGln2-AA3-D-Ala4-D-Ala5, with AA3 being either L-Lys attached to a lateral bridge peptide (that is specific to each bacterial species) or a non-proteogenic diamino acid such as meso-diaminopimelic acid (mDAP) (Fig. 1A). These stem peptides on adjacent glycan strands can form 3–4 or 3–3 crosslinks through iso-peptide bonds, thereby strengthening the peptidoglycan layer (Fig. 1C). Furthermore, a great deal of structural diversity in peptidoglycan comes from the cell wall remodeling process, where bacterial enzymes catalyze specific reactions at distinct positions in peptidoglycan to generate new structural moieties, such as modifications of the glycan backbone, trimming of pentapeptides to shorter stems, and incorporation of non-canonical D-amino acids (NCDAA) into the stem peptide (Fig. 1A and B).14 While most insights on peptidoglycan structural diversity were gained from analyses of model bacterial organisms, our knowledge of the scope and variety of peptidoglycan in the gut microbiota is still in its infancy. Recognizing the biological significance of peptidoglycan modifications, we seek to develop a robust and automated workflow to characterize peptidoglycan compositions and structural features in any bacteria of interest, especially those in the gut microbiota.
There are significant gaps in the current workflow of bacterial peptidoglycan analysis, with the widely adopted experimental procedure developed >30 years ago.15 Briefly, the peptidoglycan polymer (i.e., sacculi) isolated from bacteria is digested with a muramidase (e.g., lysozyme) that hydrolyzes the NAM-β-1,4-NAG linkages along the peptidoglycan backbone, generating soluble PGNs that are disaccharide-containing muropeptides in nature.16 The collection of these soluble PGNs is then subjected to high-performance liquid chromatography-tandem mass spectrometry (HPLC-MS/MS) analysis for structural characterization and profiling (Fig. 1D, top row). Improvements in HPLC-MS/MS instrumentation such as higher resolution and faster scanning rate have improved the quality of acquired data; however, analyzing raw MS data to elucidate PGN structures remains a painstaking manual task, where one needs to come up with the potential structures of PGNs (i.e., search space, which can be as large as >6000 structures on ChemDraw)17,18 and look for matches of the expected m/z values in the acquired LC-MS dataset. Such manual annotations of MS data are considerably time-consuming, laborious, and inconsistent, remaining as an undesirable bottleneck for robust and comprehensive peptidoglycan analysis with higher throughput.19,20 This may deter the discovery of novel structural features of peptidoglycan, especially in the gut microbiota, where the scope of peptidoglycan diversity has not been much explored.
Towards these challenges, we present a novel and customizable PGN database integrated with in silico MS/MS spectra to enable automated MS/MS deconvolution for PGN identification (Fig. 1D, bottom row). The spectral library (.msp format) encompasses the in silico predicted MS/MS fragmentation for each PGN in the dataset, which is compatible with open-access and vendor software for automated matching and scoring of the experimental MS/MS peaks, thus streamlining PGN analysis with unmatched confidence and throughput. Applying this automated PGN analysis pipeline, we profiled the peptidoglycan compositions of five different gut bacteria. Intriguingly, an unusually high abundance of anhydro-PGNs (i.e., PGNs containing a 1,6-anhydro-muramyl moiety, anNAM) (Fig. 1A, far right) was found in Bifidobacterium, the common probiotic bacteria that confer anti-inflammatory effects in hosts.21,22 We further demonstrated that MltG and RfpB homologs in Bifidobacterium possess robust lytic transglycosylase (LT) activity towards distinct peptidoglycan substrates to generate anhydro-PGN moieties. Importantly, we established that these anhydro-PGNs of Bifidobacterium exhibit novel anti-inflammatory effects in vitro, which opens up exciting opportunities for postbiotic development.
Apart from its descriptive name, the PGN database (.xlsx) also includes chemical descriptors for individual PGNs, e.g., chemical formula, adducts m/z, clogP, InChIKey, SMILES, and PGN-specific descriptors, e.g., the degree of acetylation, degree of amidation, and stem peptide length, thereby facilitating subsequent PGN categorization and comparative analysis (Fig. 2, right). Accompanying the PGN database, an image output that summarizes user-defined parameters is automatically generated for convenient referencing (Fig. 2, left). For a typical database of 3000–10000 PGNs, it takes ∼1 min per 1000 PGN to generate when run on a computer with a 2.60 GHz processor and 16 GB RAM. To reduce analysis time, PGN_MS2 includes various ways to skip illogical/unreasonable PGN polymers (Table S1†).
To derive in silico PGN MS/MS spectra, we first studied the ESI-MS/MS spectra of known PGNs. Recent studies by Tan et al. and Anderson et al. reported the experimental MS/MS spectra for selected PGNs from E. coli, S. aureus, and P. aeruginosa, providing a suitable starting point for our evaluation.17,29 In addition, we also acquired experimental LC-HRMS/MS data for several major PGNs with known structures from E. faecalis and L. plantarum. Notably, these spectra were acquired using different MS instruments, namely, Orbitrap Exploris 120 (our study), LCQ Fleet (Tan et al.), and Q-TOF (Anderson et al.), enabling us to derive common ESI-MS/MS fragmentation rules for most PGNs. We recognized that the PGN precursor ions frequently undergo B/Z-type glycan fragmentation (nomenclature according to Domon and Costello30) and b/y-type peptide fragmentation, with multiple b/y cleavages to yield lighter ions (Fig. 3A). Additionally, the lactoyl bond connecting the glycan and peptide in PGNs also fragments readily, with the peptide fragment ion henceforth named L (Fig. 3A). Furthermore, isomeric PGNs that contain stem peptides such as Aqm and Aem(NH2) with differing amidation positions can be easily distinguished by their MS/MS patterns (Fig. S3†). The y2 peptide fragments (i.e. qm or em(NH2), m/z: 319.1619) undergo further e1/e2 or q1/q2 fragmentations due to prominent neutral losses at the N-terminus.31 For instance, em(NH2) yields 301.1465 (e1) and 256.1280 (e2) fragments, whereas qm gives rise to signature MS/MS peaks of 302.1347 (q1) and 257.1103 (q2); with q2 fragments showing higher relative intensities (Fig. S3A and B†). These abundant MS/MS features are useful to distinguish PGNs that bear e or q in the stem peptides, as in the case of L. plantarum (Fig. S3†). Upon evaluating the experimental MS/MS spectra for ∼30 PGNs, we found that most of the fragmentation peaks can be explained by 19 fragmentation reactions or a combination thereof (Fig. 3A).
Based on these common fragmentation reactions, we developed PGN_MS2, an in silico MS/MS prediction tool for PGNs. As shown in Fig. 3B, we encoded each fragmentation as a chemical reaction in SMARTS and simulated it with RDKit.23 Each parental PGN ion (generation-0) is fragmented via all possible 19 reactions to form generation-1 product ions, which are further fragmented to yield generation-2 and generation-3 product ions sequentially. Fragmentation is discontinued after no new product ions are generated. For every fragmentation, the m/z value and relative intensity for each fragment are calculated. Relative intensity is estimated based on an empirically derived formula (that accounts for the number of peptide bonds or mass ratio of the precursor and product ions) together with a fragmentation-specific adjustment factor. Finally, the assembly of possible fragment ions affords the in silico predicted MS/MS spectra. To account for the different precursor adducts (i.e., [M + H]+, [M + 2H]2+, and [M + 3H]3+), separate MS/MS spectra are created for each adduct, whereby fragment ions with m/z greater than that of the precursor ion are removed. In sum, our PGN library integrates the predicted MS/MS spectra of all PGNs in the database as a NIST format text file (.msp, Fig. S2C†).
Next, we confirmed that the in silico predicted spectra by PGN_MS2 match well with the MS/MS spectra acquired using either an Orbitrap spectrometer via higher-energy C-trap dissociation (HCD)-based fragmentation or a Q-TOF instrument via collisional dissociation (CID)-based fragmentation (Fig. S5A–D†).33 In addition, to benchmark our PGN_MS2 with the available PGN dataset, we also evaluated the experimental data of P. aeruginosa PGNs deposited by Anderson et al., which was collected using a Q-TOF mass spectrometer.17 Consistently, using PGN_MS2 and MS-DIAL, we readily confirmed 54 PGNs that were manually identified in the previous work (62, with MS/MS) (Fig. S5E–F†). Taken together, our observations demonstrate the robustness and reliability of PGN_MS2 in simulating ESI-MS/MS spectra of PGNs for structural determination.
To further investigate if PGN_MS2 could indeed aid accurate assignment of PGNs among closely related structural isomers, we challenged it to identify the canonical E. coli or S. aureus PGN, (NAG)(NAM)-AemA and (NAG)(NAM)-AqKAA[3-NH2-GGGGG] respectively, from a set of four intentionally generated mock PGNs with identical molecular formulae (Fig. S6†). Satisfactorily, we correctly assigned the two PGN structures, since they both emerged as the top hits with the highest spectral similarity scores compared to other possible isomers, albeit by a small margin (Fig. S6†). Based on our analysis, we noted that although the top matched in silico PGN usually represents the accurate structure, other criteria such as the presence or absence of certain signature MS/MS fragments are particularly useful for PGN determination too. For instance, fragments containing the intact mDAP–mDAP bond (i.e., m/z: 617.2777, 746.3203, and 889.3785) are observed in the MS/MS spectra of the 3–3 but not 3–4 crosslinked PGNs in E. coli, allowing convenient distinction between the two isomers (Fig. 4C and S7†). Therefore, it is prudent to check for these signature fragments for PGN identification. To assist with this, PGN_MS2 also annotates the chemical structures of each fragment in the predicted MS/MS spectra as SMILES (Fig. S2D†).
Fig. 4 Summary of peptidoglycan compositions in the model (A) and gut bacteria (B) with the canonical makeup shown in bold and variable components not bolded. For instance, the canonical makeup in E. coli is (NAG)(NAM)-AemAA and the fifth amino acid, Ala, can be substituted with His, Gly, or Lys. The total number of PGNs identified in each species of bacteria is listed. Compositions for E. faecalis and F. nucleatum are shown in Fig. S8G† instead. PGN_MS2 enables distinctions between isomeric PGNs by matching experimental spectra against in silico predicted MS/MS patterns for: (C) tetrapeptide-tripeptide dimers with either 3–4 or 3–3 crosslinks in E. coli; (D) monomeric PGNs that incorporate Gly at either the 4th or 5th position in E. faecium. The key fragments that are essential for resolving the respective isomers are highlighted in yellow. Fig. S6, S7, S9, and S10† showcase additional examples of differentiating isomeric PGNs by PGN_MS2. |
Amidation of stem peptides is a unique feature in PGNs of Gram-positive bacteria (Fig. 5B).1 For instance, the canonical monomeric PGNs in E. faecium and L. plantarum each contain two possible amidated residues in the stem peptides, q and isoAsn, q and m(NH2), respectively (Fig. 4A). Although most PGNs in both bacteria are amidated at both positions, substantial amounts of singly amidated PGNs are also observed, which require MS/MS analysis to determine the exact amidation position in the isomeric PGNs (Fig. S3†). In addition, some L. plantarum PGNs have D-lactate instead of D-Ala at the stem peptide's terminus,43 which further complicates identification. The three structural isomers, (NAG)(NAM)-Aem(NH2)AA, (NAG)(NAM)-AqmAA, and (NAG)(NAM)-Aqm(NH2)ALac have identical m/z values that are indistinguishable solely based on MS1 analysis and require in-depth MS/MS evaluation. With our approach, the in silico predicted MS/MS spectra by PGN_MS2 revealed signature fragments for each of the three PGN isomers, which significantly improved the confidence and throughput of MS/MS identification (Fig. S9†). For instance, the experimental spectra of (NAG)(NAM)-Aqm(NH2)ALac showed the best match to the in silico spectra for this particular isomer and contained all key fragments, allowing us to easily assign the correct structure (Fig. S9†). Moreover, with our MS/MS-integrated analysis pipeline, we also uncovered that amidation at the second residue (q) of the stem peptide is more prominent than that at the side chain (β-Asp) in E. faecium, whereas similar amidation rates were observed for both q and m(NH2) in PGNs of L. plantarum (Fig. S8A†).43 Recognizing that bacterial peptidoglycan amidations are associated with increased levels of crosslinking and also implicate antibiotic resistance,44–49 we anticipate that our workflow for the facile analysis of such amidated PGNs will facilitate the development of novel antimicrobials targeting bacterial peptidoglycan amidations.
Fig. 5 Summary of peptidoglycan features in model and gut bacteria. (A) Varying lengths of stem peptide in monomeric PGNs across bacteria. (B) Amidation rate in stem peptides across bacteria. (C) Frequency of NCDAA incorporation in stem peptides across bacteria. (D) Amount of anNAM termini in bacteria. Bifidobacterium spp. showcase a high abundance of anNAM that differs from that of typical Gram-positive bacteria. All statistics indicate the relative muropeptide composition (in %) except for (B), where the amidation rate is instead defined as the number of amidated residues (γ-D-isoGln/β-D-isoAsn/mDAP(NH2)) per muropeptide. L. plantarum, E. faecium, and B. adolescentis feature two amidated amino acids, and the values shown are the combined rates for both. The data represent the average of three to four biological replicates with error bars representing standard deviations. Additional profiling analysis can be found in Fig. S8A–F.† |
Peptidoglycan crosslinking via stem peptides confers strength and resistance to certain antibiotics and stress conditions. For instance, E. coli typically manifests 3–4 crosslinking but significantly increases 3–3 crosslinking under stress conditions.50,51 The 3–4 and 3–3 crosslinked tripeptide-tetrapeptide dimeric PGNs are structural isomers that differ only in the isopeptide bond position, which were easily distinguished using our MS/MS-integrated PGN analysis workflow (Fig. 4C and S7†). Interestingly, across all bacteria, we also detected tetra-saccharide PGN dimers that are isomeric to the crosslinked dimers (Fig. S10†). Although such tetra-saccharide motifs are possible products of incomplete muramidase digestion during sample preparation, additional rounds of enzymatic digestion could not fully eliminate them.52 Compared to the crosslinked PGN dimers, these tetra-saccharide PGNs generally yielded fewer MS/MS fragments with lower relative intensity for B-type fragments and higher intensity for L-type fragments (Fig. S10†), which is consistent with the presence of only one terminal GlcNAc and two free-stem peptides in these structures. The ability to easily identify such tetra-saccharide PGNs in our workflow may provide the impetus to investigate their physiological relevance in bacteria.
We first elucidated the canonical PGN makeup in the respective gut bacteria (Fig. 4B). A. muciniphila possesses mDAP-type PGNs,56 similar to most other Gram-negative bacteria. However, F. nucleatum PGNs exclusively feature the non-proteinogenic lanthionine at the third position of the stem peptide, whose structure closely resembles that of mDAP.58,59 On the other hand, Gram-positive Bifidobacterium spp. possess either L-Lys or L-Orn as the third residue that is further appended with distinct bridge peptides (Fig. 4B).57 Surprisingly, we found that whereas the L-Lys containing PGNs are only minor constituents in B. bifidum and B. infantis (2.7 and 3.6% respectively, Fig. S8B†), they are the major constituents in B. adolescentis (62.3%, Fig. S8B†). This could imply that MurE, the ligase that incorporates the third amino acid residue in soluble peptidoglycan precursors, exhibits unique substrate tolerances amongst different species of Bifidobacterium. Furthermore, B. adolescentis PGNs also sport an identical bridge peptide (i.e., β-Asp/β-isoAsn) as those in E. faecium and L. lactis,42,61,62 which are constructed by the sequential enzymatic activities of the D-aspartate ligase, Aslfm, and the asparagine synthase, AsnH.61–63 Consistently, B. adolescentis encodes homologs of both enzymes (Table S14†).
Evaluating the lengths of stem peptides in PGNs across different bacteria, we found that PGNs in F. nucleatum, B. infantis, and S. aureus predominantly possess penta- and tetra-peptides, whereas B. adolescentis, L. plantarum, and E. faecium showcase variable PGNs with shorter stems ranging from one to four amino acids, which are likely products of enzymatic cleavages by DD-carboxypeptidases, LD-endopeptidases or DL-endopeptidases during PG maturation in bacteria (Fig. 5A).14 Recent studies have revealed that SagA-like DL-endopeptidases secreted by commensal gut bacteria such as E. faecium and Lactobacillus generate bioactive PGN motifs that regulate host gut homeostasis.42,64,65 Interestingly, both B. bifidum and B. adolescentis have a significant proportion of PGNs with dipeptide stems (∼15%) (Fig. 5A), suggesting the activities of SagA-like enzymes in these two Bifidobacterium that could be potentially relevant to their anti-inflammatory effects.
In all bacteria, NCDAAs are commonly found in the stem peptides of PGNs, substituting D-Ala in the fourth or fifth position (Fig. 4A and B and 5B).12 PGNs from A. muciniphila and F. nucleatum mostly contain basic NCDAAs such as His, Arg, Asn, or Lys at the fifth position of the stem peptides (Fig. 4B and S8G†), which could be incorporated by transpeptidases and/or Ddl in these bacteria.66 Notably, E. coli possesses the greatest diversity of NCDAAs in PGNs, including Phe, Tyr, Gly, Lys, Cys, Arg, etc., whereas other bacteria, B. adolescentis, B. infantis, L. plantarum, and S. aureus appear to solely utilize Gly as the non-canonical amino acid in the PGN stem peptides (Fig. 4A and B). Empowered by in silico MS/MS spectral references, we readily distinguished PGN isomers with Gly at either the fourth or fifth position of the pentapeptide stem in E. faecium (Fig. 4D). NCDAAs in peptidoglycan confer bacterial resistance against hydrolases of rival bacterial species, which are consistently found at elevated levels in bacteria under stress conditions.12,67 Our work reveals the widespread presence of NCDAAs in bacterial PGNs under steady-state conditions compared to what was previously appreciated.
Besides stem peptide motifs, we also profiled structural features on the (NAG)(NAM) backbone in PGNs across bacteria, including O-acetylation (i.e., DAG and DAM) or de-N-acetylation (i.e., G and MUR) (Fig. 4A and B).68 Modifications to acetylation in peptidoglycan may help bacteria evade lytic enzymes such as lysozyme.69 With our MS/MS-integrated analysis workflow, we could readily determine if acetylation/de-acetylation occurs on the NAG or NAM residue in disaccharide PGNs. Such alterations only account for a minor extent (<5%) in PGNs of B. bifidum and L. plantarum; hence no significant changes in the overall acetylation rate of PGNs were observed for most bacteria (Fig. S8E and F†). One remarkable exception is A. muciniphila that showcases 43% de-N-acetylation of NAG (Fig. S8E and F†), which is in good agreement with the recent analysis by Garcia-Vello et al. (∼40%).56 Notably, these de-N-acetylated PGNs are still potent agonists to both NOD1 and NOD2 immune sensors;56 thus, it remains to be determined if such de-N-acetylated motifs exhibit any distinct functions in the host.
Next, 1,6-anhydroMurNAc (anNAM) termini are unique features that mark the end of the peptidoglycan strands in Gram-negative bacteria.20 Correspondingly, anhydro-PGNs constituted 4–5% of total peptidoglycan composition in Gram-negative bacteria, E. coli and F. nucleatum, but are nearly undetectable in model Gram-positive bacteria and A. muciniphila (Fig. 5D).56 Surprisingly, we found that all three Bifidobacterium spp. contain a remarkably high abundance of anhydro-PGNs, which is unusual for Gram-positive bacteria (Fig. 5D and S11†). For instance, the anNAM-containing PGNs comprise nearly 40% of total PGNs in B. adolescentis (Fig. 5D). The exceedingly high amounts of anhydro-PGNs in Bifidobacterium suggest the presence of active lytic transglycosylases (LTs) in catalyzing the non-hydrolytic cleavage of the peptidoglycan backbone, which are elusive in Gram-positive bacteria.52,70,71 We next set out to establish putative LTs in Bifidobacterium responsible for anhydro-PGN formation.
Fig. 6 Bifidobacterium anhydro-PGNs from the cleavage of lytic transglycosylases (LTs) exhibit potent anti-inflammatory effects in vitro. (A) Biochemical reconstitution of recombinant Bifidobacterium MltG with nascent peptidoglycans as substrates. Lipid II was extracted from E. faecalis. (B) LC-MS chromatograms of the muropeptide products indicate the formation of anhydro-PGNs, II. Extracted ion chromatograms (EICs) for the [M + 2H]2+ adduct are shown. Additional control experiments are shown in Fig. S15.† (C) Pre-treatment of synthetic anhydro-PGN (ah-PGN), (NAG)(NAM)-AeKAA, significantly suppresses LPS-induced inflammatory responses in murine macrophage RAW264.7 cells. The synthetic ah-PGN mimics the natural anhydro-PGNs found in B. adolescentis. |
Apart from the well-characterized LTs in Gram-negative bacteria, certain Gram-positive bacteria that undergo dormancy also encode a large family of cell wall lytic enzymes that are known as resuscitation-promoting factors (Rpfs), some of which are LTs.76,77 Since Bifidobacterium can also enter a viable but non-culturable (VBNC) state similar to dormancy,78,79 we explored if Bifidobacterium could possess any Rpfs with LT activity. Using sequence similarity searching by BLAST, we identified two candidate proteins (RpfB-FL and RpfB-Truncated) in Bifidobacterium containing the lysozyme-like domain (IPR023346) that show weak homology to M. tuberculosis and S. coelicolor RpfB (Fig. S12A and C†). Interestingly, both the full-length and truncated RpfB proteins of B. adolescentis display dual LT and amidase activities with bacterial sacculi in vitro (Fig. S16†), indicating their possible involvement in sacculi remodeling. Taken together, our results established three bona fide LTs (MltG, RpfB-FL length, and RpfB-Truncated) in Bifidobacterium that may act in concert contributing to the high abundance of anNAM in Bifidobacterium peptidoglycan.
Firstly, our in silico PGN MS1 database, which centers around the (NAG)(NAM)-containing disaccharide muropeptide as the core PGN structure, is customizable with user-defined parameters to accommodate diverse structural modifications and polymerizations/crosslinking in PGNs. Currently, our algorithms support most known PGN modifications as built-in selections, including O-acetylation, de-N-acetylation, NCDAA incorporation, 3–3/3–4 crosslinking, etc.; additional structural features can be conveniently incorporated to expand the search space for identification of novel PGNs in the gut microbiota.
Secondly, for each PGN molecule in the MS1 database, an in silico predicted MS/MS pattern is automatically generated by PGN_MS2. The collection of these simulated MS/MS spectra affords a comprehensive in silico PGN spectral library that enables automated analysis. In contrast to PGFinder,35 a PGN analysis pipeline based solely on MS1 values, our PGN MS library integrates in silico MS/MS spectral prediction, marking a significant advance for accurate and robust PGN identification. Similar to the iterative searching strategy in PGFinder,35 we also recommend users specify selective parameters to build the in silico PGN polymer pool focusing on the major canonical features in the PGN monomers, to reduce the number of possible polymers created. As a novel feature of our PGN library, the PGN_MS2 tool also outputs an image summarizing the diversity of PGNs with their respective nomenclatures and chemical- and PGN-specific properties.
Notably, our PGN_MS2 represents a dedicated in silico MS/MS spectral prediction algorithm for PGNs, whose unique sugar and non-proteogenic amino acids defy reliable predictions by existing tools developed for small molecules or proteins. For instance, the proteomics analysis software Byonic has been previously used for PGN analysis,19 which takes a peptide-centric approach such that common PGN structures are viewed as variable modifications of the stem peptide. As a result, one needs to manually annotate the masses of various moieties, such as anhydro- and de-acetylation of the disaccharide backbone, non-proteogenic and amidated amino acids for PGN search and analysis. In contrast, PGN_MS2 is specialized to predict MS/MS spectra for PGN chemotypes, where the user simply selects the desired structural features of PGNs without needing to calculate and input their respective masses, rendering the analysis process user-friendly and flexible to accommodate novel modifications.
To validate the reliability of our workflow, we compared the cosine similarity scores between the in silico predicted spectra of PGNs and the authentic spectra of several PGN motifs. Remarkably, PGN_MS2 consistently outperformed other spectral simulation software packages in metabolomics and proteomics. Moreover, the PGN_MS2 predicted spectra matched well with the fragmentation data acquired using different instruments (i.e., Orbitrap and Q-TOF), showcasing the congruity of the in silico PGN fragmentation rules. We further demonstrated the facile and accurate assignment of closely related PGN isomers via automated spectral matching and scoring. However, we also noted that the experimental MS/MS spectra of low abundant analytes tend to have lower quality, which led to the top predicted PGN structures having very close similarity scores. In these cases, manual inspections are needed to ensure accurate structural assignment. To facilitate such manual analysis, our PGN_MS2 records the precursor, fragmentation type, and chemical structure of all fragment peaks generated (Fig. S2D†).
During the preparation of our manuscript, Hsu et al. reported a high-throughput automated muropeptide analysis (HAMA) framework that generates in silico MS/MS fragments for PGN analysis.85 However, we note several key distinctions between our workflow and HAMA. First, for in silico prediction of MS/MS patterns, HAMA focuses on fragmentation of the stem peptide, solely generating the b- and y-ions of stem peptides without any fragmentation of the sugar moieties in PGNs. Secondly, HAMA restricts the types of PGN modifications to <6 (including those on sugar motifs and peptide aminations etc.) to avoid mass coincidences. On the other hand, our PGN_MS2 is developed especially for simulating MS/MS patterns of soluble muropeptide chemotypes, whose fragmentation rules were derived from empirical analysis that include both sugar and peptide moieties in PGNs, showcasing superior matches to actual MS/MS data from HCD and CID fragmentations. As a result, our workflow accommodates much more diverse PGNs in the database and accurately distinguishes structural isomers by MS/MS matching. Notably, HAMA is reportedly unable to differentiate the 3–4 and 3–3 crosslinks in dimeric PGNs and hence can only consider 3–4 crosslinks currently. In contrast, with our PGN workflow, the 3–4 and 3–3 crosslinked PGN isomers can be facilely identified with signature fragments from the in silico MS/MS patterns (Fig. 4C and S7†). Moreover, our PGN_MS2 also includes specific fragmentations pertinent to the isoGln/Glu (q1/q2 and e1/e2), exhibiting its unique power in determining the amidation positions on isomeric PGNs (Fig. S3 and S9†). Furthermore, PGN_MS2 creates a PGN MS library that is compatible with various vendors or open-source MS analysis software, offering users flexibility in choosing their preferred platforms for data analysis. To promote open-access research in the PGN field, PGN_MS2 itself is open source (https://github.com/jerickwan/PGN_MS2), where users can download directly to use or modify the code to increase the scope of fragmentations and the identities of amino acids and/or glycan motifs, etc. Recognizing the lack of PGNs in the existing metabolomics databank, we also uploaded the annotated MS/MS spectra of PGNs across different bacteria to the metabolomic data repository MoNA (https://mona.fiehnlab.ucdavis.edu/spectra/browse?query=exists(tags.text:%27QiaoLab_PGN%27)).
Aided by PGN_MS2, we uncovered that Bifidobacterium spp. features a large abundance of anNAM termini in peptidoglycan, which are non-hydrolytic cleavage products of LT enzymes.72 By homology searching, we identified and biochemically characterized three enzymes as LTs in Bifidobacterium, namely, MltG, RpfB-FL, and RpfB-Truncated, respectively. Interestingly, MltG strictly requires nascent peptidoglycan strands as substrates for non-hydrolytic cleavage, whereas RpfBs robustly use mature sacculi to produce anhydro-NAM termini. The complementary substrate preferences of these LTs may account for the remarkably high amount of anhydro-PGNs. Importantly, Bifidobacterium spp. are well-known probiotics that confer beneficial effects on hosts such as reducing LPS-induced inflammation in vitro and in vivo.21,22 We demonstrated that pre-treatment with anhydro-PGN effectively suppressed LPS-induced proinflammatory cytokine expression in murine macrophages in vitro. As Bifidobacterium anhydro-PGNs are non-agnostic to canonical NOD1 and NOD2 immune receptors,81–84 the underlying mechanisms of their anti-inflammatory roles are yet to be elucidated. We are currently working to genetically manipulate putative LTs in Bifidobacterium spp. to evaluate the anti-inflammatory activities of the mutants in vivo, which may lead to improved probiotics.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3sc05819k |
This journal is © The Royal Society of Chemistry 2024 |