Morten Thaysen-Andersena, Martin R. Larsenb, Nicolle H. Packera and Giuseppe Palmisano*bc
aDepartment of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, Australia
bDepartment of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, DK-5230, Odense, Denmark. E-mail: giuseppe@bmb.sdu.dk; Fax: +45 6550 2467; Tel: +45 6550 2342
cInstitute of Biomedical Sciences, Department of Parasitology – USP, São Paulo, Brasil
First published on 16th September 2013
Sialic acids are carried by glycoproteins, proteoglycans and glycolipids as terminal entities of larger glycan structures and form a heterogeneous group of important monosaccharides in a wide range of biological systems in nature. Spatial and temporal structural characterisation of sialoglycoconjugates is required to understand their function. In this first of two related reviews we outline the available strategies for the analysis of mammalian N- and O-linked glycoprotein sialylation and summarise the associated sample handling methodologies that are a prerequisite for successful experimental designs including methods for enrichment, isolation, derivatisation and metabolic labelling. The downstream liquid chromatography (LC) mass spectrometry (MS) based separation and detection of N- and O-linked glycoprotein sialylation is covered in the second review. Since glycoprotein sialylation can be studied on multiple analyte levels, the analytical strategies and pre-LC-MS methodologies are covered separately for sialoglycans, sialoglycopeptides and intact sialoglycoproteins. Workflows to analyse glycoprotein sialylation at the glycomics level are particularly mature and the analytical chemist has multiple tools and technologies to acquire structural information on released glycans even at the system-wide level. The availability of analytical tools to study site-specific glycoprotein sialylation in the form of sialoglycopeptides or intact sialoglycoproteins is increasing through the development of sialic acid specific enrichment and labelling tools. However, the glycoproteomics route remains comparatively more challenging even when relatively simple protein mixtures are analysed. Evidenced by the wealth of available literature reviewed here, the glycoscience community has invested significant efforts to improve the analysis of glycoprotein sialylation.
The expression of sialic acids was previously thought to be unique to organisms of the deuterostome lineage as well as fungi and pathogenic bacteria; however, the development of more sensitive and accurate analytical techniques has found that sialic acids are distributed across a wide range of species and that they may be quite ancient in their origin.3–6 As such, it is estimated that sialic acids have populated the Earth for around 500 million years starting from the Cambrian explosion colonising the entire phylogenetic spectrum from the primitive platyhelminth Polychoerus carmelensis to eukaryotes.7
The most common mammalian sialic acids are N-acetylneuraminic acid (Neu5Ac) and N-glycolylneuraminic acid (Neu5Gc).8,9 Neu5Gc nucleotide sugars, cytidine 5-monophosphate (CMP)-Neu5Ac, are not synthesised in humans due to an irreversible mutation in the gene encoding for the CMP-Neu5Ac hydroxylase, the enzyme responsible for the conversion of CMP-Neu5Ac to CMP-Neu5Gc.10 Neu5Gc, which has been described to be antigenic for humans,11,12 has, nonetheless, been found in sialoglycoconjugates isolated from cultured human cells and tissues,12,13 in particular in fetal tissue and malignant tumours.12,14–19 However, it cannot be excluded that the Neu5Gc was introduced from exogenous sources in these studies.20 The synthesis and degradation of glycoconjugate sialylation is mediated directly by sialyltransferases and sialyl hydrolases (also known as neuraminidases and sialidases),21 but additional enzymes and transporters are responsible for the generation, regulation and availability of CMP-Neu5Ac substrates in the cell. The possibility of expression of sialoglycoconjugates in plants has been a topic of intense debate due to the importance of plants as expression systems for biotherapeutics.22,23 Recently, it was shown that plants contain trace amounts of Neu5Ac and Neu5Gc residues, which, however, may have originated from non-plant sources.23,24 Although these results indicate the absence of Neu5Ac and Neu5Gc in plants, it leaves open the possibility for the presence of other sialic acid species.
The surface of any cell in nature comprises a variety of glycoconjugates including glycoproteins, proteoglycans and glycolipids forming the cellular glycocalyx layer. Sialic acid residues of these glycoconjugates are known as the “functional ornament” of the glycocalyx.25 This functional implication is facilitated, in part, by the external cellular localisation of the sialic acid residues allowing them to directly interact with the extracellular environment and, in part, by virtue of their unique physicochemical properties including their bulkiness, hydrophilicity and negative charge. Sialylated glycoconjugates are known to play key roles in several pathophysiological processes26 including viral infection,27,28 embryogenesis,29 inflammation,30,31 cardiovascular diseases,32–34 cancer35–37 and neural development.38,39 In addition, it has been reported that alterations in glycoprotein sialylation, as a consequence of various diseases, can change the downstream intracellular signalling.40 Furthermore, the analysis of glycoprotein sialylation in body fluids is gaining attention due to the fact that most FDA approved biomarkers used in cancer staging, prognosis and treatment selection, are glycoproteins.41–44 Finally, glycoprotein sialylation has been shown to be critical for regulating blood circulation half-life,45 bioavailability, stability,46 receptor interaction47 and immunogenicity48 of biotherapeutic products such as monoclonal antibodies and other N- and O-linked glycoproteins.
Essentially, each species expresses a unique “sialome” (alternatively called “sialiome”49), a term defined as the total array of sialic acids and their related glycoconjugates expressed by a defined system e.g. cell, tissue, organ, or organism at a specified time and condition.50 As such, the sialome represents a subset of the glycoconjugate unspecific “glycome” as well as a subset of the glycoconjugate specific “glycoproteome” and “glycolipidome”. However, it should be noted that the sialome was initially coined as a term to define the mRNA transcripts and proteins expressed in the salivary glands.51 Here we use the former definition. Unlike the genome sequence, which is virtually identical in every somatic cell type of an organism and undergoes relative few changes during the lifetime of the organism, the sialome is highly cell-specific and varies markedly with regard to time, space, and environmental cues including physiological status of the cell. These fluctuations are a result of the concerted action of sialyltransferases (EC 2.4.99),37 sialidases (EC 3.2.1.18)52 and other sialic acid specific enzymes involved in the biosynthesis and post-synthesis alterations. In addition, the levels of nucleotide sugars and the cellular transit times may directly or indirectly modulate and regulate the sialome. The qualitative and quantitative analytical characterisation of sialoglycoconjugates present in a cell is an important step towards enabling us to decipher the complex glycosylation code for specific pathophysiological conditions and to obtain a better understanding of disease mechanisms.
The two major sialic acids core structures are neuraminic acid (chemical name: 5-amino-3,5-dideoxy-D-glycero-D-galacto-2-nonulopyranos-1-onic acid) and 2-keto-3-deoxynonulosonic-acid (Kdn) (chemical name: 3-deoxy-D-glycero-D-galacto-2-nonulopyranos-1-onic acid), which share a nine carbon nonulosonic acid structure from which the other sialic acids can be synthesised. Neu5Ac (chemical name: 5-acetamido-2,4-dihydroxy-6-(1,2,3-trihydroxypropyl)oxane-2-carboxylic acid) and Neu5Gc (chemical name: 2,4-dihydroxy-5-[(2-hydroxyacetyl)amino]-6-[1,2,3-trihydroxypropyl]oxane-2-carboxylic acid) are the most common sialic acid structures found in mammalian cells and, hence, are the focus of this review (see Fig. 1 for overview of common sialic acid structures and sialoconjugates). The sialic acid structures may be modified by for example acetylation, methylation and sulfation, which are covalently linked through the various hydroxyl groups on the carbon atoms of the sialic acids to generate additional layers of structural heterogeneity. Similar to the unmodified sialic acid structures, the modified counterparts are known to be involved in pathophysiological phenomena including as tumour development,53–55 immunity56,57 and hormone function.58 However, due to the lability and structural migration of these acid- and heat-sensitive sialic acid modifications during sample preparation and analysis, little is known about their function or tissue-specific expression.59,60
![]() | ||
Fig. 1 Chemical structures and masses (in hydrolysed form) of the most common mammalian sialic acids (i.e. Neu5Ac, Neu5Gc and Kdn) and some examples of common mammalian sialylated N- and O-linked glycans and determinants. The core structures for the four main O-linked core types (core 1–4) have been highlighted (broken lines). The R1–9 groups in red indicate the modification position and modification moieties that form the more than 60 structurally different sialic acids. |
Another layer of structural diversity of sialic acids arises from the sialyl linkages to the adjacent monosaccharide by two main sialyl linkage configurations of N- and O-linked sialoglycans i.e. the α2,3- and α2,6-linkage. Both configurations are commonly used for the linkage of Neu5Ac and Neu5Gc to galactose (Gal) residues, whereas only α2,6 seems to be used for the sialyl linkage to N-acetylglucosamine (GlcNAc) or N-acetylgalactosamine (GalNAc) residues. Other configurations, such as the α2,8- and α2,9-sialyl linkages, which often are found in the polymeric form of sialic acids known as oligo- or polysialic acids have so far generally been limited to glycoconjugates expressed in neural tissue including the neural cell adhesion molecule.61 However, recent analytical developments have indicated their more wide-spread expression and localisation, for example in secreted epidermal growth factor receptor (EGFR) from an epidermoid carcinoma cell line.62
The structural characterisation of sialic acids and their glycoconjugates has together with functional glycobiology increased our understanding of sialylation over many decades (see timeline for significant sialic acid discoveries, ESI, Fig. S1†). Although a thorough historical review of the techniques for structural analysis of sialylated compounds is outside the scope of this review, it is well accepted that modern technology has improved the analytical depth, sensitivity and speed of such analyses. Several excellent reviews on protein glycosylation analysis in general have been published focusing on different analytical aspects.63–67 Moreover, a description of the sialome complexity from an experimental point of view has been reported68 and the recent advances in the biology and chemistry of sialic acids have been concisely reviewed69.
In the two related reviews presented here we will focus our attention on the analysis of sialic acids, in particular Neu5Ac and less on Neu5Gc, carried by N- and O-linked glycoproteins, which are the two most abundant mammalian classes of glycoproteins. In N-linked glycosylation, glycans are linked to the polypeptide chain through asparagine residues in conserved sequons (Asn-Xxx-Ser/Thr/Cys, where Xxx ≠ Pro).70–72 Lately, other non-canonical sequons have been proposed,73 but these are controversial and may turn out to be artefacts.74,75 However, recent studies have shown the presence of glutamine-linked protein glycosylation and other non-canonical asparagine-linked glycosylation motifs on recombinant human antibodies76,77 opening up for the presence of other types of protein glycosylation. Three main N-glycan types exist, all sharing a common chitobiose core; high mannose, hybrid and complex types. Of these, only the two latter types have been reported to be sialylated. In contrast, eight different core types exist for mammalian O-linked glycosylation (alternatively called mucin-type glycosylation), in which a GalNAc residue connects the O-glycan structure to the polypeptide backbone through a serine or threonine residue. All the O-glycan core types may carry sialylation as terminal modifications.
In this first of two related reviews we briefly outline the common analytical strategies and summarise in more detail the methodologies and considerations for pre-liquid chromatography (LC) and mass spectrometry (MS) sample handling for the analysis of sialylated N- and O-linked glycoproteins. Aspects covered include analyte enrichment and prefractionation, chemical derivatisation and metabolic labelling as all of these are important components of the experimental design to achieve informative structural data. The second review covers the downstream separation and detection of sialylation on N- and O-linked glycoproteins using modern LC-MS based approaches.78 Glycoprotein sialylation can be studied based on the analysis of sialoglycans, sialoglycopeptides and intact sialoglycoproteins. These three analyte levels are consequently reviewed separately in these two reviews. Evidenced by the wealth of available literature, considerable efforts have been invested to improve the analytical strategies and pre-LC-MS tools for the analysis of glycoprotein sialylation.
The structural information of glycoprotein sialylation can be divided into the following categories: (i) identification and quantitation of the glycoprotein carrier of sialylation, (ii) localisation of the glycosylation site(s) and specification of the total glycan occupancy of the site(s), (iii) determination of the sialo:
asialo ratio of the glycans occupying the glycosylation site(s), (iv) characterisation of the sialoglycan structures attached to the glycosylation site(s) and finally (v) determination of the molecular relationship between sialoglycans and neutral glycans and other PTMs occupying different sites on the polypeptide backbone. The characterisation of the sialoglycan structure itself (iv) can be divided into the following sub-levels; determination of (a) monosaccharide composition e.g. determine the number of Neu5Ac/Neu5Gc residues, (b) topology e.g. establish on which N-glycan arm (3- or 6-mannose arm) the sialic acid is located, (c) sialyl linkages e.g. determine α2,3- or α2,6-sialyl linkage and (d) relative abundance of the multiple sialoglycans that often occupy a specific glycosylation site. Although the methods for initial sample preparation vary significantly for different types of biological samples (e.g. blood serum, extracts from tissues and cultured cell lysates), the subsequent analytical routes usually fall into the category of either glycomics or glycoproteomics type approaches.79–83 Several excellent reviews partly or fully devoted to glycomics67,84–88 and glycoproteomics65,89–93 analysis in general have been published, and the reader is kindly referred to these resources for an introduction to glycosylation analysis. Herein, we describe the specific analytical strategies for the study of protein N- and O-linked glycosylation with a sialic acid-centric focus.
![]() | ||
Fig. 2 General overview of the analytical strategies for the analysis of glycoprotein sialylation and topics covered in Part I and II of these two related reviews. PI and PII refer to Part I and Part II,78 respectively, followed by paragraph numbers. Bioinformatic aspects of glycoprotein sialylation analysis are not covered by the two reviews. |
In the glycoproteomics oriented approach, glycoprotein sialylation is studied in the form of sialoglycopeptides or alternatively, but much less commonly, in the form of intact sialoglycoproteins. Such strategies usually consist of initial enrichment of the sialylated glycopeptides from peptide mixtures obtained by proteolysis of the glycoproteins in the sample, followed by a one or two-dimensional LC separation, detection using MS and tandem MS, and finally manual or software assisted data interpretation. Additional steps, such as metabolic and chemical labelling may be introduced to aid the enrichment and/or detection. Desialylation and even complete deglycosylation prior to LC-MS is commonly performed in experimental designs of a more proteomics type character where the aim is large-scale glycosylation site identification of previously sialylated (or previously glycosylated) peptides.99,100 In the case of intact sialoglycoprotein analysis, thorough single protein isolation/purification and appropriate sample handling including desalting and concentration are typically required prior to on- or off-line separation and detection with LC- or capillary electrophoresis (CE)-MS. Glycoprotemics approaches benefit from providing site-specific information by yielding the identity of the protein carriers of the sialoglycans and specifying the occupied glycosylation site(s), in addition to obtaining some information on the sialoglycan structure. However, this additional structural information most often comes at a cost; the sialoglycopeptides (and sialoglycoproteins) are comparatively more difficult to study than released sialoglycans, in particular, in system-wide experiments where complex protein mixtures usually are the starting material. This is, in part, due to the higher molecular complexity of the sialoglycopeptides (analysis of two conjugated biomolecules i.e. sialoglycans and peptides/proteins) and, in part, due to the fact that sialoglycopeptides are only weakly detected in MS as a result of their microheterogeneity and their lower signal intensity65,101 compared to unglycosylated peptides as discussed in the second of these two reviews.78 Employing efficient sialoglycopeptide enrichment, separation and detection techniques in the analytical workflow reduces this limitation when performing glycoproteomic type experiments.
Crude glycoprotein isolation by subcellular fractionation is a method for reducing the protein complexity. Organelle purification can also be used to investigate the role of sialic acids and their biosynthetic enzymes in specific organelles or sub-compartments such as the endoplasmic reticulum or the individual compartments of the Golgi apparatus, under healthy and diseased conditions.109–111 Isolation steps may, however, lead to severe analyte losses or can introduce significant analyte biases. For example, trichloroacetic acid (TCA) precipitation, which is a commonly used protein isolation technique in many workflows, is seemingly biased towards disordered/unfolded proteins112 as well as sialylated glycoproteins107 affecting, in total, more than one-third of human proteins.113 Thus, TCA precipitation is potentially disturbing the qualitative and quantitative analysis of many glycoproteins when included in the sample work-up. Hence, detailed structural analysis of N- and O-linked glycoprotein sialylation on the glycoproteome or sub-glycoproteome-wide scale without previous fractionation/purification is desirable in order to avoid any bias resulting from the isolation steps. However, this is extremely challenging when starting with complex samples of biological origin. Efficient sialic acid specific enrichment coupled with high performance modern LC-MS detection has expanded the potential for sialoglycoprotein profiling from complex samples such as cells, tissues and body fluids. However, glycoprotein sialylation analysis, in particular when performed via the glycoproteomics route, clearly still needs further development of robust techniques and integrated workflows covering all aspects from sample preparation to data acquisition. In addition, the system-wide glycoproteomic type analysis of glycoprotein sialylation is hampered by the lack of dedicated computational tools, since the characterisation of branched sialoglycans and sialoglycopeptides creates unique challenges that are not encountered in conventional proteomics and other related research areas.
Sialylated N-glycans are most commonly released by N-glycosidase F (PNGase F) from Flavobacterium meningosepticum or almond N-glycosidase A (PNGase A). In contrast to PNGase F, PNGase A has the ability to cleave N-glycans containing an α1,3-linked fucose on the reducing end GlcNAc residue as found in insect and plant glycoproteins.116 The N-linked glycan release can be performed on purified sialoglycopeptides and sialoglycoproteins or on complex mixtures of such molecules on solid supports (e.g. PVDF membranes), in gels (e.g. 1D gel-electrophoresis) or in solution.82,117 The terminal sialic acid residues do not seem to alter the accessibility of the PNGase F/A to the substrates. This enables a quantitative release of sialylated N-glycans (personal unpublished observation). However, it should be noted that enzymatic release of sialoglycans from intact proteins is dependent on the sialoglycoprotein conformation since the sialoglycans need to be well exposed for complete release. Reduction and alkylation of the cysteine residues of the sialoglycoproteins and general protein denaturation will, thus, usually enhance the rate and extent of deglycosylation.
No enzymatic approach is available for the release of all O-linked sialoglycans that densely glycosylate mucins or mucin-type sialoglycoproteins. O-Glycosidase from Enterococcus faecalis removes core 1 and core 3 type O-linked glycans from glycoproteins, but only after previous desialylation. In contrast, O-linked sialoglycans can be released broadly by reductive beta-elimination,82 which promises to quantitatively release all sialoglycans while retaining the sialic acid residues on the released O-glycans. Reductive beta-elimination of sialoglycans can be performed either in-solution or from glycoproteins immobilised on solid supports. Alternatively, hydrazinolysis with anhydrous hydrazine (H2NNH2) can be used as a method for releasing sialoglycans. It was recently shown that specific hydrazinolysis reaction conditions including reaction time, temperature, chemical reactants and reaction vials are crucial to avoid significant losses of sialic acids from the sialoglycoproteins.118
The combination of high hydrophilicity and relative small size is a unique physicochemical feature that allows relative easy isolation of released N- or O-linked sialoglycans from their protein carriers and other large molecules present in samples of biological origin. For example, passing released sialoglycans over a hydrophobic reversed-phase column in a solid phase extraction (SPE) format will selectively allow the hydrophilic glycans to pass through whilst retaining the more hydrophobic protein/peptide components. Isolation methods for sialoglycans can, in addition, utilise the negative charge of the sialic acid residues for selective isolation by e.g. using TiO2 (discussed later for the enrichment of sialoglycopeptides in paragraph 3.2.2.f). Here, prior dephosphorylation is needed to avoid co-enrichment of phosphorylated peptides/proteins.49,119
In addition to permethylation, which methylates all hydroxyl and carboxyl groups of the sialoglycans, the derivatisations strategies for sialylated glycans can crudely be divided into those that specifically target the sialic acid residues at the non-reducing end and those which label the carrier glycan reducing end. The latter involves reductive amination and is a general approach in glycomics analysis.121,122 Thus, the common acid based reductive amination strategies including 2-aminobenzoic acid, 2-aminobenzamide or 2-aminopyridine labelling will only be mentioned in passing in these two reviews. Reducing end labelling under alkaline conditions e.g. with 1-(2-naphthyl)-3-methyl-5-pyrazolone is more favourable for sialoglycans than other common reagents used in reductive amination,123 due to the lower risk of loss of the acid-labile sialic acid residues. The non-reducing end derivatisation directly of the sialic acid residues aims at neutralising the carboxyl groups thus leaving the sialoglycans without any permanent charge. This can be performed using a variety of methods e.g. methyl esterification124–128 or amidation.129–131 The neutralisation stabilises the sialoglycans and limits the level of ionisation based loss of sialic acid as discussed in the second of these two reviews.78 No reactivity difference between the α2,3- and α2,6-linked NeuAc residues was reported for methyl esterification or amidation either in the derivatisation reaction itself or in the subsequent tandem MS fragmentation. In contrast, perbenzoylation132 and other types of sialic acid-specific derivatisations133,134 have different reactivities towards the two sialyl linkages, which therefore may be used to differentiate the linkage types. Permethylation is another commonly used derivatisation method in glycomics, which gives methyl esters of the acidic carboxyl group in addition to generally transforming the hydroxyl groups of the sialoglycans into methyl ethers.135,136 The sialic acid residues seem to cope with the reaction conditions without detectable losses, but permethylation reactions can suffer from significant under-methylation if reaction conditions are not optimal. In addition, permethylation is unsuitable for glycans containing O-acetylated sialic acid residues because of the chemical degradation of these groups during the harsh reaction conditions.137,138 Permethylation using heavy and light isotope labelled reagents can be introduced in the derivatisation protocol to obtain glycoform quantitation between two samples.139 Finally, acid-catalysed lactonisation i.e. internal esterification of the carboxyl to the hydroxyl group of the sialic acid residues has been employed to neutralise and discriminate between α2,8- and α2,9-linked polysialic acids for efficient MALDI MS detection.140 Irrespective of the method, sialic acid esterification has, in addition to the increased analyte stability, the advantage of allowing the analysis of sialoglycans in positive ion MS mode, where MALDI mass analyzers generally perform better in terms of sensitivity and resolution and where neutral glycans appear with higher signal strength.78
Method | Common applications | Limitations | Advantages | Examples of applications | ||
---|---|---|---|---|---|---|
Sialoglycans | Sialoglycopeptides | Sialoglycoproteins | ||||
Affinity-based methods | ||||||
Lectins (e.g. SNA, MAL, MAH, WGA) | x | x | Un-/low-specific binding of sialoglycan determinants limits enrichment of specific sialoglycoforms. | Prefractionation of crude sialoglycoprotein mixtures to lower complexity. | Enrichment of sialoglycopeptides from human serum proteins using SLAC.169 Enrichment of sialoglycoproteins from serum from pancreatic tumour patients using SNA, MAL and WGA lectins.171 | |
Potential of sialoglycoprotein visualisation via lectin blotting and lectin histochemistry. | ||||||
HILIC/ERLIC(e.g. ZIC, Amide-80) | (x) | x | Unspecific for sialylated species (affinity for all hydrophilic species). Multi-sialylated species difficult to elute. Relative low analyte solubility in organic mobile phase. | Possibilities for both pre-fractionation and enrichment in a variety of formats. | Separation of sialoglycopeptides from recombinant human interferon-gamma.278 Pre-fractionation of sialoglycopeptides from human platelets.279 | |
Binding mechanism fairly well understood and can be manipulated. | ||||||
SCX | x | x | Unspecific for sialylated species (isolate other negatively charged species). | Can separate sialylated isomers as a prefractionation step. | Prefractionation of sialoglycopeptides from human platelet membranes.188 | |
Serotonin | x | x | Low enrichment efficiency. Binding mechanism not understood. | Can discriminate between Neu5Ac and Neu5Gc containing species by selective binding of the former. | Enrichment of sialylated N-glycans and N-glycopeptides from human serum transferrin in SPE format.186 | |
COFRADIC | x | Low throughput due to re-chromatography. Reproducible chromatography required. | Localise sialoglycopeptide fractions by retention shift | Prefractionation of sialoglycopeptides from human serum.190 | ||
TiO2 | x | Affinity for modified sialic acids species not well-defined (e.g. Neu5Gc and Kdn). Oligo and polysialylated species difficult to elute. | High enrichment efficiency of sialoglycopeptides (needs dephosphorylation to avoid co-enrichment of phosphopeptides) | Enrichment of sialoglycopeptides from human plasma, saliva and mouse brain.49 | ||
Chemistry-based | ||||||
Chemical labelling | ||||||
Periodate oxidation/hydrazide coupling/acid release | x | Reaction needs fine tuning to avoid oxidation of asialo-species. Loss of sialic acid structure information. | High enrichment affinity due to covalent capture of oxidised cis diols. | Capture and detection of sialoglycoproteins derived from human cerebrospinal fluid.99 | ||
Reverse glycoblotting | x | x | Reaction needs fine tuning to avoid oxidation of asialo-species. Multiple chemical reactions cause sample loss and unwanted by-products. | Covalent capture of cis diols and release of intact sialylated glycopeptides and glycans. | Capture and detections of human α-fetoprotein, bovine pancreas fibrinogen human EPO and sialoglycopeptides232 from mouse serum.280 | |
PAL | x | Reaction needs fine tuning to avoid oxidation of asialo-species. Unknown compatibility with some cell lines. | Labelling of cell surface sialylated proteins. Fast reaction kinetics. | Capture and detection of cell surface sialoglycoproteins from B-JA-B K20 B cells.229 | ||
Metabolic labelling | ||||||
ManNAz labelling/Staudinger ligation with phosphine | x | Low reaction kinetic. | Selective labelling of sialylated glycoproteins applied to living organisms. | Labelling and visualisation of sialoglycoproteins of Jurkat cells.252 | ||
ManNAz labelling/CuAAC with alkyne probes | x | Toxicity of Cu catalyst limits in vivo applications. | Efficient and selective labelling of sialylated glycoproteins. | Labelling and visualisation of sialoglycoproteins of metastatic prostate cancer cells255 and mesenchymal stem cells.264 | ||
Alkynyl ManNAc labelling/CuAAC with azide probes | x | Toxicity of Cu catalyst limits in vivo applications. | Efficient and selective labelling of sialylated glycoproteins. | Labelling and visualisation of sialoglycoproteins of prostate cancer cell lines266 and metastatic lung cancer cells.267 |
Lectin affinity chromatography strategies involve the binding and separation of glycoproteins or glycopeptides bearing specific glycan determinants to immobilised lectins. A detailed overview of the distribution, specificity and function of sialic acid-specific lectins focusing on those that occur in viruses, bacteria and non-vertebrate eukaryotes has been published.160 The most common plant lectins used for the isolation of sialylated glycopeptides/glycoproteins are SNA, MAL-I, Maackia amurensis hemagglutinin (MAH), and Triticum vulgaris (wheat germ) agglutinin (WGA). The specificities of these lectins have been described at different resolution: SNA binds glycoconjugates containing α2,6-linked sialic acid residues,161 MAL binds most preferably to terminal Neu5Acα2,3Galβ1,4GlcNAc entities found in N-linked glycans,162,163 MAH binds preferentially to α2,3-linked sialic acid residues of O-linked disialylated tetrasaccharides with the specific structure Neu5Acα2,3Galβ1,3(Neu5Acα2,6)GalNAc,164 MAL-I and MAH also show specificity towards nonsialylated structures such as SO4-3-Galβ1,3GalNAc165 and, finally, WGA has more broad affinity to GlcNAc and sialic acid containing glycoconjugates.166 Molecules that are unspecifically bound to the immobilised lectins after analyte loading in appropriate solvents are subsequently removed by several washing steps followed by elution of the bound sialoglycoproteins or sialoglycopeptides by low pH solvents or by using sialic acids or sialic acid analogues for competitive elution. Several different formats such as single,167 multiple168 (MLAC) and serial169 lectin affinity chromatography (SLAC) along with different immobilised supports have been used for prefractionation of sialoglycopeptides and sialoglycoproteins.170 The use of lectin combinations such as concanavalin A (ConA), WGA and jackfruit jacalin, which show different specificities for sialic acid glycoconjugates, was used to isolate the majority of glycoproteins present in human serum in an MLAC format.168 Furthermore, SLAC based on SNA and ConA was used to prefractionate sialylated N- and O-linked glycopeptides derived from proteolytic digests of human serum proteins based on their degree of branching.169 Here it was found that partly sialylated biantennary N-glycans are abundant in human serum. In order to gain deeper coverage of the sialylated glycoproteome, other combinations of sialic acid specific lectins have been used.169,171 For example, SNA, MAL-I and WGA were used to prefractionate sialylated N-linked glycoproteins in serum derived from pancreatic tumour patients (described further in paragraph 3.3.1).
One of the under-recognised disadvantages of lectins is their rather limited specificity towards glycoconjugates.172 Although the lectins selectively and reproducibly retain some glycoproteins, other glycoproteins carrying the same repertoire of glycans, may not be retained.173 Detection of sialoglycoconjugates containing α2,3- and α2,6-linked sialic acids by SNA and MAL-I are furthermore affected by sialic acid modifications. In addition, it was reported that both SNA and MAL-1 bind sialoglycoconjugates with Kdn and Kdn derivatives at the non-reducing end of the glycan. Together, the lack of glycan specificity and shared protein/glycan binding epitopes limit the capacity of lectins to perform unbiased sialoglycoprotein enrichment. Instead lectins are more valuable in the prefractionation steps and for visualisation purposes using histochemical staining techniques and lectin blotting.
In one of the first reports using HILIC as a tool to enrich glycopeptides from peptide mixtures, a zwitterionic (ZIC) type of HILIC resin functionalised with sulfobetaines packed in SPE format microcolumns in GeLoader tips was used as the stationary phase.176 This enrichment setup was combined with a deglycosylation step using a mixture of endo-β-N-acetylglucosaminidases, which allowed the identification of 62 N-glycosylation sites from 37 N-linked glycoproteins derived from human plasma. Moreover, ZIC-HILIC SPE showed highly efficient and unbiased enrichment of sialylated and neutral N-glycopeptides from glycoproteins purified by gel electrophoresis, enabling the N-glycoprofiling of different sources of human tissue inhibitor of metalloproteinases-1.177–180 The enrichment efficiency for both neutral and sialoglycopeptides has been further improved by the addition of an ion-pairing reagent to the mobile phase i.e. trifluoroacetic acid (TFA), which reduces the hydrophilicity of non-glycosylated peptides comparably more than the glycosylated peptides in peptide mixtures by protonating all carboxyl groups at the low pH and by ion pairing with the protonated amino groups on the peptides.181 Several HILIC solid phase materials have been shown to all yield unbiased desalting/enrichment of glycosylated peptides in SPE formats without the loss of quantitative information.182 However, the study highlighted that column capacity is a critical parameter to consider. Hydrophilic and electrostatic interactions (attraction and repulsion) between the sialic acid residues and the sulfobetaine groups on the surface of the stationary phase influence the retention behaviour and are modulated by the degree of sialylation and the type of sialic acid linkage. It should be noted that the entire complement of sialylated glycopeptides may be difficult to elute from HILIC SPE columns in common acidic (e.g. TFA/formic acid) and alkaline (e.g. ammonium bicarbonate) solvents (Fig. 3). In order to evaluate the retention behaviour of sialylated glycopeptides on a HILIC stationary phase, bovine fetuin (Uniprot entry number: P12763) was tryptic digested and the peptide map was analysed by positive ion MALDI-TOF-MS (Fig. 3A). The peptide map was dominated by signals corresponding to non-glycosylated peptides in the lower m/z region which suppressed the glycopeptide ionisation. After loading onto a ZIC-HILIC microcolumn, the glycopeptides were selectively eluted under commonly used acidic conditions using aqueous 0.1% TFA (Fig. 3B). The MS signals in the m/z 3500–5000 range were assigned to the fetuin glycopeptides bearing the L145CPDCPLLAPLNDSR159 peptide moiety. Subsequently, remaining glycopeptides were eluted from the same ZIC-HILIC micro-column with 100 mM ammonium bicarbonate (ABC) in slightly basic conditions (Fig. 3C). The same MS signals were detected indicating an electrostatic retention effect. Moreover, glycopeptides were eluted from the same ZIC-HILIC micro-column with 2,5-dihydroxybenzoic acid (DHB) matrix (Fig. 3D). Several new MS peaks appeared in the m/z 5000–7000 region; these were associated with the glycopeptides bearing the V160VHAVEVALATFNAESNGSYLQLVEISR187 and R72PTGEVYDIEIDTLETTCHVLDPTPLANCSVR103 peptide moieties. The length of the peptide moiety strongly influenced the binding affinity to ZIC-HILIC sorbent. Furthermore, fetuin tryptic peptides were treated with sialidase A and mixed with untreated fetuin tryptic peptides in a 1:
1 ratio. This mixture was loaded onto a ZIC-HILIC micro-column and glycopeptides eluted with 0.1% TFA (Fig. 3E). The elution of desialylated glycopeptides was detected. Subsequently, sialylated glycopeptides were eluted with DHB matrix from the same ZIC-HILIC micro-column (Fig. 3F). Taken together, these results show that pH conditions and solvent additives, such as DHB, influence the qualitative and quantitative glycopeptide elution profile after ZIC-HILIC enrichment. Moreover sialylated glycopeptides bind ZIC-HILIC sorbent stronger than non-sialylated glycopeptides under commonly used acidic conditions.
![]() | ||
Fig. 3 Positive ion MALDI-TOF-MS of tryptic peptide mixture of bovine alpha-2-HS-glycoprotein (fetuin, Uniprot entry number: P12763) in linear mode under the following conditions: (A) general peptide mass map, (B) peptide mass map of glycopeptides eluted from the ZIC-HILIC micro-column using 0.1% TFA, (C) peptide mass map of glycopeptides subsequently eluted from the same ZIC-HILIC micro-column with 100 mM ABC (pH 7.8), (D) peptide mass map of glycopeptides subsequently eluted from the same ZIC-HILIC micro-column with DHB matrix, (E) fetuin tryptic peptides treated with sialidase A and mixed with untreated fetuin tryptic peptides. This mixture was loaded onto a ZIC-HILIC micro-column and glycopeptides eluted with 0.1% TFA, (F) subsequent elution with DHB matrix from the same ZIC-HILIC micro-column. |
Electrostatic repulsion hydrophilic interaction chromatography (ERLIC) has been employed for the enrichment of tryptic phosphopeptides.183 For this application the secondary interaction between the negatively charged phosphate groups of the phosphopeptides and the weak anion exchange chromatography (WAX) stationary phase was used for enhanced retention in addition to the main hydrophilic interactions. ERLIC has also been used to enrich glycopeptides from human platelets using a polyWAX (PolyLC Inc.) column with a gradient of decreasing acetonitrile concentration from 70% to 60% (v/v).184 This method enabled the identification of 125 glycosylation sites from 66 glycoproteins. Possible isoform separation was proposed due to the elution of the same glycopeptide in different chromatographic fractions, however, this was not verified. The retention mechanism of ERLIC in the context of glycopeptides has not been thoroughly described; however, the combination of WAX and hydrophilic interactions in ERLIC may yield an unbiased enrichment of sialoglycopeptides and neutral glycopeptides from peptide mixtures due to their difference in size and charge. In a recent application, ERLIC was used to simultaneously enrich the mouse brain phospho- and glycoproteome.149 A total of 544 unique glycoproteins and 383 phosphoproteins were identified and this method was shown to be superior to a method based on hydrazide chemistry (see paragraph 3.2.2.g). Comparison of ERLIC and strong cation exchange chromatography (SCX) in the prefractionation of phospho- and glycopeptides from rat kidney tissue showed a higher identification rate of glycoproteins and phosphoproteins in ERLIC due to the more uniform distribution of the modified peptides in this fractionation technique.185
N-Linked glycoproteomics of myocardial ischemia and reperfusion injury was performed with a parallel use of three available enrichment strategies: hydrazide capture (see paragraph 3.2.2.h), TiO2 and ZIC-HILIC enrichment.199 The enriched glycopeptides were all prefractionated using off-line HILIC with UV detection prior to the downstream LC-MS detection. In total, 1556 N-linked glycosylation sites were identified. In another study, the glycoproteomic analysis of the Chardonnay wine revealed five grape glycoproteins enriched by the TiO2 resin.200 This opens up for the possibility of the presence of acidic sugars in grape. However, there has not been any detailed glycan analysis performed in wine to date to document this further.
Recently, a Ti(IV)-IMAC resin201 designed with titanium ions immobilised on microspheres through a flexible linker terminated with phosphonate groups was used to enrich sialoglycopeptides from serum. The authors developed an optimised strategy based on filter digestion and enrichment identifying 217 unique N-linked sialoglycopeptides from 1 μL of human serum following deglycosylation. In another recent study, the simultaneous enrichment, identification and quantification of phosphorylated and formerly sialylated peptides during mouse brain development were reported.100 Again, a multi-dimensional separation was used combining TiO2 SPE enrichment of phosphorylated and sialylated N-linked glycopeptides and HILIC prefractionation after PNGase F treatment. Each HILIC fraction was subsequently analysed by LC-MS and a total of 7682 unique phosphopeptides and 3246 unique formerly sialylated N-linked glycopeptides were identified from 400 μg of protein starting material. The hydrophilic phosphopeptides were eluted in the early HILIC fractions whereas the formerly sialylated glycopeptides eluted later in the HILIC prefractionation. This study provide the first system-wide analysis of the dynamic changes of phosphorylated and sialylated proteins together in a biological system.
To increase the identification of sialylated glycopeptides isolated from rat liver, it was suggested to include an immobilised pH gradient-isoelectric focusing (IPG-IEF) fractionation step prior to the TiO2 enrichment.202 IPG-IEF was found to concentrate the negatively charged sialoglycopeptides in the low-pH fractions. As a result, 582 sialoglycopeptides from 322 sialylated N-linked glycoproteins were identified from rat liver tissue.
The combination of hydrazide-based chemistry to enrich glycopeptides with subsequent LC-MS detection has been reported.204 The strategy involved the oxidation of cis diol containing N-glycans at room temperature before coupling to the hydrazide resin forming hydrazone bonds with the glycan portion of the glycoproteins. The non-glycosylated proteins were removed by stringent washing before performing tryptic digestion of the resin-bound glycoproteins. After removal of the non-glycosylated peptides, enzymatic N-deglycosylation was used to release the resin-bound formerly glycosylated peptides with the simultaneous conversion of the asparagine residues to aspartic acid residues. MS was used to determine this conversion as a diagnostic for the presence of a formerly occupied glycosylation site. The hydrazide strategy for enrichment of N-glycopeptides was subsequently optimised by using SPE formats in a so-called ‘SPEG setup’,205 increasing the enrichment efficiency of glycopeptides.206,207 The hydrazide chemistry strategy has been applied to identify glycoproteins isolated from several biological origins such as body fluids,208–211 the extracellular milieu,207,212 tissues108,199,213 and cell cultures.214,215 This strategy has been optimised to analyse sialylated glycoproteins as discussed below.
As compared to other monosaccharides occurring in N- and O-linked glycoproteins that are commonly oxidised by periodate at room temperature or 50 °C,216 sialic acids contain three linearly adjacent hydroxyl groups at the C7, C8 and C9 carbons, which are highly susceptible to periodate oxidation. Consequently, the periodate oxidation reaction can be carried out at much lower temperature i.e. 0–4 °C to selectively target sialic acid residues.217 This was shown in a study where 1–2 mM periodate concentration, pH 7.4, at 0 °C oxidised sialic acids within 10 min.218 The oxidation leaves an aldehyde group on the C7 carbon of the sialic acids, which has been used for subsequent radioactive218,219 and non-radioactive220,221 labelling of cell surface sialoglycoconjugates of intact cells.
A slightly different variant of the hydrazide-based chemistry approach described above is the cell surface capturing (CSC) technology,222 which enabled identification of hundreds of cell surface glycoproteins from a variety of cellular origins including Drosophila melanogaster cells, mouse myoblasts, embryonic stem cells and T and B cells.222–224 The CSC technology is based on the mild oxidation of glycans with 1.6 mM sodium meta periodate, pH 6.5, for 15 min at 4 °C to generate reactive aldehyde groups from cis diols. Subsequently, biocytin hydrazide is reacted with aldehyde groups to form hydrazone bonds. After cell lysis, the membrane glycoproteins are recovered by ultracentrifugation and proteolytically digested using trypsin. Glycopeptides are enriched with streptavidin beads and stringent washing conditions are employed to remove unspecific peptides. Finally, N-glycosylated peptides are enzymatically released from streptavidin beads by deglycosylation using PNGase F and analysed using LC-MS. It should be noted that the CSC technology does not selectively target sialylated glycoproteins; this was shown by the positive labelling and capturing of proteins from neuramidase treated cells and insect cells which has no or negligible endogenous production of sialoglycoconjugates.222,225
The development of a highly efficient and selective labelling strategy of cell surface sialylated glycoproteins was reported.226 This strategy is based on the mild periodate oxidation to generate an aldehyde on sialic acids, followed by aniline-catalysed oxime ligation (PAL) with a suitable tag. In this strategy, sialoglycoproteins residing in the cell surface were oxidised using 1 mM sodium meta periodate, pH 7.4, for 15 min at 4 °C. Subsequently, ligation was performed using aminooxy-biotin and 10 mM aniline at pH 7.4. Aniline acts as a nucleophilic catalyst for the oxime ligation227,228 allowing a highly efficient ligation of a molecular tag onto sialoglycans of the membrane proteins under mild conditions. The application of the PAL technology was recently expanded to enrich and identify a larger selection of the cell membrane tethered glycoproteins.229 Here, two complementary strategies for the enrichment of sialic acid and galactose terminated glycoproteins on living cells were reported. In addition to the PAL strategy, the introduction of aminooxy-biotin onto terminal galactose and GalNAc residues by galactose oxidase and aniline-catalysed oxime ligation (GAL) were reported to enrich different subsets of the glycoproteome. The biotin tag functioned as an affinity tag to the streptavidin resin. The enriched glycoproteins were digested and the sialic acid and galactose/GalNAc-terminating glycopeptides were analysed by LC-MS after deglycosylation. As such, PAL and GAL represent two enrichment methods that can be used to investigate different subsets of the glycoproteome. It is important to stress the small differences in reaction conditions between CSC and PAL such as the periodate concentration, temperature and reaction time. This indicates the importance of fine tuning the reaction conditions to obtain a sialic acid-selective labelling. CSC and PAL strategies can be applied to primary cells, tissues and organs but require solubilisation of viable cells for cell surface oxidation and labelling, which is not always possible. Moreover, it should be noted that CSC and PAL technologies have been used mainly to map the cell surface N-linked glycoproteins. However, it is feasible to capture O-linked glycoproteins and release the deglycosylated peptides from the hydrazide resin by β-elimination, acid hydrolysis99,230 or transoximisation.231
The reaction between glycan aldehydes and aminoxy-derivatised polymers has been used for selective capturing of both oxidised glycans and glycopeptides followed by MS detection of the acid-released species in a method named glycoblotting.232 An alternative way of releasing captured glycans via the oxyme bond is through transoximisation using excess O-substituted aminooxy derivatives under weakly acidic conditions.231 Since it is based on the similar yet opposite concept it was coined a “reverse glycoblotting” strategy. The principle is the selective oxidation of sialic acid containing glycopeptides by sodium periodate under mild conditions.230 This strategy relies on the oxidation of sialic acids using 1 mM sodium periodate at 0 °C for 15 min generating available aldehydes on the C7 of sialic acids, which are then coupled to aminooxy-functionalised polymers. After removal of unspecifically bound molecules, the sialoglycopeptides are released using 3% TFA in an aqueous solution at 100 °C for 1 h, which hydrolyses the α-glycoside bonds between the oxidised sialic acids and adjacent galactose residues and allows the identification of the desialylated glycopeptides. A similar method was later reported using hydrazide-functionalised polymers to capture sialylated glycoproteins oxidised using 2 mM sodium periodate at 0 °C for 10 min. The unspecifically bound proteins were removed with stringent washing before trypsin digestion. The captured sialoglycopeptides were released with 0.1 M formic acid at 80 °C for 1 h and the desialylated glycopeptides were analysed by LC-MSn using orthogonal fragmentation techniques.99 In total, 36 N-linked and 44 O-linked glycosylation sites formerly occupied by sialoglycans were detected from proteins derived from human cerebrospinal fluid.
Recently, the combination of glycoblotting and selected reaction monitoring (SRM) has been used to quantitate intact sialoglycopeptides derived from serum sialoglycoproteins. In the improved glycoblotting protocol, oxidised sialylated glycopeptides linked to hydrazide-resin were released in an intact form using ice-cold aqueous 1 M HCl. This treatment regenerates the aldehyde groups that initially reacted with 2-aminopyridine in the presence of 2-picoline borane and this reversible reaction caused the sialoglycopeptides to elute. Specifically, 26 sialylated glycopeptides were identified from LC-MS analysis of 50 μL mouse serum. The corresponding tandem MS spectra were then used to determine the optimal transitions for subsequent quantitative SRM experiments. The ability to elute intact sialoglycopeptides from the hydrazide resin is a highly interesting optimisation of the technique enabling the use of hydrazine chemistry in more true glycoproteomics workflows. One of the limitations of hydrazine chemistry, however, is that it requires chemical oxidation, which in addition to destroying the structure of the monosaccharide affects other vicinal amino alcohols such as N-terminal serine and threonine residues. These residues will be oxidised by the periodate and will generate glyoxylyl derivatives that may also be captured by the hydrazide resin.233 Moreover the selectivity of the hydrazide resin towards the sialylated species relies on temperature and time-dependent oxidation reactions, which will need to be described in more detail by thorough kinetic investigations in the future, before further optimisation of the workflows can be expected.
Taken together, there is an abundance of techniques available for the enrichment and prefractionation of sialoglycopeptides. This wide selection is beneficial since the presented methods are somewhat complementary in their capacity to isolate sialylated glycopeptides and are compatible with different downstream approaches. Indeed each method enriches a unique subset of the sialoglycoproteome due to their different binding mechanisms to the sialylated glycopeptides. This is supported by the limited overlap of the detected phosphoproteomes observed when different enrichment methods were compared.234 Furthermore, a comparison between affinity and chemistry-based enrichment of sialylated glycoproteins illustrated that the two methods are capable of identifying complementary subsets of the glycoproteome with some degree of overlap.235 Hence, combining these enrichment approaches is beneficial when the aim is to enhance the coverage of the sialoglycoproteome. The choice of enrichment method will, as such, largely be determined by the research question investigated, personal preference and in-house technology. Interestingly, with the exception of serotonin none of the enrichment methods developed so far has been able to discriminate between glycopeptides containing different types of sialic acids e.g. Neu5Ac and Neu5Gc.
Top-down analyses of glycoproteins, and sialoglycoproteins in particular, are significantly more challenging than the analysis of their non-modified counterparts due to the extensive microheterogeneity and unfavourable MS properties (discussed in the second of these two reviews78). As a result, the top down analysis of intact sialoglycoproteins has, until now, only been performed in a few cases using purified or semi-purified glycoproteins. Hence, top-down analysis of sialoglycoproteins requires the use of efficient protein purification methods, which not only purify the protein of interest to homogeneity, but ideally also isolate or partially separate the individual sialoglycoforms. The initial isolation steps can be achieved using conventional protein purification techniques including immunoprecipitation, liquid–liquid extraction, affinity chromatography and other types of LC by considering the specific structural and physicochemical properties of the glycoprotein of interest.107 In addition, enrichment tools targeting the sialoglycans can be used to broadly purify sialoglycoproteins although one has to be aware that only a subset of the entire spectrum of glycoforms of specific glycoproteins may be represented in the enriched fraction. As an example, lectin affinity chromatography has been used to enrich broadly for sialoglycoproteins.171 In this work, lectin affinity chromatography using three different sialic acid specific lectins (SNA, MAL and WGA) followed by RPC prefractionation, enabled the identification of a total of 130 sialylated glycoproteins by downstream LC-MS of tryptic glycopeptides and released sialoglycans.
The separation of sialoglycoforms of a protein can be pursued on- or off-line and is needed, first, to allow the mass spectrometer to establish an accurate molecular mass without mass and signal suppression interference from other sialoglycoforms and, second, to allow time for the mass spectrometer to isolate and fragment the individual sialoglycoforms for additional structural validation. CE can be coupled to MS and the combination has proven powerful in the separation and detection of sialoglycoforms at high sensitivity.236,237 In comparison, various LC techniques including RPC and anion exchange chromatography yield seemingly less separation of sialoglycoforms, but may be easier to couple directly to MS for on-line data acquisition.238,239 Other non-MS detection methods can establish important structural information about the sialoglycoproteins e.g. two dimensional gel electrophoresis.83,179 Aspects dealing with the separation and detection of intact sialoglycoproteins are covered in more detail in the second of the two related reviews.78
Another important bi-orthogonal functional group for the incorporation of unnatural sialic acids is the azide, which has the advantage of being totally absent in biological systems in nature. In addition, azides do not perturbate the conformation of their substrates. The physiological precursor of all sialic acids is N-acetyl D-mannosamine (ManNAc) which undergoes a series of enzymatic transformations to generate the activated donor CMP-sialic acid substrate that transfers sialic acids to glycoproteins or glycolipids.251 The incorporation of azide containing sialic acids into sialoglycoproteins residing in the cell surface was performed in Jurkat cells by the incubation with the precursor derivative N-azidoacetylmannosamine (ManNAz).252 Using Staudinger ligation,252 the azide groups were subsequently ligated to phosphine groups bearing a wide array of probes such as biotin and various peptides e.g. FLAG, myc, and His6 for imaging and for enrichment by immunoprecipitation prior to glycoproteomics analysis.253 The phosphine-FLAG-His6 probe enabled two orthogonal purification steps before LC-MS based analysis of the glycopeptides.254 The metabolic incorporation of ManNAz was also utilised to label the cell surface sialoglycoproteins of syngeneic prostate cancer cell lines derived from non-metastatic (N2) and highly metastatic (ML2) prostate cancer cells.255 Affinity isolation of the modified sialoglycoproteins was performed using a biotinylated alkyne via click chemistry.256 This was followed by 1D gel electrophoresis and LC-MS analysis for the total identification of 324 and 372 proteins from N2 and ML2, respectively. As expected, the glycoproteins uniquely identified in the highly metastatic cancer cell line were involved in cellular migration and invasion processes. Finally, metabolic labelling of sialoglycoproteins with 9-aryl azide-substituted sialic acid (9AAzNeuAc) was used to identify the trans ligands of CD22 on opposing B cells using a protein-glycan cross-linking strategy.257 Cross-linking of CD22-Fc or CD22-expressing CHO cells to intact B cells allowed the identification of a subset of candidate trans ligands of CD22 including immunoglobulin M, CD45 and Basigin.
Staudinger ligation has been used for probing cell surface sialic acids in several biological systems ranging from cell lines258 to living animals.244,259 However, this reaction suffers from slow kinetics and phosphine oxidation in air.260 Beside their electrophile behaviour, azides are 1,3-dipoles that can react with terminal alkynes to provide stable triazoles according to [3 + 2] copper-catalysed azide–alkyne cycloaddition (CuAAC).261 However, this reaction requires high temperatures and pressures but was made biocompatible by adding catalytic amounts of Cu(I).262,263 This 1,3-dipolar CuAAC shows all the properties of a click reaction such as reaction efficiency, simplicity, and selectivity in addition to being biocompatible. Moreover this reaction is 25 times faster than the Staudinger ligation and was consequently used to label cell surface sialic acids bearing azides with alkyne probes for imaging and glycoproteomic studies. The secreted and cell surface tethered N- and O-linked sialoglycoproteins of human mesenchymal stem cells were manipulated using a membrane permeable ManNAz, which was converted to N-azidoacetylsialic acid prior to incorporation. The labelled sialoglycoproteins were separated by 1D or 2D gel electrophoresis and detected by fluorescence before the appropriate gel bands/spots were excised and analysed by MS.264
The incorporation of unnatural sialic acids was further explored using the metabolic precursor alkynyl ManNAc to image the cell surface sialoglycoproteins in several cancer cell lines.265 It was demonstrated that click-activated fluorogenic probes are useful to efficiently and selectively label alkynyl-modified glycans. This metabolic precursor was further applied to the enrichment of sialylated glycoproteins of prostate cancer cells.266 Following metabolic labelling, the cells were lysed and reacted with an azido-biotin probe through CuAAC. The tagged glycoproteins were captured by streptavidin-conjugated beads and followed by on-bead digestion with trypsin and PNGase F in order identify the formerly sialylated N-linked glycopeptides and map the previously occupied N-glycosylated sites. In total, 108 N-glycoproteins were identified from 1.5 mg of protein starting material. Another application of this strategy was to investigate the sialylation profile of isogenic lung cancer cells with low (CL1-0) and high (CL1-5) metastatic potential. MS based glycoproteomic analysis identified 157 N-linked glycoproteins and amongst the identified sialoglycoproteins, EGFR exhibited higher degree of sialylation and fucosylation in the lung cells with high metastatic potential. The sialylation turned out to be important for modulating the dimerisation and tyrosine phosphorylation, and hence the function, of EGFR.267 The sialic acid metabolic labelling efficiency with alkynyl ManNAc is superior to that of ManNAz. The successful labelling of sialic acids was shown in mice,268 however the use of copper generally limits its use in living organisms. Although new, biocompatible copper ligands, were developed to enable the use of CuAAC in living systems,269 there are still concerns regarding the toxicity of these ligands. Hence, a strain-promoted [3 + 2] cycloaddition between cyclooctynes and azides was developed that proceeds without the need for a copper catalyst.270 The sensitivity of this reaction was further improved using difluorinated cyclooctynes (DIFO) probes.271 Using this copper-free click chemistry, the spatiotemporal dynamics of the sialome in living zebrafish embryos were monitored by reactions with fluorophore-conjugated DIFO reagents.272
Finally, other unnatural sialic acids have been described as precursors for metabolic labelling of cell surface sialoglycoproteins273 including precursors using thiol containing functional groups.274 The thiol-analogue of ManNAc, Ac5ManNTGc, allowed the incorporation of N-thiolglycolyl neuraminic acid as a sialic acid substitute into sialoglycoproteins on the cell surface of Jurkat T cells and human embryoid body-derived stem cells. This allowed cell–cell clustering and attachment to biomaterials and synthetic scaffolds for tissue engineering. Based on the concept of activity based protein profiling275 that uses chemical probes to target and profile specific enzymes, new membrane-permeable alkyne-hinged 3-fluorosialyl fluoride molecules were used to covalently bind to virus, bacteria, and human sialidases.276 The alkyne group was subsequently used to covalently attach, via click chemistry, an azide containing biotin for detection and profiling. Using MS it was shown that a tyrosine residue of the sialidases was involved in the binding with the fluoride derivative of sialic acid. Moreover, it was possible to map the sialidase activity in living cells under different conditions.
In conclusion, the metabolic labelling of sialoglycoproteins has proven to be an efficient tool for their enhanced detection and isolation in several biological systems. However, several challenges remain with respect to competition from the endogenous substrates, such as natural sialic acids, which results in incomplete metabolic labelling. Moreover, the overall physiology of the system investigated may be perturbed by the addition of unnatural sialic acids. Finally, biases may be introduced in the incorporation of unnatural sialic acids into the sialoglycoproteins. Further research into the specificity of the different sialyltransferases towards the artificial sialic acid variants may generate a better understanding of this.
The use of glycoproteomics oriented approaches to study sialylation is an attractive, but challenging, analytical route to take, since it is rewarded by site-specific information of the sialoglycans. The popularity of such workflows is increasing driven, in part, by the development of efficient and robust enrichment techniques for sialoglycopeptides from complex matrices that reduce the interference from non-modified peptides and neutral glycopeptides. The negative charge of the sialic acid residues has widely been used as a physicochemical property in enrichment and prefractionation strategies, and this text has emphasised their advantages and limitations. To date, the choice of analytical route including the enrichment and derivatisation methods has mainly been based on the specific expertise in the individual laboratories and personal preferences. As a consequence, the enrichment methods still lack thorough comparison in terms of reproducibility, robustness, specificity and sensitivity. In addition, an enrichment strategy to enable the characterisation of the entire sialome is still missing since most of the methods have been directed mainly towards the Neu5Ac and Neu5Gc structures limiting our knowledge about other sialic acids and their modified counterparts. It should be noted that the majority of the enrichment methods target N-linked sialoglycopeptides and glycoproteins due to the lack of efficient methods to release and analyse O-linked glycans and O-linked glycopeptides. We expect that many of these strategies will be further improved to enable deeper investigation of the structural diversity of sialoglycoproteins using large scale glycoproteomics. Unbiased large scale glycoproteomic approaches will drive the discovery of new cellular pathways and will elucidate how different post-translational modifications regulate the protein function in a concerted way. However, to date, no large scale study has been reported targeting the entire sialoglycoproteome: System-wide sialoglycoproteomics approaches have predominantly been used to identify the sialoglycoproteins and their glycosylation sites. As such, a significant piece of structural information of the sialylated glycoproteins has not been addressed at the system-wide scale; the structure of the sialoglycans determined in a site-specific manner. We envision that the development of more dedicated bioinformatics tools and intelligent LC-MS acquisitions (discussed in the second of these two related reviews78) will allow more comprehensive investigation of the sialoglycoproteome on the site-specific level by the large-scale study of intact glycopeptides (Parker et al., submitted and Lendal S. et al., in preparation).277
In conclusion, the investigation of structural sialoglycobiology holds great promises for obtaining a better understanding of human health and diseases. Indeed sialoglycoconjugates are widely recognised targets for diagnosis, treatment, and therapy of various human diseases.
ABC | Ammonium bicarbonate |
CE | Capillary electrophoresis |
CSC | Cell surface capturing |
CMP | Cytidine 5-monophosphate |
COFRADIC | Combined fraction diagonal chromatography |
ConA | Concanavalin A |
CuAAC | Copper-catalysed azide-alkyne cycloaddition |
DIFO | Difluorinated cyclooctynes |
EGFR | Epidermal growth factor receptor |
ERLIC | Electrostatic repulsion hydrophilic interaction chromatography |
GAL | Galactose oxidase and aniline-catalysed oxime ligation |
Gal | Galactose |
GalNAc | N-Acetylgalactosamine |
GlcNAC | N-Acetylglucosamine |
HCD | Higher-energy C-trap dissociation |
HILIC | Hydrophilic liquid interaction chromatography |
IPG-IEF | Immobilised pH gradient-isoelectric focusing |
Kdn | 2-Keto-3-deoxynonulosonic-acid |
LC | Liquid chromatography |
MAL | Maackia amurensis leukoagglutinin |
MAH | Maackia amurensis hemagglutinin |
ManNAz | N-Azidoacetylmannosamine |
MLAC | Multiple lectin affinity chromatography |
MS | Mass spectrometry |
Neu5Ac | N-Acetylneuraminic acid |
Neu5Gc | N-Glycolylneuraminic acid |
NPC | Normal phase chromatography |
PAL | Periodate and aniline-catalysed oxime ligation |
PNGase A/F | N-Glycosidase A/F |
RPC | Reversed phase chromatography |
SCX | Strong cation exchange |
SLAC | Serial lectin affinity chromatography |
SNA | Sambucus nigra agglutinin |
SPE | Solid phase extraction |
SRM | Selected reaction monitoring |
TCA | Trichloroacetic acid |
TFA | Trifluoroacetic acid |
TiO2 | Titanium dioxide |
WAX | Weak anion exchange chromatography |
WGA | Wheat germ agglutinin |
ZIC | Zwitter-ionic |
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c3ra42960a |
This journal is © The Royal Society of Chemistry 2013 |