Integration of RNAseq transcriptomics and N-glycomics reveal biosynthetic pathways and predict structure-specific N-glycan expression

Abstract

The processes involved in protein N-glycosylation represent new therapeutic targets for diseases but their stepwise and overlapping biosynthetic processes make it challenging to identify the specific glycogenes involved. In this work, we aimed to elucidate the interactions between glycogene expression and N-glycan abundance by constructing supervised machine-learning models for each N-glycan composition. Regression models were trained to predict N-glycan abundance (response variable) from glycogene expression (predictors) using paired LC-MS/MS N-glycomic and 3′-TagSeq transcriptomic datasets from cells derived from multiple tissue origins and treatment conditions. The datasets include cells from several tissue origins – B cell, brain, colon, lung, muscle, prostate – encompassing nearly 400 N-glycan compounds and over 160 glycogenes filtered from an 18 000-gene transcriptome. Accurate models (validation R2 > 0.8) predicted N-glycan abundance across cell types, including GLC01 (lung cancer), CCD19-Lu (lung fibroblast), and Tib-190 (B cell). Model importance scores ranked glycogene contributions to N-glycan predictions, revealing significant glycogene associations with specific N-glycan types. The predictions were consistent across input cell quantities, unlike LC-MS/MS glycomics which showed inconsistent results. This suggests that the models can reliably predict N-glycosylation even in samples with low cell amounts and by extension, single-cell samples. These findings can provide insights into cellular N-glycosylation machinery, offering potential therapeutic strategies for diseases linked to aberrant glycosylation, such as cancer, and neurodegenerative and autoimmune disorders.

Graphical abstract: Integration of RNAseq transcriptomics and N-glycomics reveal biosynthetic pathways and predict structure-specific N-glycan expression

Supplementary files

Article information

Article type
Edge Article
Submitted
18 Sun 2025
Accepted
20 Kul 2025
First published
04 Dzi 2025
This article is Open Access

All publication charges for this article have been paid for by the Royal Society of Chemistry
Creative Commons BY-NC license

Chem. Sci., 2025, Advance Article

Integration of RNAseq transcriptomics and N-glycomics reveal biosynthetic pathways and predict structure-specific N-glycan expression

M. R. S. Alvarez, X. A. Holmes, A. Oloumi, S. J. Grijaldo-Alvarez, R. Schindler, Q. Zhou, A. Yadlapati, A. Silsirivanit and C. B. Lebrilla, Chem. Sci., 2025, Advance Article , DOI: 10.1039/D5SC00467E

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements