A new alternative tool to analyse glycosylation in pharmaceutical proteins based on infrared spectroscopy combined with nonlinear support vector regression†
Abstract
Almost 60% of commercialized pharmaceutical proteins are glycosylated. Glycosylation is considered a critical quality attribute, as it affects the stability, bioactivity and safety of proteins. Hence, the development of analytical methods to characterise the composition and structure of glycoproteins is crucial. Currently, existing methods are time-consuming, expensive, and require significant sample preparation steps, which can alter the robustness of the analyses. In this work, we suggest the use of a fast, direct, and simple Fourier transform infrared spectroscopy (FT-IR) combined with a chemometric strategy to address this challenge. In this context, a database of FT-IR spectra of glycoproteins was built, and the glycoproteins were characterised by reference methods (MALDI-TOF, LC-ESI-QTOF and LC-FLR-MS) to estimate the mass ratio between carbohydrates and proteins and determine the composition in monosaccharides. The FT-IR spectra were processed first by Partial Least Squares Regression (PLSR), one of the most used regression algorithms in spectroscopy and secondly by Support Vector Regression (SVR). SVR has emerged in recent years and is now considered a powerful alternative to PLSR, thanks to its ability to flexibly model nonlinear relationships. The results provide clear evidence of the efficiency of the combination of FT-IR spectroscopy, and SVR modelling to characterise glycosylation in therapeutic proteins. The SVR models showed better predictive performances than the PLSR models in terms of RMSECV, RMSEP, R2CV, R2Pred and RPD. This tool offers several potential applications, such as comparing the glycosylation of a biosimilar and the original molecule, monitoring batch-to-batch homogeneity, and in-process control.