Process optimization using machine learning enhanced design of experiments (DOE): ranibizumab refolding as a case study†
Abstract
The use of empirical approaches such as the design of experiments (DOE) is quite common during bioprocess optimization and characterization. In this paper, we present an application of machine learning (ML) enhanced DOE. Herein, the statistical analysis part of the DOE is replaced by ML to capture the process nonlinearity more effectively. The proposed approach has been applied towards modelling and optimization of refolding of ranibizumab, a well-accepted bottleneck with respect to the process yield and productivity for complex proteins. First, identified critical process parameters (CPPs) were used to develop causative relationships between the CPPs and the critical quality attributes (CQAs). Using the generated database, the support vector regression algorithm with polynomial and Gaussian kernels was implemented to optimize the process conditions while maximizing the yield. The significance of the interaction terms was evaluated based on tStat and p values. This model offered a prediction for the refolding yield with R2 of 0.99 and a root mean square error (RMSE) of 0.55 for the training dataset and 1.04 for the cross validation dataset. In comparison to the DOE with the conventional statistical analysis, the proposed approach resulted in 3% improved prediction efficiency and 50% and 5% improvement in RMSE and R2 values, respectively. We believe that the ML enhanced DOE can significantly improve the efficiency of process development.
- This article is part of the themed collection: Biocatalysis & Bioprocessing