Open Access Article
Roshiya Nongmaithema,
Raju Sasikumar
*a,
Irengbam Barun Mangang
b,
Selva Kumar T
ac,
Kambhampati Vivekd and
Amit K. Jaiswal
*ef
aDepartment of Agribusiness Management and Food Technology, North-Eastern Hill University, Tura Campus, Chasingre 794002, Tura, West Garo Hills, Meghalaya, India. E-mail: sashibiofoodster@gmail.com
bCollege of Food Technology, Lamphelpat, Central Agricultural University, Imphal-795004, India
cVel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Chennai 600062, India
dDepartment of Food Process Engineering, National Institute of Technology, Rourkela, India
eSchool of Food Science and Environmental Health, Faculty of Sciences and Health, Technological University Dublin, City Campus, Central Quad, Grangegorman, Dublin, D07 ADY7, Ireland. E-mail: amit.jaiswal@tudublin.ie
fCentre for Sustainable Packaging and Bioproducts (CSPB), Technological University Dublin, City Campus, Grangegorman, Dublin D07 H6K8, Ireland
First published on 13th April 2026
In this study, a probiotic beverage from Sohiong (Prunus nepalensis), an underutilized wild edible fruit rich in phenolics and anthocyanins, was optimized using both Response Surface Methodology (RSM) and an Artificial Neural Network–Genetic Algorithm (ANN–GA) hybrid model. Plackett–Burman screening identified temperature, strain, and inoculum size as significant variables influencing antioxidant response. Subsequent optimization using a Box–Behnken design yielded maximum DPPH scavenging activity of 71.55%, with enhanced total phenolic content (TPC) of 160.00 mg GAE per g, total anthocyanin content (TAC) of 227.85 mg C3GE per 100 mL, and total flavonoid content (TFC) of 183.96 mg QE per 100 mL. The RSM model showed good predictive capacity (R2 = 0.9467; RMSE = 1.98; AAD = 3.21%), but the ANN–GA model outperformed it with higher accuracy (R2 = 0.9988; RMSE = 0.63; AAD = 1.46%). Validation under ANN–GA-optimized conditions closely matched the predicted values, with only a 0.04% deviation in DPPH. These results confirm the superior predictive fidelity of ANN–GA for nonlinear fermentation systems. The optimized Sohiong probiotic beverage demonstrates significant antioxidant activity and is a promising functional beverage. This study highlights the potential of integrating traditional RSM with modern AI tools like ANN–GA in functional food formulation and bioactive compound enrichment.
Sustainability spotlightThis study presents a sustainable bioprocessing strategy to convert Sohiong (Prunus nepalensis), an underutilised wild fruit rich in phenolics and anthocyanins, into a high-value probiotic-fermented functional beverage. By integrating response surface methodology with artificial neural network–genetic algorithm modelling, the process achieved superior antioxidant activity and bioactive retention while minimising trial-and-error experimentation, thereby reducing material, energy, and experimental resource use. The approach promotes the development of clean-label, functional products using locally available biodiversity and aligns with the principles of the circular bioeconomy. Valorisation of indigenous fruits through precision fermentation contributes to SDG 3 (Good Health and Well-being), SDG 9 (Industry, Innovation and Infrastructure), SDG 12 (Responsible Consumption and Production), and SDG 13 (Climate Action). This work offers a scalable, data-driven model for sustainable, eco-innovative food bioprocessing that supports climate-smart nutrition, rural enterprise, and functional-beverage innovation, advancing sustainable food systems and promoting regional agricultural resilience. |
The biochemical composition of Sohiong (P. nepalensis) demonstrates remarkable therapeutic potential, containing high concentrations of anthocyanins (∼984 mg C3G per 100 g dry mass), essential micronutrients, and bioactive phytochemicals that contribute to its pronounced antioxidant capacity, as shown by DPPH and FRAP assays.3 These constituents have also been applied in product development, such as yoghurt, syrups, and hard candies, where anthocyanin stability was maintained over 14–90 days.4 However, commercial processing remains severely constrained due to an ephemeral harvesting season (August–November) and rapid post-harvest deterioration (shelf life of only 4–5 days under ambient conditions), limiting its accessibility to broader markets.5
The development of innovative processing technologies, particularly the formulation of probiotic-enriched products such as fermented beverages and dehydrated powders, presents a strategic approach to overcoming these inherent limitations. Such value-added initiatives preserve the functional bioactive components and extend the product's commercial viability, therefore addressing the growing consumer preference for nutraceutical products with demonstrated health benefits. Fermentative bioprocessing utilizing probiotic microorganisms represents a well-established biotechnological approach for enhancing the functional properties of food matrices. The strategic application of beneficial microorganisms, including Lactobacillus species and Bifidobacterium strains, facilitates the development of functional foods with demonstrated clinical efficacy in promoting intestinal health, immunomodulation, and optimizing nutrient bioavailability.6,7
Response surface methodology is a collection of statistical and mathematical methods designed to elucidate relationships among multiple independent variables and their corresponding response parameters,8 establishing it as the predominant optimization approach in fermentation process development.9 Contemporary research has witnessed the emergence of alternative algorithmic approaches for fermentation optimization applications. Advanced extraction methodologies have been refined through integration of response surface analysis with reverse propagation neural network–genetic algorithm frameworks. Dragoi et al.10 developed and validated an enhanced adaptive differential evolution algorithm incorporating neural modelling capable of optimizing aerobic fermentation processes using precise computational modelling.
ANN constitutes sophisticated deep learning methodology capable of modelling intricate nonlinear interdependencies among process variables.9 GA represent computationally effective optimization techniques which emulate biological evolutionary mechanisms, including selection pressure, genetic crossover, and mutation operations. Throughout the preceding decade, the synergistic application of ANN–GA hybrid systems has gained prominence in fermentation process optimization and extraction parameter determination, demonstrating superior efficiency in identifying optimal operational conditions with reduced experimental burden. Representative applications include pectin recovery from sunflower seed matrices, polyphenolic compound extraction from green tea substrates, bioactive compound isolation from pitaya peel waste, and ellagitannin recovery from black raspberry seed materials.11–13 The superior performance of ANN systems in modelling complex nonlinear variable relationships often surpasses conventional fitting methodologies such as traditional RSM approaches.14
Although several studies highlight the effectiveness of statistical and AI-based optimization across various food systems, none have explored Sohiong, an underutilized, anthocyanin-rich wild fruit. For the first time, this research integrates RSM with ANN–GA hybrid modeling to optimize Sohiong probiotic beverage fermentation, thereby demonstrating both the functional potential and the superior predictive accuracy of ANN–GA in fermentation systems. The present investigation establishes the following research objectives: (I) implementation of the Plackett–Burman experimental design to identify statistically significant process variables, (II) development and validation of both ANN–GA and RSM predictive models, (III) comparative evaluation of RSM and ANN–GA methodologies to determine the optimal modelling approach and derive optimal fermentation parameters. The research outcomes will provide essential technical foundation for industrial production scaling while contributing valuable insights for fermentation optimization methodologies in similar bioprocessing applications.
The probiotic juice was prepared by inoculating a culture (either Lactobacillus plantarum MCC 2974 or Bifidobacterium BB-12) into the pasteurized juice at a concentration range of 4 × 106 to 8 × 106 CFU mL−1. The mixture was incubated at range of 35 to 37 °C for 24 to 72 hours to allow fermentation, thereby enhancing the probiotic content and flavor. The fermented juice was stored at 4 °C until further use.
Based on the findings of the PB experiment, the factors with significant response values were chosen. Six gradients, each with three parallel lines, were established, and the regression model and experimental experience determined the step size and direction. To determine the range and centre points of relevant factors in the ensuing tests, the indexes were carried out in accordance with the one-factor optimal conditions.
Based on the elements influencing importance, the BB experimental design was carried out in accordance with the findings of the steepest-climbing experiment and the PB experiment. The values of the associated factors for the group with the highest DPPH scavenging activity in the steepest-ascending experiment served as the focal point.
The dataset size of 25 was split into training (70%), validation (15%), and testing (15%) subsets. The validation set was essential for implementing early stopping and ensuring that both ANN and ANN–GA models generalize beyond the training data. Unlike RSM, which estimates only a limited number of regression coefficients, ANN models contain many adjustable parameters and are therefore prone to overfitting, particularly when trained on small datasets. This risk is further amplified when GA is used to optimize weights and biases. Incorporating cross-validation helps prevent overfitting and ensures that the ANN and ANN–GA models generalize reliably. A logsig–logsig–purelin transfer function configuration was adopted for the input, hidden, and output layers, respectively. The Levenberg–Marquardt algorithm was used to minimize mean squared error (MSE).
Although ANN models are traditionally associated with large datasets, they are increasingly applied to small, well-designed experimental datasets such as those generated by Box–Behnken or central composite designs because these structured designs efficiently capture the underlying nonlinearity with a minimal number of experimental points. In this context, the ANN does not require thousands of data points; instead, it benefits from the high information content and orthogonality of the design matrix. Unlike RSM, which is constrained to quadratic polynomial relationships, the ANN can model complex, nonlinear interactions among variables even when the dataset contains only 15–30 observations. Therefore, the use of ANN is justified not by dataset size alone but by its superior ability to learn nonlinear response patterns inherent to fermentation processes.
To enhance prediction accuracy, a GA was employed to optimize the ANN's initial weights and biases. The GA was run for 100 generations with a population size of 50, using a crossover fraction of 0.8 and mutation rate of 0.01. The fitness function minimized the squared difference between the network output and the target antioxidant values. Overall, the experiment has done by screening the factors, optimizing by RSM, modelling by ANN–GA and the obtained results were validated.
Antioxidant activity increased as the inoculum size was raised to the optimal level. This is possibly because higher viable cell counts accelerate metabolite production. It was found that a 24-hour inoculation time was ideal. Longer fermentation times reduced antioxidant content by degrading phenolics and anthocyanins, whereas shorter fermentation times led to incomplete fermentation. Optimal probiotic growth and metabolic release were supported at the ideal temperature of 37 °C. Higher temperatures increased stress and reduced the stability of bioactive compounds, while lower temperatures hindered microbial activity.
Optimisation process of fermentation commonly begins with a Plackett–Burman (P–B) screening to identify key factors, followed by RSM for refined modelling. For example, Bajpai et al.19 used P–B design to screen significant medium components for CoQ10 production, and then optimized their levels via RSM. This yielded a substantial increase in CoQ10 titer (from 10.8 to 18.57 mg L−1) under the RSM-optimized conditions. Similarly, Hu et al.16 optimized coffee pulp wine fermentation by first pinpointing critical factors (material-to-liquid ratio, pH, sugar, yeast inoculum) through P–B screening, then applying RSM (central composite design) to model the fermentation response. In both cases, RSM proved effective in navigating multifactor interactions to improve fermentation metrics. For instance, the sugarcane–papaya wine optimization by Patil et al.20 achieved significant enhancement in product yields and antioxidant levels using a comparable RSM-based approach. These findings align well with the current study, where a P–B design likely identified influential variables (e.g., temperature, time, inoculum size) for subsequent RSM modelling. The use of RSM in the Sohiong study is in line with the literature. It provides a quadratic model capturing factor interactions, which is crucial since fermentation outcomes depend on a synergy of conditions. Notably, RSM optimization in fruit-based fermentations has led to improved product qualities. Deshaware et al.21 optimized bitter gourd fermentation to boost nutrients, and Yuan et al.22 optimized jujube wine fermentation for maximal antioxidant content. These precedents support the approach used in the Sohiong juice study and provide a benchmark for expected improvements. The main effects plot (Fig. 1) illustrates a steep upward trend in DPPH with increasing temperature, corroborating the strong coefficient observed in the model. Strain L (Lactobacillus plantarum MCC 2974) outperformed strain B (Bifidobacterium BB-12), validating its selection for final formulation. Similarly, increased inoculum size positively influenced antioxidant activity.
The regression equation based on coded variables is:
| Y(DPPH) = 4.90 + 1.745X1 + 0.505X2 + (1.69 × 10−7)X4 − 0.0018X3 |
The results demonstrated a consistent rise in DPPH scavenging activity with increasing temperature (Fig. 2). The maximum DPPH of 71.11% was achieved at 37 °C, establishing this point as the centre point for the subsequent Box–Behnken design. Corresponding improvements in TPC (156.10 mg GAE per g), TAC (229.23 mg C3GE per 100 mL), and TFC (176.56 mg QE per 100 mL) were also observed at this optimal temperature, indicating co-enhancement of secondary antioxidant parameters.
The final regression equation in terms of coded factors was:
| Y = 68.06 + 0.014A − 0.011B + 2.52C + 0.32A2 + 0.15B2 + 0.23C2 + 0.16AB − 0.045AC − 0.065BC |
ANOVA results showed that the model was statistically significant (p < 0.001) with R2 = 0.9467, adjusted R2 = 0.8782, predicted R2 = 0.5193, lack-of-fit F = 1.42 (p = 0.3597; not significant), adequate precision = 10.698 (detailed table is provided in the SI). A non-significant lack-of-fit implies that the model adequately fits the data, with no unexplained variation beyond random error. This supports the use of the model for process prediction and optimization.
The response surface plots showed elliptical and curved profiles, confirming the presence of significant interaction effects among the selected factors (Fig. 3). The plots revealed that increasing temperature had a consistent positive impact on DPPH activity, especially when combined with lower inoculum size and shorter fermentation periods.
![]() | ||
| Fig. 3 Response-surface and contour plots illustrating the relationships among different process variables for DPPH. | ||
The predicted optimum conditions for DPPH maximization were inoculum size = 4.00 × 106 CFU mL−1; inoculation period = 24 hours; temperature = 37 °C, and predicted DPPH = 71.55%. These conditions were selected for validation and further modelling using ANN.
The optimized ANN–GA predicted maximum DPPH under the following conditions: inoculum size (8.82 × 106 CFU mL−1); inoculation time (24 h); strain (L. plantarum MCC 2974) predicted DPPH as 71.22%; TPC as 160.00 mg GAE per g; TAC as 227.85 mg C3GE per 100 mL; and TFC as 183.96 mg QE per 100 mL as represented in Fig. 5. These outputs confirm the ANN–GA's ability to achieve or slightly exceed the predicted maxima from the RSM approach, demonstrating its superior predictive fidelity and multi-response optimization capacity.
An important insight across recent studies is that ANN models coupled with GA (ANN–GA) often outperform RSM in capturing nonlinear fermentation dynamics. In Bajpai et al.19 CoQ10 work, an ANN–GA model trained on the same data achieved an R2 of ∼0.999 with a negligible mean squared error (0.0059), compared to R2 ≈ 0.99 for the RSM quadratic model.
The ANN thus explained variance slightly better, reflecting its ability to learn complex, non-linear relationships beyond the polynomial form of RSM. This trend is echoed by Hu et al.16 who reported the ANN–GA model for coffee pulp wine had a higher explained variance (R2 = 0.914) than the RSM model (R2 = 0.890) and a lower root-mean-square error (RMSE = 0.0896 vs. 0.0968). In other words, the ANN–GA provided a tighter fit to the fermentation data, improving predictive accuracy. Anastácio et al.23 observed the same pattern in a phenolic extraction study: the ANN model for DPPH antioxidant response showed higher R2 and lower RMSE than the corresponding RSM model. Even when RSM models are statistically significant, ANNs can capture residual patterns or interactions that RSM misses. In this study, a similar outcome is evident: both modelling approaches likely achieved high R2 (indicative of good fit), but the ANN (especially when optimized via GA) would show lower prediction errors (e.g. RMSE or AAD) than RSM. This aligns with the previous study conducted by Muthusamy et al.11 and Mukherjee et al.24 to generalize better (with lower average absolute deviation, AAD) than RSM in bioprocess optimizations. In summary, the literature consistently shows (and the Sohiong juice results confirm) that ANN–GA modelling can provide more accurate predictions of fermentation outcomes, reflected in higher R2 and smaller error metrics, even though RSM remains valuable for its statistical interpretability.
| Metric | RSM | ANN–GA |
|---|---|---|
| R2 (unitless) | 0.9467 | 0.9988 |
| RMSE (% DPPH) | 1.98 | 0.63 |
| AAD (%) | 3.21 | 1.46 |
A crucial aspect of optimization studies is how well the models predict actual outcomes (i.e. the accuracy of optimization). The literature suggests that when models are properly trained, their optimum predictions are very reliable. In Bajpai et al.19 study, after ANN–GA optimization predicted an improved CoQ10 yield (≈27.9 mg L−1), experimental validation under those conditions gave 27.04 mg L−1, only about a 2% deviation from the prediction. This tight agreement exemplifies high optimization accuracy. In the coffee pulp wine study (Hu et al. 2025),16 the authors report that the ANN–GA optimal conditions – material ratio ∼4.25%, initial pH 6.92, sugar ∼22.25%, yeast 1.98% were validated experimentally, yielding 10.248 mg L−1 of the target response versus a predicted 10.255 mg L−1. The error here was virtually negligible (≪1%), again highlighting excellent predictive fidelity. By comparison, RSM in that study, while slightly less precise, also provided a solid prediction (the RSM-optimal outcome was only marginally off the ANN's, with R2 ∼0.89). In the bitter gourd-grape fermentation, RSM models were validated by additional experiments and an ANN, which confirmed the RSM-optimised level as near-optimal (e.g., the chosen 35% bitter gourd juice level was deemed ideal and produced the intended low-alcohol, high sensory-acceptance profile). These examples are consistent with the current study, in which the optimized probiotic juice presumably met the targeted quality metrics very close to the model's predictions.
The model fitting parameters reported in the Sohiong manuscript (e.g. high R2 values, low RMSE and AAD) reflect the strong correlation between predicted and observed values at the optimum. This is corroborated by other recent optimizations: for instance, Lau et al.25 optimized a lipase fermentation using RSM and ANN–GA and achieved ∼1.6-fold higher enzyme titer with both methods, with the ANN–GA slightly edging RSM in accuracy. They note that both approaches predicted similar optima, but the ANN–GA model exhibited an improved predictive match to the experimental results (owing to lower error metrics). Likewise, in an ultrasound-assisted extraction study,26 the ANN–GA optimum predictions for yield and antioxidant endpoints showed <5% deviation from experimental values, whereas RSM's optimum had a bit larger gap. Across these cases, a common thread is that optimization accuracy is very high (often within a few percent) when adequate modelling (especially using ANN–GA or a hybrid approach) is employed. The Sohiong juice optimization appears to follow this trend, as the manuscript reports strong agreement between model-predicted and actual values for DPPH, TPC, etc., post-optimization. This high degree of accuracy is critical in validating the chosen approach. It demonstrates that the ANN–GA and RSM models were not only statistically sound but also practically reliable in guiding the formulation of the probiotic juice with maximal functional benefits.
One of the main goals in functional beverage fermentation is to enhance both the bioactive compounds as well as the antioxidant activity. Multiple studies over the past decade have documented significant improvements in DPPH scavenging activity, TPC, TFC, and TAC following optimization of fermentation conditions. Liao et al.27 provide a striking example: by optimizing the co-fermentation of blueberry juice with mixed probiotics (using a simplex mixture design and GA optimization), they achieved an 82.2% increase in TPC and dramatic rises in specific polyphenols (e.g., rutin up 79%) in the fermented juice compared to the unfermented control.
This coincided with a sharp increment in antioxidant activity, as evidenced by the higher DPPH scavenging capacity in the fermented blueberry juice. In a similar study, Yuan et al.28 observed that fermenting an apple–tomato pulp with lactic cultures led to TPC and TFC increases of ∼21–22%, accompanied by a 40.9% improvement in DPPH free-radical scavenging ability (and ∼22% increase in ABTS activity) relative to the unfermented pulp.
These enhancements are attributed to microbial biotransformation of phytochemicals for instance, Lactobacillus strains can release bound phenolics from the fruit matrix or biosynthesize antioxidant metabolites. The Sohiong-based probiotic juice, rich in anthocyanins (Sohiong is a dark purple fruit), likely showed analogous trends: fermentation optimization would increase TAC and other phenolics, thereby boosting DPPH activity. Indeed, probiotic fermentation is known to elevate anthocyanin content and antioxidant power in fruit substrates. Sun et al.29 and others have reported enhanced anthocyanin stability and antioxidant profiles in fermented berry wines, which aligns with Section 3.5 of the Sohiong study. Additionally, Maselesele et al.30 demonstrated that optimizing a bitter gourd-grape fermentation not only controlled alcohol content but also retained functional components; their RSM-optimized low-alcohol beverage maintained high phenolic content and antioxidant activity. By comparison, the Sohiong juice optimization appears to have succeeded in maximizing these health-related metrics. Any reported gains in DPPH, TPC, TAC, and TFC in the Sohiong study are well within the spectrum of improvements seen in the aforementioned studies.
Supplementary information (SI) is available. See DOI: https://doi.org/10.1039/d5fb00940e.
| This journal is © The Royal Society of Chemistry 2026 |