Mohamed A. Koranya,
Marwa A. A. Ragab*a,
Rasha M. Youssefa and
Mostafa A. Afifyb
aFaculty of Pharmacy, Department of Pharmaceutical Analytical Chemistry, University of Alexandria, El-Messalah, Alexandria 21521, Egypt. E-mail: marmed_2001@yahoo.com; Fax: +20 3 4873273; Tel: +20 3 4871317
bBorg Pharmaceutical Industries, Borg El-Arab new city – Industrial Zone 3 – Area 3 – district 17, Alexandria, Egypt
First published on 4th December 2014
An experimental design was adopted to attain the optimum reaction parameters of chemical derivatization of anhydrous sodium alendronate in an oral solution formula via Hantzsch condensation reaction. All reaction controlling variables, namely, time of reaction, temperature, reagent ratio and volume and buffer type, pH and volume were studied using the Plackett–Burman screening design to determine significant variables. Reaction temperature and pH of the buffer solution were found to be significant variables. Optimization was performed using the central composite design to get the optimum levels of these variables. Moreover, a comparison was made with artificial neural networks and support vector machines. The same results were obtained with low percentage relative error. After carrying out the spectrophotometric analysis, interferences from oral solution excipients were eliminated with a simple extraction procedure before measuring the absorbance at 340 nm. Satisfactory results of sample analysis were obtained and they were in good agreement with the label claim. A linear calibration graph of absorbance versus concentration was obtained with very low value of intercept and high value of correlation coefficient (0.9999) in the range of 2.44–34.10 μg mL−1. The proposed spectrophotometric method was fully validated in accordance with ICH guidelines. Statistical comparison with a reported reference method showed similar results with respect to accuracy and precision.
Therefore, several methods were reported for ALN assay. Spectrophotometric methods were reported for the assay of ALN.4–14 However, the proposed method has limits of detection and quantitation that are considerably lower than those of many other published methods.5–7,12,13 Moreover, some of the mentioned spectrophotometric methods have a narrow linear dynamic range when compared to the proposed method.4,13,14 In addition, spectrophotometric methods depending on the reaction of the amino group of ALN with ninhydrin reagent are non-specific as only the amino group moiety of the drug will be contained in the reaction product without the rest of the molecule, thus the colored derivative is the same irrespective of the primary amine precursor.11,13 Moreover, HPLC methods were described for its analysis.2,3,6,15 ALN was electrochemically16 determined. A fluorimetric method was reported for the quantification of alendronate and clodronate in aqueous samples and in serum.17 Some capillary electrophoresis methods were described for the determination of ALN.18 A literature review reports some methods for the determination of ALN in tablets5,6,19–21 but it lacks any reported method for the assay of ALN in oral solution.
The Hantzsch condensation reaction depends on the formation of dihydrolutidine derivatives when a β-dicarbonyl compound condenses with an aldehyde and a primary amine or ammonia.22 This reaction is widely used by different analytical techniques, including spectrophotometric, spectrofluorimetric and chromatographic methods.23–27
The Hantzsch condensation reaction is considered to be a perfect example for investigating the effectiveness of the experimental design and machine learning strategies in screening and optimizing the high number of parameters affecting such reaction. These parameters include time of reaction, temperature, reagent ratio and volume and buffer type, pH and volume.
Few examples were found in literature screening and optimization the experimental parameters affecting dispersive liquid–liquid microextraction,28 solid-phase extraction29 and chromatographic separations.30 No literature was found for the investigation of the effectiveness of experimental design31,32 and machine learning strategies in screening and optimizing parameters affecting such reaction.
The different conditions for the optimization of the condensation reaction were studied using the experimental design (DOE) approach in two stages. The first involves screening design, which is Plackett–Burman design (PBD) for variable screening. The second involves the application of circumscribed central composite design (CCCD) for the optimization of significant variables. A comparison was performed among CCCD, artificial neural networks (ANN) and support vector machines (SVM).
The theoretical background of these methods was extensively discussed in literature.33–37 Although the univariate procedures (one variable at a time; OVAT) are time and effort consuming, they are still being used in routine methods. In this work, the multivariate design of experiments (DOE or experimental design) is considered because it takes less time, effort and resources than the OVAT method. DOE and the response surface methodology (RSM) were useful for improving and optimizing processes. The RSM has been widely used in analytical and industrial applications.33
Screening designs are used to specify the most significant factors from those potentially affecting the considered responses. Most often, two-level screening designs, such as fractional factorial or Plackett–Burman designs, are used, which allow examining a relatively high number of factors f at L = 2 levels in a relatively small number of experiments (N ≥ f + 1). When f is small, two-level full factorial designs might also be applied for screening purposes.34
A Plackett–Burman design (PBD) is used here for screening, which allows examining maximally f = N − 1 factors in N experiments, where N is a multiple of four (N = 8, 12, 16, 20, …). When f exceeds the number of real factors to be examined, the remaining columns of the PBD are defined as dummy factor columns.34
A central composite design (CCD) contains a two-level full factorial design (2f experiments), a star design (2f experiments) and a centre point, requiring N = 2f + 2f + 1 experiments to examine f factors. As a result, 9 experiments are needed for two factors, while 15 are needed for three factors. The points of the full factorial design are situated at the factor levels −1 and +1, those of the star design at the factor levels 0, −α and +α (where α is the distance from centre point to star points in CCD (axial distance)), and the centre point at the factor levels 0. Depending on the value, two CCDs exist, i.e. a face-centred CCD (FCCD) with |α| = 1 examining the factors at three levels, and a circumscribed CCD (CCCD) with |α| > 1 examining the factors at five levels. For a so-called rotatable CCCD, the level should be |α| = (2f)1/4, i.e. 1.41 and 1.68 for 2 and 3 factors, respectively (Fig. 2).34,35 A CCCD was used for the optimization of significant factors of this work in 2 blocks. If the number of experiments exceeds the number of experiments that can be performed in one day, the experiments should be performed in blocks.33
Fig. 2 Central composite designs for the optimization of: (a) two variables (α = 1.41) and (b) three variables (α = 1.68). (●) points of factorial design, (○) star points and (□) central point. |
The artificial neural networks (ANN) methodology is an information-processing chemometric technique specially created to model non-linear information, which simulates some properties of the human brain. The so called multilayer feed-forward networks, or multi-layer perceptron (MLP) networks, are often used for prediction, as well as for classification.
It is important to stress that ANN have a notable advantage, as there is no need to know the exact form of the analytical function on which the model should be built. Furthermore, neither the functional type nor the number of model parameters need to be given. This is the main difference between modeling by LS regression and ANN.33
In a short period of time, support vector machines (SVM) found many applications in chemistry. For example in drug design, it was used for discriminating between ligands and non-ligands, inhibitors and non-inhibitors. Moreover, in quantitative structure-activity relationships (QSAR), SVM regression is used to predict various physical, chemical, or biological properties. Moreover, SVM was a very useful tool in chemometrics dealing with the optimization of chromatographic separation or compound concentration prediction from spectral data as examples, in sensors (for qualitative and quantitative prediction from sensor data), in chemical engineering (fault detection and modeling of industrial processes) and text mining (automatic recognition of scientific information).37
Support vector machines represent an extension to nonlinear models of the generalized portrait algorithm developed by Vapnik and Lerner. The SVM algorithm is based on the statistical learning theory and the Vapnik–Chervonenkis (VC) dimension. SVM models were originally defined for the classification of linearly separable classes of objects.37 SVR is famous for handling nonlinear data through kernels, however, it can also handle linear data.38
In this work, we use a radial basis function-support vector machines regression (RBF-SVR) model, in which nonlinearity of the used data is the case due to the presence of interactions between the reaction variables.
The aim of this work is to investigate, for the first time, the effectiveness of experimental design and machine learning strategies for screening and optimizing the Hantzsch condensation reaction. This reaction was successfully applied for the indirect spectrophotometric determination of ALN, which contains no chromophore, in oral solution. Hantzsch condensation reaction was a suitable example for investigating the effectiveness of such methods as it is affected by a large number of parameters. Plackett–Burman screening design was used to determine significant variables. After that, optimization was performed using the central composite design to get the optimum levels of these variables. A comparison was performed with artificial neural networks and support vector machines. The same results were obtained with low percentage relative error. Moreover, a full validation of the spectrophotometric method was performed in accordance with ICH guidelines. The optimized method was then applied to the spectrophotometric determination of the drug in oral solutions.
Fig. 3 A proposed reaction mechanism for Hantzsch condensation reaction of ALN with acetylacetone and formaldehyde. |
Trial no. | t (min) | T (°C) | RR (A:F)c | RV (mL) | pH | BTd | BV (mL) | Maximum absorbancee |
---|---|---|---|---|---|---|---|---|
a t is the reaction time, T is the reaction temperature, RR is the reagent ratio, RV is the reagent volume, pH is the pH of the buffer used, BT is the buffer type, and BV is the buffer volume.b “(C)” denotes center points.c “A” stands for acetylacetone while “F” stands for 37% w/w formaldehyde solution. 0.5: 2.0 mL of A + 4.0 mL of F + 4.0 mL of water. 2: 4.0 mL of A + 2.0 mL of F + 4.0 mL of water. 1.25: 2.5 mL of A + 2.0 mL of F + 3.0 mL of water.d Both buffer types are of 0.1 M strength.e Ranging from λ = 330–340 nm. | ||||||||
1 | 15.00 (−1) | 21.50 (−1) | 0.50 (−1) | 1.00 (1) | 5.80 (1) | Acetate (1) | 0.50 (−1) | 0.12941 |
2 | 15.00 (−1) | 21.50 (−1) | 2.00 (1) | 1.00 (1) | 3.80 (−1) | Citrate (−1) | 2.00 (1) | 0.01378 |
3 | 15.00 (−1) | 95.00 (1) | 0.50 (−1) | 0.25 (−1) | 5.80 (1) | Citrate (−1) | 2.00 (1) | 0.28700 |
4 | 15.00 (−1) | 95.00 (1) | 2.00 (1) | 0.25 (−1) | 3.80 (−1) | Acetate (1) | 0.50 (−1) | 0.47332 |
5 | 60.00 (1) | 21.50 (−1) | 0.50 (−1) | 0.25 (−1) | 3.80 (−1) | Acetate (1) | 2.00 (1) | 0.13053 |
6 | 60.00 (1) | 21.50 (−1) | 2.00 (1) | 0.25 (−1) | 5.80 (1) | Citrate (−1) | 0.50 (−1) | 0.0104 |
7 | 60.00 (1) | 95.00 (1) | 0.50 (−1) | 1.00 (1) | 3.80 (−1) | Citrate (−1) | 0.50 (−1) | 0.36474 |
8 | 60.00 (1) | 95.00 (1) | 2.00 (1) | 1.00 (1) | 5.80 (1) | Acetate (1) | 2.00 (1) | 0.52309 |
9 (C)b | 37.50 (0) | 58.25 (0) | 1.25 (0) | 0.63 (0) | 4.80 (0) | Citrate (−1) | 1.25 (0) | 0.19312 |
10 (C)b | 37.50 (0) | 58.25 (0) | 1.25 (0) | 0.63 (0) | 4.80 (0) | Acetate (1) | 1.25 (0) | 0.19876 |
11 (C)b | 37.50 (0) | 58.25 (0) | 1.25 (0) | 0.63 (0) | 4.80 (0) | Citrate (−1) | 1.25 (0) | 0.19791 |
12 (C)b | 37.50 (0) | 58.25 (0) | 1.25 (0) | 0.63 (0) | 4.80 (0) | Acetate (1) | 1.25 (0) | 0.19711 |
The 7-factors 4-center points PBD trials shown in Table 1 were performed using 1 mL working standard solution (to give final standard solution of a concentration of 24.36 μg mL−1). For each trial, standard was measured spectrophotometrically against the corresponding blank and the maximum absorbance was recorded and entered in the design matrix using StatSoft STATISTICA 10 software for subsequent data analysis. Results are illustrated in Table 1 and Pareto chart (Fig. 4) shows that reaction temperature and pH of the buffer are the only significant factors at p-value = 0.05. ANOVA analysis (Table 2) shows good agreement with the experimental data (high values of R2 and adjusted R2).
Factor | SS | df | MS | F | p |
---|---|---|---|---|---|
a t is the reaction time, T is the reaction temperature, RR is the reagent ratio, RV is the reagent volume, pH is the pH of the buffer used, BT is the buffer type, BV is the buffer volume, SS is sum of squares, df is degrees of freedom and MS is mean of squares. Significant factors (p-value = 0.05) appear in bold.b “A” stands for acetylacetone while “F” stands for 37% (w/w) formaldehyde solution. | |||||
(1) t (min) | 0.001961 | 1 | 0.001961 | 1.4416 | 0.296121 |
(2) T (°C) | 0.232572 | 1 | 0.232572 | 170.9758 | 0.000197 |
(3) RR (A:F)b | 0.001483 | 1 | 0.001483 | 1.0900 | 0.355413 |
(4) RV (mL) | 0.002105 | 1 | 0.002105 | 1.5475 | 0.281422 |
(5) BT | 0.000064 | 1 | 0.000064 | 0.0468 | 0.839365 |
(6) pH | 0.042112 | 1 | 0.042112 | 30.9590 | 0.005110 |
(7) BV (mL) | 0.000069 | 1 | 0.000069 | 0.0506 | 0.833016 |
Error | 0.005441 | 4 | 0.001360 | ||
Total SS | 0.285807 | 11 | |||
R2 | 0.98096 | ||||
Adjusted R2 | 0.94765 |
Trial no. | Block | T (°C) | pH | Absorbancec |
---|---|---|---|---|
a T is the reaction temperature and pH is the pH of the acetate buffer.b “(C)” denotes center points.c At λ = 340 nm. | ||||
1 | 1 | 75.00 (−1.00) | 3.50 (−1.00) | 0.08952 |
2 | 1 | 75.00 (−1.00) | 6.50 (1.00) | 0.25332 |
3 | 1 | 95.00 (1.00) | 3.50 (−1.00) | 0.35412 |
4 | 1 | 95.00 (1.00) | 6.50 (1.00) | 0.32158 |
5 (C)b | 1 | 85.00 (0.00) | 5.00 (0.00) | 0.39515 |
6 | 2 | 70.86 (−1.41) | 5.00 (0.00) | 0.29174 |
7 | 2 | 99.14 (1.41) | 5.00 (0.00) | 0.51417 |
8 | 2 | 85.00 (0.00) | 2.88 (−1.41) | 0.14871 |
9 | 2 | 85.00 (0.00) | 7.12 (1.41) | 0.32323 |
10 (C)b | 2 | 85.00 (0.00) | 5.00 (0.00) | 0.40513 |
The results are shown in Table 3 and the subsequent statistical analysis using StatSoft STATISTICA 10 software is shown in the Pareto chart (Fig. 5), the ANOVA table (Table 4) shows good agreement with the experimental data (high values of R2 and adjusted R2), and a response surface plot (Fig. 6) shows the relation between the two significant variables to be optimized and absorbance. It was found that for these variables, the temperature relation with absorbance was linear more than quadratic, whereas the relation of the buffer solution pH with absorbance was both linear and quadratic (Fig. 5, 7 and Table 4). The desirability function graph (Fig. 7) shows the optimum levels for both reaction temperature (99.14 °C) and pH of the buffer solution (5.0).
Percentage relative error (Er%) of predicted optimum and observed optimum = −0.44%. |
Factor | SS | df | MS | F | p |
---|---|---|---|---|---|
a T is the reaction temperature, pH is the pH of the acetate buffer, L is linear effect, Q is quadratic effect, SS is sum of squares, df is degrees of freedom and MS is mean of squares. Significant factors (p-value = 0.05) appear in bold. | |||||
Blocks | 0.007251 | 1 | 0.007251 | 7.46478 | 0.071809 |
(1) T (°C) (L) | 0.052394 | 1 | 0.052394 | 53.93539 | 0.005217 |
T (°C) (Q) | 0.001001 | 1 | 0.001001 | 1.03083 | 0.384725 |
(2) pH (L) | 0.017868 | 1 | 0.017868 | 18.39382 | 0.023301 |
pH (Q) | 0.044166 | 1 | 0.044166 | 45.46537 | 0.006662 |
1L by 2L | 0.009638 | 1 | 0.009638 | 9.92103 | 0.051275 |
Error | 0.002914 | 3 | 0.000971 | ||
Total SS | 0.138413 | 9 | |||
R2 | 0.97895 | ||||
Adjusted R2 | 0.93684 |
The predictive equation is:
z = −3.9974861300552 + 0.049615109658483x − 0.0001480033125x2 + 0.746510940046y − 0.043685358333334y2 − 0.0032723600000001xy + 0. |
A suitable method to find the optimal location is through the graphical representation of the model. Two types of graphs may provide helpful results: (a) the response surface in the three dimensional space and (b) the graph of contours that is the projection of the surface in a plane, represented as lines of constant response. Each contour corresponds to a specific height of the surface.33 Hence, a surface plot between independent variables (reaction temperature and pH of the buffer solution) and dependent variable (absorbance) was plotted and carefully examined as illustrated in Fig. 8.
Percentage relative error (Er%) of predicted optimum and observed optimum = −0.007%. |
The overlay contour plot of both CCCD and ANN predictive models (Fig. 9) shows coincidence of the optimal location for both optimization techniques at the maximum temperature of the water bath and the pH of the buffer solution equal to 5.0.
In a similar manner to ANN, optimal location was determined by plotting and carefully examining the surface plot.
Percentage relative error (Er%) of predicted optimum and observed optimum = −0.41%. |
The overlay contour plot of both CCCD and SVM predictive models (Fig. 10) shows coincidence of the optimal location for both optimization techniques at the maximum temperature of the water bath and the pH of the buffer solution equal to 5.0.
Therefore, the three optimisation techniques (CCCD, ANN and SVM) gave the same results with a low value of error percent (Er%), which confirms that pH 5.0 for the buffer solution and a boiling water bath are the optimum conditions for Hantzsch condensation reaction of sodium alendronate for its determination in oral solution formula.
Concentration (μg mL−1) | Mean % recovery ± SD | RSD (%) | Er (%) |
---|---|---|---|
a Mean ± standard deviation of three determinations. | |||
(a) Intra-day precision and accuracy | |||
0.00 (placebo) | 0.68 ± 3.79 × 10−2 | 5.56 | 0.68 |
19.47 | 100.73 ± 1.65 × 10−1 | 0.20 | 0.73 |
24.34 | 98.47 ± 9.62 × 10−2 | 0.10 | −1.53 |
29.21 | 100.31 ± 3.32 × 10−1 | 0.28 | 0.31 |
(b) Inter-day precision and accuracy | |||
0.00 (placebo) | −0.41 ± 2.27 × 10−1 | −55.90 | −0.41 |
19.47 | 98.52 ± 2.60 × 10−1 | 0.33 | −1.48 |
24.34 | 99.37 ± 2.61 × 10−1 | 0.26 | −0.63 |
29.21 | 98.63 ± 2.12 × 100 | 1.79 | −1.37 |
This journal is © The Royal Society of Chemistry 2015 |