Peter
Sagmeister
ab,
Johannes
Poms
c,
Jason D.
Williams
*ab and
C. Oliver
Kappe
*ab
aCenter for Continuous Flow Synthesis and Processing (CCFLOW), Research Center Pharmaceutical Engineering (RCPE), Inffeldgasse 13, 8010 Graz, Austria. E-mail: jason.williams@rcpe.at; oliver.kappe@uni-graz.at
bInstitute of Chemistry, University of Graz, NAWI Graz, Heinrichstrasse 28, A-8010 Graz, Austria
cResearch Center Pharmaceutical Engineering (RCPE), Inffeldgasse 13, 8010 Graz, Austria
First published on 21st February 2020
Inline benchtop NMR analysis is established as a powerful tool for reaction monitoring, but its capabilities are somewhat limited by low spectral resolution, often leading to overlapping peaks and difficulties in quantification. Using a multivariate analysis (MVA) statistical approach to data processing these hurdles can be overcome, enabling accurate quantification of complex product mixtures. By employing rapid data acquisition (2.0 s recording time per spectrum), we demonstrate the use of inline benchtop NMR to guide the optimization of a complex nitration reaction in flow. Accurate quantification of four overlapping species was possible, enabling generation of a robust DoE model along with accurate evaluation of dynamic experiments.
The increased uptake of flow technology, particularly systems equipped with process analytical technology (PAT),4 has opened the field to reaction optimization using advanced techniques such as design of experiments (DoE),5 dynamic experimentation,6 automated self-optimization7 and feedback loops for process control.8 Driven by regulatory authorities, PAT has emerged as a key tool to support pharmaceutical development, manufacturing and quality by design (QbD).9 In this context, PAT can enable faster and more reliable process optimization and enhanced process control compared to cases using offline analysis only. The resulting workflows perfectly align with approaches towards Industry 4.0 – data driven automation for development and advanced process control for manufacturing.10
In terms of hardware, a PAT tool can be as simple as a temperature, pressure, pH or conductivity probe. More complex analytical instruments have also been integrated to continuous flow reactors for monitoring single or multistep syntheses.11 These include UV-vis,12 Raman13 or infrared (IR)14 spectroscopies, but also chromatographic techniques such as high/ultra performance liquid chromatography (HPLC/UPLC),15 and gas chromatography (GC).16
Low field (benchtop) NMR is also being increasingly utilized within academic and industrial labs.17 Due to their simplicity, low operating cost, compactness and ability to operate without deuterated solvents, these instruments have the potential to serve as an excellent addition to the arsenal of PAT tools.18 However, the resolution of the resulting spectra is poor compared to high field instruments and overlapping signals can be especially troublesome when attempting to quantify components in complex mixtures. In order to follow reaction progress, well-resolved peaks are monitored and a long acquisition time is generally utilized.19 The resulting analysis period is relatively long, when considering the collection of multiple data points for the purpose of reaction optimization.
Data analysis methods, such as multivariate analysis (MVA) are vital in the utilization of PAT, to adequately deconvolute spectroscopic analyses. MVA is a statistical approach to quantify components by their “fingerprint” signals in a measurement. This approach enables quantification of different components in a complex spectrum. Typically, an MVA model is built using a training set, consisting of mixtures of the analytes with known concentrations. It has been demonstrated in many instances that MVA, using a method such as partial least squares (PLS) regression, is an efficient approach to process NIR20 and fluorescence21 spectra. Surprisingly, examples in which NMR spectra are processed using MVA appear to consist almost exclusively of studies in food and biological analytics.22
A small number of examples exist, demonstrating the combination of NMR with MVA to monitor chemical transformations. Most notably, the group of Maiwald has demonstrated a process monitoring strategy of an SNAr reaction using inline NMR, quantified using a multivariate analysis PLS model.23 This was later expanded to simple optimization experiments to maximize plant throughput.24 In a related concept, it would be of substantial benefit to expand the usage of this techniques to early-stage reaction optimization studies using common approaches, such as DoE or dynamic experimentation.25 Herein, we demonstrate the utility of an MVA approach to interpret rapidly recorded low field NMR spectra, which contain a complex mixture of species, for optimization and monitoring of a continuous flow nitration (Fig. 1).
This nitration is generally achieved using either sulfuric acid or acetic acid as solvent. Sulfuric acid provides an exceptionally fast rate of reaction,27 whereas acetic acid requires heating and prolonged reaction times.28 In view of capitalizing on the benefits of flow chemistry, in mixing and heat dissipation, the reaction procedure in sulfuric acid was selected. Due to incompatibilities of neat sulfuric acid with polyether ether ketone (PEEK) material fittings, a quench and membrane separation were introduced prior to NMR analysis. The reaction stream was quenched with water and the organic products were simultaneously extracted into an organic solvent, isopropyl acetate (iPrOAc). The phases were then separated using an inline phase separator (SEP-10, Zaiput). The organic phase was then passed through a benchtop NMR (Spinsolve Ultra 43 MHz, Magritek) flow-through cell for continuous inline analysis (Fig. 2b).
Initial attempts to carry out this nitration used a Lonza FlowPlate reactor (Ehrfeld), with a “liquid–liquid” plate design.30 Reaction streams were delivered using SyrDos 2 pumps (HiTec Zang) into a reactor constructed of Modular MikroReaction System (MMRS, Ehrfeld) parts, with temperature control supplied by 2× Ministat 240 thermostats (Huber). However, in this reactor, the reaction progress appeared to be limited depending upon the flow rates used (see ESI†).31 This was proposed to be due to mixing limitations caused by the high reaction mixture viscosity (dynamic viscosity of 95% H2SO4 = 17.6 mPa s at 25 °C).32 Due to the large volume of water required to quench the acid stream (5:
1 flow rate ratio) and total flow rate restriction imposed by the phase separator, a relatively low flow rate of reaction mixture was necessitated (1.1 mL min−1 combined). Accordingly, it was found that switching to a mixer based on a split-and-recombine mixing principle (Cascade Mixer 06, Ehrfeld) provided effective mixing, allowing full reaction conversion independent of the flow rates used (see ESI†).33
With the finalized reactor setup, we sought to determine the residence time distribution (RTD) by taking advantage of NMR analysis with a short acquisition period. This type of RTD measurement is generally carried out by injecting a dye solution and observing its flow pattern using UV-vis. One key advantage of using NMR for this purpose instead, is that any organic molecule with sufficient separation from the solvent signals can be used. In this case, where sulfuric acid is used as the solvent, viscosity plays a key role, but the solvent is incompatible with the majority of organic dyes. The actual reaction solvent can be used in this case, providing more representative RTD data for the entire process from reaction, through separation, to NMR analysis (see ESI†).
It should be noted, however, that the total time taken to acquire each data point is limited by the speed at which the computer is able to write and process the data files. This time is variable depending upon the processor and hard drive write speed. For example, in our case, the installation of a solid-state drive (SSD) to the computer was observed to significantly shorten overall analysis and processing times from >10 s (using standard hard disk drive) to ∼6 s per data point.
To set up a PLS model for this system, mixtures of known concentrations were pumped through the flow-through cell for analysis (see ESI†). 10 different concentration levels were used for each component, representing the expected spread of data under the reaction conditions (0.25 M initial SA concentration). Since a linear response between concentration and peak area is expected, this relatively small number of concentration levels is thought to be sufficient. For each of these concentration levels 100 spectra were obtained, using the aforementioned 2.0 s measurement period.
In order to handle such a large quantity of data, initial processing (Fourier transform and baseline correction) was performed on the stacked spectra in Mestrenova v11 (Mestrelab Research). The resulting data were used to generate a PLS model for the simultaneous prediction of all reaction components, using Simca v16 (Sartorius Stedim Biotech). To perform cross-validation (CV) the training set data was divided into subgroups by concentration level. The CV algorithm generates reduced data sets to get a performance indicator of model error, the root-mean-square error of cross validation (RMSECV). The initial model, using the full NMR spectrum, had RMSECV values >10 mM for all components (Table 1, entry 1), so further effort was made to improve the model.
Entry | Model | SA | 3-NSA | 5-NSA | DNSA |
---|---|---|---|---|---|
a Number of components for each model: SA = 1 + 4 + 0; 3-NSA = 1 + 5 + 0; 5-NSA = 1 + 6 + 0; DNSA = 1 + 4 + 0. See ESI† for further details. | |||||
1 | Combined PLS model | 14.1 | 13.9 | 10.9 | 27.2 |
2 | Individual PLS model | 8.7 | 8.4 | 9.4 | 7.8 |
3 | Individual OPLS model | 3.8 | 3.3 | 3.4 | 2.4 |
4 | After alignment OPLS model | 3.5 | 2.9 | 3.2 | 2.1 |
5 | Final model | 3.6 | 2.8 | 3.1 | 2.1 |
It was found that the error could be reduced by building individual models for each reaction component and excluding sections of the spectrum without useful spectroscopic information for that component (entry 2). Additional improvement was achieved by using an OPLS (orthogonal projections to latent structures) model, which uses orthogonal signal correction to explain additional covariance (entry 3).34 To further refine the models a MATLAB script was then used to interpolate and align peaks, in order to negate the influence of small shifts in the data (entry 4, see ESI† for details). For the final models the zero-concentration calibration values for 3-NSA, 5-NSA and DNSA were excluded. On the other hand, it was found that including the zero-concentration values for SA provided better prediction power for lower concentrations.
The predictive capability of these finalized OPLS models was then tested, using 4 validation sets of mixtures with different concentrations. These were within the range of the original calibration mixtures and were pumped through the flow cell at 1.1 mL min−1. The root mean error of validation (RMSEV) was calculated for each mixture and were found to be sufficiently low for accurate predictions (SA < 4.8 mM, 3-NSA < 5.9 mM, 5-NSA < 3.7 mM, DNSA < 6.7 mM). The predicted vs. actual concentration plots for the finalized models, including validation sets, are shown in Fig. 3, with relevant statistics appended.
Additionally, to explore the robustness of the model, experiments were performed with different flow rates (1.7 mL min−1 and 2.4 mL min−1) and different acquisition times (3.2 s and 6.4 s). In general, the RMSE increased for faster flow rates due to signal broadening.17b However, the models still remained within a useable error range (validation RMSE < 5 mM) for most cases, even at the significantly higher flow rate of 2.4 mL min−1. The change of acquisition times did not have a significant impact on the RMSE, which demonstrates robustness (see ESI†). This implies that developed MVA models can be applied across a range of different NMR acquisition parameters, so the user is not limited to using the initial calibration parameters.
A three-factor full factorial DoE design was applied (using Modde 12.1 software, Sartorius Stedim Data Analytics AB), considering the input parameters: temperature, residence time and nitric acid equivalents. This design resulted in 8 different experimental conditions, with an additional 3 center points to determine reproducibility (Fig. 4a).
![]() | ||
Fig. 4 a) Cube showing the planned DoE experimental points, including center points, full factorial points (original model) and face centered points (additional experiments). b) Experimental results of DoE experiments using NMR analysis with the described OPLS model to determine relative ratios of each reaction component. Each set of experimental conditions is numbered in grey. The NMR average for each experiment is denoted by a black line, and offline UPLC measurements (3 for each set of conditions) are appended as black-bordered diamonds. c) Response contour plots for each reaction component, showing results at a flow rate of 1.1 mL min−1. See ESI† for full DoE details, analysis and results. |
Data was acquired by NMR continuously, throughout the experimental run, and grouped for each DoE level (Fig. 4b, mean result displayed as a black line). To account for gas bubbles or other inconsistent results, quartiles (Q) and interquartile ranges (IQR) were calculated, then data points which exceeded the lower or upper boundary (Q1 − 1.5 IQR or Q3 + 1.5 IQR) were identified as outliers and removed (see ESI†). In order to confirm the NMR results, three samples were taken for offline UPLC analysis (black-bordered diamonds). In most cases, the difference between UPLC and mean NMR result was minimal and the discrepancy never exceeded 5%. One area in which the MVA model was less successful, however, was determining low (<5%) levels of starting material (SA). Under several of the experimental conditions, complete consumption of SA was observed by UPLC, but the value determined by NMR was around 2%, presumably due to noise in the spectra.
The model constructed from these results showed that substantial curvature was present, implying the presence of squared term interactions. To verify these squared terms, axial points were added (6 additional points, for a total of 17 experiments – a face centered design, Fig. 4a). The initially observed curvature was confirmed, providing a satisfactory model. Furthermore, the model constructed using NMR data aligns very well with its equivalent based on the confirmatory UPLC data (see ESI† for details).
In this specific model reaction the selectivity towards the desired product 5-NSA can, unfortunately, not be significantly improved through tuning of the examined parameters (Fig. 4c). It can be observed that 5-NSA overreacts to form DNSA preferentially versus 3-NSA. Accordingly, operating at low temperature and nitric acid equivalents is preferable, whereby a selectivity of roughly 1:
1 between the two regioisomers is achievable (1 equiv. HNO3, 0 °C, 1.1 mL min−1 combined flow rate). The DoE output suggests that flow rate (and, by association, mixing speed) has minimal impact upon the reaction performance. For improved selectivity, it could be argued that the procedure using acetic acid may be preferable. It is envisioned that the workflow described here will be applied in future cases for rapid optimization of reaction conversion and selectivity.
The validation experiment was carried out using fixed residence time (18.7 s) and nitric acid loading (1.6 equiv.), and a temperature set point ramped from 0 °C to 35 °C over 60 min (Fig. 5). Experimental outcomes predicted by the constructed DoE model were taken as a range (with an error defined by the Modde output). Excellent agreement was observed between the predicted and measured results for all four species, further corroborating the validity of the developed DoE model. Only a slight deviation from the prediction was measured for 5-NSA, which demonstrated less than the expected level of curvature.
It is important to note that the ramp time for this experiment could be significantly shortened, as subsequently demonstrated in a “ramp-down” experiment, which reduced the temperature from 35 °C to 0 °C over ∼15 min (see ESI†). When paired with this fast inline analytical method, it is conceivable that these experiments could be carried out in very short times. This implies that more complex arrays of dynamic experiments (e.g. for kinetic analysis) may be achieved in a rapid and cost-effective manner.
The rapid acquisition of data using NMR with MVA provides 3 main advantages when used for dynamic experiments, compared to examples using chromatographic analysis: 1) data is acquired in a truly inline fashion, rather than (at best) online analysis of aliquots taken from the process stream; 2) far more data analysis points can be captured, since most chromatographic methods (even fast UPLC with an isocratic gradient) generally require >2 minutes of processing time; 3) NMR analysis is far more cost effective, requiring no solvent consumption or (serviceable) moving parts.
To demonstrate the utility of this inline analysis in reaction optimization, a DoE model was constructed, with accurate agreement between inline NMR and offline UPLC analysis. Additional axial points confirmed the curvature proposed from the initial experimental points. Confirmation of the DoE model was then successfully achieved in a dynamic experiment by ramping the reactor temperature. Using this statistical method, the power of benchtop NMR can be fully realized in automated optimization, mechanistic experiments and process control for flow chemistry.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d0re00048e |
This journal is © The Royal Society of Chemistry 2020 |