Fatemeh
Ahmadi
abc,
Mohammad
Simchi
d,
James M.
Perry
b,
Stephane
Frenette
b,
Habib
Benali
ab,
Jean-Paul
Soucy
be,
Gassan
Massarweh
e and
Steve C. C.
Shih
*abc
aDepartment of Electrical and Computer Engineering, Concordia University, 1455 de Maisonneuve Blvd. West, Montréal, Québec H3G 1M8, Canada. E-mail: steve.shih@concordia.ca; Tel: +1 (514) 848 2424 x7579
bPERFORM Centre, Concordia University, 7200 Sherbrooke Street West, Montréal, Québec H4B 1R6, Canada
cCentre for Applied Synthetic Biology, Concordia University, 7141 Sherbrooke Street West, Montréal, Québec H4B 1R6, Canada
dDepartment of Mechanical & Industrial Engineering, University of Toronto, 5 King's College Rd, Toronto, Ontario M5S 3G8, Canada
eMcConnell Brain Imaging Centre, Montreal Neurological Institute, McGill University, 3801 University Street, Montréal, Québec H3A 2B4, Canada
First published on 23rd November 2022
Digital microfluidics (DMF) has the signatures of an ideal liquid handling platform – as shown through almost two decades of automated biological and chemical assays. However, in the current state of DMF, we are still limited by the number of parallel biological or chemical assays that can be performed on DMF. Here, we report a new approach that leverages design-of-experiment and numerical methodologies to accelerate experimental optimization on DMF. The integration of the one-factor-at-a-time (OFAT) experimental technique with machine learning algorithms provides a set of recommended optimal conditions without the need to perform a large set of experiments. We applied our approach towards optimizing the radiochemistry synthesis yield given the large number of variables that affect the yield. We believe that this work is the first to combine such techniques which can be readily applied to any other assays that contain many parameters and levels on DMF.
While there are many types of biological and chemical assays that can be implemented on a DMF device, the number of experiments that can be performed in parallel is limited. The main reason for the limitation is the number of electrodes that can be accommodated on a single DMF device. The low electrode density only allows for tens of reactions to be performed in parallel. There are physical solutions to increase the electrode density (and to increase reactions in parallel): vertical addressing techniques,26–29 inkjet-printing techniques,30 and three dimensional stacking of substrates;31 however, there continues to be limits on the number of reactions or conditions that can be handled on the platform due to unreliable droplet movement, μL-volume requirements, and evaporation or biofouling challenges.
One method that optimizes screening conditions without physical implementation is the use of machine learning. Machine learning has the ability to predict the performance of your assay based on your input parameters that affect the assay and models the output response without costly fabrication and design iterations. Currently, the pairing of microfluidics and machine learning has been used for optimizing droplet generator designs,32 tumor cell classification,33 and image-based cell sorting techniques.34 Using these methods together, the authors were able to perform a set of experiments on the device and develop models to predict the output.35,36 Microfluidic-based optimization studies can also benefit from design-of-experiments (DoE) including full factorial37,38 and fractional factorial39,40 designs for an efficient screening of a multiparameter assay. These demonstrations, though few, represent the power of coupling microfluidics with either machine learning or DoE together.
Here, we report a new method for assay optimization on DMF, centered on the combination of digital microfluidics and design-of-experiments to optimize the output yield. We employed a method called OFAT (one-factor-at-a-time) to determine the significant parameters that affect the output. This method starts with automating the base experiment followed by ‘n’-parameter variable experiments. The variable experiments contain the base level parameters except that a change is made to the level (increasing or decreasing) for one of the parameters. Through statistical analysis, we determine the most significant factors by comparing the variable with the base case. Next, we obtain a database of output values to develop a model to predict the optimal output using OFAT and machine learning techniques. The new method was used to implement a seven-parameter optimization of fluorination labeling for [18F]FDG radiotracers. The resulting model provided an ideal and a practical optimized protocol, which reduced the synthesis time and improved the synthesis yield, relative to those from previous work.7,22,25 More importantly, these results establish the first report (to our knowledge) to combine digital microfluidics and machine learning algorithms enabling applications that require optimization of screening conditions. We propose that the new methods represent a useful development for users interested in using digital microfluidics for their biological and chemical screening applications.
Fig. 1B shows the use of one-factor-at-a-time (OFAT), a systematic approach towards finding the optimal levels for ‘n’ parameters without requiring to perform a large set of experiments. The method starts with designing a base level for each parameter (values are either obtained from the literature or randomly generated). Next, for each parameter, we change the level of the parameter (increasing or decreasing the base value) while keeping the other parameters at the base level. Together with the base set, this creates n + 1 experiments to be automated on the DMF device. After obtaining the output from the variable case, we compare with the base level output to determine the impact (significant or non-significant) of each individual parameter on the assay. The use of OFAT provides several advantages compared to the traditional trial-and-error search process. First, it significantly reduces the number of experiments that need to be performed on the device. For example, a 33 full factorial experiment requires 27 experiments (and 108 with replicates41), but with OFAT, only 4 experiments (1 base case with 3 variable cases; 12 experiments with 3 biological replicates) are needed. Second, the parameters that affect the assay can be determined after performing OFAT. As such, this eliminates experiments with insignificant parameters, which saves time and costs in relation to device fabrication and assay preparation. Third, the method is compatible with any biological and chemical-based assay that requires many parameters and levels. Although we applied it to one type of assay (see below), the same method can be applied to the synthesis of other materials42,43 and to the metabolic production of pharmaceutical compounds44 that require optimization of multiple parameters and levels for optimal output production.
We evaluated the OFAT method by employing radiosynthesis of [18F]FDG7,22,25 on DMF using mannose triflate as the substrate and performed fluorination of [18F] to produce [18F]FDG (Fig. S1†). We used this assay to showcase our method given that there are seven parameters to be optimized (more than the typical assay performed on DMF) and there are data presented in the literature such that we can verify improvements in prediction accuracy and in the overall yield. In these assays, it started with delivering the [18F] via a syringe system to the reaction center on the device. Next, the [18F] was activated through evaporation at 120 °C, followed by dispensing droplets containing mannose triflate and mixing with the evaporated [18F] at 85 °C. Deprotection of the intermediate product was performed by hydrolysis with sodium hydroxide at room temperature producing [18F]FDG. We chose to produce [18F]FDG because this reaction has several different parameters, and we believe that the yield can be improved by using OFAT. Specifically, as shown by previous authors, the synthesis is affected by seven parameters.7,22,25 With each parameter broken intro 3 possible levels, we are required to automate 37 = 2187 reactions (without replicates) to obtain the full space of experiments – a nearly impossible task with the current DMF devices and fabrication methods. Using the OFAT approach, we can significantly reduce the number of experiments that are performed on the device and identify the significant parameters that affect the yield. The results comparing the traditional method with the OFAT method for optimizing synthesis are recorded in Fig. 2.
Fig. 2 Comparison of the fluorination efficiency using traditional optimization vs. OFAT approaches. (A) The traditional approach changes the level of each parameter based on literature values. Each experiment (row) shows constant (red) and variable (blue) values. The constant values represent the parameter values kept at levels found from the literature and the variables represent the parameter values changing towards favorable conditions. For the full set of experiments performed, see Table S1.† (B) In OFAT, we use a base case (yellow) and create variable experiments by changing the level of only one variable (blue) while keeping all other levels at the base case. The optimal fluorination efficiency (blue) is obtained after 1 + 7 experiments. The t-values comparing the variable to the base case are shown in brackets beside the fluorination efficiency. |
As shown in Fig. 2, the [18F]FDG fluorination efficiency results are shown for the different synthesis parameters for the traditional and OFAT methods. In the traditional approach, a base case was obtained by Keng et al.22 and Chen et al.25 and the parameters were changed via recommendation from previous work – lowering material concentrations, reducing synthesis time (higher temperatures, faster reaction), and optimizing the volume ratio between the carrier molecules and [18F]fluoride. For OFAT, eight experiments were created, starting from the same base case as the traditional approach, and we created seven variable experiments. Two key results were obtained from OFAT. First, we observed significant changes in the fluorination efficiency values for the variable output when compared to the base level output. A t-test revealed significant differences for parameters such as mannose triflate and NaOH concentration, radiolabeling temperature, and deprotection time (at 95% confidence level; P < 0.05) except for radiolabeling time and deprotection temperature. Second, OFAT resulted in a higher fluorination efficiency of 70.82 ± 1.54% compared to the traditional approach, which after ten experiments, (see Table S1† for the complete list of conditions), the best efficiency that we could obtain was 65.92%. The OFAT improvement is likely related to starting with a careful choice of a base level.7,22,25 Regardless, the data presented in Fig. 2 demonstrate that OFAT can help identify the significant parameters which can be useful for building a predictive model for the response as a function of the input variable (see below) and for converging towards maximizing the production of the output.
The RMSE values generated from the training data and test data for the linear and non-linear models are shown in Fig. 4A. A linear regression was fitted to the experimental output in the single input linear model (Fig. S2†), which resulted in a range of RMSE values between 13.89 ± 1.1 and 15.53 ± 1.3 for the training data. In addition, the single model was evaluated with the test data, and as expected, the model showed relatively higher errors than the training data. The single model also falls short on accurately predicting the fluorination efficiency (see Fig. S3†), which suggests that using a single parameter model approach is not the ideal case to predict the fluorination efficiency output. However, a multiple interaction model that considers all the variables for fluorination efficiency shows decreases in the RMSE for the training (11.47 ± 1.0) and test (11.71) datasets. In fact, using our model against a test dataset (i.e., unseen parameter conditions), the accuracy was higher compared to the best-performing single model (Fig. 4B). We also hypothesized that using a non-linear regression model (higher-order factors) would improve the accuracy; however, as shown, the non-linear model showed similar accuracy (RMSE for the training 11.54 ± 1.2 and test 13.26) compared to the single model, which is likely due to the limited number of experiments available in our database.52
From OFAT we identified significant factors via the t-test for base and variable cases. Using the multiple linear regression model, we verified the significant factors from OFAT by estimating the model's coefficients. As shown in Fig. 4C, we ordered the most significant factors by increasing t-values, with the most substantial effects determined from OFAT at the top. Generally, the highest coefficient from our multiple linear models matches with the significant factors determined by OFAT. For deprotection temperature, we calculated a relatively higher coefficient in contrast to what we expected from OFAT t-values. There are a number of potential causes for the difference: (1) a limited number of experiments and levels (25–50 °C) were studied with the parameter and (2) to train an accurate model requires a relatively large dataset (>1000s).32,53 However, we expect the accuracy to improve if more replicates or levels are automated. In these cases, we can still leverage the OFAT (only using the significant parameters) and machine learning approach that we developed here.
Fig. 5 Optimal fluorination efficiency. (A) Comparing the fluorination efficiency for [18F]FDG radiosynthesis through OFAT and predictive modeling based on the significant parameters. Model conditions that were not practically possible to automate on our DMF device are highlighted in red. The optimal conditions obtained from OFAT (yellow) and the model (blue) are shown. (B) Graphical illustration comparing the total crude radiochemical yield achieved by OFAT assay optimization and comparing it with previous publications.7,22,25 The total crude radiochemical yield is determined by the product of the fluorination efficiency and the radioactivity recovery (94.1%; Table S3†). |
Surprisingly, the conditions derived at 85% were very similar to the OFAT optimal conditions (the only difference being the mannose triflate concentration 30 vs. 80 mM). This improvement is likely related to starting with a base case that is already close to the optimal conditions. A comparison of our method and previous work is presented in Fig. 5B. As shown, with the improved DMF design and the developed screening (OFAT) and optimization (ML) methods, the overall synthesis yield from OFAT resulted in an improved yield of 66.64% and a further increase to 79.13% with ML modeling. The overall synthesis time (∼ 19 ± 2 min; n = 6) also exhibited better performance compared to previous work. These improvements are critical, as increasing overall yield and reducing synthesis time provide the capability to supply more doses available for patients at the end of the synthesis.54
A final goal of the work was to demonstrate that our new method can be combined with other protocols and techniques to optimize the radiochemical yield. Optimization of the [18F]FDG yield efficiency is a continuous challenge due to the short half-life of [18F]. There have been multiple efforts towards developing automation systems,7,22,25,55 improving reagent delivery mechanisms,54–56 and integrating interfaces to couple with chromatography systems for purification57–59 to speed up the delivery of the tracer to the patient. We designed an on-chip purification scheme that purifies the sample immediately after synthesis. Discs with 2 or 6 mm diameter were created with alumina beads mixed with the PDMS elastomer prior to curing and directly placed on the DMF device (Fig. S6A†). After synthesis, the droplet with the incorporation of [18F]FDG was brought to the disc for incubation and moved around the adjacent electrodes to enhance the purification process of removing both the Kryptofix and unreacted [18F] and F−. After incubation, the droplet was actuated away from the disc and manually pipetted for downstream processing. Using this technique, we observed high purity (∼93.05 ± 2.46%) with 6 mm discs incubated for over 40 minutes (Fig. S6B†). All the final optimized results related to radioactive recovery, incorporation, and purity are shown in Table S3 in the ESI.† To our knowledge, this work is the first to integrate radiotracer synthesis with on-chip purification and to use OFAT optimization and ML modelling.
The devices were coated with 15 g of parylene-C (7 μm) as a dielectric layer in a SCS Labcoater 2 PDS 2010 (Specialty Coating Systems, Indianapolis, IN, USA) and then coated with Teflon™ AF 1600 (150 nm) in a Laurell spin coater (North Wales, PA, USA) set to 1500 rpm for 60 s with 300 rpm s−1 acceleration followed by 10 min baking at 165 °C. For the electrical ground plate, ITO-coated glass slides were cut into 1′′ × 3′′, coated with Teflon™ AF 1600 by spin coating, and then post-baked as described above. The ITO plate was then placed on top of the substrate separated with two pieces of double-sided tape (3M), resulting in an inter-plate gap of ∼140 μm.
For automated radiotracer synthesis, we used our automation system containing optocoupler switches, a thermoelectric device, and a reagent delivery system (see Fig. S7–S9† for temperature and reagent delivery control setups). Briefly, the thermoelectric device was controlled using an Arduino microcontroller (Arduino Uno, Italy) and a driver motor consisting of a two half-bridge driver chip and a low resistance N-channel MOSFET (Amazon, Mississauga, ON, Canada) following Perry et al.1,60 The reagent delivery system was built using a stepper motor that was also connected and controlled using an Arduino microcontroller.1 Droplet operation was controlled using our open-source software (see: https://bitbucket.org/shihmicrolab/f_ahmadi_2022) that was used to apply high-voltage potentials to a stack of optocoupler switches (the design of the stack was described elsewhere61).
To prepare for radiotracer synthesis on the device, a DMF device (only the bottom plate) was loaded onto a pogo-pin holder consisting of 104 pogopins with a spacing of 3 mm which will deliver 104 individual voltage inputs to the contact pads of the device to apply a sine wave 160 VRMS potential at 15 kHz between the top and bottom plates. The 14.8 × 14.8 mm2 thermoelectric device was directly placed below the bottom plate and will provide heating and cooling to 10 central electrodes on the device, which we set as the ‘reaction site’ (Fig. S1†). A syringe tube was fixed with tape on the pogo-pin holder and directed to the central electrode in the reaction site to deliver the radioisotope. After evaporation of [18F]fluoride (∼ 10 min), an ITO top plate was placed on top of the bottom substrate such that the edges of the ITO were aligned to the edges of the reservoir electrodes and were affixed by two pieces of double-sided tape. For purification experiments, discs were sandwiched between the top and bottom plates and aligned with a set of five electrodes, which we refer to as the ‘purification site’. When the bottom-plate was secured by the pogo pin holder and top-plate, reagent reservoirs were loaded by pipetting the reagents at the edge of the reservoir with the same applied voltage and frequency as above to draw the liquids into the reservoirs.
To create a predictive model based on the “OFAT” experiments, we generated a database of 55 reactions. Each reaction set contained different levels for the significant parameters and the base level for the non-significant factors. We performed all reactions on the chip to determine the observed FDG fluorination efficiency value (see Table S2†). As shown, input parameters except for deprotection temperature and radiolabeling temperature have imbalanced distribution in the dataset. Data points outside the most frequent conditions are considered ‘rare’ data. For example, data points with NaOH concentration lower than 1M are considered a ‘rare’ condition since only one experiment was performed with this condition. The experimental data having at least one input parameter with rare frequency in the training dataset for our work were replicated to improve the data distribution balance (a method shown for other studies50,51 that only have few experimental datapoints). These collected experimental data points (experimental conditions and their resulting FDG fluorination efficiency) were used to fit a linear and a non-linear regression model to study the effect of the reaction parameters on the FDG incorporation.
The database was randomly divided into two sets: (1) a training set (80% of data) to generate the model and (2) a test set (20% of data) to evaluate the model's performance. Input data were standardized by subtracting the mean value and scaling to unit standard variation using eqn (1) below:
(1) |
The model was trained using a 10-fold cross validation technique,64 which partitions the training set into 10 different subsets and iteratively trains the model on 9 subsets while the remaining set was used as the validation set. The training goal was to minimize squared error between the measured fluorination efficiency and the predicted value. The model was tested with the independent, random test group. The mean absolute error and root mean squared error were calculated to measure the performance of the models.
Seven linear models were trained where only one parameter was used as the input. In comparison, a multiple linear regression model was trained with all seven variables as the input. The linear models followed eqn (2):
FDG fluorination efficiency = · + b | (2) |
The reported coefficients for the final linear models were obtained by using the whole database including training and test sets (55 data points presented in Table S2†). To prevent overfitting, the linear models were obtained using Ridge regression of the scikit-learn library.65 This method performed L2 regularization and the optimum regularization factor was found using the grid search cv algorithm from the scikit-learn library.65
A multi-layer feed-forward neural network was implemented using the Keras library (https://github.com/fchollet/keras66) for non-linear regression analysis. The artificial neural network consisted of an input layer of 7 neurons for the 7 experimental parameters and an output layer with a single node to represent the predicted FDG incorporation yield. A dense hidden layer of 4 neurons with a rectified linear unit (ReLU) activation function was used to add non-linearity to the regression model. Also, in the hidden layer, L2 regularization was used to prevent overfitting of the model to the training dataset.47 In L2 regularization, squared magnitude of the layer weights is added to the loss function as a penalty term. In this way, it forces the weights to be small to prevent overfitting.47 The regularization factor and the learning rate of the Adam optimizer were optimized using the grid search cv algorithm from the scikit-learn library.65 The optimized values were 0.5 and 1 for the regularization factor and learning rate, respectively. 10-fold cross-validation (scikit-learn library) was used to measure the model performance.65 The final model (trained with all the training data) was further tested with the separate test data. The performance of the models was reported in terms of the root mean squared error as shown above.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2lc00764a |
This journal is © The Royal Society of Chemistry 2023 |