Albert
Chu
a,
Du
Nguyen
a,
Sachin S.
Talathi
a,
Aaron C.
Wilson
ab,
Congwang
Ye
a,
William L.
Smith
a,
Alan D.
Kaplan
a,
Eric B.
Duoss
a,
Joshua K.
Stolaroff
a and
Brian
Giera
*a
aLawrence Livermore National Laboratory, Livermore, California 94550, USA. E-mail: giera1@llnl.gov; Tel: +925 422 2518
bGoogle, Inc., Mountain View, California 94043, USA
First published on 15th April 2019
Microfluidic-based microencapsulation requires significant oversight to prevent material and quality loss due to sporadic disruptions in fluid flow that routinely arise. State-of-the-art microcapsule production is laborious and relies on experts to monitor the process, e.g. through a microscope. Unnoticed defects diminish the quality of collected material and/or may cause irreversible clogging. To address these issues, we developed an automated monitoring and sorting system that operates on consumer-grade hardware in real-time. Using human-labeled microscope images acquired during typical operation, we train a convolutional neural network that assesses microencapsulation. Based on output from the machine learning algorithm, an integrated valving system collects desirable microcapsules or diverts waste material accordingly. Although the system notifies operators to make necessary adjustments to restore microencapsulation, we can extend the system to automate corrections. Since microfluidic-based production platforms customarily collect image and sensor data, machine learning can help to scale up and improve microfluidic techniques beyond microencapsulation.
Microfluidic systems, such as inkjet print heads,16 DNA microarrays,17 biosensors,18 and lab-on-a-chip devices,19 rely on sub-millimeter plumbing components to precisely manipulate fluids. Due to the low Reynolds flow characteristics within microfluidic channels, these systems are often amendable to physics-based models that inform suitable geometric designs and/or operating conditions for a wide variety of microfluidic applications.20 However, the small scale of components leads to an inherent sensitivity of the system to perturbations. For instance, unexpected clogs, air bubbles, chemical impurities, particulates, pressure fluctuations in external pumps, fluid property changes, among other issues, can lead to disruptions in normal operation. These disruptions result in time and material loss, reduce production or detection quality, and, depending on the severity, damage to the microfluidic device. Microfluidic systems can be scaled up for industrial use by parallelization,21–23 but how to monitor and manage upsets in a massively parallel array is a central challenge of scale up. Laboratory-scale systems typically require monitoring and intervention by a human operator, a process which is time consuming and unlikely to scale.
Machine learning offers a route to enable automated monitoring of microfluidic systems by converting routinely collected sensor and image data into actionable information in real-time.24 An appealing characteristic of machine learning algorithms is that they dispense with the need for precisely modeling the environment. Instead they leverage direct experience gathered from real executions or “learn by example” in order to derive meaning from (possibly) complicated or imprecise data.25 Depending on the assessment output by the algorithm, the algorithm can then trigger pre-programmed adjustment(s) to modulate the operation of a given system without requiring human intervention. Furthermore, algorithms are agnostic to the data streams and/or the apparatus they monitor. Although re-training may be necessary, the coding framework and implementation strategies of these algorithms are often general to any device, sensor modalities, and target application. Thus, these algorithms could be deployed on a wide variety of microfluidic platforms to increase operational efficiency and address challenges inherent to scale up via automated error detection and rectification.
In this work, we demonstrate a machine learning based control system for microfluidic microencapsulation, which is a common technique that traditionally requires human oversight to produce microcapsules. We develop a convolutional neural network machine learning algorithm that assesses the state of the system via real-time microencapsulation images. Using assessments from the detection algorithm as input, a separate control algorithm triggers valves that sort “good” and “bad” microencapsulation events. The control algorithm accounts for the time between image capture and capsule sorting, while mitigating disruptions the sorting valves may introduce to the flow field at the site of microencapsulation. We describe the machine learning model development in three straightforward steps: data collection, data labeling, and neural network training. We assess the integrated detection and control algorithms in operation and demonstrate high-quality microcapsules after on-the-fly sorting. Although we demonstrate this approach using image detection of an assembly for microcapsule production, due to the inherent generalities of neural networks26 and diverse sensors common to microfluidic set ups, we envision our approach should benefit other microfluidic systems given sufficient training data.
During typical operation, the system may transition to non-dripping regimes in which droplet formation is inconsistent or does not occur at all. Non-idealities may arise for a variety of (sometimes combined) effects, including clogs, bubbles, pressure fluctuations, viscosity changes of the photo-curable middle and/or other fluids or device irregularities such as acentricity of the nozzles or poor capillary alignment. For instance, at higher fluid flow rates characterized by the “jetting” regime, inertial forces exceed surface tension between the dispersed phase and outer carrier fluid causing aspherical, polydisperse, and/or multi-core droplets to form farther into the outlet capillary, leading to suboptimal microcapsules and potential clogging. When the middle fluid poorly envelops the inner fluid, droplets may rupture or break. “Rupturing” events are uncommon and often self-correcting. Finally, the most severe failure is the “wetting” regime in which the middle fluid wets the exit channel, making droplet formation unstable. Since wetting does not readily self-correct, significant material and time losses occur until the operator interjects. Furthermore, if the middle fluid excessively fills the exit channel near the UV light source, clogs may destroy the device, thereby disrupting the entire production campaign.
In order to observe microencapsulation, an operator typically positions the microfluidic device under a microscope and manually focuses the field of view at the double-capillary junction. From this vantage, the state of the microencapsulation is readily apparent to a trained operator. Thus, microencapsulation devices require an operator to notice any deviations from desired behavior that may arise. Depending on the observed non-dripping state, an operator may attempt to adjust fluid flow rates to restore the device to dripping. Until dripping is restored, ideal and non-ideal microencapsulation products accumulate in a container downstream unless manually directed into collection and waste containers. Due to phase separation, the entire production batch can be ruined by forming a continuous mass instead of discrete microcapsules.
To move beyond limitations of existing microencapsulation devices, we automate the processes of (1) image collection, (2) microencapsulation classification, and (3) trigger valves to sort dripping from non-dripping events as detailed in Fig. 1. A commercial-grade computer with a single NVIDIA TITAN X GPU executes these three functions at ∼40 Hz. A trained convolutional neural network (CNN)30 takes individual images as input and predicts the state of microencapsulation according to one of the four possible classifications: dripping, jetting, wetting, or rupturing. Based on a series of predictions from the CNN, a separate control algorithm triggers a valving system that sorts acceptable from defective microencapsulation events. Microcapsules sent to the collection line pass under a UV lamp and photo-cure before accumulating into a collection jar. Material in the rejection line avoids UV light (as to prevent clogging) and accumulates in a separate rejection jar. This two-part system relies on an image classifier and valve controller working conjointly to assess and sort microencapsulation events, respectively. We coded all software using open-source NumPy, OpenCV, pyduino, and ToupCam DCM1.3 Python libraries. The total cost of hardware that is external to the microencapsulation system is amendable to most research budgets.
In practice, an operator manually places the microfluidic device somewhere onto the microscope and adjusts the focus to observe microencapsulation. To ensure the CNN is sufficiently robust to accommodate images taken during routine operation, we perform standard data augmentation techniques to simulate anticipated variations in image sharpness and orientation using our original image set. By applying affine transformations in different combinations of shift, rotate, and blur via Gaussian filters – all while preserving image labels – we expand our image set by a factor of 91, from ∼74000 to ∼6
700
000 samples. Finally, experts used full 680 × 720 square-pixel resolution color images during the labeling process. The computational requirements for training on images typically reduces when images are represented by a square matrix whose entries are grayscale intensity values, as opposed to tuples of red, green, and blue, i.e. (R, G, B), intensity values that triple the file size. Furthermore, additional computational gains are possible by reducing image resolution, provided image compression does not render the classes indecipherable. Our experts could reliably distinguish the classes using grayscale images at resolutions at or higher than 32 × 32 square-pixels. Thus, while developing the CNN architecture (see ESI†), we noticed considerable model computational speed reduction during training using downsized 32 × 32 square-pixel grayscale images, without any loss in performance. During real-time operation, the computer pulls, pre-processes, and predicts the class of each image before executing the valve actuation routines.
Using our assembled a labeled set of images, we train our CNN to classify images according to the four classes described above. In the process of training our deep CNN, we iteratively compute a cost function that essentially represents a tally of all incorrectly classified images during training. The goal is to minimize this cost function by systematically varying all (in our case) 80000 model weights via batch gradient decent. Considering non-dripping images comprise only ∼20% of the training set, a trivial image classification algorithm that always predicts dripping is guaranteed 80% accuracy. To compensate for the imbalance of class representation in our training set, we add to the standard cross-entropy cost function31 the following class weighting scheme to compute the cost for each class,
![]() | (1) |
Fig. 2 shows how the training set size influences both training time and the following model performance metrics: precision (or positive predictive value), recall (or sensitivity), and F1 score (the harmonic mean of precision and recall). Both model performance and computational time generally increase with the training set size. The non-monotonic form results from randomly culling the overall training set at each iteration. We could perform multiple training sessions at the lower training sizes to smooth the curves and determine error bounds, however, we restrict our attention to the best performing models for this application. As expected from scenarios where training data is limited, Fig. 2 demonstrates training with larger data sizes increases CNN model performance, in addition to the originally intended purpose of enhancing robustness of the model against device rotation and shifting and image focus. Model performances begin to saturate above 90%. Nevertheless, we obtain an F1 score of 95.5% by training on all available data from the augmented dataset. Although the trend in Fig. 2 indicates we can continue collecting and labeling data to achieve better performance, it is evident this laborious process will yield diminishing returns with this CNN model architecture. As shown in the ESI,† other model configurations that use different model structures and/or input image resolutions can elevate prediction accuracy. However, deeper CNNs incur additional computational cost during training and operation and may require larger datasets to prevent overfitting. Although performance requirements vary by application,3,33–37Fig. 2 is useful when estimating model performance relative to the effort of acquiring additional labeled data.
For our sorting system, we select the highest performing CNN that requires the largest training set and longest training time. We present a confusion matrix in Fig. 3 to visualize the per-class accuracy and limitations of this CNN using a series of example input images. The row, Ri, and column, Cj, pertain to one of four respective algorithm-predicted or human-labeled classes, arranged in order of increasing prevalence in the training set. The accompanying percentages, P(Ri, Cj), give the frequency the CNN predicts an image with ground truth class Cj to be class Ri and . The class averaged accuracy computed from the diagonal entries
is high compared to a trivial model that only predicts a single class in the training set, i.e.
. Furthermore, the prediction accuracy is lowest for the least-populated (left column, rupturing) and increases for more commonly occurring microencapsulation events. This is due to class imbalance inherent to capillary-based microencapsulation and the resulting collected and labeled image set. Despite the large disparity in the quantity of dripping versus non-dripping images, the average misclassification frequency is low,
, because of imbalance correction. We suspect the most common misclassification, e.g. 7.79% likelihood of confusing rupturing capsules with dripping, results from two concordant causes: extreme class imbalance (80% versus 2% representation) and apparent visual similarity between these classes. Images labeled as rupturing often contain a single ruptured capsule accompanied by as many as three dripping capsules. Since rupturing-labeled images are both infrequent and contain features common to dripping-labeled images, we hypothesize the distinction between these classes is difficult to tease out during training. Indeed, trained experts spent the most time distinguishing these classes from each other during the labeling process.
The effectiveness of our sorting system depends on the interplay between the image classification algorithm and valve control logic. For instance, even if we implemented a perfect microencapsulation classifier, undesirable sorting could result from the constraints we impose on the valve responses. To mitigate disruptive pressure fluctuations induced by our pinch valves, our system collects material only when ≥0.75τ* is dripping and reject material only when ≥0.50τ* is non-dripping, where τ* = 2 seconds as described in Methods. Thus, the valves may falsely reject (or collect) material during continuous sub-second non-dripping to dripping (or vice versa) fluctuations. In practice, this constraint is not limiting because our capillary-based microencapsulation system has never exhibited this behavior over any appreciable duration of time. Nevertheless, other valving mechanisms38 and microfluidic devices might enable faster switching and/or require different control settings based on allowable τ*.
Irrespective of the microencapsulation detection method and/or valving systems used, optimal performance is difficult to intuit and assess before or during actual production runs. For this reason, we present a general simulation-based approach to evaluate sorting that relies on sequence(s) of human-labeled images in Fig. 4. We curate a list of 4000 randomly-chosen microencapsulation images from the augmented image set to simulate ∼100 seconds of operation for the simulation (see ESI†). The curated sequence of events contains uninterrupted non-dripping and dripping events, with periods of moderate and high frequency fluctuations. We feed this sequence to all the trained image detection algorithms in Fig. 2 that vary by training set size and F1 score, then trigger valves based on the predicted state of the system. We identify the algorithm-predicted class during false collection of non-dripping events or false rejection of dripping events. Fig. 4 illustrates how the quality of collection and rejection containers improve with diminishing false sorting events. Ideal collection containers will include only microcapsules and carrier fluid, without any inner fluid in the continuous phase. Conversely, the ideal rejection container eventually contains a mixture of the inner and outer phase with dispersed droplets of pure shell material, without any microcapsules present.
Using the 15 differently-trained CNNs in Fig. 2, we evaluate sorting error fraction, e.g. false collection over all collectable events, etc., as a function of F1 score and training set size in Fig. 4(b) and (c), respectively. Although the trend is non-monotonic, the general trend is that the fraction of false sorting events decreases with increasing training set size and F1 score. In all cases, false rejection rates are low because the system collects dripping events, which is the largest class in the training set. The operational model with the largest training set and F1 score exhibits the lowest fraction of both sorting error types. Additionally, systems that utilize more accurate CNNs signaled the operators more quickly as shown in ESI.† Training the CNN on only 1000 images produces a trivial detection algorithm that always predicts dripping. Consequently, the valving system always collects non-dripping microencapsulation events (false collection fraction ≡1) and never rejects dripping events (false rejection fraction ≡0). Notably, the quantity of sorting errors is considerably lower than the number of anticipated misclassifications based on estimations from F1 scores. Therefore, our valve triggering strategy, which mitigates disruptions to microencapsulation, has the added benefit of minimizing sorting errors. In other words, the confusion matrix analysis in Fig. 3 would be more useful for estimating the sorting accuracy of a more responsive valving system, e.g. one that could switch according to every classified microencapsulation event. Fig. 4 isolates and evaluates sorting errors that stem from image detection accuracy alone. However, it is straightforward to repeat simulations with different model architectures, image sequences, valve controller constraints, etc. to help estimate sorting accuracy before performing experiments during production.
Having selected our CNN image detection algorithm, we implement and test our sorting system in an actual microencapsulation production run using a new microfluidic device that was not used to collect training data. As was the case with training data collection, during the test, an operator adjusts relative volumetric flow rates of core:
shell fluids to result in both “good” and “bad” events that the overall system detects and sort. Fig. 5 compares material that accumulated in the collection and rejection containers. “Rupturing” and “wetting” events were the dominant non-dripping events, causing unencapsulated inner fluid to concentrate in the outer continuous phase. As a result, the structures primarily present in the rejection container are single emulsions of the silicone shell phase dispersed in a mixture of the inner and outer fluids. Due to the inclusion of a pH indicating dye into the core phase, an apparent shift from blue to yellow provides a qualitative visual confirmation for the mixture of inner and outer fluids. In contrast to the rejection container, pristine monodisperse microcapsules are found in the collection container and the fully encapsulated inner fluid retains the initial blue color, indicating proper microencapsulation.
In Fig. 5, we also compare the particles sorted into the collection and rejection containers through image analysis. Using ImageJ,39 we measure particle diameters, d, and compute the polydispersity index (PDI) from the quadratic ratio of the standard deviation of the diameter to the average diameter, PDI = (σd/)2. The PDI measurement is more traditionally appropriate for the collection container than the rejection container. The rejection container had a variety of different particle morphologies including single emulsions, double emulsions, and multiple core emulsions. Furthermore, larger aggregates of silicone (<2 mm) could also be found which were difficult to include in the PDI measurement. However, the PDI of the particles that could be measured in the rejection container still shows the systems sorting capabilities. Collected particles have a diameter of 429.5 ± 11.5 μm, i.e. PDI = 0.001, whereas rejected particles have a radius of 37.7 ± 30.6 μm, i.e. PDI of 0.658. The collected particles' PDI indicates monodispersity and highlights the system's ability to sort the “good” events from the “bad” events.
Considering the low hardware costs of our system and generalizability inherent to machine learning, we expect our system can benefit other microfluidic systems beyond microencapsulation. Furthermore, results here point the way toward massively parallel microfluidic arrays41 with active monitoring of individual channels, giving control of overall product quality. Considering the small 32 × 32 size of images input to our classifier, a high resolution image of an entire parallelized microfluidic array could be subdivided into smaller images of individual droplet generation sites in preparation for a machine learning algorithm. In this way, automated process monitoring via machine learning may solve one of the central challenges to scale up of many promising microfluidic systems. However, automated process control, which triggers corrective action(s) based real-time assessments, is increasingly complex in cases where the performance of a given droplet generators is confounded with others in the array.
This work provides a general overview as to how supervised machine learning approaches might operate for related microfluidic systems and applications. Irrespective of the type(s) of data collected by a microfluidic sensor, supervised machine learning may offer a route for real-time processing in cases where it is feasible to label the data. Generally, increasing model accuracy requires suitably-large and informative datasets at increasing computational expense. We strongly emphasize that the “garbage in, garbage out” principle42 still applies to machine learning. Notwithstanding, the microfluidics field43 routinely produces rich and vibrant visual and/or other high signal-to-noise sensor data44 to infer the state of a given microfluidic system. Indeed, human-discernable images arguably represent the gold standard for assessments in microfluidic devices for commercialized point-of-care diagnostics like pregnancy45 and diabetes tests,19 visual chemotaxis assays,46 high throughput cell characterization,47 or miniaturized artistic media.48 Semi- or unsupervised machine learning may be necessary for other microfluidic applications where it is not possible to label all (or any) data. In such cases, transfer learning49 may help, in which a previously-trained model is used as a starting point to accelerate (and possibly improve) the training of the same model on separate, new set of training data. The degree to which transfer learning will accelerate ultimate model performance depends on a variety of factors,50 including (at least) similarity between the sparse and original training datasets, size and quality of training datasets, model architecture, and so on.
Complementary physics- and data-driven approaches can help to overcome limitations stemming from the inherent variability to our (and many other) microfluidic systems. At present, a deep investigation into a given machine learning model does not reveal the same physical insights that stem from conventional modeling with constitutive equations and conservation laws.20 However, machine learning can assist where physical insights fail to yield practical solutions. For example, it is well understood, theoretically and practically, how clogs disrupt flow, but real-time machine learning based monitoring can mitigate production losses caused by such spontaneous events. This by no means downplays the relevance of traditional models and simulation, especially since they are crucial to revealing operating conditions and flow restoration strategies of many microfluidic platforms. Indeed, we initialize our system using predictions from idealized models, but, since our devices are handmade and subject to variability, each requires slightly different process conditions. Thus, in the future, physical models can guide reinforcement machine learning51 approaches to rapidly converge on optimal processing conditions for reproducibility.
Although we demonstrate success with single image detection, other labeling schemes and/or recurrent neural network machine learning architectures may be able to exploit temporal relationships among collected images. In addition to instantaneous classifications, labels can incorporate information from future portions of image sequences within training sets. For instance, a “good” image can also be labeled with the time until long periods of interrupted “bad” events occur. With temporal labels, it may be possible to train machine learning algorithms to predict the onset of production failures. The controller could then use these predictions to modulate the operating conditions to enhance sorting accuracy and minimize disruptions, leading to improved quality of microcapsules or output from other microfluidic production systems.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c8lc01394b |
This journal is © The Royal Society of Chemistry 2019 |