André
Colliard-Granero†
ab,
Keusra A.
Gompou†
a,
Christian
Rodenbücher
c,
Kourosh
Malek
ab,
Michael H.
Eikerling
abd and
Mohammad J.
Eslamibidgoli
*ab
aTheory and Computation of Energy Materials (IEK-13), Institute of Energy and Climate Research, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany. E-mail: m.eslamibidgoli@fz-juelich.de
bCentre for Advanced Simulation and Analytics (CASA), Simulation and Data Science Lab for Energy Materials (SDL-EM), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
cElectrochemical Process Engineering (IEK-14), Institute of Energy and Climate Research, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
dChair of Theory and Computation of Energy Materials, Faculty of Georesources and Materials Engineering, RWTH Aachen University, 52062 Aachen, Germany
First published on 5th March 2024
The ever-increasing utility of imaging technology in proton exchange membrane water electrolyzer research raises the demand for rapid and precise image analysis. In particular, for optical video recordings, the challenge primarily lies in the large number of frames that impede the delineation of bubble dynamics with standard methods. In order to address this problem, the present study supports the automation of data analysis to facilitate swift, comprehensive, and measurable insights from captured imagery. We present a deep learning-based framework to perform high-throughput analyses of bubble dynamics using optical images of proton exchange membrane water electrolyzers. Leveraging a relatively small annotated imaging dataset of just 35 images, various configurations of the U-Net architecture were trained to perform bubble segmentation tasks. The best model achieved a precision of 95%, a recall of 78%, and an F1-score of 86% on the validation set. Subsequent to segmentation, the methodology enabled the rapid extraction of parameters such as time-resolved bubble area, size distributions, bubble position probability density, and individual bubble shape analytics. The findings underscore the potential of deep learning to enhance the analysis of polymer electrolyte membrane water electrolyzer imaging, offering a path toward more efficient and informative evaluations in electrochemical research.
In particular, the generation of oxygen gas bubbles at the anode of the catalytic layer is a significant factor inhibiting the access of water to the reactive sites, particularly under conditions of high current densities.2,3 Consequently, several studies focus on optimizing the two-phase flow behavior as a means to increase PEMWE performance. For instance, several works focus on exploring the effects of flow field channel geometry on cell efficiency, comparing parallel and serpentine flow fields has been conducted. Majasan and coworkers found that under a range of relevant operating conditions, including variations in water inflow rate, fluid dynamics, and temperature, a parallel channel configuration is more efficient than a serpentine one.4
The relationship between flow regimes and cell current density is another critical area of study. Dedigama et al. provided evidence that the presence of larger bubbles can displace more water, facilitating the removal of smaller bubbles adhered to the electrode surface, and ultimately improving water transport.5 Aubras et al. demonstrated that the transition from effervescent flow of smaller bubbles to slug flow dominated by larger bubbles enhances overall mass transport, diminishes ohmic resistance, and contributes to improved cell efficiency.6 Generally, the dynamics of bubble coverage, along with the concurrent processes of bubble growth/detachment and electrochemical reactions, are recognized as crucial to cell performance. Su et al. proposed that a high bubble coverage can be indicative of a deficient water supply to the cell, leading to a higher voltage and, decreased efficiency.7 These findings collectively underscore the complex interplay between flow design, bubble dynamics, and cell efficiency.
Understanding the processes of bubble formation, growth, and detachment is pivotal in the context of PEMWE, where their interplay determines the two-phase flow dynamics.6 A gap exists in terms of characterization techniques, necessitating advanced methodologies to elucidate the complex behavior of bubbles in PEMWE. Conventionally, the evaluation of bubble dynamics has relied on the analysis of extensive datasets acquired via indirect techniques such as acoustic emission (AE), electrochemical impedance spectroscopy (EIS), and pressure drop measurement. AE serves as a non-destructive, operando diagnostic tool utilizing a piezoelectric sensor to capture mechanical disturbances emanating from an object.8 This methodology facilitates the determination of bubble size distribution and the elucidation of diverse gas-flow patterns through the analysis of acoustic waves generated by bubbles.9,10
However, AE has limited spatial resolution and susceptibility to background noise, which can impede the accurate localization and size determination of bubbles. EIS offers insights into the individual contributions to an electrochemical system's total impedance, encompassing processes such as interfacial charge transfer and mass transport.11 Nonetheless, interpreting impedance data, especially in systems like PEM electrolyzers with evolving bubbles, is challenging. Especially, distinguishing between the various contributing factors, e.g. charge transfer resistance, mass transport limitations, and bubble dynamics, in the impedance spectrum. Conversely, pressure drop measurement, utilizes pressure sensor probes to detect signals that reflect bubble behavior, allowing for the correlation of these signals to dynamic bubble phenomena, as discussed by Zhang et al.12 Therefore, the integration of indirect techniques with direct observational methods can offer a more holistic understanding of bubble dynamics.
Optical photography facilitates the observation and quantification of bubble behavior.13 While this technique is advantageous for transparent cells, offering unobstructed visualization, the quantification of important parameters can still pose significant challenges. The manual processing of extensive video data sets is not only laborious but also prone to subjective interpretations that can vary with the analyst's bias. The advent of sophisticated image analysis coupled with the evolution of artificial intelligence (AI) offers a compelling solution to these limitations. These technological advances have laid the groundwork for automated, objective extraction of data from image sequences, effectively circumventing the issues of manual analysis.14–17
Recently, Sun et al. performed a study where a multi-task deep learning (DL) network was employed for instance segmentation to elucidate fission gas bubbles within nuclear fuel.18 Anderson et al. utilized DL to autonomously identify helium bubbles in irradiated micrographs and to extract their radii and cumulative volumes. The model exhibited a high accuracy, achieving a 93% success rate in the detection of bubbles on high-magnification micrographs, and maintained robust performance in analyzing lower magnification samples.19 Kim and Park introduced an instance segmentation model designed to autonomously discern and delineate the contours of bubbles across diverse flow conditions, attaining an average precision (AP50) of 98%.20 Nevertheless, DL for bubble analysis in PEMWE remains an under-explored field. The closest work found in the literature employs a state-of-the-art object detection framework, YOLOv7, for the identification of anodic oxygen bubbles within a transparent PEMWE system.2 While this method facilitates the extraction of bubble features, including the area, their count, and the extent of bubble coverage across the electrolyzer, it relies on bounding boxes to approximate bubble detection, a strategy that is prone to inaccuracies, especially when confronted with bubbles that have coalesced or present irregular shapes. Hence, there is a widely recognized need for methodologies capable of delineating the actual contours of individual bubbles during their formation, growth and release times, to enhance the precision of feature extraction and analysis.
To demonstrate the application of deep learning for the localization of gas bubbles in the flow field of a PEM electrolyzer, we employed a commercial reversible fuel cell (FCSU-023, Horizon Fuel Cell Europe, Czech Republic) operated in the electrolysis mode. The housing and the flow field of the cell consisted of green-blueish transparent plastic allowing for operando optical inspection of bubble evolution (Fig. 1a). The cell was driven by a power supply in constant voltage mode (2651A, Keithley, USA), which also served for measuring the current. Before starting each electrolysis run, the anode compartment of the cell was filled with deionized water, which was then consumed during the process as no water circulation was applied. Images of the anode, where the oxygen bubbles were released, were recorded by a digital camera (Lumix TZ 18, Panasonic, Japan) with a frame rate of 30 fps and a resolution of approximately 25 px mm−1. The images were cropped to the active electrode area of approximately 25 × 25 mm2 as shown in Fig. 1b, with the water inlet visible in the bottom right and the gas outlet in the top left part. The cyclic voltammogram of the cell obtained at a speed of 10 mV s−1 is shown in Fig. 2a. The onset of the electrolysis process can be seen as steep increase of the current at 1.5 V. Hence, the images were recorded at four different voltages of 1.5, 1.6, 1.7, and 1.8 V while the voltage was applied for approximately 400 s each.
![]() | ||
Fig. 1 (a) Schematic experimental setup for the recording of the dataset. (b) Exemplification of the obtained data from the experiment from the PEMWE transparent cell and the oxygen bubbles. |
![]() | ||
Fig. 2 (a) Cyclic voltammogram of the electrolyzer. (b) Current as function of time during the application of different constant voltages. |
Our methodology for analyzing bubble dynamics in videos encompasses four principal steps, as illustrated in Fig. 3: (1) manual annotation of oxygen bubbles to prepare the training dataset; (2) employing supervised learning for the semantic segmentation of bubbles; (3) extracting features automatically from the regions of interest (ROI); and (4) statistical analysis and visualization of the findings. Initially, we meticulously annotate the oxygen bubbles, as the precision of segmentation relies heavily on the quality of this labeled data. To facilitate this process, we used Label Studio,21 a comprehensive and intuitive annotation tool. This software facilitates efficient annotation and the automated export of masks in the desired format.
For the training and validation of our model, we annotated thirty-five optical frames. Originally, each frame measured 640 × 630 pixels, but for the training process, we resized them to 512 × 512 pixels. This resizing was crucial to optimize the model's performance, as it specializes in analyzing squared cells. Furthermore, maintaining a minimum resolution of 512 × 512 pixels is essential to prevent any potential resolution-related issues. These frames were carefully chosen from various parts of the experiment to represent a broad range of bubble scenarios. This deliberate choice is aimed at improving the model's generalization abilities. An expert, using visual cues, performed the detailed task of annotation, with a special focus on bubbles that were difficult to differentiate. To guarantee the reliability of the annotations, two additional reviewers examined them to achieve a unified consensus on the bubble selection.
In this study, utilizing the provided images and their associated annotated masks as ground truth, we focused on the pixel-wise identification of oxygen bubbles through the implementation of several U-Net-based architectures. These architectures are based on fully convolutional neural networks, which conduct a series of convolutions and down-sampling operations to identify the key patterns present within the images in a latent space. The encoding process progresses iteratively until the entirety of the information is consolidated into a singular vector. Subsequently, a reverse decoding process is initiated: up-sampling, coupled with the application of transpose convolutions, restores the data to its original size. A salient feature of the U-Net architecture is the concatenation of the up-sampled decoder feature map with the encoder feature maps of a matching resolution. This technique facilitates the discernment of object boundaries and edges, harnessing both low-level and high-level features, culminating in a refined segmentation output.22
Within the scope of our research, we experimented with the canonical U-Net framework along with two advanced variations. The first is a U-Net model augmented with a ResNeXt101 network as its backbone, integrating residual connections. The second is the more recent attention U-Net model, which capitalizes on attention mechanisms to further enhance segmentation precision.
The ResNeXt101 U-Net architecture fuses the ResNeXt101 model's capabilities with the U-Net structure. ResNeXt101, a 101-layer variant of the ResNeXt series, utilizes the split-transform-merge strategy. Its unique feature is “cardinality” – parallel paths in a block. Instead of increasing depth or width, elevating cardinality enhances performance. By using ResNeXt101 as U-Net's backbone, the network captures intricate image features more effectively.23
Attention U-Net augments the traditional U-Net with an attention mechanism, enabling the model to concentrate on specific image regions. This is vital for detecting various object scales or minor yet crucial image sections. Attention gates, applied before each decoder concatenation step, weigh encoder features, intensifying key pixels and suppressing irrelevant ones. This heightens segmentation precision, especially when target regions are surrounded by noise.24
In our study, we evaluated the performance of our trained model using standard evaluation metrics, notably the intersection over union (IoU) threshold. IoU is a commonly used metric in image segmentation, providing a quantitative assessment of the overlap between the model's predictions and the actual annotated masks. It calculates the proportion of overlapping pixels to the total number of pixels present in both the predicted and true masks for the target class.
Typically, a prediction is classified as accurate if the IoU score surpasses 0.5. Under this paradigm, a true positive (TP) signifies a scenario where a pixel is accurately identified as a bubble, evident from an IoU score greater than 0.5. Conversely, a true negative (TN) marks a pixel correctly identified as part of the background. Furthermore, a false positive (FP) arises when a pixel, presumed by the model to represent a bubble, lacks any corresponding annotation in the ground truth. On the other hand, a false negative (FN) emerges when a pixel, indicative of a bubble in the ground truth, is either overlooked or misclassified by the model.
Fig. 4 shows examples of TP, TN, FP, and FN classifications. Following the determination of these fundamental metrics, we can proceed towards more advanced evaluation metrics like precision, recall, and the F1 score, defined as:
Utilizing the segmentation maps generated from model predictions, we performed further statistical analysis. By employing elementary mathematical procedures, we calculated the pixel count representing bubbles in each frame, offering an estimation of frame-wise bubble coverage. This analysis facilitated the creation of plots illustrating time-resolved visual coverage and bubble area distributions.
Furthermore, by overlaying these masks, we derived density maps that provide insights into the spatial distribution of bubbles over the course of the experiment. Focusing on the distinct contours of bubbles in the masks, we formulated computer vision algorithms using the OpenCV library.25 These algorithms were adept at identifying individual bubbles and subsequently extracting a plethora of shape characteristics, enhancing our understanding of bubble dynamics and morphology.
In our research, we undertook a benchmarking study, comparing three distinct variations of the U-Net model. Firstly, we evaluated the conventional U-Net 2D model, which was trained using the ZeroCostDL4Mic approach.26 This was compared with a custom U-Net 2D model that integrated a pre-trained ResNeXt101 from the ImageNet dataset as its backbone. The third model was the more contemporary attention U-Net. The models were trained on the same dataset consisting of 28 images for training and 7 images for validation for 100 epochs with a batch size of 8. This training involved segmenting the images into patches and employing a dynamic learning rate which was adjusted in real time based on model performance across epochs.
As illustrated in Table 1, we present a comprehensive array of metrics for these models. In terms of F1 scores, the models showcased similar performance levels; however, the U-Net model augmented with the ResNeXt101 backbone slightly outperformed the others. It was observed that the standard U-Net architecture achieved lower precision metrics but with a compensatory spike in recall compared to the other two architectures. In the context of our bubble segmentation task, the model's precision is especially relevant. A diminished precision hints at the introduction of non-existent bubble pixels, which reduce the interpretability and accuracy when analyzing features extracted from videos.
Model | Precision [%] | Recall [%] | F1-score [%] |
---|---|---|---|
U-Net 2D | 81 | 89 | 85 |
U-Net with ResNeXt101 backbone | 95 | 78 | 86 |
Attention U-Net | 95 | 75 | 84 |
In addition to the models' performances, the decisive step remains the visual evaluation of the segmentation outcomes. To this end, Fig. 5 offers a comparative visualization with sub-figure (a) depicting a frame randomly sourced from the validation set, and sub-figure (b) showcasing the segmentation results as generated by the different models under study. A close inspection of these frames bolsters our conviction in the superior performance of the U-Net model incorporating the custom ResNeXt101 backbone. Notably, this model yields the most seamless contours around the segmented bubbles and achieves good results in separating bubbles in close proximity—a feat not as pronounced in the other models in our dataset.
To provide a more granular view of our results, we have rendered videos of the segmentation outputs. These videos, available on our GitHub repository, allow for a detailed examination and facilitate a deeper appreciation of the differences in each model's performance.
The implementation of automatic video segmentation paves the way for an enhanced software utility capable of extracting intrinsic data from the bubble dynamics unfolding within the cell. these parameters are instrumental for parametrization in simulation sciences and hold the potential to expedite investigations within experimental cohorts. Further, model predictions reveal multifaceted insights pertaining to the bubble dynamics of the utilized PEM electrolyzer.
Fig. 6 delineates the time-resolved bubble ratio and presents its corresponding histogram for four distinct voltages but retaining an identical experimental configuration. A comparative analysis reveals differences across the different experiments. As anticipated, a reduced voltage corresponds to decelerated gas generation, marked by merely four significant gas releases and a diminished mean bubble coverage. Conversely, an increased voltage shows an augmented visual bubble coverage of the electrode, and the time-sequence plots show a higher periodicity in gas releases.
An interesting result obtained through our methodology is the spatiotemporal distribution of bubbles, visualized as a density map, showing bubble position probabilities over a time series. These maps are constructed using model predictions, where each frame's pixels are classified into two categories: ‘0’ representing the background and ‘1’ for the bubbles. Upon aggregating all these predictions, the regions most consistently occupied by bubbles across the time series attain a higher value, which is depicted in the map with warmer hues. Conversely, regions with infrequent bubble presence are represented with cooler colors. Intermediary zones, represented in green, elucidate regions within the cell where bubbles tend to coalesce and accumulate prior to release. Such maps offer a correlation with bubble nucleation points, shedding light on predominant bubble generation zones and their distribution across the cell.
Fig. 7 exemplifies the utility of these density maps. Presented therein are four distinct maps corresponding to a single experiment across varied voltage levels. At lower voltage settings, the map reveals fewer warm-colored regions, indicative of bubble nucleation, and a relatively random distribution pattern across the cell. As the voltage is incremented, there's a palpable increase in these nucleation regions, complemented by a clearer depiction of bubble accumulation zones—the latter exhibiting a subtle proclivity towards the cell's right side. Another salient observation gained from these heat maps is the remarkable consistency in nucleation points across experiments, especially at higher voltage levels.
Our methodology facilitates an individual analysis of bubbles, focusing on their unique morphological attributes. Analyzing the predicted masks, bubbles fully encapsulated by background pixels are classified as distinct entities. Employing specialized computer vision techniques, we derive a series of attributes elucidating the structural properties of individual bubbles detected throughout the experimental phase. These commonly calculated properties in the realm of materials science include:27–29
• Area: denoted by the pixel count of an individual bubble.
• Diameter: inferred under the assumption that the bubble's area aligns with a perfect circle.
• Aspect ratio: derived by inscribing a bounding box around the bubble and calculating the ratio of its longest to shortest side.
• Solidity: a metric indicative of the bubble's convexity. It is defined for a given shape as the ratio of the area of the shape to the area of its convex hull, represented as:
• Orientation: determined by superimposing an ellipse on the bubble and calculating the tilt angle of this fitted ellipse.
• Perimeter: represents the contour length of a bubble.
• Extent: defines the bubble's squareness.
• Roundness: assesses the bubble's resemblance to an ideal circle.
This set of parameters, when used together, provides a detailed geometric characterization of bubbles, aiding in a more thorough understanding of their behavior and interaction with surrounding environments.
Given that our data acquisition is frame-centric, it permits temporal analysis. As illustrated in Fig. 8, time-series data for a specific experiment conducted at four distinct voltage levels are plotted. Initially, bubbles are individually analyzed and subsequently averaged on a per-frame basis. Given the volatility in individual data points, which can make the visualization of emergent patterns difficult, we opted for data pre-treatment prior to visualization. We employed a moving average approach, wherein an average spanning 100 frames was taken, which then informed the central line of our plot. The accompanying standard deviation is depicted as a shaded region around this line.
This single bubble data provides a comprehensive perspective on the morphological trends of bubbles across temporal intervals. A particular revelation from this data is related to bubble uniformity. For example, smaller standard deviations in the area signify periods of systemic uniformity, indicating a preponderance of similar bubbles. Conversely, pronounced standard deviations, e.g. with respect to area, indicate a bifurcation between small and large bubbles, suggestive of system heterogeneity.
The results shown in this work, establish a basis for a detailed, time-resolved characterization of bubble system dynamics within the cell, facilitated through automation.
Additionally, combining the various characteristics previously mentioned can result in a solid manner to characterize particular systems of interest under examination. From the time-series bubble coverage data, we can discern vital information about cell cycling, the periodicity of prominent gas releases, average bubble coverage, and its inherent volatility. The density maps are instrumental in visually pinpointing nucleation sites and frequent bubble aggregation zones, thereby shedding light on the distribution of active sites and the efficacy of the gas flow field. The contours and characteristics of individual bubbles further provide insights into the bubble formation dynamics and how the environment influences specific bubble shapes. For instance, comparing the average bubble area and its associated standard deviation with the average coverage metrics from the histograms and the active zones from the density maps enables elucidation of the underlying bubble dynamics.
We envisage that our software will significantly enhance the depth and complexity of bubble dynamics analysis within a PEMWE cell. This is achieved by enabling the interpretation of data-rich yet easily conducted experimental recordings. Creating an accessible path to obtain such detailed data is essential for augmenting the sophistication of simulation models aimed at understanding these phenomena and resulting in insightful diagnostic techniques. For experimentalists, this tool offers the dual advantage of rapid yet robust characterization, facilitating iterative testing of diverse cell geometries and setups. Furthermore, the workflow is able to analyze several thousands of frames in a matter of minutes enabling an in-depth analysis, unfeasible to obtain manually in a reasonable time frame or cost.
Looking forward, this work will be employed to facilitate further research aimed at deciphering the complexities of bubble dynamics in relation to varying voltages and gas flow fields. Combining this approach with physical modelling, it is possible to unfold the full potential in terms of mechanistic interpretation and prediction of bubble dynamics.
Footnote |
† Equal contribution. |
This journal is © the Owner Societies 2024 |