Insights into infrared crystal phase characteristics based on deep learning holography with attention residual network

Haochong Huang *, Haichao Huang , Zhiyuan Zheng and Lu Gao
School of Science, China University of Geosciences (Beijing), Beijing, 100083, China. E-mail: hchhuang@cugb.edu.cn

Received 18th October 2024 , Accepted 7th January 2025

First published on 27th January 2025


Abstract

This paper introduces the infrared crystal phase, and provides unconventional mechanistic insights into the commonly thought “crystal phase”. The critical challenge with the obtained fidelity phase is the unwrapping process, which was addressed using self-attention mechanism deep learning infrared-band holography. This method strikes a balance between the theoretical rigor of physical models and the flexibility of data-driven approaches. Specifically, utilizing a short wavelength infrared digital holographic system and algorithm resulted in the acquisition of high-quality wrapped phases. Then, the network architecture was applied for phase unwrapping. Through demonstrative applications, static phase-type thickness variation was measured in samples. During moments of intense phase transitions, the microstructural evolution of Na2CO3 crystals was monitored, and the process of perovskite material film formation was observed. The results demonstrated that environmental detection noise and twin images were effectively suppressed, and phase values were also dramatically varied after stabilization of the traditional amplitude signal. These discoveries guide the characterization of novel materials and also provide insights into alterations of properties during crystal preparation and growth, which is crucial for the final outcome.


1. Introduction

The formation of crystal materials is complex, particularly as alterations in physical properties during fabrication determine the ultimate outcome. However, existing crystal characterization techniques primarily provide traditional amplitude information, and they significantly limit elucidation of the mechanisms of crystal growth.1–4 For instance, the data from X-ray diffraction methods fail to offer the details of refractive index,5 and while infrared spectroscopy can detect the evolution in chemical bonds, its sensitivity limits its ability to discover minute architectural variations.6 These challenges underscore the need for unconventional insights into the crystal phase to achieve real-time and high-resolution monitoring of physical property changes during crystal growth.

Digital holography (DH) plays a pivotal role in micro-crystal development and material nature analysis.7–9 Compared to the aforementioned approach, DH can provide much richer structural data. The core challenge of this technology lies in phase unwrapping (PU),10 which involves extracting phase information from the intensity measurements of the optical field. Traditional PU methods, such as path-tracking algorithms and minimum norm strategies, continue to be hampered by issues such as path discontinuity and twin-image artifacts.11

To overcome these problems, researchers have explored novel avenues by machine learning techniques.12–14 For instance, Galande et al. attempted to enhance phase recovery quality by combining deep networks with explicit denoisers, yet this approach is parameter-sensitive and poses challenges in dynamic imaging.15 Dyomin et al. studied ZnGeP2 crystal properties within a DH framework, although real-time reconstruction remains a question.16,17 Moreover, end-to-end hologram-to-phasemap recovery methods rely heavily on high-quality annotated datasets of a particular sample and are limited in their interpretability and flexibility.18–20 Similarly, the PU-generative adversarial network (GAN) proposed by Zhou et al. directly recovers continuous phases from wrapped phases (WP), which is inspiring for the crystal phase in PU of DH.21,22

Moreover, the traditional concept of phase is regarded as the crystal phase, while herein, the phase of the light field specifically refers to that studied in the infrared band. Therefore, we present unconventional mechanistic insights into the crystal phase, namely the infrared crystal phase.23 This mode integrates a physical model and algorithm with data-driven methods for high-precision imaging in complex environments. Specifically, an infrared-band digital in-line holographic system is used to capture holograms.24–26 Then, it freely propagates through the sample plane and detector plane via transform. After applying the arctangent function on the resulting complex area, WP are obtained and use network unwrapping. Experiments demonstrate that this mode excels in static and dynamic scenarios.27,28

2. Experimental section

2.1 Infrared digital holographic system

The hardware part uses vertical lensless DH for precise fluctuating holograms of liquid samples, as shown in Fig. 1. Its horizontal setup eases target loading and enables real-time process recording. Key benefits include reduced light coherence needs, fewer interference fringes, wider spatial bandwidth, and increased system versatility. It employs a 20 mW, 1550 nm infrared (IR) laser for high quality, and compact lighting. Guided via fiber coupling to the probe, it utilizes a reflective parallel light pipe and beam expander for uniform sample illumination and optimal interference effects, capturing holograms on the detector.
image file: d4ta07450e-f1.tif
Fig. 1 In the upper left, a lensless infrared DH system is shown with a 1550 nm distributed feedback (DFB) laser (ROF-LD-2 kHz-M), a direct current power supply from the ETM-3010 series, FC/APC-type fiber optic connectors, a 1280 × 960 phosphor detector (3.75 μm pixels), and a laser, all by Conquer-OC Co., and a target area of 4.8 mm × 3.6 mm. A PU flowchart is shown on the right, and the lower section displays the BSRU-Net structure.

2.2 Network model

On the algorithm front, bilinear interpolation and Squeeze-and-Excitation (SE) attention mechanism Residual U-Net (BSRU-Net), inspired by No New U-Net (nnU-Net), is introduced to enhance PU performance, as shown in Fig. 1. First, in this study, residual blocks were incorporated into the base module of nnU-Net. In the design, the skip connection essentially retained the feature of the previous layer and appended it to the output of the current stage, which aids the network in capturing more local details while simultaneously maintaining global information. The residual strategy also improved cross-layer data flow, alleviating gradient diffusion, and consequently enabled the net to learn and optimize more proficiently during the training process. Then, as illustrated in Fig. 1(c), a dual-branch structure within the downsampling block was adopted for handling local spatial features. The left branch consisted of two 3 × 3 convolutional blocks connected in series. Through the serial connection, the representational ability of regional traits can be progressively enhanced. Then, the right branch only employs one. It can extract details in a relatively independent manner and adjust the number of channels. It simultaneously cooperates with the complex information refined by the two convolutions in the first branch, thereby processing the local spatial features in the entire data more comprehensively and meticulously. For instance, it can more accurately distinguish the minute differences in fine textures and the specific patterns of phase changes. These components together form a residual nnU-Net (RU-Net). However, the capacity of the residual blocks to capture features is limited, and PU is a task that is highly sensitive to details. Therefore, the SE attention mechanism is incorporated into the network subsequent to it to comprehensively acquire minute phase differences, thereby augmenting precision and robustness. In addition, the dual branch module behind the SE boosts information transfer and feature expression. The SE-integrated RU-Net (SRU-Net) is thus proposed.

During upsampling via transposed convolution, a checkerboard effect issue may emerge,29 disrupting the spatial continuity and accuracy crucial for the PU task. In contrast, using bilinear interpolation, which is based on the weighted averaging of surrounding pixel values, can effectively avert the generation of such unnatural patterns. Moreover, in high-resolution PU assignments, transposed convolution entails intricate convolutional kernel operations with high computational complexity. Nevertheless, the modification can achieve upsampling through linear weighted calculations, thereby reducing the numerical burden of the model to some extent and enhancing operational efficiency.

Finally, the skip connections reintegrate the high-level data from the encoder into the decoder, which significantly boosts detail restoration and the precision of PU. Based on these, the BSRU-Net is introduced. Additionally, the SE is replaced with the Efficient Channel Attention, and the Convolutional Block Attention Module to form BERU-Net and BCRU-Net, respectively. These two variants are used for comparative tests to validate BSRU-Net's reliability and adaptability.30

2.3 Implementation and dataset information

BSRU-Net is implemented in Pytorch 1.10.1 on Python 3.6.1 and conducted on a computer equipped with an NVIDIA GeForce RTX 3080 graphics processor, and an AMD Ryzen 5950X central chip for network training and testing. The loss function employed is a smooth L1 loss. Batch 32 was used, with the adam optimizer, and the learning rate was 0.1, dynamically adjusted (decay if >0.000001, else stable). The dataset contained 22[thin space (1/6-em)]000 phase map pairs, with 20[thin space (1/6-em)]000 for training or validation and 2000 for testing.

The generation process involves: (1) creating a random initial matrix of 2 × 2 to 24 × 24 dimensions with random values, ensuring data diversity and model generalization to avoid overfitting; (2) upscaling and cropping to 256 × 256 to avoid low peripheral data; (3) scaling of matrix values to [0, 80], with 60% in [0, 60], and the rest evenly split between [60, 70] and [70, 80] to ensure balance within the data set.

2.4 Evaluation index

Mathematically, the relationship of real phase (RP) and WP is defined by eqn (1):
 
image file: d4ta07450e-t1.tif(1)
where ϕ(x, y) represents the RP, φ(x, y) denotes the WP, Im(·) and Re(·) show imaginary, real parts, j denotes an unreal unit, and (x, y) denotes the position of pixel points.

This study employs the peak signal-to-noise ratio (PSNR) to quantify the discrepancies between the reconstructed and original images. It is defined by eqn (2) and (3):

 
image file: d4ta07450e-t2.tif(2)
 
image file: d4ta07450e-t3.tif(3)
where m × n denotes the resolution, I and O represent the RP and the PU, respectively, MSE denotes the mean squared error, and MAXI denotes the maximum pixel value.

The structural similarity index (SSIM) serves as a critical tool for evaluating PU image fidelity. It is defined as eqn (4):

 
image file: d4ta07450e-t4.tif(4)
where [small phi, Greek, tilde] shows the predicted result, μf and σf2 correspondingly denote f’s the mean and variance, σ[small phi, Greek, tilde]ϕ denotes the covariance between [small phi, Greek, tilde] and ϕ, and c1 and c2 denote constants.

SSIM may not be able to fully capture the negative impacts brought about by aliasing, and its sensitivity to local high-frequency detail changes is relatively limited. However, binary error mapping (BEM)31 can quantify the performance of the model in handling these sensitive regions. Thus, BEM is applied according to eqn (5), but it contains defects in the fault tolerance regarding the accuracy of unwrapping (AU) and the precision of recovering high phase when the WP and the RP are very similar. In this study, bias factor α = max (phase value)/200 for low-phase error allowance and 1% exactness for high-phase are introduced. These visually show pixel differences and boost PU evaluation comprehensiveness.

 
image file: d4ta07450e-t5.tif(5)
where BEM assigns 1 to correctly unwrapped pixels (CUP), and 0 to others. To quantify a CUP proportion, the AU from eqn (6) assesses outcome, calculated as:
 
image file: d4ta07450e-t6.tif(6)

3. Results and discussion

To corroborate BSRU-Net's accuracy and feasibility, a comparative test was visually analyzed. Its PSNR reached 52.64, outperforming the least squares (LS) method, the phase unwrapping max-flow (PUMA) algorithm,32 deep learning phase unwrapping (DLPU), BERU-Net, and BCRU-Net by 46.32, 21.63, 14.9, 8.24, and 7.32, respectively. The SSIM was near perfect at 0.9997. Additionally, the selected line segment in Fig. 2(h3) shows uniform color, unlike the alternating red and blue colours observed in other approaches, indicating high consistency with true labels. BSRU-Net's MSE is minimum at 0.02, with minimal differences in selected segments. Visually, Fig. 2(h2) has no black areas and has the highest AU value of 99.91%, unlike Fig. 2(c2)–(g2), particularly in phase regions bounded by the red rectangle. These findings demonstrate BSRU-Net's superior PU performance.
image file: d4ta07450e-f2.tif
Fig. 2 (a) WP. (b) The ground truth (GT) phase. (c1–h1) PU from LS, PUMA, DLPU, BERU-Net, BCRU-Net, and BSRU-Net, top to bottom. (c2–h2) BEM diagrams for (c1–h1), with AU values. (c3–h3) Compare phase height plots for specified segments between (b) and (c1–h1), with MSE values.

To test BSRU-Net's generalizability, real phase values are necessary, although difficult to obtain, but verifying its output is crucial. Therefore, four complicated phase samples were created—‘C’, ‘NLP’, ‘RST’, and ‘CUGB’. At first, ‘RST’ was more complex and challenging than ‘C’. DLPU's AU for ‘RST’ is 23.3% of ‘C’, and showed instability with high-phase data in Fig. 3(e1) and (e3). However, BERU-Net increased AU by 17.04%, BCRU-Net decreased by 7.32%, and BSRU-Net only decreased by 2.32%, thus achieving 96.56% accuracy. The results underscore BSRU-Net's robust reliability and precision in PU for sophisticated samples.


image file: d4ta07450e-f3.tif
Fig. 3 (a) WP. (b1–b4) RP. (c1–c4) PUMA's output. (d1–h4) BEM diagrams for samples processed by PUMA, DLPU, BERU-Net, BCRU-Net, and BSRU-Net with AU scores, respectively. (i1–i4) 3D visualizations of outputs from BSRU-Net. (j1–j4) Metric values for the data (‘C’, ‘NLP’, ‘RST’, and ‘CUGB’) were analyzed by the methods.

Then, ‘NLP’ exhibited a higher AU and SSIM across methods, as shown in Fig. 3(d2)–(h2). However, PUMA's output demonstrated that the edge continuity was not satisfactory. When dealing with complex samples (except for ‘NLP’), all its indicator values were inferior to those of other approaches. In addition, large areas of black regions were found in the corresponding BEM diagram. Thus, PUMA is characterized by insufficient generalization ability and low stability.

‘RST’ caused significant negative fluctuations in all networks except BSRU-Net, as shown in Fig. 3(j3). Moreover, ‘CUGB’ exhibited aliasing, and possesses the highest phase height and maximum object distribution, as shown in Fig. 3(a4). However, BSRU-Net achieved a SSIM of 0.9908 and AU of 86.25%, outperforming the others. Concurrently, Fig. 3(j1)–(j4) also shows it undergoing the smallest fluctuation, with an average SSIM of 0.996 and AU stability at 95.35%. Thus, it is a promising tool for real PU applications, effectively mapping from WP to RP.

To enhance the persuasiveness of the BSRU-Net experimental results, several algorithms were selected for comparative analysis. The state of Natron (Na2CO3 and Na2CO3·10H2O) crystal at a certain moment in the crystallization process was selected to study. The presence of wire-drawing and plaque phenomena in Fig. 4(a7) indicates that PU cannot be effectively carried out.33Fig. 4(a3) shows that there are aliasing regions. When processing such areas, the methods employed in Fig. 4(a5) and (a6) result in poor unwrapping performance, with both showing discontinuous phase sections. The computational efficiency of minimum cost flow (MCF) is the lowest,34 with the current output time reaching up to 88 seconds. Moreover, as the complexity of the WP increased, there was a further downward trend in its efficiency. In contrast, the unwrapping effect demonstrated in Fig. 4(a4) is superior to the former two. One possible reason for this is that the LS method achieves PU by minimizing the error of the entire image. The interrelationships among all pixel points in the image are fully considered with this method, which assists in maintaining the continuity of the phase and thereby usefully suppressing noise. However, the implementation process of the PUMA is relatively complex. Its optimization procedure based on graph cuts is more sensitive to noise, leading to a reduction in the accuracy of the unwrapping results.


image file: d4ta07450e-f4.tif
Fig. 4 (a1) Hologram. (a2–a3) Amplitude and WP in turn. (a4–a8) The PU of LS, PUMA, MCF, Goldstein's branch cut, and BSRU-Net, respectively. (a9) Amplitude-phase select line comparisons. (b1–b5) The WP after wavelet transform denoising and the PU of LS, PUMA, MCF, and Goldstein's branch cut. (c1–c5) The WP after median filtering denoising, and the PU of the corresponding algorithms. (d1–d5) The WP after mean filtering denoising and the PU of the method.

In Fig. 4(a9), due to the influence of noise, compared with LS, the curve fluctuations of the PUMA and MCF algorithms are extremely significant. However, BSRU-Net is stable, and its trend is more consistent compared with other line segments, thereby indicating certain advantages in capturing the overall trend of phase changes in crystals. In addition, except for BSRU-Net, there was significant interference in the other algorithms by twin images and uneven light distribution noise outside the object region. The existence of a large amount of salt-and-pepper noise can also be observed from Fig. 4(a3).

To ensure the equity of the comparison, wavelet transform, median filtering, and mean filtering methods were adopted for denoising treatment, and then, the unwrapping operation was subsequently performed. However, the results in Fig. 4(b1)–(d5) show that there is a loss of WP information after denoising, and the unwrapping effects of the corresponding methods are not satisfactory. In subsequent studies, the LS method will be selected for comparison. To avoid the loss of phase information, no denoising treatment will be carried out, and the noise resistance performance of the algorithm will completely depend on the algorithm itself. Current research findings indicate that under such circumstances, BSRU-Net's capability demonstrates certain advantages among them.

Subsequently, the BSRU-Net was applied to measure the phase-type thickness of samples and to explore the variations in the physical properties of sodium carbonate crystals and perovskite crystals. In addition, although the network outperformed the traditional method in simulated data, LS served as a control to bolster the validity of the findings. The trials were as follows. First, the workpiece was laser-engraved with a ‘CUGB’ pattern to a depth of 316 nm, creating subtle height differences from the overall flatness, resulting in significant phase modulation effects. The selected depth of 316 nm was approximately one-fifth of the wavelength of 1550 nm, thereby indicating a very high precision. Comparing Fig. 5(a4) and (a5), the outcome illustrates that CNN effectively reduces background noise and illumination inconsistencies, thus enhancing the fidelity of the workpiece's surface representation. In Fig. 5(b2), the contours of ‘CUGB’ are precisely outlined, and no twin image is detected. Moreover, in the case of this small depth, the phase value was easily overwhelmed by noise. Instead, analysis of phase differences from Fig. 5(a5) revealed an average groove depth of 312.47 nm, and a low error rate of 1.12%, which closely matched the pre-experiment measurements. Furthermore, as evident in Fig. 5(b3), the network-generated result curve exhibited reduced fluctuations and a smoother profile compared to the LS outcome, clearly suppressing the noise impact. These data not only verify the high precision of the net in PU of real samples, but also highlight its outstanding capability to suppress noise and extract refined phase data. These advantages of BSRU-Net are of great significance for the real-time monitoring of crystals.


image file: d4ta07450e-f5.tif
Fig. 5 (a1) The ‘CUGB’ hologram; (a2–a3) Amplitude and WP. Reprinted and reused with permission from ref. 4 © American Chemical Society. (a4–a5) LS unwrapped phase and BSRU-Net output. (b1–b3) Microscopic photo, (a5)'s 3D view, and amplitude-phase select line comparisons for (a2), (a4), and (a5).

Additionally, 1.5 μL of a 20.6% Na2CO3 solution was utilized in the experiment to observe the crystal growth process, particularly during moments of intense phase transitions. By using this technique, the features of crystals can be detected and characterized, and stages of crystal growth at those times can be further determined, such as the nucleation period in Fig. 6(e1), the growth step in Fig. 6(e2)–(e4), and the maturation era in Fig. 6(e5). Notably, regions outside the object showed stripe noise and uneven light, as seen in Fig. 6(d1)–(d5), suggesting that LS is notably susceptible to noise during crystallization. In contrast, BSRU-Net's output at corresponding time instances demonstrated substantial noise suppression. It has also been observed from the result graph of LS that due to the pseudo-lens effect, a phase-free region will appear at the edge of the water droplet. However, because the network adopts an end-to-end data-driven processing mechanism, this problem does not exist in the training set.


image file: d4ta07450e-f6.tif
Fig. 6 (a) Natron holograms at varying times. (b1–b5) Corresponding amplitude maps. (c1–c5) WP. (d1–d5) and (e1–e5) LS, BSRU-Net unwrapped phases, respectively. (f1–f5) Amplitude-phase select line comparisons for (d1–e5).

While continuously learning and adapting to various changes and abnormal situations in the data, the model utilizes its internal parameter adjustment system and feature-processing logic to reasonably repair and reconstruct the phase information at the edge of the droplet, successfully suppressing the influence of this problem. During the crystallization process of Na2CO3, phase shifts arise due to physical changes such as varying crystal part growth rates, environmental factors such as temperature and pressure, and internal crystal stresses, which cause minute changes in the optical path. In DH, it can be leveraged to study crystal growth dynamics and material properties. For instance, the irregularities at the edges of the sample's main region in Fig. 6(e1)–(e5) reflect the actual physical changes, yet this genuine information is obscured by LS, and the edge information is weakened. This suggests that LS fails to capture detailed information on the complex surface of crystals.

As shown in the segments of Fig. 6(f1)–(f5), the middle part contains phase information, yet amplitude cannot represent it. The figure shows that the amplitude curve and the LS phase curve fluctuate extremely violently, indicating that it is susceptible to noise interference. Although the segment trends of BSRU-Net and LS are alike to a certain extent, the phase curve of the network is relatively smooth and exhibits spatial continuity. The experimental results demonstrate that BSRU-Net, harnessing SE attention, pinpoints crucial information, leveraging its architecture to mitigate interference. As a result, its curve distributions more accurately mirror the dynamics and authenticity of complex crystal material.

The crystallization of the CsPbBr3 solution and film formation were also monitored.35 Because the perovskite material may exhibit high photosensitivity in the infrared region, using high-intensity infrared light as much as possible can more effectively activate the photoresponse of the material, thereby achieving higher sensitivity during the hologram recording process. However, due to the light-gathering effect, the formation of this strong light area occurs, which will interfere with the phase information within the area.

Based on the collected data, the second and third rows of Fig. 7 show the results obtained by performing diffraction propagation on the corresponding holograms at specific moments. The root cause of this effect lies in the limited dynamic range of the infrared detector in the detection system, leading to the hologram acquisition reaching the threshold. However, studying the phase information during the perovskite film-forming process, especially during moments of intense phase transitions, is of great significance for the characterization of new materials. Therefore, regions of interest were selected for research.


image file: d4ta07450e-f7.tif
Fig. 7 (a) CsPbBr3 perovskite crystal holograms at varying times. (b1–b8) Corresponding amplitude maps. (c1–c8) WP. (d1–d8) and (e1–e8) LS and BSRU-Net unwrapped phases, respectively. (f1–f8) Amplitude-phase select line comparisons for (d1–d8) and (e1–e8).

Fig. 7(d1)–(d8) shows that the crystal variations are dramatic, with phase shifts leading to uneven light field distribution and noise, such as interference fringe in areas other than the object. In response to this challenge, LS smoothed out the actual physical phenomena, blurring the edges and structural information of the object. In contrast, at the same moment, BSRU-Net effectively suppressed noise such as light spots and characterized the object's contour and structural data more clearly, as shown in Fig. 7(e1)–(e8). The net also successfully suppressed the influence of the pseudo-lens effect.

Moreover, a comparison of the selected line segments indicated that the amplitude values were stable and failed to effectively represent the object information, while the phase data from LS tended to be smooth. However, CsPbBr3 perovskite crystals exhibited satisfactory photoelectric effects, which indicated a complex crystal structure. The line segment changes presented by the BSRU-Net at different moments are more consistent with the actual crystal structure compared to LS in Fig. 7(f1)–(f8).

Finally, based on the line segment changes at different moments during the process of Na2CO3 crystal development and perovskite film formation, it was demonstrated that throughout the crystal growth process, even after the traditional amplitude signal ceased to change, the phase values continued to exhibit dramatic physical phenomena, offering a guide for the preparation and characterization of novel materials.

Ablation experiments assessed model modules' impact on performance, with ‘A’ using nnU-Net, ‘B’ employing RU-Net, ‘C’ utilizing SRU-Net, and ‘D’ applying BSRU-Net. There was more rapid convergence with ‘D’, and lower loss, as shown in Fig. 8(e), indicating the effectiveness of its refinements. On the validation set, ‘D’ displayed the highest stability by the 50th epoch. Table 1 exhibits performance metrics as averages from the test set. ‘B’ increased M-PSNR and M-SSIM by 1.6 and 0.0007 over Set ‘A’. ‘C’ enhanced M-PSNR and decreased M-MSE by 1.79 and 0.098 over ‘B’. Despite having fewer parameters, ‘D’ possessed the strongest expressive capability and superior metrics, reducing parameters by approximately 8% compared to ‘C’, and avoiding overfitting. Thus, BSRU-Net's balance of complexity and performance offers insights for future PU model design.


image file: d4ta07450e-f8.tif
Fig. 8 (a) The WP with increasing GS α and SP β by 0.2, enhancing noise in phase images. (b1–b4) BEM maps for BSRU-Net under varying noise. (c1) RP. (c2–c5) Output images under different noise levels, with metrics in (d1–d4) for X = 55. (d1–d4) Metric changes for different phase heights with α and β at 0.2, 0.4, 0.6, 0.8; the graph's right axis is PSNR, left axis is SSIM and AU (as decimals), and the x-axis is max phase height, with each point corresponding to a phase result. (e) Loss function is altered during training and validation of ablation networks.
Table 1 Ablation experiment comparison metrics
Experiment R SE Bilinear M-PSNR M-SSIM M-MSE Params (M)
A 44.72 0.9976 0.151 93.20
B 46.32 0.9983 0.157 100.37
C 48.11 0.9984 0.059 100.38
D 52.10 0.9993 0.049 92.35


Because noise impacts PU accuracy, testing BSRU-Net's noise resistance was essential. This study added Gaussian (GS) and salt-and-pepper (SP) noise to a noise-free dataset with increasing factor ratios (0.2[thin space (1/6-em)]:[thin space (1/6-em)]0.2, 0.4[thin space (1/6-em)]:[thin space (1/6-em)]0.4, 0.6[thin space (1/6-em)]:[thin space (1/6-em)]0.6, 0.8[thin space (1/6-em)]:[thin space (1/6-em)]0.8). GS element α (max standard deviation) and SP factor β (noise density) control noise levels. A random standard deviation matrix was used to enhance noise randomness. As α and β increased, noise intensity increased, testing BSRU-Net's performance under various noise conditions. This study also used mixed noise and increasing phase values to simulate real-world noise. Seven samples with different phase heights were tested, and only shown at a max phase of X = 55. Despite increasing noise, the results indicate its output phase images closely match the true images in Fig. 8(c1)–(c5). The PSNR remained stable in Fig. 8(d1) and (d2). SSIM and AU values minimally fluctuated with noise intensity in Fig. 8(a1)–(a3), stabilizing after an initial dip, as shown in Fig. 8(d4) at X = 47.

Table 2 shows that Fig. 8(d1) displays the highest metric values, suggesting minimal noise impact at this level. Despite the noise getting stronger, the network's PU metrics (M-PSNR, M-AU, and M-SSIM) only slightly decreased, with the lowest values being 10%, 0.18%, and 2.9% less than the highest, respectively, maintaining high performance. These experiments effectively evaluated BSRU-Net's noise resistance, supporting its stability and reliability in real-time infrared crystal characterization.

Table 2 Comparison of the average metrics under varying noise intensities
Fig. 8 M-PSNR M-SSIM M-AU
d1 49.52 0.9990 99.70%
d2 47.08 0.9985 99.04%
d3 45.57 0.9972 98.75%
d4 44.16 0.9985 96.79%


4. Conclusions

In summary, this paper introduced unconventional insights regarding infrared crystal phases by combining a short-wave infrared holographic system with BSRU-Net, effectively overcoming the challenge of PU. It significantly improved PU performance over traditional LS methods, with a 93.24% increase in AU, and 87.99% and 89.51% increases in PSNR and SSIM, respectively. The MSE was reduced by 99.99%, confirming BSRU-Net's high precision. Its generalization is shown by the average values of SSIM, AU, and MSE, which are 0.996, 95.35% and 0.22, respectively, surpassing other approaches and crucial for in-depth analysis of phase details in material physical properties. Furthermore, ablation experiments resulted in parameter reduction with a 71.52% MSE decrease, confirming the network structure's high adaptability.

A comparison was also made with several classic and commonly used algorithms, demonstrating the competitiveness of BSRU-Net. Noise resistance tests were conducted at various intensities, providing data support for accurate PU in real samples. Based on the system, BSRU-Net performed experimental studies on static ‘CUGB’ workpiece thickness measurement. The data (error rate of 1.12%) revealed its accuracy in real samples, and its adeptness at noise reduction and phase detail extraction, providing reliability for the characterization of infrared crystals. BSRU-Net also monitored the microstructural evolution during the growth of Na2CO3 crystals and CsPbBr3 perovskite crystals.

The trials demonstrated the successful untangling of phase information while suppressing environmental detection noise, twin images, and pseudo-lens effects. It is worth noting that the outcomes also illustrate that after the traditional amplitude signals stabilize, changes in phase values continue to dramatically occur. These discoveries show the applicability and practical value of the unconventional insights, and hold significant importance for research on new materials for sodium-ion batteries and advanced materials for perovskite photovoltaic cells.

Data availability

The data that support the findings of this study are available within the article.

Author contributions

Haochong Huang: conceptualization (lead); investigation (lead); funding acquisition (lead); supervision (lead); methodology (lead); data curation (lead); writing – review and editing (lead). Haichao Huang: conceptualization (lead); investigation (lead); software (lead); data curation (lead); methodology (lead); writing – original draft (lead); writing – review and editing (lead). Zhiyuan Zheng: methodology; writing – review and editing. Lu Gao: methodology; writing –review and editing.

Conflicts of interest

There are no conflicts to declare.

Acknowledgements

The authors would like to express their gratitude to the reviewers for the professional suggestions and the explanatory discussions provided. The authors sincerely appreciate the invaluable assistance rendered by Spozmai Panezai (Department of Imaging Physics, Delft University of Technology) during the process of paper composition. The authors wish to acknowledge the support of the National Natural Science Foundation of China (61805214, 42072087), Open Fund of State Key Laboratory of Infrared Physics (SITP-NLIST-YB-2024-12), Piesat Information Technology remote sensing interdisciplinary research project (HTHT202202), Fundamental Research Funds for the Central Universities (2-9-2022-203), Young Elite Scientists Sponsorship Program by Bast (BYESS2020037), and Frontiers Science Center for Deep-time Digital Earth (2652023001).

Notes and references

  1. G. M. Walkden and J. R. Berry, Nature, 1984, 308, 525–527 CrossRef CAS.
  2. Y. Zhang, R. Wang and Z. Tan, J. Mater. Chem. A, 2023, 11, 11607–11636 RSC.
  3. S. M. Kronawitter and G. Kieslich, Chem. Commun., 2024, 60, 11673–11684 RSC.
  4. H. Huang, E. Yuan, D. Zhang, D. Sun, M. Yang, Z. Zheng, Z. Zhang, L. Gao, S. Panezai and K. Qiu, Cryst. Growth Des., 2023, 23, 7992–8008 CrossRef CAS.
  5. M. Vassholz, H. P. Hoeppe, J. Hagemann, J. M. RossellÓ, R. Mettin, T. Kurz, A. Schropp, F. Seiboth, C. G. Schroer, M. Scholz, J. MÖller, J. Hallmann, U. Boesenberg, C. Kim, A. Zozulya, W. Lu, R. Shayduk, R. Schaffer, A. Madsen and T. Salditt, Nat. Commun., 2021, 12, 3468 CrossRef CAS PubMed.
  6. J. Pan, Z. Chen, T. Zhang, B. Hu, H. Ning, Z. Meng, Z. Su, D. Nodari, W. Xu, G. Min, M. Chen, X. Liu, N. Gasparini, S. A. Haque, P. R. F. Barnes, F. Gao and A. A. Bakulin, Nat. Commun., 2023, 14, 8000 CrossRef CAS PubMed.
  7. Y. Yang, J. Cui, H. J. Guo, X. Shen, Y. Yao, R. C. Yu and R. Wen, J. Mater. Chem. A, 2021, 9, 15038–15044 RSC.
  8. S. Zhao, Y. Kang, M. Liu, B. Wen, Q. Fang, Y. Tang, S. He, X. Ma, M. Liu and Y. Yan, J. Mater. Chem. A, 2021, 9, 18927–18946 RSC.
  9. H. Huang, Z. Li, Q. Zhang, J. Hui, Z. Zheng, D. Sun and S. Panezai, IEEE Trans. Instrum. Meas., 2024, 73, 4508337 Search PubMed.
  10. K. Wang, L. Song, C. Wang, Z. Ren, G. Zhao, J. Dou, J. Di, G. Barbastathis, R. Zhou, J. Zhao and E. Y. Lam, Light: Sci. Appl., 2024, 13, 4 CrossRef CAS PubMed.
  11. E. G. Tsiplakova, Y. V. Grachev and N. V. Petrov, Appl. Phys. Lett., 2024, 125, 091108 CrossRef CAS.
  12. K. Song, G. Xu, A. N. M. Tanvir, K. Wang, M. O. Bappy, H. Yang, W. Shang, L. Zhou, A. W. Dowling, T. Luo and Y. Zhang, J. Mater. Chem. A, 2024, 12, 21243–21251 RSC.
  13. S. Zhu, K. Jiang, B. Chen and S. Zheng, J. Mater. Chem. A, 2023, 11, 3849–3870 RSC.
  14. J. Ojih, M. Al-Fahdi, Y. Yao, J. Hu and M. Hu, J. Mater. Chem. A, 2024, 12, 8502–8515 RSC.
  15. A. S. Galande, V. Thapa, H. P. R. Gurram and R. John, Appl. Phys. Lett., 2023, 122, 133701 CrossRef CAS.
  16. V. V. Dyomin, A. I. Gribenyukov, A. Y. Davydova, A. S. Olshukov, I. G. Polovtsev, S. N. Podzyvalov, N. N. Yudin and M. M. Zinovev, Appl. Opt., 2021, 60, A296–A305 CrossRef CAS PubMed.
  17. Z. Li, H. Huang, D. Sun, Z. Zheng, F. Wang, S. Panezai, J. Xing, Y. Yang and K. Qiu, Cryst. Growth Des., 2024, 24, 6851–6864 CrossRef CAS.
  18. H. Ren, W. Shao, Y. Li, F. Salim and M. Gu, Sci. Adv., 2020, 6, 4261 CrossRef.
  19. X. Shi, Z. Wu, Z. Liu, J. Lv, Z. Zi and R. Che, J. Mater. Chem. A, 2022, 10, 8807–8816 RSC.
  20. Y. Zhai, H. Huang, D. Sun, S. Panezai, Z. Li, K. Qiu, M. Li, Z. Zheng and Z. Zhang, Opt Laser. Eng., 2024, 178, 108201 CrossRef.
  21. L. Zhou, H. Yu, V. Pascazio and M. Xing, IEEE Trans. Geosci. Rem. Sens., 2022, 60, 1–10 Search PubMed.
  22. Z. Wu, T. Wang, Y. Wang, R. Wang and D. Ge, IEEE Trans. Geosci. Rem. Sens., 2021, 60, 1–16 Search PubMed.
  23. S. I. Shkuratov, J. Baird, V. G. Antipov, C. S. Lynch, S. Zhang, J. B. Chase and H. R. Jo, J. Mater. Chem. A, 2021, 9, 12307–12319 RSC.
  24. C. Martin, L. E. Altman, S. Rawat, A. Wang, D. G. Grier and V. N. Manoharan, Nat. Rev. Methods Primers, 2022, 2, 83 CrossRef CAS.
  25. S. Gupta and S. Bhattacharyya, Chem. Commun., 2024, 60, 11685–11701 RSC.
  26. A. M. G. Muchlis and C. Lin, J. Mater. Chem. A, 2024, 12, 26471–26483 RSC.
  27. X. Ding, Q. Zhou, Z. Wang, L. Liu, Y. Wang, T. Song, F. Wu and H. Gao, J. Mater. Chem. A, 2024, 12, 27598–27609 RSC.
  28. F. Wang, F. Fan, D. Yuan and X. Wang, Int. J. Opt., 2022, 2022, 6577057 Search PubMed.
  29. A. Odena, V. Dumoulin and C. Olah, Distill, 2016, 1, e3 Search PubMed.
  30. K. Wang, Y. Li, Q. Kemao, J. Di and J. Zhao, Opt. Express, 2019, 27, 15100–15115 CrossRef PubMed.
  31. Y. Qin, S. Wan, Y. Wan, J. Weng, W. Liu and Q. Gong, Appl. Opt., 2020, 59, 7258–7267 CrossRef PubMed.
  32. J. M. Bioucas-Dias and G. Valadao, IEEE Trans. Image Process., 2007, 16, 698–709 Search PubMed.
  33. W. Huang, X. Mei, Y. Wang, Z. Fan, C. Chen and G. Jiang, Measurement, 2022, 200, 111566 CrossRef.
  34. Z. Lin, Y. Duan, Y. Deng, W. Tian and Z. Zhao, Rem. Sens., 2022, 14, 2543 CrossRef.
  35. O. V. Minin, I. V. Minin and Y. Cao, Sci. Rep., 2023, 13, 7732 CrossRef CAS PubMed.

Footnote

Haochong Huang and Haichao Huang contributed equally to this work and should be considered co-first authors.

This journal is © The Royal Society of Chemistry 2025
Click here to see how this site uses Cookies. View our privacy policy here.