Attention-based multimodal fusion of event-reconstructed images and LIBS spectra using CNN and BiLSTM for metal classification

Honglin Jian; Lei Deng; Jun Wang; Zikui Shen; Xilin Wang; Zhidong Jia

doi:10.1039/D5JA00238A

Attention-based multimodal fusion of event-reconstructed images and LIBS spectra using CNN and BiLSTM for metal classification

Honglin Jian,^a Lei Deng,^b Jun Wang,^c Zikui Shen,^d Xilin Wang

*^a and Zhidong Jia^a

Author affiliations

* Corresponding authors

^a Engineering Laboratory of Power Equipment Reliability in Complicated Coastal Environments, Tsinghua University, Shenzhen, Guangdong, PR China
E-mail: jianhl23@mails.tsinghua.edu.cn, wang.xilin@sz.tsinghua.edu.cn

^b Department of Precision Instrument, Centre for Brain Inspired Computing Research, Tsinghua University, Beijing, PR China

^c State Key Laboratory of Environmental Adaptability for Industrial Products, China National Electric Apparatus Research Institute Co., Ltd, Guangzhou, Guangdong, PR China

^d School of Electric Power Engineering, South China University of Technology, Guangzhou, Guangdong, PR China

Abstract

Laser-induced breakdown spectroscopy (LIBS) has been widely employed for the detection and analysis of metal materials. However, most current methods that primarily combine dimensionality reduction with machine learning still demonstrate limited discriminative power when distinguishing between metals with similar compositions. To improve the analytical accuracy of LIBS, this study introduces a dynamic vision sensor (DVS) into the LIBS system to capture the optical emissions from plasma and reconstruct plasma images using an event frame method. By fusing spectral data and plasma images, we propose a metal classification model based on a temporal spatial attention fusion network (TSAF Net). TSAF Net employs a combination of 1D-convolutional neural network (1D-CNN) and bidirectional long short-term memory network (BiLSTM) architectures for spectral feature extraction, a 2D-CNN for image feature extraction, and incorporates a multi-head attention mechanism for deep cross-modal feature fusion. A fully connected layer then completes the final metal classification task. To better simulate on-site challenges, the experimental setup introduces disturbances such as laser energy fluctuations. The proposed TSAF Net achieves classification accuracies of 93.24% for carbon steel and 94.57% for copper alloys, along with outstanding macro precision, recall, and F1 scores. Compared with the best-performing conventional methods, TSAF Net increases classification accuracy by 46.21% for carbon steel and 33.86% for copper alloys. Additionally, TSAF Net exhibits high computational efficiency and maintains a compact model size. This study significantly improves the accuracy of LIBS in the identification of metallic materials and provides new insights for the further development and application of LIBS.

Article information

https://doi.org/10.1039/D5JA00238A

Article type

Paper

Submitted

18 Jun 2025

Accepted

21 Jul 2025

First published

29 Jul 2025

Download Citation

J. Anal. At. Spectrom., 2025, Advance Article

Permissions

Request permissions

Attention-based multimodal fusion of event-reconstructed images and LIBS spectra using CNN and BiLSTM for metal classification

H. Jian, L. Deng, J. Wang, Z. Shen, X. Wang and Z. Jia, J. Anal. At. Spectrom., 2025, Advance Article , DOI: 10.1039/D5JA00238A

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Journal of Analytical Atomic Spectrometry

Attention-based multimodal fusion of event-reconstructed images and LIBS spectra using CNN and BiLSTM for metal classification

Abstract

Article information

Download Citation

Permissions

Attention-based multimodal fusion of event-reconstructed images and LIBS spectra using CNN and BiLSTM for metal classification

Social activity

Search articles by author

Spotlight

Advertisements