Deep learning methods and applications in single-cell multimodal data integration
Abstract
The integration of multimodal single-cell omics data is a state-of-art strategy for deciphering cellular heterogeneity and gene regulatory mechanisms. Recent advances in single-cell technologies have enabled the comprehensive characterization of cellular states and their interactions. However, integrating these high-dimensional and heterogeneous datasets poses significant computational challenges, including batch effects, sparsity, and modality alignment. Deep learning has shown great promise in addressing these issues through neural network-based frameworks, including variational autoencoders (VAEs) and graph neural networks (GNNs). In this Review, we examine cutting-edge deep learning methodologies for integrating single-cell multimodal data, discussing their architectures, applications, and limitations. We highlight key tools such as sciCAN, scJoint, and scMaui, which use deep learning techniques to harmonize various omics layers, improve feature extraction, and improve downstream biological analyses. Despite significant advancements, it remains challenging to ensure model interpretability, scalability, and generalizability across different datasets. Future directions of research in this field include the development of self-supervised learning strategies, transformer-based architectures, and federated learning frameworks to enhance the robustness and reproducibility of single-cell multi-omics integration.