Xiaolong
Liu
ab,
Yifei
Jiang
c,
Yutong
Cui
ab,
Jinghe
Yuan
*a and
Xiaohong
Fang
*abc
aKey Laboratory of Molecular Nanostructure and Nanotechnology, CAS Research/Education Center for Excellence in Molecular Sciences, Institute of Chemistry, Chinese Academy of Sciences, Beijing 100190, China. E-mail: xfang@iccas.ac.cn; jhyuan@iccas.ac.cn
bUniversity of Chinese Academy of Sciences, Beijing 100049, P. R. China
cInstitute of Basic Medicine and Cancer, Chinese Academy of Sciences, Hangzhou 310022, Zhejiang, China
First published on 22nd September 2022
Single-molecule microscopy is advantageous in characterizing heterogeneous dynamics at the molecular level. However, there are several challenges that currently hinder the wide application of single molecule imaging in bio-chemical studies, including how to perform single-molecule measurements efficiently with minimal run-to-run variations, how to analyze weak single-molecule signals efficiently and accurately without the influence of human bias, and how to extract complete information about dynamics of interest from single-molecule data. As a new class of computer algorithms that simulate the human brain to extract data features, deep learning networks excel in task parallelism and model generalization, and are well-suited for handling nonlinear functions and extracting weak features, which provide a promising approach for single-molecule experiment automation and data processing. In this perspective, we will highlight recent advances in the application of deep learning to single-molecule studies, discuss how deep learning has been used to address the challenges in the field as well as the pitfalls of existing applications, and outline the directions for future development.
However, there are several barriers that hinder the wide application of single-molecule imaging in biochemical studies. Firstly, single-molecule imaging is a generally sensitive and time/labor-consuming process, which requires high stability of the instrument and extensive experience from the researcher. Run-to-run variation increases measurement errors and makes the result hard to interpret. Secondly, a single-molecule signal is often weak and heterogeneous with various types of dynamics. The event of interest is also convolved with noise and photo-physical kinetics, as well as instrument fluctuation, which results in highly complex data.16 Traditional algorithms that assume the data to follow a certain distribution might not work well with single-molecule data.17 Thirdly, single-molecule imaging typically generates a large amount of data. Its data analysis method requires lots of time/effort from experienced users and the procedures are easily affected by human subjective factors, which affects the accuracy and the consistency of analysis.
Recently, as a new class of computer algorithms that simulate the human brain to extract data features, deep learning has been applied to a wide range of research fields with excellent performance.18 Deep learning networks excel in task parallelism and model generalization and are well-suited for handling nonlinear functions and extracting weak features, which provide a promising approach for single-molecule experiment automation and data processing.19 Recent published studies that apply deep learning to single-molecule imaging and analysis have shown that, compared to previous algorithms, deep learning provides superior performances in terms of sensitivity, accuracy, and processing speed.
In this perspective, we will first introduce the basic principles of single-molecule microscopy, in particular, current challenges in experiment automation and data processing. Then we will review recent advances in deep learning in single-molecule studies and highlight how deep learning has been used to address the challenges in the field. Finally, we will conclude with the current stage of deep learning in single-molecule imaging and data analysis, discuss the pitfalls of the existing applications, and outline the directions for future development. It should be noted that deep-learning-assisted SMLM, including the single molecule localization method,20–22 image reconstruction,23,24 background estimation,25 and point spread function (PSF) engineering,26–30 has received broad attention and been reviewed extensively.1,31–33 Equally important but often neglected areas are single-molecule imaging automation and single-molecule feature recognition, which will be the focus of this review.
In general, single-molecule imaging methods enhance the imaging SNR and detection sensitivity by reducing the excitation/detection volumes using different types of fluorescence microscopes. For example, total internal reflection fluorescence microscopy (TIRFM)34–38 exploits evanescent waves to selectively excite molecules near the interface; confocal microscopy uses pinholes to filter out non-focal fluorescence signals; light-sheet fluorescence microscopy (LSFM) uses a 2D light sheet to illuminate and image samples in thin slices, etc. Among these methods, TIRFM has very shallow imaging depth and is most suitable for studying lateral structures/dynamics; confocal microscopy, as a point-scanning technique, offers 3D resolution but suffers from low imaging efficiency; LSFM,39,40 on the other hand, combines wide-field planar excitation with axial optical-sectioning, which offers a balanced performance between axial resolution and imaging speed.41–43
In addition to the various excitation schemes, single-molecule detection schemes can also be modified to extend the imaging depth and obtain additional information. For example, PSF engineering methods use conventional epi excitation schemes and modify the shape of the PSF to reflect the axial position of the fluorophore. By introducing cylindrical optics or phase plates into the detection light path, the conventional Gaussian-like PSF can be transformed into ellipse, double helix, and tetrapod shapes.44–47 PSF engineering methods offer very good temporal resolution and extend the imaging depth to as deep as 20 μm, which is particularly useful for 3D imaging and particle tracking.48 Hyperspectral imaging determines the spectra of individual molecules through dispersion of fluorescence photons, which can provide information about structure and dynamic heterogeneities.49–52 In addition, the emission polarization of fluorescent probes can be used to study the orientation and the rotational movements of biomolecules.53–58
The spatial resolution of conventional optical microscopy is limited by the diffraction of light.59 Depending on the numerical aperture of the objective and imaging wavelength, the lateral and axial resolutions of fluorescence microscopy are typically 200–300 nm and 500–600 nm, respectively. Driven by the interest in studying biological structures/processes below the diffraction limit, a variety of methods have been developed to further improve the spatial resolution of fluorescence imaging, including single-molecule localization methods, such as stochastic optical reconstruction microscopy (STORM)60 and photoactivated localization microscopy (PALM),61 methods that exploit fluorophores' non-linear response to excitation, such as stimulated emission depletion microscopy (STED)62–64 and ground state depletion microscopy (GSD),65 and post-acquisition processing methods, such as super-resolution optical fluctuation imaging (SOFI),66etc. Among these techniques, SMLM67 has attracted particular research interest, as it offers high spatial resolution while using relatively simple instrumentation.66,68,69 By combining SMLM with point-function engineering, 3D super-resolution imaging with a lateral and an axial resolution of 5 nm and 10 nm has been demonstrated, which greatly improves the level of detail provided by single-molecule imaging.70,71 SMLM and its deep learning applications have been extensively reviewed.1,31–33,72–77 Due to the limited space, we will focus on single-molecule imaging and only mention SMLM briefly in the review.
Overall, advances in imaging techniques have increased the detection sensitivity, imaging depth, and spatial and temporal resolution of single-molecule imaging. Due to the high sensitivity of the measurement, maintaining focus and minimizing sample drift are crucial for reducing measurement variations and obtaining reliable results. In addition, advanced applications, such as deep particle tracking and 3D single-molecule imaging, require careful optimization and calibration of the instruments. To address these challenges, we will discuss how deep learning has been used to set up single-molecule experiments, optimize imaging conditions, and improve the quality of the results in Sections 4.1 and 4.2.
Localization of single molecules in the images is the first step of single-molecule data processing. Single-molecule localization allows obtaining basic information such as location, intensity, and orientation of single molecules, which can be used to visualize the subcellular structure and to construct the fluorescence intensity and position traces. The conventional approach is to use a two-dimensional Gaussian distribution to fit the PSF. Multiple iterations are performed using maximum likelihood estimation (MLE)78 or nonlinear least squares (NLLS)79 until the best Gaussian model is found. Such an iterative approach is usually time-consuming. There is also the wavelet segmentation algorithm, which converts the raw data into wave maps and performs single-molecule localization using a wavefront to accelerate the process.80
After initial localization, analysis of the fluorescence intensity traces can be used to obtain a variety of valuable information on biomolecule structures and functions (Fig. 1). For example, counting the number of steps in a photobleaching trajectory can be used to determine the single-molecule aggregation state (Fig. 1b), smFRET analysis can be used to study protein interactions (Fig. 1c), single-molecule recognition through equilibrium Poisson sampling (SiMREPS) can be used to characterize the binding dynamics of bio-molecules (Fig. 1d), etc.5 For diffusing molecules, the trajectory is constructed by linking localized positions between sequential frames,81,82 which can be used to characterize the state of single molecules and their interactions with the microenvironment (Fig. 1a).83,84 Many physical parameters associated with biological processes can be extracted from the analysis of the trajectories, such as total displacement, furthest distance from the starting point, confinement ratio, local orientation, directional change, instantaneous velocity, mean curve rate, root mean square displacement (RMSD) and diffusion coefficient (D).85,86 These parameters reflect the state of single molecules and their interactions with the surroundings. For example, molecular diffusion models, such as Brownian motion, directional diffusion, confined diffusion, etc., are extensively used to analyze the interactions of proteins on the membrane.7
Single-molecule data analysis is a challenging process by traditional methods. On one hand, single-molecule data often contains a variety of dynamics, and do not follow a certain distribution. On the other hand, single-molecule imaging typically generates a large amount of data, which are easily influenced by human bias with reduced accuracy and consistency. We will discuss how deep learning algorithms have been used to address these problems and facilitate single-molecule data analysis in Sections 4.3 and 4.4.
The basic units of DNNs are neurons.18 Each neuron is a simple operator that yields an output from multiple inputs. Multiple neurons in parallel form a layer of neurons, and the output of the neurons in one layer is used as the input of the neurons in the next layer, thus forming a neural network. The number of layers, the number of neurons in each layer, and the weights of neurons are all adjustable parameters in the model. The parameters are determined by learning a large amount of training data. Due to the advantages of task parallelism and model generalization, DNNs can be used to fit nonlinear functions and simulate feature extraction functions of the human brain.
Deep neural networks can be divided into two main categories in terms of training methods:88 supervised learning networks and unsupervised learning networks. Supervised learning feeds the model with already labeled data for training. The output targets of the training data are already known in advance, and the model only needs to iterate continuously so that the objective function converges to minimum error. The advantage of supervised learning is the high accuracy of the trained model. However, the data needs to be labeled in advance, which is difficult for some applications due to the lack of a priori knowledge. In contrast, an unsupervised learning network is a type of learning in which the training data does not need to be labeled in advance and the model automatically finds features and classifies all the data. As a result, unsupervised learning performs well in cluster analysis and is able to find small classes that traditional methods cannot find.
The most widely used deep neural network is convolutional neural networks (CNNs).89 CNNs are suitable for processing multidimensional data, such as images, audio signals, etc. CNNs generally consist of several types of network layers: input layer, convolutional layer, activation layer, pooling layer, and fully connected layer. The input layer feeds the raw or pre-processed data into the convolutional neural network. As the core layer in CNNs, the convolutional layer performs a sliding window operation using smaller convolutional kernels to detect different features, which is similar to the receptive field in a biological visual system. The activation layer converts linear mapping into a nonlinear mapping using a nonlinear activation function, such as a rectified linear activation function or sigmoid function. The pooling layer is a down-sampling layer, sandwiched between successive convolutional layers, used to compress the number of parameters and reduce overfitting. In a fully connected layer, all the neurons between two successive layers are interconnected with weights to map the learned features into the sample labeling space.
Recurrent neural networks (RNNs) are often used for time series data, such as speech signals.89 The RNN is a recursion of a neural network, which uses the previous output as well as a hidden layer, in which the information about all the past processed elements is preserved, as the input.90 The memory layer adds a layer of weights with each propagation, which results in a reduced amount of previous information in the later memory (vanishing gradiant). To solve the problem of gradient disappearance, long-short-term memory (LSTM) networks have been developed.91 By adding a forgetting gate, LSTMs chose which memory to remember or forget and can preserve long-term information. Long-term memory is propagated by linear summation operations so that gradient disappearance does not occur in back propagation. LSTMs have been shown to perform better than conventional RNNs in most problems.91
A generative adversarial network (GAN), contains two sequential networks: the generative network and the discriminative network.92 The generative network is used to generate data based on a probability distribution and the discriminative network is used to extract features from the generated data. The two models are trained to promote each other. As a type of GAN, the discriminator-generator network (DGN) uses two bi-directional long-short term memory networks (biLSTMs) as a generator and a discriminator respectively.93,94 BiLSTMs can access both past and future contexts to improve the prediction results. The discriminator is used to map the input sequence to a hidden state vector, and then the generator recovers the input time sequence from this hidden state vector. The discriminator and generator are jointly trained to optimize the prediction accuracy, thus uncovering the hidden state behind a time series.
Considering that single-molecule imaging data is mainly images and time-series, CNN and RNN-based networks are well-suited for single-molecule data analysis. In addition, in the case of single-molecule data with unknown features or with features that cannot be labeled, a GAN-based unsupervised network is particularly useful.
The application of deep learning in single molecule data analysis includes two stages: model training and experimental data analysis. Model training is time- and power-consuming. Once the model is trained, the analysis of experimental data only takes seconds to minutes. For most deep learning tasks, a personal computer with an appropriate configuration is enough. There are three parts that need to be considered: central processing unit (CPU), graphics processing unit (GPU), and random access memory (RAM). A deep learning model always processes a large amount of data. The performance of the CPU mainly limits the speed of data loading and pre-processing. Most mainstream CPUs can meet the requirements. Most deep learning models are trained on the GPU. An excellent GPU with a memory of no less than 8 GB can accelerate the speed of training, e.g., Nvidia RTX 1080 – 3080 series and Titan series. Insufficient RAM will limit the processing speed of the CPU and GPU. The RAM should be larger than the GPU memory. We recommend RAM greater than 32 GB. There are several cloud computing platforms providing free GPUs, which facilitate the use of deep learning and project sharing, e.g., Amazon Web Service (AWS), Microsoft Azure, and Google Colaboratory.
In Table 1, we have listed recent representative applications of deep learning in single-molecule imaging/analysis and summarized the key information, including network type, input/output of the model, training hardware and training time. In this part, we will review these applications in detail and compare the performances of the various deep learning algorithms.
Applications | Network type | Input | Output | GPU | Train time | Ref. |
---|---|---|---|---|---|---|
Autofocus for SMLM | CNNs | Defocus image | Defocus degree | GeForce RTX 2080 SUPER VENTUS XS OC GPU | 3 h | Lightley95 |
Offline autofocus for fluorescence microscopy | GAN | Defocus image | Focused image | GeForce RTX 2080Ti | 30 h | Luo96 |
Single-shot autofocus for fluorescence microscopy | Fully connected Fourier neural network (FCFNN) | Defocus image | Defocus degree | GTX 1080Ti | 30 min (GPU) | Pinkard97 |
15 h (CPU) | ||||||
Automated single-molecule imaging | CNN | Image | Classified image based on expression level | NVIDIA Quadro 4000 | — | Yasui98,99 |
Protein stoichiometry analysis for epidermal growth factor receptors (EGFRs) | CNN, LSTM | Single molecule intensity-time series | Aggregation state | GeForce 1080Ti GPU | — | Xu16 |
Protein stoichiometry analysis for transforming the growth factor-β type II receptor (TβRII) | biLSTM | Single molecule intensity-time series | Aggregation state and state change dynamics | — | — | Yuan100 |
Protein stoichiometry analysis for the chemokine receptor (CXCR4) | CNN | Single molecule images | Aggregation state and state change dynamics | GeForce 2080Ti | — | Wang101 |
Protein stoichiometry analysis for CXCR4 | CNN, LSTM | Single molecule blinking intensity time series | Aggregation state | GeForce RTX 2080Ti | — | Wang102 |
FRET trace classification | LSTM | SmFRET intensity time series | Classified intensity time series based on FRET activity | — | — | Thomsen103 |
DNA point mutation recognition by SiMREPS and FRET traces classification | LSTM | SiMREPS or smFRET intensity time series | Classified intensity time series based on binding activity | NVIDIA's TESLA V100 | 1 h (GPU) | Li104 |
5–10 h (CPU) | ||||||
Diffusion model classification | CNN | Single molecule position traces | Diffusion type | NVIDIA GeForce Titan GTX | — | Granik105 |
Autofocus is a very useful function in microscopic imaging. It can quickly find the focal plane without human judgment and, in addition, prevent samples from defocusing during long-time imaging. Traditional real-time autofocus includes two main types: hardware-based autofocus and image-based autofocus. Hardware-based autofocus relies on an additional sensor that detects the back-reflection signal from the coverslip to determine the focus drift and then performs re-focus. Lightley et al.95 recently improved the working distance of the hardware-based autofocus system by developing a CNN-based algorithm. A diode laser with a wavelength of 830 nm is focused onto a coverslip. The detector is located on the conjugate plane of the coverslip. The reflected NIR laser is detected by using a camera and the spatial distribution of intensity is recorded (Fig. 2a). The shape of the distribution is influenced by the focal condition. A CNN model is trained with the acquired images of various out-of-focus depths. The off-focus distance can be quickly calculated and corrected by analyzing the distribution shape during the imaging process (Fig. 2b). This method has been applied in SMLM and works well over a range of ±100 μm. The image-based auto-focusing takes a series of images along the Z-axis and determines the off-focus distance by calculating the sharpness of the feature edges.
Fig. 2 Different types of deep-learning assisted autofocus system. (a) Instrumentation for a CNN-assisted hardware-based online autofocus system. (b) The process of CNN training for hardware-based autofocus application.95 Copyright © 2021, The Authors. (c) Overview of the integration of the deep-learning-assisted image-based autofocus method with a custom-built LSFM.106 Copyright © 2021, Optica Publishing Group. |
Henry et al.97 reported a single-shot focusing method based on deep learning, which relies on one or more off-axis illumination sources to find the correct focal plane. While the idea of single-shot focusing is compelling, the requirement for an extra illumination source could limit its application in single-molecule imaging. Li et al.106 developed a deep learning model for autofocus of LSFM (Fig. 2c). Hundreds of defocused image stacks are acquired, each containing a series of images with various off-focus distances. For every image stack, two defocused images with a known off-focus distance are fed into the network for training. The known defocus distance served as the ground truth. After training, this model can determine the off-focus distance according to two defocused images in LSFM. This model has been demonstrated in the imaging of mouse forebrain and pig cochleae samples.
More recently, a new off-line autofocus method has been developed. Luo et al.96 developed a post-imaging autofocus system called Deep-R based on the GAN. Different levels of out-of-focus and in-focus images are used for training. The generative network takes the out-of-focus images as input data and outputs the in-focus images, and then the discriminative network takes the output of the generative network as the input and generates the out-of-focus images. The two networks are trained jointly. The in-focus image generated by the model is compared with the actual in-focus image to filter out the wrong models. After the training, with an out-of-focus image as the input, the generative network is able to generate the corresponding in-focus image quickly and accurately.
Baddeley developed the Python-Microscopy Environment (PYME)107 which is an integrated platform for high-throughput SMLM. Deep learning neural network Mask R-CNN is trained to detect nuclei for ROI selection automatically. Mask R-CNN is a flexible framework for object instance segmentation, which has been applied in human pose estimation, tumor identification, artifact detection, etc. A dataset BBBC038v1, which contains a large number of segmented nuclei images, is used as training data. The system exploits data compression, distributed storage, and distributed analysis for automatic real-time localization analysis, which massively increases the throughput of SMLM to 10000 cells a day.
Cellular proteins generally function as multimers, aggregates or protein complexes. SmPSCA has become a common method to count the number of fluorescent proteins within a diffraction-limited-spot and determine the stoichiometry and aggregation state of the proteins. In photobleaching trajectories, the dynamics of interest are easily confused with various types of noise and photophysical kinetics, such as photoblinking, which are not accounted for in conventional analysis methods, such as the filter method,108 threshold method, multiscale product analysis, motion t-test method, and step fitting method.109 Taking the temporal information into account, the hidden Markov model (HMM) can partially eliminate the interference of photoblinking.110 However, HMM methods show a weak ability to correlate long-term events and require users to preset parameters such as initial states, state numbers and a transition matrix. All the methods above require the input of parameters based on prior knowledge of the biological system as well as the algorithms, which could be challenging for users and might affect the accuracy of the analysis. Xu et al.16 reported the first deep learning model to solve these problems in smPSCA, which is referred to as the convolutional and long-short-term memory deep learning neural network (CLDNN) (Fig. 3a). This model consists of both a convolutional layer and LSTM layer. Single-molecule photobleaching traces are used as input data, and the output is the number of steps. The convolutional layer is introduced to accurately extract features of steplike photobleaching events, and the LSTM layer is to remember the previous fluorescence intensity for photoblinking identification. Manually labeled experimental data and artificially synthesized data are used as the training sets. Once the model is trained, it can analyze a large amount of data quickly without setting any parameters. The CLDNN model effectively removes the interference of photoblinking and noise on bleaching step recognition. Compared to the previously reported algorithms for smPSCA, the CLDNN shows higher accuracy with even over 90% at a low SNR value (SNR = 1.9), and higher computational efficiency with 2–3 orders of faster speed.
Fig. 3 Deep learning for single molecule stoichiometry studies. (a) Architecture of the CLDNN for SMPSCA.16 Copyright © 2019, American Chemical Society. (b) The training and performances of the DGN on both SMPSCA and dynamic finding with fluorescence intensity traces.100 Copyright © 2020, The Author(s). |
The CLDNN is a supervised-learning network. Training data need to be labeled manually, which is often difficult to realize without human bias. Yuan et al.100 developed an unsupervised neural network, DGN, which can be used not only for protein stoichiometry determination but also for the kinetic characteristics of protein aggregation state changes in live cells (Fig. 3b). The DGN model consists of two biLSTMs. Each biLSTM consists of two inverse LSTMs. The LSTM is suitable to analyze the change of the aggregation state, in which the previous state affects the prediction of the later data. In the traditional LSTM, the model predicts the current time point according to previous information. However, there is very little reference information for the first data point, which leads to less accurate prediction. Therefore, a bi-directional LSTM is used so that there is data to refer to for both front and back feature extraction. In order to achieve unsupervised learning, two biLSTMs are used as a generator and discriminator, respectively. The discriminator identifies the hidden state behind the input fluorescence intensity traces, and then the generator generates fluorescence intensity traces using the hidden state sequence from the discriminator. The generator and the discriminator are trained jointly. After training, the DGN exhibits excellent accuracy in counting photobleaching steps. At SNR = 1.40, the DGN is able to achieve 79.6% accuracy, while conventional methods such as HMM can only achieve 30.9% accuracy. In addition, the DGN can recover the state path, which allows the dynamic information to be obtained from the analysis of fluorescence intensity traces of live cells, including durations of protein association, transition rates during protein interactions and state occupancies of different protein aggregation states. The authors used the model to investigate the TGF-β receptor monomer and dimeric/oligomeric state change under different conditions. They found that while the ligand TGF-β can drive the balance forward to receptor oligomer formation, disruption of lipid-rafts by nystatin can make TGF-β receptor association or disassociation more active, and oligomers are difficult to stably exist.
Wang et al.101 developed a deep learning convolutional neutral network (DLCNN) to recognize receptor monomers and complexes. When receptors form a complex, multiple fluorophores are integrated into a diffraction volume to create an overly bright or abnormal spot, which can be used to identify the complex state. This model was trained with images of single quantum dot (QD) particles and aggregates. After training, it can visualize the complex formation of chemokine receptor CXCR4 in real time and reach an accuracy of >98% for identifying monomers and complexes. They also developed deep-blinking fingerprint recognition (BFR) for identification of oligomeric states.102 They labeled the CXCR4 receptor with carbon dots (CDs). According to the different aggregation states of the receptor, CD blinking creates different intensity fingerprints. Deep learning models extract the fingerprints and classify the receptor aggregation states. They demonstrate that the heterogeneous organizations of CXCR4 can be regulated by various stimuli at different degrees. For 42-residue amyloid-β peptide (Aβ42), it is difficult to probe individual aggregation pathways in a mixture because existing fibrils grow and new fibrils appear. A deep neural network (FNet)111 was developed to split highly overlapping fibrils into single fibrils, which enables tracking of the changes of individual fibrils.
smFRET can be used to analyze protein interactions and achieve highly sensitive detection of targets. Performing smFRET requires preprocessing of images, and extracting, classifying and segmenting smFRET traces.112 Traditionally, the selection of traces requires a lot of subjective judgment and is time-consuming. The two-color intensity trajectories of smFRET need to match the inverse relationship such that as one falls, the other rises. One of the major advantages of deep learning lies in fast feature recognition. Therefore, classification and analysis of smFRET trajectories using DNNs has been reported. Thomsen et al.103 developed software for smFRET data analysis based on DNNs: DeepFRET. This model includes the whole process from image pre-processing, extraction of trajectories, and selection of trajectories, to data analysis. The LSTM is used to learn the temporality of the data and propagate the learned information to the later frames. The introduction of the LSTM eliminates the interference of noise. The model was pre-trained using 150000 simulated data that included all possible FRET states, inter-state transition probabilities, state dwell time etc. On the real data, the model was able to achieve an accuracy of over 95%, while using 1% of the time required by the traditional method.
SiMREPS uses fluorescence to detect the specific binding and dissociation of the labeled molecules with fixed targets (Fig. 4a).113 The binding and dissociation of the molecule are reflected as an increase or a decrease in the fluorescence intensity (Fig. 4b). The rate constant of dissociation represents the strength of the binding, and the frequency of binding represents the concentration of the diffusing molecules. SiMREPS requires classification of intensity traces based on resident (“on”) time, which is easily influenced by photobleaching, protein aggregation, and noise variation when analysed using traditional algorithms. Li et al.104 developed a LSTM-based single-molecule fluorescence trace classification model, i.e., an automatic SMFM trace selector (AutoSiM) (Fig. 4c). The authors applied this model to analyze DNA sequences with point mutations. A solution containing a DNA sequence with a pre-known proportion that exhibits point mutations, which randomly bind to complementary strands was immobilized on the surface of a glass plate. The DNA with a specific base mutation shows a shorter residence time in the fluorescence state due to the reduced binding between the mutated DNA and the target. When used for the analysis of experimental data, the recognition specificity increased by 4.25 times compared to that by the conventional HMM method. The number of layers of the LSTM is adjusted in this model, with 7 layers for classification and 8 layers for segmenting useful trajectories. The model was trained with real experimental and synthetic FRET data. The FRET data classification network was able to achieve an accuracy of 90.2 ± 0.9%. To extend the applicability of the model, transfer learning was introduced. Only 559 manually analyzed FRET trajectories of a Mn2+ sensor were used to complete the model training, which took less than 15 min. A classification accuracy of 91% was achieved for the experimental data using the transfer learning model.
Fig. 4 AutoSIM for SiMREPS and smFRET data classification. (a) Schematic of an experimental system for detection of a mutant DNA sequence by SiMREPS. (b) Representative experimental fluorescence intensity traces showing repeated binding of mutant and wild type DNAs to the complementary strands, as well as a typical trace showing non-repetitive nonspecific binding. (c) SiMREPS and smFRET data analysis steps bypassed by the deep learning methods in AutoSIM.104 Copyright © 2020, The Author(s). |
Granik et al.105 used CNNs to classify diffusion trajectories. Three diffusion modes were analyzed: Brownian motion, continuous time random walk (CTRW) and fractional Brownian motion (FBM). FBM and CTRW have similar motion characteristics for shorter trajectories, but they obey different physical laws: FBM is associated with crowded cellular environments, while CTRW motion mainly occurs in trap-containing environments. The neural network was trained with 300000 trajectories. The accuracy of the model was evaluated using real experiment trajectories. The diffusion of fluorescent beads follows FBM in gel networks of different densities, and pure Brownian motion in water and glycerol solutions. The diffusion of proteins across the cell membrane is a combination of FBM and CTRW. Based on 100-step tracks of beads with two sizes, the network can distinguish two different populations, and the mean values are similar to predicted theoretical diffusion coefficients; however, the existence of two populations cannot be distinguished by TAMSD with 100 steps.
Due to the complexity of the cellular environment, single-molecule diffusion is often a combination of multiple models and varies over time. So far there is no unified model that can completely describe all the kinetic characteristics of single-molecule diffusion. To address this problem, deep learning has been recently used to construct diffusion fingerprints for single molecule position traces. Pinholt et al.117 proposed a single-molecule diffusion fingerprinting method that integrates 17 single-molecule diffusion characteristics (Fig. 5). This approach creates an exclusive diffusion fingerprint for each type of single-molecule diffusion, which allows better classification of different diffusion entities. The 17 characteristics include 8 features from HMM estimation: D of the four states and the respective residence times, two features from classical RMSD analysis: the diffusion constants describing irregular diffusion, four features based on trajectory shape: kurtosis, dimension, efficiency, and trappedness, and three features describing the general trend: the average speed, track duration, and MSD parameters. These features are partially overlapped and can be used to distinguish subtle differences between trajectories. A logistic regression classifier is used to predict the experimental environment that generates such data. A linear discriminant analysis (LDA) is used to rank the most relevant features. Single molecule diffusion fingerprinting was applied to identify Thermomyces lanuginosus lipase (TLL) and L3 mutants. The mutant and wild type have almost the same catalytic rate. The step length distributions among the single-molecule trajectories were very similar and difficult to differentiate using conventional methods. The analysis of the diffusion fingerprints identified the feature that distinguishes the two enzymes: the residence time of the HMM diffusion state. The L3 mutant diffuses away from the generated product region in larger steps and spends more time in faster states. This allows the L3 mutant to have less end-product inhibition. This is also in agreement with the available experimental results. This approach of diffusion fingerprinting combined with multiple traditional characterization methods provides a more comprehensive understanding of different diffusion patterns. However, when the feature selection is not optimal, the accuracy of the classification is not very high. Replacing the simple logistic regression model with a CNN or LSTM could potentially improve the classification accuracy.
Fig. 5 The concept of diffusional fingerprinting for classifying molecular identity based on SPT data. (a) Analysis of the trajectory and extraction of 17 descriptive features. (b) The diffusional fingerprint is composed of the feature distributions for each particle type. (c) Diffusional fingerprinting of SPT data for two functionally similar TLL variants, L3 and native. (d) Confusion matrix for classifying two kinds of TLL. (e) Differential histograms of the five highest-ranked features.117 Copyright © 2021, National Academy of Science. |
Overall, traditional feature-based methods of single-molecule diffusion analysis assume that the diffusion of particles obeys some basic physical diffusion patterns, which in reality is often more complex. Analyzing a single feature of diffusion cannot reflect the full information of single-molecule motion. Compared to intensity trajectories, the single-molecule diffusion trajectory has higher dimensions and therefore contains more features, which may not obey existing physical laws or models. Deep learning can be used to discover these weak features, achieve a comprehensive description of single-molecule diffusion, and truly establish a single-molecule diffusion fingerprint.
There are still some pitfalls to be solved for the application of deep learning in single molecule studies. (1) In order to obtain a good model, a large amount of data is required for training. Acquisition of such data is time- and labor-demanding. (2) Deep learning methods often suffer from the problem of overfitting. A model that learns well on training data may not be able to accurately handle unfamiliar data. Some methods have been developed to mitigate this problem, but it is still a tricky situation. (3) Deep learning is a black box; the distribution of features has no analytical form and steps in the algorithm cannot be correlated to the features, which makes it impossible to get an exact interpretation of the algorithm. (4) Deep learning is not easy to get started with. Performing deep learning requires extensive knowledge of the related algorithms and programming skills. Skilled scientists often had difficulties tuning parameters and fixing bugs, let alone the freshman. These problems limit the application of deep learning in single-molecule imaging and analysis.
In the future, development of more advanced algorithms will reduce the requirement for training data volume and makes deep learning more user friendly. Construction of an authoritative single-molecule database can facilitate the generalization of deep learning methods. Not only does it help scientists to verify the accuracy of the methods, but it also contributes to building deep-learning models that are applicable to different instruments and experimental conditions. The standardization of instruments allows for the comparison of different research studies. For home-built microscopy systems, scientists should add more imaging parameters such as: SNR, laser power, TIRFM angle, etc. We should develop more convenient deep learning platforms and modularize different deep learning methods so that even inexperienced users can invoke them easily with a mouse click. Cloud computation platforms have dramatically lowered the barrier for deep learning applications, but more efforts are needed. Scientists who use deep learning for single-molecule data processing should share the code and package their models into easy-to-use applications. In addition, adding more single-molecule parameters (polarization, spectrum, phase, etc) to the deep learning model can help it extract less-obvious features with enhanced accuracy. By adding these advantages, deep learning can further improve the performance of single-molecule microscopy in a low SNR environment, providing a truly powerful tool set for biochemical applications. Further development of deep learning-aided single molecule imaging should also contribute to clinical studies, including disease diagnoses, pathological investigations, and drug discovery.
This journal is © The Royal Society of Chemistry 2022 |