Detection of the mesenchymal-to-epithelial transition of invasive non-small cell lung cancer cells by their membrane undulation spectra

A cancer cell changes its state from being epithelial- to mesenchymal-like in a dynamic manner during tumor progression. For example, it is well known that mesenchymal-to-epithelial transition (MET) is essential for cancer cells to regain the capability of seeding on and then invading secondary/tertiary regions. However, there is no fast yet reliable method for detecting this transition. Here, we showed that membrane undulation of invasive cancer cells could be used as a novel marker for MET detection, both in invasive model cell lines and repopulated circulating tumor cells (rCTCs) from non-small cell lung cancer (NSCLC) patients. Specifically, using atomic force microscopy (AFM), it was found that the surface oscillation spectra of different cancer cells, after undergoing MET, all exhibited two distinct peaks from 0.001 to 0.007 Hz that are absent in the spectra before MET. In addition, by adopting the long short-term memory (LSTM) based recurrent neural network learning algorithm, we showed that the positions of recorded membrane undulation peaks can be used to predict the occurrence of MET in invasive NSCLC cells with high accuracy (>90% for model cell lines and >80% for rCTCs when benchmarking against the conventional bio-marker vimentin). These findings demonstrate the potential of our approach in achieving rapid MET detection with a much reduced cell sample size as well as quantifying changes in the mesenchymal level of tumor cells.

buffer (Thermofisher) containing phosphatase inhibitor cocktail C (Sigma-Aldrich) for total protein extraction. The concentration of proteins in cell lysates was quantified by Pierce BCA Protein Assay (Pierce Biotechnology, Inc., MA, USA) where 30 μg of proteins were loaded in each lane. Proteins were then separated on 12% SDS-PAGE and transferred to PVDF membranes using the iBlot® Dry Blotting System. The membranes were blocked in 5% skim milk in TBST (25 mM Tris-HCl, pH 7.4, 125 mM NaCl, 0.1% Tween 20) for 1 h, and subsequently incubated with appropriate primary antibodies at 4 °C overnight. After that, membranes were washed three times (with a duration of 10 min each) with TBST and then incubated with rabbit-conjugated secondary antibodies (Abcam, including beta-catenin, vimentin, E-cadherin, N-cadherin, and GAPDH) for 2 h at room temperature. The immune complexes formed were detected by chemiluminescence (SuperSignal TM ). Band quantification via densitometry was performed using ImageJ.
To verify that cells have indeed changed from epithelial to mesenchymal state and maintained the stability using this induction protocol, we performed a time-point measurement over 16    Boyden chamber assay. The inserts were incubated at 37°C and 5% CO2 for 2 h to allow gelation.
A total of 1 × 10 5 cells were seeded in the insert with 1.5 mL phenol-red-free RPMI1640 solution.
The bottom of wells was filled with 2 mL phenol-red-free RPMI1640 solution with FBS (10%) as a chemoattractant. Cells were incubated at 37°C and 5% CO 2 for 24 h. After that, the Matrigel was removed by three washes with phosphate buffer saline. Cells that migrated through the porous membrane (indicated as invasive cell here) were harvested with 1 mL PBS-EDTA (1 mM, Thermo Fisher Scientific) and permeabilized by 0.1% Triton X-100 (Sigma). Calcein AM (Sigma) was dissolved in Dimethylsulfoxide (DMSO), with gentle agitation performed every 15 min, to stain of the gel membrane. Fluorescence was measured using a microplate fluorometer (Gemini).
The excitation and emission wavelengths were set to be 485 and 538 nm, respectively. 7

B. Obtaining the frequency spectrum
The undulation spectra of cell membrane were obtained by discrete Fourier transform, that is , where t is time, ω denotes the frequency, j = and is the measured

C. MET induction method and validations
Potential MET inducers, including 6.5 nM bufalin (Sigma-Aldrich) 2-4 , 10 ng/mL rapamycin (Sigma-Aldrich) 5     405nm, 488nm, 633nm lasers were used for the analysis. The measured batch was discarded if either one of the markers exhibiting more than 15% of negative result from the total population.

E. Long short-term memory (LSTM) based recurrent neural network learning model
LSTM was used in the neural network to learn the AFM frequency spectrum and predict the vimentin level. The LSTM block contains three computation networks: the Forget network (Fig.   E1), the Input Network (Fig. E2) and the Output Network ( Figure E3).
As described in Fig. 4, training data containing 1,000 vimentin level (i.e. with a =1, 2, …1000, each defined as a neuron) and the corresponding 1,000 frequency spectra, with a sampling interval of 0.5s, were input to the LSTM block. Each frequency spectrum was equally divided in frequency range, denoted as where the subscript f varies from 1 to 14,440 (i.e. the number of frequency divisions).
For each set of training data (i.e. for each a), a Forget Network (Fig E1) was used to compute the "Forget Array" comprising elements of 0 and 1 based on the training data and the Previous Hidden State . The Forget Array was utilized in the next step as a gate to forget (filter out) -1 certain information in the Previous Cell State . Next, the Input Network (Fig. E2)   The process of computing the Forget Array, , is shown in Figure E1. Specifically, the previous hidden state and were multiplied elementarily with a pair of weights, and , and -1 ℎ then added with a constant bias . A logic function , defined as with , is then introduced to determine the value of each element in the Note that, depending on the value of y, varies between zero and unit. The bracket 1 1 +operator here rounds up its value to 0 or 1, i.e. the resulting Forget Array contains elements of 0 ⌊⌋ or 1 only. This array serves as an operator to keep or forget the next input information/data, with 1 referring to keep the information while 0 corresponding to forget the information.  containing element values between 0 and 1 by using a sigmoid function where and are a pair of weights and is a constant bias. ℎ Next, to get the input cell for long term storage, similar operation was carried out to the training data and the Previous Hidden State with two different weights, and , and another -1 ℎ bias . After that, a hyperbolic tangent ( ) function was used to sustain the long range from iterations before going to zero 11 . The output from allows increases and decreases in the cell state and the output can be positive and negative, ranging from -1 to +1 as To output the long-range result for storing purpose, a dot product of and k was carried out with the value denoted as , i.e.
After getting the long-range result as an input, will pass the frequency position to the next ̅ iteration if the vimentin level matches with the neuron. Specifically, the previous cell state, Finally, to update the hidden state , the current cell state is extended by a hyperbolic tangent function , and pointwise multiplying with the recurrent product , i.e.
After training, the tested spectra were input one-by-one into the algorithm described above. In each case, the output spectrum was identified from the neurons trained and updated such spectrum in the same neuron, serving as a mechanism to learn the frequency in each neuron. , , and are the weights and , , are the additive , ℎ , , ℎ , , ℎ ℎ , biases at the input gate, the forget gate, the cell state and the output gate, respectively. These weights and bias were assigned initial values randomly in the range of (-0.1 to 0.1) and then updated after each iteration by Adam optimization method with the Keras and TensorFlow libraries 12,13 , using a learning rate of 0.1 in each iteration he hidden and cell states were initialized by introducing random noise (0 to 1) with the same dimension of the frequency spectrum.
To validate the training data, cross-validation of training data was performed by k-fold crossvalidation (k=10) 14 . Specifically, the training data (1000 AFM spectra peaks with 1000 expected changes in vimentin level) were divided into 10 bins of equal size randomly. After training, 30 additional data were used to test the training data. For each iteration, the predicted value was feedbacked to the cell state as the input and updated the hidden state for the next prediction. > 90% accuracy was achieved in cell lines and >80% was achieved in rCTC. The training efficiency (expressed as accuracy and loss against epochs, passing through all the blocks above) compared against other existing training methods is shown in Fig. E4. Our method is able to converge after only ~30 epochs, presumably because that both long-and short-term information were captured during the recurrent learning/training in the algorithm proposed here. forward 15 and gated recurrent unit (GRU) 16 . Loss here represents the percentage difference between the predicted value and the measured value between two consecutive epochs.