Non-invasive diagnostic test for lung cancer using biospectroscopy and variable selection techniques in saliva samples

Camilo L. M. Morais ab, Kássio M. G. Lima a, Andrew W. Dickinson c, Tarek Saba c, Thomas Bongers c, Maneesh N. Singh de, Francis L. Martin *cd and Danielle Bury *c
aBiological Chemistry and Chemometrics, Institute of Chemistry, Federal University of Rio Grande do Norte, Natal 59072-970, Brazil
bCenter for Education, Science and Technology of the Inhamuns Region, State University of Ceará, Tauá 63660-000, Brazil
cDepartment of Cellular Pathology, Blackpool Teaching Hospitals NHS Foundation Trust, Whinney Heys Road, Blackpool FY3 8NR, UK. E-mail: francis.martin2@nhs.net; danielle.bury@nhs.net
dBiocel UK Ltd, Hull HU10 6TS, UK
eChesterfield Royal Hospital, Chesterfield Road, Calow, Chesterfield S44 5BL, UK

Received 22nd May 2024 , Accepted 31st July 2024

First published on 1st August 2024


Abstract

Lung cancer is one of the most commonly occurring malignant tumours worldwide. Although some reference methods such as X-ray, computed tomography or bronchoscope are widely used for clinical diagnosis of lung cancer, there is still a need to develop new methods for early detection of lung cancer. Especially needed are approaches that might be non-invasive and fast with high analytical precision and statistically reliable. Herein, we developed a swab “dip” test in saliva whereby swabs were analysed using attenuated total reflection Fourier-transform infrared (ATR-FTIR) spectroscopy harnessed to principal component analysis–quadratic discriminant analysis (QDA) and variable selection techniques employing successive projections algorithm (SPA) and genetic algorithm (GA) for feature selection/extraction combined with QDA. A total of 1944 saliva samples (56 designated as lung-cancer positive and 1888 designed as controls) were obtained in a lung cancer-screening programme being undertaken in North-West England. GA-QDA models achieved, for the test set, sensitivity and specificity values of 100.0% and 99.1%, respectively. Three wavenumbers (1422 cm−1, 1546 cm−1 and 1578 cm−1) were identified using the GA-QDA model to distinguish between lung cancer and controls, including ring C–C stretching, C[double bond, length as m-dash]N adenine, Amide II [δ(NH), ν(CN)] and νs(COO) (polysaccharides, pectin). These findings highlight the potential of using biospectroscopy associated with multivariate classification algorithms to discriminate between benign saliva samples and those with underlying lung cancer.


Introduction

A prominent cause of cancer-related mortality globally is lung cancer (LC). LC is a malignant neoplasm that develops from the epithelial cells of the lung tissue and represents the first and third most commonly diagnosed cancer among males and females in 2020, respectively. In 2020, an estimated 2.2 million new LC cases occurred worldwide, representing 14.3% (1.4 million) and 8.4% (0.8 million) of all new cancer diagnoses among males and females, respectively.1 In addition, the resurgence of interest in lung cancer screening and the application of new techniques for the management of early cancer have raised various issues regarding this global epidemic. In previous randomised clinical trials, the use of conventional chest radiographs and sputum cytology examinations for screening have been shown not to reduce lung cancer mortality.2

The use of biomolecular markers,3 autofluorescence bronchoscopy,4 low-dose spiral and high-resolution computed tomography,5,6 endobronchial ultrasonography,7 optical coherence tomography,8 confocal micro-endoscopy,9 positron emission tomography in combination with video-assisted thoracic surgery10 and intraluminal bronchoscopic treatments11 may provide new modalities with which to manage lung cancer at the earliest stage possible. However, conventional diagnostic methods for lung cancer are unsuitable for widespread screening because they are usually expensive and occasionally overlook tumours.

An alternative approach for early detection of lung cancer is to utilize “metabonomics” strategies for biomarker discovery or metabolic profiling based on attenuated total reflection Fourier-transform infrared (ATR-FTIR) spectroscopy.12 “Metabonomics” is defined as the quantitative measurement of multiparametric metabolic responses for living systems with particular emphasis on the elucidation of differences in population groups due to diseases, environmental stress or genetic modifications.13 The application of ATR-FTIR spectroscopy as a metabonomic tool is relevant and significant to the health sciences due to the non-destructive and sensitive nature of this technique, together with a relatively low-cost and high analytical frequency (quick measurement time). Additionally, this technique can be translatable to portable devices; thus, its use in investigating biological and clinical samples is of great interest. To obtain reliable and chemically meaningful results from the analysis of ATR-FTIR spectral data, it is fundamental to apply multivariate methods.14 Recently, Martin et al.12 developed a swab “dip” test in saliva obtained from consenting patients participating in a lung cancer-screening programme using ATR-FTIR spectroscopy coupled with multivariate statistics.

Minimally-invasive biological fluids, such as saliva and urine, have emerged as a potential means for the diagnosis of LC, with some studies also reporting detection of early-stage disease.15–17 There are many advantages to use saliva as a clinical diagnostic biofluid. Saliva consists of water (99.5%), inorganic salts and enzymes (0.2%) and proteins (0.3%).18 There are some interesting advantages for use of saliva as a cost-effective approach for large-scale screens such as sample collection is simple, non-invasive and causes less anxiety on the part of patients. Saliva has been successfully used as a diagnostic fluid for oral and systemic diseases19–23 as well for oncological disorders;24–30 where, salivary proteomic and genomic approaches have been successfully developed for the detection of lung,12 oral,31 breast,32 prostate33 and pancreatic cancer,34 as well as endometriosis.35

Saliva is known to contain ultra-short circulating tumour DNA (usctDNA), which consists of small fragments of circulating tumour DNA (ctDNA) present in the saliva of LC patients.36 Moreover, expressions of exosomal microRNAs (miRNAs) of saliva samples has been associated with the presence of LC.37 Exosomes contain several stimulatory and inhibitory factors involved in mediating the immune response of tumours, and they are associated with several tumours-derived mechanisms, such as cell proliferation and metastasis.37 Their presence in saliva alongside usctDNA could indicate potential biomarkers for cancer detection. Additionally, salivary microbiome38,39 and metabolomics40 are highly associated with LC detection. For example, promotion and inhibition mechanisms of Aggregatibacter, Campylobacter, Fusobacterium, Streptococcus and TM7x have been associated with LC;38 whereas, 16 metabolites in saliva have been associated with LC.40 In summary, saliva contains potential biomarkers present in proteins, peptides, carbohydrates, lipids, and nucleic acids,41,42 and each of these types of salivary components can be detected by ATR-FTIR,43 making the diagnostic process of diseases and oncological disorders quicker, less-invasive and with a lower operation cost.

Herein, we will demonstrate the use of ATR-FTIR spectroscopy combined with chemometrics techniques to identify lung cancer patients in a large clinical screening program with the application of minimally invasive swab “dip” tests of saliva samples.

Materials and methods

Lung cancer screening programme and participant recruitment

This study was carried out in agreement with the Helsinki Declaration and full ethical approval was obtained (HRA IRAS ref: 276081; REC ref: 20/PR/0390; London Bridge REC). All procedures and possible risks were explained to participants before they provided written consent. The study was nested in a prospective study of people invited to attend the National Lung Cancer Screening Pilot in the Blackpool area of North-West England. These potential participants were pre-selected based on multiple factors, including age and smoking history, to be deemed ‘at risk’ of lung cancer. Once they had undergone health checks, those participants who triggered a CT scan for further investigation consented, if willing, to take part in this study. This was performed by the nurse undertaking the initial assessment and consent for involvement in the screening pilot. The rationale for this approach was to provide a mixture of both suspected cancer and non-cancer patients in a typical clinical setting. All participants had a CT scan, and those that exhibited no lung lesions were immediately assigned to the benign group. A visible lesion triggered an urgent oncology referral. Participants who underwent surgery were proven to have cancer following histopathology undertaken by a consultant histopathologist. A small number of participants had radiotherapy; these were also assigned as cancer. Additionally, some participants sent for oncology referral had benign lesions; these individuals were assigned to the benign group. All participants were followed for up to 2 years in order to validate these outcomes. A total of 1950 saliva samples (from which, 56 designated as lung-cancer positive and 1894 as benign group) were randomly taken in the order in which they entered the clinic (i.e., there was no selection of participants in order to avoid bias) during the course of this prospective lung cancer-screening programme.

Saliva collection and swab analysis

For all participants, demographic data (age, gender and comorbidities – see ESI, Table S1) were collected for NHS records; these were accessible as the study progressed and as outcomes became known. Once consent was given, participants were requested to provide saliva for testing by spitting into a sterile universal container. Approximately 2 mL of saliva was collected per patient in a universal polypropylene tube; some 80% of samples were collected within the hospital while the remaining approximately 20% were collected at satellite clinical settings. Samples were stored frozen at −20 °C prior to spectroscopy analysis. They were transported in Styrofoam boxes to the laboratory within 24 h from acquisition. The ATR-FTIR spectrometer was set up in a laboratory in the same building where the majority of samples were collected. Towards spectral analysis, a plain sterile rayon-tipped swab (Ref no.: 155C; Copan, Italy) was placed in the thawed (at room temperature) saliva for about 30 seconds (mixing) and then left to dry at room temperature for 10 minutes. The swab was applied directly to the ATR ZnSe crystal for spectral analysis—this was found to be an extremely convenient means of handling this biological material. The swab was measured 10 times from different locations and the average spectrum was recorded. Whilst there are contributing peaks from the swab, our objective was solely to develop a technique capable of giving a yes/no answer to the possibility or not of lung cancer being present. After analysis, the swabs with saliva were disposed in biohazardous waste containers. This methodology approach has the capacity to readily analyse 10 samples per hour.

ATR-FTIR spectral analyses of swabs

FTIR spectral data (wavenumber range 4000–650 cm−1) for each swab was obtained by directly placing the saliva swab on a portable Agilent Cary 630 FTIR Spectrometer equipped with an ATR ZnSe crystal (Agilent, Santa Clara, CA, USA) and Microlab PC software running from a dedicated computer laptop. Each whole spectrum contains 1798 points (1.86 cm−1 spectral resolution). For every ATR-FTIR spectroscopic measurement, three spectra were obtained from each saliva swab. Each swab analysis was performed with 32 co-additions, interspersed with 32 background scans. After each analysis, the swab was removed from the crystal and the crystal was cleaned with miliQ water (Merck, Rahway, NJ, USA) and 70% alcohol, thus avoiding inter-sample contamination.

Computational analysis: preprocessing, models and validation

Computational analysis including import and pre-processing of data as well as the construction of multivariate classification models were performed within MATLAB® R2014b environment (MathWorks Inc., USA) by using the PLS Toolbox version 7.9.3 (Eigenvector Research, Inc., USA) and laboratory-made routines. Spectral pre-processing for data analysis consisted of the Savitzky–Golay (SG) 2nd derivative (window of 9 points, 2nd order polynomial fitting) followed by mean-centring.

Before the modelling, the replicate samples’ spectra were averaged, so the study was performed on a samples-basis, with one spectrum per sample. Thereafter, the dataset underwent outlier detection by using the Hotelling T2vs. Q residuals test.14 From the complete dataset containing 1950 samples (1894 controls and 56 LC), 6 control samples were identified as outlier. Therefore, the final dataset contained 1944 samples (1888 controls and 56 LC). The samples were then divided into training (70%, 1322 control samples and 39 LC) and test (30%, 566 control samples and 17 LC) sets by the Morais–Lima–Martin (MLM) algorithm.44 Model construction and optimization (variable selection by successive projections algorithm (SPA) and genetic algorithm (GA)) were carried out using the training samples, and internally validated by Monte-Carlo cross-validation.45 The Monte-Carlo cross-validation was performed with 1000 iterations, with 20% of the training samples left-out for validation in each iteration. The left out samples, the test set, were applied to evaluate the classification accuracy by linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA).

For variable selection using GA or SPA, the optimal number of variables was achieved by minimizing the average risk of misclassification G, calculated in the training set as:

 
image file: d4an00726c-t1.tif(1)
where N is the number of training samples; and gn is defined as,
 
image file: d4an00726c-t2.tif(2)
where I(n) is the index of the true class for the nth training object xn; r2(xn, m1(n)) is the squared Mahalanobis distance between object xn (of class index I(n)) and the sample mean mI(n) of its true class; and r2(xn, m1(m)) is the squared Mahalanobis distance between object xn and the center of the closest wrong class.46 The minimum value of the cost function (maximum fitness) is achieved when the selected variables from the original data are closer as possible to its true class and more distance as possible from its wrong class. GA calculations were performed through 100 generations having 200 chromosomes each. The risk of over-fitting with GA was reduced by setting the crossover probability to a relatively large number (60%) in order to increase the size of the offspring; and the mutation probability was set to a relatively large value (10%) so the model could adjust to a better fitting throughout mutation. Further, the algorithm was repeated three times, starting from different random initial populations. The best solution (in terms of the fitness value) was employed.

LDA classification score (Lik) is calculated for a given class k by the following equation in order to obtain a discriminant profile:

 
image file: d4an00726c-t3.tif(3)
where xi is an unknown measurement vector for sample i; [x with combining macron]k is the mean measurement vector of class k; Σpooled is the pooled covariance matrix; and πk is the prior probability of class k.47

QDA classification score (Qik) is estimated using the variance–covariance for each class k and an additional natural logarithm term, as follows:

 
image file: d4an00726c-t4.tif(4)
where Σk is the variance–covariance matrix of class k; and loge|Σk| is the natural logarithm of the determinant of the variance–covariance matrix of class k. QDA forms a separated variance model for each class and does not assume classes having similar variance–covariance matrices, differently of what is assumed by LDA.48

To statistically evaluate the classification models, calculations of accuracy, sensitivity, specificity, F-score and G-score were performed using the test samples as important quality measures of model accuracy. To clinically classify normal and LC samples, sensitivity can be understood as the probability that a test result will be positive when the disease is present, while specificity is the probability that a test result will be negative when the disease is not present. Both parameters have a maximum value of 1 (or 100%) and a minimum of 0, and are obtained as follows:

 
image file: d4an00726c-t5.tif(5)
 
image file: d4an00726c-t6.tif(6)
where FN is defined as a false negative and FP as a false positive; and TP and TN are defined as true positive and true negative, respectively.

The accuracy, F-score and G-score are measurements of the model general accuracy, and they are defined as:49

 
image file: d4an00726c-t7.tif(7)
 
image file: d4an00726c-t8.tif(8)
 
image file: d4an00726c-t9.tif(9)

Results

A typical representative raw spectra was obtained by ATR-FTIR spectroscopy in the fingerprint region (1800–900 cm−1) for the 56 lung cancer samples against all remaining patients (n = 1888, designated controls) with a slight difference of glycogen intensities and DNA/RNA region being observable, as shown in Fig. 1A. Both groups of samples share similar spectral profiles, with only small differences at approximately 1500–1650 cm−1 and 1000–1100 cm−1 as shown in the average profile per class (Fig. 1B). The complete pre-processed spectra and average pre-processed spectra for each group by SG 2nd derivative (window of 9 points, 2nd-order polynomial fitting) are shown in Fig. 1C and D, respectively. A power test based on a two-tailed t-test with post-hoc analysis of the IR absorbances at the fingerprint region (1800–900 cm−1; LC absorbance = 0.1107 ± 0.0101; control absorbance = 0.1178 ± 0.0137) indicates a discriminatory power of 99% between the spectral data at a 95% confidence level. Distinguishing these classes only on spectral observation is difficult, so to identify important spectral features, it is necessary to apply chemometric techniques, such PCA-QDA, SPA-QDA and GA-QDA.
image file: d4an00726c-f1.tif
Fig. 1 Mid-infrared spectra in the fingerprint region (1800–900 cm−1) using ATR-FTIR spectroscopy for controls and lung cancer (LC) samples. (A) Raw spectra for each group; (B) average raw spectrum for each group; (C) pre-processed spectra (Savitzky–Golay 2nd derivative (window = 9 points, 2nd order polynomial fitting)) for each group; (D) average pre-processed spectrum for each group.

Following pre-processing, the data were split between training (70% of samples, 39 lung cancer and 1322 controls) and test (30% of samples, 17 lung cancer and 566 controls) sets. The training set was used for model construction, and the test set was used for final validation. Six classification algorithms were applied to classify the data (ESI, Tables S2 and S3); however, the best discrimination results were obtained by QDA-based algorithms. These were either principal component analysis with quadratic discriminant analysis (PCA-QDA), successive projections algorithm with quadratic discriminant analysis (SPA-QDA) or genetic algorithm with quadratic discriminant analysis (GA-QDA).

Classification of LC and control samples was developed by QDA using the pre-processed IR spectra, where GA-QDA, using only 3 selected wavenumbers (1422, 1546 or 1578 cm−1), as shown in Fig. 2, was found to give the highest classification accuracy compared to other methods (i.e., PCA-QDA or SPA-QDA). The classification by GA-QDA achieved 100% sensitivity and 97.4% specificity during the Monte-Carlo validation; and, 100% sensitivity and 99.1% specificity for the test set, as shown in Table 1. SPA-QDA gave a sensitivity and specificity of 94.1% and 98.9% for the test set, respectively, using only 4 selected wavenumbers (1019, 1545, 1578 or 1735 cm−1) (Fig. 3). The PCA-QDA models achieved a sensitivity of 94.1% and specificity of 98.6% for the test set, using the scores on 10 principal components (PCs), which accounts for 97.6% of the explained variance for both classes. However, the Monte-Carlo cross-validation of PCA-QDA resulted in lower sensitivity (81.8%) and specificity (96.6%) compared to SPA-QDA and GA-QDA. The F-score (the test accuracy considering the imbalanced data) and G-score (the test accuracy not accounting for the class size), especially for GA-QDA, at 99.6% indicate the different class sizes did not interfere with the model accuracy.


image file: d4an00726c-f2.tif
Fig. 2 Average raw and pre-processed (Savitzky–Golay 2nd derivative (window = 9 points, 2nd order polynomial fitting)) spectra in the fingerprint region (1800–900 cm−1) for controls and lung cancer (LC) samples accompanied by the selected variables by the GA-QDA model (circled marks).

image file: d4an00726c-f3.tif
Fig. 3 Average raw and pre-processed (Savitzky–Golay 2nd derivative (window = 9 points, 2nd order polynomial fitting)) spectra in the fingerprint region (1800–900 cm−1) for controls and lung cancer (LC) samples accompanied by the selected variables by the SPA-QDA model (circled marks).
Table 1 Classification performance for the QDA-based models applied to classify lung cancer samples. The PCA-QDA model was built using 10 PCs, accounting for 97.64% of the explained variance. The SPA-QDA and GA-QDA models were built using 4 or 3 selected wavenumbers, respectively. Validation was performed using Monte-Carlo cross-validation with 1000 iterations (20% of samples were left-out for validation in each iteration)
Model Set Accuracy Sensitivity Specificity F-Score G-Score
PCA-QDA Training 0.964 0.692 0.972 0.809 0.820
Validation 0.961 0.818 0.966 0.886 0.889
Test 0.985 0.941 0.986 0.963 0.963
 
SPA-QDA Training 0.979 0.923 0.980 0.951 0.951
Validation 0.992 1.000 0.992 0.996 0.996
Test 0.988 0.941 0.989 0.965 0.965
 
GA-QDA Training 0.979 0.923 0.981 0.951 0.952
Validation 0.974 1.000 0.974 0.987 0.987
Test 0.991 1.000 0.991 0.996 0.996


To validate the effectiveness of our three classification models (PCA-QDA, SPA-QDA and GA-QDA), we turned to an analytical tool called the confusion matrix, which is depicted in Table 2. Usually, a confusion matrix is a table that displays the performance of a classification model. It is organized such that each row corresponds to the model's predictions for a given class, while each column shows the actual instances of that class. The intersection of a row and a column reveals the number of instances where the model predicted a particular class, and the true class was the same. The confusion matrix reveals that GA-QDA performs exceptionally well for lung cancer using biospectroscopy in saliva samples.

Table 2 Confusion tables for the QDA-based classification models for training and test sets showing the measured vs. predicted number of samples classified in each class (control and lung cancer (LC))
Model Meas./Pred. Training Test
Control LC Total Control LC Total
PCA-QDA Control 1285 37 1322 558 8 566
LC 12 27 39 1 16 17
SPA-QDA Control 1296 26 1322 560 6 566
LC 3 36 39 1 16 17
GA-QDA Control 1297 25 1322 561 5 566
LC 3 36 39 0 17 17


Table 3 lists the molecular functional groups associated with selected wavenumbers by the SPA-QDA and GA-QDA models with their tentative assignments. These regions contain the largest weights for class discrimination.

Table 3 Selected wavenumbers by the SPA-QDA and GA-QDA models with their tentative assignments50
Model Selected wavenumber (cm−1) Tentative assignment
δ: angular deformation; ν: stretching; νs: symmetric stretching.
SPA-QDA 1735 C[double bond, length as m-dash]O stretching (lipids)
1578 Ring C–C stretching, C[double bond, length as m-dash]N adenine
1545 Protein band, amide II [δ(NH), ν(CN)]
1019 ν(CO), ν(CC), δ(OCH), ring (polysaccharides, pectin), DNA
 
GA-QDA 1578 Ring C–C stretching, C[double bond, length as m-dash]N adenine
1546 Protein band, Amide II [δ(NH), ν(CN)]
1422 ν s(COO–)(polysaccharides, pectin)


Discussion

Lung cancer detection normally requires CT scanning and skilled professionals after the patient present symptoms, thus not being conventionally early detected in screening programs. The late diagnosis presents several risks since the advance of cancer disease is fast and dangerous, thus early treatment is paramount for a possible recovery. The development of quick, non-invasive, low-cost and convenient screening methods for early detection of lung cancer can guide further biopsies and avoid unnecessary costs with false positive diagnosis, increasing the disease prognosis and reducing cost for the general population.51–53

The use of vibrational spectroscopy for cancer detection in biologically-derived samples is not new,54–56 where several studies have been published for detecting breast,57,58 brain,59–61 cervical,62–65 colorectal,66,67 ovarian,68,69 skin,70–72 and others.56,73 For lung cancer, FT-IR74–77 and Raman78–83 spectroscopy have been used to detect this malignance. However, the main limitation of studies using vibrational spectroscopy for cancer detection is the sample size, since the recruitment of patients to participate in such studies are difficult due to the cost and great amount of time for recruiting lung cancer patients given the disease prevalence in the general population.84 The small sample size leads to a challenging validation of the proposed methodologies, since the sample size is a fundamental parameter to ensure reliability of the detection model.85 Herein, approximately 2000 patients were investigated for the detection of lung cancer in a screening program that undertook 2 years of patient recruitment and sample analyses. Moreover, the analyses were made based on saliva samples, which are much less invasive than tissue or blood collection.

Saliva contains mainly water, proteins, inorganic minerals and traces substances, where the main proteins are constituted by glycoproteins, enzymes such as α-amylase and carbonic anhydrase, immunoglobulins, and peptides such as cystatins, statherin and histatins.86 Additionally, saliva has some similarity with plasma (about 30%), with common traditional immunological factors.12 Saliva is an ideal biofluid in terms of sample collection due to its convenience, and it has been used as a biofluid together with vibrational spectroscopy to detect several diseases (oral and general diseases).87,88 For cancer diagnosis, saliva has been used to detect oral,89 breast,90 and head and neck cancers.91

Apart from proteins, enzymes, RNA and DNA, there are immunoglobulins coming from different sources in saliva, mainly from the salivary glands, but also from secretions from nasal cavity and lower respiratory tract, gingival crevicular fluid, and ultrafiltrate blood plasma.21,92 Lung cancer is known to cause modifications in the composition of saliva biomarkers.92 Some mechanisms include: (1) releasing mediators in blood affecting the salivary grand function and therefore the quantity and composition of saliva; by releasing lung secretions containing microbes and biomolecules directly to saliva; and, (3) by releasing circulating biomolecules in blood directly to saliva.92 The former may happen due to ctDNA and tumour-derived exosomes released in blood circulation that reaches the salivary glands. They both are taken by secretory cells (acinar cells) via endocytosis or membrane fusion and then released in the saliva being produced.93

The chemical composition of saliva is infrared-active and thus detected by FT-IR spectroscopy. ATR-FTIR spectroscopy has been used for salivary diagnostic of several oncological and systematic diseases,94 including oral cancer,95 breast cancer,96 oral submucous fibrosis,97 diabetes,98–101 chronic kidney disease,102 Zika virus103 and COVID-19.104,105 We have demonstrated previous in a pilot study that saliva with FT-IR spectroscopy can detect lung and prostate cancer with accuracies of 91% and 93%, respectively.12 The number of cancer-positive samples is this study was limited, thus further validation was necessary. Herein, 56 lung cancer samples are detected amid a total of 1944 saliva samples collected, with accuracies ranging around 99% for different classification algorithms. For this, the pre-processed FT-IR spectral data underwent chemometric analysis by different QDA-based classifiers: PCA-QDA, SPA-QDA and GA-QDA. The best algorithm in terms of test accuracy was the GA-QDA algorithm (99.1% test accuracy) (Table 1). Furthermore, the similarities between the F-score and G-score (both around 99% in the test set) indicate that the difference in the classes’ sizes between lung cancer samples and the controls did not interfere with the classifier performance, as the QDA classifier is corrected by the class size in the Bayesian form.106 In addition, the best sensitivity for lung cancer detection was obtained by the GA-QDA algorithm, with 100% sensitivity in the validation and test set (Table 1), thus correctly classifying all the lung cancer samples (Table 2).

The spectral markers associated with the presence of lung cancer according to the selected wavenumbers of SPA-QDA and GA-QDA models (Table 3) were very similar between the different algorithms. Four wavenumbers were selected by the SPA-QDA model: 1735, 1578, 1545 and 1019 cm−1; while tree wavenumbers were selected by the GA-QDA model: 1578, 1546 and 1422 cm−1. The main difference was the wavenumbers at 1735 cm−1 [C[double bond, length as m-dash]O stretching (lipids)] and 1019 cm−1 [ν(CO), ν(CC), δ(OCH), ring (polysaccharides, pectin), DNA] for the SPA-QDA model, and at 1422 cm−1 [νs(COO-)(polysaccharides, pectin)] for the GA-QDA model. The wavenumbers at 1578 cm−1 (ring C–C stretching, C[double bond, length as m-dash]N adenine) and ∼1545 cm−1 (protein band, Amide II [δ(NH), ν(CN)]) were common to both algorithms. The increase in the absorption band of the C[double bond, length as m-dash]O stretching of lipids/phospholipids has been associated with the presence of lung cancer, as well as the increase of the C–O stretching band at around 1919 cm−1 due to glycogen.12 The amide II band at around 1545 cm−1 has also been reported as a marker for lung cancer, with decrease in its absorption.12 The absorption at 1019 cm−1 may be caused by different absorption bands, most likely a combination of CO and CC stretching vibrations with OCH deformation, which are present in bioactive compounds such as polysaccharides (pectin) or DNA. Changes in CO stretching absorptions related to DNA have been reported as biomarker for lung cancer,107 whereas pectin is a polysaccharide with cancer prevention potential.108 The absorption at 1578 cm−1 may be related with adenine, which is known to inhibit tumour cell growth.109

While promising results, the main limitation of this study is the small number of LC samples (n = 56) compared with controls (n = 1888). Although the spectral data resulted in a good discriminatory power of 99%, including more LC samples to the test cohort is fundamental to ensure model robustness, especially when the sample size is increased with real samples that may carry further variance to the dataset. Demographic data (ESI, Table S1) indicates significant differences (p < 0.001) between the age of LC (age = 69.8 ± 5.2) and control (age = 67.1 ± 5.1) groups, which is expected since cancer risk increases with age.110 Gender differences between male and female patients between the groups were not found to be statistical significant (p > 0.05); whereas the presence of comorbidities in the control group is found to have some significance (p = 0.0141). The control group is composed of approximately 90.3% of healthy individuals and 9.7% of patients with some disease conditions. The most frequent conditions are prostate (1.6%) and breast (1.5%) cancers, as well as other systematic diseases (3.6%). Variance in this control group is expected since patients recruited in hospital conditions usually present some indication of disease, thus conferring variability to this group. Nevertheless, the multivariate nature of IR spectroscopy is able to target LC samples amid this variance and provide good discriminatory performance for this group as demonstrated herein.

This study validates the potential of FT-IR spectroscopy as a detection tool for lung cancer based on saliva samples, as a conclusion of a protocol started with saliva dip test, infrared spectroscopy a chemometrics.12,111 This study was deliberately carried out in a real-world clinical setting where the number of cases is relatively small compared to other conditions that would appear during recruitment, thus challenging the construction of classifiers for the dataset. Nonetheless, this spectrochemical approach combined with computation tools has proven to be effective and accurate to identify lung cancer patients and may now be further extended to multicentre validation for real-world implementation.

Author contributions

Conceptualization, F.L.M. and D.B.; methodology, F.L.M.; software, C.L.M.M.; validation, C.L.M.M. and K.M.G.L.; computational analysis, C.L.M.M. and K.M.G.L.; investigation, F.L.M. and D.B.; sample collection, analysis and resources, F.L.M., D.B., A.W.D., T.S., T.B. and M.N.S.; data curation, F.L.M. and D.B.; writing original draft preparation, C.L.M.M., K.M.G.L. and F.L.M.; reviewing and editing, F.L.M., D.B., K.M.G.L. and C.L.M.M.; funding acquisition, F.L.M. and D.B. All authors have read and agreed to the published version of the manuscript.

Institutional review board statement

This study was carried out in agreement with the Helsinki declaration and full ethical approval was obtained (HRA IRAS ref: 276081; REC ref: 20/PR/0390; London Bridge REC).

Informed consent statement

Informed consent was obtained from all subjects involved in the study.

Data availability

Data contributing to this manuscript will be made available upon reasonable request to the corresponding authors.

Conflicts of interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Acknowledgements

This research was primarily funded by North West Cancer Research (NWCR; ref no.: C12019.03 BURY) and is gratefully acknowledged. For the purchase of an ATR-FTIR spectrometer, funding support from the Pathological Society of Great Britain and Ireland is gratefully acknowledged (ref. no.: Bury1174). During the course of this research, F. L. M. received funding from the NIHR Manchester Biomedical Research Centre (NIHR203308). The views expressed are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care. We thank the study participants for their participation in the project, as well as the research nurses who facilitated consent and sample collections. The company Copan is thanked for the provision of swabs. We also thank the targeted lung health check team and Alliance Medical for facilitating our access to the screening programme. Kássio M. G. Lima thanks the Pro-Rectory of post-graduation at UFRN and AUXPE No. 1153/2023 for his visiting researcher at Clinical Research Centre (CRC) in Blackpool Victoria Hospital, Blackpool/UK (Ref. 23077.016975/2024-47).

References

  1. H. Sung, J. Ferlay, R. L. Siegel, M. Laversanne, I. Soerjomataram, A. Jemal and F. Bray, Ca-Cancer J. Clin., 2021, 71, 209–249 CrossRef PubMed.
  2. G. Sutedja, Eur. Respir. J., 2003, 21, 57S–66s CrossRef PubMed.
  3. M. P. Rangel, V. K. De Sá, T. Prieto, J. R. M. Martins, E. R. Olivieri, D. Carraro, T. Takagaki and V. L. Capelozzi, Glycoconjugate J., 2018, 35, 233–242 CrossRef CAS PubMed.
  4. B. Zaric, B. Perin, V. Carapic, V. Stojsic, J. Matijasevic, I. Andrijevic and I. Kopitovic, Thorac. Cancer, 2013, 4, 1–8 CrossRef PubMed.
  5. R. Fang, Y. Yang, H. Han, X. Fu, L. Dong, B. Xie, W. Lu, C. Ma, F. Cui, J. Hu and J. Wang, Oncol. Lett., 2018, 16(2), 2483–2489 Search PubMed.
  6. M. Infante, S. Cavuto, F. R. Lutman, G. Brambilla, G. Chiesa, G. Ceresoli, E. Passera, E. Angeli, M. Chiarenza, G. Aranzulla, U. Cariboni, V. Errico, F. Inzirillo, E. Bottoni, E. Voulaz, M. Alloisio, A. Destro, M. Roncalli, A. Santoro and G. Ravasi, Am. J. Respir. Crit. Care Med., 2009, 180, 445–453 CrossRef PubMed.
  7. J. Sanz-Santos, P. Serra, M. Torky, F. Andreo, C. Centeno, L. Mendiluce, C. Martínez-Barenys, P. López De Castro and J. Ruiz-Manzano, Ann. Thorac. Surg., 2018, 106, 398–403 CrossRef PubMed.
  8. L. Van Manen, J. Dijkstra, C. Boccara, E. Benoit, A. L. Vahrmeijer, M. J. Gora and J. S. D. Mieog, J. Cancer Res. Clin. Oncol., 2018, 144, 1967–1990 CrossRef CAS PubMed.
  9. S. Shinohara, K. Funabiki, M. Kikuchi, S. Takebayashi, K. Hamaguchi, S. Hara, D. Yamashita, Y. Imai and A. Mizoguchi, Auris, Nasus, Larynx, 2020, 47, 668–675 CrossRef PubMed.
  10. S. Rouzé, B. De Latour, E. Flécher, J. Guihaire, M. Castro, R. Corre, P. Haigron and J.-P. Verhoye, Interact. CardioVasc. Thorac. Surg., 2016, 22, 705–711 CrossRef PubMed.
  11. N. Guibert, J. Mazieres, C.-H. Marquette, D. Rouviere, A. Didier and C. Hermant, Eur. Respir. Rev., 2015, 24, 378–391 CrossRef PubMed.
  12. F. L. Martin, C. L. M. Morais, A. W. Dickinson, T. Saba, T. Bongers, M. N. Singh and D. Bury, J. Pers. Med., 2023, 13, 1533 CrossRef PubMed.
  13. J. J. Ramsden, in Bioinformatics, Springer London, London, 2009, vol. 10, pp. 1–6 Search PubMed.
  14. C. L. M. Morais, K. M. G. Lima, M. Singh and F. L. Martin, Nat. Protoc., 2020, 15, 2143–2162 CrossRef CAS PubMed.
  15. L. Zhang, H. Xiao, H. Zhou, S. Santiago, J. M. Lee, E. B. Garon, J. Yang, O. Brinkmann, X. Yan, D. Akin, D. Chia, D. Elashoff, N.-H. Park and D. T. W. Wong, Cell. Mol. Life Sci., 2012, 69, 3341–3350 CrossRef CAS PubMed.
  16. J. Kisluk, M. Ciborowski, M. Niemira, A. Kretowski and J. Niklinski, J. Pharm. Biomed. Anal., 2014, 101, 40–49 CrossRef CAS PubMed.
  17. Y. Sun, S. Liu, Z. Qiao, Z. Shang, Z. Xia, X. Niu, L. Qian, Y. Zhang, L. Fan, C.-X. Cao and H. Xiao, Anal. Chim. Acta, 2017, 982, 84–95 CrossRef CAS PubMed.
  18. E. Roblegg, A. Coughran and D. Sirjani, Eur. J. Pharm. Biopharm., 2019, 142, 133–141 CrossRef CAS PubMed.
  19. A. Roi, L. C. Rusu, C. I. Roi, R. E. Luca, S. Boia and R. I. Munteanu, Dis. Markers, 2019, 2019, 1–11 CrossRef PubMed.
  20. M. A. Javaid, A. S. Ahmed, R. Durand and S. D. Tran, J. Oral Biol. Craniofac. Res., 2016, 6, 67–76 CrossRef PubMed.
  21. C.-Z. Zhang, X.-Q. Cheng, J.-Y. Li, P. Zhang, P. Yi, X. Xu and X.-D. Zhou, Int. J. Oral Sci., 2016, 8, 133–137 CrossRef CAS PubMed.
  22. M. Meleti, D. Cassi, P. Vescovi, G. Setti, T. A. Pertinhez and M. E. Pezzi, Med. Oral. Patol. Oral Cir. Bucal, 2020, 25(2), e299–e310 CAS.
  23. A. Zalewska, N. Waszkiewicz and R. M. López-Pintor, Dis. Markers, 2019, 2019, 1–2 CrossRef PubMed.
  24. M. Robotti, F. Scebba and D. Angeloni, Biomedicines, 2023, 11, 652 CrossRef CAS PubMed.
  25. C.-Z. Zhang, X.-Q. Cheng, J.-Y. Li, P. Zhang, P. Yi, X. Xu and X.-D. Zhou, Int. J. Oral Sci., 2016, 8, 133–137 CrossRef CAS PubMed.
  26. P. Gopikrishna, A. Ramesh Kumar, K. Rajkumar, R. Ashwini and S. Venkatkumar, Oral Oncol. Rep., 2024, 10, 100508 CrossRef.
  27. Ó. Rapado-González, C. Martínez-Reglero, Á. Salgado-Barreira, B. Takkouche, R. López-López, M. M. Suárez-Cunqueiro and L. Muinelo-Romay, Ann. Med., 2020, 52, 131–144 CrossRef PubMed.
  28. D. Bastías, A. Maturana, C. Marín, R. Martínez and S. E. Niklander, Int. J. Mol. Sci., 2024, 25, 2634 CrossRef PubMed.
  29. M. Koopaie, S. Kolahdooz, M. Fatahzadeh and S. Manifar, Cancer Med., 2022, 11, 2644–2661 CrossRef PubMed.
  30. R. Swaathi, M. Narayan and R. Krishnan, Oral Oncol. Rep., 2024, 10, 100503 CrossRef.
  31. J. Kaur, R. Jacobs, Y. Huang, N. Salvo and C. Politis, Clin. Oral Invest., 2018, 22, 633–640 CrossRef CAS PubMed.
  32. E. C. Porto-Mascarenhas, D. X. Assad, H. Chardin, D. Gozal, G. De Luca Canto, A. C. Acevedo and E. N. S. Guerra, Crit. Rev. Oncol. Hematol., 2017, 110, 62–73 CrossRef PubMed.
  33. H. Farahani, M. Alaee, J. Amri, M.-R. Baghinia and M. Rafiee, Lab. Med., 2020, 51, 243–251 CrossRef PubMed.
  34. Y. Asai, T. Itoi, M. Sugimoto, A. Sofuni, T. Tsuchiya, R. Tanaka, R. Tonozuka, M. Honjo, S. Mukai, M. Fujita, K. Yamamoto, Y. Matsunami, T. Kurosawa, Y. Nagakawa, M. Kaneko, S. Ota, S. Kawachi, M. Shimazu, T. Soga, M. Tomita and M. Sunamura, Cancers, 2018, 10, 43 CrossRef PubMed.
  35. S. Bendifallah, S. Suisse, A. Puchar, L. Delbos, M. Poilblanc, P. Descamps, F. Golfier, L. Jornea, D. Bouteiller, C. Touboul, Y. Dabi and E. Daraï, J. Clin. Med., 2022, 11, 612 CrossRef CAS PubMed.
  36. F. Li, F. Wei, W.-L. Huang, C.-C. Lin, L. Li, M. M. Shen, Q. Yan, W. Liao, D. Chia, M. Tu, J. H. Tang, Z. Feng, Y. Kim, W.-C. Su and D. T. W. Wong, Cancers, 2020, 12, 2041 CrossRef CAS PubMed.
  37. M. Liu, X. Yu, J. Bu, Q. Xiao, S. Ma, N. Chen and C. Qu, Front. Genet., 2023, 14, 1249678 CrossRef CAS PubMed.
  38. K. Feng, F. Ren and X. Wang, Front. Mol. Biosci., 2023, 10, 1327893 CrossRef CAS PubMed.
  39. J. Yang, X. Mu, Y. Wang, D. Zhu, J. Zhang, C. Liang, B. Chen, J. Wang, C. Zhao, Z. Zuo, X. Heng, C. Zhang and L. Zhang, Front. Oncol., 2018, 8, 520 CrossRef PubMed.
  40. N. Kajiwara, M. Kakihana, J. Maeda, M. Kaneko, S. Ota, A. Enomoto, N. Ikeda and M. Sugimoto, Cancer Sci., 2024, 115, 1695–1705 CrossRef CAS PubMed.
  41. S. Kumari, M. Samara, R. Ampadi Ramachandran, S. Gosh, H. George, R. Wang, R. P. Pesavento and M. T. Mathew, Biomed. Mater. Devices, 2024, 2, 121–138 CrossRef PubMed.
  42. M. Song, H. Bai, P. Zhang, X. Zhou and B. Ying, Int. J. Oral Sci., 2023, 15, 2 CrossRef CAS PubMed.
  43. M. J. Baker, J. Trevisan, P. Bassan, R. Bhargava, H. J. Butler, K. M. Dorling, P. R. Fielden, S. W. Fogarty, N. J. Fullwood, K. A. Heys, C. Hughes, P. Lasch, P. L. Martin-Hirsch, B. Obinaju, G. D. Sockalingum, J. Sulé-Suso, R. J. Strong, M. J. Walsh, B. R. Wood, P. Gardner and F. L. Martin, Nat. Protoc., 2014, 9, 1771–1791 CrossRef CAS PubMed.
  44. C. L. M. Morais, M. C. D. Santos, K. M. G. Lima and F. L. Martin, Bioinformatics, 2019, 35, 5257–5263 CrossRef CAS PubMed.
  45. Q.-S. Xu and Y.-Z. Liang, Chemom. Intell. Lab. Syst., 2001, 56, 1–11 CrossRef CAS.
  46. K. M. G. Lima, K. Gajjar, G. Valasoulis, M. Nasioutziki, M. Kyrgiou, P. Karakitsos, E. Paraskevaidis, P. L. Martin Hirsch and F. L. Martin, Anal. Methods, 2014, 6, 9643–9652 RSC.
  47. W. Wu, Y. Mallet, B. Walczak, W. Penninckx, D. L. Massart, S. Heuerding and F. Erni, Anal. Chim. Acta, 1996, 329, 257–265 CrossRef CAS.
  48. S. J. Dixon and R. G. Brereton, Chemom. Intell. Lab. Syst., 2009, 95, 1–17 CrossRef CAS.
  49. C. L. M. Morais and K. M. G. Lima, Chemom. Intell. Lab. Syst., 2017, 170, 1–12 CrossRef CAS.
  50. Z. Movasaghi, S. Rehman and Dr. I. Ur Rehman, Appl. Spectrosc. Rev., 2008, 43, 134–179 CrossRef CAS.
  51. K. O'Rourke, Cancer, 2022, 128, 3011–3012 CrossRef PubMed.
  52. R. Gasparri, A. Guaglio and L. Spaggiari, J. Clin. Med., 2022, 11, 4398 CrossRef PubMed.
  53. K. Takahashi, S. Nakamura, K. Watanabe, M. Sakaguchi and H. Narimatsu, Int. J. Environ. Res. Public Health, 2022, 19, 11477 CrossRef PubMed.
  54. O. J. Old, L. M. Fullwood, R. Scott, G. R. Lloyd, L. M. Almond, N. A. Shepherd, N. Stone, H. Barr and C. Kendall, Anal. Methods, 2014, 6, 3901 RSC.
  55. C. Kendall, M. Isabelle, F. Bazant-Hegemark, J. Hutchings, L. Orr, J. Babrah, R. Baker and N. Stone, Analyst, 2009, 134, 1029 RSC.
  56. D. J. Anderson, R. G. Anderson, S. J. Moug and M. J. Baker, BJS Open, 2020, 4, 554–562 CrossRef CAS PubMed.
  57. R. A. Faria, L. B. Leal, M. M. Thebit, S. W. A. Pereira, N. R. Serafim, V. G. Barauna, L. F. Da Chagas E Silva Carvalho, C. L. Sartório and S. A. Gouvea, Appl. Spectrosc., 2023, 77, 405–417 CrossRef CAS PubMed.
  58. Y. Du, F. Xie, L. Yin, Y. Yang, H. Yang, G. Wu and S. Wang, Spectrochim. Acta, Part A, 2022, 283, 121715 CrossRef CAS PubMed.
  59. H. J. Butler, P. M. Brennan, J. M. Cameron, D. Finlayson, M. G. Hegarty, M. D. Jenkinson, D. S. Palmer, B. R. Smith and M. J. Baker, Nat. Commun., 2019, 10, 4501 CrossRef PubMed.
  60. G. Steiner, R. Galli, G. Preusse, S. Michen, M. Meinhardt, A. Temme, S. B. Sobottka, T. A. Juratli, E. Koch, G. Schackert, M. Kirsch and O. Uckermann, J. Neurooncol., 2023, 161, 57–66 CrossRef CAS PubMed.
  61. J. Desroches, M. Jermyn, M. Pinto, F. Picot, M.-A. Tremblay, S. Obaid, E. Marple, K. Urmey, D. Trudel, G. Soulez, M.-C. Guiot, B. C. Wilson, K. Petrecca and F. Leblond, Sci. Rep., 2018, 8, 1792 CrossRef PubMed.
  62. M. M. Félix, M. V. Tavares, I. P. Santos, A. L. M. Batista De Carvalho, L. A. E. Batista De Carvalho and M. P. M. Marques, Molecules, 2024, 29, 922 CrossRef PubMed.
  63. F. M. Lyng, E. Ó. Faoláin, J. Conroy, A. D. Meade, P. Knief, B. Duffy, M. B. Hunter, J. M. Byrne, P. Kelehan and H. J. Byrne, Exp. Mol. Pathol., 2007, 82, 121–129 CrossRef CAS PubMed.
  64. C. A. Meza Ramirez, M. Greenop, Y. A. Almoshawah, P. L. Martin Hirsch and I. U. Rehman, Expert Rev. Mol. Diagn., 2023, 23, 375–390 CrossRef CAS PubMed.
  65. R. Shaikh, A. Daniel and F. M. Lyng, Molecules, 2023, 28, 2502 CrossRef CAS PubMed.
  66. D. Cameron, A. Talari, I. Rehman, P. Mitchell and E. Parkin, Br. J. Surg., 2022, 109, e61–e62 CrossRef PubMed.
  67. A. Synytsya, A. Vaňková, M. Miškovičová, J. Petrtýl and L. Petruželka, Diagnostics, 2021, 11, 2048 CrossRef CAS PubMed.
  68. P. Giamougiannis, C. L. M. Morais, B. Rodriguez, N. J. Wood, P. L. Martin-Hirsch and F. L. Martin, Anal. Bioanal. Chem., 2021, 413, 5095–5107 CrossRef CAS PubMed.
  69. P. Giamougiannis, C. L. M. Morais, R. Grabowska, K. M. Ashton, N. J. Wood, P. L. Martin-Hirsch and F. L. Martin, Anal. Bioanal. Chem., 2021, 413, 911–922 CrossRef CAS PubMed.
  70. C. Delrue, R. Speeckaert, M. Oyaert, S. De Bruyne and M. M. Speeckaert, J. Clin. Med., 2023, 12, 7428 CrossRef CAS PubMed.
  71. N. S. Eikje, K. Aizawa and Y. Ozaki, in Biotechnology Annual Review, Elsevier, 2005, vol. 11, pp. 191–225 Search PubMed.
  72. Z. Hammody, R. K. Sahu, S. Mordechai, E. Cagnano and S. Argov, Sci. World J., 2005, 5, 173–182 CrossRef CAS PubMed.
  73. S. Dong, D. He, Q. Zhang, C. Huang, Z. Hu, C. Zhang, L. Nie, K. Wang, W. Luo, J. Yu, B. Tian, W. Wu, X. Chen, F. Wang, J. Hu and X. Xiao, eLight, 2023, 3, 17 CrossRef.
  74. E. J. Lugtu, D. B. Ramos, A. J. Agpalza, E. A. Cabral, R. P. Carandang, J. E. Dee, A. Martinez, J. E. Jose, A. Santillan, R. Bangaoil, P. M. Albano and R. C. Tomas, PLoS One, 2022, 17, e0268329 CrossRef CAS PubMed.
  75. P. D. Lewis, K. E. Lewis, R. Ghosal, S. Bayliss, A. J. Lloyd, J. Wills, R. Godfrey, P. Kloer and L. A. Mur, BMC Cancer, 2010, 10, 640 CrossRef CAS PubMed.
  76. X. Yang, Z. Wu, Q. Ou, K. Qian, L. Jiang, W. Yang, Y. Shi and G. Liu, Front. Chem., 2022, 10, 810837 CrossRef CAS PubMed.
  77. X. Yang, Q. Ou, K. Qian, J. Yang, Z. Bai, W. Yang, Y. Shi and G. Liu, Front. Oncol., 2021, 11, 753791 CrossRef CAS PubMed.
  78. Z. Huang, A. McWilliams, H. Lui, D. I. McLean, S. Lam and H. Zeng, Int. J. Cancer, 2003, 107, 1047–1052 CrossRef CAS PubMed.
  79. Y. Miao, L. Wu, J. Qiang, J. Qi, Y. Li, R. Li, X. Kong and Q. Zhang, Front. Bioeng. Biotechnol., 2024, 12, 1385552 CrossRef PubMed.
  80. Z.-Y. Ke, Y.-J. Ning, Z.-F. Jiang, Y. Zhu, J. Guo, X.-Y. Fan and Y.-B. Zhang, Lasers Med. Sci., 2022, 37, 425–434 CrossRef PubMed.
  81. Q. Zheng, J. Li, L. Yang, B. Zheng, J. Wang, N. Lv, J. Luo, F. L. Martin, D. Liu and J. He, Analyst, 2020, 145, 385–392 RSC.
  82. C. Chen, J. Hao, X. Hao, W. Xu, C. Xiao, J. Zhang, Q. Pu and L. Liu, Transl. Cancer Res., 2021, 10, 3680–3693 CrossRef CAS PubMed.
  83. L. Shi, Y. Li and Z. Li, Light: Sci. Appl., 2023, 12, 234 CrossRef CAS PubMed.
  84. K. C. Thandra, A. Barsouk, K. Saginala, J. S. Aluru and A. Barsouk, Contemp. Oncol. (Pozn.), 2021, 25, 45–52 CAS.
  85. J. Lever, M. Krzywinski and N. Altman, Nat. Methods, 2016, 13, 703–704 CrossRef CAS.
  86. R. G. Schipper, E. Silletti and M. H. Vingerhoeds, Arch. Oral Biol., 2007, 52, 1114–1135 CrossRef CAS PubMed.
  87. S. Derruau, J. Robinet, V. Untereiner, O. Piot, G. D. Sockalingum and S. Lorimier, Molecules, 2020, 25, 4142 CrossRef CAS PubMed.
  88. H. J. Byrne, I. Behl, G. Calado, O. Ibrahim, M. Toner, S. Galvin, C. M. Healy, S. Flint and F. M. Lyng, Spectrochim. Acta, Part A, 2021, 252, 119470 CrossRef CAS PubMed.
  89. G. Calado, I. Behl, A. Daniel, H. J. Byrne and F. M. Lyng, Transl. Biophotonics, 2019, 1, e201900001 CrossRef.
  90. I. C. C. Ferreira, E. M. G. Aguiar, A. T. F. Silva, L. L. D. Santos, L. Cardoso-Sousa, T. G. Araújo, D. W. Santos, L. R. Goulart, R. Sabino-Silva and Y. C. P. Maia, J. Oncol., 2020, 2020, 1–11 CrossRef PubMed.
  91. H. J. Koster, A. Guillen-Perez, J. S. Gomez-Diaz, M. Navas-Moreno, A. C. Birkeland and R. P. Carney, Sci. Rep., 2022, 12, 18464 CrossRef CAS PubMed.
  92. H. E. Skallevold, E. M. Vallenari and D. Sapkota, Mediators Inflammation, 2021, 2021, 1–10 CrossRef PubMed.
  93. T. Nonaka and D. T. W. Wong, Annu. Rev. Anal. Chem., 2022, 15, 107–121 CrossRef CAS PubMed.
  94. C. Delrue, S. De Bruyne and M. M. Speeckaert, J. Pers. Med., 2023, 13, 907 CrossRef PubMed.
  95. P. Shree, Y. Aggarwal, M. Kumar, L. Majhee, N. N. Singh, O. Prakash, A. Chandra, S. A. Mahuli, S. Shamsi and A. Rai, Indian J. Otolaryngol. Head Neck Surg., 2024, 76, 2282–2289 CrossRef PubMed.
  96. I. C. C. Ferreira, E. M. G. Aguiar, A. T. F. Silva, L. L. D. Santos, L. Cardoso-Sousa, T. G. Araújo, D. W. Santos, L. R. Goulart, R. Sabino-Silva and Y. C. P. Maia, J. Oncol., 2020, 2020, 1–11 CrossRef PubMed.
  97. S. Shaikh, D. K. Yadav and R. Rawal, J. Pharm. Biomed. Anal., 2021, 203, 114202 CrossRef CAS PubMed.
  98. M. Sanchez-Brito, G. J. Vazquez-Zapien, F. J. Luna-Rosas, R. Mendoza-Gonzalez, J. C. Martinez-Romo and M. M. Mata-Miranda, Comput. Struct. Biotechnol. J., 2022, 20, 4542–4548 CrossRef CAS PubMed.
  99. M. S. Nogueira, A. L. Barreto, M. Furukawa, E. S. Rovai, A. Bastos, G. Bertoncello and L. F. D. C. E. S. Carvalho, Photodiagn. Photodyn. Ther., 2022, 40, 103036 CrossRef CAS PubMed.
  100. D. C. Caixeta, M. G. Carneiro, R. Rodrigues, D. C. T. Alves, L. R. Goulart, T. M. Cunha, F. S. Espindola, R. Vitorino and R. Sabino-Silva, Diagnostics, 2023, 13, 1396 CrossRef CAS PubMed.
  101. D. C. Caixeta, E. M. G. Aguiar, L. Cardoso-Sousa, L. M. D. Coelho, S. W. Oliveira, F. S. Espindola, L. Raniero, K. T. B. Crosara, M. J. Baker, W. L. Siqueira and R. Sabino-Silva, PLoS One, 2020, 15, e0223461 CrossRef CAS PubMed.
  102. R. P. Rodrigues, E. M. Aguiar, L. Cardoso-Sousa, D. C. Caixeta, C. C. Guedes, W. L. Siqueira, Y. C. P. Maia, S. V. Cardoso and R. Sabino-Silva, Braz. Dent. J., 2019, 30, 437–445 CrossRef PubMed.
  103. S. W. Oliveira, L. Cardoso-Sousa, R. P. Georjutti, J. F. Shimizu, S. Silva, D. C. Caixeta, M. Guevara-Vega, T. M. Cunha, M. G. Carneiro, L. R. Goulart, A. C. G. Jardim and R. Sabino-Silva, Diagnostics, 2023, 13, 1443 CrossRef CAS PubMed.
  104. A. Martinez-Cuazitl, G. J. Vazquez-Zapien, M. Sanchez-Brito, J. H. Limon-Pacheco, M. Guerrero-Ruiz, F. Garibay-Gonzalez, R. J. Delgado-Macuil, M. G. G. De Jesus, M. A. Corona-Perezgrovas, A. Pereyra-Talamantes and M. M. Mata-Miranda, Sci. Rep., 2021, 11, 19980 CrossRef CAS PubMed.
  105. V. G. Barauna, M. N. Singh, L. L. Barbosa, W. D. Marcarini, P. F. Vassallo, J. G. Mill, R. Ribeiro-Rodrigues, L. C. G. Campos, P. H. Warnke and F. L. Martin, Anal. Chem., 2021, 93, 2950–2958 CrossRef CAS PubMed.
  106. C. L. M. Morais and K. M. G. Lima, J. Braz. Chem. Soc., 2018, 29(3), 472–481 CAS.
  107. H. Li, J. Wang, X. Li, X. Zhu, S. Guo, H. Wang, J. Yu, X. Ye and F. He, Spectrochim. Acta, Part A, 2024, 306, 123596 CrossRef CAS PubMed.
  108. T. B. Emran, F. Islam, S. Mitra, S. Paul, N. Nath, Z. Khan, R. Das, D. Chandran, R. Sharma, C. M. G. Lima, A. A. A. Awadh, I. A. Almazni, A. H. Alhasaniah and R. P. F. Guiné, Molecules, 2022, 27, 7405 CrossRef CAS PubMed.
  109. M. Han, X. Cheng, Z. Gao, R. Zhao and S. Zhang, Oncotarget, 2017, 8, 94286–94296 CrossRef PubMed.
  110. E. Laconi, F. Marongiu and J. DeGregori, Br. J. Cancer, 2020, 122, 943–952 CrossRef PubMed.
  111. F. L. Martin, A. W. Dickinson, T. Saba, T. Bongers, M. N. Singh and D. Bury, J. Pers. Med., 2023, 13, 1039 CrossRef PubMed.

Footnote

Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4an00726c

This journal is © The Royal Society of Chemistry 2024