Xinchen
Deng
a,
Kirsty
Milligan
a,
Alexandre
Brolo
b,
Julian J.
Lum
cd,
Jeffrey L.
Andrews
e and
Andrew
Jirasek
*a
aDepartment of Physics, The University of British Columbia – Okanagan campus, Kelowna, Canada V1V 1V7. E-mail: andrew.jirasek@ubc.ca
bDepartment of Chemistry, University of Victoria, Victoria, Canada V8P 5C2
cDepartment of Biochemistry and Microbiology, University of Victoria, Victoria, Canada V8P 5C2
dTrev and Joyce Deeley Research Centre, BC Cancer, Victoria, Canada V8R 6V5
eDepartment of Statistics, The University of British Columbia – Okanagan campus, Kelowna, Canada V1V 1V7
First published on 26th September 2022
Recent advancements in anatomical imaging of tumours as treatment targets have led to improvements in RT. However, it is unlikely that improved anatomical imaging alone will be the sole driver for new advances in personalised RT. Biochemically based radiobiological information is likely to be required for next-generation improvements in the personalisation of radiotherapy dose prescriptions to individual patients. In this paper, we use Raman spectroscopy (RS), an optical technique, to monitor individual biochemical response to radiation within a tumour microenvironment. We spatially correlate individual biochemical responses to augmentatively derived hypoxic maps within the tumour microenvironment. Furthermore, we pair RS with a data analytical framework combining (i) group and basis restricted non-negative matrix factorization (GBR-NMF), (ii) a random forest (RF) classifier, (iii) and a feature metric importance calculation method, Shapley Additive exPlanations (SHAP), in order to ascertain the relative importance of individual biochemicals in describing the overall biological response as observed with RS. The current study found that the GBR-NMF-RF-SHAP model helped identify a wide range of radiation response biomarkers and hypoxia indicators (e.g., glycogen, lipids, DNA, amino acids) in H460 human lung cancer cells and H460 xenografts. Correlations between the hypoxic regions and Raman chemical biomarkers (e.g., glycogen, alanine, and arginine) were also identified in H460 xenografts. To summarize, GBR-NMF-RF-SHAP combined with RS can be applied to monitor the RT-induced biochemical response within cellular and tissue environments. Individual biochemicals were identified that (i) contributed to overall biological response to radiation, and (ii) spatially correlated with hypoxic regions of the tumour. RS combined with our analytical pipeline shows promise for further understanding of individual biochemical dynamics in radiation response for use in cancer therapy.
Although advancements in anatomical imaging of tumours as treatment targets have led to improvements in RT, it is unlikely that the improved anatomical imaging will be the sole solution for personalised RT.2 The interest in applying personalised treatment plans has been growing as the knowledge of cancer biology and radio-biology deepens within the cancer research and treatment community. Resistance to RT often occurs in the hypoxic (low oxygen) regions of tumours, which leads to unsatisfactory clinical outcomes by reducing the efficacy of radiotherapy, resulting in lower tumour control and overall survival.2,5 Hypoxia is associated with complex metabolic pathways in the tumour microenvironment.6,7 However, an efficient technique to simultaneously monitor multiple biochemical changes within the tumour microenvironment during RT has not been developed.
Raman spectroscopy (RS) is a non-invasive optical technique that provides a detailed spectroscopic description of the molecular composition within a sample. RS can be attractive in the clinical setting as it can yield fingerprints of multiple chemical biomarkers simultaneously without destroying the biological sample.8 To analyze the spectral data acquired from RS, dimensionality reduction techniques are typically applied to parse the covariant features in the spectra.8 Unsupervised dimensionality reduction techniques such as principal component analysis (PCA) and non-negative matrix factorization (NMF) have helped discover glycogen and lipids as radiation response Raman biomarkers in cancer cells and tissue samples.9–17 However, unsupervised dimensionality reduction methods can also lead to interpretability problems when used to interpret the biochemical information contained within RS data, as biological components do not necessarily segregate along individual reduced dimensions (e.g. PCs).
To tackle this challenge, a group and basis restricted non-negative matrix factorization (GBR-NMF) algorithm has been previously developed by our group in order to decompose the spectral data with constrained chemical bases of interest.18 GBR-NMF is a semi-supervised dimension reduction method, which helps to improve the interpretability of the decomposed RS data, and hence aid in monitoring radiation response.19,20 Moreover, GBR-NMF can be combined with a classifier method such as random forest (RF) to form a data analytical framework (GBR-NMF-RF) to stratify the relative importance of each of the GBR-NMF decomposed biochemicals. RF ensembles a collection of decision classifiers and generates the overall classification result based on the individual voting results of tree classifiers within the forest.21
The aim of the current study is to apply GBR-NMF-RF data analytical framework and the feature importance metric (e.g., SHapley Additive exPlanations [SHAP]) to analyze Raman cellular and spectral tissue data. RF classifies the irradiated and non-irradiated cancer cellular and tissue samples based on the GBR-NMF decomposed chemical scores. In addition, RF can pair with the feature importance evaluation metrics (e.g., SHAP) to measure the contribution of chemicals during the classification tests. SHAP is a game theoretic approach to explaining the effects of features on the predictions.
Using the GBR-NMF-RF analytical framework, we first examines which biomarkers RS can track when monitoring radiation response in cancer under different micro-environments (cellular vs. tissue). The present study also investigated whether biochemical changes detected by RS can be associated with tumour biological features such as hypoxia in the tumour microenvironment. Hypoxia, as a standard feature of solid tumours can deprive the oxygen supply and impair blood flow due to abnormal angiogenesis (the formation of abnormal blood vessels supplying the tumour).22 Thus, not only cancer cells but also the tumour microenvironment is affected by hypoxia-induced changes.22,23 To investigate which biomarkers correlate with hypoxic regions in the tissue, three data sets (Table 1) that cover cells and xenografts derived from human lung (H460) were selected. The random forest classifier in GBR-NMF-RF is used to classify the cellular and tissue data sets into irradiated and non-irradiated classes. After RF classification, the feature importance ranked by SHAP uncovered the variability of the chemical scores before and after the radiation treatment and correlations between hypoxic regions and chemical spatial distribution.
Data names | Descriptions |
---|---|
Cellular data set (Matthews et al.11) | 360 H460 cellular spectra (irradiated at 0, 2, 4, 6, 8, and 10 Gy harvested at day 3) |
Tissue data set 1 (Harder et al.13) | 6280 H460 xenografts tissue spectra from 12 mice (irradiated at 0 Gy, 5 Gy and 15 Gy and harvested at day 3) |
Tissue data set 2 (Van Nest et al.15) | 2960 H460 xenografts tissue spectra from 6 mice (irradiated at 0 Gy and 15 Gy and harvested at day 3) |
The feature importance ranked by SHAP uncovered the biochemical variability in H460 cells and H460 tissue xenografts, pre and post-irradiation. In both the H460 cells and H460 tissue xenografts, radiation induced changes in the following biochemicals were observed; glycogen, lipids, DNA and amino acids. In addition, this study also discovered the co-expression of hypoxia and Raman chemical biomarkers (e.g., glycogen, alanine, and arginine) in the tissue tumour microenvironment. RS assisted with GBR-NMF-RF data analytical framework has demonstrated outstanding capability to monitor the RT-induced biochemical response within the tumour microenvironment.
The acquisition procedures of tissue data set 1 were summarised based on Harder et al.12 The mice were irradiated when tumour sizes reached 10–12 mm diameter. Each mouse was anesthetized using a single intraperitoneal (IP) injection of Ketamine/Dexdomitor mixture (50 mg kg−1 Ketamine and 0.5 mg kg−1 Dexdomitor based on body weight). After confirmation of anesthesia, individual mice were placed into a custom-designed restraining acrylic chamber and placed near the isocentre of a Varian Truebeam STx linear accelerator (LINAC) (Varian Medical Systems Inc., Palo Alto, CA, USA). Based on treatment plans made in Eclipse (version 11) treatment planning software (Varian Medical Systems Inc.), irradiations were delivered via a 6 MV photon beam with a dose rate of 6 Gy per minute at isocentre. Single fractions of 0, 5, and 15 Gy dose were delivered to the tumour tissues. Mice were euthanized three days post-irradiation. Removed tumour tissues from mice for Raman spectral analysis were embedded in mounting medium (Tissue-Tek O.C.T.™ Sakura Finetek Europe B.V., The Netherlands), flash-frozen in liquid nitrogen, and stored at −80 °C.
Tissue data set 2 were acquired by Van Nest et al.15 The animals were prepared for irradiation when the tumour volume was below 80 mm2 or above 130 mm2 with an average tumour volume of 97 ± 7mm2. The animals were anesthetized using isoflurane inhalation (2%, in oxygen). Treatment plans were made using Muriplan (Gulmay Medical Inc.) to deliver single fractions of 0 or 15 Gy dose the tumour at a dose rate of 4 Gy min−1. Mice were treated using a small animal radiation research platform (SARRP, Xstrahl, Gulmay Medical Inc., Suwanee, GA) and imaged using a single cone-beam computed tomography (CBCT) scan. For tumour tissues involving evaluation of carbonic anhydrase IX (CAIX) expression, tumours were irradiated with a Varian Truebeam STx linear accelerator (LINAC) (Varian Medical Systems Inc., Palo Alto, CA, USA) via a 6 MV photon beam with a dose rate of 6 Gy per minute. Mice were euthanized through isoflurane overdose (5%, in oxygen) and cervical dislocation at three days post-irradiation for tumour extraction. Tumours were extracted and embedded in the mounting medium (Tissue-Tek O.C.T. Sakura Tinetek Europe B.V., The Netherlands), flash-frozen in liquid nitrogen, and stored at –80 °C immediately after euthanasia.
Chemical names | Abbreviation |
---|---|
Alanine | Ala |
Arginine | Arg |
Asparagine | Asn |
Aspartic acid | Asp |
Cholesterol | Cho |
Citric acid | Cit |
Coenzyme A | CoA |
Collagen | Col |
Cysteine | Cys |
DNA | DNA |
Glucose | Glu |
Glutamic acid | GluA |
Glutamine | Gln |
Glutathione | GSH |
Glycerol | Glyc |
Glyceryl tripalmitoleate | GlyT |
Glycine | Gly |
Glycogen | Glg |
Histidine | His |
Isoleucine | Ile |
Lactic acid | Lac |
Leucine | Leu |
Lysine | Lys |
Mannose | Man |
Methionine | Met |
Oleic acid | Ole |
Palmitic acid | Pal |
Phenylalanine | Phe |
Phosphatidylcholine | PC |
Phosphatidylethanolamine | PE |
Phosphatidylinositol | PI |
Phosphatidylserine | PS |
Pyruvic acid | Pyr |
Serine | Ser |
Stearic acid | Ste |
Threonine | Thr |
Triglycerides | Tri |
Tryptophan | Trp |
Tyrosine | Tyr |
Valine | Val |
All the reference chemical spectra (liquid or solid) were acquired using a Renishaw inVia Raman microscope (Renishaw Inc.) with a 100× dry objective (NA = 0.9) (Leica Microsystems, Wetzlar, Germany), a 1200 lines per mm diffraction grating, a 10 s exposure time, and a 785 nm laser (Renishaw). All reference spectra were interpolated onto the same wavenumber axis (resolution) as the cellular or tissue xenografts spectra before the analysis.
For the cellular Raman data set acquisition, cells were washed with PBS, harvested with trypsin, and centrifuged into a pellet. Pellets were transferred to a 5 mm thick magnesium fluoride window (Janos Technology Inc., Keene, NH, USA) and allowed to air dry for 5 minutes. Raman spectra were acquired from 20 individual cells from each sample (20 spectra per sample at each radiation dose indicated) at three days post irradiation. Cells were randomly chosen from the top layer of the cell pellet for acquisition. To acquire the xenografts tissue Raman data set 1 and 2, tumours were sectioned into 20 μm using a rotary microtome (HM 550; MICROM International GmbH, Walldorf, Germany) and placed on magnesium fluoride slides. Before Raman acquisition, the frozen sections were air-dried for 10 minutes.
All RS biological sample acquisitions were conducted with the same instrument parameters. A Renishaw inVia Raman Microscope (Renishaw Inc., Illinois, IL, USA) equipped with a 785 nm diode laser (Renishaw) and dry objective (100×, NA = 0.9) (Leica Microsystems, Wetzlar, Germany) was used to collect Raman map spectra. Spectra were registered using a thermoelectrically cooled charge-coupled device (CCD) detector (Andor Technology, Connecticut, USA). The laser sampling dimensions were 2 × 5 × 10 μm3, and a laser power density was measured 0.5 mW μm−3 at the sampling volume. Spectra were acquired for 20 s per point, covering a spectral range of 460–1800 cm−1.
A total of twelve mice were studied for tissue data set 1, with four mice in each dose group (0, 5, and 15 Gy). The animal sample size was selected to follow similar population sizes from previously published Raman studies involving animals and radiation exposure. Five to eight unique mapping regions (map areas are between 100–220 μm2, step size 15 μm2) were analysed over three tissue sections per mouse, resulting in a total of 6648 spectra before spectral processing. After pre-processing, 6280 H460 xenografts tissue spectra were acquired in total. For tissue data set 2, two maps were collected from randomly selected regions within each tissue section, leading to a total of six maps collected per tumour (2 maps per section, 3 sections per tumour). This study included a total of 6 mice (6 mice at 3 days post irradiation; 3 per dose group). A total number of 2960 H460 xenograft tissue spectra resulted after spectra were processed.
Each spectrum was processed with a cosmic ray removal program from WiRE (Renishaw Inc.) to remove cosmic rays. Additionally, Matlab was used for other prepossessing steps, such as correcting for wavenumber calibration drifts, estimating and subtracting a baseline arising from the substrate and biological fluorescence, and normalised to a total area under the curve equal to 1.
Rather than decomposing the non-negative data matrix X into two lower rank non-negative matrices W such that
X ≈ WH |
GBR-NMF decomposed the non-negative data matrix X into W, A, and S such that
X ≈ WAS |
SHapley Additive exPlanations (SHAP) was used to calculate the feature importance after the classification tests. SHAP was developed to explain the output of machine learning models based on game theory by finding the optimal contribution allocation for feature variables.29 SHAP assigns each variable feature an importance value for a particular prediction.29
An explainer optimized for tree-based models (e.g., RF) implemented by Lundberg et al. was used to calculate the feature importance in the current study in Python 3.9.13.30,31
The random forest is used to classify the cellular and tissue data sets into irradiated and non-irradiated classes with a standard package in scikit-learn (version 1.0) in Python 3.9.13.31,32 Based on the feature importance calculated by SHAP, the top 20 chemicals were selected for further investigation on the variance of the chemical scores before and after the radiation treatment and correlations between hypoxic regions and chemical spatial distribution.
In Fig. 1, the average spectra (red) with ±1 standard deviation (grey shadow spectrum) for the three H460 Raman spectral data sets are shown. The greatest standard deviation of H460 cellular data set and tissue data set 1 occurs at 482 cm−1, which can be attributed to glycogen.11,14,19,33 For tissue data set 2, the greatest standard deviation is located at 1438 cm−1, which can be assigned to unsaturated fatty acids and triglycerides.27 Other spectral regions where high standard deviations occur in three data sets were also identified, such as 1440 cm−1 (CH2 scissoring vibrations and lipids), 1442 cm−1 (CH2 scissoring and lipids), 1447 cm−1 (CH2 bending mode of proteins and lipids), 850 cm−1 (single-bond stretching vibrations for amino acids), and 1658–1664 cm−1 (Amide I).
Fig. 1 Average Raman spectra of H460 cellular and tissue samples. The red spectrum is the average spectrum and the grey shadow spectrum is the average spectrum ±1 standard deviation. The greatest standard deviation of H460 cellular data set and tissue data set 1 is at 482 cm−1, which can be attributed to glycogen.11,14,19,33 For tissue data set 2, the greatest standard deviation is at 1438 cm−1, which can be assigned to unsaturated fatty acids and triglycerides.27 High standard deviations were also identified, in 1440 cm−1 (CH2 scissoring vibrations and lipids), 1442 cm−1 (CH2 scissoring and lipids), 1447 cm−1 (CH2 bending mode of proteins and lipids), 850 cm−1 (single-bond stretching vibrations for amino acids), and 1658–1664 cm−1 (Amide I). |
In the top 20 contributing chemicals, alanine, citric acid, glycogen, stearic acid, threonine, and valine appeared in all three data sets (cell and xenografts tissue). DNA, glucose, leucine, mannose, phosphatidylethanolamine, and triglycerides were only ranked as high contributing chemicals in the cellular data set. Although absent from the cellular data set, chemical bases such as arginine, phenylalanine, phosphatidylserine, tyrosine, and unconstrained bases show high contributions in both tissue data sets. Cysteine, glutathione, glycine, methionine only occur as the critical chemical in one of the tissue data sets. It is worth noting that DNA was ranked as the highest contributing chemical in the cellular data set classification tests while absent from the top 20 chemicals in both tissue data sets. Moreover, the unconstrained bases appeared as a top 20 contributing chemical to both tissue data set classifications. The unconstrained bases can be interpreted as the residue of unidentified chemical metabolites in the Raman spectra.
Glycogen accumulation after irradiation has been reported previously in H460 cancer cells and xenografts tissues.11,13,15 Increased glycogen levels can be attributed to impaired inhibition of glycogen synthase kinase-3 (GSK-3) which regulates glycogen synthesis. Western blot analysis was conducted previously and found increases in phosphorylation of GSK-3β and total amount of GSK-3β11 for post-irradiation cells. Glycogen accumulation in xenograft tissues after radiation treatment was also confirmed using periodic acid-Schiff staining.13
Other than glycogen, multiple amino acids can be identified as metabolites responding to radiation treatment. The increased level of multiple amino acids can indicate tumour radiation response to repair the damage induced by ionising radiation. Alanine is an amino acid that can be synthesized from pyruvate, which has a vital role in the TCA cycle of cancer cells.34 Altered alanine metabolism was found in studies across various types of cancer (e.g., lung cancer, and pancreatic cancer).35–37 Moreover, increased alanine production can be used as a tumour response indicator to effective RT evaluated by hyperpolarized 13C MRI.38 Asparagine has an essential role in promoting cancer cell proliferation as an amino acid exchange factor;39,40 this phenomenon can be associated with the increased level observed in Fig. 3. Asparagine is degraded by the enzyme asparaginase (ASNase) and combinatorial treatment with ASNase with chemotherapy or chemoradiation showed synergy in suppressing tumor growth.41–43
In Fig. 4, chemicals (e.g., arginine, citric acid, DNA, palmitic acid, pyruvic acid, and stearic acid) with a consistent decreasing trend (statistically significant with a p-value <0.05 using null-hypothesis significance testing) post-irradiation across three data sets are presented.
Arginine is a non-essential amino acid that can be synthesized by cells.44 Nonetheless, when cells encounter catabolic stress, arginine will be heavily consumed from the environment,44 which explains the decreased arginine level found by RS. DNA double-strand and single-strand breaks induced by ionising radiation have been established as one primary cellular damage mechanism,1 which corresponds to the observation in declined DNA scores. The depletion of citric acid and pyruvic acid can potentially lead to increased energy consumption in the TCA cycle.45 Altered lipid metabolism is a hallmark of metabolic alterations in cancer.46 Previously, a study conducted on blood serum samples acquired before and after radiation treatment in patients analyzed by gas chromatography has demonstrated decreased level of saturated fatty acids after treatment.47 The decreases in palmitic acid and stearic acids (which are two types of saturated fatty acids) found in the present study corresponds to the previously reported decreases in saturated fatty acids.
The average glycogen score inside the hypoxia mask is higher than the average outside of the hypoxia region. A positive correlation between the glycogen scores and the CAIX intensities in the map was calculated from Spearman's correlation (0.23, p-value <0.05). The similarities between glycogen and hypoxia distribution in Fig. 6 can be linked to glycogen synthesis induced in hypoxia by the hypoxia-inducible factor 1.48,49 The hypoxia-inducible factor 1 (HIF-1) is a transcription factor responsible for alterations in cell metabolism in hypoxic tumor cells.48,49 It promotes cell proliferation by inducing a metabolic shift from oxidative phosphorylation to glycolysis and lactic acid production.48,49 Furthermore, alanine distribution in the chemical map also resembles CAIX spatial distribution in Fig. 6. Alanine has a higher average score inside the hypoxia region than the region outside of hypoxia. Alanine scores are positively correlated with CAIX intensities shown by Spearman's correlation (0.204, p < 0.05). Previous studies have found connections between hypoxia and alanine synthesis indicated in Fig. 6.34,50 Alanine is an amino acid synthesized from pyruvate. Alanine is expected to lie mainly in the mitochondrial matrix, which is the site of the TCA cycle.34,50 The alanine aminotransferase (ALAT) competes for mitochondrial pyruvate with pyruvate dehydrogenase (PDH).34 Therefore, if the PDH activity is low (e.g., hypoxia), the increased alanine synthesis may be observed.34
Arginine displays high activities across the map except for regions inside the hypoxia mask (Fig. 6). Arginine scores are also negatively correlated with CAIX intensities calculated from Spearman's correlation (−0.232, p-value <0.05). Hypoxia induces angiogenesis.22 Studies have found that arginine plays an essential role in angiogenesis.51,52 Arginine can be converted to nitric oxide (NO), and NO within the tumour microenvironment is an essential mediator of tumour angiogenesis.52 NO can promote angiogenesis by producing more blood vessels near the solid hypoxic tumours to supply nutrients and oxygen.52,53
In addition to reducing pyruvate in the TCA cycle, the potential reduction in the overall TCA cycle would stall electron transport chain (ETC) activity and reduce the production of reactive oxygen species necessary for RT-induced DNA damage.57 Interestingly the increase in CAIX staining, a hypoxia marker, suggests that low oxygen may be the primary driver of these metabolite changes (Fig. 3–5). These data indicate that inhibition of GPT1/2 could shift the pyruvate reactions towards pyruvate dehydrogenase and conversion to acetyl-CoA.58 Pyruvate dehydrogenase and conversion to acetyl-CoA can help maintain TCA flux, ETC to allow for sufficient ROS production during RT and thereby restore or sensitize to RT-induced cell death.
In previous research, RS has demonstrated the potential to be incorporated into radiation therapy as a label-free technique to reveal the biochemical dynamics and assess the treatment response for various types of cancer.9–17,59–63 Nevertheless, conventional unsupervised dimension reduction techniques can provide limited resolution in distinguishing responses from different chemicals.9–17,59–63 In this manuscript, the GBR-NMF-RF-SHAP data analytical framework has demonstrated an outstanding capability to resolve signals such as radiation response and hypoxia indicators corresponding to constrained chemicals. Moreover, GBR-NMF-RF-SHAP also exhibited efficiency in analyses across various types of cancer data (cell and tissue). In the future, biochemicals indicating radiation response and hypoxia identified herein will be cross-validated using orthogonal techniques such as mass spectroscopy and staining.
This journal is © The Royal Society of Chemistry 2022 |