Kevin
Chappell‡
a,
Kanishka
Manna‡
a,
Charity L.
Washam
ab,
Stefan
Graw
abc,
Duah
Alkam
a,
Matthew D.
Thompson
a,
Maroof Khan
Zafar
a,
Lindsey
Hazeslip
a,
Christopher
Randolph
b,
Allen
Gies
a,
Jordan T.
Bird
a,
Alicia K
Byrd
ad,
Sayem
Miah
ad and
Stephanie D.
Byrum
*abd
aDepartment of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, 4301 West Markham Street (slot 516), Little Rock, AR 72205-7199, USA. E-mail: sbyrum@uams.edu; Fax: +(501) 526-7008; Tel: +(501) 686-5783
bArkansas Children's Research Institute, 13 Children's Way, Little Rock, AR 72202, USA
cEmory University, Atlanta, GA, USA
dWinthrop P. Rockefeller Cancer Institute, 449 Jack Stephens Dr, Little Rock, AR 72205, USA
First published on 18th June 2021
Triple negative breast cancer (TNBC) is an aggressive type of breast cancer with very little treatment options. TNBC is very heterogeneous with large alterations in the genomic, transcriptomic, and proteomic landscapes leading to various subtypes with differing responses to therapeutic treatments. We applied a multi-omics data integration method to evaluate the correlation of important regulatory features in TNBC BRCA1 wild-type MDA-MB-231 and TNBC BRCA1 5382insC mutated HCC1937 cells compared with non-tumorigenic epithelial breast MCF10A cells. The data includes DNA methylation, RNAseq, protein, phosphoproteomics, and histone post-translational modification. Data integration methods identified regulatory features from each omics method that had greater than 80% positive correlation within each TNBC subtype. Key regulatory features at each omics level were identified distinguishing the three cell lines and were involved in important cancer related pathways such as TGFβ signaling, PI3K/AKT/mTOR, and Wnt/beta-catenin signaling. We observed overexpression of PTEN, which antagonizes the PI3K/AKT/mTOR pathway, and MYC, which downregulates the same pathway in the HCC1937 cells relative to the MDA-MB-231 cells. The PI3K/AKT/mTOR and Wnt/beta-catenin pathways are both downregulated in HCC1937 cells relative to MDA-MB-231 cells, which likely explains the divergent sensitivities of these cell lines to inhibitors of downstream signaling pathways. The DNA methylation and RNAseq data is freely available via GEO GSE171958 and the proteomics data is available via the ProteomeXchange PXD025238.
TNBC is associated with worse prognoses than other types of breast cancer, which is attributed to its aggressiveness and heterogeneity.4 The various subtypes and heterogeneity of breast cancer are largely due to altered genomic, transcriptomic, and proteomic molecular signatures that impact various signaling pathways. About 5% of breast cancer tumors carry heritable gene mutations, such as the most common mutation in the breast cancer gene1 (BRCA1). Of these BRCA1 mutant tumors, the BRCA1 5382insC mutation is one of the most common.5 This study by Pogoda et al. showed that out of 124 TNBC patients that underwent genetic counselling with BRCA1/2 mutation tests, 30 had a mutation detected. Of those 30 patients, 18 patients had a BRCA1 5382insC mutation.5
Despite their established diversity, TNBC subtypes are all treated with chemotherapy, which too often results in recurrence.6 This treatment strategy is clearly inadequate as genetic alterations in key genes that seemingly define the TNBC aggressiveness (e.g. PIK3CA and AKT) have been characterized – rendering treatments which target these proteins ineffective in a subset of TNBC patients.7 A study by Gu et al. 2016 demonstrated TNBC tumors with a BRCA1 mutant are insensitive to mitogen-activated protein kinase 1 and 2 (MEK 1/2) inhibitors, limiting the therapeutic potential against these cancer subtypes.8 MEK inhibitors have been used to successfully treat other genomic mutated cancers, such as melanoma.8 Therefore, understanding the molecular complexity of various subtypes of breast cancer is of primary importance.
The advent of personalized medicine brings great promise to those suffering from this challenging disease as distinct, potentially druggable, molecular targets with unique alterations have been identified. Development of treatment strategies that would broadly target TNBC patients necessitates a comprehensive understanding of the underpinnings of this disease. To achieve such understanding, we characterized two subtypes of TNBC by examining and integrating information on their epigenetic, transcriptomic, proteomic and phospho-proteomic profiles. Specifically, we characterized the BRCA1-wild-type MDA-MB-231 TNBC, and the BRCA15382insC HCC1937 TNBC cell lines as well as the MCF10A cell line as a normal breast epithelial control. Our multi-omics approach highlights the diversity of the different TNBC subtypes, and sharpens our understanding of the molecular pathways that govern this challenging breast cancer.
Following an initial quality control step, the DNA methylation data was preprocessed and normalized using Bioconductor packages minfi and watermelon.15,16 Briefly, the function “preprocessNoob” (minfi) was used to correct for background fluorescence and dye biases within an array. Next, probes and samples with poor quality were identified and removed. Therefore, samples with more than 10% of probes that had detection p-values >1 × 10−5 or samples whose intensity distributions demonstrated irregularities were excluded.17 Furthermore, Illumina removed 1031 CpG probes when transitioning to their B1 version of the MethylationEPIC v1.0 manifest due to poor performance and additional probes in the transition from their B2 to their B3 version.18 In addition to probes flagged by Illumina, we also excluded probes with a median detection p-value > 0.05 from the subsequent statistical analysis. Next, we corrected the type II probe bias using the function “BMIQ” (wateRmelon) to achieve a comparable methylation distributions of type I and II probes.15 All single nucleotide polymorphism (SNP)-CpG interaction or cross-reactive probes were flagged and excluded from the downstream analyses in order to reduce the bias of the results. The DNA-methylation data were analyzed using the R statistical programming language (version 3.6).
The statistical analysis for probe-wise differential DNA methylation was performed following the limma workflow.19M-values were calculated by transforming β-values using the logit-transformation. Average M-values across promoter regions and associated genomic features were calculated for each “Promoter Associated” region annotated by the IlluminaHumanMethylationEPICanno.ilm10b4.hg19 R Package.20 The change in methylation at a promoter region was indicated by the difference between mean CpG methylation between treatments. A linear model was fitted with the “lmFit” and “eBayes” functions from the limma package. We individually adjusted the p-values for multiple testing across all CpG sites/promoter regions using the Benjamini–Hochberg method21 to control the False Discovery Rate (FDR). A CpG was considered differentially methylated if the FDR was <0.05.
RNA reads were checked for quality of sequencing using FastQC. The adaptors and low quality bases (Q < 20) were trimmed to a minimum of 36 base pairs using Trimmomatic. Reads that pass quality control were aligned to the GRCh38 Ensemble release 99 Homo sapiens reference genome using STAR sequence aligner. Raw counts were obtained from bam files using Subread's “featuresCount” and transformed to log2 counts per million (CPM).22 Low expressed genes were filtered out and libraries normalized by trimmed mean of M-values.23 Differential expression was performed using limma's “voom with quality weights” and p-values were corrected for multiple testing using the Benjamini–Hochberg procedure. Genes with a false discovery rate (FDR) p-value < 0.05 and fold change > 2 were considered significant.
Proteins and phosphosites were identified and reporter ions quantified by searching the UniprotKB Homo sapiens database (November 2018) using MaxQuant (version 1.6.10.43; Max Planck Institute) with a parent ion tolerance of 3 ppm, a fragment ion tolerance of 0.5 Da, a reporter ion tolerance of 0.001 Da, trypsin enzyme with 2 missed cleavages, variable modifications including oxidation on M, acetyl on protein N-term, and phosphorylation on STY, and fixed modification of Carbamidomethyl on C. Protein and peptide identifications are accepted if they could be established with less than 1.0% false discovery. TMT MS3 reporter ion intensity values were analyzed for changes in total protein using the unenriched lysate sample. Phospho (STY) modifications were identified using the samples enriched for phosphorylated peptides. The enriched and un-enriched samples are multiplexed using two TMT10-plex batches, one for the enriched and one for the un-enriched samples.
Following data acquisition and database search, the results were normalized using cyclic loess normalization for both the protein and the phosphopeptide data sets.25
The normalized protein and phosphorylated peptide data was analyzed for differential abundance using the limma package by applying “lmFit” and “eBayes” functions. Protein kinases and their substrates were identified using PHOXTRACK.26 A similar approach was used for differential analysis of the phosphopeptides, with the addition of a few steps. The phosphosites were filtered to retain only peptides with a localization probability > 75%, filter peptides with zero values, and cyclic loess transformed. Limma was also used for differential analysis of single phosphosite peptides.
Histones were resolved (20 μg of histones per lane) by SDS-PAGE using 4–20% Novex Tris-glycine gradient gels (Life Technologies, Inc.) and stained with Thermo Fisher Scientific Pierce GelCode Blue stain reagent. The region of each gel lane containing the core histones was excised as one piece, diced into small pieces, destained, treated with d6-acetic anhydride to chemically block unmodified lysines and monomethylated lysines with an isotopically heavy acetyl, and digested in-gel with trypsin. Tryptic peptides were separated by reverse phase Jupiter Proteo resin (Phenomenex) on a 100 0.075 mm column using a nanoAcquity UPLC system (Waters). Peptides were eluted using a 40 min gradient from 97:3 to 35:65 buffer A/B ratio. (Buffer A consists of 0.1% formic acid, 0.5% acetonitrile; buffer B consists of 0.1% formic acid, 75% acetonitrile). Eluted peptides were ionized by electrospray (1.9 kV) followed by MS/MS analysis using collision induced dissociation on an Orbitrap Fusion Lumos mass spectrometer (Thermo Fisher Scientific). MS data were acquired using the Fourier Transform Mass Spectrometry analyzer in profile mode at a resolution of 60000 over a range of 375 to 1500 m/z.
Proteins were identified by searching the UniProtKB Homo sapiens database (October 2019 database) using an in-house Mascot server (version 2.5.1; Matrix Science). Mascot search parameters were specified as follows: trypsin digestion with up to two missed cleavages; fixed carbamidomethyl modification of cysteine; variable modifications including methyl, dimethyl, trimethyl, acetyl, acetyl/2 H3, and methyl + acetyl/2 H3 modification of lysine; 2.0 ppm precursor ion tolerance; 0.50 Da fragment ion tolerance. A reverse sequence decoy search was also performed. Peptide and protein identifications were validated using Scaffold (version 4.8.7; Proteome Software). Peptide and protein identifications were accepted with a 1% FDR which was assigned by the Protein Prophet algorithm.27 The spectrum report was then exported for further analysis by PTMViz (https://github.com/ByrumLab/PTMViz). PTMViz was utilized to identify writers, erasers, and readers for methylation and acetylation (WERAM) modifying proteins for significant histone post-translational modifications. Briefly, PTMViz analyzes histone post-translational modification peptides (PTMs) using the relative abundances of PTMs and performs a limma moderated t-test to identify significant modifications.
Multi-omics data annotation was curated by mapping the transcript version of the Illumina EPIC manifest Regulatory Features, the “gencode.v12.annotation.gtf.gz”, and the “genecode.v12.metadata.SwissProt.gz” files in order to retrieve consistent annotation for the CpGs grouped by promoter regions, gene expression, protein expression, and phosphopeptide data sets. The annotation data was mapped based on the transcript ID from each source to provide a final table consisting of the GenecodeCompV12 accession, Illumina Regulatory Feature name, transcript ID, gene ID, gene name, UniProtKB AC, and UniProtKB ID. The DNA methylation and gene expression data as well as the gene and protein data sets were matched based on Ensembl IDs. Proteins and phosphorylated peptides data sets were matched based on UniProtKB IDs. Pearson correlation analysis was performed using the significant features from each data set, which corresponds to the four outer quadrants on Fig. 3, to identify the relationships between DNA methylated promoters and the genes they regulate, between gene and protein expression, and between protein expression and phosphorylated proteins.
The DNA methylation promoter regions, gene, protein, and phosphopeptide normalized expression data was analyzed using MixOmics, which integrates multi-omics data to identify correlated variables measured on heterogeneous data sets and explain the categorical outcome of interest (supervised analysis). MixOmics applies matrix factorization using the supervised projection to latent structures (PLS) models for data integration to reduce the dimension, capture and explain the variation in the data that discriminate between MCF10A, MDA-MB-231, and the HCC1937 cell lines. We used the Data Integration Analysis for Biomarker discovery (DIABLO) with the sparse partial least squares regression (sPLS) method with tuning parameter set to keep the top 15 and 10 features for each data set corresponding to principal component 1 and 2, respectively.
Multi-omics gene-set analysis (MOGSA) was applied to the same data set as MixOmics, except only features with gene symbols were included, in order to investigate data integration at the pathway/gene set level. MOGSA is a multivariate single sample gene-set analysis method that integrates multiple experimental and molecular data types measured over the same set of samples.28 The method learns a low dimensional representation of most variant correlated features (genes, proteins, etc.) across multiple omics data sets, transforms the features onto the same scale and calculates an integrated gene-set score from the most informative features in each data type. A gene-set with a high GSS are driven by features that explain a large proportion of the global correlated information among data matrices and can be from one or all data matrices.
There were large differences in expression among each of the three cell lines providing for valuable data for integration and pathway analyses. First, the DNA methylation status of the promoters that regulate gene expression from each cell line was evaluated using Illumina EPIC Beadchip array technology. A total of 865077 CpG probes were analyzed from the Beadchip array. We performed pair-wise comparisons of the three cell lines and found 96% of the significant CpGs were common among the pair-wise comparisons of the three cell lines. From these probes, 616053 CpGs were associated with a gene and after averaging the M-values of the CpGs by promoter regions, a total of 86168 regulatory features were analyzed for each group comparison. A total of 4175, 2992, 4050 significantly hypomethylated and 6393, 8528, 7663 significantly hypermethylated promoter regions were identified from the MDA-MB-231 vs. MCF10A, HCC1937 vs. MCF10A, and HCC1937 vs. MDA-MB-231 cell line comparisons, respectively. The M-values for the top significantly hypermethylated/hypomethylated promoter regions are shown as a heatmap that reveals methylation patterns across the three phenotypes (Fig. S1, ESI†). Certain genes are activated or repressed in TNBC compared to the control as well as some genes that are regulated differently in the HCC1937 BRCA1mu compared to both MDA-MB-231 BRCA1wt and the MCF10A control, which may be indicative of the more aggressive phenotype. One such gene is the SLFN11, in which the promoter is hypomethylated in HCC1937 compared with the other two cell lines, which are hypermethylated (Fig. S1, ESI†). SLFN11 serves as an S-phase checkpoint preventing the survival of cells after accumulation of DNA damage and replication stress.29 SLFN11 has shown a wide range of gene expression in TNBC; however, when studied together with BRCAness phenotype, it was shown that tumors with high SLFN11 expression and BRCA1 and BRCA2 germline mutations had increased response to irinotecan treatment, which is a Topoisomerase I inhibitor.29
RNA-sequencing and phosphoproteomic analysis was performed on the same cell lines and growth conditions as the DNA methylation data to investigate differential gene expression and protein abundance between the three cell lines. Out of a total of 16718 genes analyzed, 7226, 7506, and 7724 genes were differentially regulated with an FDR p-value < 0.05 and an absolute fold change > 2 between the MDA-MB-231 vs. MCF10A, HCC1937 vs. MCF10A, and HCC1937 vs. MDA-MB-231 cell line comparisons, respectively (Fig. S2, ESI†). A total of 7634 proteins were quantified with 1350, 1997, and 1968 identified as differentially abundant in the same pair-wise comparisons above with a FDR adjusted p-value < 0.05 and an absolute fold change > 2 (Fig. S2, ESI†). Interestingly, SLFN11 was slightly elevated in the RNAseq gene expression data in MDA-MB-231 compared to MCF10As but had significantly higher expression in the HCC1937 cells compared to both MCF10As and MDA-MB-231 cells (Supplemental file 2, ESI†).
Additionally, we quantified 11242 single phosphopeptide sites with 3243, 4490, and 4249 significantly differentiated between the MDA-MB-231 vs. MCF10A, HCC1937 vs. MCF10A, and HCC1937 vs. MDA-MB-231 cell line comparisons. Phosphorylated peptides were analyzed using PHOXTRACK (PHOsphosite-X-TRacing Analysis of Causal Kinases) to identify key regulating proteins by mapping phosphopeptides to putative kinases.26 Significant phosphopeptides from both MDA-MB-231 and HCC1937 TNBC cells compared to the control identified kinases such as CDK1, CDK2, and PAK4 (Fig. 2A and Supplemental file 4, ESI†). CDK1 kinase substrates include TP53-S315, KAT7-T88, STMN1-S25 and STMN1-S38, which were hyperphosphorylated in TNBC. PAK4 substrates PAK4-S181, S167, S104, S148, and S309 were also all hyperphosphorylated in both TNBC cell lines (Fig. 2B). PAK4 is an upstream regulator of JNK and is an important regulator of cytoskeleton remodeling. PAK4 is commonly overexpressed in human cancer30–32 and affects motility, invasion, metastasis and growth. Kinases regulating differentiating phosphopeptides between the subtypes of TNBC include LYN and LATS2. LYN is a non-receptor tyrosine kinases (nRTK) member of the SRC family and functions as an oncogene in breast cancer.4,6,33 The Lyn protein was found to be hyperphosphorylated at position Y397 in the BRCA1mu compared to BRCA1wt. In addition, a second SRC family member, HCK was also hyperphosphorylated at position Y411 by LYN kinase (Fig. 2C).
Other proteins of interest include those that write, read, and erase histone post-translational modifications. These proteins were identified from the Writers, Erasers, and Readers of Acetylation and Methylation (WERAM, version 1.0) database34 and are highlighted in Fig. S2 (ESI†). HDAC4 was significantly altered in HCC1937 compared to control, where it was reduced at both the transcription and protein level. This is not surprising since missense mutations were detected in HDAC4 specifically in HCC1937.35 In MDA-MB-231, only HDAC1 was significantly altered compared to control, where its transcription levels were elevated. Expression of HDACs is diverse and redundant across different TNBC subtypes. This redundancy is crucial especially for the BRCA1 mutant subtypes. Inhibition of HDACs has been shown to be lethal because this causes extensive DNA damage which cells deficient in BRCA1 cannot tolerate since their homologous recombination DNA repair mechanisms are impaired.36 HDAC4 has been shown to be involved in MTA1-mediated epigenetic regulation of ESR1 expression in breast cancer.35 This protein deacetylates HSPA1A and HSPA1B at Lys-77.
DNMT1 is overexpressed at the transcriptional level in HCC1937 compared to control, which is in line with the reported association between higher levels of DNMT1 and increased transformation.37,38 Interestingly, DNMT3A protein levels are increased in HCC1937 but decreased in MDA-MB-231 compared to control. DNMT3A is a histone H3K36me2 and H3K36me3 reader and acts as a transcriptional corepressor for ZBTB18.39 DNMTs function to methylate DNA and recruit NSD1 and NSD2 histone methyltransferases to specific genomic locations. NSD2 was found to be up-regulated in MDA-MB-231 compared to both HCC1937 and control cell lines and may be an indication of changes about to occur in the methylation patterns on H3K36 into H3K36me1/2. It is unclear the role of H3K36me and this modification was found to be increased greater than 2-fold in the HCC1937 cells compared to MCF10As.40 However, histone H3K36me2 has a clear role in double-strand break repair.41 Taken together, DNMTs work in conjunction with NSD2 to methylate H3K36 to H3K36me1/2, which plays a central role in transforming chromatin from heterochromatin into euchromatin; allowing for the upregulation of genes associated with cancer phenotypes.40 Histone H3K36me2 recruits DNMT3A and shapes the methylation landscape of intergenic regions.39 A previous study showed that DNMT protein levels, and not transcription levels, predict sensitivity to the demethylating agent, decitabine.42 Therefore, this discrepancy in DNMT levels among the TNBC cell lines may be of clinical significance, which warrants further research.
We also identified 2-fold enrichments of H3K9me2/3, H3K36me, H3K79me, and H4K20me modifications among the three cell lines (Table 1 and Supplemental file 5, ESI†). The log2 fold change values for each sample group comparison and each histone modification is listed in Table 1. We also included proteins identified as writers, readers, and erasers of the significant histone PTMs.
Histone PTM | log2 fold change (MDA-MB-231 vs. MCF10A) | log2 fold change (HCC1937 vs. MCF10A) | log2 fold change (HCC1937 vs. MDA-MB-231) | PTM modification proteins (WERAM) |
---|---|---|---|---|
H3k9me2 | 0.22 | −1.737 | −1.956 | MECOM, PRDM2,PRDM16, PRDM8, PRDM2, EHMT1, EHMT2, JHDM2A, JMJD2C, JMJD2B, TRIM28 |
H3k9me3 | −3.319 | −4.735 | −1.416 | MECOM, PRDM2, PRDM16, PRDM8, PRDM2, EHMT1, EHMT2, JHDM2A, JMJD2C, JMJD2B, TRIM28 |
H3k36me | 0.883 | 1.579 | 0.696 | WHSC1, NSD1, NSD2, SMYD2, SETMAR, SETD2, ASH1L, METNASE, SETD3 |
H3k79me | 1.601 | 1.905 | 0.304 | DOT1 |
H4k20me | −0.817 | −1.994 | −1.177 | WHSC1, SUV420H1, SUV420H2, SETD8, L3MBBTL1 |
MixOmics supervised analysis was used to identify key features from multi-omics data that can clearly separate normal epithelial cells from triple negative breast cancer with and without a BRCA1 mutation. We used the established DIABLO method to integrate the same biological samples from each data set. The first principal component clearly separated the HCC1937 cell line from both MDA-MB-231 and MCF10As while the second principal component distinguishes between MDA-MB-231 and the HCC1937 and MCF10As. Important features in component 1 include TGFB1, TGFBR2, KLF6, KLF12, PIK3R3 and VIM, while features important from component 2 include NES, RASL11B, HOXC9, LAMB3, PRKCD, PRKCE, and MELK to name a few (Fig. 4A and Fig. S3, ESI†). TGFB1 expression and its receptors, TGFBR1 and TGFBR2, lead to the induction of canonical and noncanonical TGFβ signaling pathways and is correlated with oncogenic activity.43 In early stage breast cancer, TGFβ signaling shows tumor suppressive effects; however, in late stages, TGFB1 is linked with increased tumor progression, metastasis, and epithelial to mesenchymal transition (EMT).43 TGFB1 and TGFBR2 had decreased gene expression in HCC1937 compared to the other cell lines. KLF6 and KLF12 (Kruppel-like factors) are DNA-binding transcriptional regulators which play essential roles in proliferation, differentiation, apoptosis, and migration.44 In breast cancer, KLFs function in the process of EMT, invasion, and angiopoiesis.44 These genes were up-regulated in MCF10As and MDA-MB-231 cell lines. VIM is a marker for mesenchymal cell types and was found to be hyperphosphorylated at S214 in the MCF10A and MDA-MB-231 cell lines.45 NES was hyperphosphorylated at S768, S1496, and S1498 in the MDA-MB-231 cell line. NES promotes the disassembly of phosphorylated vimentin intermediate filaments during mitosis and is associated with reduced survival rates in basal-like subtypes of breast cancer.46 PRKCD, PRKCE, and MELK are non-receptor serine/threonine protein kinases and part of the VEGF and Wnt signaling pathways.47–50 Maternal embryonic leucine zipper kinase (MELK) shows increased gene expression in MDA-MB-231 cells. Silencing of this regulator leads to programmed cell death in MDA-MB-231 cells as it functions as a modulator of intercellular signaling through the apoptosis signal-regulating kinase/Jun N-terminal kinase (JNK) pathway, p38 signaling, and NF-kB pathway.51
Multi-omics gene-set analysis (MOGSA) was used to integrate DNA methylation at regulatory elements, gene expression, protein abundance, and phosphorylated peptides from the same set of samples. The method uses a low dimensional representation of most variant correlated features across the different data types and transforms the features to the same scale. The integrated gene-set score is then calculated from the most informative features in each data type.28 Interestingly, significant hallmark pathways differentiating the three cell lines include Notch signaling, PI3K/AKT/MTOR signaling, TNFα signaling via NFκB as up-regulated in TNBC BRCA1wt; KRAS signaling, MYC targets, and DNA repair as up-regulated in TNBC BRCA1mu (Fig. 4B). Epithelia Mesenchymal transition (EMT) was found to be increased in MCF10A and MDA-MB-231, which is consistent with the increased features VIM and TGFB1 from the MixOmics feature extraction analysis. TGFβ signaling was found to be down-regulated in the TNBC BRCA1mu cell line compared to both normal and BRCA1wt cell lines (Fig. 4B).
We investigated the features from each multi-omics level of regulation to see which data had the most influence in identifying TGFβ signaling as an important pathway. Fig. 4C displays the gene influential score for the TGFβ signaling gene features from DNA methylation, mRNA, protein, and the phosphorylation data sets. The data-wise decompose gene set score for a subset of these pathways is shown in Fig. S4 (ESI†). The majority of the features included data from at least two levels of regulation and all included data from RNAseq. By using a multi-omics approach, we are able to identify the majority of features in the pathway analysis but also identify the activated SMAD5 phosphorylation. It is worth noting that TGFβ-stimulated TGF-Beta Receptor Type I phosphorylates of SMAD1/5 to promote cancer cell migration but not in normal cells.52 Interestingly, we also observed that TGFBR1 protein expression and SMAD5 phosphorylation are up-regulated in the TNBC MDA-MB-231 cells compared to MCF10A, a normal mammary epithelial cell (Fig. 3 and Supplemental files 3 and 4, ESI†). This interesting finding raises the possibility to target TGFβ–TGFBR1–SMAD1/5 axis to inhibit the pro-tumorigenic functions of TGF-β signal transduction pathway. The DNA methylation and RNAseq data is freely available via GEO GSE171958 and the proteomics data is available via the ProteomeXchange PXD025238.
The emergence of concepts such as BRCAness further emphasize the need for understanding the systems-level biology of breast cancer. BRCAness refers to the phenomenon where tumors with functional BRCA1/2 display phenotypes similar to those of BRCA1/2-deficient tumors.53 This has been attributed to deficiency in homologous repair pathways in the BRCA-functional tumors, making them potentially susceptible to treatments usually reserved to BRCA1/2-mutant tumors such as PARP inhibitors (PARPi).53–55 Thus, strictly relying on the genomic information of the functional status of BRCA1/2, as opposed to considering the molecular mechanisms of the cells as a whole, may limit potentially impactful treatments for a subset of patients.
Our multi-omics integrations emphasize a stark discrepancy in expression of key signaling pathways among the TNBC subtypes. Specifically, the PI3K/AKT/mTOR pathway is differentially important for the two TNBC cell lines studied here, a conclusion that can be clearly drawn upon examining the MOGSA heatmap (Fig. 4B).56 Given that this pathway is among those most frequently mutated in TNBC, one may conclude that it is targetable. However, this treatment strategy may be ineffective in a subset of TNBCs (i.e., BRCA1 mutants) since this pathway is downregulated in the HCC1937 cell line (Fig. 4B). Our data highlight the inadequacy of this treatment strategy as this diversity among the cell lines is the underlying reason behind failure of treatments which targeted this pathway.54,56
The PI3K/AKT/mTOR pathway is frequently aberrant in cancer. Specifically, PIK3CA is enriched in 10% of TNBC cases,57–59 and loss-of-function PTEN occurs in one-third of TNBCs.60 Despite the high occurrence of mutations in PIK3CA among different TNBC subtypes, we did not observe a significant difference in its gene or protein expression levels.61 This may be explained by the observed upregulation of MYC target genes (Fig. 4B), as overexpression of MYC has been shown to downregulate the PI3K/AKT pathway.62 Additionally, we find that PTEN is of reduced transcription, protein and phosphorylation levels, and its promoter is hyper-methylated, in HCC1937 compared to the other two cell lines. Whereas PTEN transcription and protein levels were unchanged and its phosphorylation levels were increased when comparing MDA-MB-231 to control. This is in alignment with reports of the BL1 TNBC subtype, a subtype of MDA-MB-231 cells, expressing a low level of PTEN.61 The PI3K/AKT/mTOR pathway is activated by EGFR. This pathway is constitutively activated in some TNBC subtypes. Further analysis of the molecular mechanisms leading to this phenotype revealed that loss of PTEN and activating mutations in PI3KCA lead to activation of mTOR. The HCC1937 cell line is a known PTEN-null mutant63 – this is corroborated by our studies as mentioned above (Fig. 3). The loss of PTEN would suggest that the PI3K/AKT/mTOR pathway is hyper-activated since PTEN is a negative regulator of mTOR. However, our multi-omic data integration analyses indicate that this pathway is unexpectedly downregulated in HCC1937. Indeed, this apparent paradox may be explained by a report that demonstrated the lack of therapeutic benefits when combing EGFR and mTOR inhibitors (gefitinib and everolimus, respectively) to treat the HCC1937 cell line, whereas these drugs showed a favorable synergistic effect when treating TNBC cell lines with an activating PI3KCA mutation.64 Our data indicate that the HCC1937 cell line did not respond to the mTOR inhibitor, despite the absence of PTEN, possibly because the PI3K/AKT/mTOR pathway as a whole is downregulated (Fig. 4B). Additionally, PIK3CA mutant breast cancer cells are selectively sensitive to mTOR inhibitors but not the cells with PTEN loss of function. Suggesting two quite distinct functional consequences of the PI3K/AKT/mTOR pathway in breast cancer.65 Conversely, our integrated analysis demonstrate that PI3K/AKT/mTOR pathway is upregulated in the MDA-MB-231 cell line (Fig. 4B). This is in agreement with previous findings that this cell line responds to co-inhibition of EGFR and mTOR with lapatinib and rapamycin, respectively.66,67
Further, we offer a potential molecular basis for the lack of therapeutic advantage to combining EGFR and mTOR inhibitors to treat HCC1937.64 It would have been reasonable to target this TNBC subtype with such inhibitors since EGFR protein levels are elevated and PTEN is deleted in the HCC1937 cell line (Fig. 3B). However, our multi-omic analysis of this cell line clearly predicts the failure of co-inhibiting EGFR and mTOR as the PI3K/AKT/mTOR pathway is downregulated. Interestingly, the DNA methylation and gene expression data point to an increase in SLFN11 in the HCC1937 cells, suggesting a role for TOP1 inhibitor therapy for a subset of TNBC tumors.29
In contrast to reports which showed a concomitant increase in Wnt signaling along with reduced PTEN levels, we find that the Wnt/β-catenin pathway is downregulated in HCC1937 compared to MDA-MB-231.68 This may be explained by a previous report that established an interaction between BRCA1 and nuclear β-catenin.69 This report demonstrates that functional BRCA1 enhances the levels of the nuclear active form of β-catenin, whereas mutant BRCA1, such as that in HCC1937, limits its expression. Thus, the Wnt pathway genes, whose transcription is activated via β-catenin, may not be expressed in HCC1937 leading to the observed downregulation of the pathway in these cells. This further accentuates the heterogeneity even within TNBC subtypes as we demonstrate diversity in Wnt/β-catenin expression within the BL1 subtype. We do find however that MYC targets are increased in our integrated dataset, in correlation with the aforementioned report, which shows that high MYC pathway activity correlates with increased risk in the BL1 subtype (Fig. 4B).70
Overall, downregulation of signaling pathways downstream to EGFR is evident in HCC1937 compared to MDA-MB-231. This is reflected in their divergent sensitivities to inhibitors of key signaling pathways. For instance, MDA-MB-231 is sensitive to the MAPK pathway MEK1/2 inhibitor, selumetinib, whereas HCC1937 is not.71 In addition, it is not surprising that our integrated analyses show downregulation of key signaling pathways in HCC1937 cells. It is of note that the EGFR-activated pathway, PI3K/AKT/mTOR, and the TGFβ pathway interact through the MAPK pathway.72
Protein tyrosine kinases (PTK) are important regulators of several signal transduction pathways, shaping biological processes like cell proliferation, differentiation, migration, and apoptosis in response to internal and external stimuli.73 In breast cancer, amplification and activation of ERBB2 results in ERBB2 (HER2)-positive tumors. Additionally kinase-based gene expression signatures (GESs) is also reported in luminal BCs74 and in basal breast cancers.75 Several PTKs have been implicated and explored as potential therapeutic targets in cancer, nevertheless, the aberrant PTK signaling, and functional redundancy make them a hard target for therapeutic regiment. Recently, we reported that breast tumor kinase (BRK), promotes metastatic potential by phosphorylating SMAD4, the major component of TGFβ/SMAD signaling in mammary epithelial cells.76 However, the inhibition of the kinase activity of BRK with small molecules did not inhibit cancer cell growth77 – suggesting functional redundancy and kinase-independent roles of BRK in breast cancer. In this study, we also observed that besides ptk6/BRK (Supplemental files 2 and 4, ESI†), non-receptor protein tyrosine kinase LYN is hyperphosphorylated (Fig. 2C). Unfortunately, we were not able to detect SMAD4 phosphorylation in this data set. Of note, we will further interrogate whether LYN kinase beside BRK also regulates TGFβ/SMAD signaling in addition to BRK in TNBC cells (Fig. 5).
For example, the value of conducting multi-omics analysis is demonstrated in Carbonic anhydrase 2 (CA2), where we observe a significant reduction in mRNA levels despite a significant increase in protein levels – this is true for both cancer cell lines compared to control (Fig. S2, ESI†). CA2 upregulation is associated with poor prognosis, enhanced tumor progression and metastasis of breast cancer.78 The reliance on mRNA levels alone may have missed the identification of this protein. The necessity of integrating the multi-omics profile of tumors is exemplified by the LOTUS trial. Patients with aberrant PI3K/AKT/mTOR pathways benefitted most and some with loss of PTEN did not benefit.66 These results solidify our observations that for some TNBC subtypes, aberrations of the PI3K/AKT/mTOR pathway do not necessarily arise with the loss of PTEN. This emphasizes the importance of carefully tailoring therapies to pathways activated in the specific tumor. This is best achieved by considering the genomic, transcriptomic and proteomic profile of TNBCs.
Footnotes |
† Electronic supplementary information (ESI) available: Supplemental file 1: Illumina EPIC beadChip array DNA methylation normalized values and statistical results. Supplemental file 2: RNA-seq normalized counts and statistical results. Supplemental file 3: Protein normalized MS3 intensities and statistical results. Supplemental file 4: Phosphopeptide normalized MS3 intensities and statistical results. Supplemental file 5: Histone post-translational modifications normalized MS1 intensities. Supplemental file 6: DNA methylation of promoter regions and RNAseq gene expression common features and correlation results from Fig. 3. Supplemental file 7: RNAseq and protein common features and correlation results. Supplemental file 8: Protein and phosphorylated peptide features and correlation results. See DOI: 10.1039/d1mo00117e |
‡ These authors have contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2021 |