Defining the human copper proteome and analysis of its expression variation in cancers †

Copper (Cu) is essential for living organisms, and acts as a cofactor in many metabolic enzymes. To avoid the toxicity of free Cu, organisms have specific transport systems that ‘chaperone’ the metal to targets. Cancer progression is associated with increased cellular Cu concentrations, whereby proliferative immortality, angiogenesis and metastasis are cancer hallmarks with defined requirements for Cu. The aim of this study is to gather all known Cu-binding proteins and reveal their putative involvement in cancers using the available database resources of RNA transcript levels. Using the Uniprot.org database along with manual curation, we identified a total of 54 Cu-binding proteins (named the human Cu proteome). Next, we retrieved RNA expression levels in cancer versus normal tissues from the TCGA database for the human Cu proteome in 18 cancer types, and noted an intricate pattern of upand downregulation of the genes in different cancers. Hierarchical clustering in combination with bioinformatics and functional genomics analyses allowed for the prediction of cancer-related Cu-binding proteins; these were specifically inspected for the breast cancer data. Finally, for the Cu chaperone ATOX1, which is the only Cu-binding protein proposed to have transcription factor activities, we validated its predicted over-expression in patient breast cancer tissue at the protein level. This collection of Cu-binding proteins, with RNA expression patterns in different cancers, will serve as an excellent resource for mechanistic–molecular studies of Cu-dependent processes in cancer.


Introduction
Cu is essential to living organisms and acts as a cofactor in key enzymes, but the redox properties that allow the metal to impart activities to proteins also imply the potential for toxicity. 1To minimize the toxic effects and strictly regulate the temporal and spatial distributions of Cu, organisms have evolved elaborate systems for the uptake, intracellular transport, protein loading and storage of Cu.In human cells, after internalization of Cu by the Cu importer CTR1, 2 Cu is distributed in the cytoplasm via at least three pathways: 3 in the secretory path, the cytoplasmic Cu chaperone ATOX1 delivers Cu to the Cu-transporting P 1B -type ATPases ATP7A and ATP7B, also called Wilson and Menkes disease proteins, in the trans-Golgi network.Once transferred to ATP7A/B, Cu is channeled to the lumen and, upon ATP hydrolysis, loaded onto many target Cu-dependent enzymes, e.g., ceruloplasmin (CP) and lysyl oxidase (LOX). 4][7][8] LOX is also a secreted Cu-dependent enzyme and crosslinks extracellular matrix proteins.In addition to the secretory pathway, there are two other paths for Cu transport in the cytoplasm.In one, the Cu chaperone for superoxide dismutase 1 (CCS) directs Cu to cytoplasmic Cu/Zn superoxide dismutase 1 (SOD1); in the other, Cu is transported to mitochondria by COX17, and with help from additional proteins (i.e., SCOs and COX11), Cu is incorporated into cytochrome c oxidase (COX1 and COX2).
Since Cu is a key component in many essential enzymes, 1,9,10 it is not surprising that Cu is required for at least three characteristic phenomena involved in cancer progression: proliferative immortality, angiogenesis, and metastasis. 11,12In support of increased Cu demand, cancer tissue and cancer patients' serum have increased Cu levels, whereas serum levels of other metals (i.e., iron, zinc) are often lower than normal. 11Cancer progression involves uncontrolled growth of cells, followed by cancer cell invasion, dissemination and secondary tumor formation at local and distant sites.For a tumor to grow larger than a few mm, angiogenesis is needed.Cu influences several molecular pathways that induce a pro-angiogenic response, 13 including direct binding to, and promoting the expression of, angiogenic factors. 14,15Cu also influences the ability of cancerous cells to metastasize through activation of metabolic and proliferative enzymes. 11For example, LOX is secreted by cancer cells to create a pro-metastatic niche. 16,17ecent unprecedented findings suggest that ATOX1 has transcription factor activities connected to cancer progression. 18TOX1 acts as a Cu-dependent transcription factor 19 that promotes the expression of cyclin D1, a key protein involved in the cell cycle and cell proliferation, 20 and the Cu-dependent enzyme SOD3, a major extracellular antioxidant protein and a protectant against hypertension. 21,22ATOX1 also promotes inflammatory neovascularization by acting as a Cu-dependent transcription factor for NADPH oxidase, 23 and is essential for platelet-derived growth factor-induced Cu-dependent cell migration, thus potentially regulating malignant angiogenesis and vascular remodeling. 24In some analogy with ATOX1, CCS transports Cu to the nucleus and regulates the formation of the hypoxia-inducible factor 1 (HIF-1) transcriptional complex, which in turn promotes the expression of the vascular endothelial growth factor (VEGF) and, thereby, tumor growth. 25Our research group recently confirmed the presence of ATOX1 in the nucleus of mammalian cells, but found no binding to the proposed DNA promoter sequence in vitro. 26Nonetheless, from an extensive yeast two hybrid screening, using ATOX1 as a bait, several new human protein partners were detected: 27 interestingly, among the confident hits, several proteins are related to cancer (e.g., CPEB4, DNMT1, and PPM1A that regulate RNA transcription, DNA methylation, and phosphorylation, respectively [28][29][30][31] ).
To get a global view, we established here the human copper proteome (i.e., the collection of all identified human Cu-binding proteins) and then analyzed their RNA transcript level changes in different cancer tissues using information taken from the publicly-available TCGA database.The Cancer Genome Atlas, or TCGA, is a collaboration between the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI) that has generated comprehensive, multi-dimensional maps of genomic changes in many different cancers.It was recently updated to include RNA transcript levels for 25 different cancer types (GDC data portal accessed on 22 Aug 2016).Based on functional genomics and bioinformatics analysis of the extracted expression level data, we identified Cu-binding proteins that appear important in cancer and, finally, in a proof-of-principle experiment, we confirmed the upregulation of the ATOX1 protein in patient breast cancer tissue.

Data mining
Cu-binding proteins were identified using Uniprot.org,literature review, and manual curation.As a first screening, genes were extracted from Uniprot.org using the search terms 'human + copper'.We then manually checked each gene in Uniprot and selected only the ones defined as Cu-binding or Cu-transport.
3][34] This resulted in a total of 54 proteins that can be classified as Cu-binding.Subcellular localization of the 54 Cu-binding proteins was retrieved from the section 'Localization' in Genecards.org.Protein function was extracted from Uniprot.org and literature.

RNA-seq data analysis
The normalized gene expression quantification data for 54 copperbinding genes and 18 tumor types were downloaded from the GDC data portal.The values are upper quartile normalized FPKM count estimates for sequence reads mapping to the genes.The log 2-fold change (lfc = log 2(tumor/normal)) and p-values (t-test) were calculated from the normalized expression data for each gene and tumor type.To visualize the similarity in the gene expression patterns, a heat map was created with R version 3.3.0(http:// www.r-project.org)using the heat map.2 function in R's 'gplot' package (version 3.0.1).The genes and tumors were classified based on the Pearson correlation coefficients as a similarity measure in gene expression and Ward2's hierarchical clustering method.The classifications for interpretation of the strengths of the Pearson's correlations were implemented according to Cohen. 35The proteo-genomics heat map in Fig. 4 is based on data presented by Mertins et al. 36 following PAM50-defined molecular subtypes. 37

Cluster and GO (gene ontology) term enrichment analysis
The Cytoscape software (http://www.cytoscape.org) 38was used together with GeneMania App 39 to generate interaction networks for all clusters of Cu-binding proteins identified in the heat map.The global parameters were: (i) data source, selected considering the known co-expression, and physical (protein-protein) and genetic interactions; (ii) maximum resultant genes, set to double the size of each cluster; and (iii) an automatic weighting method to assign values to each node.Genes in query get the maximum node size in the Cytoscape graph, whereby the size of a related gene is inversely proportional to the rank of that gene in the list, sorted using the score assigned by GeneMANIA.In a separate (top-down) approach, we analyzed the overall GO term enrichment by calculation of the weighted average log 2-fold change of the expression of all genes in cancer, with a cut-off value of 0.6, This journal is © The Royal Society of Chemistry followed by ranking of the genes by the absolute value of their log 2-fold change (i.e., over-and underrepresented genes above the cut-off taken together).Then, we used this ranked and nonredundant list of gene identifiers as input for the GO enrichment analysis tool Gorilla (http://cbl-gorilla.cs.technion.ac.il/), 40 with the p-value threshold set to 10 À3 and retrieving the most represented biological processes.The resulting table, ranked by the FDR q-value (i.e., a correction of the above p-value using the Benjamini and Hochberg method), was fed to the ReviGO platform (http://revigo.irb.hr/) 41 to visualize a semantic similarity-based scatterplot.The circles' color represents the log 10 p-value.

Patient material
The breast cancer material consists of tissue microarrays (TMAs) of biopsies originating from a cohort of post-menopausal breast cancer patients enrolled in a randomized clinical trial between 1976 and 1990.Briefly, breast cancer patients with a tumor diameter r30 mm and no lymph node involvement were included.The patients were randomized to receive tamoxifen for two years or no endocrine treatment.Nottingham histological grade and immunohistochemical staining of ER, PR and HER2 were performed retrospectively according to standardized methods used in clinical routine. 42Retrospective studies of biomarkers were approved by the regional ethical review board at the Karolinska Institute, Stockholm, Sweden (KI 97-451, with amendment 030201).The four molecular breast cancer subtypes were defined as follows: luminal A, ER+/PR+/HER2À/ low mitotic count; luminal B, ER+/(high mitotic count and/or PRÀ and/or HER2+); HER2-subtype (ERÀ/HER2+); and triple negative breast cancer (TNBC; ERÀ, PRÀ and HER2À).

Immunohistochemical (IHC) staining of ATOX1 in breast tissue
The paraffin embedded sections were incubated at 65 1C for 2 h, followed by deparaffinization in Aqua de Par 10X Ancillary reagent (Biocare Medical, Concord California) for 20 min.The sections were then subjected to treatment with Borg Decloaker RTU antigen retrieval solution (Biocare Medical, Concord California) for heat induced epitope retrieval (HIER) using a Decloaking Chamber NxGen pressure cooker (Biocare Medical, Concord California) programed in a temperature cycle to reach a maximum of 110 1C for 5 min.After allowing cooling down, the slides were rinsed in tap water prior to washing in TBS (Tris-buffered saline).After blocking endogenous peroxidase by incubation in Peroxidazed 1 (Biocare Medical, Concord California) for 5 min, the slides were washed in TBS and incubated with the mouse monoclonal anti-ATOX1 primary antibody (ab54865, Abcam) for 30 min at room temperature (RT) followed by incubation with the secondary antibody MACH4 Universal HRP (horseradish peroxidase)-Probe and the tertiary antibody MACH4 Universal HRP-Polymer (Biocare Medical, Concord California) for 10 min each at RT.After HRP detection, the tissue section was rinsed in TBS.For visualization, one drop (32 ml) of DAB chromogen per 1.0 ml of DAB substrate buffer (Biocare Medical, Concord California) was applied to the tissue sections followed by incubation for 5 min at RT.After rinsing in deionized water, the sections were counterstained with Mayer's haematoxylin staining solution (Histolab Products AB, Go ¨teborg Sweden) and mounted with Pertex mounting media (Histolab Products AB, Go ¨teborg Sweden).The immunostaining was scored by two independent observers based on the intensity in the epithelial cells without knowledge of clinicopathological and biological information.The staining intensity was scored according to the following criteria: negative (no brown), weak (light brown), moderate (brown), and strong (dark brown) staining.In total, breast cancer tissue samples were reliably analyzed along with four normal breast tissue samples.(Due to the limited sample number, a statistical comparison between the scoring data for the subtypes and the control tissue, calculated using Fisher's exact test, resulted in no statistical significance.)Notably, matching samples of normal and cancerous breast tissue from the same patient are desired to obtain higher statistical power (not available for this dataset).

I. The human Cu proteome
To gather all human proteins known to bind Cu, we used Uniprot.org as a source and identified all proteins classified as Cu binding or Cu transporting (Table S1, ESI † with proposed functions of each protein, if reported).To the list, we added SPARC, MEMO1 and MAP2K1 as Cu-binding proteins, based on literature reports, 32-34 although they are not yet described as such in UniProt.Next, we used stringent criteria in Genecards.org with assignment by both UniProtKB and COMPARTMENTS (confidence levels 4 or 5) to separate the proteins according to their subcellular localizations.Subcellular localizations for of the 54 identified Cu-binding proteins were indicated following these stringent criteria.ATOX1, MEMO1, MT3, LTF and S100A5 were localized according to less stringent criteria with assignment by COMPARTMENTS (confidence level 4 or 5) only, and CUTA and MT4 according to even less stringent criteria with assignment by COMPARTMENTS (confidence level 2).MOXD2P was the only Cu protein that remained unassigned.As MOXD2P is a pseudogene of MOXD1, we speculated that it occupies the same subcellular site as MOXD1, i.e., the endoplasmic reticulum (Fig. 1 and Table S1, ESI †).
The list of 54 Cu-binding proteins (Fig. 1 and Table S1, ESI †) corresponds to less than 0.5% of the whole proteome (which consists of at least 10 000 different proteins).This fraction is expected based on earlier estimates using a bioinformatics approach. 43There are two important points: first, there may be more proteins to be discovered as Cu-binding proteins.Thus, the list will grow with time.Second, some of the proteins defined as Cu-binding proteins may not use Cu for their function.][46][47] We found that human cells have 12 proteins classified as Cu transporters.For import there are SLC31A1 and SLC31A2 (CTR1 and CTR2), for cytoplasmic transport there are CCS for Cu delivery to SOD1, ATOX1 for Cu delivery to ATP7A/B in the Golgi, and then COX11, COX17, SCO1, and SCO2 that provide Cu to COX1 and COX2 in the mitochondria.COMMD1 is an enigmatic protein that may be involved in the regulation of exocytosis of Cu-loaded vesicles, 48,49 perhaps via modulation of ATP7B protein stability. 50Whether CUTC is truly a Cu transporter, as reported here, or a Cu-dependent enzyme instead, is an open question. 51e observed that about half of the identified Cu-binding proteins are Cu-dependent enzymes (Fig. 1 and Table S1, ESI †).Many are found extracellular and in the plasma membrane, but also in all (but Golgi) intracellular compartments.In addition to many well-characterized Cu-dependent enzymes, PARK7 is classified as a Cu-dependent enzyme.It is a peptidase that may also function as a sensor for oxidative stress and, notably, mutations in this gene are the cause of early-onset Parkinson's disease. 52,53In addition to Cu homeostasis and enzymatic activity, we functionally classified the remaining 15 proteins in the Cu proteome as 'other' which corresponds to either non-enzymatic or unknown functions (Table S1, ESI †).Interesting proteins in this group are four S100 proteins, typically EF-hand calcium-binding proteins, which have Cu-binding sites and Cu-dependent functions. 54Notably, none (but ATOX1, see Introduction) of the identified Cu-binding proteins are transcription factors.
In addition to LOX (collagen cross-linking) and ATOX1/CCS (putative transcription factor roles), only a few other proteins in the Cu proteome have yet been shown to play roles in cancer.MAP2K1 (MEK1) is a kinase involved in the mitogen-activated protein kinase signaling pathway, and directly related to tumor growth, 18 invasion and metastasis. 55,56MEMO1 is a Cu-dependent redox enzyme that facilitates tumor cell migration and in vivo metastasis 57 via, for example, production of reactive oxygen species, regulation of transcriptional pathways related to the epithelial to mesenchymal transition, 58 and interaction with the RhoA-mDia1 signaling complex. 59SPARC, a collagen-binding glycoprotein, plays a role in tumor invasion and metastasis via modulation of cell-cell and cell-matrix interactions. 32,33Interestingly, in certain types of cancer, SPARC is associated with highly aggressive tumor phenotypes, whilst in others SPARC may function as a tumor suppressor. 60In addition to extracellular mechanisms already mentioned, LOX also regulates cancer progression within cells, here via actin polymerization promoting migratory phenotypes. 61Whereas LOX obtains Cu from the secretory pathway (i.e., ATOX1-ATP7A), it is not known how SPARC, MEMO1 and MEK1 are loaded with Cu.

II. RNA transcript levels of Cu-binding proteins in cancer
To derive if the assigned Cu-binding proteins are involved in cancer, which subsequently may be used to elucidate new (druggable) pathways, we used the TCGA data (http://cancergenome.nih.gov/) to extract RNA transcript levels of the Cu proteome.Of the data Fig. 1 Cellular localization of the 54 identified Cu-binding proteins.Subcellular localizations for 46 of the 54 copper-binding proteins are indicated following the stringent criteria in Genecards.orgwith assignment by both UniProtKB and COMPARTMENTS (confidence levels 4 or 5).ATOX1, MEMO1, MT3, LTF and S100A5 were localized according to the less stringent criteria with assignment by COMPARTMENTS (confidence level 4 or 5) only, and CUTA and MT4 according to even less stringent criteria with assignment by COMPARTMENTS (confidence level 2).MOXD2P was the only Cu protein that remained unassigned.As MOXD2P is a pseudogene of MOXD1, we speculated that it occupies the same subcellular site as MOXD1, i.e., the endoplasmic reticulum.Colors indicate a protein's function, with blue for the transporter; orange for the enzyme; and black for the protein with other or unknown function.
This journal is © The Royal Society of Chemistry for 25 different cancers in TCGA, 18 contained information on all the 54 Cu-binding proteins and, thus, these 18 cancer types were used in the analysis.In each case, the RNA transcript level in cancer tissues was compared to the RNA transcript level in matching normal tissues.In Fig. 2, we show the resulting heat map with log 2-fold changes for the 54 proteins (on the x axis) in 18 cancer types (on the y axis) (red, upregulation; blue, downregulation; and white, no change).In this plot, genes placed close to each other on the y axis have more comparative transcript levels than genes far away in the list, in the different cancer types.
Visual inspection revealed that many of the proteins in the human Cu proteome are up-or downregulated in the different cancers in nontrivial patterns.For example, LOX, LOXL1-2, SPARC and ENOX2 are upregulated, and S100B, PRNP, ENOX1, SNCA, SOD3, HEPHL1 and AOC3 are downregulated in many cancers (Z6 out of 18).With respect to ATOX1, it is upregulated in breast, colorectal, uterus and liver tumors, but downregulated in bile duct and pancreatic tumors (Fig. S1, ESI †).

III. Clustering of Cu-binding proteins
Next, building on the hierarchical clustering of the RNA expression data we divided the Cu-binding proteins on the y-axis, in Fig. 2, into 8 clusters, with each cluster containing the genes with the most similar expression patterns throughout the different cancers.Cluster 1 ranges from AOC1 to DBH, cluster 2 from MT4 to GPC1, cluster 3 from CUTA to AFP, cluster from CP to ALB, cluster 5 from S100A5 to S100A13, cluster from ENOX2 to SNCA, cluster 7 from LOX to TYR, and cluster from LOXL1 to AOC3 (Fig. 2).Also the 18 tumor types can be divided into 3 groups based on the hierarchical clustering results.Group 1 contains thymus, head and neck, esophagus,  This journal is © The Royal Society of Chemistry 2017 adrenal gland, bladder, and stomach cancers; group 2 contains soft tissue, kidney, pancreas, bile duct and liver cancers; and group 3 contains prostate, cervix, breast, uterus, thyroid, colorectal, and lung cancers (Fig. 2).
To reveal the functional aspects of the eight clusters of Cu-binding proteins derived from cancer expression patterns, the clusters were analyzed with the GeneMANIA tool using the Cytoscape software.This tool combines the available bioinformatics information regarding the co-expression pattern, and physical and genetic interactions of the genes within each cluster, including here one extra level of interaction partners, to derive what is known about functional relations between the genes.Based on the distance between the gene representations and the number of interactions between the genes within each cluster plot, we detected that the genes in clusters 1, 3 and 8 are closely related in terms of known functional parameters (many interactions and short distances between the gene representations), whereas those in clusters 2 and 4 seem to be more distant to each other with regard to known functions (few interactions and genes spread apart).In addition, Cu-binding proteins assigned to one cluster based on the heat map (blue/ red/black circles) sometimes appear as partner proteins (grey circles) in other clusters (Fig. 3).This observation suggests that the clusters have a high level of inter-connectivity.

IV. Exploration of breast cancer RNA transcript data
To further analyze the roles of the Cu-binding proteins, we focused on the RNA transcript data for one cancer: breast cancer.For the breast cancer TCGA data, we observed the upregulation of F5, ATP7B and SLC31A1 in cluster 1, SCO2 and HEPHL1 in cluster 2, CUTA, ATOX1, and COX17 in cluster 3, TYRP1 in cluster 5, MT3 in cluster 7, and LOXL1-2, SPARC and MOXD1 in cluster 8 (red circles in Fig. 3).Because several Cu-binding proteins are upregulated in breast cancer (corresponding to 26% of the Cu proteome proteins), clearly, Cu-dependent processes are of importance for breast cancer development.Moreover, we found downregulated CUTC and AFP in cluster 3, S100B in cluster 5, PRNP and SNCA in cluster 6, AOC2 and LOXL4 in cluster 7, and PAM, SOD3 and AOC3 in cluster 8 in breast cancer (blue circles in Fig. 3).One may imagine that even if cancer cells have a higher demand for Cu than normal cells (which should correlate with increased Cu uptake and/or decreased Cu export), some Cu-binding proteins must likely be downregulated in order to facilitate increased Cu loading of upregulated (and thereby, cancer-promoting) Cu-binding proteins.
Of the upregulated Cu-binding proteins in breast cancer, five were Cu transporters: SLC31A1, ATOX1, ATP7B, COX17 and SCO2.This suggests that there is an increased Cu flow via SLC31A1-mediated cell uptake followed by loading onto Cudependent enzymes via the ATOX1-ATP7B path, and delivery of Cu to mitochondria via COX17 and SCO2 in breast cancer.Notably, the COX1 and COX2 genes, normally obtaining Cu via COX17/COX11/SCO1/SCO2 were not upregulated in the breast cancer data.This may hint at other purposes of increased Cu delivery to mitochondria.For example, Cu import into mitochondria can regulate the mitochondrial redox status as reversible changes in the redox state of cysteines in Cu-transport proteins occur in parallel with Cu transfer. 62,63Five Cu-binding enzymes were upregulated in breast cancer, namely, HEPHL1, TYRP1, LOXL1, LOXL2, and MOXD1.Since ATP7B, but not ATP7A, was upregulated in breast cancer, it appears that these enzymes are loaded with Cu via ATP7B under these conditions.
The expression of the CUTA gene is also upregulated in breast cancer and found in the same gene cluster as ATOX1 and COX17 (cluster 3).Since CUTA is predicted to be a mitochondrial protein (Fig. 1), COX17 may be the chaperone that delivers Cu to CUTA in mitochondria.However, in the functional network analysis in Fig. 3, the nearest neighbors of CUTA are COMMD1 and SOD1, both of which are cytoplasmic proteins, posing the question in which compartment is CUTA really present.Like LOX1 and LOX2, SPARC is an extracellular protein upregulated in breast cancer that is found in cluster 8. Therefore, we speculate that SPARC, like LOX1 and LOX2, is loaded with Cu via the secretory pathway, i.e., through the ATOX1-ATP7A/B path, where ATP7B is the likely ATPase as is upregulated in breast cancer.
To test these hypotheses, molecular studies investigating the roles of COX17 and/or ATOX1 in providing Cu to CUTA, CUTA localization, and the ATOX1-ATP7B path for loading Cu onto SPARC are desired and, importantly, should be assessed as a function of breast cancer molecular subtype.Similar analyses of upregulated RNA transcripts of Cu-binding proteins in each of the remaining 17 cancer types will likely result in a number of unprecedented predictions of Cu-dependent processes important in different cancer types.

V. ATOX1 protein upregulation in breast cancer tissues
We note that changes in RNA transcript levels not always reflect the corresponding changes in protein levels as many Cu-binding proteins may be regulated at the translational and/or posttranslational level.To address this possible caveat, we turned to the recently-reported proteogenomics data 36 on four PAM50defined intrinsic breast cancer subtypes (29 luminal A, 33 luminal B, 18 HER2 (ERBB2)-enriched, and 25 basal-like) that contained RNA and protein data for 36 of the here identified 54 Cu-binding proteins.Analysis of the reported information revealed that 28 of the 36 Cu-binding proteins (78%) have high-to-moderate positive correlations between RNA transcript and protein levels (Fig. 4).ATOX1 is one of the proteins with the highest positive correlation (r = 0.67) indicating that if the RNA transcript level is up, protein amount is up too.Thus, as a first approximation, RNA data may be used as a measure of protein levels in this set of genes.Nonetheless, proteomic studies are always desired to support RNA transcript analysis because of the sensitivity of high-resolution accurate-mass tandem mass spectrometry (MS/MS) for determining protein levels. 36s a first step towards protein expression analysis, we focused on ATOX1 because of its proposed transcription factor role. 26,27,64 Using immunohistochemistry we investigated the amount and distribution of ATOX1 protein in 67 breast cancer sections in tissue microarrays (TMAs).In agreement with the RNA transcript data from TCGA (i.e., upregulation of ATOX1 in breast cancer; Fig. 2), we found ATOX1 protein levels to appear higher in cancerous than in normal breast tissues (examples in Fig. 5

and Fig. S2, ESI †). This observation parallels the data
This journal is © The Royal Society of Chemistry 2017 from the Human Protein Atlas that report ATOX1 protein in cancerous but not in normal breast tissues (http://www.proteinatlas.org;version 15, last accessed on 17 November 2016).ATOX1 scoring data for the 67 breast cancer samples (consisting of negative, weak and moderate ATOX1 staining intensities) were categorized into four molecular subtypes (i.e., luminal A, luminal B, HER2, and triple negative) (Fig. 6).Inspection reveals that the highest ATOX1 intensities were found in all cancer subtypes but HER2 and, interestingly, the least proliferative subtype (luminal A) was the one with the highest frequency of ATOX1 negatives.In order to draw molecular conclusions, additional samples are needed; more quantitative detection methods must be used (e.g., MS/MS, Western blot) and matching normal and cancer tissue samples from the same patient are desired.With respect to cellular distribution, ATOX1 was localized in both the cytoplasm and the nucleus, with apparent increased nuclear localization in tissue samples with the highest ATOX1 levels.This protein distribution parallels the reported data from cell line studies and emphasizes the prospect of ATOX1 activities in the nucleus. 26,27,64In future studies, we will investigate the functional roles of ATOX1 in established breast cancer cell lines (work in progress).

Conclusions
Our study brings together all known Cu-binding proteins into, what we term, the Cu proteome, followed by extraction of their expression levels in different cancers using RNA transcript data available in the TCGA database.The hierarchical clustering of this data collection (shown as a heat map in Fig. 2) constitutes an excellent resource for researchers aiming to test new hypotheses around cancer progression.Cu is a necessary component of most cancer cells, 11,12 but the possible participation of Cu-binding proteins in various steps of cancer is an underexplored topic.For the breast cancer data, we performed an initial inspection of Cu-binding proteins that are upregulated at the RNA transcript level, and thereby we proposed mechanisms by which SPARC and CUTA obtain Cu.We also compared the RNA transcript and protein levels for most Cu-binding proteins as a function of molecular subtypes of breast cancer (Fig. 4).Mechanistic studies using a combination of animal, cell culture and biophysical experiments are desired to test the hypotheses based on the presented proteogenomic data.
As a top-down approach, we analyzed the expression patterns of all human genes in cancer versus normal tissues from the TCGA database using the ReviGO software and the GOseq tool, Gorilla.This procedure condenses the extensive and complicated information provided by expression patterns of individual genes into an enrichment graph of biological attributes (GO terms) that can be visualized (Fig. S3, ESI †).Notably, only a few overrepresented GO terms in the plot include Cu-binding proteins.AOC3, MEMO1, LTF and SPARC are the four Cu-binding proteins found in the plot, and they are connected to the following enriched GO terms: response to metal ion (AOC3), cell division (SPARC), ERBB2-ERBB3 signaling (MEMO1), regeneration (AOC3), catabolism (AOC3) and regulation of cell proliferation (LTF).With more studies assessing the roles of Cu-binding proteins in cancer, the database of information will expand, and we predict that GO terms related to Cu-dependence will become of increased significance in the future.

Fig. 2
Fig. 2 Heat map of expression patterns for the human Cu proteome in cancers.Expression patterns of 54 copper-binding genes across 18 tumor types are given.The color scale indicates the degree of change in expression (log 2-fold change), with blue for downregulation and red for upregulation.Genes showing a log 2-fold change of at least 0.4 and a p-value below 0.05 were considered to be significant and are marked with an asterisk (*).Both the row and column dendrograms indicate unsupervised hierarchical clustering of 54 copper-binding genes and 18 cancer types, respectively.The data were classified using the Ward2's hierarchical clustering method with a dissimilarity matrix based on the Pearson correlation as the distance metric.

Fig. 3
Fig. 3 Network analysis of the eight clusters of Cu-binding protein genes.Networks of interactions for each of the gene clusters (red, blue and black circles) identified in the RNA expression heat map shown in Fig. 2. Red circles denote upregulation in breast cancer; blue circles, downregulation in breast cancer; and black circles, no change in breast cancer.Each network shows the first-level partners (grey circles) in physical interactions (red connectors), co-expression correlation (purple connectors) and genetic interactions (green connectors) for each gene in a cluster.The close proximity of the genes in the network indicates a high level of interaction.

Fig. 4
Fig. 4 Proteogenomic data for breast cancer subtypes.Heat map of correlations between protein (high-resolution accurate-mass tandem mass spectrometry, MS/MS) and RNA-seq data for 36 Cu-binding proteins in breast cancer samples 36 clustered according to the PAM50-defined subtypes (labeled basal-like, HER2, luminal A and luminal B).Correlations between the protein and transcript levels were calculated using the Pearson correlation coefficient, r (from ref. 35).The genes were ordered according to r (shown in the left column) from high (top) to low (bottom) correlation.RNA/protein abundance is depicted in a color scale from blue (low) to red (high) scaled within each row according to ref. 36.

Fig. 6
Fig. 6 ATOX1 expression in breast cancer tissue of different molecular subtypes.(A) Legend for the manual scoring of ATOX1 expression levels (i.e., staining intensity) in microarrays of breast cancer tissue sections.(B) Scoring results for ATOX1 expression levels in cancerous breast tissues (67 samples) of the following molecular subtypes: triple negative (TNBC), HER2, luminal A, and luminal B.