Background: Women with triple negative breast cancers (TNBCs) have a poor prognosis due to lack of suitable targeted therapies. Changes in the protein glycosylation are increasingly being recognized as an important modification associated with cancer etiology. Methods: In an attempt to identify TNBC biomarkers with greater diagnostic and prognostic capabilities, hydrazide- based chemistry method combined with LC-MS/MS were used to purify and identify N-linked glycopeptides or glycoproteins of tissues from TNBC patients. Results: A total of 550 unique N-linked glycoproteins were identified, among these proteins, 72 unique N-linked glycoproteins were significantly regulated in tumor tissues, of which 56 proteins were upregulated and 16 proteins were downregulated. To assess the validity of the results, three selected proteins including Vascular endothelial growth factor receptor 1, Insulin receptor, Tissue factor pathway inhibitor were selected for western blot analysis, and these proteins were found as potential biomarkers of TNBC. The top three pathways of differentially expressed glycoproteins participated in were caveolar-mediated endocytosis signaling, agrin interactions at neuromuscular junction and LXR/RXR activation. Conclusion: This work provides potential glycoprotein markers to function as a novel tissue-based biomarker for TNBC.
Breast cancer is a very heterogeneous disease with various morphological and molecular features, natural history and response to therapy . A significant proportion of primary breast cancers do not express estrogen receptor (ER), progesterone receptor( PgR) or human epidermal growth factor receptor 2 (Her2) (ER- PgR- Her2-), generally referred to as triple negative breast cancers (TNBCs) . Women with TNBC have a poor prognosis due to lack of suitable targeted therapies . It is reported that early detection and diagnosis of breast cancer significantly improves 5-year survival rates . Therefore, there is an urgent need for the identification of biomarkers for use in early breast cancer detection . In the past few years, several studies have discovered a few breast cancer associated tissue markers, such as estrogen and progesterone receptors, HER2/neu (erbB2), p53, Ki-67/MIB-1 and vascular endothelial growth factor (VEGF) [6,7]. In addition, Mage-A4 was identified as potential therapeutic target in TNBC .
However, TNBC is such a heterogeneous disease that a study at the protein level without information about post-translational modifications is probably insufficient for disease diagnosis . Protein glycosylation is one of the most abundant and structurally diverse post-translational modifications (PTMs) . N-Glycosylation is initiated in the secretory pathway where a common precursor N-glycan can be attached to an asparagine residue in the Asn-XSer/Thr motif (X ≠ Pro) . Aberrant glycosylation has long been recognized as a hallmark of cancer, and is increasingly being exploited in biomarker discovery studies . Therefore, identification of a series of reliable prognostic glycosylation biomarkers and therapeutic targets for TNBC are still needed. A comprehensive characterization of the N-linked glycoproteome is very difficult because N-linked glycans comprise the most diversified structures among all known posttranslational modifications [13,14]. To conquer these challenges, several techniques have been developed for the enrichment of N-linked glycoproteins which is often necessary for sensitive analyses , including the lectin affinity enrichment, boronic acid and hydrazide based chemical enrichments among other chemical or physical methods [15,16]. Some studies recently reported the hydrazide method was effective in characterizing N-linked glycoproteins in various samples [17,18]. In addition, several stable isotope labeling-based quantification methods have been used in combination with N-glycoproteomic approaches, including chemical labeling such as iTRAQ, stable isotope dimethyl labeling and metabolic labeling such as SILAC . The use of iTRAQ labeling and SILAC  can be cost-prohibitive [20,21], whereas stable isotope dimethyl labeling is performed with inexpensive generic reagents and thereby does not pose financial restrictions to the amount of sample to be labeled .
So in this study, hydrazide method combined with stable isotope dimethyl labeling was used to screen glycoprotein biomarkers for TNBC. More than five hundred proteins were characterized with their relative abundances quantified in TNBC. Several differentially expressed glycoproteins involved in various biological functions associated with TNBC were identified and inspected in silico for their sites of glycosylation. The differential expression profiles of the putative glycosylation proteins may provide new insight into the underlying mechanisms and the involvement of these components in TNBC.
Materials and Methods
Human tissue sams
Tumor samples of breast and control samples from the same breast were obtained from 98 patients and subjected to routine pathological examination at Maternal and Child Health Hospital of Nanjing. All of them were invasive ductal carcinomas. The patients' age ranged from 34 to 90 years, and none of the patients received any treatment prior to surgery. Only 12 patients were TNBC and were available for further studies. Tissue samples were frozen in liquid nitrogen and stored at −80 °C until analysis. Written informed consent was obtained from each patient before surgery. This study was approved by the Ethics Committee of Nanjing Medical University with an Institutional Review Board (IRB) number of 2012-NFLZ-32. The tumor and control samples were pooled separately and subjected to glycosylation proteomics analysis.
Glycoprotein digestion, dimethyl labelling, hydrazide method
The TNBC tissue was solubilized with 2% SDS, 8 M urea in 25 mM NH4HCO3 (pH 8.0). One milligram of protein from normal or TNBC tissue was reduced with 10 mM DTT at 60 °C for 1 h, and alkylated with 55 mM IAA at 37 °C for 40 min. The tryptic peptides were desalted with homemade C18 solid phase extraction column and then dried down in a speed Vac (eppendof, USA), then resuspended in 100 µL of triethylammonium bicarbonate (100 mM). Subsequently, the dimethyl labelling, glycopeptide enrichment and release of glycans by PNGase F were done as described . Briefly, formaldehyde-H2 (573 µmol) was added to above solution and vortexed for 2 min followed by the addition of freshly prepared sodium cyanoborohydride (278 µmol). The resultant mixture was vortexed for 60 min at room temperature (RT) and then a total of 60 µL of ammonia (25%) was added to consume the excess formaldehyde. Finally, 50 µL formic acid (50%, Sigma) was added to acidify the solution. For heavy labeling, 13C-D2-formaldehyde (573 µmol) and freshly prepared cyanoborodeuteride (278 µmol) were used. The light and heavy dimethyl-labeled samples were mixed at 1:1 ratio based on total peptide amount, which was determined by running an aliquot of the labeled samples on a regular LC-MS/MS run and comparing overall peptide signal intensities. The dried labeled peptides were reconstituted in CWR buffer (8 µL Concanavalin A, 15 µL wheat germ agglutinin, 9 µL Ricinus communis Agglutinin 120), centrifuged at 500 rpm for 1 min, then kept in the dark for 1 h at 4°C. The mixture was then centrifuged and filtered with molecular weight cut off (MWCO) of 1000 Dalton at 14000×g for 25 min. After that, 200 µL BB (1 mM MnCl2 in 40 mM Tris/HCl, pH 7.6) was added and the mixture was centrifuged at 14000×g and 4 °C for 25 min. This procedure was repeated for 3 times. Then the supernatant was discarded and the pellet was added with 50 µL of 50 mM Ammonium bicarbonate (ABC), centrifuged at 14000×g and 4°C for 15 min, repeated twice. The glycans were released from glycopeptides after the addition of 500 units of PNGase F (New England biolabs, USA) in 40 µL ABC to the resin and being incubated at 37 °C for 3 hours with gentle shaking. The deglycosylated peptides were carefully collected after gentle centrifugation.
The labeled deglycosylated peptides were applied on the LTQ-Orbitrap instrument (Thermo Fisher, USA) as described . The LC-MS/MS was operated in positive ion mode. The analytical condition was set at a linear gradient from 0 to 60% of buffer B (CH3CN) in 150 min, and flow rate of 200 nL/min. For analysis of normal and TNBC tissues, one full MS scan was followed by five MS/MS scans on those five highest peaks respectively. The MS/MS spectra acquired from precursor ions were submitted to Maxquant (version 18.104.22.168) using the following search parameters: the database searched was Uniprot proteome (version20120418); the enzyme was trypsin (KR/P); the dynamic modifications were set for oxidized Met (+16) and Deamidation 18O (Asn); carbamidomethylation of cysteine was set as static modification; MS/MS tolerance was set at 10ppm; the minimum peptide length was 6; the false detection rate for peptides, proteins, and deamidation 18O were all set below 0.01.
To further explore the significance of the glyco- proteins, Ingenuity® Pathway Analysis (IPA; I ngenuity® Systems, www.ingenuity.com) was used to search the relevant molecular functions, cellular processes and pathways of these identified proteins during the pathological changes of TNBC. Associated networks of differentially expressed glycoproteins were generated, along with a score representing the log probability of a particular network being found by random chance. Top canonical pathways associated with the uploaded data were presented, along with a p-value. The p-values were calculated using right-tailed Fisher's exact tests.
Western blot analysis
Western blot was performed as described earlier using primary antibodies specifically recognizing Vascular endothelial growth factor receptor 1 (Abcam, ab9540), Insulin receptor (Abcam, ab131238), Tissue factor pathway inhibitor (Abcam, ab180619) and GAPDH (Abcam, ab9485). The second antibodies were alkaline phosphatase (AP)-conjugated anti-mouse or rabbit IgG (Promega, S372B, WI, USA; 1:1000). Protein levels were evaluated by the detection of activity of alkaline phosphatase using a Lumi-Phos kit (Pierce Biotechnol-ogy, KJ1243353).
For Dimethyl Labeling, the following criteria were required to consider a protein for further statistical analysis: two or more high-confidence unique peptides had to be identified, the p value had to be < 0.1. Then Student's T-test was used to find the significantly changed proteins with the SPSS software (version 18.0).
The visualized bands of western blot were quantified with Bio-Rad QUANTITY ONE software. The volumes of target bands were normalized to GAPDH. The average absolute intensity and the standard deviation were determined. The protein ratio was determined using these averaged values. Student's T-test was used to generate p values. Significant difference was recognized as a p value less than 0.05.
Global glycoproteomic profiling of TNBC and adjacent normal tissue
Quantitative glycoproteomics was used to analyze levels of glycosylation proteins in TNBC and adjacent normal tissue. From three measurements, we could quantify 1050 unique glycopeptides from 550 proteins with high confidence (two or more unique peptides with an FDR less than 1%). 298, 129, 62 proteins have single, double, and triple glycosylation peptides respectively. 33 proteins have four glycosylation peptides. More interestingly, 10,12,15 glycosylation peptides of Laminin subunit gamma-1, Integrin alpha-1 and Prolow-density lipoprotein receptor-related protein 1 were identified respectively. These results suggested that the methods of glycosylation protein enrichment and identification were effective.
All of the identified proteins were used for further bioinformatic analysis. The subcellular locations were cataloged according to the gene ontology (GO). The majority of glycoproteins were located in membrane (331) and cytoplasm(150), several proteins in extracellular region(89) and endoplasmic reticulum (31). According to GO annotation reported in the UniProt database, 162 glycosylated proteins are involved in signal transduction, 144 proteins in developmental process, 134 proteins in response to external stimulus, 57 proteins in cell adhesion, 24 proteins in cell migration and so on. This subset of glycosylated proteins was enriched for functions associated with tumor invasion.
The N-linked glycoproteomealteration of TNBC tissue
We identified proteins and glycosylation sites significantly regulated in TNBC compared to adjacent normal tissue using Dimethyl Labeling. Of the 550 quantified proteins, 56 were upregulated in TNBC and 16 proteins were downregulated. Of the 1050 glycosylation sites, 70 were upregulated in TNBC, while 17 were downregulated (Table 1). Among the differentially expressed glycoproteins, ANPEP , ITGA1 , LGALS3BP , DCN , CD36 , CD47 , FLT1 , HLA-B , TF , INSR , CD163 , ECM1  and TFPI  were reported to be regulated in breast cancer. Representative mass spectra of FLT1, INSR and TFPI are shown in Figure 1. Our “re-discovery” of proteins previously demonstrated to be related to breast cancer provides a proof-of-principle for our methodology. Other proteins such as NCLN , HYOU1, ENPP , SLC2A1 and so on were firstly found differentially expressed in TNBC including some low abundance proteins such as ANO6.
Functional classification and pathway analysis of the differentially expressed glycoproteins
To gain insights on the processes regulated in TNBC, these identified differentially expressed glycoproteins were subjected to IPA analysis for further identification of important biological processes that they were significantly involved in. The over-represented biological processes, molecular functions, and canonical pathways were generated based on information contained in the Ingenuity Pathways Knowledge Base. We found that the top three significant biological processes of the differentially expressed proteins in our study were networks describing 1) Post-Translational Modification, Protein Degradation, Protein Synthesis (Fig. 2A); 2) Hereditary Disorder, Respiratory Disease, Developmental Disorder (Fig. 2B); 3) Cell-To-Cell Signaling and Interaction, Hematological System Development and Function, Inflammatory Response (Fig. 2C). For molecular and cellular functions, the data indicated that many proteins involved in cell-to-cell signaling and interaction, cellular movement, cellular growth and proliferation. Our results showed that the top three canonical pathways of differentially expressed glycoproteins participated in were caveolar-mediated endocytosis signaling, agrin interactions at neuromuscular junction and LXR/RXR activation (Fig. 3).
Validation of dynamic glycoproteomic profiles responding to TNBC
We applied Western blot analysis to further verify the changes of the glycoproteins in Table 1. Three interesting proteins were selected for the verification, including Vascular endothelial growth factor receptor 1, Insulin receptor and Tissue factor pathway inhibitor. The down-regulation of Insulin receptor and Tissue factor pathway inhibitor, as well as the up-regulation of Vascular endothelial growth factor receptor 1 were confirmed (Fig. 4). It is noteworthy that consistent with the Dimethyl Labelling quantitation results shown in Table 1.
Triple-negative breast cancers (TNBCs) are a heterogeneous set of tumors defined by an absence of actionable therapeutic targets (ER, PR, and HER-2) [36,37]. TNBCs are generally associated with an aggressive clinical course and where targeted therapies are currently limited . More recent technical advancements in identifying several cancer-related biomarkers have provided further opportunities to identify specific treatment options of TNBCs [3,39]. However, the prognostic significance of TNBCs remains unclear since the group is heterogeneous and worst prognosis . It still requires continuous investigation to identify new prognostic markers for the prediction of clinical outcome.
Protein glycosylation is one of the most important post-translational modifications, with more than half of all human proteins estimated to be glycosylated . Aberrant glycosylation associated with abnormal biological function and protein folding has been implicated in cancer. Alterations in protein glycosylation can promote invasive behaviour of tumour cells that ultimately lead to the progression of cancer . While a majority of studies have investigated protein glycosylation changes in cancer cell lines and tumour tissue for individual cancers, it is envisioned that the understanding of these biologically relevant glycan alterations on cellular proteins will facilitate the discovery of novel glycan-based biomarkers which could potentially serve as diagnostic and prognostic indicators of TNBC.
In this study, we compared the N-glycoprotein profile in TNBC tissues with that of normal tissues. The current study explored the hydrazide method as an effective way to preselect N-linked glycoproteins before applying the resolving power of coupled microbore chromatography-mass spectrometry to identify the proteins. 550 glycoproteins were identified, 15 glycosylation peptides of Prolow-density lipoprotein receptor-related protein 1 were identified. Some identified proteins, such as Isoform 2 of Anoctamin-6 and G-protein coupled receptor, are known to be present at very low abundance, which demonstrates the utility of the approach for revealing low-abundance disease biomarkers.
It is suggested that a clinically useful positive marker for disease should be up-regulated in expression specifically in that disease but not in related diseases. Additionally, it would be of interest if the proposed marker was specific to the tissue in which the disease originated . Of the 550 quantified proteins, 56 proteins were upregulated in TNBC while 16 proteins were downregulated. A subset of the identified proteins has previously been proposed for use as breast cancer diagnostic markers. This is particularly useful for demonstrating the relevance of our analysis. This group of proteins includes HLA-B, HLA-C, CD47, CD163, INSR and so on. In addition, many of the differentially expressed glycoproteins carry annotations for cellular processes particularly relevant to oncogenesis and metastasis. For instance, integrins meditate cell adhesion to the cell -extracellular matrix (ECM), an important cellular process that regulates malignant cell growth, metastasis and cancer - induced angiogenesis . 1 glycopeptide of ITGA6, 3 glycopeptide of ITGA1 and 1 glycopeptide of ITGAV were found upregulated in TNBC samples.
Of the 72 differentially expressed N-linked glycoproteins identified in TNBC, particularly interesting proteins were Vascular endothelial growth factor receptor 1, Insulin receptor and Tissue factor pathway inhibitor. Vascular endothelial growth factor receptor-1 (VEGFR-1 or Flt-1) is a tyrosine kinase receptor which is highly expressed in breast cancer tissues, but near absent in normal breast tissue . It also has been suggested that VEGFR-1 may be an independent predicator for poor prognosis in breast carcinoma patients . In this research, two glycopeptides of VEGFR-1 were upregulated. The insulin receptor (INSR) can be activated by insulin or IGF-II which plays a key role in the regulation of growth and metabolism as well as in the initiation and maintenance of breast tumors . High IGF-II levels increases breast cancer susceptibility and stimulates tumor growth and progression by signaling through the IGF-I and insulin receptors . One glycopeptide of insulin receptor was downregulated. Tissue factor pathway inhibitor(TFPI) is a kunitz-type serine proteinase inhibitor, which is produced and secreted into ECM by endothelial cells, smooth muscle cells, fibroblasts, keratinocytes, and urothelium . Recent studies showed that the expression of TFPI was down-regulated in breast cancer patients. Low or negative expression of TFPI was associated with breast cancer progression, recurrence and poor survival outcome after breast cancer surgery . Two glycopeptides of TFPI was downregulated in our research. The down-regulation of Insulin receptor and Tissue factor pathway inhibitor, as well as the up-regulation of Vascular endothelial growth factor receptor 1 were also confirmed by western blot. An important extension of this work will be to document the correlation between these three differentially expressed N-linked glycoproteins and important consequences in TNBC.
To our knowledge, this is the first N-glycoproteomics study to reveal the response to TNBC in vivo. 72 unique N-linked glycoproteins were significantly regulated in tumor tissues, of which 56 proteins were upregulated and 16 proteins were downregulated. Proteins such as Vascular endothelial growth factor receptor 1, Insulin receptor, Tissue factor pathway inhibitor were found as potential biomarkers of TNBC. The pathways of differentially expressed glycoproteins participated in were associated with tumor. An improved understanding of these protein alterations and the signaling pathways where they are involved will broaden our knowledge about the TNBC, and may provide useful information for the therapy of this disease.
This work was supported in part by the National Natural Science Foundation of China (81172502, 81202077, 81300363and 81272916), the Natural Science Foundation of Jiangsu Province (BK2011853, BK20131023and BK2011855), the Program for Development of Innovative Research Team in the First Affiliated Hospital of NJMU (IRT-008) and a project Funded by the Priority Academic Program Development of Jiangsu higher Education Institutions (PAPD).
The authors declare that they have no competing interests.
X. Chen, J. Wu and H. Huang contributed equally to this work.