Abstract
Background/Aims: Long noncoding RNAs (lncRNAs) are pervasively transcribed and have been shown to regulate key biological processes that maintain normal cellular functions. Abnormal regulation of these lncRNAs can promote tumorigenesis through resulting aberrant cellular essential functions. however, the roles of lncRNAs played in the development of gastric cardiac adenocarcinoma (GCA) remain unknown. With this work we aimed to show the expression profile of lncRNAs in GCA tissues compared with paired adjacent noncancerous tissue using microarray analysis in order to interrogate potential carcinogenesis molecular mechanisms of GCA from lncRNA level. Methods: In this study, total RNA was isolated from 15 pairs of GCA tissue, cancerous and non-cancerous, and hybridized to arraystar lncRNA V2.0 chips containing probes representing 33,000 lncRNA genes. Quantitative real-time polymerase chain reaction (PCR) was used to validate 6 up-regulated and 6 down-regulated lncRNAs. Bioinformatic analysis including gene ontology(GO) analysis, pathway analysis and network analysis was done for further investigation. Results: Pathway analysis indicated that 8 pathways corresponded to downregulated transcripts and that 20 pathways corresponded to up-regulated transcripts (p-value cut-off is 0.05). GO analysis showed that the highest enriched GOs targeted by up-regulated transcripts were tissue homeostasis and the highest esenriched GOs targeted by the downregulated transcripts were tissue homeostasis. Conclusion: Our study is the first to interrogate differentially expressed lncRNAs in human GCA tissues and indicates that lncRNAs may be used as novel candidate biomarkers for the clinical diagnosis of GCA and potential targets for further therapy.
Introduction
The incidence of adenocarcinoma in esophageal and the gastroesophageal junction has increased significantly over the past decades [1], and the rates of increase vary dramatically by gender and ethnicity. The proportion of gastric cardia adenocarcinoma(GCA) has also increased strikingly in the west countries since the l970s (rates -5) although the incidence of gastric cancer as a whole has decreased and in some places of northern China where is the high-risk region of upper gastrointestinal carcinomas since the1990s [2].
The human genome sequencing showed that the human genome is pervasively transcribed into RNA and it is estimated that up to 70% of the genome is transcribed but only up to 2% of the human genome is transcribed into protein-coding RNA [3,4,5]. The RNA molecules lacking protein-coding potential are known as non-coding RNAs. Long non-coding RNAs refer to non-coding RNAs which are over 200 nucleotides in length [6,7,8]. Accumulating evidence indicates that lncRNAs are important regulators of essential cellular functions at transcriptional, post-transcriptional and translational control levels [9,10,11,12,13]. Aberrant regulation of these essential functions or the reactivation of some lncRNAs that are abundant during embryogenesis can promote tumor development [14,15]. Furthermore, lncRNAs are often expressed in a spatially and temporally specific manner, that is, the lncRNAs expression often displays a cell-, tissue- and development-specific model [16], which makes lncRNAs become the potential therapeutic targets for different diseases from development to tumorigenesis and can be used as classification markers for various types of cancers [17]. Just like protein-coding genes, lncRNAs molecules may also function as tumor oncogenes or tumor suppressors through changing chromatin structure or some protein-coding genes expression level [18]. Hence, lncRNAs can be used as attractive biomarkers for tumor diagnosis and prognosis [19]. However, the roles of lncRNAs in the development of GCA and the correlation between GCA and the expression levels of these lncRNAs remains unknown.
In this study, we interrogated the differentially expression profiles of lncRNAs and mRNAs between fifteen GCA specimens and their paired adjacent noncancerous samples using microarray analysis. Our results illustrated that the lncRNA and the mRNA expression profiles differ significantly between normal gastric cardia and GCA tissues. This finding suggests that the aberrant expression levels of lncRNAs may contribute to the development and progress of GCA and that studying the differences in lncRNA expression profiles may provide new potential methods for diagnosing and treating GCA.
Materials and Methods
Patient specimens
The procedures used in this study were approved by the Institutional Review Board of the Henan University of Science and Technology and was conformed to the Helsinki Declaration, and to local legislation. A total of fifteen human primary gastric cardiac adenocarcinoma (GCA) specimens and their paired adjacent non-cancerous tissue specimens were obtained with informed consent from patients of the first affiliated hospital, Henan University of Science and Technology. All of these patients were assigned a diagnosis of GCA based on histopathology and clinical history. The anatomical sites of the primary GCA cases were from dentate line of gastroesophageal junction to 2 cm below that line. Other available clinical information that was recorded for each specimen included age, race, tumor grade, cancer stage, tissue dimensions and date (year) of resection. The pathologist assessed the tumor content by microscopic examination of cases where the percent of tumor was estimated to be ≥80%. All the patients had not received preoperative chemoradiation after which the tissue samples were taken. All the patients had not received preoperative chemoradiation after which the tissue samples were taken.
RNA isolation
Total RNA was isolated from fifteen primary GCA specimens and their paired adjacent noncancerous tissues using TRIzol reagent (Invitrogen, CA, USA). Total RNA from each specimen was quantified using a NanoDrop ND-1000(OD 260 nm, NanoDrop‚Wilmington, DE), RNA integrity was assessed using standard denaturing agarose gel electrophoresis, and the purity was judged by the ratio of absorbance at 260 nm to 280 nm (A260/A280).
DNA microarray
The Human LncRNA microarray V2.0 (manufactured by Arraystar Co. USA) was composed of lncRNAs and protein coding mRNAs in the human genome. About 33,000 lncRNAs were collected from the authoritative data sources including NCBI RefSeq, UCSC, RNAdb, LncRNAs from literatures and UCRs. Sequences were selected with proprietary strategies. Repeat sequences and ncRNAs shorter than 200 bp were deleted. Each Hunan lncRNA array was composed of 60,302 distinct probes(60 mers) and each transcript was represented by 1-5 probes to enhance statistical confidence. Each transcript was represented by a specific exon or splice junction probe to identify individual transcripts accurately. The microarray hibridization and bioinformatic analysis was performed by KangChen Bio-tech, Shanghai, PR China.
RNA labeling and array hybridization
In this study, double-strand cDNA (ds-cDNA) was synthesized using 5 µg of total RNA by an Invitrogen SuperScript ds-cDNA synthesis kit in the presence of 100 pmol oligo dT primers. The ds-cDNA was cleaned and labeled in accordance with the NimbleGen Gene Expression Analysis protocol (NimbleGen Systems, Inc., USA). Namely, ds-cDNA was incubated with 4 µg RNase A at 37 °C for 10 min and cleaned using phenol:chloroform:isoamyl alcohol followed by ice-cold absolute ethanol precipitation. The purified cDNA was quantified using NanoDrop ND-1000 and labeled with Cy3 using a NimbleGen One-Color DNA labeling kit according to the manufacturer's guidelines detailed in the Gene Expression Analysis protocol (NimbleGen Systems, Inc., Madison, WI, USA). One microgram of ds-cDNA was incubated for 10 min at 98°C with 1 OD of Cy3-9 mer primer. In addition, 100 pmol of deoxynucleo-side triphosphates and 100 U of the Klenow fragment (New England Biolabs, USA) were added, and the mix was incubated at 37°C for 2 h. The reaction was stopped by adding 0.1 volume of 0.5 M EDTA, and the labeled ds-cDNA was purified by isopropanol/ethanol precipitation. Microarrays were hybridized with 4 µg of Cy3-labeled ds-cDNA at 42°Cduring 16-20 h in NimbleGen hybridization buffer/hybridization component A in a hybridization chamber (Hybridization System-NimbleGen Systems, Inc., Madison, WI, USA). Following hybridization, washing was performed using the NimbleGen Wash Buffer kit (NimbleGen Systems, Inc., Madison, WI, USA). After being washed in an ozone-free environment, the slides were scanned using an Axon GenePix 4000B microarray scanner. The RNA labeling and microarray hybridization was performed by KangChen Bio-tech, Shanghai, PR China.
Data analysis
Informatic data analysis were performed by KangChen Biotech (Shanghai, PR China). All of the slides were scanned at 5 lm/pixel resolution using an Axon GenePix 4000B scanner (Molecular Devices Corporation) runed by GenePix Pro 6.0 software (Axon). Scanned images (JPEG format) were then imported into NimbleScan software (version 2.5) for grid alignment and expression data analysis. Expression data were normalized through quantile normalization and the Robust Multichip Average (RMA) algorithm included in the NimbleScan software. The probe level files and mRNA level files were generated following normalization. All gene level files were imported into Agilent GeneSpring GX software (version 11.5.1) and normalized by the quantile method; then, Combat software was used to adjust the normalized intensity to remove batch effects. Significantly differentially expressed lncRNAs and mRNAs were identified through Volcano Plot filtering. Hierarchical clustering was performed using Agilent GeneSpring GX software (version 11.5.1) [20].
Quantitative real-time PCR(qRT-PCR)
The total RNA was isolated using TRIzol reagent (Invitrogen‚CA, USA) and then reverse transcribed using PrimeScript RT reagent Kit with gDNA Eraser (Perfect Real Time) (TaKaRa, Dalian China) according to the manufacturer's structions. The expression of eight up-regulated lncRNAs and four down-regulated lncRNAs in fifteen patients included in this study was measured by qRT-PCR using SYBRGreen assays (TaKaRa, Dalian, China), and GAPDH was used as an internal control. The primers are listed in http://cancerresearch.faisco.cn/. The expression level of each lncRNA was represented as fold change using 2-ΔΔCt methods. The expression level of lncRNAs differentially expressed between human primary GCA and their paired adjacent noncancerous specimens were analyzed using Student's t-test with SPSS (Version 17.0 SPSS Lnc). A value of p < 0.05 was considered significant.
Results
Differentially expressed lncRNAs
The clinical parameters of all the patients are as shown in Table 1. The lncRNA expression profiling data showed a total of 24,842 lncRNAs expressed in gastric cardia adenocarcinoma (GCA) using microarray analysis (Fig. 1A, and http://cancerresearch.faisco.cn/). Based on these data, the lncRNA expression levels between fifteen human primary GCA and their paired adjacent noncancerous tissues were compared and an average of 3026 up-regulated lncRNAs (range from 1880 to 4972) and 2113 downregulated lncRNAs (range from 976 to 5629) that were significantly differentially expressed (fold change≥2.0) (Fig. 1B). Among these lncRNAs, 1880 were consistently up-regulated, and 293 lncRNAs were consistently downregulated (http://cancerresearch.faisco.cn/). Hierarchical clustering analysis was used to arrange specimens into groups according to their expression levels, hence, we can infer the relationships among sepcimens. ASHG19A3A028863 (fold change:146.0139123) was the most up-regulated lncRNA, and ASHG19A3A007184 (fold change:58.71047078 ) was the most downregulated lncRNA(Fig. 1C). The dendrogram presents the relationships among the lncRNA expression modes between specimens (Fig. 1D). Up-regulated lncRNAs were more common than downregulated lncRNAs in our microarray data.
A, a, Volcano Plots of lncRNA expression profile. The vertical lines correspond to 2.0-fold up and down and the horizontal line represents a P-value of 0.05. The red point in the plot represents the differentially expressed LncRNAs with statistical significance; b, The scatterplot of lncRNA expression profile, which is useful for assessing the variation (or reproducibility) between chips. B, The number of up-regulated and down-regulated lncRNAs differentially expressed between tumor and normal tissues. C, Forms showing top 10 differentially expressed lncRNAs. D, principle component analysis of lncRNA chips.
A, a, Volcano Plots of lncRNA expression profile. The vertical lines correspond to 2.0-fold up and down and the horizontal line represents a P-value of 0.05. The red point in the plot represents the differentially expressed LncRNAs with statistical significance; b, The scatterplot of lncRNA expression profile, which is useful for assessing the variation (or reproducibility) between chips. B, The number of up-regulated and down-regulated lncRNAs differentially expressed between tumor and normal tissues. C, Forms showing top 10 differentially expressed lncRNAs. D, principle component analysis of lncRNA chips.
Differentially expressed mRNAs
The mRNA expression profiling data showed a total of 17,384 mRNAs identified in the GCA tissues using microarray analysis (Fig. 2A, http://cancerresearch.faisco.cn/). Among the fifteen human primary GCA and their paired adjacent non-cancerous samples, an average of 2619 up-regulated mRNAs (range from 1130 to 3452) and 1395 downregulated mRNAs (range from 782 to 3133) were significantly differentially expressed (fold change≥2.0) (Fig. 2B). In the GCA tissues, 986 mRNAs were 1735 consistently upregulated, and 257 mRNAs were consistently downregulated (http://cancerresearch.faisco.cn/). ASHG19A3A023341(fold change: 118.7051639) was the most up-regulated lncRNA, and ASHG19A3A046477(fold change: 114.8326712) was the most downregulated lncRNA (Fig. 2C)The Hierarchical clustering analysis presented the relationships among the mRNA expression modes that were present in the specimens (Fig. 2D).
A, a, Volcano Plots of mRNA expression profile; b, The scatterplot of lncRNA expression profile, which is useful for assessing the variation (or reproducibility) between chips. B, Number of up-regulated and down-regulated mRNAs differentially expressed between tumor and normal tissues in each clinical sample. C, Forms showing top 10 differentially expressed mRNAs. D, principle component analysis of mRNA chips.
A, a, Volcano Plots of mRNA expression profile; b, The scatterplot of lncRNA expression profile, which is useful for assessing the variation (or reproducibility) between chips. B, Number of up-regulated and down-regulated mRNAs differentially expressed between tumor and normal tissues in each clinical sample. C, Forms showing top 10 differentially expressed mRNAs. D, principle component analysis of mRNA chips.
GO analysis
In this study, Gene Ontology (GO) analysis was performed to determine the gene and gene product attributes in biological processes, cellular components and molecular functions. Fisher's exact test is done to find if there is more overlap between the DE list and the GO annotation list than would be expected by chance. The p-value denotes the significance of GO terms enrichment in the DE genes. The lower the p-value, the more significant the GO Term (p-value<=0.05 is recommended). It has been found that the highest enriched GOs targeted by up-regulated transcripts were tissue homeostasis (GO:0001894; ontology: biological process; p=0.0003) (Fig. 3A), cytosolic ribosome (GO:0022626; ontology: cellular component; p=5.13E-29) (Fig. 3B) and structural constituent of ribosome (GO:0003735; ontology: molecular function; p=2.80E-21) (Fig. 3C) and that the highest esenriched GOs targeted by the downregulated transcripts were tissue homeostasis (GO:0001894; ontology: biological process) (Fig. 3D), anaphase-promoting complex(GO:0005680; ontology: cellular component; p=0.0018) (Fig. 3E) andmetal ion binding(GO:0046872; ontology:molecular function; p=0.00057) (Fig. 3F) (http://cancerresearch.faisco.cn/).
The Gene Ontology project provides a controlled vocabulary to describe gene and gene product attributes in any organism (http://www.geneontology.org). The ontology covers three domains: Biological Process, Cellular Component and Molecular Function. Fisher's exact test is used to find if there is more overlap between the DE list and the GO annotation list than would be expected by chance. The p-value denotes the significance of GO terms enrichment in the DE genes. The lower the p-value, the more significant the GO Term (p-value<=0.05 is recommended). A-C, The highest enriched GOs targeted by up-regulated transcripts. A, biological process (BP). B, cellular component (CC). C, molecular function (MF). D-F, The highest enriched GOs targeted by down-regulated transcripts. D, biological process (BP). E, cellular component (CC). F, molecular function (MF). A-F: The chart shows the top ten counts of the significant enrichment terms.
The Gene Ontology project provides a controlled vocabulary to describe gene and gene product attributes in any organism (http://www.geneontology.org). The ontology covers three domains: Biological Process, Cellular Component and Molecular Function. Fisher's exact test is used to find if there is more overlap between the DE list and the GO annotation list than would be expected by chance. The p-value denotes the significance of GO terms enrichment in the DE genes. The lower the p-value, the more significant the GO Term (p-value<=0.05 is recommended). A-C, The highest enriched GOs targeted by up-regulated transcripts. A, biological process (BP). B, cellular component (CC). C, molecular function (MF). D-F, The highest enriched GOs targeted by down-regulated transcripts. D, biological process (BP). E, cellular component (CC). F, molecular function (MF). A-F: The chart shows the top ten counts of the significant enrichment terms.
Pathway analysis
Pathway analysis indicated that 20 pathways corresponded to up-regulated transcripts (Fig. 4A)and that the most enriched network was "Ribosome - Homo sapiens (human)"(Fisher-Pvalue=4.86E-25) composed of 49 targeted genes. Moreover, the pathway analysis also showed that 8 pathways corresponded to downregulated transcripts (Fig. 4B) and that the most enriched network was "Collecting duct acid secretion - Homo sapiens (human)"(Fisher-Pvalue=0.0055)composed of 3 targeted genes (http://cancerresearch.faisco.cn/, the recommend p-value cut-off is 0.05). Among these pathways, the gene category "Epstein-Barr virus infection" (Fig. 4C), has been reported to be involved in the carcinogenesis of gastric cardia adenocarcinoma and gastric adenocarcinoma [21,22,23]; the gene category“Gastric acid secretion" has been investigated as a cause of adenocarcinoma at the gastro-oesophageal junction and the distal esophagus [24,25]and the gene category“Mismatch repair" has been showed to be involved in the development of GCA [26].
Pathway analysis is a functional analysis mapping genes to KEGG pathways. The p-value (EASE-score, Fisher-Pvalue or Hypergeometric-Pvalue) denotes the significance of the Pathway correlated to the conditions. Lower the p-value, more significant is the Pathway. (The recommend p-value cut-off is 0.05.). A, Pathways corresponded to up-regulated transcripts.B, Pathways corresponded to down-regulated transcripts. A-B. The bar plot shows the top ten Enrichment score (-log10(Pvalue)) value of the significant enrichment pathway.C, The schematic diagram of the gene category Epstein-Barr virus infection". Yellow marked nodes are associated with down-regulated genes, orange marked nodes are associated with up-regulated or only whole dataset genes, green nodes have no significance.
Pathway analysis is a functional analysis mapping genes to KEGG pathways. The p-value (EASE-score, Fisher-Pvalue or Hypergeometric-Pvalue) denotes the significance of the Pathway correlated to the conditions. Lower the p-value, more significant is the Pathway. (The recommend p-value cut-off is 0.05.). A, Pathways corresponded to up-regulated transcripts.B, Pathways corresponded to down-regulated transcripts. A-B. The bar plot shows the top ten Enrichment score (-log10(Pvalue)) value of the significant enrichment pathway.C, The schematic diagram of the gene category Epstein-Barr virus infection". Yellow marked nodes are associated with down-regulated genes, orange marked nodes are associated with up-regulated or only whole dataset genes, green nodes have no significance.
Real time quantitative PCR validation
We randomly selected eight up-regulated lncRNAsand four under-regulated lncRNAs from the differentially expressed lncRNAs. To identify the expression levels of the altered lncRNAs in GCA patients, another 12 GCA tissues and paired non-cancerous tissues were examined for the expression level of these lncRNAs using quantitative real-time PCR (qRT-PCR). The results demonstrated that the qRT-PCR results and microarray data are consistent (p ˂ 0.05, Fig. 5A and B).
A, Comparison of lncRNAs expression level between microarray and qRT- PCR results. Eight up-regulated and four down-regulated differentially expressed lncRNAs were validated by qRT- PCR. The Y-axis of the columns in the chart represent the log-transformed median fold changes (T/N) in expression across twelve samples (p < 0.05). The qRT- PCR results are consistent with microarray data. (B) Distributions of the lncRNA expression levels (p < 0.05). Twelve differentially expressed lncRNAs were validated by qRT- PCR in ten human GCA and patient paired non-cancerous samples.
A, Comparison of lncRNAs expression level between microarray and qRT- PCR results. Eight up-regulated and four down-regulated differentially expressed lncRNAs were validated by qRT- PCR. The Y-axis of the columns in the chart represent the log-transformed median fold changes (T/N) in expression across twelve samples (p < 0.05). The qRT- PCR results are consistent with microarray data. (B) Distributions of the lncRNA expression levels (p < 0.05). Twelve differentially expressed lncRNAs were validated by qRT- PCR in ten human GCA and patient paired non-cancerous samples.
Discussion
In the past decade, it has been found that the genome are widely transcribed as well as about 1% of the genome are protein-coding genes [4]. Although this pervasive transcription products named "non-coding" RNAs were once seen as transcriptional "noise‚”more and more research validated that these non-coding RNAs, especially long non-coding RNAs(lncRNA) are biologically functional [7,27]. Firstly, the abundance of lncRNAs in cancer suggests their potential role in tumorigenesis [15]; secondly, just like protein-coding genes, lncRNAs may function as tumor oncogenes or tumor suppressors through regulating related protein-coding genes expression or control transcriptional alteration directly [9]. Hence, lncRNAs could potentially be used for cancer diagnosis and prognosis [7].
Our previous research has already found that miR-645 can promote human gastric cardia adenocarcinoma(GCA)cell proliferation and inhibit GCA cell apoptosis through the down-regulation of a tumor spressor, SOX11. However, the tumorigenesis mechanisms of GCA are far more elucidated and studies about lncRNAs dysregulation of GCA have not been reported yet.
This study showed the differential lncRNAs expression profile in GCA and non-cancerous samples using microarray analysis for the first time. We found that the lncRNA expression levels in GCA samples were altered compared with adjacent non-cancerous tissues. We analyzed fifteen human primary GCA and their paired adjacent noncancerous samples using microarray analysis and randomly selected several lncRNAs for validation by quantitative real-time PCR in 16 pairs of human primary GCA and their paired adjacent noncancerous tissue samples. The microarray expression profiles exhibited that 24,842 lncRNAs were expressed; 1880 up-regulated lncRNAs and 293 under-regulated lncRNAs were significantly differentially expressed (change fold ≧2.0) in fifteen GCA samples. Moreover, Gene Ontology (GO) analysis and pathway analysis were analyzed and a coexpression network was constructed to study preliminarily the biological functions and potentail mechanisms of these lncRNAs in the tumorigenesis of GCA.
To our knowledge, there was not research concerning the lncRNA and mRNA microarray expression profile in human GCA to date. Most of the interest seemed to mainly focused on the single nucleotide polymorphism (SNP) of some genes in the past decade. More often, those gene SNPs might not be associated with the risk of GCA. For example, MMP-7-181A/G [28] and MMP-3 promoter polymorphism [29]might not be associated with the development and potential of lymphatic metastasis in GCA; XRCC1 Arg194Trp and Arg399Gln SNPs might not be associated with the risk of GCA [30]; FAS-1377 G/A and FASL-844 T/C polymorphisms were not associated with the risk of GCA [31]; Lys939Gln SNP in exon 15 may have no influence on the risk of ESCC and GCA [32]; IL-10-G1082A polymorphism might not be used as a stratification marker to predicate the risk of ESCC and GCA development. However, some gene SNPs might be associated with the risk of GCA, i.e., the FAS-670 A/G genotype might decrease the risk of GCA for smoker individuals [31]; the thymidylate synthase (TS ) polymorphisms might be associated with the susceptibility to GCA [33]; TNF-alpha-308G/A and TNF-beta+252G/A genotyping may be used as a stratification markers to predicate the risk of ESCC and GCA development in North China [34] and a functional polymorphism in the matrix metalloproteinase-2 gene promoter (-1306C/T) is associated with risk of development but not metastasis of GCA [35]. We examined those genes listed above in our general microarray expression profiles of all the 15 GCA samples, and all the genes showed no significant difference in the expression between GCA tumours and adjacent noncancerous tissues.
Because adenocarcinoma of the esophagocardia region is probably a different disease from cancer of the rest of the stomach, we then compared our findings with several other previous research about gene expression of gastric cancer. Yu et al. have studied the gene expression profile differences in gastric cancer, pericancerous epithelium and normal gastric mucosa by gene chip [36]. We examined the genes published in that paper and found no genes were significantly differentially expressed in the GCA tissues compared with paired normal tissues; cytoplasmic expression of HER1, HER3, and HER4 were 45, 62, and 24 %, respectively in gastric cancer [37], however, the EGFR family members expression all showed no difference between GCA tissues and paired normal tissues.
It has been reported that there are epidemiological differences between oesophageal adenocarcinoma (OA) and GCA [38], suggesting that these two malignancies are separate entities with different risk factors. van Baal JW et al. have studied transcriptomes of Barrett's esophagus(BE), squamous epithelium, and gastric cardia epithelium, finding that BE proves to be an incompletely differentiated type of epithelium that shows similarities to both normal squamous and gastric cardia epithelia [39]. Zambon et al. found that BAGE or GAGE expression was related significantly to a poor prognosis, whereas the expression of MAGE genes (in the absence of BAGE and GAGE expression) was related significantly to a good prognosis of esophageal squamous cell carcinoma and adenocarcinoma of the gastric cardia [40], however, BAGE, GAGE and MAGE were not up-regulated or not down-regulated in GCA tissues compared with normal tissues.
The finding that those genes which might be associated with development of gastric cancer or esophageal cancer did not show a significant difference between GCA and paired normal tissues may suggest that GCA is a unique entity different from esophageal cancer and gastric cancer, owing to its special developmental mechanisms.
Enhancer-like role is one of the important mechanisms by which lncRNAs could regulate some protein-coding genes transcription [41,42], hence, enhancer lncRNAs were summarized according to their locations relative to the protein-coding genes in the genome and we found that there are 17 potential lncRNAs altogether. Genes involved in chromatin structure and epigenetic processes are sometimes used as the prognostic factor, for example, a four-gene classifier containing the non-coding gene H19, the histone HIST1H3F and the two small nucleolar RNAs, SNORA16A and SNORD14C was developed that assigns cases to low and high risk classes of squamous cell carcinomas of the larynx. In this present study, a up-regulated lncRNA, NR_026790, was found to be located upstream HIST1H3F, which encodes histone 3.1 that are responsible for the nucleosome structure of the chromosomal fiber in eukaryotes [43]. Another up-regulated lncRNA, AL832916, was found to be located upstream POLE4, a histone-fold protein that interacts with other histone-fold proteins to bind DNA in a sequence-independent manner. A up-regulated lncRNA, ENST00000430562, was found to be located upstream ZBED1, which functions as a transcription factor through binding to DNA elements found in the promoter regions of several genes related to cell proliferation, such as histone H1. hence, ZBED1 may have a role in regulating genes related to cell proliferation [44]. A up-regulated lncRNA, NR_027418, was found to be located upstream KPNA2, a uniform poor prognostic biomarker of multiple cancers such as esophagus squamous cancer, bladder cancer, gastric cancer, ovarian malignant germ cell tumor ‚ non small cell lung cancer, infiltrative astrocytomas and so on [45].
To understand the functions of lncRNAs and the mechanisms by which lncRNAs exert their roles, pathway analysis was performed to study the differentially expressed lncRNAs and it was found that 20 pathways corresponded to up-regulated transcripts and 8 pathways corresponded to downregulated transcripts. Among these pathways, “Εpstein-Barr virus infection" signal pathway has been fond to be involved in the carcinogenesis of gastric cardia adenocarcinoma and gastric adenocarcinoma for Epstein-Barr virus (EBV) is a ubiquitous human herpesvirus that is associated with oncogenesis [46]. Moreover, "mismatch repair" signal pathway is involved in the development of gastric cardia adenocarcinoma. DNA mismatch repair (MMR) is a highly conserved biological pathway that plays a key role in maintaining genomic stability. MMR corrects DNA mismatches generated during DNA replication, thereby preventing mutations from becoming permanent in dividing cells.
The study using gene microarrays has helped us to understand the potentail mechanisms of the carcinogenesis of gastric cardia adenocarcinoma preliminarily through "GO" analysis, signal pathway analysis and lncRNA classification analysis. Furthermore, this study for the first time focus on the molecular mechanisms of development of GCA at lncRNA level. In addition, our study showed that lncRNA NR_027418 is located near KPNA2 and ENST00000430562 is associated with ZBED1, a transcription factor involved in cell proliferation. It is suggested that lncRNAs may exert their functions through regulating the transcription of related protein-coding genes in GCA. Pathway analysis suggested the connection between signal pathways and lncRNAs which implies that lncRNAs may be used to modulate biological activities or be used as biomarkers for diagnosis or prognosis of GCA. However, to elucidate the potential roles of lncRNAs in GCA, further work including functional and mechanism experiments is needed.
Abbreviations
lncRNA (long non-coding RNA); GCA (Gastric cardiac adenocarcinoma).
Acknowledgements
Supported by National Natural Science Foundation of China, No. 81301763.
Disclosure Statement
All authors have nothing to disclose.