Abstract
MicroRNAs (miRNAs) are promising biomarkers for the diagnosis and prognosis of various diseases. Quantitative PCR is the most frequently used method of measuring expression levels of miRNA. However, the lack of validated reference genes represents the main source of potential bias in results. It is normal practice to use small nuclear RNAs as reference genes; however, they often have variable expression. Researchers tend to prefer the most stable reference genes in each experiment. The review includes reference genes for the following tissue types: gliomas, lung cancer, melanoma, gastric cancer, liver cancer, prostate cancer, breast cancer, thyroid cancer, ovarian cancer, cervical cancer, endometrial cancer, rectal cancer, blood tumors, and placental tissues.
Highlights of the Study
Different studies use different normalization strategies to report miRNA expression data and there is no universal reference gene.
A complex normalizer, which is the geometric mean of miRNA-16-5p, miRNA-103a-3p, miRNA-191-5p, may be used as a universal reference gene.
Numerous studies have shown that small nuclear RNAs have variability in expression and that miRNAs are preferable for use as reference genes.
Introduction
MicroRNAs (miRNAs) are small noncoding RNAs. They regulate gene expression post-transcriptionally [1]. Ludwig et al. [2] measured miRNA profiles for tissues in various organs and showed that each tissue expresses a unique set of more than 1,000 miRNAs, of which 143 were found in all tissues studied. There is little doubt that aberrant miRNA expression may entail the initiation and progression of various diseases [3]; thus, miRNAs are viewed as promising diagnostic and prognostic biomarkers [4].
Basically, there are two strategies in use for exploring the role played by miRNAs and various diseases: one by analyzing circulating miRNAs and the other one, by analyzing miRNAs in solid tumors [5, 6]. The material used includes fresh frozen (FF) tissue, formalin-fixed paraffin-embedded (FFPE) sections, biopsy samples on glass slides, and cell cultures [7-9]. Undoubtedly, RNAs expression depends on the stabilization methods used [10]. Expression levels of miRNA were shown to depend on the storage time of the FFPE block and that U6B, the most frequently used reference gene, had higher degradation rates compared to those of miRNA-141-3p or miRNA-221-3p [11]. Although fixation with formalin was reported to reduce the expression levels of miRNAs, their expression profiles in FF and FFPE samples from the same tumor were normally still strongly correlated, with miR-103a-3p being the most stable [12]. There is no doubt that any comparative analysis of miRNA expression levels is worthwhile if it is performed for the samples stabilized using an identical method.
The accuracy of measuring variation in expression is an important part of the answer to the question regarding the roles played by miRNAs in biological processes. Normalization of expression levels is an important step for ensuring accurate quantitative assessment of qPCR data. Normalization is performed to differentiate true variation in the expression levels of the target in the samples from the one induced experimentally. Factors affecting the results include sampling protocols, the quantity and quality of material introduced, and RNA extraction methods, to name a few [13]. Apart from technical factors, one of the major problems with interpretation of qPCR data is the one related to proper choice of reference genes. A good reference gene should display the qualities of least variation in expression between the subgroups being compared and stable expression regardless of RNA isolation methods or storage conditions [14]. Wong et al. [15] demonstrated that the reference gene used and the study target should belong to the same class of RNA.
Several mathematical approaches such as geNorm, NormFinder, and BestKeeper have been developed to help identify suitable reference genes (i.e., the genes with the least variation and high stability in biological samples) [16-18]. However, none of them is the gold standard, and the researchers make their choices on a case-to-case basis.
It should be noted that a number of studies have shown that U6, which is the most commonly used reference gene, has variable expression and therefore cannot be the optimal reference gene for miRNA analysis [19, 20]. Meanwhile, in 300 studies randomly selected from PubMed publications retrieved using the search criteria “microRNA” + “qPCR” + “cancer”, we observed the following preferences for reference genes: 84% used the small nuclear RNA (snRNA) U6, 10% used U48 or U58, and only as few as 6% chose from miRNAs with the most stable expression in each particular experiment. Despite many thousands of publications exploring the effects of miRNAs on human diseases, there exists no single standard for normalizing qPCR data; hence, the expression levels of candidate reference genes need to be checked for stability in each experiment. However, in some studies, expression levels of many miRNAs cannot be analyzed due to limited sample amounts or funds for developing miRNA systems for analysis. In such cases, the only solution for this technical problem is the selection of reference genes based on analysis of published data. Systematization of the data on reference genes in various human tissues, which has been performed in this review, will help researchers choose the most optimal reference genes for their studies. The aim of our work is to systematize the data on reference genes used for analyzing expression levels of miRNA in tumors of different origin by qPCR.
Search Strategy
We searched for studies exploring the expression levels of miRNAs in various tissues by qPCR with a view to ascertain which reference genes are tissue-specific and which ones are universal. The tissues of interest were the malignancies most often referred to in the literature: gliomas, lung cancer (LC), melanoma, gastric cancer, liver cancer, prostate cancer (PC), breast cancer (BC), thyroid cancer, ovarian cancer (OC), cervical cancer (CC), endometrial cancer, rectal cancer (RC), and blood tumors. Additionally, we considered placental tissues. To exclude review articles, electronic searches were conducted using the following search criteria: “microRNA” and “qPCR” and “human” and “Journal Article”. We have looked into 2,950 articles in the PubMed database. Articles were reviewed in the third quarter of 2021. To exclude articles related to circulating miRNAs, the variable search criteria used were the name of the disease, the name of the tissue type, and the type of material. Next, at least 50% of the articles with each tissue type in them were randomly selected for analysis.
Tissue Types
The reference genes for the analysis of miRNAs in different tissues are summarized in Table 1.
Brain Tumors
snRNAs are most often used as reference genes for profiling miRNA expression in glioblastomas in both FF and FFPE samples [21-23].
Lung Cancer
Melanoma
miRNA-191-5p was found to be the most stable miRNA in melanoma cells and levels of expression U6 depend on hormone levels; thus, it is better not to regard U6 as a candidate reference gene [35].
Placenta
Today, snRNAs (predominantly U6) are most commonly used as reference genes when analyzing the miRNA expression levels in placental tissue.
Gastric Cancer
Anauate et al. [47] assessed the stability of potential reference genes (U6, U44, U48, let-7a-5p, miR-28-5p, miR-101-3p, miR-140-3p, miR-152-3p, and miR-374a-3p) profiling miRNA expression in gastric tissues and demonstrated that the best reference gene is a combination of miR-101-3p and miR-140-3p, while U6, U44, and U48 were not suitable options.
Liver Cancer
Jacobsen et al. [20] proposed the use of a combination of miR-151a-5p, miR-425-5p, and miR-24-3p as reference genes for hepatocytes, pointing out that U6 which is most often used as a reference gene had the lowest stability among all the potential reference genes. Levels of expression miR-30c-5p, miR-30b-5p, and miR-126-3p were demonstrated to be stable in liver tissue, while U6, U48, and U44 had low stability and thus were not suitable for use as reference genes [53].
Breast Cancer
Davoren et al. [62] reported that let-7a-5p and miR-16-5p were the top two miRNAs most stably expressed in FF samples of BC tissue and Chen et al. [63] confirmed that let-7a-5p was the most stable reference gene for the analysis of BC samples using FFPE tissue. In addition, Rinnerthaler et al. [64] confirmed that miRNA-16-5p is expressed at invariable levels in FFPE samples of BC tissue, but noted that this reference gene is suitable for metastatic BC, while the primary tumor should be analyzed using miR-29a-3p.
Thyroid Cancer
miRNA expression profiling in thyroid samples often involves the use of snRNAs (U6, U48, and U44) as reference genes [71, 72, 75]. Titov et al. [74] proposed a fundamentally different approach regarding choosing reference genes: they selected their reference genes based on the NanoString outputs. For example, miR-151a-3p, -197-3p, -99a-5p, and -214-3p had a relatively low variation in expression among the 800 miRNAs analyzed.
Ovarian Cancer
Azzalini et al. [81] used the geometric mean of let-7e-5p and miR-423-3p as a reference gene for miRNA expression profiling in FFPE samples of OC tissue. U48 and miR-191-5p were noted as being among the most stable reference genes [82]. MiRNA-27a-3p was found to be appropriated as a reference gene for both FF and FFPE samples of OC tissue [83, 84].
Cervical Cancer
Cervical cytology is an efficient approach for reducing the mortality from CC worldwide. However, the sensitivity of the standard cervical cytology test may be too low to detect cervical intraepithelial neoplasia; hence, more sensitive molecular diagnostic tests that could substantially improve detection rates and diagnostic accuracy need to be developed. miRNA expression profiling in cell smear samples obtained in a minimally invasive way by liquid-based biopsy cytology has shown promise for early diagnosis of CC. U6 and U49 have been described as being the most stable reference genes for miRNA expression profiling in LBC samples from patients who had undergone CC screening [92].
Prostate Cancer
When choosing a reference gene for miRNA expression profiling in PC tissue samples, it is preferred to normalize data by snRNAs [95, 96]. However, Schaefer et al. [97] recommend the use of miR-130b-3p or the geometric mean of miR-130b-3p and U6-2 for the purposes of normalization in PC tissue samples because this combination is subject to much lower variation than U6-2 alone.
Endometrial Cancer
miRNA expression profiling in endometrial cancer tissue samples commonly involves the use of snRNAs, either alone or as combinations [107-109].
Bone Marrow
Complex reference genes are the most common choices for profiling miRNA expression in bone marrow samples [114-117].
Colorectal Cancer
Various combinations of miR-520d-3p, miR-1228-3p, and miR-345-5p have been shown to be suitable reference genes for miRNA expression profiling in FF samplers of colorectal cancer tissue [121]. Chang et al. [122] identified a combination of miR-16-5p and miR-345-5p as the most stable reference gene for the analysis in FFPE samples of colorectal cancer tissue.
Rectal Cancer
The mean expression of miR-27a-3p, miR-193a-5p, and let-7g-5p has been reported to be best for qPCR-based miRNA expression profiling in RC tissue [123]. Thus, miRNA-16-5p, miRNA-103a-3p, and miRNA-191-5p as well as snRNAs U6, U44, and U48 are commonly used reference genes in 14 human tissue types (Fig. 1). However, numerous studies have demonstrated that snRNAs have variability in expression and that miRNAs are preferable for use as reference genes. Unfortunately, it is difficult to identify the trend of using individual miRNAs as reference genes for each tissue type. We have listed all miRNAs found in publications as reference genes in 14 tissues and decided that the conclusion about the possibility of using a universal reference gene is more important. Thus, we think that a complex reference gene, which is the geometric mean of threshold fluorescence cycles of miRNA-16-5p, miRNA-103a-3p, and miRNA-191-5p, may be used as a universal reference gene to study miRNA expression levels.
Discussion
In search of suitable reference genes, researchers normally follow three strategies: (1) searching through literature sources for the most frequently used reference gene, (2) using mathematical algorithms to choose the most stably expressed RNAs among the ones expressed in a particular experiment, and (3) choosing reference genes among the miRNAs identified by NanoString nCounter as occurring in the most stable quantities. Small noncoding RNAs, and snRNAs in particular, are most commonly used as reference genes. However, snRNAs are not miRNAs and may therefore have different extraction efficiencies, reverse transcription efficiencies, and PCR amplification efficiencies. Also, the efficiencies of their extraction, reverse transcription, and PCR amplification may differ from those of miRNAs.
A large number of papers showing variation in snRNA expression levels in different tumor tissues have been published. Torres et al. [126] considered a selection of candidate reference genes (miR-16-5p, miR-26b-5p, miR-92a-3p, U44, U48, U75, U54, U6, U49, U6B, U38B, and U18A) and found U48, U75, and U44 to be most stable in FFRE endometrial carcinoma tissues. Their study confirmed the possibility of using snRNAs as reference genes; however, they obviously have different levels of stability. The use of combinations of snRNAs (and a combination of U6 and U47 [or U66, U44, and U48] in particular) is preferred for the analysis in RC [124, 125]. A complex reference gene composed of U44, U48, and U6 has been used for the analysis of FFPE samples of BC tissue and lymph nodes [69]. The geometric mean of as many as five snRNAs (U61, U68, U72, U95, and U96A) was used for the analysis of miRNA expression levels in human bone marrow-derived multipotent stromal cells [120]. The use of multiple reference genes was shown to improve the quantification accuracy achievable with a single reference gene for miRNA expression profiling [16]. Although various mathematical algorithms for the identification of the most stable genes in individual experiments are becoming increasingly popular when reference genes need to be chosen, U6 is the most commonly used reference gene.
A review of publications has shown that the same miRNA can be successfully used as a reference gene in one tissue and have aberrant expression in another one. Liang et al. [127] analyzed the expression levels of 345 miRNAs in 40 normal tissues and found miR-30e-5p, miR-92a-3p, and miR-423-3p to have the least variable expression. Although our study has identified promising reference genes that can be universally used in different tissues, literature analysis has demonstrated that miRNA-16-5p, miRNA-103a-3p, and miRNA-191-5p are most commonly used as universal reference genes [29, 31, 35, 42, 89, 99, 110].
miRNA expression levels can be affected by the heterogeneity of tumor tissue depending on the sampling site and account for the bias in the results obtained by different laboratory teams [65]. In addition, the results can be affected by the dimensions of the samples. The test samples being handled can often be very small, thus leading to biomarker misidentification [128].
Conclusions
We conclude that it is most appropriate to use a complex reference gene represented by several miRNAs with stable expression in each experiment. We suggest that a complex normalizer, which is the geometric mean of threshold fluorescence cycles of miRNA-16-5p, miRNA-103a-3p, and miRNA-191-5p, may be used as a universal reference gene to study miRNA expression levels.
Statement of Ethics
The authors have no ethical conflicts to disclose.
Conflict of Interest Statement
The authors have no competing interests.
Funding Sources
The work of S.E. Titov was financially supported by the Russian Science Foundation (project No. 20-14-00074). The work of Y.A. Veryaskina was supported by the Russian Foundation for Basic Research (project No. 19-34-60024).
Author Contributions
Y.A. Veryaskina contributed to the conception and design of the work and wrote the manuscript draft. S.E. Titov contributed to the conception and design of the work and edited the manuscript draft. I.F. Zhimulev supervised this study and reviewed and edited the manuscript. All authors have read and approved the final manuscript.