Abstract
Background/Aims: Embryo implantation is an essential process for eutherian pregnancy, but this process varies across eutherians. The genomic mechanisms that led to the emergence and diversification of embryo implantation are largely unknown. Methods: In this study, we analyzed transcriptomic changes during embryo implantation in mice and rats by using RNA-seq. Bioinformatics and evolutionary analyses were performed to characterize implantation-associated genes in these two species. Results: We identified a total of 518 differentially expressed genes in mouse uterus during implantation, of which 253 genes were up-regulated and 265 genes were down-regulated at the implantation sites compared with the inter-implantation sites. In rat uterus, there were 374 differentially expressed genes, of which 284 genes were up-regulated and 90 genes were down-regulated. A cross-species comparison revealed that 92 up-regulated genes and 20 down-regulated genes were shared. The differences and similarities between mice and rats were investigated further at the gene ontology, pathway, network, and causal transcription factor levels. Additionally, we found that embryo implantation might have evolved through the recruitment of ancient genes into uterine expression. The evolutionary rates of the differentially expressed genes in mouse and rat uterus were significantly lower than those of the non-changed genes, indicating that implantation-related genes are evolutionary conserved due to high selection pressure. Conclusion: Our study provides insights into the molecular mechanisms involved in the evolution of embryo implantation.
Introduction
Embryo implantation is essential for eutherian pregnancy, which is defined as the process by which the blastocyst establishes an intimate connection with the uterus [1]. Although there are generally three phases of implantation, namely, apposition, adhesion, and penetration, this process varies across eutherians [2]. On the basis of the different types of blastocyst-uterine cell interactions, implantation can be classified into three broad categories: centric, eccentric, and interstitial [3].
The genomic mechanisms that led to the emergence and diversification of eutherian implantation are largely unknown. Uterine prolactin (PRL) expression is an innovation of eutherian mammals to silence the expression of IL-6 and 20α-HSD, which are detrimental to the developing embryo [4]. Systemic PRL is produced by the anterior pituitary, whereas the promoter for uterine PRL is derived from the eutherian-specific transposable element MER20 located upstream of the pituitary promoter [5, 6]. In fact, thousands of cis-regulatory elements that mediate gene expression in the uterus during pregnancy are derived from ancient mammalian transposable elements [7, 8]. In addition to cis-regulatory mutations, direct changes in the amino acid sequence of a protein may also have contributed to the evolution of embryo implantation and pregnancy. It has been demonstrated that the transcription factor HOXA11 experienced a period of strong positive selection in the stem-lineage of eutherian mammals to obtain the ability to up-regulate PRL gene expression in endometrial cells [9]. Mechanically, these amino acid changes generated a new functional interaction between HOXA11 and FOXO1, which switched HOXA11 from a repressor to an activator [10, 11]. Moreover, gene family expansion by duplication is likely associated with the diversification of implantation types. In rodents, there is a family of PRL-related genes that regulates pregnancy-dependent adaptations to physiological stressors [12]. In the pig, the species-specific IL1B duplicate, designated as IL1B2, is required for conceptus elongation before implantation [13]. Conceptus elongation is believed to maximize contact between the embryo and the uterine surface for superficial implantation, as well as to inhibit the endometrial production of luteolytic PGF2α [14]. Interferon tau is produced by conceptus in ruminants, which is essential for the maternal recognition of pregnancy in ruminants. The ancestral interferon tau gene emerged approximately 36 million years ago from a duplication of the interferon omega gene [15].
Mice (Mus musculus) and rats (Rattus norvegicus) are closely related rodents that last shared a common ancestor 12–24 million years ago [16]. Both mice and rats demonstrate rapid eccentric implantation with apposition, attachment, and invagination of the uterine epithelium that occurs in 6 h [3]. In the present study, we performed comparative transcriptomic analysis of embryo implantation in mice and rats using RNA-seq. The differences and similarities between mice and rats were investigated further at the gene ontology (GO), pathway, network, and causal transcription factor levels. Additionally, we found that embryo implantation might have evolved through the recruitment of ancient genes into uterine expression. The evolutionary rates of the differentially expressed genes in mouse and rat uterus were significantly lower than those of the non-changed genes, indicating that implantation-related genes are evolutionary conserved due to high selection pressure. Our study provides insights into the molecular mechanisms involved in the evolution of embryo implantation.
Materials and Methods
Sample collection
Natural pregnancy was established by co-caging adult females with fertile males. The day of the appearance of the vaginal plug was designated as day 1 of pregnancy. Uterine segments were obtained on days 5 and 6 of pregnancy from mice and rats, respectively. Implantation sites (IS) and inter-implantation sites (IIS) were identified after an intravenous injection of Chicago Blue B dye solution. The embryos were removed by gently and repeatedly flushing the uterus with saline. The complete removal of embryos was ensured by counting the number of recovered embryos. The integrity of the uterine epithelium after flushing was confirmed by sequential frozen sections. Then, uterine fragments from IS and IIS were collected separately. All collected samples were flash-frozen in liquid nitrogen and stored at -80 °C until use. All animal procedures in this study were approved by the Institutional Animal Care and Use Committee of South China Agricultural University.
RNA-seq
Three pregnant animals were used to collect uterine samples (n = 3). In order to acquire sufficient tissues for RNA-seq analysis, 3 IS segments and 3 IIS segments were pooled for each animal. Total RNA was extracted with the TRIzol reagent (Invitrogen). The RNA quality control parameters were A260/A280 ratio > 1.8, A260/A230 ratio > 2.0, and RNA integrity number > 7.0. A TruSeq RNA Sample Preparation Kit (Illumina) was used to generate RNA-seq libraries. High-throughput sequencing was conducted on the Illumina HiSeq 2500 system. Raw data were aligned to the mouse genome (UCSC mm9) or rat genome (UCSC rn5) using Tophat v2.0.4 [17] with default options. The aligned reads were assembled using Cufflinks v2.2.1 [18]. Differentially expressed genes were chosen on the basis of the criteria of fold change > 2 and false discovery rate (FDR) < 0.05.
Validation by quantitative RT-PCR
Total RNA was extracted with the TRIzol reagent (Invitrogen). A PrimeScript Reverse Transcriptase Reagent Kit (TaKaRa) was used to synthesize cDNA after eliminating potential genomic DNA contamination by DNase I (Invitrogen). Quantitative PCR was performed using THUNDERBIRD SYBR qPCR Mix (Toyobo) on an Applied Biosystems 7500 system (Life Technologies). The Rpl7 gene was chosen as a reference gene for normalization. Primer sequences are listed in Supplementary Table 1(For all supplemental material see www.karger.com/10.1159/000494187/).
GO and pathway analysis
GO analysis was performed using the MGI GOslim database as described previously [19]. To test for enrichment, a hypergeometric test was conducted using in-house PERL scripts. Pathway analysis was performed by using the DAVID tool [20]. FDR < 0.05 was used as a significance threshold to identify enriched GO terms and pathways.
Gene network analysis
A gene network was generated using the STRING database v10.0 [21]. The minimum combined score was set to 0.4 (median confidence). Cytoscape software [22] was applied for visualization and analysis of the gene network. Degree distribution was analyzed using the Cytoscape plugin Network Analyzer [23]. The degree threshold value for hub genes was the mean plus two standard deviations.
Analysis of transcription factor-binding sites
Putative promoter regions (1 kb upstream of the transcription start site) were retrieved from the UCSC genome browser [24]. TESS v6.0 [25] was used to search for matches of position weight matrices available in the TRANSFAC database [26]. The cutoff value for the relative score was set at 0.9. Using all genes in the genome as the background, a hypergeometric test with the Benjamini-Hochberg multiple test correction was conducted using in-house PERL scripts. An adjusted P-value < 0.05 was used as the significance threshold to identify enriched transcription factors.
Estimation of evolutionary gene ages
Gene age datasets were extracted from the ProteinHistorian database [27]. ProteinHistorian estimates gene ages from phylogenetic patterns by integrating different ancestral family reconstruction algorithms and pre-existing databases. We classified the differentially expressed genes into different groups based on their phylogenetic ages.
Evolutionary rate calculation
Gene orthology relationships were batch downloaded from Ensembl release 80 [28], and only one-to-one orthologs were considered. Orthologous nucleotide coding sequences were downloaded and aligned using ClustalW [29]. Pairwise estimates of ω were performed using the maximum likelihood method implemented in codeml (PAML v4, runmode = -2) [30]. High values of ω (dN/dS = ω > 0.5) indicate elevated rates of amino acid sequence divergence and suggest that some sites are likely to be targets of positive selection [31].
Results
Transcriptomic analysis of embryo implantation in mice and rats
To assess the molecular bases of embryo implantation in mice and rats, we performed RNA-seq on IS and IIS from mouse and rat uteri on days 5 and 6 of pregnancy, respectively. Differentially expressed genes were identified using the following criteria: fold change > 2 and FDR < 0.05. There were 518 genes differentially expressed in mouse uterus, of which 253 genes were up-regulated and 265 genes were down-regulated at IS compared with IIS (Fig. 1A and Supplementary Table 2). In rat uterus, a total of 374 genes were differentially expressed, of which 284 genes were up-regulated and 90 genes were down-regulated at IS in comparison with IIS (Fig. 1B and Supplementary Table 3). A cross-species comparison revealed that 92 up-regulated genes and 20 down-regulated genes were shared by mice and rats (Fig. 1C). To validate the RNA-seq data, 12 mouse genes (Fig. 2A) and 12 rat genes (Fig. 2B) with various fold changes were selected and validated by quantitative RT-PCR (qRT-PCR). Although there was variation in fold changes between qRT-PCR and RNA-seq, the expression trend of the selected genes was consistent, indicating the high quality of our RNA-seq data.
Identification of differentially expressed genes at IS compared to IIS in mice and rats. (A) Volcano plot for the comparison between IS and IIS in mice. (B) Volcano plot for the comparison between IS and IIS in rats. The cutoff values of fold change > 2 and FDR < 0.05 were utilized to identify differentially expressed genes. Non-changed genes are shown in blue, up-regulated genes are shown in red, and down-regulated genes are shown in green. (C) Venn diagrams comparing the up-regulated and down-regulated genes between mice and rats.
Identification of differentially expressed genes at IS compared to IIS in mice and rats. (A) Volcano plot for the comparison between IS and IIS in mice. (B) Volcano plot for the comparison between IS and IIS in rats. The cutoff values of fold change > 2 and FDR < 0.05 were utilized to identify differentially expressed genes. Non-changed genes are shown in blue, up-regulated genes are shown in red, and down-regulated genes are shown in green. (C) Venn diagrams comparing the up-regulated and down-regulated genes between mice and rats.
Validation of the selected genes identified by RNA-seq using qRT-PCR. (A) Differentially expressed genes in mouse uterus. (B) Differentially expressed genes in rat uterus. Fold changes determined by RNAseq and qRT-PCR are presented as the mean ± standard error of the mean. Statistical significance was reached at P< 0.05 for all genes.
Validation of the selected genes identified by RNA-seq using qRT-PCR. (A) Differentially expressed genes in mouse uterus. (B) Differentially expressed genes in rat uterus. Fold changes determined by RNAseq and qRT-PCR are presented as the mean ± standard error of the mean. Statistical significance was reached at P< 0.05 for all genes.
Characteristics of the differentially expressed genes at the GO, pathway, and network levels
GO analysis was performed based on MGI GOslim. Enriched GO terms were classified into 3 categories: biological process, cellular component, and molecular function (Supplementary Fig. 1A and Fig. 1B). In the biological process category, 5 enriched GO terms, namely, DNA metabolism, cell organization & biogenesis, metabolic processes, cell cycle & proliferation, and developmental processes, were shared by mice and rats. The enriched GO terms unique to mice were stress response, death, and protein metabolism. The enriched GO term cell adhesion was unique to rats. In the cellular component category, there were 4 enriched GO terms shared by mice and rats: extracellular matrix, non-structural extracellular, cytoskeleton, and nucleus. The enriched GO term endoplasmic reticulum/Golgi was unique to rats. In the molecular function category, the enriched GO term extracellular structural activity was shared by mice and rats. The enriched GO terms kinase activity and enzyme regulator activity were unique to mice, whereas the enriched GO term translation activity was unique to rats.
In addition to GO analysis, we also performed pathway analysis by using DAVID tools. Two pathways, namely, cell cycle and DNA replication, were significantly enriched among the differentially expressed genes in both mice and rats (FDR < 0.05) (Fig. 3A and Fig. 3B). Additionally, 7 pathways (ECM-receptor interaction, protein processing in endoplasmic reticulum, biosynthesis of antibiotics, phagosome, gap junction, ribosome biogenesis in eukaryotes, and p53 signaling pathway) were significantly enriched among the differentially expressed genes in rats (Fig. 3B).
Pathway analysis of differentially expressed genes. (A) Bar plot showing significantly enriched pathways for mice (FDR < 0.05). (B) Bar plot showing significantly enriched pathways for rats (FDR < 0.05). The DAVID tool was used for an enrichment test and the genes are classified according to the KEGG pathway database.
Pathway analysis of differentially expressed genes. (A) Bar plot showing significantly enriched pathways for mice (FDR < 0.05). (B) Bar plot showing significantly enriched pathways for rats (FDR < 0.05). The DAVID tool was used for an enrichment test and the genes are classified according to the KEGG pathway database.
The reconstructed network for the differentially expressed genes in mouse uterus consisted of 353 nodes connected by 2558 edges (Supplementary Fig. 2A). Topological analysis showed that the network followed a power-law distribution (Supplementary Fig. 2B). Using a defined threshold of degree values exceeding the mean plus two standard deviations, we identified 15 hub genes: receptor-interacting serine-threonine kinase 4 (Ripk4), thymoma viral proto-oncogene 1 (Akt1), topoisomerase II alpha (Top2a), proliferating cell nuclear antigen (Pcna), uridine monophosphate synthetase (Umps), polo-like kinase 1 (Plk1), heat shock protein A1B (Hspa1b), heat shock protein A5 (Hspa5), cell division cycle 20 (Cdc20), 2′-5′ oligoadenylate synthetase-like 2 (Oasl2), 2′-5′ oligoadenylate synthetase 2 (Oas2), REV3 like DNA directed polymerase zeta catalytic subunit (Rev3l), polymerase DNA directed delta 1 catalytic subunit (Pold1), cyclin-dependent kinase inhibitor 1B (Cdkn1b), and cyclin A2 (Ccna2) (Supplementary Fig. 2C). The reconstructed network for the differentially expressed genes in rat uterus consisted of 239 nodes connected by 1992 edges (Supplementary Fig. 3A). This network followed a power-law distribution (Supplementary Fig. 3B) with 11 hub genes: topoisomerase II alpha (Top2a), carbamoyl-phosphate synthetase 2/aspartate transcarbamylase/dihydroorotase (Cad), heat shock protein 90 alpha A1 (Hsp90aa1), cyclin-dependent kinase 4 (Cdk4), histone cluster 2 H3c2 (Hist2h3c2), exportin 1 (Xpo1), heat shock protein A5 (Hspa5), MAD2 mitotic arrest deficient-like 1 (Mad2l1), FBJ osteosarcoma oncogene (Fos), actin alpha 2 (Acta2), and centromere protein E (Cenpe) (Supplementary Fig. 3C). Two hub genes, Top2a and Hspa5, were shared by both mice and rats. Due to their key positions in the network, these hub genes are likely more important than the other genes.
Screening for causal transcription factors
Up-regulated and down-regulated genes were tested separately for the over-representation of transcription factor binding sites. Binding sites for ETF (M00695), ZF5 (M00716), E2F-1 (M00428), E2F-4:DP-2 (M00739), MOVO-B (M01104), and Sp1 (M00008) were significantly over-represented among the up-regulated genes in mouse uterus, while binding sites for MOVO-B (M01104), Sp1 (M00008), and SREBP-1 (M00749) were significantly over-represented among the down-regulated genes in mouse uterus (Supplementary Fig. 4A). For the up-regulated genes in rat uterus, ETF (M00695), E2F-4:DP-2 (M00739), and MOVO-B (M01104) binding sites were significantly over-represented. No transcription factor binding sites were significantly over-represented among the down-regulated genes in rat uterus (Supplementary Fig. 4B). Notably, the over-representation of ETF (M00695), E2F-4:DP-2 (M00739), and MOVO-B (M01104) binding sites was shared for the up-regulated genes in both mice and rats. This analysis provided clues to the regulatory mechanisms underlying embryo implantation in mice and rats.
Analysis of gene age and evolutionary rate
We systematically evaluated the evolutionary histories of the differentially expressed genes using the ProteinHistorian database of gene ages (http://lighthouse.ucsf.edu/proteinhistorian/). We found that 79.4% of the differentially expressed genes in mouse uterus and 85.9% of the differentially expressed genes in rat uterus originated before the emergence of the eutherian lineage (Fig. 4A). Thus, embryo implantation in eutherians might have evolved through the recruitment of ancient genes into uterine expression. Additionally, we observed that the average dN/dS value of the differentially expressed genes in mouse and rat uterus was significantly lower than that of the non-changed genes (Mann-Whitney U test, P < 0.05) (Fig. 4B). This result indicated that implantation-related genes are evolutionary conserved due to high selection pressure.
Evolutionary analysis. (A) Evolutionary origins of the differentially expressed genes in mice and rats. The graphs were generated using ProteinHistorian datasets. (B) The rate of evolution (dN/dS = ω) for the differentially expressed genes in mice and rats. P-values were calculated using the Mann-Whitney U test.
Evolutionary analysis. (A) Evolutionary origins of the differentially expressed genes in mice and rats. The graphs were generated using ProteinHistorian datasets. (B) The rate of evolution (dN/dS = ω) for the differentially expressed genes in mice and rats. P-values were calculated using the Mann-Whitney U test.
Discussion
In this study, we confined our analysis to implantation-related genes in two rodents, mice and rats, which share the same type of eccentric implantation. By using RNA-seq, we identified 518 differentially expressed genes in mouse uterus during implantation, of which 253 genes were up-regulated and 265 genes were down-regulated at IS compared with IIS. In rat uterus, there were a total of 374 differentially expressed genes, of which 284 genes were up-regulated and 90 genes were down-regulated. qRT-PCR analysis demonstrated that the expression trend of the selected genes was consistent with RNA-seq, suggesting that our RNA-seq data were of high quality. Mice and rats are the most widely used experimental animal models for studying human reproduction. Mice are small in size (weighing 25–30 g) and easy to handle. Most importantly, the mouse genome supports sophisticated gene-targeting modifications by homologous recombination in embryonic stem cells [32]. Rats are larger (weighing 250–400 g) than mice, making it easy to perform surgical manipulations. Moreover, rats are more physiologically similar to humans than are mice [33]. Recently, the application of the CRISPR/Cas9 gene-editing system has greatly reduced the difficulty of genome editing in various species, including the rat [34]. We hypothesized that the mechanisms underlying the evolution of embryo implantation might be deduced by a comparative analysis of differentially expressed genes during implantation in mice and rats.
A cross-species comparison revealed that 92 up-regulated genes and 20 down-regulated genes were shared by mice and rats. We next examined the similarities and differences between these two species at the GO, pathway, and network levels. At the GO level, of the 18 GO terms enriched in either mice or rats, 10 GO terms were shared by both species, 5 GO terms were specific to mice, and 3 GO terms were specific to rats. Moreover, in both species, the top 2 most enriched GO terms were cell organization & biogenesis and cell cycle & proliferation. At the pathway level, we found that cell cycle and DNA replication were commonly enriched pathways. These results suggest that functional categories show more consensus than individual genes. We speculate that a similar set of functional categories may be required for embryo implantation in mice and rats, but each functional category can be implemented by alternative genes in these two species. At the network level, we identified 15 hub genes for mice (Ripk4, Akt1, Top2a, Pcna, Umps, Plk1, Hspa1b, Hspa5, Cdc20, Oasl2, Oas2, Rev3l, Pold1, Cdkn1b, and Ccna2) and 11 hub genes for rats (Top2a, Cad, Hsp90aa1, Cdk4, Hist2h3c2, Xpo1, Hspa5, Mad2l1, Fos, Acta2, and Cenpe), with an overlap of 2 hub genes (Top2a and Hspa5) shared by both species. The hub genes are likely more important than the other genes due to their key positions in the network. Strikingly, 9 out of 15 mouse hub genes (Akt1, Top2a, Pcna, Plk1, Cdc20, Rev3l, Pold1, Cdkn1b, and Ccna2) and 6 out of 11 rat hub gene (Top2a, Cdk4, Xpo1, Mad2l1, Fos, and Cenpe) are involved in the cell cycle. During implantation, proliferation (cell cycle initiation) and subsequent differentiation (cell cycle exit) are required for uterine epithelial cells to establish receptivity [1] and for uterine stromal cells to become decidualized [35], respectively. Therefore, the cell cycle plays a central role in embryo implantation.
Since transcription factors regulate gene expression, we also examined the similarities and differences between these two species at the causal transcription factor level. Binding sites for ETF, ZF5, E2F-1, E2F-4:DP-2, MOVO-B, Sp1, and SREBP-1 were significantly over-represented among the differentially expressed genes in mouse uterus, while binding sites for ETF, E2F-4:DP-2, and MOVO-B were significantly over-represented among the differentially expressed genes in rat uterus. ETF (TEA domain family member 2) recognizes GC-rich sequences and stimulates the expression of genes without a TATA box [36]. Although ETF is abundant in the uterus relative to other tissues [37], its role in embryo implantation is unknown. ZF5 contains five Krüppel-like zinc-fingers at its C-terminus [38]. E2F-1 (E2F transcription factor 1) is an upstream regulator of cell proliferation and decidualization in mouse uterus [39]. Moreover, E2F-1 was predicted to be a key regulator of endometrial receptivity in humans [40]. E2F-4 is another member of the E2F transcription factor family that forms a DNA-binding heterodimeric complex with DP-2 (E2F dimerization partner 2). MOVO-B (mouse homolog of Drosophila Ovo protein B) is a zinc-finger transcription factor [41]. Sp1 is a member of a family of transcription factors that bind to and act through GC boxes to regulate gene expression. Sp1 protein is expressed in a menstrual cycle stage-specific pattern in the endometrium [42]. SP1 has been demonstrated to be a key regulator of estrogen-mediated gene expression in human endometrium [43]. SREBP-1 belongs to a family of basic helix-loop-helix-leucine zipper transcription factors [44]. Known target genes of SREBP-1 are involved in cholesterol biosynthesis and transport [45]. Currently, little is known about the role of ETF, ZF5, E2F-4:DP-2, MOVO-B, and SREBP-1 in regulating uterine gene expression and implantation. These transcription factors deserve further investigation.
The uterus is a comparatively recently evolved organ. In therian mammals, the lower part of the oviduct differentiated into the uterus, where embryo implantation takes place [46]. The evolution of implantation occurred by recruiting a large number of ancient genes into uterine expression [7, 8]. Indeed, using the ProteinHistorian database of gene ages, we found that 79.4% of the differentially expressed genes in mouse uterus and 85.9% of the differentially expressed genes in rat uterus originated before eutherians. By comparing the non-synonymous substitution rate to the synonymous substitution rate, we estimated the rate of evolution for each gene (ω = dN/dS) in mice and rats. High values of ω (ω > 0.5) indicate elevated rates of amino acid sequence divergence and suggest that some sites are likely to be targets of positive selection [31]. We observed that the average dN/dS value of the differentially expressed genes in mouse and rat uterus was significantly lower than that of the non-changed genes, indicating that implantation-related genes are evolutionary conserved due to high selection pressure. Nevertheless, we did identify a small set of genes that exhibited elevated rates of evolution (Supplementary Table 4). For example, tissue inhibitor of metalloproteinase 1 (Timp1) was up-regulated during implantation in mice (fold = 3.98, P = 0.000377, FDR = 0.0307) and rats (fold = 3.19, P = 0.00839, FDR = 0.0735). In our analysis, Timp1 was among the most rapidly evolving differentially expressed genes (dN = 0.0807, dS = 0.156, ω =0.517). Timp1-deficient female mice are sub-fertile [47], exhibiting alterations in normal uterine morphology and physiology characterized by profound branching of the uterine lumen and altered adenogenesis [48]. The rapid evolution of Timp1 is potentially a consequence of an evolutionary conflict between uterine proteases and their inhibitors.
Conclusion
In this study, we compared implantation-related genes in two closely related rodents, mice and rats, using RNA-seq. Our study provides a valuable resource for an in-depth understanding of the molecular mechanisms underlying the evolution of embryo implantation.
Acknowledgements
This work was funded by the National Natural Science Foundation of China (grant numbers 31771665 and 31271602 to Ji-Long Liu). The funder had no role in study design or data collection, analysis, or interpretation.
Disclosure Statement
The authors declare that there is no conflict of interests that could be perceived as prejudicing the impartiality of the research reported.