Introduction: Sex chromosomes have evolved independently across various lineages, often showing convergent degradation of the sex-limited chromosome. While extensively studied in model organisms with ancient sex chromosomal systems, the evolution of early-stage sex chromosomes remains poorly understood. Eulimnadia texana, a freshwater crustacean with a unique androdioecious breeding system (ZZ, ZW, and viable WW genotypes), provides a rare opportunity to study early sex chromosome evolution. This study examines E. texana’s W chromosome for evidence of a small localized non-recombining region, characterized by a transposable element (TE) “hotspot,” low gene density, and low GC content. Methods: Sex-linked markers were mapped onto the W chromosome (scaffold 1). TEs in the WW genome were identified using RepeatModeler and RepeatMasker. Statistical analyses compared TE distribution between the genome and scaffold 1, which was then divided into 20 equal-sized “bins” for finer-scale statistical analyses. Gene density and GC content were analyzed across these bins. Results: While no significant TE accumulation was found across the entire W chromosome compared to the remaining genome, a specific region (6.6–8.8 Mb, fourth bin) showed significantly higher TE accumulation. This region also exhibited low gene density and low GC content, indicative of reduced recombination. Conclusion: Our findings suggest that E. texana’s W chromosome contains a smaller region of crossover suppression, supporting the hypothesis that it is a proto-sex chromosome in early evolutionary development. This study provides valuable insights into early sex chromosome evolution and establishes E. texana as an ideal model for further investigation of evolutionary processes driving proto-sex chromosome differentiation.

In many animals and plants, a specific region of the sex chromosomes (termed the “sex-determining region”) contains genes that code for an organism’s sex (male or a female) and sexual development [1, 2]. Sex chromosomal systems commonly occur in two forms: XY and ZW systems [3]. In XY systems, females are homogametic (XX) and males are heterogametic (XY). In ZW systems, males are homogametic (ZZ) and females are heterogametic (ZW).

Sex chromosomes have evolved independently numerous times in various animal and plant lineages [3‒6]. Of those lineages, a common characteristic of sex chromosomes is that the sex-limited (Y or W) chromosome is degraded. This represents an example of evolutionary convergence of sex chromosomes across distant lineages, which allows a unique opportunity to study the evolutionary forces that act on sex chromosomes [7].

Sex chromosomes begin their evolutionary trajectory as “proto-sex chromosomes” that have reduced recombination around a small sex-determining region to reduce the likelihood of crossing over between “sexually antagonistic” genes (genes that benefit one sex but are detrimental in the opposite sex), and by extension, reduce the likelihood of producing low-viability “intersexes” [7‒9]. Such crossover suppression should create a linkage unit near the sex-determining region of the proto-sex chromosomes [7, 10]. However, a side effect of recombination suppression is that transposable elements (TEs), a type of repeat sequence that moves from one location in the genome to another, are predicted to accumulate in the non-recombining region of the permanently heterozygous chromosome (Y or W) [11‒13]. This accumulation of TEs can cause gene mutations, deleterious effects in gene expression, and mutations in regulatory sequences due to their insertion into coding regions and replication within the genome [14‒16]. As a result, the non-recombining sex-determining region of sex chromosomes commonly contains high TE accumulation and low gene content compared to rest of the genome [17].

There are two major classes of TEs. Class I are retrotransposons that can move through copy-paste mechanisms using intermediate RNA and can be subdivided into long terminal repeat retrotransposons (termed “LTR”; e.g., gypsy) and non-LTR retrotransposons (e.g., L1 and ALU) [18, 19]. Class II retrotransposons are DNA transposons that can move via cut-and-paste mechanisms (e.g., hATs and Helitrons) [20]. Distribution patterns of TE types can be used as markers to identify the boundaries of the recombining and non-recombining regions of a sex chromosome. L1 and gypsy elements accumulate in heterochromatic regions of suppressed recombination, whereas Alu and DNA transposons elements accumulate in regions of high recombination [21‒24]. The accumulation of TEs can disrupt the functions of the genes present in the non-recombining region and as a result, the gene functions can be lost leading to functional and physical degradation of the Y(W) chromosome [13, 25‒27].

Studies of sex chromosomes in mammals, Drosophila, and Caenorhabditis elegans support theoretical predictions that the later stages in the evolution of permanently heterozygous sex chromosome (Y or W) accumulate mutations, have low GC content (indicative of low recombination rates), lose genetic information, and physically degrade [7, 28‒30]. However, because the sex chromosomes of these model organisms are already highly diverged, it is important to also explore new study systems with “young” sex chromosomes (i.e., at early evolutionary stages of sex chromosome evolution) [9, 31] to allow the delineation of the evolutionary processes involved in the earlier stages of sex chromosome divergence.

The freshwater crustacean Eulimnadia texana is an excellent candidate to serve as a new study system, as these “clam shrimps” have an androdioecious (males and hermaphrodites) breeding system [32, 33]. E. texana is a branchiopod crustacean known for its unique sex-determining system [34] having three common mating types formed by combining two “proto-sex chromosomes”: Z and W [34, 35]. Males are homozygous for the Z chromosome (ZZ), while hermaphrodites are ZW or WW. Unlike in most animals, E. texana expresses both the Z and W chromosomes in homogametic form, allowing the purging of deleterious recessive alleles from both chromosomal types, which should retard physical degradation of both the Z and W [35]. This unique feature of E. texana allows for the study of proto-sex chromosomes [36, 37], making it an ideal species for exploring early-stage sex chromosome evolution.

In a recent study, the E. texana WW genome was assembled producing a genome size of 120 Mb and 108 scaffolds [38]. Baldwin-Brown et al. [38] suggested that the putative sex chromosome was located in the largest scaffold (scaffold 1) of the genome. However, they were unable to identify the sex-determining region on scaffold 1. In a separate study, a low genome-wide recombination rate was observed in ZW hermaphrodites implying that recombination on the sex chromosome is suppressed in these hermaphrodites [39]. Baldwin-Brown and Long [39] proposed that only Z chromosomes undergo recombination in males [39], and by extension, that the W chromosome never experiences crossing over. However, a recent study of TE content in E. texana showed no signs of TE accumulation throughout the entire W chromosome relative to the autosomes [40], indicating that the W chromosome does not have crossover suppression throughout its length. Instead, Lang et al. [40] proposed that a smaller portion of the W had an accumulation of TEs, suggestive of a smaller region of crossover suppression.

The objective of our study is to assess whether E. texana’s scaffold 1 contains a smaller region of crossover suppression by testing for a TE “hotspot,” low gene density, and low GC content. All three indicators do indeed suggest that a smaller region (close to previously identified sex-linked markers) has low-to-no crossing over, indicating that E. texana’s W chromosome is indeed in a proto-sex chromosome stage of early evolutionary development.

Genomic Dataset

The genomic data for WW E. texana were collected from NCBI GenBank, under BioProject “PRJNA352082” and genome accession number “NKDA00000000.” Microsatellite sequences of sex-linked markers were collected from NCBI, accession number AY515024.1 for microsatellite CS8 sequence, accession number “AY515026.1” for microsatellite CS11 sequence, and accession number “AY515028.1” for microsatellite CS15 sequence.

Species-Specific Repeat Library Construction and TE Identification

To identify species-specific repeat sequences, we used RepeatModeler (https://www.repeatmasker.org/RepeatModeler/), a de novo TE family identification and modeling package, with default parameters. Using the LTRStruct option in RepeatModeler, LTRs present in the repeat sequences were verified. To annotate the TEs in the genome, RepeatMasker (https://www.repeatmasker.org/RepeatMasker/) was used against RMBlast using the species-specific repeat library.

Statistical Analysis

To annotate the TEs in the genome, RepeatMasker has built-in, statistically optimal scoring matrices to align the sequences and identify the hits with biological significance. For the analysis of the distribution of different types of TEs across different regions of the genome, a randomization test was used [41]. A window of 10,000 bp was used to break the genome into “segments.” Scaffold 1 contained 4,369 segments, whereas rest of the genome contained 7,839 segments. Here we considered scaffold 1 as sample 1 and rest of the genome as sample 2. For each segment in the two samples, the number of TEs present was identified, and the mean number of TEs present for each sample was calculated. The mean difference (“observed” mean difference) between sample 1 and sample 2 was calculated. Then we combined all the segments from the two samples into one pool and randomly drew 4,369 segments and assigned them to sample 1 and 7,839 segments to sample 2. The mean number of TEs presents for each randomized sample, and the mean difference (“expected” mean difference) between the two samples was then calculated. We repeated this expected difference calculation 10,000 times. The p value was then the percentage of times the observed mean difference value was equal to or greater than the expected mean difference value. If the p value is less than 0.05 (p < 0.05) then the TE accumulation on scaffold 1 was deemed significantly higher than the rest of the genome.

The above-mentioned randomization method was used a second time to compare regions within scaffold 1. Scaffold 1 was split into 20 equally sized “bins” to determine if the TEs were uniformly distributed across the scaffold or were instead concentrated into one or more “hotspot” region(s). Herein, each bin of scaffold 1 was treated as sample 1 and rest of the bins of scaffold 1 plus the rest of the genome was treated as sample 2. The likelihood that a bin had a higher observed mean difference than expected was calculated as noted above.

Sex-Linked Marker Mapping

Microsatellite sex-linked marker sequences were blast against the scaffold 1 sequence to identify the exact genomic positions of the sex-linked markers on scaffold 1. These sex-linked markers were then mapped onto scaffold 1 using R scripts.

TE Distribution Mapping

Known TEs ≥ 500 bp were mapped onto the genome. The information on TE type and their locations were obtained from the RepeatMasker output files and were then mapped onto the genome using R scripts.

Gene Density and GC Content of Scaffold 1

To generate information on gene density and GC content of scaffold 1, we used the genome FASTA and the gene annotation GTF files of the WW genome, which were available from WW genome assembly project [38]. First, the genome was indexed using SAMtools, and then we extracted the sequence lengths of each scaffold from the output.fai file. We created a 2,200 kb sliding window file using BEDTools in order to break scaffold 1 into 20 bins which allowed for the comparison of gene density and GC content in those bins. Second, we generated a BED file storing coordinates of coding regions in using a GTF file. Third, we calculated the fraction of coding sequences in the genomic regions using the coverage BEDTool. Finally, using the same 2,200 kb genome sliding window file and BEDTools “nuc” option, we calculated the GC content of the genomic regions. Then gene density and GC content values for scaffold 1 were extracted from files containing the fraction of coding sequences in the genomic regions and the GC content file, respectively. To visualize gene density and GC content present in the 20 bins of scaffold 1, R scripts were used.

To visualize the GC content of scaffold 1 in a boxplot and conduct pairwise Wilcoxon tests, we generated a GC file with a 50 kb sliding window using the above-mentioned method. From the 50 kb window GC content file, we extracted the information on window coordinates, percentage values of AT and GC and added a new column called “location.” Here, the coordinates which fall within the putative sex-determining region were labeled as “WSDR” (W sex-determining region), and rest of the coordinates were labeled “PAR” (pseudoautosomal region), under the location column. The boxplot generation and pairwise comparisons for genomic location versus GC content using Wilcoxon rank sum test were conducted using an R script.

Identification of Sex Determination Genes in Scaffold 1

To identify gene models on scaffold 1, we utilized the genome annotation file produced by the Maker pipeline [42]. The protein sequences derived from the MAKER pipeline for WW individuals were subjected to a protein BLAST search [43] against the Drosophila proteome database, which was obtained from UniProt database [44]. The BLAST results were filtered based on the following criteria: e-value <1e−10, bit score >50, and percentage identity ≥80%. The genes associated with sex determination, spermatogenesis, and oogenesis were extracted from UniProt and then identified on scaffold 1. Finally, we calculated the number of genes found in each bin of scaffold 1.

Repetitive Genomic Composition

The RepeatMasker scan of the WW genome of E. texana found that 30.8% of the genome was composed of TEs (online suppl. Fig. 1; for all online suppl. material, see https://doi.org/10.1159/000542284). The majority of the repeat content was composed of unknown/unclassified TEs, whereas DNA transposons and retrotransposons comprised ∼4% and ∼10% of the genome, respectively. LTRs were the dominant TE family compared to other known TE families (online suppl. Fig. 1).

Localization of Putative Sex-Determining Region

To localize the putative sex-determining region of the WW genome, the known TE accumulation was compared on scaffold 1 to the rest of the genome. Scaffold 1 was not significantly higher in TEs than the rest of the genome (observed mean difference = −0.418, p value = 1). Therefore, we split scaffold 1 into 20 equal sections (“bins”) and carried out another set of randomization tests, treating each bin as an individual sample to compare to the rest of the genome. The genomic region of 6.6 Mb through 8.8 Mb of scaffold 1 (fourth bin, Table 1) showed significantly higher accumulation of TEs (observed mean difference = 0.317, p value = 0.013), while other bins of the scaffold 1 showed no significant accumulations of TEs compared to the rest of the genome. Therefore, from this point onwards, we will refer to E. texana’s bin 4 as the putative WSDR.

Table 1.

Randomization test on TE accumulation of scaffold 1 bins versus rest of the genome

Bin#Observed mean difference between bin# and rest of the genomep value
−0.106 0.7734 
−0.342 0.9995 
0.148 0.1145 
4 0.317 0.0134 
−0.058 0.6685 
−0.298 0.9980 
−0.046 0.6692 
−0.268 0.9970 
−0.909 1.0000 
10 −0.220 0.9650 
11 −0.516 1.0000 
12 −0.266 0.9679 
13 −0.342 0.9998 
14 −0.595 1.0000 
15 −0.358 0.9993 
16 −0.308 0.9991 
17 −0.159 0.9345 
18 −0.481 1.0000 
19 −0.238 0.9809 
20 −0.442 0.9991 
Bin#Observed mean difference between bin# and rest of the genomep value
−0.106 0.7734 
−0.342 0.9995 
0.148 0.1145 
4 0.317 0.0134 
−0.058 0.6685 
−0.298 0.9980 
−0.046 0.6692 
−0.268 0.9970 
−0.909 1.0000 
10 −0.220 0.9650 
11 −0.516 1.0000 
12 −0.266 0.9679 
13 −0.342 0.9998 
14 −0.595 1.0000 
15 −0.358 0.9993 
16 −0.308 0.9991 
17 −0.159 0.9345 
18 −0.481 1.0000 
19 −0.238 0.9809 
20 −0.442 0.9991 

Bins with significantly high TE accumulation are in bold.

Localization of Sex-Linked Markers

All three sex-linked marker microsatellites were found on scaffold 1 (Fig. 1a): CS8 was located at ∼2.1 Mb, CS11 was located at ∼7.4 Mb (within the above-mentioned WSDR), and CS15 was located at the center of the scaffold (∼22 Mb).

Fig. 1.

a Genomic locations of sex-linked microsatellites and the putative sex-determining region on scaffold 1 of Eulimnadia texana. Blue shaded area is the putative sex-determining region (WSDR; bin 4). The genomic positions of three sex-linked markers are color-coded as: CS8 (yellow), CS11 (black), and CS15 (red). b Landscape of most abundant TEs along scaffold 1 of Eulimnadia texana. Histogram represents the density of repeats along scaffold 1 for three TE types: DNA (green), LINE (orange), LTR (purple). The distribution was calculated in a sliding window of 10 Mb. The X axis indicates the genomic position and the Y axis indicates the percentage of TEs per 10 Mb. The red double headed arrow indicates the TE-enriched region present on the first quarter of scaffold 1. c Gene density and GC content of scaffold 1 in E. texana. A 2,200 kb sliding window was used to calculate the gene density and GC content in each genomic region (“bin”) present on scaffold 1. Each stripe represents a bin in scaffold 1 and there are 20 bins in total. All the bins except for bin 20 contained 2,200 kb sequence length. Bin 20 only contained 884 kb sequence length as it housed the sequence end of scaffold 1. The left bar graph represents the gene density, with gene-rich regions represented by green color and gene-poor regions represented by red color. Bin 4 has the least gene density compared to rest of the bins in scaffold 1. The right bar graph represents the GC content, with GC-rich regions represented by green color and GC-poor regions represented by red color. Bin 4 and bin 7 have the least GC content compared to rest of the bins in scaffold 1.

Fig. 1.

a Genomic locations of sex-linked microsatellites and the putative sex-determining region on scaffold 1 of Eulimnadia texana. Blue shaded area is the putative sex-determining region (WSDR; bin 4). The genomic positions of three sex-linked markers are color-coded as: CS8 (yellow), CS11 (black), and CS15 (red). b Landscape of most abundant TEs along scaffold 1 of Eulimnadia texana. Histogram represents the density of repeats along scaffold 1 for three TE types: DNA (green), LINE (orange), LTR (purple). The distribution was calculated in a sliding window of 10 Mb. The X axis indicates the genomic position and the Y axis indicates the percentage of TEs per 10 Mb. The red double headed arrow indicates the TE-enriched region present on the first quarter of scaffold 1. c Gene density and GC content of scaffold 1 in E. texana. A 2,200 kb sliding window was used to calculate the gene density and GC content in each genomic region (“bin”) present on scaffold 1. Each stripe represents a bin in scaffold 1 and there are 20 bins in total. All the bins except for bin 20 contained 2,200 kb sequence length. Bin 20 only contained 884 kb sequence length as it housed the sequence end of scaffold 1. The left bar graph represents the gene density, with gene-rich regions represented by green color and gene-poor regions represented by red color. Bin 4 has the least gene density compared to rest of the bins in scaffold 1. The right bar graph represents the GC content, with GC-rich regions represented by green color and GC-poor regions represented by red color. Bin 4 and bin 7 have the least GC content compared to rest of the bins in scaffold 1.

Close modal

Distribution of Known TEs across Scaffold 1

On scaffold 1, TE classes were enriched (above 20% of TEs) in the first quarter portion, while TEs along the remainder of the scaffold were scattered (Fig. 1b). At the last quarter portion of the scaffold, DNA TEs had high peaks (above 20%) and LINE had medium peaks (10–20% of TEs) (Fig. 1b). Interestingly, there were two peaks of LTR (indicated by dense purple peaks above 10% of TEs) in the region of 5–10 Mb (Fig. 1b). A similar enrichment pattern was observed in LTR and LINE repeats when the TEs above 500 bp were mapped separately to scaffold 1 (online suppl. Fig. 2a). DNA repeats showed random distribution along scaffold 1; they were enriched even in the regions where LTR and LINE repeats were poor (online suppl. Fig. 2a). LINE/I and LTR/Gypsy repeats, the two most abundant TE superfamilies on scaffold 1, tended to occur in the same genomic regions displaying similar enrichment patterns (online suppl. Fig. 2b).

Gene Density and GC Content of Scaffold 1

The gene density along the length of scaffold 1 was found to be nonuniform, ranging from a low of 0.308 to a high of 0.452 (Fig. 1c). Notably, bin 4, which corresponds to the putative WSDR, displayed the lowest gene density (0.308), while bin 19 exhibited the highest gene density (0.452) (Fig. 1c). Analysis of sex determination-related genes across scaffold 1 revealed that the first four bins showed a higher concentration of genes associated with reproductive processes (drawn from Drosophila melanogaster database) compared to the remaining bins (online suppl. Table 1).

Similar to gene density patterns, the levels of GC content along scaffold 1 were found to be nonuniform, ranging from 0.382 to 0.412 (Fig. 1c), with bin 7 having the lowest GC content (0.382), bin 4 having second lowest GC content (0.384), and bin 20 having the highest GC content (0.412) (Fig. 1c). When comparing GC content of PAR (pseudoautosomal region) and WSDR (bin 4) genomic locations on scaffold 1, WSDR had a lower GC content than PAR region, and the Wilcoxon rank sum test revealed that the difference was significant (Fig. 2, p value = 0.00001).

Fig. 2.

Genomic location versus GC content of scaffold 1 in Eulimnadia texana. A 50 kb sliding window was used to calculate the GC content in each genomic region belonging to the PAR and WSDR (bin 4) locations. The black line indicates the average GC content value for each genomic location. WSDR (blue) had significantly lower GC content than PAR (red; Wilcoxon rank sum test, p value = 0.00001).

Fig. 2.

Genomic location versus GC content of scaffold 1 in Eulimnadia texana. A 50 kb sliding window was used to calculate the GC content in each genomic region belonging to the PAR and WSDR (bin 4) locations. The black line indicates the average GC content value for each genomic location. WSDR (blue) had significantly lower GC content than PAR (red; Wilcoxon rank sum test, p value = 0.00001).

Close modal

Many studies provide evidence that TEs play a pivotal role in sex chromosome evolution [45, 46]. Moreover, the non-recombining sex-determining regions are found to be rich in TEs along with having low gene density and low GC content [17, 47]. In the current study, we used information on TE distribution, sex-linked markers, gene density, and GC content to delineate the likely WSDR of E. texana.

In agreement with an earlier study of TE distribution in E. texana [40], three sex-linked markers were found on scaffold 1 reinforcing the notion that this is indeed the sex chromosome. Contrary to the expectation of a uniform TE enrichment across scaffold 1 [39], we found that scaffold 1 has sparse accumulation of TEs throughout most of its length (Fig. 1b, online suppl. Fig. 2a, b). Scaffold 1 does not have a significantly higher TE content throughout its length relative to the rest of the genome, suggesting that the entirety of scaffold 1 is unlikely to be non-recombining. These results agree with the findings of Lang et al. [40].

In addition to the TE predictions, if scaffold 1 is non-recombining across its entirety, we would expect a low gene density [17] and a low GC content across the entire length of scaffold 1 [47, 48]. Our results revealed that the gene density and GC content levels varied across the length of scaffold 1. While some regions of scaffold 1 had high gene densities, others had low gene densities (Fig. 1c). Similarly, some regions of scaffold 1 had high GC content, whereas other regions had low GC content (Fig. 1c). Both results indicate that the entire length of scaffold 1 is not subjected to recombination suppression, again reinforcing previous findings [40]. Therefore, although scaffold 1 harbors sex-linked markers, it is not likely to be a fully non-recombining chromosome, as previously predicted [39].

Lang et al. [40] suggested that a smaller region of E. texana’s scaffold 1 might harbor the sex-determining region, which has been found in other organisms [49‒51]. However, the number of TEs masked in the Lang et al. [40] study was too small to test this hypothesis. In the current study, the inclusion of species-specific TEs allowed us to test for evidence of a smaller region of scaffold 1 harboring a non-recombining region. Scaffold 1 was divided into 20 equal-sized section (“bins”) to investigate TE accumulation in each section against the rest of the genome. Our results showed that only bin 4 (∼5% of scaffold 1 ranging from 6.6 to 8.8 Mb) was found to be accumulating higher TE content compared to the rest of the genome. None of the remaining bins had higher TE accumulation compared to the rest of the genome (Table 1). Therefore, higher TE accumulation was limited to a small region of scaffold 1, implying that the non-recombining region of the W is concomitantly small, as predicted by Lang et al. [40]. Two of the three previously found [35] sex-linked markers also mapped close to this likely non-recombining region (blue in Fig. 1a). Additionally, the gene density analysis (Fig. 1c) revealed that bin 4 has the lowest gene density among the 20 bins of E. texanas scaffold 1, indicating this region has a high level of gene degradation. However, the WSDR (Bin 4) retained several genes associated with reproductive processes in Drosophila melanogaster (online suppl. Table 1), including the mts gene (involved in oogenesis [52]), the Rab6 gene (plays a role in germarium-derived egg chamber formation, oocyte microtubule cytoskeleton polarization, and pole plasm oskar mRNA localization [53]), and the dac gene (associated with genital disc sexually dimorphic development [54]). Similar to our results, the smaller sex-determining regions in young sex chromosomes of spinach, papaya, and Mercurialis annua were also found to exhibit patterns of high repeat content and lower gene density [55‒57]. Therefore, the observed patterns of TE distribution, linked genes, and gene content in E. texana all suggest that a smaller region of scaffold 1 is the likely sex-determining region.

GC content serves as a useful proxy for identifying non-recombining regions [29, 48]. Many species (e.g., mammals, Drosophila melanogaster, Caenorhabditis elegans, and yeast) have low GC content in non-recombining regions [29, 48]. Researchers have observed significantly lower GC content on degenerated sex-limited W(Y) chromosomes of chickens, zebra finches, humans, and mice when compared to autosomes and Z(X) chromosomes [58]. Our results for GC content revealed that bins 7 and 4 have the lowest GC content among the 20 bins of scaffold 1 (Fig. 1c), suggesting that these two bins might be non-recombining regions. Although the GC content is low in bin 7 (Fig. 1c), its gene density was relatively high compared to bin 4 (Fig. 1c) and its TE content was relatively low compared to the entire genome (Table 1), both of which are inconsistent with bin 7 being the sex-determining region of the W. However, the association between GC content and gene density is complex and not well understood [30]. A study of the rattlesnake W chromosome found that the youngest evolutionary strata were relatively gene-rich despite recombination suppression [58]. Therefore, bin 7 might be a young evolving stratum with recombination suppression and thus exhibits a lesser degree of degradation than the potentially older evolutionary stratum, bin 4. Moreover, our results on comparison of WSDR (bin 4) with PAR (the rest of the bins in scaffold 1) revealed that WSDR has significantly low GC content compared to PAR (p value = 0.00001, Fig. 2). Overall, our findings from analyses of TE accumulation, gene density, and GC content suggest that bin 4 likely harbors the putative sex-determining region of the proto-W sex chromosome.

As mentioned earlier, many studies on sex chromosome evolution have been done on model organisms with advanced sex chromosomal degradation, which provide much information on the later stages of degeneration of non-recombining chromosomes [7, 28]. However, there is little understanding of the evolutionary forces driving recombination suppression and proto-sex chromosome differentiation during the early stages of sex chromosome evolution [59]. To overcome these challenges, it is essential to explore species (e.g., E. texana) that have “young” sex chromosomes that are at the early stages of sex chromosome differentiation [9].

In conclusion, this study confirms previous findings [40] that challenge the prediction that Scaffold 1 represents a wholly non-recombining sex chromosome [39]. Instead, we have evidence of a smaller sex-determining region embedded within the W chromosome that exhibits low recombination, in agreement with the idea that E. texana’s W is a proto-sex chromosome [35, 40]. Hence, E. texana is an ideal candidate for studying evolutionary processes that drive early sex chromosome evolution. Future studies of E. texana should explore the genomic composition of the proto-sex-determining region and the evolution of recombination suppression, as well as whether LTR/Gypsy TEs contribute to structural changes in these sex chromosomes [60]. To explore this role, TE composition and sex-specific TE abundance on ZZ and WW sex chromosomes would need to be compared [60]. In the near future, we plan to assemble the genome of ZZ males so that we can compare the TE composition of the Z to the W chromosomes to understand how TEs may influence sex chromosome differentiation in proto-sex chromosomes.

We thank following people for their continuous support and guidance: Agnieszka Lipinska, Octavio Manuel Palacios Giménez, and Sean Mitchuson.

All work was conducted on nonregulated invertebrate animals; thus, neither IACUC reviews nor a review by the Ethical Review Board were necessary. An ethics statement was not required for this animal study as the species used does not require ethics approval according to local/national guidelines.

The authors have no conflicts of interest to declare.

No external funding was used in the conduct of these studies.

C.E. designed the study, performed data collection, analysis, and interpretation, and wrote the manuscript. S.C.W. helped conceive the study and edited the manuscript.

All data generated or analyzed during this study are included in this article and its online supplementary material. Further inquiries can be directed to the corresponding author.

1.
Bachtrog
D
,
Mank
JE
,
Peichel
CL
,
Kirkpatrick
M
,
Otto
SP
,
Ashman
T-L
, et al
.
Sex determination: why so many ways of doing it
.
PLoS Biol
.
2014
;
12
(
7
):
e1001899
.
2.
Hiort
O
.
Chapter 1: normal and variant sex development
. In:
Legato
MJ
, editor.
Principles of gender-specific medicine
. 3rd edn.
Cambridge, MA
:
Academic Press
;
2017
. p.
1
16
.
3.
Bull
J
.
Evolution of sex determining mechanisms
.
Menlo Park, CA
:
Benjamin/Cummings
;
1983
.
4.
Ohno
S
.
Sex chromosomes and sex linked genes
.
Berlin
:
Springer Verlag
;
1967
.
5.
Marshall Graves
JA
,
Peichel
CL
.
Are homologies in vertebrate sex determination due to shared ancestry or to limited options
.
Genome Biol
.
2010
;
11
(
4
):
205
.
6.
Fraser
JA
,
Heitman
J
.
Evolution of fungal sex chromosomes
.
Mol Microbiol
.
2004
;
51
(
2
):
299
306
.
7.
Abbott
JK
,
Nordén
AK
,
Hansson
B
.
Sex chromosome evolution: historical insights and future perspectives
.
Proc Biol Sci
.
2017
;
284
(
1854
):
20162806
.
8.
Bachtrog
D
,
Kirkpatrick
M
,
Mank
JE
,
McDaniel
SF
,
Pires
JC
,
Rice
W
, et al
.
Are all sex chromosomes created equal
.
Trends Genet
.
2011
;
27
(
9
):
350
7
.
9.
Charlesworth
D
,
Charlesworth
B
,
Marais
G
.
Steps in the evolution of heteromorphic sex chromosomes
.
Heredity
.
2005
;
95
(
2
):
118
28
.
10.
Beukeboom
L
,
Perrin
N
.
The evolution of sex determination
.
Oxford
:
Oxford University Press
;
2014
.
11.
Charlesworth
B
.
Model for evolution of Y chromosomes and dosage compensation
.
Proc Natl Acad Sci USA
.
1978
;
75
(
11
):
5618
22
.
12.
Nakamura
M
.
Sex determination in amphibians
.
Semin Cell Dev Biol
.
2009
;
20
(
3
):
271
82
.
13.
Charlesworth
B
,
Charlesworth
D
.
The degeneration of Y chromosomes
.
Philos T Roy Soc B
.
2000
;
355
(
1403
):
1563
72
.
14.
Biémont
C
,
Vieira
C
.
Genetics: junk DNA as an evolutionary force
.
Nature
.
2006
;
443
(
7111
):
521
4
.
15.
Slotkin
RK
,
Martienssen
R
.
Transposable elements and the epigenetic regulation of the genome
.
Nat Rev Genet
.
2007
;
8
(
4
):
272
85
.
16.
Dolgin
ES
,
Charlesworth
B
.
The effects of recombination rate on the distribution and abundance of transposable elements
.
Genetics
.
2008
;
178
(
4
):
2169
77
.
17.
Zhou
R
,
Macaya-Sanz
D
,
Rodgers-Melnick
E
,
Carlson
CH
,
Gouker
FE
,
Evans
LM
, et al
.
Characterization of a large sex determination region in Salix purpurea L (Salicaceae)
.
Mol Genet Genomics
.
2018
;
293
(
6
):
1437
52
.
18.
Sotero-Caio
CG
,
Platt
RN
,
Suh
A
,
Ray
DA
.
Evolution and diversity of transposable elements in vertebrate genomes
.
Genome Biol Evol
.
2017
;
9
(
1
):
161
77
.
19.
Eickbush
TH
,
Jamburuthugoda
VK
.
The diversity of retrotransposons and the properties of their reverse transcriptases
.
Virus Res
.
2008
;
134
(
1–2
):
221
34
.
20.
Jurka
J
,
Kapitonov
VV
,
Pavlicek
A
,
Klonowski
P
,
Kohany
O
,
Walichiewicz
J
.
Repbase update, a database of eukaryotic repetitive elements
.
Cytogenet Genome Res
.
2005
;
110
(
1–4
):
462
7
.
21.
Jensen-Seaman
MI
,
Furey
TS
,
Payseur
BA
,
Lu
Y
,
Roskin
KM
,
Chen
C-F
, et al
.
Comparative recombination rates in the rat, mouse, and human genomes
.
Genome Res
.
2004
;
14
(
4
):
528
38
.
22.
Lander
ES
,
Linton
LM
,
Birren
B
,
Nusbaum
C
,
Zody
MC
,
Baldwin
J
, et al
.
Initial sequencing and analysis of the human genome
.
Nature
.
2001
;
409
(
6822
):
860
921
.
23.
Witherspoon
DJ
,
Watkins
WS
,
Zhang
Y
,
Xing
J
,
Tolpinrud
WL
,
Hedges
DJ
, et al
.
Alu repeats increase local recombination rates
.
BMC Genomics
.
2009
;
10
:
530
.
24.
Daron
J
,
Glover
N
,
Pingault
L
,
Theil
S
,
Jamilloux
V
,
Paux
E
, et al
.
Organization and evolution of transposable elements along the bread wheat chromosome 3B
.
Genome Biol
.
2014
;
15
(
12
):
546
.
25.
Medstrand
P
,
van de Lagemaat
LN
,
Dunn
CA
,
Landry
JR
,
Svenback
D
,
Mager
DL
.
Impact of transposable elements on the evolution of mammalian gene regulation
.
Cytogenet Genome Res
.
2005
;
110
(
1–4
):
342
52
.
26.
Bull
JJ
.
Sex determining mechanisms: an evolutionary perspective
.
Experientia
.
1985
;
41
(
10
):
1285
96
.
27.
Rice
WR
.
Evolution of the Y sex chromosome in animals: Y chromosomes evolve through the degeneration of autosomes
.
Bioscience
.
1996
;
46
(
5
):
331
43
.
28.
Úbeda
F
,
Manus
M
,
Wild
G
.
On the origin of sex chromosomes from meiotic drive
.
Proc Biol Sci
.
2014
;
282
:
20141932
.
29.
Meunier
J
,
Duret
L
.
Recombination drives the evolution of gc-content in the human genome
.
Mol Biol Evol
.
2004
;
21
(
6
):
984
90
.
30.
Sémon
M
,
Mouchiroud
D
,
Duret
L
.
Relationship between gene expression and GC-content in mammals: statistical significance and biological relevance
.
Hum Mol Genet
.
2005
;
14
(
3
):
421
7
.
31.
Innocenti
P
,
Morrow
EH
.
The sexually antagonistic genes of Drosophila melanogaster
.
PLoS Biol
.
2010
;
8
(
3
):
e1000335
.
32.
Weeks
SC
,
Benvenuto
C
,
Reed
SK
.
When males and hermaphrodites coexist: a review of androdioecy in animals
.
Integr Comp Biol
.
2006
;
46
(
4
):
449
64
.
33.
Weeks
SC
,
Sanderson
TF
,
Zofkova
M
,
Knott
B
.
Breeding systems in the clam shrimp family Limnadiidae (Branchiopoda, Spinicaudata)
.
Invertebr Biol
.
2008
;
127
(
3
):
336
49
.
34.
Sassaman
C
,
Weeks
SC
.
The genetic mechanism of sex determination in the conchostracan shrimp Eulimnadia texana
.
Am Nat
.
1993
;
141
(
2
):
314
28
.
35.
Weeks
SC
,
Benvenuto
C
,
Sanderson
TF
,
Duff
RJ
.
Sex chromosome evolution in the clam shrimp, Eulimnadia texana
.
J Evol Biol
.
2010
;
23
(
5
):
1100
6
.
36.
Charlesworth
D
.
Androdioecy and the evolution of dioecy
.
Biol J Linn Soc
.
1984
;
22
(
4
):
333
48
.
37.
Pannell
J
.
The evolution and maintenance of androdioecy
.
Annu Rev Ecol Syst
.
2002
;
33
(
1
):
397
425
.
38.
Baldwin-Brown
JG
,
Weeks
SC
,
Long
AD
.
A new standard for crustacean genomes: the highly contiguous, annotated genome assembly of the clam shrimp Eulimnadia texana reveals hox gene order and identifies the sex chromosome
.
Genome Biol Evol
.
2018
;
10
(
1
):
143
56
.
39.
Baldwin-Brown
JG
,
Long
AD
.
Genomic signatures of local adaptation in clam shrimp (Eulimnadia texana) from natural vernal pools
.
Genome Biol Evol
.
2020
;
12
(
7
):
1194
206
.
40.
Lang
C
,
Ediriweera
C
,
Weeks
SC
.
Sex chromosome evolution in the clam shrimp Eulimnadia texana
.
Invertebr Biol
.
2024
;
143
(
2
):
e12426
.
41.
Lewis
J
.
What a randomization test is and how to run one in R. Measuring U
.
2020
. [cited 2022 Dec 16]. Available from: https://measuringu.com/randomization-test/
42.
Cantarel
BL
,
Korf
I
,
Robb
SM
,
Parra
G
,
Ross
E
,
Moore
B
, et al
.
MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes
.
Genome Res
.
2008
;
18
(
1
):
188
96
.
43.
Altschul
SF
,
Gish
W
,
Miller
W
,
Myers
EW
,
Lipman
DJ
.
Basic local alignment search tool
.
J Mol Biol
.
1990
;
215
(
3
):
403
10
.
44.
UniProt Consortium
.
UniProt: the universal protein knowledgebase in 2023
.
Nucleic Acids Res
.
2023
;
51
(
D1
):
D523
31
.
45.
Iwase
M
,
Satta
Y
,
Hirai
Y
,
Hirai
H
,
Imai
H
,
Takahata
N
.
The amelogenin loci span an ancient pseudoautosomal boundary in diverse mammalian species
.
Proc Natl Acad Sci USA
.
2003
;
100
(
9
):
5258
63
.
46.
Xu
L
,
Auer
G
,
Peona
V
,
Suh
A
,
Deng
Y
,
Feng
S
, et al
.
Dynamic evolutionary history and gene content of sex chromosomes across diverse songbirds
.
Nat Ecol Evol
.
2019
;
3
(
5
):
834
44
.
47.
Lynch
DB
,
Logue
ME
,
Butler
G
,
Wolfe
KH
,
Chromosomal
G
.
Chromosomal G + C content evolution in yeasts: systematic interspecies differences, and GC-poor troughs at centromeres
.
Genome Biol Evol
.
2010
;
2
:
572
83
.
48.
Marsolier-Kergoat
MC
,
Yeramian
E
.
GC content and recombination: reassessing the causal effects for the Saccharomyces cerevisiae genome
.
Genetics
.
2009
;
183
(
1
):
31
8
.
49.
Anderson
JL
,
Rodríguez Marí
A
,
Braasch
I
,
Amores
A
,
Hohenlohe
P
,
Batzel
P
, et al
.
Multiple sex-associated regions and a putative sex chromosome in zebrafish revealed by RAD mapping and population genomics
.
PLoS One
.
2012
;
7
(
7
):
e40701
.
50.
Ma
WJ
,
Rovatsos
M
.
Sex chromosome evolution: the remarkable diversity in the evolutionary rates and mechanisms
.
J Evol Biol
.
2022
;
35
(
12
):
1581
8
.
51.
Yue
J
,
Krasovec
M
,
Kazama
Y
,
Zhang
X
,
Xie
W
,
Zhang
S
, et al
.
The origin and evolution of sex chromosomes, revealed by sequencing of the Silene latifolia female genome
.
Curr Biol
.
2023
;
33
(
12
):
2504
14.e3
.
52.
Wassarman
DA
,
Solomon
NM
,
Chang
HC
,
Karim
FD
,
Therrien
M
,
Rubin
GM
.
Protein phosphatase 2A positively and negatively regulates Ras1-mediated photoreceptor development in Drosophila
.
Genes Dev
.
1996
;
10
(
3
):
272
8
.
53.
Coutelis
JB
,
Ephrussi
A
.
Rab6 mediates membrane organization and determinant localization during Drosophila oogenesis
.
Development
.
2007
;
134
(
7
):
1419
30
.
54.
Keisman
EL
,
Baker
BS
.
The Drosophila sex determination hierarchy modulates wingless and decapentaplegic signaling to deploy dachshund sex-specifically in the genital imaginal disc
.
Development
.
2001
;
128
(
9
):
1643
56
.
55.
Veltsos
P
,
Cossard
G
,
Beaudoing
E
,
Beydon
G
,
Savova Bianchi
D
,
Roux
C
, et al
.
Size and content of the sex-determining region of the Y chromosome in dioecious Mercurialis annua, a plant with homomorphic sex chromosomes
.
Genes
.
2018
;
9
(
6
):
277
.
56.
Yu
L
,
Ma
X
,
Deng
B
,
Yue
J
,
Ming
R
.
Construction of high-density genetic maps defined sex determination region of the Y chromosome in spinach
.
Mol Genet Genomics
.
2021
;
296
(
1
):
41
53
.
57.
Wang
J
,
Na
JK
,
Yu
Q
,
Gschwend
A
,
Han
J
,
Zeng
F
, et al
.
Sequencing papaya X and Yh chromosomes reveals molecular basis of incipient sex chromosome evolution
.
Proc Natl Acad Sci
.
2012
;
109
(
34
):
13710
5
.
58.
Schield
DR
,
Perry
BW
,
Card
DC
,
Pasquesi
GIM
,
Westfall
AK
,
Mackessy
SP
, et al
.
The rattlesnake W chromosome: a GC-rich retroelement refugium with retained gene function across ancient evolutionary strata
.
Genome Biol Evol
.
2022
;
14
(
9
):
116
.
59.
Wright
AE
,
Dean
R
,
Zimmer
F
,
Mank
JE
.
How to make a sex chromosome
.
Nat Commun
.
2016
;
7
:
12087
.
60.
Srikulnath
K
,
Ahmad
SF
,
Singchat
W
,
Panthum
T
.
Do Ty3/Gypsy transposable elements play preferential roles in sex chromosome differentiation
.
Life
.
2022
;
12
(
4
):
522
.