Impairment of male fertility is one of the major public health issues worldwide. Nevertheless, genetic causes of male sub- and infertility can often only be suspected due to the lack of reliable and easy-to-use routine tests. Yet, the development of a marker panel is complicated by the large quantity of potentially predictive markers. Actually, hundreds or even thousands of genes could have fertility relevance. Thus, a systematic method enabling a selection of the most predictive markers out of the many candidates is required. As a criterion for marker selection, we derived a gene-specific score, which we refer to as fertility relevance probability (FRP). For this purpose, we first categorized 2,753 testis-expressed genes as either candidate markers or non-candidates, according to phenotypes in male knockout mice. In a parallel approach, 2,502 genes were classified as candidate markers or non-candidates based on phenotypes in men. Subsequently, we conducted logistic regression analyses with evolutionary rates of genes (dN/dS), transcription levels in testis relative to other organs, and connectivity of the encoded proteins in a protein-protein interaction network as covariates. In confirmation of the procedure, FRP values showed the expected pattern, thus being overall higher in genes with known relevance for fertility than in their counterparts without corresponding evidence. In addition, higher FRP values corresponded with an increased dysregulation of protein abundance in spermatozoa of 37 men with normal and 38 men with impaired fertility. Present analyses resulted in a ranking of genes according to their probable predictive power as candidate markers for male fertility impairment. Thus, AKAP4, TNP1, DAZL, BRDT, DMRT1, SPO11, ZPBP, HORMAD1, and SMC1B are prime candidates toward a marker panel for male fertility impairment. Additional candidate markers are DDX4, SHCBP1L, CCDC155, ODF1, DMRTB1, ASZ1, BOLL, FKBP6, SLC25A31, PRSS21, and RNF17. FRP inference additionally provides clues for potential new markers, thereunder TEX37 and POU4F2. The results of our logistic regression analyses are freely available at the PreFer Genes website (https://prefer-genes.uni-mainz.de/).
Unintended childlessness affects an estimated 7–15% of couples during the reproductive phase, thus being a major public health issue [Sullivan, 2004; Ferlin et al., 2007; Zorrilla and Yatsenko, 2013; Song et al., 2016; Cariati et al., 2019]. Male-factor infertility is thereby assumed to contribute to about half of the cases, although the actual causes remain unexplained in 30–50% of the men seeking medical help [Poongothai et al., 2009; Nieschlag et al., 2010; Massart et al., 2012; Hamada et al., 2013; Krausz and Riera-Escamilla, 2018; Ferlin et al., 2019]. In fact, the main biomarker for male sub- and infertility, semen analysis, cannot reliably discriminate fertile from infertile individuals [Kovac et al., 2013; Bracke et al., 2018; Cariati et al., 2019; Panner Selvam et al., 2019], and genetic analysis continues to be ineffective to date [Aston, 2014; Coutton et al., 2015; Song et al., 2016; Mitchell et al., 2017; Cariati et al., 2019]. On the other hand, clarification of the genetic causes is a prerequisite for developing a tailored therapy of male fertility impairment [Sullivan, 2004; Bonache et al., 2012]. Yet, the long-term goal of a therapeutic offer is certainly desirable, regardless of alternatives such as in vitro fertilization and intracytoplasmic sperm injection [European IVF-Monitoring Consortium et al., 2016; Behre, 2019; Neuhaus and Schlatt, 2019]. After all, there are still many couples whose wish for a biological child is not being fulfilled despite medical attention.
One way toward an improved diagnosis of male fertility impairment is the definition of markers based on a comprehensive literature survey [Coutton et al., 2015; Bracke et al., 2018; Cariati et al., 2019]. However, it is generally difficult to determine the most predictive genetic markers across studies. A way out can be to evaluate the suitability of genes or proteins as markers on the basis of a standardized procedure [Smith et al., 2017; Oud et al., 2019]. The ideal marker panel would then enable a highly sensitive and accurate as well as minimally invasive assessment of the genetic causes of male fertility impairment for a large proportion of men [Kovac et al., 2013; Dipresa et al., 2018; Tüttelmann et al., 2018]. Based on such a marker panel, less infertile men should be exposed to invasive diagnostics. It should also be possible through improved tests to estimate the fertilization potential of sperm for in vitro fertilization and intracytoplasmic sperm injection more accurately than currently possible [Kovac et al., 2013]. Not least, an improved genetic diagnostic of (idiopathic) male fertility impairment holds out the prospect of fewer women going through diagnosis and treatment.
Unfortunately, the search for a panel that conveys the above-mentioned benefits is additionally complicated by the high number of potential biomarkers for male fertility deficiencies. Thus, NCBI’s OMIM (Online Mendelian Inheritance in Man) already lists more than 200 genetic conditions associated with male infertility, spanning from the most common manifestations to the rarest complex syndromes [Cariati et al., 2019]. However, the number of genes with potential fertility relevance is probably higher as illustrated by more than 6,200 proteins in sperm alone, which may at least partially influence male fertility [Shetty et al., 1999; Johnston et al., 2005; Schumacher et al., 2013; Amaral et al., 2014; Schumacher and Herlyn, 2018]. Beyond that, genes expressed in spermiogenesis stages and somatic testicular cells will play a role in fertility maintenance – and these are many. In fact, an estimated 74% of the roughly 20,000 human genes are expressed in human testis [Zorrilla and Yatsenko, 2013; Carrell et al., 2016; Bracke et al., 2018]. Additional processes such as sperm maturation and capacitation must also be considered in order to fully encompass the fundamentals of normal male fertility [Gatta et al., 2010; Fu-Jun and Xiao-Fang, 2012; Djureinovic et al., 2014; Intasqui et al., 2016; Khan et al., 2018; Ferlin et al., 2019]. It is thus unsurprising that the number of genes, which have been recognized for their potential as biomarkers for male sub- and infertility, is growing each year [Zorrilla and Yatsenko, 2013; Carrell et al., 2016; Dipresa et al., 2018; Oud et al., 2019].
With the present study, we aimed at providing a metric basis for the selection of the most predictive fertility markers out of the many possible candidates. For achieving a ranking of genes, we took into account expressional, evolutionary, and network parameters, which together should give an approximation of the functional importance of an individual gene or protein. Indeed, constant levels of expression in a broad range of tissues can be a hint for enhanced functional relevance of a gene [Zhang and He, 2005; Larracuente et al., 2008; Eisenberg and Levanon, 2013]. Greater evolutionary conservation of coding DNAs is usually taken as another indication of essentiality [Wilson et al., 1977; Jordan et al., 2002; Zhang and He, 2005]. According to this, increased importance for survival implies enhanced functional constraint and thus lowered substitution rates [Schumacher et al., 2017 and references therein]. This might reflect that the encoded proteins have larger proportions engaging in interactions with other proteins. Yet, most amino acid exchanges in interacting domains will have negative consequences for the interplay with other proteins, and the underlying mutations will experience negative selection [e.g., Wilke and Drummond, 2010; Yang et al., 2012]. Correspondingly, high connectivity in a protein-protein interaction (PPI) network is widely accepted as additional evidence for elevated relevance of a protein [Jeong et al., 2001; Fraser et al., 2002; Hahn and Kern, 2005; Schumacher and Herlyn, 2018].
The patterns described in the previous paragraph also apply to the majority of genes expressed in the male reproductive tract, especially when they are involved in basic cellular processes. In turn, genes functioning in greater proximity to fertilization quite regularly show an inverse pattern [Dean et al., 2009; Ramm et al., 2009; Schumacher et al., 2014, 2017; see also Fouchécourt et al., 2019]. In such cases, fewer interactants imply less functional constraint, thus allowing for more substitutions [e.g., Kwiatkowski et al., 2020] – a phenomenon that can be fostered by postcopulatory forms of sexual selection such as sperm competition, female choice, and immune evasion [Gasparini and Pilastro, 2011; Løvlie et al., 2013; Lüke et al., 2014a; Ramm et al., 2014; Sirot et al., 2015; Zhou et al., 2015]. However, whatever pattern a male reproductive gene may follow, evolutionary, expressional, and network parameters together should enable an assessment of how important it is for the maintenance of normal male fertility. We have taken advantage of this fact and combined corresponding parameters with a categorization of genes according to their non-association or association with fertility disorders. Conducting binary logistic regressions, we obtained fertility relevance probability (FRP) values for individual genes. The validity of the procedure was evaluated using quantitative proteomics of spermatozoa from normal-fertile and impaired-fertile men. Based on increased FRP values, we eventually selected genes that we consider to be particularly informative biomarkers for male fertility impairment (Fig. 1).
Materials and Methods
Network Reconstruction and Analyses
We used Cytoscape 3.4.0 for network reconstruction and analysis. First, we downloaded PPIs from IntAct, APID, MINT, DIP-IMEx, MatrixDB, InnateDB-IMEx (all integrated in PSICQUIC, October 2016) as well as BioGrid (version 3.4.141). Subsequently, we filtered for PPIs in humans, methods that allowed the detection of direct binary interactions, and participation in the largest connected component (LCC), i.e., the main network. The remaining interactants were mapped to UniProtKB and Ensembl identifiers (IDs) with the UniProt mapping tool, Ensembl BioMart, and manually. Subsequently, we analyzed the degree of identity between canonical amino acid sequences with CD-Hit (sequence identity threshold of 95%). After removal of unmapped interactants and manual resolution of redundancies, the network exclusively contained uniquely identifiable proteins. The corresponding 12,724 nodes (proteins) combined to a single coherent network with 99,113 edges (clustering coefficient = 0.083). For the sake of clarity, we always use regular writing of gene names when referring to proteins. Thus, we use CCDC155 for the protein encoded by CCDC155, instead of KASH5.
Datasets and Binary Categorization of Genes for Logistic Regression Analyses
Dataset M: The initial categorization of proteins used phenotype data of male knockout mice. We retrieved the corresponding information from Mouse Genome Informatics (MGI, http://www.informatics.jax.org/) by mapping Ensembl IDs of the genes encoding the proteins in our network to the IDs of murine 1-to-1 orthologues, using BioMart 86. Murine gene IDs were subsequently checked for entries in the phenotype database of MGI 6.10 (MGI_PhenotypicAllele.rpt; downloaded July 19, 2017). For further consideration, a phenotype had to manifest in male mice homozygous or hemizygous for a targeted knockout. We regarded genes as candidate markers for male fertility impairment (category 1), when their knockout associated with reduced sperm count, abnormal sperm morphology, and male sub- or infertility. However, we excluded genes if no knockout data existed at the time of the study or if a knockout was associated with pre-weaning mortality. We additionally eliminated genes if the ID for a reproductive knockout phenotype proved to be unstable during the study. Genes, for which knockouts associated with none of the included fertility phenotypes nor with any of the excluded ones were collected into the control group (non-candidates: category 0).
Dataset H: In a first step, we collected all fertility markers reported in the following overview articles: Ferlin et al. [2006, 2007], Poongothai et al. , Massart et al. , Hamada et al. , Kovac et al. , Zorrilla and Yatsenko , Aston , Ferlin and Foresta , Coutton et al. , De Braekeleer et al. , Krausz et al. , Carrell et al. , Dhanoa et al. , Song et al. , Mitchell et al. , and Ray et al. . Next, we matched the candidate list with the genes contained in dataset M. We kept the matching entries and subsequently verified the fertility relevance of a gene by consulting the aforementioned reviews and the original studies referenced therein as well as NCBI’s OMIM and PubMed (search item: gene symbol fertil*). We included various forms of fertility impairment in men, thereunder complex syndromes such as hypogonadism (with described fertility problems or lack of puberty), sex reversal (XY individuals with female phenotype), testicular maldescent with fertility restriction, sub-/infertility despite normal sperm parameters/idiopathic cases, and primary ciliary dyskinesia/Kartagener syndrome with fertility restriction. For acceptance of fertility relevance of a gene (category 1), corresponding evidence from a single study was deemed sufficient. It was further accepted if fertility restriction was reported for a single man. We also accepted if fertility relevance of a gene was inferred from comparison of disease and control cohorts, provided the statistical test was significant. However, a gene was excluded if an association with non-normal semen parameters was concluded without reference to a normal state. A gene was also eliminated if fertility relevance was exclusively derived from deviating expression or methylation patterns. We additionally deleted a gene when a causal role in fertility maintenance was not apparent from the sources consulted, e.g., when fertility relevance was only postulated due to a specific phenotype (e.g., testicular maldescent, priapism, hypogonadism or primary ciliary dyskinesia). Furthermore, we disregarded a gene if an indel or SNP could not be unambiguously mapped to it via dbSNP (NCBI). Reports of a positive effect or immune infertility additionally led to deletion of a gene. All in all, a gene was excluded as a precaution in case of uncertainty regarding the categorization. The candidate markers, which passed the filtering, constituted category 1 of dataset H. Category 0 of dataset H contained genes that were not sorted into category 1 and neither were filtered out.
Datasets M′ and H′: These datasets were partial samples of the genes in datasets M and H. Prerequisite for inclusion of a gene into datasets M′ and H′ was the detection of the encoded protein in the spermatozoa of at least six normal-fertile and at least six impaired-fertile men by label-free protein quantification (LFQ). In addition, protein abundances had to differ between both cohorts by at least a factor of 2. Lacking values in other probands were imputed with the aid of an own script assuming a beta distribution within 0.2 and 2.5 percentiles. For more details on LFQ, see below.
Covariates for Logistic Regression Analyses
We retrieved rate ratios of non-synonymous to synonymous substitutions (dN/dS) between 1-to-1 orthologues of humans (Homo sapiens) and mouse (Mus musculus) with the aid of BioMart (Ensembl Genes 86). As a second parameter, we estimated the level of testis-biased expression by comparing testicular transcript abundance with abundances in brain, heart, and ovary. Corresponding calculations used TMM-normalized RNA-Seq data downloaded from www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-2836/samples/. The human genome assembly GRCh38 was used as reference for read mapping, which was done in accordance with the script by Chen et al. . For assessing the centrality of a protein in our PPI network, we collected its node degree, closeness centrality, betweenness centrality, minimal shortest path to proteins encoded by candidate markers (category 1), and the number of category 1 proteins as direct neighbors in the PPI network (see above for network reconstruction and analysis).
Cohorts and Label-Free Quantification
The fertility of the probands was assessed on the basis of their parenting status and spermiograms. The latter were recorded by a qualified and experienced person in accordance with World Health Organization standards [World Health Organization, 2010]. Ejaculates were provided by 75 male Central Europeans, of which 37 were normal-fertile (father of at least 1 naturally conceived child in combination with normal spermiogram parameters) and 38 impaired-fertile (no pregnancy despite regular, unprotected sexual intercourse for at least 12 months, in combination with oligozoospermia, oligoasthenozoospermia, or oligoasthenoteratozoospermia). Both cohorts were similar in age structure, as demonstrated by median and mean values of 29 and 30 years (normal-fertile cohort) and 32 years each (impaired-fertile cohort).
Upon swim-up, spermatozoa were separated from seminal plasma by centrifugation for 5 min at 1,000 g. After discarding the plasma, cells were resuspended twice in PBS and re-pelleted by centrifugation at 2,000 g for 5 min, each time succeeded by discard of the supernatant. Washed spermatozoa were shock-frosted in liquid nitrogen and stored at −80°C. Following thawing on ice, the samples were boiled in lithium dodecyl sulfate buffer and purified by electrophoresis on a 4–12% gradient NOVEX gel (both Life Technologies). After in-gel digestion [Kappei et al., 2013], peptides were separated on a 30-cm reverse-phase capillary (75 μm inner diameter) packed with Reprosil C18 (1.9 μm; Dr. Maisch). Peptides were eluted with a 4 h gradient from 5 to 60% acetonitrile at 200 nL/min, using an Easy LC1000 HPLC system directly mounted to a Q Exactive Plus mass spectrometer (Thermo Scientific). Raw data analysis was performed with MaxQuant (v220.127.116.11) [Cox and Mann, 2008] with LFQ activated and match between runs with a match time window of 0.7 min and alignment time window of 20 min. For quantification, only unique peptides were used. For further consideration, proteins had to be detected with minimum two peptides (one unique, one razor) per sample.
For achieving a ranking according to the predictive power for male fertility impairment, we derived FRP scores for individual genes (datasets M and H). For doing so, we conducted binary logistic regression analyses using the forward likelihood ratio method in SPSS 23.0 v.5 (IBM). The above categorization of genes according to previous knowledge on their fertility relevance gave the dependent variable. Thus, genes with fertility relevance according to male phenotypes belonged to category 1 (candidate markers), while genes without corresponding evidence belonged to category 0 (non-candidates). Nine metric parameters giving sequence evolution, the level of testis-enriched transcription, and network centrality were defined as covariates. Whether or not inclusion of an individual covariate significantly improved the regression model, was tested with the Wald χ2 test implemented in SPSS. Effect size (f) was inferred from Nagelkerke’s R2,
An f value of 0.400 was taken as lower boundary for a large effect size [Cohen, 1988]. Levels of FRP were compared between category 0 and category 1 genes applying the Mann Whitney U (MWU) test in SPSS. This was done separately for datasets M, M′, H, and H′. We additionally tested for an association of FRP scores with absolute values of fold change of protein abundance as inferred from LFQ data (datasets M′ and H′), employing Spearman's rank correlation (SPSS).
All tests conducted were two-sided. We transformed p values into false discovery rates (FDR) applying the procedure of Benjamini and Hochberg . Thus, we multiplied each p value with the factor m/i, where i denotes the rank of an individual p value in the ascending order of all p values, and m is the number of tests performed. Thereby, we accounted for altogether 24 tests (2 logistic regressions with 9 χ2 tests each + 4 MWU tests + 2 Spearman's rank correlations). Notably, this is a conservative approach since, in logistic regression analyses, we were primarily interested in finding a summarizing model but not to identify the best predictor variables for gene categorization. Furthermore, the LFQ measurements were not intended to identify new biomarkers, but only to validate the procedure of FRP inference. Therefore, we did not conduct significance testing of protein expression levels.
Descriptive Approaches and Visualization
We applied arbitrary FRP threshold values of 0.600 and 0.350 to the genes in datasets M and H, for deriving proportions of true and false cases. Moreover, we matched our candidate marker set with the compilation of housekeeping genes by Eisenberg and Levanon . According to the underlying concept, in a housekeeping gene, more than half of the exons of at least one RefSeq transcript have to show largely constant expression in 16 human tissues. The matching was carried out with gene symbols and Ensembl gene IDs. These were retrieved for RefSeq mRNA IDs in the housekeeping gene compilation, using the data mining tool BioMart (Human genes: GRCh38.p13 in Ensembl Genes 99). For visualization of FRP distributions in datasets M, M′, H, and H′, we generated violin plots with the aid of BoxPlotR (http://shiny.chemgrid.org/boxplotr). Complementary network analysis aimed at a validation of the ranking of the 22 candidate markers in Table 5. Using STRING 11.0 (https://string-db.org/; visited May 29, 2020), we obtained confidence values for individual edges. The edges were collected from various sources, including text mining, experiments, databases and co-expression. Threshold values applied to the edges (0.150, 0.400, 0.700) corresponded to predefined increments for low, medium, and high confidence levels.
Genes Categorized according to Murine Knockout Phenotypes
Our initial dataset included 2,753 testis-expressed genes (dataset M) that were contained in the MGI phenotype database and additionally had 1-to-1 orthologues in our PPI network (Fig. 1). All evolutionary and expressional parameters considered were available for these genes as well. For 2,502 of these genes, no fertility restriction was observed in male mice homozygous or hemizygous for the respective knockout (category 0). For the other 251 genes, male knockout mice were reported to be sub- or infertile (category 1). Logistic regression analysis supported the inclusion of dN/dS, transcript abundance in testis relative to brain and heart, and the number of neighbors in the PPI network into the summarizing model (FDR ≤ 0.012, each; Table 1). After successive inclusion of the above variables, effect size culminated in a value of 0.434, which according to the definition applied corresponds to a large effect (online suppl. Table 1; see www.karger.com/doi/10.1159/000511117 for all online suppl. material). Across all proteins in dataset M, the values of FRP ranged from 0.005 to 0.915. Thereby, the FRP score of individual genes increased with stronger evolutionary conservation and more testis-enriched expression. The number of directly neighbored category 1 members in the PPI network was an additional positive determinant of FRP (Table 1). Highest FRP scores were derived for category 1 genes, while lowest ranks were occupied by category 0 genes. Accordingly, the FRP level was significantly higher in genes that were sorted into category 1 than in their category 0 counterparts (FDR = 0.000, MWU; Table 2). An arbitrary FRP threshold of 0.350 confirmed the initial categorization in 57% of the genes. When applying an FRP threshold of 0.600, true cases amounted to 54% still (Fig. 2A).
In dataset M, 149 genes coded for proteins with at least 2-fold differential protein abundances between spermatozoa of normal- and impaired-fertile men (Fig. 1; online suppl. Table 2). Thereby, proteins with markedly different abundances were encoded at higher rate by genes belonging to category 1 (13%) than category 0 (5%). We used the subsample of 149 proteins and their coding genes (dataset M′) to assess whether the FRP values were still grounded in empirical reality. In confirmation of this, the FRP score tightly correlated with the fold change of protein expression in the spermatozoa of our 2 study groups (FDR = 0.000, Spearman's rho = 0.328; Table 3). Thus, the more the expression of a protein was dysregulated between normal-fertile and impaired-fertile men, the higher was the FRP value of the coding gene. Furthermore, as in dataset M, the FRP level was significantly higher for category 1 than category 0 genes in dataset M′ (FDR = 0.000, MWU test; Table 2). Yet, the focus on dataset M′ led to a stronger increase in category 1 than category 0 for the FRP score, thus enhancing the discriminatory power. In detail, an FRP threshold of 0.350 confirmed the initial categorization in 61% of the genes in dataset M′, while a threshold value of 0.600 implicated 59% true cases (Fig. 2A, B).
Reproduced Findings in Genes Classified according to Phenotypes in Humans
We additionally inferred FRP values for a sample in which genes were grouped based on fertility phenotypes in men (Fig. 1). This dataset H contained 2,526 genes, whereby category 0 was larger (N = 2,474) than category 1 (N = 52) again. The fewer category 1 members in dataset H (relative to dataset M) reflected the deletion of genes in case of uncertainty (for criteria, see Materials and Methods). A reduced confidence in the categorization of genes by phenotypes of men could also be the reason for an overall lower FRP level in the analysis of dataset H (0.000–0.875), when compared to dataset M. Furthermore, betweenness centrality emerged as a new positive determinant of FRP in logistic regression analysis of dataset H (FDR = 0.040), while dN/dS was not amongst the predictors anymore. Also, the number of directly adjacent category 1 members in the PPI network was included into the regression model again (FDR = 0.012; Table 4). Moreover, transcription levels in testis relative to brain and heart were reproduced as positive covariates in dataset H (FDR ≤ 0.012, each; Table 4). In addition, the effect size (0.566) associated with the summarizing regression model was again large (online suppl. Table 1). Furthermore, in dataset H, the upper FRP ranks were occupied by category 1 genes once more, while the lowest ranks were exclusively taken by category 0 genes (Fig. 2C). Consequently, in dataset H, FRP levels were significantly higher in category 1 than in category 0 (FDR = 0.000, MWU; Table 2), just as in dataset M. Also, application of the above FRP threshold values to dataset H resulted in similar percentages of true cases (FRPH ≥ 0.350: 59%; FRPH ≥ 0.600: 54%) (Fig. 2C) as reported above for dataset M.
These findings were basically confirmed in the subset of 125 genes of dataset H′ (online suppl. Table 2), for which proteins showed at least 2-fold differential abundances in the spermatozoa of normal- and impaired-fertile men. Thus, in dataset H′, the level of predicted FRP was significantly higher in category 1 than category 0 genes (FDR = 0.000, MWU; Table 2), and true cases made up a much higher proportion of category 1 (12%) than category 0 genes (5%). In addition, the discriminatory power showed the expected raise in dataset H′ as evidenced by percentages of true cases (FRPH′ ≥ 0.350: 66%; FRPH′ ≥ 0.600: 58%) (Fig. 2D). Not least, in the genes included in dataset H′, the FRP score once more correlated significantly with the fold change of protein abundance in spermatozoa of men with different fertilities (FDR = 0.000; Spearman's rho = 0.318; Table 3).
Selecting Candidate Markers for Male Fertility Impairment
Genes and proteins with increased FRPs should have the highest predictive potential (Fig. 1). For their selection, we applied different FRP thresholds to datasets M (FRPM ≥ 0.600) and H (FRPH ≥ 0.350), thus accounting for different levels of the parameter in both datasets. If genes had to take the respective thresholds in datasets M and H, 9 candidates remained as the most predictive markers for male fertility impairment: AKAP4, TNP1, DAZL, BRDT, DMRT1, SPO11, ZPBP, HORMAD1, and SMC1B (class A candidate markers in Table 5). If elevated FRP values in analysis of dataset M were regarded as sufficient, 11 additional candidates emerged (class B candidate markers): DDX4, SHCBP1L, CCDC155, ODF1, DMRTB1, ASZ1, BOLL, FKBP6, SLC25A31, PRSS21, and RNF17. Besides such true positives, some category 0 genes might be rewarding candidates, considering their elevated FRP values. The highest-ranking examples for such formally false negatives are TEX37 and POU4F2 (class C candidate markers). Notably, there were no housekeeping genes among the 22 candidate markers in Table 5.
Complementary PPI network analysis with STRING widely confirmed the above ranking of genes as candidate markers for male fertility impairment. Thus, when accepting low confidence for edges, all the proteins encoded by class A and class B candidate markers combined to a single network (LCC). On the contrary, proteins encoded by class C candidates had no interaction partner (Fig. 3). Raising confidence threshold via medium to high level would additionally disconnect part of the class B proteins before first class A proteins would be expelled (see edge thickness in Fig. 3). In further support of their candidate status, out of the 22 genes listed in Table 5, five and thus 23% coded for proteins that showed more than 2-fold downregulated abundance in spermatozoa of impaired fertile men: AKAP4, ZPBP, ODF1, RNF17, and TEX37. This is above the random expectation of 5%, as given by 149 genes in dataset M′ out of a total of 2,753 genes in dataset M. Lastly, post-hoc matching of the listed genes with external evidence confirmed the candidate status for nine of them (see empty rhombs and filled circles in Table 5).
Low Potential Candidates and Non-Candidates
Values approaching the lower boundary of the FRP interval (0,1) questioned the marker potential of genes. For example, FRPM values <0.030 had been calculated for ADRA1B, GRIA3, and CCND2 (not shown). They had initially been sorted into category 1 and thus may formally be regarded as false positives. Such genes should have less predictive power as biomarkers for male fertility impairment than the candidates presented in the previous section. In any case, category 0 genes with lowered FRPs should have the least potential as fertility markers. These true negatives made up the majority of cases in all datasets, regardless of the FRP threshold applied (Fig. 2).
In the present study, we carried out binary logistic regression analyses to evaluate the potential of genes as markers for male fertility impairment. The FRP score we have derived in this way for individual genes can take any value in the interval (0,1). Depending on the number of decimal places, the method thus allows a finer gradation of genes according to their predictive power than is possible with alternative methods underlying 5 to 15 categories [Smith et al., 2017; Oud et al., 2019]. More importantly, the procedure can be modified by other combinations of putative predictors and is principally applicable to the selection of biomarkers for other clinical manifestations. Here, it should be an advantage that the method can be operated with basic parameters that can be retrieved from public databases. Hence, the comparability of processed data from different studies has not to be established in the first place, as is usually required in merging approaches. In addition, there is no need to compare the results from different studies, as is otherwise the case with meta-analyses [Zhang et al., 2018]. Thus, the present procedure is comparatively straightforward to use.
In support of the validity of the current approach, we found the FRP score being positively correlated with the extent to which protein abundances differed between spermatozoa of normal- and impaired-fertile men (Tables 3, 5; online suppl. Table 1). Differentially expressed proteins even amounted to 23% in our selection of candidate markers. The relevance of the candidate markers shortlisted in Table 5 is further underscored by consistencies with follow-up studies that were not included in our initial categorization of genes. For example, Oud et al. [2019; their Table SIV] found for altogether 5 of our class A candidate markers (AKAP4, BRDT, DMRT1, SPO11, ZPBP) variants causing male fertility impairment. The same authors recognized fertility relevance for CCDC155, which was ranked as a class B candidate marker in the present analysis of dataset M (Table 5). In addition, present class A member DMRT1 was evaluated as one of the most relevant candidate markers for azoo- and cryptozoospermia by Tüttelmann et al. . Beyond that, at least 4 of the above-mentioned class A and B candidate markers (AKAP4, TNP1, DAZL, DDX4) have already been discussed as male fertility markers in farm animals such as horse, cattle, and pig [Ma et al., 2013; Kim et al., 2015; Zhang et al., 2015; Blommaert et al., 2019; Li et al., 2019]. Not least, the frequency of confirmatory evidence increased toward genes with higher FRP scores (see graphic symbols in Table 5).
Taken together the results from regression analyses of datasets M and H, the genes received elevated FRP values, which combined testis-biased expression with higher connectivity in the PPI network. Although less stable, stronger sequence conservation seems to be another indication of higher FRP (Fig. 3; Tables 1, 4). This suggests that the candidate markers in Table 5 should primarily be involved in basic cellular processes of particular importance for reproduction [e.g., Jeong et al., 2001; Zhang and He, 2005]. However, they should be implicated to less extent in processes taking place close to fertilization [Schumacher et al., 2014, 2017]. Furthermore, no housekeeping genes were found among the candidate markers [Eisenberg and Levanon, 2013]. We consider this to be a favorable constellation for the definition of biomarkers, the targeting of which by new drugs should have minimal side effects [Kovac et al., 2013].
Functional Implications of Candidate Markers for Male Fertility Impairment
A closer look at the functions of the 9 class A candidate markers (Table 5) underlines that they should have greater fertility relevance. The majority of these genes are involved in precopulatory processes, while only two are better known for postcopulatory involvements. The latter pair includes the highest-ranked candidate marker, AKAP4, which codes for A-kinase anchoring protein 4. The protein is a major component of the fibrous sheath in the sperm tail and participates in the cAMP/PKA/AKAP4 pathway. As such, it is primarily involved in capacitation and sperm motility, but the protein has additionally been implicated in acrosome reaction [Turner et al., 1998; Miki et al., 2002; Hamada et al., 2013; Coutton et al., 2015; Rahamim Ben-Navi et al., 2016]. Yet, the significance of AKAP4 could go beyond the maintenance of normal fertilization ability of spermatozoa. In fact, mutations in the gene have been linked to phenotypes such as asthenozoospermia and teratozoospermia, suggesting a role in spermiogenesis [Massart et al., 2012; Coutton et al., 2015]. The second member of the mentioned pair with mainly postcopulatory significance is ZPBP. In fact, zona pellucida-binding protein engages in the eponymous binding of the spermatozoon to the zona pellucida upon acrosome reaction [McLeskey et al., 1998; Ito and Toshimori, 2016]. Nevertheless, ZPBP seems also to be required for the formation of normally shaped spermatozoa, as indicated by missense and splicing mutations in the coding gene in almost 4% of men with abnormal sperm head morphology [Yatsenko et al., 2012; see also Ray et al., 2017].
The proteins encoded by other class A candidate markers function not as close to fertilization as AKAP4 and ZPBP but nevertheless are essential for reproductive health. For example, DMRT1 codes for doublesex and mab-3-related transcription factor 1, which is a regulator of sex determination, testis formation, and spermiogenesis [Raymond et al., 2000; Matson et al., 2011; Tüttelmann et al., 2018]. In BRDT (bromodomain testis associated), importance for spermiogenesis is reflected in participation of the encoded protein in germ cell differentiation [Shang et al., 2007; Barda et al., 2012], probably by regulating mRNA splicing and transcriptional repression [Berkovits et al., 2012]. In the case of DAZL (deleted in azoospermia-like form), potential relevance for germ cell differentiation results from the presumed controlling influence, which the protein has on the cell cycle switch from mitotic to meiotic cell division. Beyond that, DAZL has been implicated in gamete formation and survival [Kee et al., 2009; Dhanoa et al., 2016; Zagore et al., 2018]. Importance for male fertility is plausible also for the second-placed candidate marker, TNP1. The encoded protein, spermatid nuclear transition protein 1, is one of the histone substitutes in spermatids [Zhao et al., 2004; Lüke et al., 2014b], which subsequently are replaced by other candidate markers, protamines [Kovac et al., 2013; Zorrilla and Yatsenko, 2013; Carrell et al., 2016].
Reproduction relevance of another class A candidate, SPO11, is reflected in the initiation of meiotic double-stranded breaks (DSBs) prior to recombination by the protein [Cole et al., 2010; Bloomfield, 2016]. As such, SPO11 acts in close linkage with proteins, which two other top-ranked candidates, HORMAD1 and SMC1B, code for. In fact, HORMAD1 (HORMA domain containing 1), participates in DSB repair [Carofiglio et al., 2018], and SMC1B (structural maintenance of chromosomes 1B) engages in chromosome segregation, chromatid cohesion, and the maintenance of genome stability [Revenkova et al., 2001; Mannini et al., 2015]. Thus, candidate markers SPO11, HORMAD1, and SMC1B corroborate that genome and chromosome integrity are essential for normal spermiogenesis and male fertility [Ferlin et al., 2007; Poongothai et al., 2009; Zorrilla and Yatsenko, 2013; Bracke et al., 2018; Cariati et al., 2019; Cheung et al., 2019]. Actually, this is also true for DAZL, through its regulatory effect on the gene coding for synaptonemal complex protein 3 (SYCP3) [Aarabi et al., 2006; Reynolds et al., 2007; see also Miyamoto et al., 2003]. The widely acknowledged role of subnetworks in the etiology of diseases and disorders [Hayashida and Akutsu, 2016] was additionally reflected in the following findings: First, the number of neighbored category 1 members in the PPI network was a positive predictor of FRP (Tables 1, 4). Second, all class A and class B candidate markers in Table 5 combined to an LCC (Fig. 3).
It is generally known that the transferability of observations made in the mouse model to humans has its limitations [Massart et al., 2012]. Nevertheless, an increased potential as fertility marker is also plausible for class B candidate markers, which were primarily selected based on high FRP scores in analysis of dataset M (Table 5). For example, DMRTB1 (DMRT like family B with proline-rich C-terminal 1), for which the coding gene is a paralog of the above-mentioned DMRT1, is considered a central coordinator of the transition from mitosis to meiosis [Hilbold et al., 2019]. In line with this, loss-of-function mutations have been found to associate with azoospermia in men [Zorrilla and Yatsenko, 2013]. Another example illustrating the relevance of class B members for male fertility is DDX4. Thus, the encoded DEAD box helicase 4 is presumed to promote meiotic progression and repression of transposon activity [Medrano et al., 2012]. In contrast to DMRTB1and DDX4, the functional importance of other class B members manifests less at the regulatory but on the structural level. In particular, SHCBP1L (SHC binding and spindle associated 1 like) stabilizes the meiotic spindle apparatus [Liu et al., 2014], and CCDC155 (coiled-coil domain containing 155) participates in the linkage of meiotic chromosomes to the cytoskeleton [Stewart and Burke, 2014]. In addition, the fibrous sheath component ODF1 (outer dense fiber 1) is assumed to enable the recoil of the bent sperm flagellum [Shao et al., 1999]. In accordance with the above examples, the remaining six class B candidate markers, ASZ1, BOLL, FKBP6, SLC25A31, PRSS21, and RNF17 (Table 5) also code for proteins, which engage in spermiogenesis, sperm maturation, and sperm functionality [Crackower et al., 2003; Pan et al., 2005; Brower et al., 2009; Lin et al., 2009; Netzel-Arnett et al., 2009; Wang et al., 2017].
Future investigations might substantiate if class C candidates such as TEX37 and POU4F2 are informative in respect to male fertility impairment (Table 5). Elevated FRP scores as well as functional and expressional data suggest their candidate status indeed. Thus, TEX37 (testis expressed 37) is primarily expressed in testis, where the protein assumedly plays a role in spermatogenesis [Yu et al., 2007; Khan et al., 2018]. Testicular expression is also known for POU4F2 (POU class 4 homeobox 2; also known as BRN3B), whereby the encoded transcription factor presumably regulates transcription of spermiogenesis genes [Budhram-Mahadeo et al., 2001]. But even if TEX37 and POU4F2 have marker potential, their predictive power should be lower than for class A and class B candidate markers (Table 5). In addition, their potential fertility relevance would unlikely manifest in closer association with present class A and class B candidates. After all, TEX37 and POU4F remained isolated in the network reconstruction (Fig. 3).
Less Predictive Candidate Markers and Non-Candidates
Low FRP values question the marker potential of most of the genes, which we had initially assigned to category 0 (Fig. 2; Table 2). However, lowered FRPs were also inferred for part of the category 1 candidate markers for which fertility relevance was expected. Extreme examples were the genes coding for ADRA1B (alpha-1-adrenergic receptor 1B), GRIA3 (glutamate ionotropic receptor AMPA type subunit 3), and CCND2 (cyclin D2), which all received FRP scores <0.03 according to our calculations. However, irrespective of their formal assignment to category 1, annotation data do not suggest special importance for male reproduction. In particular, CCND2 might rather be important for maintaining female fertility [e.g., Chermula et al., 2019]. Moreover, GRIA3 has mainly been implicated in mental capability and diverse mental disorders and diseases [e.g., Fang et al., 2015; Davies et al., 2017], and ADRA1B seems to be involved in the etiology of psoriasis [Fan et al., 2019]. In this respect, their low FRP values appear to be reasonable.
Logistic regression analyses enabled the inference of FRP values for more than 2,750 genes. The procedure was validated in quantitative proteome analysis of altogether 75 men with normal and impaired fertility. In particular, a higher FRP score of an individual gene corresponded to a greater fold change of protein abundance between both cohorts. In further support of the validity of the logistic regression models inferred, FRP distributions met the expectation. Thus, genes which before were known for fertility relevance received higher FRP values than their counterparts without such association. Not least, we found our post-hoc ranking of 22 candidate markers (classes A–C) confirmed in edge confidence levels in network analysis conducted on them (Fig. 3). This approach revealed an increasing probability of an individual candidate marker to participate in a functionally linked subnetwork the higher it was ranked.
The shortlist of candidate markers in Table 5 is intended as a contribution toward a marker panel of male fertility impairment [Kovac et al., 2013; Dipresa et al., 2018; Tüttelmann et al., 2018]. Obviously, class A candidates should be particularly suitable for such application, followed by class B members, while class C members still require validation. Some of these genes and proteins could also be useful targets for fertility treatment yet to be developed. In addition, the higher ranked ones might open up new avenues for the development of new contraceptive strategies [Kaur and Prabha, 2014]. Prime candidate for the new contraceptive strategies could be ZPBP [Lin et al., 2007]. It should be an advantage here that ZPBP is not a housekeeping gene (Table 5). Consequently, the risk of significant side effects by a blocking agent is likely to be low. Expanding the contraceptive offer would bring added value indeed, as this may help reducing the estimated rate of 44% unintentional pregnancies and more than one million abortions per year worldwide [Kaur and Prabha, 2014; Bearak et al., 2018].
The results of present logistic regression analyses and the FRPs derived for individual genes are freely available at www.prefer-genes.uni-mainz.de (Figs. 4, 5). The short name of the website, PreFer Genes, refers to Prediction of Fertility Genes by logistic regression analysis. The acronym of the website additionally reflects the goal of providing assistance, which genes should be preferred for the development of new applications in reproductive medicine. Thereby, the website reports FRP values for all investigated genes, so that the selection of candidates can be made against the background of non-candidates. However, we do not claim comprehensiveness of the gene lists presented on the PreFer Genes website, although we consulted a large set of review articles and original studies on the genetic causes of male fertility impairment. Thus, there will be more parameters and biomarkers for male sub- and infertility than reported herein and at www.prefer-genes.uni-mainz.de.
We thank Anja Freiwald (IMB Mainz) for technical support in proteome analysis. Furthermore, we gratefully acknowledge that the implementation of reviewer comments improved the manuscript.
Statement of Ethics
The study was approved by the ethics committee of the Medical Faculty of the Martin Luther University Halle/Saale (decision date: April 19, 2010, amendment date: March 14, 2013). All patients gave written informed consent.
Conflict of Interest Statement
The authors have no conflicts of interest to declare.
Bioinformatic analyses by J.S. were in part funded by the Johannes Gutenberg University of Mainz and in part by Deutsche Forschungsgemeinschaft, DFG, to the credit of H.H. (project HE 3487/3: Bioinformatic prediction of sperm protein relevance and validation of the parameters network centrality and substitution rate in men and bulls).
T.G., J.S., F.B., and H.H. conceived the study. T.G. and H.M.B. planned the sampling, obtained votes and permissions, diagnosed the fertility status of probands, and collected and processed samples for proteomic analyses. J.S. planned logistic regression analyses, collected all data on which regression analyses are based, conducted binary logistic regression analyses, computed FRP values, and built the PreFer Genes website. M.D. and F.B. further processed the samples, conceived preliminary experiments, conducted quantitative proteomics, analyzed mass spectra, imputed lacking values, and determined fold change values. H.Z. participated in manuscript organization. H.H. supervised the study. H.H., T.G., and F.B. wrote the manuscript. All authors read and approved the final manuscript.
T.G. and J.S. contributed equally to this work.