Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2023 Jan;613(7943):308-316.
doi: 10.1038/s41586-022-05547-7. Epub 2022 Dec 21.

The molecular evolution of spermatogenesis across mammals

Affiliations
Comparative Study

The molecular evolution of spermatogenesis across mammals

Florent Murat et al. Nature. 2023 Jan.

Abstract

The testis produces gametes through spermatogenesis and evolves rapidly at both the morphological and molecular level in mammals1-6, probably owing to the evolutionary pressure on males to be reproductively successful7. However, the molecular evolution of individual spermatogenic cell types across mammals remains largely uncharacterized. Here we report evolutionary analyses of single-nucleus transcriptome data for testes from 11 species that cover the three main mammalian lineages (eutherians, marsupials and monotremes) and birds (the evolutionary outgroup), and include seven primates. We find that the rapid evolution of the testis was driven by accelerated fixation rates of gene expression changes, amino acid substitutions and new genes in late spermatogenic stages, probably facilitated by reduced pleiotropic constraints, haploid selection and transcriptionally permissive chromatin. We identify temporal expression changes of individual genes across species and conserved expression programs controlling ancestral spermatogenic processes. Genes predominantly expressed in spermatogonia (germ cells fuelling spermatogenesis) and Sertoli (somatic support) cells accumulated on X chromosomes during evolution, presumably owing to male-beneficial selective forces. Further work identified transcriptomal differences between X- and Y-bearing spermatids and uncovered that meiotic sex-chromosome inactivation (MSCI) also occurs in monotremes and hence is common to mammalian sex-chromosome systems. Thus, the mechanism of meiotic silencing of unsynapsed chromatin, which underlies MSCI, is an ancestral mammalian feature. Our study illuminates the molecular evolution of spermatogenesis and associated selective forces, and provides a resource for investigating the biology of the testis across mammals.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. snRNA profiling across ten mammals and a bird.
a, Species sampled and uniform manifold approximation and projection (UMAP) of snRNA-seq datasets. UMAP of the integrated primate dataset (dashed box), showing undifferentiated and differentiated SG (undif. SG and dif. SG, respectively), leptotene, zygotene, pachytene and diplotene SCs (lept. SC, zyg. SC, pach. SC and dipl. SC, respectively), spermatids (SD) and somatic cell types. Chimp., chimpanzee. b, Principal component (PC) analysis of cell-type pseudo-bulks. Species and lineages are encircled by a dashed line. Each symbol represents an individual. c, Gene expression phylogeny based on pseudo-bulk transcriptomes for whole testes. Bootstrap values (4,498 1:1 orthologous amniote genes were randomly sampled with replacement 1,000 times) are indicated by circles, ≥0.9 (white fill).
Fig. 2
Fig. 2. Gene expression divergence and evolutionary forces.
a, Total branch lengths of expression trees among testicular cell types for amniotes and primates. Box plots show the median (central value); upper and lower quartile (box limits) and 95% confidence intervals (whiskers) for 1,000 bootstrap replicates. b, Spearman’s correlations between humans and other species from 100 bootstrap replicates (dots). Lines correspond to linear regression trends (after log transformation of the time axis. Regression R2 values range from 0.86 to 0.97. Ma, millions of years ago. c, Mean pLI value of expressed genes in human (the pLI score reflects the tolerance of a gene to a loss-of-function mutation; lower values mean less tolerance). d, Percentage of expressed genes leading to a lethal phenotype when knocked out in mouse (out of 4,742 knockouts). e, Mean normalized ratio of nonsynonymous (dN) over synonymous (dS) nucleotide substitutions of expressed genes in macaque. f, Percentage of expressed genes under positive selection (out of 11,170 genes tested for positive selection) in chimpanzee. g, Mean phylogenetic age of expressed genes in mouse. h, Percentage of UMIs mapping to protein-coding genes (top) or intergenic elements (bottom) in gorilla. i, Translational efficiency values (data from ref. ) are plotted for all genes with predominant expression in a given cell type in human. j, Mean of tissue-specificity values (data from ref. ) in opossum. k, Percentage of expressed genes associated with infertility (out of 3,552 knockouts) in mouse. ch,j,k, Plotted is the mean value per cell. ck, Superimposed thick black dots indicate medians from biological replicates. Box plots depict the median (centre value); upper and lower quartile (box limits) with whiskers at 1.5 times the interquartile range. Red lines separate somatic (OS, other somatic; ST, Sertoli cells) and germ cells. Data for other studied species are shown in Extended Data Fig. 3.
Fig. 3
Fig. 3. Evolution of gene expression trajectories along spermatogenesis.
a,b, The numbers of changed trajectories (in purple) and conserved trajectories (in olive) are indicated. In ac, red asterisks indicate the branch for which a trajectory change has been called. For each replicate, mean expression levels across cells of a given cell type were calculated. Marks indicate values for the replicates with the highest and lowest mean expression levels, dots indicate the median of mean expression values of the replicates (three replicates for chimpanzee and opossum; two for human, bonobo, gorilla, macaque, marmoset, mouse, platypus and chicken; one for gibbon). In ac, dashed vertical lines separate SG, SCs, rSD and eSD. a, Primate trajectories (human, bonobo, chimpanzee, gorilla, gibbon, macaque and marmoset; from top to bottom) based on 4,459 1:1 orthologues. TEX11 and PI4KB are examples of conserved and changed trajectories, respectively. b, Amniote trajectories (human, mouse, opossum, platypus and chicken; from top to bottom) based on 2,927 1:1 orthologues. DMRT1 and IP6K1 are examples of conserved and changed trajectories, respectively. c, RUBCNL expression trajectory from snRNA-seq data along spermatogenesis for human, bonobo, chimpanzee and gorilla (from top to bottom). Available orangutan samples did not allow for the generation of high-quality snRNA-seq data. d, Detection of RUBCNL (each red dot reflects a single transcript) expression in human, chimpanzee and orangutan testis by smISH using RNAScope. Left, seminiferous tubule cross sections counterstained with haematoxylin, and closeups on areas containing SG, SC and spermatids (SD). Right, quantification of RUBCNL expression levels in ten tubules per section (n = 3 for human, n = 1 for chimpanzee and orangutan). Box plots represent the mean (central value) distribution of staining (dots); upper and lower quartile (box limits) with whiskers at 1.5 times the interquartile range.
Fig. 4
Fig. 4. Mammalian sex-chromosome evolution.
a, Phylogenetic tree for human, mouse, opossum and platypus (from top to bottom). Arrows show sex systems origination. Illustration of therian (human, top) and monotreme (platypus, bottom) sex chromosomes. Recombining (crosses) PARs in turquoise and SDRs in red. b, Percentages of X-linked genes among testis-specific genes with predominant expression in a given cell type for human, macaque, mouse, opossum and platypus (from left to right; Supplementary Table 10). The red dashed line represents the expected percentages of X-linked genes, if testis-specific genes with predominant expression in the different cell types were randomly distributed across the genome. Asterisks indicate significance after two-sided exact binomial testing (Benjamini–Hochberg corrected P values from left to right: 0.000353, 1.18 × 10−7, 0.000687, 0.0142, 1.08 × 10−7, 3.45 × 10−8, 0.000792, 0.0355 and 0.00043). The number of testis-specific X-linked genes enriched in each cell type is indicated above each bar. c, UMAP representation of germ cells (left). Progression of spermatogenesis (grey arrow) and meiotic divisions (black arrows) are indicated. Spermatids identified as X- and Y-bearing are coloured in orange and purple, respectively. Box plots show the median (central value) percentages of X and Y transcripts in X- and Y-bearing spermatids; upper and lower quartile (box limits) with whiskers at 1.5 times the interquartile range. Two-sided Wilcoxon rank-sum tests were performed for statistical comparisons (Benjamini–Hochberg corrected ****P < 2.2 × 10−16). d, X-to-autosome transcript ratios in individual germ cells across spermatogenesis (from left to right). For spermatids, only X-bearing cells are considered. Lines depict generative additive model trend. Red arrows indicate MSCI. c,d, For platypus, X transcripts are dissected according to their location on the X chromosomes (Methods).
Extended Data Fig. 1
Extended Data Fig. 1. Cellular composition and marker gene assessment.
a–e, Clustering and UMAP representations (left), marker gene-based cell type assignment (right), for a, primates, b, mouse, c, opossum, d, platypus and e, chicken. CLU marks Sertoli cells; TAGLN and ACTA2 peritubular and smooth muscle cells; CD34 and TM4SF1 endothelial cells; APOE and CD74 macrophages; STAR and CYP11A1 Leydig cells; GFRA1, PIWIL4 (undifferentiated), DMRT1(differentiated), and STRA8 spermatogonia; SYCE1 (leptotene), SYCP1 (zygotene), PIWIL1 (pachytene), SYCP2, TANK and AURKA spermatocytes; LRRIQ1 (early), ACRV1 (late) and SPACA1 (late) round spermatids; SPATA3, NRBP1, PRM1 and GABBR2 elongated spermatids. f, Cell-type composition. g, Scaled expression of specific genes (y-axis) for all species and cell types (x-axis).
Extended Data Fig. 2
Extended Data Fig. 2. Mammalian testicular gene expression divergence.
a, Gene expression phylogenies. Neighbor-joining trees based on pairwise expression level distances (1−rho, Spearman’s correlation coefficient) for the main testicular cell types, respectively (“Other somatic” refers to peritubular, smooth muscle, endothelial, macrophage, and Leydig cells as detailed in the legend of Extended Data Fig. 1 and Methods). Trees are drawn to the same scale (indicated by the scale bar). Bootstrap values (i.e., proportions of replicate trees that have the branching pattern as in the majority-rule consensus tree shown) are indicated by circles at the corresponding nodes: ≥ 0.9 (white fill). b, Human-macaque (left, n = 304) and human-mouse (right, n = 257) pairwise Spearman correlations after downsampling to the same number of nuclei and UMI across cell types for each species. Boxplots show the median (central value); upper and lower quartile (box limits) and 95% confidence intervals (whiskers) for 100 bootstrap replicates.
Extended Data Fig. 3
Extended Data Fig. 3. Evolutionary forces.
a, Tolerance to functional variants. b, Average normalized ratio of nonsynonymous over synonymous nucleotide substitutions (dN/dS) of expressed genes across primates. c, Percentage of positively selected genes along spermatogenesis. Opossum, platypus, and chicken were not used in the original studies, and so the 1:1 orthologues in these species were used. d, Phylogenetic age of cellular transcriptomes. Higher values reflect increased lineage-specific gene contributions (that is, younger transcriptomes). e, Percentages of UMIs that map to protein coding genes (top) or to intergenic elements (bottom). XLOC_030429 drives a different mouse pattern. f, Translational efficiency (TE; normalized log2-transformed values) of expressed genes. g, Tissue specificity. h, Time specificity (development). In a, b, and d, human middle temporal gyrus (MTG) snRNA-seq data was used for non-gonadal (brain) comparisons. See legend of main Fig. 2 for further details regarding the plotted measures. Box plots depict the median (center value); upper and lower quartile (box limits) with whiskers at 1.5 times the interquartile range.
Extended Data Fig. 4
Extended Data Fig. 4. Examples of gene expression trajectory changes along primate spermatogenesis validated by smISH.
Left: Median gene expression trajectory from snRNA-seq along spermatogenesis for human, bonobo, chimp., and gorilla (from top to bottom). For each replicate, mean expression levels across cells of a given cell type were calculated. Marks indicate values for the replicates with the highest and lowest mean expression levels, dots indicate the median of mean expression values of the replicates. There are two biological replicates for human, bonobo, and gorilla and three biological replicates for chimpanzee. Center: images of seminiferous tubule cross sections from human, chimp., and orangutan (from top to bottom) stained with smISH for MYO3B (a) and ADAMTS17 (b) using RNAScope. Red asterisks indicate on which branch a trajectory change was called. Examples of spermatogonia (SG) and round spermatids (rSD) closeups are provided for visualization purposes. Right: percentage of positive cells in SG and rSD from 10 tubules per section (n = 3 for human, n = 1 for chimp. and orangutan). Box plots represent the mean (central value) distribution of staining; upper and lower quartile (box limits) with whiskers at 1.5 times the interquartile range.
Extended Data Fig. 5
Extended Data Fig. 5. Conservation of gene expression trajectories along spermatogenesis.
a, Global trajectory conservation score across amniotes for changed (n = 237) and conserved (n = 389) trajectories (P < 2.2x10−16, two-sided Wilcoxon rank-sum test). b, Trajectory conservation score between human and mouse of genes that lead to either a fertile (n = 1077) or an infertile (n = 108) phenotype when knocked out in mouse (P = 0.007, two-sided Wilcoxon rank-sum test). a,b, Box plots depict the median (center value); upper and lower quartile (box limits) with whiskers at 1.5 times the interquartile range. c, Gene expression trajectories (DMRT1, top; IP6K1, bottom) along primate spermatogenesis (spermatogenesis, from left to right). For each replicate, mean expression levels across cells of a given cell type were calculated. Marks indicate values for the replicates with the highest and lowest mean expression levels, dots indicate the median of mean expression values of the replicates. There are three replicates for chimpanzee; two replicates for human, bonobo, gorilla, macaque, and marmoset; one replicate for gibbon. The dashed vertical lines separate spermatogonia (SG), spermatocytes (SC), round (rSD) and elongated spermatids (eSD).
Extended Data Fig. 6
Extended Data Fig. 6. Functions of conserved and changed trajectories.
a, Top 20 enriched biological process GO terms of genes showing conserved expression trajectories across primates and peak expression in the different cell types, respectively. b, Top 20 enriched biological process GO terms of genes showing conserved trajectories between human and mouse and peak expression in the different cell types, respectively. c, The top 20 enriched biological process GO terms of genes with trajectory changes in primates. d, The top 20 enriched biological process GO terms in genes showing trajectory changes in amniotes. e, Human tissue (left) and time (right) specificity for changed (n = 237) and conserved (n = 389) trajectories across amniotes, respectively (two-sided Wilcoxon rank-sum tests were performed for statistical comparisons). Box plots depict the median (center value); upper and lower quartile (box limits) with whiskers at 1.5 times the interquartile range.
Extended Data Fig. 7
Extended Data Fig. 7. New genes with crucial spermatogenic functions.
a, percentages of genes among new genes of different ages (i.e., emergence in last common vertebrate, tetrapod, mammalian, eutherian, or rodent ancestors) that show infertility phenotypes when knocked out in the mouse (numbers of genes with infertility phenotypes and total number of genes considered for an age category are provided above each bar). b, Expression level trajectories of the retrogenes H2al2a and D1Pas1 in spermatogenic and somatic cell types. For each of the two replicates, mean expression levels across cells of a given cell type were calculated. Marks indicate values for the replicates and dots indicate the median of the values of the replicates. c, Trees of H2al2a and D1Pas1 and paralogs from which they originated through the mechanism of RNA-based gene duplication (i.e., D1Pas1 stems from Ddx3y, whereas the precise ancestral parental gene of H2al2a is not easily discernable due to a complex history of a number of RNA- and DNA-based duplication events).
Extended Data Fig. 8
Extended Data Fig. 8. Sertoli-germ cell communications mediated by ligand-receptor interactions in testis across mammals.
a, Significant ligand-receptor interactions (as assessed by CellPhoneDB; see Methods) across species for Sertoli-spermatogonia, Sertoli-spermatocytes, Sertoli-round spermatids and Sertoli-elongated Spermatids communications (details in Supplementary Table 9). b, Overview of all distinct significant ligand-receptor interactions between Sertoli and the four principal germ cell types across species. The number of detected significant interactions for each species in both panels is indicated to the bottom left of each plot (Set size).
Extended Data Fig. 9
Extended Data Fig. 9. Per chromosome testis-specific genes enriched in spermatogonia and Sertoli cells.
a,c, Per chromosome percentage of expressed testis-specific genes (y-axis) with predominant expression in differentiated spermatogonia (human and macaque) or spermatogonia (mouse, opossum and platypus) (a) and Sertoli cells (c) versus the percentage of all genes in the genome located on a given chromosome (x-axis: “% genes genome-wide”; i.e., the number of genes on a given chromosome divided by the total number of genes in the genome). X chromosomes are colored in red. The diagonal shows the numbers expected if testis-specific genes were randomly distributed across the genome. b, Location of platypus testis-specific genes enriched in spermatogonia on chromosomes X1, X3 and X5.
Extended Data Fig. 10
Extended Data Fig. 10. Classification of X and Y bearing spermatids.
The scatter and bar plots show the percentages of X and Y transcripts, and the distribution of X transcripts (%), respectively, across nuclei or cells in our single-nucleus (left) and publicly available single-cell data in mouse (a) and human (b). The red and green lines depict the fitted curves for the bimodal distributions.
Extended Data Fig. 11
Extended Data Fig. 11. Transcriptome differences between X and Y bearing spermatids.
The heatmaps indicate transcript abundances of significantly enriched genes in X- and Y-bearing spermatids in the different species (a: human, b: mouse, c: opossum, d: platypus). The corresponding bar plots (genes in the same order) show the genes with significantly enriched expression in X- and Y-bearing cells, respectively using two-sided Wilcoxon rank-sum tests with Bonferroni correction (ordered according to increasing P-values) (see also lists in Supplementary Table 11). Genes marked with an asterisk are long non-coding RNAs. Genes in red are located on autosomes, those in black on sex chromosomes. “XLOC” denotes loci annotated in our previous work; notably, only few XLOC loci (opossum: XLOC-076551, XLOC-076500, XLOC-076152, XLOC-076546) were found to have coding potential (as assessed by analyses based on ribosome profiling data generated in our previous work), whereas all others likely represent lncRNAs.
Extended Data Fig. 12
Extended Data Fig. 12. Mammalian sex chromosome evolution.
a, Y-to-autosome transcript ratios in individual germ cells along spermatogenesis. For spermatids, only Y-bearing cells are considered. Whiskers depict the median ± 25th and 75th percentiles. b, Average expression along platypus X chromosomes in spermatogonia and pachytene spermatocytes. Genes are grouped into 1,000,000 bp bins along the chromosomes. c, To identify potential MSCI escapees, we screened for X-linked genes in SDRs with a significant increase in transcript abundance from SG to SC stages subject to MSCI, which ensures that potential escape genes are indeed actively transcribed in SC (i.e., they do not merely represent genes expressed in SG with stable transcripts still detectable in SC) (Methods). Expression differences (loge-fold change, x-axis) of X and autosomal genes between SC and SG were assessed using differential expression analysis (two-sided Wilcoxon rank-sum tests; p-values on the Y-axis are Bonferroni-corrected). Vertical red lines indicate a 0.25-fold expression change (loge-scale); horizontal red lines indicate corrected P < 0.05. Potential escapees (i.e., X-linked genes in SDRs with significantly higher expression in SC than SG) are indicated by red arrows.

Similar articles

Cited by

References

    1. Brawand D, et al. The evolution of gene expression levels in mammalian organs. Nature. 2011;478:343–348. doi: 10.1038/nature10532. - DOI - PubMed
    1. Cardoso-Moreira M, et al. Gene expression across mammalian organ development. Nature. 2019;571:505–509. doi: 10.1038/s41586-019-1338-5. - DOI - PMC - PubMed
    1. Necsulea A, Kaessmann H. Evolutionary dynamics of coding and non-coding transcriptomes. Nat. Rev. Genet. 2014;15:734–748. doi: 10.1038/nrg3802. - DOI - PubMed
    1. Necsulea A, et al. The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature. 2014;505:635–640. doi: 10.1038/nature12943. - DOI - PubMed
    1. Sarropoulos i, Marin R, Cardoso-Moreira M, Kaessmann H. Developmental dynamics of lncRNAs across mammalian organs and species. Nature. 2019;571:510–514. doi: 10.1038/s41586-019-1341-x. - DOI - PMC - PubMed

Publication types