Remarkably little variation in proteins encoded by the Y chromosome's single-copy genes, implying effective purifying selection.
Y-linked single-nucleotide polymorphisms (SNPs) have served as powerful tools for reconstructing the worldwide genealogy of human Y chromosomes and for illuminating patrilineal relationships among modern human populations. However, there has been no systematic, worldwide survey of sequence variation within the protein-coding genes of the Y chromosome. Here we report and analyze coding sequence variation among the 16 single-copy "X-degenerate" genes of the Y chromosome. We examined variation in these genes in 105 men representing worldwide diversity, resequencing in each man an average of 27 kb of coding DNA, 40 kb of intronic DNA, and, for comparison, 15 kb of DNA in single-copy Y-chromosomal pseudogenes. There is remarkably little variation in X-degenerate protein sequences: two chromosomes drawn at random differ on average by a single amino acid, with half of these differences arising from a single, conservative Asp-->Glu mutation that occurred approximately 50,000 years ago. Further analysis showed that nucleotide diversity and the proportion of variant sites are significantly lower for nonsynonymous sites than for synonymous sites, introns, or pseudogenes. These differences imply that natural selection has operated effectively in preserving the amino acid sequences of the Y chromosome's X-degenerate proteins during the last approximately 100,000 years of human history. Thus our findings are at odds with prominent accounts of the human Y chromosome's imminent demise.
Rozen, S; Marszalek, JD; Alagappan, RK; Skaletsky, H; Page, DC
Volume / Issue
Start / End Page
Pubmed Central ID
Electronic International Standard Serial Number (EISSN)
Digital Object Identifier (DOI)