Skip to main content

Advertisement

Next-generation sequencing and the era of personal Y genomes

Article metrics

  • 2281 Accesses

  • 1 Citations

Next-generation sequencing technologies have produced many 'whole' human genome sequences and the 1000 Genomes Project is poised to add two and a half thousand additional genomes. The 1000 Genomes Pilot 1 Project has sequenced Y chromosomes from 77 males (27 CEU, 10 CHB, 16 JPT, 24 YRI) albeit at low (1x-2x) sequence coverage, and identified a total of ~3000 variable sites of which 75% are novel. Although many sites are missed, preliminary validation by Sanger di-deoxy sequencing gave an initial estimate of a low (3.3%) false positive rate. In addition, we have developed a resource of Y-specific primer pairs that target single-copy regions of the human Y chromosome. These have been used to produce 5-7 kb overlapping fragments by long-PCR that are subsequently pooled and sequenced.

This strategy was used to sequence, at high depth, 2-3 Mb of single-copy Y-specific DNA in a haplogroup A individual. The primer pairs have also been distributed to the wider community with the aim of re-sequencing comparable targeted regions in a diverse group of Y chromosomes. Using these primers, we have also established a tag indexing protocol that uses an 8-bp tag specific to each male, so that pooled samples from several males can be sequenced in one lane of an Illumina Genome Analyzer II, and sequence reads can be assigned back to each male sample using the tag. Using these approaches, we aim to generate individual Y-chromosomal sequences with adequate coverage that can be appropriately filtered to generate robust Y-SNP calls for refinement of the Y phylogenetic tree and further investigation of lineages of particular interest.

Author information

Correspondence to Qasim Ayub.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Ayub, Q., Jostins, L., Xue, Y. et al. Next-generation sequencing and the era of personal Y genomes. Genome Biol 11, O2 (2010) doi:10.1186/gb-2010-11-s1-o2

Download citation

Keywords

  • False Positive Rate
  • Human Genome Sequence
  • Male Sample
  • Adequate Coverage
  • Illumina Genome