Next-generation sequencing and the era of personal Y genomes
© Ayub et al; licensee BioMed Central Ltd. 2010
Published: 11 October 2010
Next-generation sequencing technologies have produced many 'whole' human genome sequences and the 1000 Genomes Project is poised to add two and a half thousand additional genomes. The 1000 Genomes Pilot 1 Project has sequenced Y chromosomes from 77 males (27 CEU, 10 CHB, 16 JPT, 24 YRI) albeit at low (1x-2x) sequence coverage, and identified a total of ~3000 variable sites of which 75% are novel. Although many sites are missed, preliminary validation by Sanger di-deoxy sequencing gave an initial estimate of a low (3.3%) false positive rate. In addition, we have developed a resource of Y-specific primer pairs that target single-copy regions of the human Y chromosome. These have been used to produce 5-7 kb overlapping fragments by long-PCR that are subsequently pooled and sequenced.
This strategy was used to sequence, at high depth, 2-3 Mb of single-copy Y-specific DNA in a haplogroup A individual. The primer pairs have also been distributed to the wider community with the aim of re-sequencing comparable targeted regions in a diverse group of Y chromosomes. Using these primers, we have also established a tag indexing protocol that uses an 8-bp tag specific to each male, so that pooled samples from several males can be sequenced in one lane of an Illumina Genome Analyzer II, and sequence reads can be assigned back to each male sample using the tag. Using these approaches, we aim to generate individual Y-chromosomal sequences with adequate coverage that can be appropriately filtered to generate robust Y-SNP calls for refinement of the Y phylogenetic tree and further investigation of lineages of particular interest.
This article is published under license to BioMed Central Ltd.