Sniper: improved SNP discovery by multiply mapping deep sequenced reads

Table 2 Data sets used in this study

Region name	Type^a	Loci^b	Difficulty^c	% deg.^d	APR^e	% ≥ 3	% ≥ 5	% ≥ 10
Human 261 k	Exp.	261,475	Low	90.4	1.91	0.21	0.05	0.02
Yeast RPL	Sim.	94,678	Low	85.1	1.88	2.60	0.00	0.00
Yeast 2 × RPL +0%	Sim.	189,356	Extreme	97.1	3.01	96.90	3.20	0.00
Yeast 2 × RPL +2%	Sim.	189,356	High	98.6	2.18	29.90	0.52	0.00
Yeast 2 × RPL +5%	Sim.	189,356	Medium/high	97.9	2.05	6.83	0.06	0.00
Yeast 2 × RPL +10%	Sim.	189,356	Medium	86.8	2.01	1.86	0.57	0.00

Genomic templates used in this study: Human 261 k [9]; synthetic concatenation of yeast ribosomal proteins (RPL); perfect tandem duplication of RPL (2 × RPL +0%); tandem duplication of RPL adding 2% sequence divergence to the second copy (2 × RPL +2%); tandem duplication of RPL adding 5% sequence divergence to the second copy (2 × RPL +5%); tandem duplication of RPL adding 10% sequence divergence to the second copy (2 × RPL +10%). See main text for more details. ^aWhether the sequenced reads were obtained experimentally (Exp.) or by simulation (Sim.). ^bIndicates the number of positions genotyped. ^cDetermined in the following manner: 100,000 36-nucleotide PE reads were sampled randomly from the template region with perfect fidelity. Sampled reads were aligned to the template with k = 3 mismatches. ^dPercentage of sampled reads with more than one alignment (percentage degeneracy). ^eAverage number of alignments for each sampled read across all reads (alignments per read). The last three columns report the percentage of reads aligning to at least three, five, or ten different loci in the template.

ISSN: 1474-760X