Sniper: improved SNP discovery by multiply mapping deep sequenced reads

Table 1 Extent of spurious alignment due to unique or best-guess mapping strategies

Condition	Read length	SNP reads	Total aligned	Discarded	Misaligned	Misaligned/Aligned
Best, no error	30	7,413,324	99.96%	0.04%	12.11%	12.11%
	60	14,826,840	99.81%	0.19%	3.02%	3.02%
	90	22,240,926	99.55%	0.45%	1.15%	1.15%
	120	29,654,088	99.21%	0.79%	0.66%	0.67%
Uni, no error	30	7,413,324	74.94%	25.06%	0.00%	0.00%
	60	14,826,840	89.20%	10.80%	0.00%	0.00%
	90	22,240,836	93.74%	6.26%	0.00%	0.00%
	120	29,654,088	94.98%	5.02%	0.00%	0.00%
Best, 1% error	30	4,388,175	96.40%	3.60%	12.53%	13.00%
	60	2,771,304	86.33%	13.67%	2.99%	3.46%
	90	2,252,588	74.20%	20.62%	1.02%	1.38%
	120	2,132,976	62.14%	37.86%	0.51%	0.82%
Uni, 1% error	30	4,388,175	72.98%	27.02%	0.07%	0.09%
	60	2,771,304	78.01%	21.99%	0.06%	0.08%
	90	2,252,588	70.34%	29.66%	0.03%	0.05%
	120	2,132,976	59.82%	40.18%	0.02%	0.03%

Mapping of reads that overlap synthetic SNPs on human chromosome 1 was performed allowing up to k = 2 mismatches per read under four mapping conditions (best-guess without base-call error; best-guess with 1% base-call error; unique without error; unique with 1% base-call error) and four read lengths (30, 60, 90, and 120 nucleotides). Total mapped SNP reads and percentages of aligned, discarded, misaligned, and the ratio of misaligned to aligned reads are shown for each experiment. A read is considered misaligned if it does not overlap the SNP locus used to generate the read. See Figure 1b for a description of mapping strategies.

ISSN: 1474-760X