Skip to main content
Fig. 3 | Genome Biology

Fig. 3

From: An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar

Fig. 3

False positive intrahost variants caused by sequencing errors can be removed by technical replicates and frequency cutoffs. We sequenced mixed Zika virus population containing 10% of virus #2 in triplicate, limited our analysis to the regions only covered by perfect PCR primer matches, and removed sites with intrahost single-nucleotide variant (iSNV) detected at > 1% frequency in either of the Zika virus isolates. This left us with 61 true positive (10% frequency) iSNV sites and 3940 sites not expected to be variable to investigate false positives (> 0.1% frequency). a The locations of false positives on the sequencing read position were mapped and shown as the distribution within 25 nt bins by percent of sites with false positive iSNV calls. Each color represents data from an independent replicate. Inset: read positions > 150 nt had a significantly higher false positive rate than positions < 150 nt (*, Wilcoxon test, p < 0.05). b The iSNV frequencies from each false positive were also plotted by position on the sequencing read. Each color represents data from an independent replicate. Inset: False positive iSNV frequencies were significantly higher at read positions > 150 nt than < 150 nt (*, Mann-Whitney test, p < 0.05). c True and false iSNVs were plotted by frequency for each individual replicate (A, B, and C) and combined as technical duplicates and triplicates showing the mean frequencies of iSNVs only found in all replicates. Data shown as means and standard deviations. The line indicates the proposed cutoff at 3% based on removing false positives from the replicate data while still in the range of high accuracy (Fig. 1b)

Back to article page