Tests on 15 breast cancer exome sequencing data sets. (A) Sensitivity results from four different tools (five different modes). The samples are ordered by our estimated α. (B, upper) Negative calls from Strelka (purple), MuTect (light blue) and JointSNVMix (orange). Each false negative resulted from a low allele frequency or low read depth. (B, lower) Many of the false negatives from Virmid (15/23) resulted from extremely low coverage in the normal sample, and there was insufficient likelihood of calling the somatic mutation (LK, red dots). The remaining eight false negatives were explained as filtering errors (see Materials and methods for more details of filtering). (C) Total number of calls. (D) Specificity result from the virtual tumor analysis. The number of false positive calls were normalized by the total size of the exome region to give the number of false positives per million base pairs. Note that the y-axis has a log scale. LK, likelihood in calling the somatic mutation; Mbp, million base pairs; NM, number of mismatches; OFF, read offset; PROX, proximity to an indel; TRI, tri-allele.