Skip to main content
Fig. 3 | Genome Biology

Fig. 3

From: MuSE: accounting for tumor heterogeneity using a sample-specific error model improves sensitivity and specificity in mutation calling from sequencing data

Fig. 3

MuSE performance in two WES data sets and one WGS data set. a Venn diagram of MuSE and Caller A calls from 91 pairs of ACC WES samples. The calls are overlaid with 550 positions that were selected for deep sequencing validation. The numbers of validated calls are shown in boldface. For selected MuSE unique, MuSE and Caller A shared, and Caller A unique calls, 35 out of 139, 268 out of 290, and 30 out of 121 are validated, respectively. b Venn diagram of calls from five different callers using the same ACC data. All the calls except those of MuSE are extracted from TCGA mutation annotation format (MAF) file. The circles label the numbers of calls missed by one caller but captured by the other four callers. The blue dotted circle denotes the number of calls missed by MuSE, and the red solid circle indicates the number of calls missed by Caller A. TPR, PPV, F1, and F2 scores are calculated and listed below the Venn diagram. The truth set is defined as calls shared by at least three callers. c Mutation plot and summary table of MuSE and Caller A calls from 48 pairs of multi-region lung adenocarcinoma WES samples. Each gray column represents a sample. MuSE and Caller A share 33,035 calls and possess 3750 and 7886 unique calls, respectively. Only calls from Caller A were further validated. MuSE confirms 16,907 and misses 248 Caller A validated calls. Calls from chromosome 18 are shown in the mutation plot to illustrate how the artificial truth set and false positives are defined. The vertical gray lines separate 11 patients who have samples from 3 to 5 regions of one tumor. The numbered shapes combined with different call types are examples for defining the artificial truth set as positions that fall into any of the three categories: shared or validated (oval 1), called in all regions by including Caller A unique (oval 2), called in all regions by including MuSE unique (oval 3), and false positives: unique and single calls (star 4 and star 5). Correspondingly, the TPR∗, PPV∗, \(\mathrm {F}_{1}^{*}\), and \(\mathrm {F}_{2}^{*}\) scores are calculated and listed beside the mutation plot. d Venn diagram of calls from five different callers using 56 pairs of ICGC Pilot-63 WGS samples on chromosome 1. The circles label the number of calls missed by one caller but captured by the other four callers. The blue dotted circle denotes the number of calls missed by MuSE, and the red solid circle indicates the number of calls missed by Caller A. TPR, PPV, F1, and F2 scores are calculated and listed below the Venn diagram

Back to article page