Skip to main content

Table 1 Long-read validation rates for each tool relative to randomly permuted data

From: LUMPY: a probabilistic framework for structural variant discovery

Method Total calls Observed validations (fraction) Expected validations (fraction)
50X coverage
  LUMPY (pe + sr) 4,347 2,653 (0.61) 37.9 ± 1.2 (0.009)
  LUMPY (pe + sr + prior) 4,809 2,706 (0.563) 41.1 ± 1.3 (0.009)
  LUMPY trio (pe + sr) 5,108 2,660 (0.521) 31.5 ± 1.1 (0.006)
  LUMPY (pe + sr&rd) 1,355 1,114 (0.822) 5.4 ± 0.5 (0.001)
  GASVPro 3,929 2,249 (0.572) 61.1 ± 1.5 (0.016)
  DELLY 12,272 3,127 (0.255) 219.2 ± 2.9 (0.018)
  Pindel 7,219 2,208 (0.306) 0.7 ± 0.2 (~0)
5X coverage
  LUMPY (pe + sr) 643 619 (0.963) 4.9 ± 0.4 (0.008)
  LUMPY (pe + sr + prior) 840 785 (0.935) 4.3 ± 0.4 (0.005)
  LUMPY trio (pe + sr) 1,006 958 (0.952) 4.1 ± 0.4 (0.004)
  LUMPY (pe + sr&rd) 73 66 (0.904) 0.01 ± 0.02 (~0)
  GASVPro 356 338 (0.949) 10.2 ± 0.6 (0.029)
  DELLY 798 698 (0.875) 4.5 ± 0.4 (0.006)
  Pindel 640 521 (0.814) 0.04 ± 0.04 (~0)
  1. Monte Carlo simulations were performed to assess the rate at which false positive SV calls are validated purely by chance using split-read mapping analysis of PacBio and Moleculo data. For each NA12878 deletion callset shown in Figures 5 and 6, deletion coordinates were shuffled 100 times (retaining the breakpoint interval sizes and total span of each deletion call), and validation experiments were conducted precisely as for real data. For each callset, we show the total number of deletion calls, the number of validated calls with the fraction validation in parentheses, and the number of validations expected by chance and the 95% confidence interval (with the expected fraction in parentheses) based on Monte Carlo simulations. pe, paired-end; rd, read-depth; sr, split-read.