Skip to main content

Table 2 Summary of M. tuberculosis data sets used for joint genotyping. “Genome inside sites” is the total length of all reference alleles across all sites after clustering. It is reported as the total number of base pairs, and in parentheses as a percentage of the 4.4Mbp H37Rv reference genome. SNP sites is the number of sites where all alleles have length 1

From: Minos: variant adjudication and joint genotyping of cohorts of bacterial genomes

Data set

Number of samples

Unique variants

Excluded variants

Sites after clustering

Genome inside sites (bp(%))

Total alleles

SNP sites

Walker 2013

385

31,548

231

30,621

41,437 (1%)

62,690

27,639

Mykrobe

13,411

699,484

6,259

593,584

756,003 (17%)

1,414,723

552,543

CRyPTIC

15,215

718,863

6,576

611,269

778,949 (18%)

1,469,100

568,224