Skip to main content

Table 2 Summary of M. tuberculosis data sets used for joint genotyping. “Genome inside sites” is the total length of all reference alleles across all sites after clustering. It is reported as the total number of base pairs, and in parentheses as a percentage of the 4.4Mbp H37Rv reference genome. SNP sites is the number of sites where all alleles have length 1

From: Minos: variant adjudication and joint genotyping of cohorts of bacterial genomes

Data set Number of samples Unique variants Excluded variants Sites after clustering Genome inside sites (bp(%)) Total alleles SNP sites
Walker 2013 385 31,548 231 30,621 41,437 (1%) 62,690 27,639
Mykrobe 13,411 699,484 6,259 593,584 756,003 (17%) 1,414,723 552,543
CRyPTIC 15,215 718,863 6,576 611,269 778,949 (18%) 1,469,100 568,224