Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: VirStrain: a strain identification tool for RNA viruses

Fig. 1

A The pairwise Hamming distance distribution of SARS-CoV-2, H1N1 (HA), and HIV (Gag). The Hamming distance here is measured only on the chosen SNV sites by the greedy covering algorithm. They can represent the overall similarity of the three reference sets. The dashed line is at distance value 50 across the three panels for easy comparison. B The statistics of the reference database of 3 viruses. “# Genome Number”: the number of downloaded genomes. “# Cluster Number”: the number of final clusters generated by VirStrain. “# SNV Sites Number”: the number of SNV sites chosen by VirStrain for each virus. C The cluster size distribution of SARS-CoV-2, H1N1 (HA), and HIV (Gag). “Cluster size” represents the number of strains contained in each cluster

Back to article page