Skip to main content
Fig. 2 | Genome Biology

Fig. 2

From: Pandora: nucleotide-resolution bacterial pan-genomics with reference graphs

Fig. 2

The pandora workflow. A Reference panel of genomes; color signifies locus (gene or intergenic region) identifier, and blobs are SNPs. B The multiple sequence alignment (MSA) for each locus is converted into a directed acyclic graph (termed local graph). C Local graphs constructed from the loci in the reference panel. D Workflow: the collection of local graphs, termed the PanRG, is indexed. Reads from each sample under study are independently quasi-mapped to the graph, and a determination is made as to which loci are present in each sample. In this process, for each locus, a mosaic approximation of the sequence for that sample is inferred, and variants are genotyped. E Regions of low coverage are detected, and local de novo assembly is used to generate candidate novel alleles missing from the graph. Returning to D, the dotted line shows all the candidate alleles from all samples are then gathered and added to the PanRG. Then, reads are quasi-mapped one more time, to the augmented PanRG, generating new mosaic approximations for all samples and storing coverages across the graphs; no de novo assembly is done this time. A pan-genome matrix showing which input loci are present in each sample is created. Finally, all samples are compared, and a VCF file is produced, with a per-locus reference that is inferred by pandora

Back to article page