Skip to main content
Figure 1 | Genome Biology

Figure 1

From: Current challenges in de novo plant genome sequencing and assembly

Figure 1

Schematic overview of genome assembly. (a) DNA is collected from the biological sample and sequenced. (b) The output from the sequencer consists of many billions of short, unordered DNA fragments from random positions in the genome. (c) The short fragments are compared with each other to discover how they overlap. (d) The overlap relationships are captured in a large assembly graph shown as nodes representing kmers or reads, with edges drawn between overlapping kmers or reads. (e) The assembly graph is refined to correct errors and simplify into the initial set of contigs, shown as large ovals connected by edges. (f) Finally, mates, markers and other long-range information are used to order and orient the initial contigs into large scaffolds, as shown as thin black lines connecting the initial contigs.

Back to article page