Skip to main content

Table 1 Hierarchy of assembly data types

From: Hawkeye: an interactive visual analytics tool for genome assemblies

Data type Description
Scaffold (100 kb to 10 Mb) Layout of potentially nonoverlapping contigs based on mate-pair information, ideally spanning entire chromosomes or replicons
Contig (5 kb to 500 kb) Layout of overlapping reads with a consensus sequence
Mate-pair (2 kb to 100 kb) Pair of end-sequenced reads with a known orientation and separation
Read (0.5 kb to 1.0 kb) Base-calls and quality scores assigned to a chromatogram
Chromatogram (4× 10,000 time points) Signal data from a sequencing reaction of a physical piece of DNA
  1. Each type is composed of the next lower level type. Typical sizes are also listed. bp, base pairs; Mb, megabases.