Skip to main content

Table 1 Hierarchy of assembly data types

From: Hawkeye: an interactive visual analytics tool for genome assemblies

Data type

Description

Scaffold (100 kb to 10 Mb)

Layout of potentially nonoverlapping contigs based on mate-pair information, ideally spanning entire chromosomes or replicons

Contig (5 kb to 500 kb)

Layout of overlapping reads with a consensus sequence

Mate-pair (2 kb to 100 kb)

Pair of end-sequenced reads with a known orientation and separation

Read (0.5 kb to 1.0 kb)

Base-calls and quality scores assigned to a chromatogram

Chromatogram (4× 10,000 time points)

Signal data from a sequencing reaction of a physical piece of DNA

  1. Each type is composed of the next lower level type. Typical sizes are also listed. bp, base pairs; Mb, megabases.