Cumulative distributions of assembled sequence as a function of scaffold and contig length. The total amount of assembled sequence in scaffolds or contigs longer than a minimum length is shown. As the available paired-end insert size is increased, the W7984 WGS assembly becomes progressively longer, with the inclusion of short-inserts (<500 bp) only (red); the addition of medium-inserts (700 bp to 1 kbp; dark blue); and finally the inclusion of approximately 4 kbp insert mate pairs (green). For comparison, the International Wheat Genome Sequencing Consortium chromosome-sorted assembly of ‘Chinese Spring’ (CSS) is also shown (black dashed line). Cumulative contig distributions for W7984 (light blue) and CSS (gray dashed line) are also depicted. As predicted by assembly theory, these quantities are exponentially distributed with decay lengths proportional to the N50 length scale of the assembly. This demonstrates that the excess length of the CSS assembly is restricted to an abundance of very short sequences (less than 1 kbp in length) that are outside of the body of the main exponential decay curves.