Skip to main content
Fig. 4 | Genome Biology

Fig. 4

From: False gene and chromosome losses in genome assemblies caused by GC content variation and repeats

Fig. 4

Distribution of previously missing sequences and GC content within or near genes in VGP assemblies. a Average missing ratio and GC content of VGP RefSeq annotated multi-exon protein-coding genes separated by the presence or absence of upstream CpG islands (CGIs). Left and right panels indicate the upstream and downstream 3 kbp sequences of a gene in 100-bp consecutive blocks. Middle panels indicate the gene body regions with exons (top) and introns (bottom) positions. b GC profile of previously missing and present regions in various types of genes. Solid line with transparent background indicates average and S.D. of GC content calculated from 100-bp consecutive blocks extracted from the upstream and downstream 3 kbp regions of genes. Blocks were classified as missing if their missing ratio was over 90%. Missing was calculated by the percentage of missing blocks among all blocks. Bar indicates the average GC content of exons (F: first exon, I: internal exon, L: last exon, E: exon without consideration of its order)

Back to article page