Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: False gene and chromosome losses in genome assemblies caused by GC content variation and repeats

Fig. 1

Proportion, GC content, and repeat content of missing regions in prior assemblies found in VGP assemblies. a–d Logarithm of identified chromosome or scaffold size for those greater than 100 kbp in each of the VGP assemblies. Gray and red bars highlight the proportion of sequence present or missing in the prior assemblies, respectively. Below each chromosome/scaffold is a heatmap of the GC and repeat contents of the missing sequence. up, unplaced scaffolds; u, unlocalized within the chromosome named. * indicates the scaffolds with over 30% of missing sequences in the prior assembly. e Distributions of % GC content and % repeat content in 10 kbp consecutive blocks of missing or present sequences. Large dots indicate the average of GC and repeat content, which were significantly higher in the missing regions (red) than in the previously present (gray) regions except GC content of climbing perch (p < 0.0001, Wilcoxon rank-sum test). f, g Missing rates in prior assemblies for CpG islands, repeats, and control non-CpG and non-repeated regions

Back to article page