Skip to main content
Fig. 7 | Genome Biology

Fig. 7

From: False gene and chromosome losses in genome assemblies caused by GC content variation and repeats

Fig. 7

Effect of false gene losses in the previous assemblies on annotations. a GC content peaks near TSSs and TTSs from VGP or prior annotations (blue: VGP annotation, yellow: VGP annotation projected on the prior assembly by CAT, green: prior annotation). b, c DRD1B and CADPS2 were missing 5′ UTRs, CpG islands of promoter regions, and some coding sequence in the prior assemblies, resulting in the false understanding of the genes’ structures and false annotations. In the zebra finch, the missing regions of both genes are inferred regulatory regions based on open chromatin ATAC peaks unique to Area X (AX) and arcopallium (Arco) compared to striatum brain regions, respectively. d IPO4, REC8, and immediate syntenic genes were present in the VGP zebra finch assembly while they were missing in the prior assembly. e KCTD15 was erroneously assembled with the inverted contig including its first and second exons in the prior assembly. f ADAM7 was fragmented on different two scaffolds and its N-terminal 6 exons were missed in the prior annotation. g PCDH17 included frameshift inducing indels in the coding region in the prior assembly, which resulted in false prediction of 1 and 2 bp length introns to compensate for the frameshift error

Back to article page