Skip to main content
Fig. 5 | Genome Biology

Fig. 5

From: Widespread false gene gains caused by duplication errors in genome assemblies

Fig. 5

The genome landscape of false gene gains. 10X linked read pairs are shown above the PacBio CLR reads along with the depth coverage of the respective read data. a The genome landscape of ZBTB11 and false ZBTB11-like (LOC100218125) genes in the previous zebra finch assembly. Most of the region of ZBTB11 was duplicated adjacent to itself in the previous assembly (highlighted as orange and blue) and showed typical characteristics of false heterotype duplications. Black and red arrows represent assembly gap and read depth-gap, respectively. b Corrected gene structure in the VGP assembly (gray). c The genome landscape of ACAD10 and ALDH2 genes in the previous zebra finch assembly. Three exons of ALDH2 were inserted in ACAD10 by a false duplication (highlighted as orange and blue). Black arrows represent assembly gaps. d Corrected gene structure in the VGP assembly. The extrinsic three exons from ALDH2 (gray) were not found in ACAD10 of the VGP assembly. e The genome landscape of ADAMTS13 and ADAMTS-like (LOC105760960) genes in the previous zebra finch assembly. The 5′ and 3′ exons of ADAMTS13 (highlighted as blue) were falsely duplicated to two ADAMTS13-like genes, which are assembled in the same scaffold (highlighted as orange) as ADAMTS13-like (LOC105760960) and a different scaffold (LOC101232819; Additional file 2: Fig. S7). The homologous region of the LOC101232819 gene is marked by a red star. f Corrected gene structure in the VGP assembly (gray). The homologous region of the LOC101232819 gene in the previous assembly is marked by a red star. g The genome landscape of neurotrypsin (LOC100229828) and neurotrypsin-like genes (LOC100217566) in the previous zebra finch assembly. The 3′ region of LOC100229828 (highlighted as blue) was falsely duplicated to a different small scaffold (~3 kbp; highlighted as orange), NW_002201465. h Corrected region in the VGP assembly (highlighted as gray). The different colors and their heights in the read depth rows are the proportion of sites in reads with haplotype variants

Back to article page