Skip to main content

Table 1 False duplication statistics in previous and VGP assemblies

From: Widespread false gene gains caused by duplication errors in genome assemblies

 

Zebra finch

Pre. (Sanger)

Zebra finch

VGP (bTaeGut1)

Hummingbird

Pre. (Illumina)

Hummingbird

VGP (bCalAnn1)

Platypus

Pre. (Sanger)

Platypus

VGP (mOrnAna1)

Heterozygosity (%)

-

0.95

-

0.34

-

0.22

Total false duplication length (Mbp)

195.6 (15.9 %)

24.1 (2.3 %)

40.9 (3.7 %)

5.8 (0.5 %)

126.0 (6.3 %)

5.6 (0.3 %)

Type

Heterotype (Mbp)

190.5 (15.5 %)

23.1 (2.2 %)

13.7 (1.2 %)

5.4 (0.5 %)

72.0 (3.6 %)

5.0 (0.3 %)

Homotype (Mbp)

5.1 (0.4 %)

0.9 (0.1 %)

27.2 (2.5 %)

0.4 (<0.1 %)

54.0 (2.7 %)

0.6 (<0.1 %)

Location

Same scaffold (Mbp)

46.4 (3.8 %)

22.2 (2.1 %)

2.9 (0.3 %)

4.8 (0.5 %)

22.7 (1.1 %)

3.4 (0.2 %)

Different scaffold (Mbp)

149.3 (12.1 %)

1.9 (0.2 %)

38.0 (3.4 %)

1.0 (<0.1 %)

103.3 (5.2 %)

2.2 (0.1 %)

Total assembly Length (Mbp)

1232.1 (100 %)

1058.0 (100 %)

1105.7 (100 %)

1059.7 (100 %)

1995.6 (100 %)

1858.5 (100 %)

  1. Heterozygosity (top row) was calculated as the mean number of variants in each assembly based on the 10X linked reads produced for each VGP assembly. Second row is the total Mbp (and % of genome size in brackets) that are falsely duplicated. Rows below that are the type and location of false duplications